How to do cpuid and rdtsc on 32 bit and 64 bit Windows

By Stephen Kellett
4 June, 2010

With the introduction of WIN64, the C++ compiler has many improvements and certain restrictions. One of those restrictions is no inline assembly code. For those few of us that write hooking software this is a real inconvenience. Inline assembly is also useful for adding little snippets of code to access hardware registers that are not so easy to access from C or C++.

The 64-bit compiler also introduces some intrinsics which are defined in intrinsic.h. These intrinsics allow you to add calls to low-level functionality in your 64-bit code. Such functionality includes setting breakpoints, getting CPU and hardware information (cpuid instruction) and reading the hardware timestamp counter.

In this article, I’ll show you how you can use the same code for both 32-bit and 64-bit builds to have access to these intrinsics on both platforms.

__debugbreak()

The 64-bit compiler provides a convenient way for you to hard code breakpoints into your code. Often very useful for putting breakpoints in your code during testing. The __debugbreak() intrinsic provides this functionality.

There is no 32 bit __debugbreak();

For 32-bit systems, you have to know 80386 assembly. The breakpoint instruction is opcode 0xcc. The inline assembly for this is __asm int 3;

	#define __debugbreak()				__asm { int 3 }

cpuid – 32 bit

On 32-bit systems there is no cpuid assembly instruction so you have to use the emit directive.

	#define cpuid	__asm __emit 0fh __asm __emit 0a2h

and then you can use cpuid anywhere you need to.

	void doCpuid()
	{
		__asm pushad;		// save all the registers - cpuid trashes EAX, EBX, ECX, EDX
		__asm mov eax, 0; 	// get simplest cpuid data
		
		cpuid;			// call cpuid, results returned in EAX, EBX, ECX, EDX
		
		// read cpuid results here before restoring the registers...

		__asm popad;		// restore registers
	}

cpuid – 64 bit

On 64-bit systems you have to use the intrinsic __cpuid(registers, 0) provided in the intrinsic.h file.

	void doCpuid()
	{
	    int registers[4];

	    __cpuid(registers, 0);
	}

rdtsc – 32 bit

On 32-bit systems, there is no rdtsc assembly instruction so you have to use the emit directive.

	#define rdtsc	__asm __emit 0fh __asm __emit 031h

and then you can use rdtsc anywhere you need to.

	__int64 getTimeStamp()
	{
	    LARGE_INTEGER li;

	    rdtsc;

	    __asm	mov	li.LowPart, eax;
	    __asm	mov	li.HighPart, edx;
	    return li.QuadPart;
	}

rdtsc – 64 bit

On 64-bit systems, you have to use the intrinsic __rdtsc() provided in the intrinsic.h file.

	__int64 getTimeStamp()
	{
	    return __rdtsc();
	}

That’s not very portable is it?

The problem with the above approach is that you end up having two implementations for these functions – one for your 32-bit build and one for your 64-bit build. It would be much more elegant to have a drop-in replacement that you can use in your 32-bit code that will compile in the same manner as the 64-bit code that uses the intrinsics defined in intrinsic.h

Here is how you do it. Put all of the code below into a header file and #include that header file wherever you need access to __debugbreak(), __cpuid() or rdtsc().

#ifdef _WIN64
#include 
#else	// _WIN64
	// x86 architecture

	// __debugbreak()

	#if     _MSC_VER >= 1300
		// Win32, __debugbreak defined for VC2005 onwards
	#else	//_MSC_VER >= 1300
		// define for before VC 2005

		#define __debugbreak()				__asm { int 3 }
	#endif	//_MSC_VER >= 1300

	// __cpuid(registers, type)
	//		registers is int[4], 
	//		type = 0

	// DO NOT add ";" after each instruction - it screws up the code generation

	#define rdtsc	__asm __emit 0fh __asm __emit 031h
	#define cpuid	__asm __emit 0fh __asm __emit 0a2h

	inline void __cpuid(int	cpuInfo[4], 
						int	cpuType)
	{
		__asm pushad;
		__asm mov	eax, cpuType;

		cpuid;
		
		if (cpuInfo != NULL)
		{
			__asm mov	cpuInfo[0], eax;
			__asm mov	cpuInfo[1], ebx;
			__asm mov	cpuInfo[2], ecx;
			__asm mov	cpuInfo[3], edx;
		}

		__asm popad;
	}

	// __rdtsc()

	inline unsigned __int64 __rdtsc()
	{
		LARGE_INTEGER	li;

		rdtsc;

		__asm	mov	li.LowPart, eax;
		__asm	mov	li.HighPart, edx;
		return li.QuadPart;
	}

#endif	// _WIN64

Now you can just the 64-bit style intrinsics in your code and not have to worry about any messy condition code for doing the 32-bit inline assembly or the 64-bit intrinsics. Much neater, elegant, readable and more maintainable.

Additional Information

If you wish to know more about __cpuid(), the parameters it takes and the values returned in the registers array, Microsoft has a __cpuid() instrinsic description which explains everything in great detail.

Cupid instruction

When I wrote this article I kept typing cupid instead of cpuid. I’m sure my mind was on something else :-). How would the cupid instruction be implemented…

Fully functional, free for 30 days