Category: Thread

Detecting Abandoned Critical Sections

By , March 6, 2020 4:24 pm

Multithreading is a powerful way to improve the processing throughput and responsiveness of your software. We use it to great effect at Software Verify. In order to manage multithreading successfully it’s necessary to use some form of synchronization between each thread that wishes to read/write data. Deadlocks can result. The main cause of deadlocks is two or more locks (critical section being an example) accessed in different orders on each thread. This has been the subject of much writing, so for now I won’t repeat that topic here.

There is another cause of deadlock which is less well known. The abandoned critical section.

In this article I’m going to describe how to detect abandoned critical sections. But first I need to describe them to you and explain how abandoned critical sections get created.

What is an Abandoned Critical Section?

An abandoned critical section is a critical section that has been locked but then the thread that owns the lock ends without unlocking the critical section. This creates a critical section that cannot be unlocked, and is thus permanently locked. If any other thread attempts to enter the critical section it will wait forever, in a deadlock caused by an infinite wait.

How does this happen?

There are several ways that a critical section can become abandoned.

  • Incorrect code.
  • Incorrect exception handling.
  • Terminate Thread.

Incorrect code

This is where the thread code enters a critical section to do some work and forgets to unlock the critical section. Then the thread exits. If you use object oriented code (CSingleLock for example) to manage the lifetime of critical section ownership then this problem should never happen. But if you manually control the locking, using say, CCriticalSection::Lock() and CCriticalSection::Unlock(), or EnterCriticalSection(&cs) and LeaveCriticalSection(&cs) then it’s possible for you to forget to leave a locked CS, or for a logic failure to result in a critical section not being locked.

If you’re using object oriented synchronization locking methods you might want to look at Thread Lock Checker to automate checking for some simple and common errors that can happen.

DWORD doThread(void	*param)
{
	EnterCriticalSection(&dataCS);
	
	doWork(data);
	
	return 0;	// forgot to call LeaveCriticalSection(&dataCS);
}

Incorrect exception handling

This is where some code in a thread is protected by an exception handler (you’re calling a 3rd party library, or working with data of unknown integrity) and a critical section is locked when an exception is thrown. In an ideal world the exception handler will leave that locked critical section. Unfortunately the writer of the exception handler may not known about the critical section, or they may have forgotten about it – either way the locked critical section doesn’t get unlocked. As with the previous case, if you use object oriented access to critical sections (CCriticalSection, CSingleLock) the process of unwinding the stack during the exception handling should automatically unlock these locks. This won’t happen if you’re using CRITICAL_SECTIONs with the Win32 API.

DWORD doThread(void	*param)
{
	__try
	{
		EnterCriticalSection(&dataCS);
	
		doWork(data);	// something inside here throws an exception. 
	
		LeaveCriticalSection(&dataCS);
	}
	__except(EXCEPTION_EXECUTE_HANDLER)
	{
		// forgot to call LeaveCriticalSection(&dataCS);
	}
	
	return 0;	
}

Terminate Thread

This is where a thread that is doing some work that has accessed some critical sections is killed by another thread calling TerminateThread(). There are occasions where TerminateThread() can be useful, but this is a last ditch method for dealing with threads. If your code is using TerminateThread() to manage your own threads why not spend some time to work out how not to use TerminateThread and to make your threads end normally (by exiting the thread or calling ExitThread()).

// correctly written thread

DWORD doThread(void	*param)
{
	EnterCriticalSection(&dataCS);
	
	doWork(data);
	
	LeaveCriticalSection(&dataCS);
	
	return 0;
}

void mainThread()
{
	HANDLE	hThread;
	DWORD	threadId;
	
	hThread = CreateThread(NULL, 0, doThread, NULL, 0, &threadId);
	if (hThread != NULL)
	{
		doSomeWork();
		
		TerminateThread(hThread, 0); // this is a bit brutal
		CloseHandle(hThread);
	}
}

How to detect Abandoned Critical Sections?

We have two ways to detect Abandoned Critical Sections.

  • Thread Wait Chain Inspector
  • Thread Validator

Thread Wait Chain Inspector

Thread Wait Chain Inspector is a free software tool that we wrote that uses the Win32 Wait Chain API to identify various wait chain states of the locks and waits in a given application. Just select the application in question and look at the results.


This tool tells you process ids and thread ids, but it can’t give you symbols, filenames and line numbers. It will provide thread names if you’re working on Windows 10 and you’ve named your threads using the SetThreadDescription() API.

Thread Validator

Thread Validator is our thread analysis software tool for analysing thread synchronization problems, deadlocks, busy locks, slow locks, contended locks and recursing locks. We’ve recently added some reporting options to Thread Validator will help you identify the location of abandoned critical sections.

I’ve used the tvExample demonstration application that ships with Thread Validator (you’ll need to build) to deliberately create two abandoned critical sections. From the test menu choose “Exit thread with a locked critical section” and “Terminate thread with a locked critical section”.

The summary display will show an abandoned count of 2 in the Errors panel.


The various locks displays will colour the abandoned thread dark purple and list the Lock status as Abandoned


If you click the Abandoned bar in the Errors panel, the display will move to the Analysis tab and the callstacks for the abandoned critical sections will be displayed.


Expanding each entry reveals the callstacks so that you can see where see where each critical section is abandoned. Note that each entry shows two callstacks. The first is where the critical section was created. The second is where the critical section was abandoned. You can expand any entry on any callstack to see the source code.

Abandoned because of thread exit


Abandoned because of TerminateThread()


Expanding the callstack entries to reveal the source code…


Conclusion

Abandoned Critical Sections are bad news. They cause deadlocks. But they don’t need to be hard to track down when you’ve got the right tools to put to work.

Monitoring a service with the NT Service API

By , February 11, 2020 5:21 pm

Debugging services is a pain. There is a lot that can go wrong and very little you can do to find out what went wrong. Perfect! Just what you need for an easy day at work. Services run in a restricted environment, these days you also need to be Administrator to do anything with them, and getting your favourite software tool which isn’t a debugger working with them is hard. I remember years ago seeing the list of things you needed to do to get NuMega’s BoundsChecker to work with services. It was a couple of web pages of instructions, each line containing a detailed step. You had to do all of the actions correctly in order to set things up to work with services.

These days Microsoft have changed the security landscape and it’s no longer possible to launch your data monitoring software tool from a service as that ability is correctly regarded as a security vulnerability. It’s also pretty much impossible to inject into a service from a GUI application. As a result the correct way to work with services is to add a few lines of glue code, in the form of calls to an API that setup communications with an already running user interface.

We’ve described our updated NT Service API in a previous article, so in this article I’m going to talk about the using the API to track errors in the service code calling the API and also describe how you use the user interface to work with services. This article will focus on C++ Memory Validator, but the techniques described here will also work for C++ Coverage Validator, C++ Performance Validator and C++ Thread Validator. If you’re using a .Net service, or a mixed mode service with a .Net entry point you don’t need to use the API, but the GUI parts of this article will still apply to you. If you using a native service or a mixed mode service with a native entry point all of this article applies to you.

Monitoring a Service

Before we get into the error codes and error handling in the GUI, let’s first take a tour of how things should work if everything goes to plan. This will provide some context for the errors I’m going to describe later. I’m going to assume you’ve built both the example service and the example service client, and that you’ve installed the service (serviceMV.exe -install in an Administrator mode command prompt). The service client passes a string to the service, which reverses it and passes it back to the client. The service also deliberately leaks some memory for testing purposes.

Here’s a video of the process.

From the Launch menu, choose Monitor a Service.


The Monitor a Service dialog is displayed.


Enter the full path to your service and click OK to start monitoring. The Validator will now setup some environment variables and some data in the registry that will be used by the service API. After a few seconds the Start your Service dialog appears.


Click OK, then start your service (you’ll need to do this from an Administrator command prompt).

serviceMV -start

The Validator attaches to the service and after a few moments various status information in the Validator title bar and the Validator status bar updated.

It is possible that you may get a debug information informational dialog displayed. You can dismiss this (it can be viewed from the Validator Tools menu). To change how symbols are found you’ll need to look at the Symbol Server and File Locations parts of the Validator settings dialog.


Next a dialog is displayed informing you that Administrator Privileges may be required.


For some services you may find that the Validator gets better data, or sends data to the GUI faster if the Validator is run in Administrator mode. If that is the case you’ll need to restart the Validator with Administrator privileges (and also stop and restart the service, etc).

For this particular example service, we don’t need Administrator privileges so we’ll continue without them.

Now we can interact with the service from the service client by sending a string to the service. The service reverses it and sends it back.

serviceClient "Hello World"


Once we’re done working with the service we can stop it (you’ll need to do this from an Administrator command prompt).

serviceMV -stop

The Validator disconnects from the service and displays all the data it has collected from the service.


That’s how it looks when everything goes according to plan.

What happens when things go wrong? That’s what the next section is about.

Tracking errors in the service

The various API functions return a SVL_SERVICE_ERROR error code. We’ve extended this code so that you can detect when the user has forgotten to do something prior to starting the service, or you can detect if various other error conditions have occurred. Some of these error codes are internal error codes and should never be seen by a customer, but we’re documenting them here for completeness.


  • SVL_FAIL_PATHS_DO_NOT_MATCH. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_INCORRECT_PRODUCT_PREFIX. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_X86_VALIDATOR_FOUND_EXPECTED_X64_VALIDATOR. Looks like you’re monitoring a 64 bit service with a 32 bit Validator. You need to use a 64 bit Validator.

  • SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR. Looks like you’re monitoring a 32 bit service with a 64 bit Validator with the svl*VStubService.lib library. You need to use a 64 bit Validator with the svl*VStubService6432.lib.

  • SVL_FAIL_DID_YOU_MONITOR_A_SERVICE_FROM_VALIDATOR. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_SPECIFIED. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_A_PROCESS. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

To aid in debugging we strongly recommend that you log all error codes (successful or failure) from Software Verify API calls. This will allow you to track down errors rapidly rather a series of trial error coding mistakes or a back and forth with support but with no information to help support. We added all of the above error codes after 3 customers all reported similar, but different problems with using the service API. All of their problems would be have been solved if these error codes had been available.

Error codes can be logged with this call.

void writeToLogFile(const wchar_t     *fileName,
                    SVL_SERVICE_ERROR errCode);

Helpful messages can be logged with this call.

void writeToLogFile(const wchar_t *fileName,
                    const wchar_t *text);

Error codes can be turned into human readable messages with this call.

const wchar_t *getTextForErrorCode(SVL_SERVICE_ERROR	errorCode);

And if you need to log Windows error codes, use this call.

void writeToLogFileLastError(const wchar_t *fileName,
                             DWORD         errCode);

See the help documentation for all the available API calls.

Tracking errors in the GUI

There are a couple of mistakes that can be made in the user interface. These are related to monitoring the wrong type of service, and the location of the service. Where it is possible to identify this error in the GUI, we will do so. Where it is not, the error codes described above will help you understand the mistake that has been made.

64 bit Service, 32 bit GUI

If you try to monitor a 64 bit service with a 32 bit GUI that will fail. We can detect this and prevent this. When this error happens you will be shown an error dialog similar to this.


Note that monitoring a 32 bit service with a 64 bit GUI is OK, but you need to use the svl*VStubService6432.lib not the svl*VStubService.lib. We can’t detect this from the GUI, which is why the SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR error code exists – you will get this if you are linked to svl*VStubService.lib when you should be linked to svl*VStubService6432.lib.

Service on a network share

Windows won’t let you start a service on a network share. And yet I’ve lost count of the number of times I’ve tried to do this. This is typically because I have the solution working on machine X (where I wrote it) and wish to test on machine Y, and I just use a network share to map it across. This works for applications and fails for services. This can be a real time waster and Windows isn’t exactly helpful about this, and of course it’s in a service’s startup code, so fun debugging that.

To make this failure easier to detect we check the path of the service you specify in the Monitor a Service dialog and determine if the service is on a network share. If it is we tell you we can’t work with it. This then alerts you to the fact you’ll need to copy that service locally to run tests on it. Probably and hour or two of your time saved, right there.


Conclusion

Working with services can be fraught with problems, but if you log your error codes you can easily and quickly identify any errors made configuring your use of the NT Service API that we were unable to catch with the Validator user interface.

Thread Wait Chain Inspector

By , August 22, 2019 10:53 am

Since Windows Vista the Windows operating system has included functionality to iterate across the waiting objects that form a chain between threads. I’m waiting for thread A, which is waiting for thread B, which is waiting for process Y. That sort of thing. Waits come in the form of EnterCriticalSection, WaitForSingleObject, WaitForMultipleObjects, etc. All documented in Microsoft’s Synchronization API. If you get these waits wrong, you can get deadlocks, or waits that wait forever. Either way, it’s game over for your program if that happens.

The Wait Chain Traversal API was added with windows Vista, but only made public recently. Prior to the API, access to the Wait Chain API was only via Resource Monitor, and more recently via Task Manager. A detailed article by a Microsoft field engineer, faik hakan bilgen, documents the history of the Wait Chain user interfaces and then provides a console program (with source code on github) to provide a wait chain dump to a text file. Unfortunately this isn’t very easy to use as it relies on decoding thread ids and process ids to understand what is happening. Also because it’s a text file, to get an update you need to run the tool again.

We decided to take inspiration from our Thread Status Monitor tool and create a version specifically for wait chains – Thread Wait Chain Inspector.


Select the process you are interested in in the upper window. Wait chains for each thread are shown in the window below. Select a thread and it and any related threads (in the same wait chain) will be highlighted in yellow. Deadlocked threads are shown in red. Process names are displayed and thread names are taken from the GetThreadDescription() API (Windows 10 only).

If you wish to debug a process you can create a minidump for any process in a wait chain. Right click on the process of interest in the wait chain and Create a Minidump….

Thread Wait Chain Inspector is a free tool, complementing our other threading tools, Thread Validator, Thread Lock Checker and Thread Status Monitor.

Thread naming

By , June 19, 2019 7:21 pm

Multi-threading is becoming quite common these days. It’s a useful way to provide a responsive user interface while performing work at the same time. Our tools report data per thread where that is warranted (per-thread code coverage doesn’t seem to be a thing – no one has requested it in 17 years). In this article I’m going to discuss thread naming, OS support for thread naming, and additional support for thread naming that our tools automatically provide.

The Default Thread Display

Threads are represented by a thread handle and a thread id. Typically the thread id is what will be used to represent the thread in a user interface reporting thread related data. Thread ids are numeric. For example: 341. For trivial programs you can often infer which thread is which by looking at the data allocated on each thread. However that doesn’t scale very well to more complex applications. You can end up with thread displays like this:


Microsoft Thread Naming Exception

The WIN32 API does not provide any functions to allow you to name a thread. To handle this oversight, Microsoft use a convention that allows a program to communicate a thread name with it’s debugger. This is done via means of an exception that the program throws (and then catches to prevent it’s propagation terminating the application). The debugger also catches this exception and with the help of ReadProcessMemory() can retrieve the exception name from the program. Here’s how that works.

In your program

The exception code that identifies a thread naming exception is 0x406D1388. To pass the thread name to the debugger you need to create a struct of type THREADNAME_INFO (definition shown int the code sample), populate it with the appropriate data then raise an exception with the specified exception code. You’ll need to use SEH __try/__except to surround the RaiseException() call to prevent crashing the program.

typedef struct tagTHREADNAME_INFO
{
	DWORD	dwType;		// must be 0x1000
	LPCSTR	szName;		// pointer to name (in user addr space) buffer must be 8 chars + 1 terminator
	DWORD	dwThreadID;	// thread ID (-1 == caller thread)
	DWORD	dwFlags;	// reserved for future use, must be zero
} THREADNAME_INFO;

#define MS_VC_EXCEPTION 0x406D1388

void nameThread(const DWORD	threadId,
                const char	*name)
{
	// You can name your threads by using the following code. 
	// Thread Validator will intercept the exception and pass it along (so if you are also running
	// under a debugger the debugger will also see the exception and read the thread name

	// NOTE: this is for 'unmanaged' C++ ONLY!

	#define BUFFER_LEN		16

	THREADNAME_INFO	ThreadInfo;
	char		szSafeThreadName[BUFFER_LEN];	// buffer can be any size, just make sure it is large enough!
	
	memset(szSafeThreadName, 0, sizeof(szSafeThreadName));	// ensure all characters are NULL before
	strncpy(szSafeThreadName, name, BUFFER_LEN - 1);	// copying name
	//szSafeThreadName[BUFFER_LEN - 1] = '\0';

	ThreadInfo.dwType = 0x1000;
	ThreadInfo.szName = szSafeThreadName;
	ThreadInfo.dwThreadID = threadId;
	ThreadInfo.dwFlags = 0;

	__try
	{
		RaiseException(MS_VC_EXCEPTION, 0, 
                               sizeof(ThreadInfo) / sizeof(DWORD_PTR), 
                               (DWORD_PTR *)&ThreadInfo); 
	}
	__except(EXCEPTION_EXECUTE_HANDLER)
	{
		// do nothing, just catch the exception so that you don't terminate the application
	}
}

In the debugger

The THREADNAME_INFO struct address is communicated to the debugger in event.u.Exception.ExceptionRecord.ExceptionInformation[0]. Cast this pointer to a THREADNAME_INFO *, check the number of parameters are correct (4 on x86, 3 on x64), check the dwType and dwFlags are correct, then use the pointer in the szName field. This is a pointer inside the other process, which means you can’t read it directly you need to use ReadProcessMemory(). Once the debugger continues and this event goes out of scope, the ability to read the thread name no longer exists. If you want to read the thread name you must read it immediately, storing it for future use.

	if ((de->dwDebugEventCode == EXCEPTION_DEBUG_EVENT) &&
	    (de->u.Exception.ExceptionRecord.ExceptionCode == SVL_THREAD_NAMING_EXCEPTION))
	{
#ifdef _WIN64
		if (de->u.Exception.ExceptionRecord.NumberParameters == 3)
#else	//#ifdef _WIN64
		if (de->u.Exception.ExceptionRecord.NumberParameters == 4)
#endif	//#ifdef _WIN64
		{
			THREADNAME_INFO	*tni;

			tni = (THREADNAME_INFO *)&de->u.Exception.ExceptionRecord.ExceptionInformation[0];
			if (tni->dwType == 4096 &&
			    tni->dwFlags == 0)
			{
				void	*ptr = (void *)tni->szName;
				DWORD	threadId = tni->dwThreadID;

				if (ptr != NULL)
				{
					char	buffer[1000];
					int		bRet;
					SIZE_T	dwRead = 0;

					memset(buffer, 0, 1000);
					bRet = ReadProcessMemory(hProcess,
								 ptr,
								 buffer,
								 1000 - 1,	// remove one so there is always a terminating zero
								 &dwRead);

					if (buffer[0] != '\0')
						setThreadName(threadId, buffer);
				}
			}
		}
	}

Our tools support intercepting this mechanism so that we can name your threads if you use this mechanism.

The problem with this mechanism is that it puts the onus for naming the threads on the creator of the application. If that involves 3rd party components that spawn their own threads, and OS threads then almost certainly not all threads will be named, which results in some threads having a name and other threads being represented by a thread id.


CreateThread() And Symbols

In an attempt to automatically name threads to provide useful naming for threads without having to rely on the author of a program using thread naming exceptions we started monitoring CreateThread() to get the thread start address, then resolve that address into a symbol name (assuming debugging symbols are available) and use that symbol name to represent the thread. After some tests (done with our own tools – which are heavily multithreaded – dog-fooding) we concluded this was a useful way to name threads.

However this doesn’t solve the problem as many threads are now named _threadstartex().


_beginthread(), _beginthreadex() And Symbols

The problem with the CreateThread() approach is that any threads created by calls to the C runtime functions _beginthread() or _beginthreadex() will result in the thread being called _threadstart() or _threadstartex(). This is because these functions pass their own thread function to CreateThread(). The way to solve this problem is to also monitor _beginthread() and _beginthreadex() functions to get the thread start address, then resolve that address into a symbol name.


SetThreadDescription

Starting with Windows 10, a name can be associated with a thread handle via the API SetThreadDescription(). The name associated with a thread handle can be retrieved using GetThreadDescription(). We use these functions to provide names for your threads if you have used these functions.

Un-named Threads

Some threads still end up without names – typically these are transient threads created by the OS to do a short amount of work. If the call to create the thread was internal to Kernel32.dll then CreateThread() will not be called by an IAT call and will not be monitored, resulting in the thread not getting named automatically. This isn’t ideal, but ultimately you don’t control these threads, which means you can’t affect the data reported by these threads, so it’s not that important.

Thread Naming Priority

We’ve made these changes to all our Validator tools and updated the user interfaces to represent threads by both thread id and thread name.

If a thread is named by a thread naming exception, SetThreadDescription(), or via a Software Verify API call that name will take precedence over any automatically named thread. This is useful in cases where the same function is used by many threads – you can if you wish give each thread a unique name to prevent multiple threads having the same name.

Here are some of the displays that have benefited from named threads.

Bug Validator


Memory Validator


Performance Validator


Thread Validator



64 bit C++ software tool Beta Tests are complete.

By , January 9, 2014 1:33 pm

We recently closed the beta tests for the 64 bit versions of C++ Coverage Validator, C++ Memory Validator, C++ Performance Validator and C++ Thread Validator.

We launched the software on 2nd January 2014. A soft launch, no fanfare, no publicity. We just wanted to make the software available and then contact all the beta testers so that we could honour our commitments made at the start of the beta test.

Those commitments were to provide a free single user licence to any beta tester that provided feedback, usage reports, bugs reports, etc about the software. This doesn’t include anyone that couldn’t install the software because they used the wrong licence key!

We’ve written a special app here that we can use to identify all email from beta test participants and allow us to evaluate that email for beta test feedback criteria. It’s saved us a ton of time and drudge work even though writing this extension to the licence manager software took a few days. It was interesting using the tool and seeing who provided feedback and how much.

We’ve just sent out the licence keys and download instructions to all those beta testers that were kind enough to take the time to provide feedback, bug reports etc. to us. A few people went the extra mile. These people bombarded us with email containing huge bugs, trivial items and everything in between. Two of them, we were on the verge of flying out to their offices when we found some useful software that allowed to us to remotely debug their software. Special mentions go to:

Bengt Gunne (Mimer.com)
Ciro Ettorre (Mechworks.com)
Kevin Ernst (Bentley.com)

We’re very grateful for everyone taking part in the beta test. Thank you very much.

Why didn’t I get a free licence?

If you didn’t receive a free licence and you think you did provide feedback, please contact us. It’s always possible that a few people slipped through our process of identifying people.

Dang! I knew I should’ve provided feedback

If you didn’t provide us with any feedback, check your inbox. You’ll find a 50% off coupon for the tool that you tested.

Thread Validator x64 enters BETA

By , August 6, 2010 8:57 am

Thread Validator x64 is now available for beta testing.

Thread locking history

Thread Validator x64 is the 64 bit version of our successful 32 bit Thread Validator software tool that runs on Microsoft Windows operating systems. Thread Validator x64 is a deadlock detection and thread analysis software tool, running on Windows 7 64 bit, Windows Vista 64 bit and Windows XP 64 bit.

Thread Validator has multiple displays to provide you with different perspectives onto the data you have collected.

What does Thread Validator do?

Thread Validator x64 identifies thread deadlocks, potential deadlocks and locks with a high contention rate.

Thread deadlocks usually mean that one or more threads can no longer function correctly because they are waiting on a lock that will never be released. This is an error condition and usually manifests as an unresponsive computer program.

Potential deadlocks are locking sequences that have not triggered a deadlock but may lead to a deadlock under slightly different conditions.

High contention rate locks result in your program spending too much time waiting for access to a lock. A different program design can often reduce a high contention rate to a less demanding contention rate.

How does Thread Validator work?

Thread Validator instruments your computer program so that it can monitor the appropriate synchronization APIs used to control access to locks, mutexes, semaphores and wait conditions. Using the information gained from monitoring these APIs, Thread Validator can calculate deadlock conditions, potential deadlock conditions and detect locks with high contention rates.

Thread Validator gathers data for all locks, all threads, all mutexes, all semaphores and all wait conditions. The data is organised into various displays allowing you to view information:

  • All active locks.
  • All active locks, organized by thread.
  • All locks that are locked at a given time.
  • Allocation information for all allocated synchronization objects, showing callstack and source code.
  • Thread locking history. View all threads, see what each threads is doing and when.
  • Thread lock order. View the order locks are acquired across threads for a given lock sequence.
  • List of all application objects that can be used in wait conditions.
  • How Thread Validator helps you be more productive

    Thread Validator x64 can help you:

    • Identify deadlocks in your application – quickly identify and fix hard threading problems.
    • Identify potential deadlocks in your application – prevent problems before they get serious.
    • Identify busy contended critical sections in your application – improve performance.
    • View thread locking behaviour in real time.
    • Improve your software quality by modifying your threading behaviour.
    • View all open handles that your application can wait on.

    Join the beta test

    If you are developing 64 bit software and have some multi-threading problems you would like to analyze, please join the beta, analyze your multi-threading problems and let us know your thoughts.

Thread monitoring made easy

By , April 30, 2010 10:36 am

Tools like Thread Validator are great for delving into the details of why a thread deadlock has occurred.You get all the gory details, DLL, filename, line number, lock, mutex, wait, sequence of lock acquisition on each thread. You can work out why the failure occurred.

But sometimes this is overkill, more than you need, for this particular bug. Perhaps you don’t think the bug is real, but just an artifact of how things are working. You don’t want to spend time in the heavyweight tool, not today, but perhaps next week.

What you want a tool with a lighter impact on the system. A tool that will just tell you that what you think is a problem, really is a problem, or actually is not a problem. If its not a problem you can just move on to the next topic, if it is a problem, time to get the serious tools out and investigate the problem.

That is where Thread Status Monitor is very useful. We wrote an early version of this tool some time ago. It was hosted on Object Media where we publish source code for some of our tools. We have improved the tool a bit and decided that it is a better fit on the Software Verification website.

Thread Status Monitor

Thread Status monitor tells you everything you need to know about each thread in a process. The thread id, its wait status, the wait reason, number of context switches, thread priority, how long the thread has spent waiting, how long the thread has been executing, how much CPU thread is currently consuming.

Where appropriate bar graphs are used to provide visual indicators as to which items are largest. Colour coding (pink/blue) is also used to indicate values increasing and decreasing, drawing your attention to the latest changes.

Thread Status Monitor Detail

Simply select the process you wish to monitor in the list at the top of the user interface and view the threads in the lower pane. You can change the refresh rate using the combo box and sort by clicking on the column header you want to sort.

Thread Status Monitor will be available in the next few days.

Thread Lock Checker now available

By , April 22, 2010 6:20 pm

We’ve just released Thread Lock Checker.

Took a bit longer than we anticipated (sorry about that) due to some website maintenance work. Anyway its available now, go and give your source code some TLC and find any latent lock errors in your code!

Thread Lock Checker

Improving how you use CSingleLock

By , April 2, 2010 9:44 pm

Thread Lock Checker Logo

This posting covers a brief background:

  • Win32 critical sections.
  • How CCriticalSection and CSingleLock can be used instead of Win32 critical sections.
  • An improved way to use CSingleLock.
  • Some ways CSingleLock can be used that do not have the desired effect.

Critical Sections in Win32

The Win32 API uses InitializeCriticalSection, EnterCriticalSection, LeaveCriticalSection and DeleteCriticalSection to manage critical sections (CRITICAL_SECTION). Using these APIs is not particularly hard, but nonetheless it is possible to use critical sections that have not been initialized or that have been deleted. It is also possible to forget to leave a critical section that has been entered. In addition, any exceptions that get thrown may result in a critical section being left in its locked state.

This can cause serious performance problems as locks are held for too long, or in the case of a lock not being released, it can prevent other threads gaining access to the resource the lock was protecting, possibly resulting in a deadlock.

Example Win32 usage (assume critical section initialized in a different function):

void someFunc()
{
	doWork();

	EnterCriticalSection(&cs);

	doWorkEx();

	LeaveCriticalSection(&cs);
}

Why use CSingleLock and CMultiLock?

When using critical sections in MFC you use the CCriticalSection class instead of CRITICAL_SECTION objects.

You can directly call Lock() and Unlock() on the CCriticalSection, but it is recommended that you use CSingleLock and CMultiLock to manage your CCriticalSection objects.

The benefits of using a class such as CSingleLock (and its related class CMultiLock) are that:

  • The CSingleLock manages the activities of entering and leaving the critical section – you do not have to think about the critical section at all.
  • Any CCriticalSection object used with CSingleLocks will automatically be initialized before the CSingleLock gets to work with it.
  • The CSingleLock is automatically unlocked (if it was locked) when the CSingleLock is deleted and thus the CCriticalSection that was associated with this CSingleLock is not held locked .
  • If an exception is thrown, C++ objects are cleaned up by the exception handling chain, thus automatically deleting any CSingleLock objects and releasing any locks they hold.
  • CSingleLock can be used to lock and unlock critical sections just like the old Win32 methods, allowing for easy conversion of code from Win32 style to CSingleLock style.
  • It is possible to create a CSingleLock that is automatically locked. This is very useful for set-and-forget critical section management. Just put the CSingleLock in the right place and you can ignore it in the rest of the code. Very neat, convenient and elegant.

One way of using CSingleLock

As described above a typical style of using CSingleLocks echoes the Win32 style of using critical sections.

void someFunc()
{
	CSingleLock	lock(&csSect);

	doWork();

	lock.Lock();
	doWorkEx();
	lock.Unlock();
}

As you can see, the CSingleLock lock manager is created, the doWork() function is called outside of the protected area, the lock is locked, doWorkEx() is called, then the lock is unlocked. This is a very similar style of writing to Win32 equivalent.

A better way of using CSingleLock

The problem with the previous way of using CSingleLock is that most of the power and convenience of CSingleLock is ignored. Lock management has been made explicit via calls to Lock() and Unlock(). This means there is potential for forgetting to lock the CSingleLock, or for unlocking the CSingleLock later than desirable.

An improved way of using CSingleLock is to always create CSingleLocks in the locked state and to create CSingleLocks as close to the resource they are need to protect.

The following example shows the same function written using a CSingleLock that is automatically locked, created just before it is required and automatically destroyed at the end of the function.

void someFunc()
{
	doWork();

	CSingleLock	lock(&csSect, TRUE);

	doWorkEx();
}

If I wanted to some more work after doWorkEx() but I didn’t want that protected by the lock I could do it by using C++’s scoping capabilities. I simply create a new scope and place the CSingleLock in there. At the end of the scope the CSingleLock is destroyed and the lock is unlocked.

void someFunc()
{
	doWork();

	{
		CSingleLock	lock(&csSect, TRUE);

		doWorkEx();
	}

	doMoreWork();
}

Some problems we have seen…

During the development of code for the software tools at Software Verification and our private tools we’ve found a few interesting mistakes. Mistakes often made not through poor design, but simply a typing oversight or mistake, possibly due to tiredness of the person working on the code – the type of mistake you can only put down to the fact that humans do make mistakes, not matter how talented they are in any given field.

Where possible we like to use the CSingleLock lock(&csSect, TRUE) automatic locking style coupled with tight scoping to make the lock lifetime short. As a result we are interested in find the following coding constructs which will result in errors in expected behaviour in our software:

  • CSingleLock created without a lock argument. This defaults to an unlocked CSingleLock.
    CSingleLock	lock(&csSect);
  • CSingleLock created with a FALSE lock argument. This is an unlocked CSingleLock.
    CSingleLock	lock(&csSect, FALSE);
  • CSingleLock created with a variable declaration. This compiles but creates a lock that is immediately destroyed. Any of these three variants are interesting as none of the are useful, but all compile OK.
    CSingleLock(&csSect);
    CSingleLock(&csSect, FALSE);
    CSingleLock(&csSect, TRUE);

Thread Lock Checker

Thread Lock Checker

The problem with the examples we show above is that looking for them is hard work because humans often read what they expect to read (this is part of our predictive pattern recognition built into how we process shapes and text). As a result you may be looking right an error and not see it, but you may see the error the next time you come to the code (having forgotten all about it).

To aid in the discovery of these types of lock usage (for both CSingleLock, CMultiLock and any named classes that have the same style of behaviour) we have written a software tool, Thread Lock Checker.

We use Thread Lock Checker before we release any software. We use Thread Lock Checker to scan our codebase looking for any mistakes not identified by our software engineers. Its a very useful tool. We hope that you will also find Thread Lock Checker useful. Please check back next week for your free download.

We will be releasing Thread Lock Checker during the week of 5 April to 9 April.

Support for MinGW and QtCreator

By , December 4, 2009 5:01 pm

Everyone uses Visual Studio to write C and C++ software don’t they? Yes! you all chorus. Apart from some guys at the back who like to use gcc and g++. They use MinGW when working on Windows. And they may even use Emacs, or perish the thought, vi!

Up until now we haven’t been able to cater to the needs of gcc and g++ users. We’d get email every month asking when we were going to support MinGW or if we supported QtCreator. It was frustrating admitting we couldn’t support that environment. Even more so as the founders of Software Verification wrote large GIS applications using gcc and g++ back in the early 1990s.

During October we integrated support for MinGW and QtCreator into Coverage Validator, Memory Validator, Performance Validator and Thread Validator. Both COFF and STABS debug formats are supported, which provides some flexibility in how you choose to handle your symbols.

We’ll continue to add support for additional compilers to our tools as long as there is interest from you, the kind people that use our software tools.

Panorama Theme by Themocracy