Rss Feed
Tweeter button
Facebook button
Technorati button
Reddit button
Myspace button
Linkedin button
Webonews button
Delicious button
Digg button
Flickr button
Stumbleupon button
Newsvine button

Identifying crashes with the Windows Event Log

By , February 12, 2020 4:43 pm

It’s an unfortunate and inevitable fact that while developing software sometimes your software will crash. This also happens, sometimes, hopefully very infrequently, in production code. Each time this happens Windows stores some information about each crash in the Windows Event Log, along with a multitude of other event information it logs.

In this article I’m going to explain two event log entry types which encode crashes, and how to read them. Then I’ll also introduce some tools that take the drudgery out of converting this information into symbol, filename and line number.

The Windows Event Log

The Windows event log can be viewed using Microsoft’s Event Viewer. Just type “Event Viewer” in the start menu search box and press return. That should start it. Crash information is stored in the sub category “Application” under “Windows Logs”. The two event sources that describe crashes are Windows Error Reporting and Application Error.


The image above shows a Windows Error Reporting event has been selected. The human readable form is shown below in the General tab. Although I say human readable, it really is unintelligible gibberish. None of the fields are identified and you have nothing to work with. The details tab isn’t any better – the raw data is present in text or XML form. Here’s the XML for the crash shown above.

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Windows Error Reporting" /> 
    <EventID Qualifiers="0">1001</EventID> 
    <Level>4</Level> 
    <Task>0</Task> 
    <Keywords>0x80000000000000</Keywords> 
    <TimeCreated SystemTime="2020-02-12T10:09:34.000000000Z" /> 
    <EventRecordID>260507</EventRecordID> 
    <Channel>Application</Channel> 
    <Computer>hydra</Computer> 
    <Security /> 
  </System>
  <EventData>
    <Data>2023787729086567941</Data> 
    <Data>1</Data> 
    <Data>APPCRASH</Data> 
    <Data>Not available</Data> 
    <Data>0</Data> 
    <Data>testDeliberateCrash.exe</Data> 
    <Data>1.0.0.1</Data> 
    <Data>5e419525</Data> 
    <Data>testDeliberateCrash.exe</Data> 
    <Data>1.0.0.1</Data> 
    <Data>5e419525</Data> 
    <Data>c0000005</Data> 
    <Data>000017b2</Data> 
    <Data /> 
    <Data /> 
    <Data>C:\Users\stephen\AppData\Local\Temp\WERC24C.tmp.WERInternalMetadata.xml</Data> 
  <Data>C:\Users\stephen\AppData\Local\Microsoft\Windows\WER\ReportArchive\AppCrash_testDeliberateCr_c31b903842d94a84d4621dceaac377462674f7a_eb589596_139ec4bd</Data> 
    <Data /> 
    <Data>0</Data> 
    <Data>c3d360b2-4d7f-11ea-83d3-001e4fdb3956</Data> 
    <Data>0</Data> 
    <Data>54756af49aec84f97c15f03794ffd605</Data> 
  </EventData>
</Event>

There’s quite a bit of data in here, the purpose of each field implied, not stated. Towards the end is some information related to minidumps, but if you go searching for it, the minidump will no longer be present.

The format for Application Error crashes is different.

Windows Error Reporting

The event log data for a Windows Error Reporting event contains many fields that we don’t need if we’re just investigating a crash address. Each event starts with an <Event> tag and ends with an </Event> tag.

We need to correctly identify the event. Inside the event is a <System> tag which contains a tag with an attribute “Provider Name” set to “Windows Error Reporting”.

Once the event is identified we need to find the <EventData> tag inside the event. The <EventData> contains 14 <Data> tags. These tags are present:


  • 1. Timestamp.

  • 2. Number of data items.

  • 3. Information Type.

  • 4. Information Status.

  • 5. Unknown.

  • 6. Crashing executable.

  • 7. Executable version.

  • 8. Executable timestamp.

  • 9. Crashing DLL. This will be the same as 6 if the crash is in the .exe.

  • 10. DLL version. This will be the same as 7 if the crash is in the .exe.

  • 11. DLL timestamp. This will be the same as 8 if the crash is in the .exe.

  • 12. Exception code.

  • 13. Fault offset.

  • 14. Class. This may or may not be present

Information Type is normally “APPCRASH”. In this case we’re interested in tags 9, 12 and 13.

If Information Type is “BEX”, the data is different:


  • 1. Timestamp.

  • 2. Number of data items.

  • 3. Information Type.

  • 4. Information Status.

  • 5. Unknown.

  • 6. Crashing executable.

  • 7. Executable version.

  • 8. Executable timestamp.

  • 9. Crashing DLL. This will be the same as 6 if the crash is in the .exe.

  • 10. DLL version. This will be the same as 7 if the crash is in the .exe.

  • 11. DLL timestamp. This will be the same as 8 if the crash is in the .exe.

  • 12. Fault offset.

  • 13. Exception code.

  • 14. Class. This may or may not be present

Note that the order of the fault offset and exception code has been reversed compared to APPCRASH.

Of these tags we’re interested in tags 9, 12 and 13.

If we want to version the crashing DLL we also need tags 10 and 11.

Application Error

The event log data for an Application Error event contains many fields that we don’t need if we’re just investigating a crash address. Each event starts with an <Event> tag and ends with an </Event> tag.

We need to correctly identify the event. Inside the event is a <System> tag which contains a tag with an attribute “Provider Name” set to “Application Error”.

Once the event is identified we need to find the <EventData> tag inside the event. The <EventData> contains at least 12 <Data> tags, some of which may not be present, or which may be empty. These tags are present:


  • 1. Crashing executable.

  • 2. Executable version.

  • 3. Executable timestamp.

  • 4. Crashing DLL. This will be the same as 1 if the crash is in the .exe.

  • 5. DLL version. This will be the same as 2 if the crash is in the .exe.

  • 6. DLL timestamp. This will be the same as 3 if the crash is in the .exe.

  • 7. Exception code.

  • 8. Fault offset.

  • 9. Process id.

  • 10. Application start timestamp.

  • 11. Application path.

  • 12. Module path.

Of these tags we’re interested in tags 7, 8 and 12.

If we want to version the crashing DLL we also need tags 5 and 6. If 12 isn’t available, use 4.

Removing the drudgery

The previous two sections have described which fields to extract data from. If you’re doing this manually this is tedious and error prone. You have to select the correct values from the correct fields and then use another application to turn them into a symbol, filename and line number. Our tools DbgHelpBrowser and MapFileBrowser are designed to take a crash offset inside a DLL and turn it into a human readable symbol, filename and line number. But that still requires you to do the hard work of fishing the correct data out of the XML dump.

Now there is a better way, we’ve added an extra option to these tools that allows you to paste the entire XML data from a crash event and the tool then extracts the data it needs to show you the symbol, filename and line number.

DbgHelpBrowser

Load the crashing exe (or DLL) into DbgHelpBrowser. This will cause the symbols to be loaded for the DLL (assuming symbols have been created and can be found). We’re not covering versioning the DLL as most likely you will have your own methods for this.

Choose the option Find Symbol from Event Viewer XML crash log… on the Query menu. The Event Viewer Crash Data dialog is displayed.


Paste the XML data into the dialog and click OK.


The main display will select the appropriate symbol in the main grid and display the relevant symbol, filename, line number and source code in the source code viewer below.


MapFileBrowser

Load the MAP file for the crashing exe (or DLL) into MapFileBrowser. We’re not covering versioning the DLL as most likely you will have your own methods for this.

Choose the option Find Symbol from Event Viewer XML crash log… on the Query menu. The Event Viewer Crash Data dialog is displayed.


Paste the XML data into the dialog and click OK.


The main display will select the appropriate symbol in the main grid and display the relevant symbol, filename, line number and source code in the source code viewer below.


Conclusion

Windows Event Logs can be hard to read and error prone to use. However when paired with suitable tools you can quickly and easily turn event log crashes into useful symbol, filename and line number information to inform your debugging efforts.

Monitoring a service with the NT Service API

By , February 11, 2020 5:21 pm

Debugging services is a pain. There is a lot that can go wrong and very little you can do to find out what went wrong. Perfect! Just what you need for an easy day at work. Services run in a restricted environment, these days you also need to be Administrator to do anything with them, and getting your favourite software tool which isn’t a debugger working with them is hard. I remember years ago seeing the list of things you needed to do to get NuMega’s BoundsChecker to work with services. It was a couple of web pages of instructions, each line containing a detailed step. You had to do all of the actions correctly in order to set things up to work with services.

These days Microsoft have changed the security landscape and it’s no longer possible to launch your data monitoring software tool from a service as that ability is correctly regarded as a security vulnerability. It’s also pretty much impossible to inject into a service from a GUI application. As a result the correct way to work with services is to add a few lines of glue code, in the form of calls to an API that setup communications with an already running user interface.

We’ve described our updated NT Service API in a previous article, so in this article I’m going to talk about the using the API to track errors in the service code calling the API and also describe how you use the user interface to work with services. This article will focus on C++ Memory Validator, but the techniques described here will also work for C++ Coverage Validator, C++ Performance Validator and C++ Thread Validator. If you’re using a .Net service, or a mixed mode service with a .Net entry point you don’t need to use the API, but the GUI parts of this article will still apply to you. If you using a native service or a mixed mode service with a native entry point all of this article applies to you.

Monitoring a Service

Before we get into the error codes and error handling in the GUI, let’s first take a tour of how things should work if everything goes to plan. This will provide some context for the errors I’m going to describe later. I’m going to assume you’ve built both the example service and the example service client, and that you’ve installed the service (serviceMV.exe -install in an Administrator mode command prompt). The service client passes a string to the service, which reverses it and passes it back to the client. The service also deliberately leaks some memory for testing purposes.

Here’s a video of the process.

From the Launch menu, choose Monitor a Service.


The Monitor a Service dialog is displayed.


Enter the full path to your service and click OK to start monitoring. The Validator will now setup some environment variables and some data in the registry that will be used by the service API. After a few seconds the Start your Service dialog appears.


Click OK, then start your service (you’ll need to do this from an Administrator command prompt).

serviceMV -start

The Validator attaches to the service and after a few moments various status information in the Validator title bar and the Validator status bar updated.

It is possible that you may get a debug information informational dialog displayed. You can dismiss this (it can be viewed from the Validator Tools menu). To change how symbols are found you’ll need to look at the Symbol Server and File Locations parts of the Validator settings dialog.


Next a dialog is displayed informing you that Administrator Privileges may be required.


For some services you may find that the Validator gets better data, or sends data to the GUI faster if the Validator is run in Administrator mode. If that is the case you’ll need to restart the Validator with Administrator privileges (and also stop and restart the service, etc).

For this particular example service, we don’t need Administrator privileges so we’ll continue without them.

Now we can interact with the service from the service client by sending a string to the service. The service reverses it and sends it back.

serviceClient "Hello World"


Once we’re done working with the service we can stop it (you’ll need to do this from an Administrator command prompt).

serviceMV -stop

The Validator disconnects from the service and displays all the data it has collected from the service.


That’s how it looks when everything goes according to plan.

What happens when things go wrong? That’s what the next section is about.

Tracking errors in the service

The various API functions return a SVL_SERVICE_ERROR error code. We’ve extended this code so that you can detect when the user has forgotten to do something prior to starting the service, or you can detect if various other error conditions have occurred. Some of these error codes are internal error codes and should never be seen by a customer, but we’re documenting them here for completeness.


  • SVL_FAIL_PATHS_DO_NOT_MATCH. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_INCORRECT_PRODUCT_PREFIX. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_X86_VALIDATOR_FOUND_EXPECTED_X64_VALIDATOR. Looks like you’re monitoring a 64 bit service with a 32 bit Validator. You need to use a 64 bit Validator.

  • SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR. Looks like you’re monitoring a 32 bit service with a 64 bit Validator with the svl*VStubService.lib library. You need to use a 64 bit Validator with the svl*VStubService6432.lib.

  • SVL_FAIL_DID_YOU_MONITOR_A_SERVICE_FROM_VALIDATOR. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_SPECIFIED. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_A_PROCESS. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

To aid in debugging we strongly recommend that you log all error codes (successful or failure) from Software Verify API calls. This will allow you to track down errors rapidly rather a series of trial error coding mistakes or a back and forth with support but with no information to help support. We added all of the above error codes after 3 customers all reported similar, but different problems with using the service API. All of their problems would be have been solved if these error codes had been available.

Error codes can be logged with this call.

void writeToLogFile(const wchar_t     *fileName,
                    SVL_SERVICE_ERROR errCode);

Helpful messages can be logged with this call.

void writeToLogFile(const wchar_t *fileName,
                    const wchar_t *text);

Error codes can be turned into human readable messages with this call.

const wchar_t *getTextForErrorCode(SVL_SERVICE_ERROR	errorCode);

And if you need to log Windows error codes, use this call.

void writeToLogFileLastError(const wchar_t *fileName,
                             DWORD         errCode);

See the help documentation for all the available API calls.

Tracking errors in the GUI

There are a couple of mistakes that can be made in the user interface. These are related to monitoring the wrong type of service, and the location of the service. Where it is possible to identify this error in the GUI, we will do so. Where it is not, the error codes described above will help you understand the mistake that has been made.

64 bit Service, 32 bit GUI

If you try to monitor a 64 bit service with a 32 bit GUI that will fail. We can detect this and prevent this. When this error happens you will be shown an error dialog similar to this.


Note that monitoring a 32 bit service with a 64 bit GUI is OK, but you need to use the svl*VStubService6432.lib not the svl*VStubService.lib. We can’t detect this from the GUI, which is why the SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR error code exists – you will get this if you are linked to svl*VStubService.lib when you should be linked to svl*VStubService6432.lib.

Service on a network share

Windows won’t let you start a service on a network share. And yet I’ve lost count of the number of times I’ve tried to do this. This is typically because I have the solution working on machine X (where I wrote it) and wish to test on machine Y, and I just use a network share to map it across. This works for applications and fails for services. This can be a real time waster and Windows isn’t exactly helpful about this, and of course it’s in a service’s startup code, so fun debugging that.

To make this failure easier to detect we check the path of the service you specify in the Monitor a Service dialog and determine if the service is on a network share. If it is we tell you we can’t work with it. This then alerts you to the fact you’ll need to copy that service locally to run tests on it. Probably and hour or two of your time saved, right there.


Conclusion

Working with services can be fraught with problems, but if you log your error codes you can easily and quickly identify any errors made configuring your use of the NT Service API that we were unable to catch with the Validator user interface.

Exception Tracer

By , August 24, 2019 8:43 am

We’ve just released another of our in-house tools – Exception Tracer.

Exception Tracer started off life as an experiment and then through a series of needing to capture debugging data for various problems with customers, morphed into the exception tracing tool that it is today. Exception Tracer logs debugging events that are sent to debuggers. Most debuggers respond to these events in an interactive manner, breaking the code on exceptions (such as access violations and breakpoints), stepping into and out of functions, inspecting variables. Exception Tracer doesn’t do any of those things. Exception Tracer simply logs every event and stores callstacks associated with each event. You can save the entire trace and inspect it later (or on a different machine if you wish).

Exception Tracer is great for understanding what exceptions are thrown by applications that throw a lot of exceptions, whether that is by design, or because something is going wrong and the exception handling mechanism is being triggered a lot.

We’ve provided filtering so that you only collect the events you’re interested in – perhaps all you’re interested in is what DLLs load and unload and the order they load in, or maybe you only care about a custom exception that your program throws.

We’ve also provided the ability to create minidumps when exceptions are thrown – minidump for any exception, or just the exceptions you care about.

Lastly, we’ve also provided automatic single stepping support (if you want it, turned off by default) with some intelligent options to reduce the amount of redundant single stepping events that are collected. Because you can turn single stepping on and off during a trace you can run at full speed to where the problem area is, turn on single stepping and collect just the area you need in detail.

We used single stepping to great effect to understand the cause of a stack overflow when one of our tools was shutting down on a customer machine. Turns out the culprit was an anti-virus product on the customer machine that was triggering an unexpected sequence of events that would never happen outside of the shutdown phase. We couldn’t get near this bug with a traditional debugger like Visual Studio, but Exception Tracer got us there (that’s where the intelligent filtering came in – the traces contained so many events we had to reduce the data size just to make it manageable when you were inspecting the results).


Select an item in the top window to view the event data and callstack in the lower windows.

Threads names are taken from GetThreadDescription() API (Windows 10), thread naming exceptions, and manual naming of threads (context menu).

Specific threads can be highlighted so that you can pick out related events on the same thread.

We’d love to hear about problems you’ve solved using Exception Tracer. Please let us know.

Thread Wait Chain Inspector

By , August 22, 2019 10:53 am

Since Windows Vista the Windows operating system has included functionality to iterate across the waiting objects that form a chain between threads. I’m waiting for thread A, which is waiting for thread B, which is waiting for process Y. That sort of thing. Waits come in the form of EnterCriticalSection, WaitForSingleObject, WaitForMultipleObjects, etc. All documented in Microsoft’s Synchronization API. If you get these waits wrong, you can get deadlocks, or waits that wait forever. Either way, it’s game over for your program if that happens.

The Wait Chain Traversal API was added with windows Vista, but only made public recently. Prior to the API, access to the Wait Chain API was only via Resource Monitor, and more recently via Task Manager. A detailed article by a Microsoft field engineer, faik hakan bilgen, documents the history of the Wait Chain user interfaces and then provides a console program (with source code on github) to provide a wait chain dump to a text file. Unfortunately this isn’t very easy to use as it relies on decoding thread ids and process ids to understand what is happening. Also because it’s a text file, to get an update you need to run the tool again.

We decided to take inspiration from our Thread Status Monitor tool and create a version specifically for wait chains – Thread Wait Chain Inspector.


Select the process you are interested in in the upper window. Wait chains for each thread are shown in the window below. Select a thread and it and any related threads (in the same wait chain) will be highlighted in yellow. Deadlocked threads are shown in red. Process names are displayed and thread names are taken from the GetThreadDescription() API (Windows 10 only).

If you wish to debug a process you can create a minidump for any process in a wait chain. Right click on the process of interest in the wait chain and Create a Minidump….

Thread Wait Chain Inspector is a free tool, complementing our other threading tools, Thread Validator, Thread Lock Checker and Thread Status Monitor.

Decompression and blue days

By , July 11, 2019 1:53 pm

It’s not uncommon for the founders of startup businesses to experience problems with motivation and problems with productivity as their business grows. I’m going to write about two issues I’ve run into over the years. They recur. You can’t stop them recurring. So the best thing to do is to understand them and accept them. They’re what I call decompression and blue days.

Decompression

Decompression is the word I use to identify the following pattern. You complete a major software release to the public. Then you find yourself unable to commit to any “serious” work for a period of time. For me, it’s typically one day. For you it could be an afternoon, a day, a week, maybe more.

So what is happening with decompression? I think it’s the process of your mind unwinding all the many layers of logic, dependencies, commitments and anxiety of %^&(ing up the release (it does happen!). During this decompression period I’ve found I can work on things tangentially related to the business, but not directly related to the business. As such I can work on side projects, read technical books, non-technical books, go for a walk, play musical instruments, provide mentorship, whatever. I just can’t work on the software or on marketing for the software during this period.

I’ve also found that I can’t do anything about this. Decompression needs to happen. Once it’s done I can get back to work with no distractions about any of the issues related to that previous software release. If I try to force it, by trying to work during a decompression period I just end up doing nothing, but getting frustrated that I’m doing nothing. That isn’t healthy, so I’ve come to the conclusion that the best thing to do is to accept that this happens and work with it. Do something else that is good for your mental health during one of these periods.

If you do this for yourself, cut your team some slack too. They’re probably going through their own version of the same thing.

Blue days

Blue days are different. These don’t come after any specific event. They just appear at random. You could be having a troubling business time or you could be having a great time, building the product, or you’ve already built it and have revenue pouring in, but then one day you’re thinking “I’m wasting my time. Why are we doing this? This will never succeed. Should I stop and spend my energy on (shiny! shiny!).” Typically this is accompanied by a very bleak outlook on life. Often this can be triggered by slow sales (which might mean you get this at certain times of year).

This is like a mini-depression, a very, very short duration depression. Emotionally it’s horrible. But they go away. After you’ve had this happen to you repeatedly you realise this is just hidden emotions bubbling to the surface and needing to be released. Being aware of this then makes the next time more bearable. Depending on your disposition what you do on a blue day will vary. You may bury yourself in work, or may need to leave all that behind and head off up a hill. Do what’s best for you. Mental health first.

Conclusion

Decompression and blue days both affect productivity and motivation. You can’t do much about them. But you can learn to recognise them, accept them for what they are, and that they will pass, and take action to make them bearable while they happen.

Hopefully if you’re reading this you recognise these two states and are now thinking “Someone else experiences this too. It’s normal!” and that’s a relief 🙂

There is also a small chance you ended up here because you’re seeking out articles on depression. If that’s the case you may find this wonderful talk at Business of Software by Greg Baugues talk about depression some help.

Stdout redirection and capture

By , July 11, 2019 10:34 am

We were recently asked if Memory Validator could handle monitoring a program that took it’s input from a file and wrote its’ output to a file. As shown in the following example.

redirect.exe < input.txt > output.txt

Our tools could handle this, but it wasn’t obvious that they could. Also for interactive use there was no easy way to do this via the launch dialog, unless you were using Coverage Validator. We had to make changes to most of our tools so that they could do what Coverage Validator could do. All tools had a new diagnostic sub-tab added so that stdout data that was captured (an option) could be viewed.

In this article I’m going to explain how to launch a program that reads from stdin and/or writes to stdout using our tools. Although I’m talking about Memory Validator, these techniques apply to all our Validator tools.

There are four ways of doing this.

  1. Start your program from a batch file, putting the redirect of stdin and stdout in the batch file.
  2. Start your program from the launch dialog/wizard, specifying the input and output files on the launch wizard.
  3. Start your program from the command line, specifying the input and output files on the command line.
  4. Use the Memory Validator API to start Memory Validator from inside the target program.

Batch File

Using a batch file to do this is easy. Simply create your batch file in the form shown in the example below.

e:\redirect\release\redirect.exe < e:\test\input.txt > e:\test\output.txt

Save the batch file with a known name. In this example I’ll save the batch file in e:\test\redirectTest.bat. Then launch the batch file from Memory Validator (or another Validator tool). The first program launched by the batch file will be monitored.


Launch Dialog/Wizard

We modified the launch wizard and the launch dialog to include fields for an optional input file and an optional output file. We also added an option that would capture stdout so that you could view the output on the diagnostic tab.

This example shows the program testAppTheReadsFromStdinAndWritesToStdout.exe being launched, reading from e:\test\input.txt, and writing to reading from e:\test\output.txt.


The stdout capture checkbox has been selected. This will mean a copy of stdout will be captured and displayed on the diagnostic sub-tab stdout.


Command Line

For command line operation we need two new options -stdin and -stdout, each of which takes a filename for an argument.

There are two additional arguments that you can supply to tell Memory validator to ignore missing input files and missing output files: -ignoreMissingStdin and -ignoreMissingStdout.

memoryValidator.exe -program e:\redirect\release\redirect.exe 
                    -directory e:\redirect\release 
                    -stdin e:\test\input.txt 
                    -stdout e:\test\output.txt
                    -showErrorsWithMessageBox

API

You can use the Memory Validator API to start Memory Validator each time the target program is run. In that case just running the following command on the command prompt would cause Memory Validator monitor the target program redirect.exe.

e:\redirect\release\redirect.exe < e:\test\input.txt > e:\test\output.txt

To use the Memory Validator API with a particular application, the following steps outline the minimum steps required.

  1. Link to svlMemoryValidatorStub.lib (_x64.lib for x64)
  2. Link to svlMemoryValidatorStubLib.lib (_x64.lib for x64)
  3. #include “stublib.h”
  4. call startProfiler(); at the start of your program
  5. See documentation in the help file for more details.

For other Validator tools the library names will change. See the documentation (topic API) for the particular tool for details.

Conclusion

There are four ways you can work with stdin and stdout with our Validator tools.

You can work with batch files, launch interactively using the launch dialog/wizard, work from the command line, and use the Validator API.

Thread naming

By , June 19, 2019 7:21 pm

Multi-threading is becoming quite common these days. It’s a useful way to provide a responsive user interface while performing work at the same time. Our tools report data per thread where that is warranted (per-thread code coverage doesn’t seem to be a thing – no one has requested it in 17 years). In this article I’m going to discuss thread naming, OS support for thread naming, and additional support for thread naming that our tools automatically provide.

The Default Thread Display

Threads are represented by a thread handle and a thread id. Typically the thread id is what will be used to represent the thread in a user interface reporting thread related data. Thread ids are numeric. For example: 341. For trivial programs you can often infer which thread is which by looking at the data allocated on each thread. However that doesn’t scale very well to more complex applications. You can end up with thread displays like this:


Microsoft Thread Naming Exception

The WIN32 API does not provide any functions to allow you to name a thread. To handle this oversight, Microsoft use a convention that allows a program to communicate a thread name with it’s debugger. This is done via means of an exception that the program throws (and then catches to prevent it’s propagation terminating the application). The debugger also catches this exception and with the help of ReadProcessMemory() can retrieve the exception name from the program. Here’s how that works.

In your program

The exception code that identifies a thread naming exception is 0x406D1388. To pass the thread name to the debugger you need to create a struct of type THREADNAME_INFO (definition shown int the code sample), populate it with the appropriate data then raise an exception with the specified exception code. You’ll need to use SEH __try/__except to surround the RaiseException() call to prevent crashing the program.

typedef struct tagTHREADNAME_INFO
{
	DWORD	dwType;		// must be 0x1000
	LPCSTR	szName;		// pointer to name (in user addr space) buffer must be 8 chars + 1 terminator
	DWORD	dwThreadID;	// thread ID (-1 == caller thread)
	DWORD	dwFlags;	// reserved for future use, must be zero
} THREADNAME_INFO;

#define MS_VC_EXCEPTION 0x406D1388

void nameThread(const DWORD	threadId,
                const char	*name)
{
	// You can name your threads by using the following code. 
	// Thread Validator will intercept the exception and pass it along (so if you are also running
	// under a debugger the debugger will also see the exception and read the thread name

	// NOTE: this is for 'unmanaged' C++ ONLY!

	#define BUFFER_LEN		16

	THREADNAME_INFO	ThreadInfo;
	char		szSafeThreadName[BUFFER_LEN];	// buffer can be any size, just make sure it is large enough!
	
	memset(szSafeThreadName, 0, sizeof(szSafeThreadName));	// ensure all characters are NULL before
	strncpy(szSafeThreadName, name, BUFFER_LEN - 1);	// copying name
	//szSafeThreadName[BUFFER_LEN - 1] = '\0';

	ThreadInfo.dwType = 0x1000;
	ThreadInfo.szName = szSafeThreadName;
	ThreadInfo.dwThreadID = threadId;
	ThreadInfo.dwFlags = 0;

	__try
	{
		RaiseException(MS_VC_EXCEPTION, 0, 
                               sizeof(ThreadInfo) / sizeof(DWORD_PTR), 
                               (DWORD_PTR *)&ThreadInfo); 
	}
	__except(EXCEPTION_EXECUTE_HANDLER)
	{
		// do nothing, just catch the exception so that you don't terminate the application
	}
}

In the debugger

The THREADNAME_INFO struct address is communicated to the debugger in event.u.Exception.ExceptionRecord.ExceptionInformation[0]. Cast this pointer to a THREADNAME_INFO *, check the number of parameters are correct (4 on x86, 3 on x64), check the dwType and dwFlags are correct, then use the pointer in the szName field. This is a pointer inside the other process, which means you can’t read it directly you need to use ReadProcessMemory(). Once the debugger continues and this event goes out of scope, the ability to read the thread name no longer exists. If you want to read the thread name you must read it immediately, storing it for future use.

	if ((de->dwDebugEventCode == EXCEPTION_DEBUG_EVENT) &&
	    (de->u.Exception.ExceptionRecord.ExceptionCode == SVL_THREAD_NAMING_EXCEPTION))
	{
#ifdef _WIN64
		if (de->u.Exception.ExceptionRecord.NumberParameters == 3)
#else	//#ifdef _WIN64
		if (de->u.Exception.ExceptionRecord.NumberParameters == 4)
#endif	//#ifdef _WIN64
		{
			THREADNAME_INFO	*tni;

			tni = (THREADNAME_INFO *)&de->u.Exception.ExceptionRecord.ExceptionInformation[0];
			if (tni->dwType == 4096 &&
			    tni->dwFlags == 0)
			{
				void	*ptr = (void *)tni->szName;
				DWORD	threadId = tni->dwThreadID;

				if (ptr != NULL)
				{
					char	buffer[1000];
					int		bRet;
					SIZE_T	dwRead = 0;

					memset(buffer, 0, 1000);
					bRet = ReadProcessMemory(hProcess,
								 ptr,
								 buffer,
								 1000 - 1,	// remove one so there is always a terminating zero
								 &dwRead);

					if (buffer[0] != '\0')
						setThreadName(threadId, buffer);
				}
			}
		}
	}

Our tools support intercepting this mechanism so that we can name your threads if you use this mechanism.

The problem with this mechanism is that it puts the onus for naming the threads on the creator of the application. If that involves 3rd party components that spawn their own threads, and OS threads then almost certainly not all threads will be named, which results in some threads having a name and other threads being represented by a thread id.


CreateThread() And Symbols

In an attempt to automatically name threads to provide useful naming for threads without having to rely on the author of a program using thread naming exceptions we started monitoring CreateThread() to get the thread start address, then resolve that address into a symbol name (assuming debugging symbols are available) and use that symbol name to represent the thread. After some tests (done with our own tools – which are heavily multithreaded – dog-fooding) we concluded this was a useful way to name threads.

However this doesn’t solve the problem as many threads are now named _threadstartex().


_beginthread(), _beginthreadex() And Symbols

The problem with the CreateThread() approach is that any threads created by calls to the C runtime functions _beginthread() or _beginthreadex() will result in the thread being called _threadstart() or _threadstartex(). This is because these functions pass their own thread function to CreateThread(). The way to solve this problem is to also monitor _beginthread() and _beginthreadex() functions to get the thread start address, then resolve that address into a symbol name.


SetThreadDescription

Starting with Windows 10, a name can be associated with a thread handle via the API SetThreadDescription(). The name associated with a thread handle can be retrieved using GetThreadDescription(). We use these functions to provide names for your threads if you have used these functions.

Un-named Threads

Some threads still end up without names – typically these are transient threads created by the OS to do a short amount of work. If the call to create the thread was internal to Kernel32.dll then CreateThread() will not be called by an IAT call and will not be monitored, resulting in the thread not getting named automatically. This isn’t ideal, but ultimately you don’t control these threads, which means you can’t affect the data reported by these threads, so it’s not that important.

Thread Naming Priority

We’ve made these changes to all our Validator tools and updated the user interfaces to represent threads by both thread id and thread name.

If a thread is named by a thread naming exception, SetThreadDescription(), or via a Software Verify API call that name will take precedence over any automatically named thread. This is useful in cases where the same function is used by many threads – you can if you wish give each thread a unique name to prevent multiple threads having the same name.

Here are some of the displays that have benefited from named threads.

Bug Validator


Memory Validator


Performance Validator


Thread Validator



Business of Software Europe 2019

By , April 18, 2019 11:00 am

Last week I attended my the Business of Software Europe conference. This year the conference returns to Cambridge after a few years in Dublin and London.

This year the conference logo is the black squirrel, which is apparently a genetic mutation local to Cambridge.

As usual, an excellent conference with a wide range of topics and an interesting mix of speakers. I’m not going to go into detail on the talks, other folks will have already written them up in more detail than my sparse notes will allow. If you’re interested in high level strategic thought about how to run a software business, this is the conference for you.

But I did make a note of every book that a speaker or attendee mentioned.

Let It Go, by Dame Stephanie ‘Steve’ Shirley CH. This book was provided to all attendees by the conference.

Art of Profitability, by Adrian Slywotzky

Thinking, Fast and Slow, by Daniel Kahneman

The Games People Play, by Eric Berne

Rest, by Alex Soojung-Kim Pang

What Got You Here Wont Get You There, by Marshall Goldsmith

Skin In The Game, by Nassim Nicholas Taleb

Let My People Go Surfing, by Yvon Chouinard

This Is Marketing, by Seth Godin

Clean Code, by Robert C. Martin

Building A Story Brand, by Donald Miller

11 Laws Of Showrunning, (PDF) by Javier Grillo-Marxuach . Also a podcast.

Powerful, by Patty McCord

The Art of Product Management, by Rich Mironov

What They Don’t Teach You At Harvard, by
Mark H. McCormack

Chimp Paradox, by Professor Steve Peters

The Manager’s Path, Camille Fournier

Radical Candor, by Kim Scott

Badass, by Kathy Sierra

Purple Cow, by Seth Godin

Film: Free Solo

Video: Steve Jobs marketing talk

Metcalfe’s Law, as the number of people rises, the number of connections gets out of control.

Conway’s Law, your designs are constrained by your communication structures. This graphic captures it all. Note the Microsoft graphic.

There’s more than one way to leak a GDI object

By , September 4, 2018 10:05 am

Working with GDI in Windows, whether you’re using Win32 calls or MFC, you’re concerned with pens, brushs, fonts, bitmaps and regions for drawing. You may also be concerned with Palettes, although with our full colour displays these days working with palettes is something of a rarity.

The typical way to work with a GDI object, for example, a pen, is shown below. Create the pen, select it into the DC, do the drawing, select the original object back into the DC, delete the pen.

void CtestGDISelectObjectDlg::drawSomethingWithARedPen(HDC hDC)
{
	HPEN		hPen;
	HPEN		hPenOld;
	HGDIOBJ		retVal;

	hPen = ::CreatePen(PS_SOLID, 0, RGB(255, 0, 0));

	hPenOld = (HPEN)SelectObject(hDC, hPen);

	doDrawing1(hDC);

	retVal = SelectObject(hDC, hPenOld);

	DeleteObject(hPen);
}

Leaking GDI objects, type 1

The most common way people leak GDI objects is because they simply forget to delete them. Here’s the previous example, modified to leak the pen. (Don’t do this!).

void CtestGDISelectObjectDlg::drawSomethingWithARedPen(HDC hDC)
{
	HPEN		hPen;
	HPEN		hPenOld;
	HGDIOBJ		retVal;

	hPen = ::CreatePen(PS_SOLID, 0, RGB(255, 0, 0));

	hPenOld = (HPEN)SelectObject(hDC, hPen);

	doDrawing1(hDC);

	retVal = SelectObject(hDC, hPenOld);
}

The author of the code forgot to delete the pen using DeleteObject(hPen); This leak looks like this in Memory Validator.


Leaking GDI objects, type 2

There is another way to leak GDI objects, even when you think you’ve deleted all the objects you created. Take a look at this code. Does it leak?

	HPEN		hPen1;
	HPEN		hPen2;
	HPEN		hPenOld;
	HGDIOBJ		retVal;

	hPen1 = ::CreatePen(PS_SOLID, 0, RGB(255, 0, 0));
	hPen2 = ::CreatePen(PS_DASHDOTDOT, 0, RGB(255, 0, 0));

	hPenOld = (HPEN)SelectObject(hDC, hPen1);

	doDrawing1(hDC);

	retVal = SelectObject(hDC, hPen2);
	
	doDrawing2(hDC);

	DeleteObject(hPen1);
	DeleteObject(hPen2);

A quick examination shows that two pens are created, some work is done with each pen, then the pens are deleted.

No leaks, right?

Wrong! hPen2 was selected into the DC for use with doDrawing2() but was not deselected from the DC prior to DeleteObject(hPen2);. This means that the call to DeleteObject() will fail as the pen is still in use. hPen2 has been leaked.

Memory Validator can detect this (as of V7.38, released today). Here’s what that looks like:


Expanding the source and you can easily see the failed DeleteObject() call and the SelectObject() call that mean the object was still in use.


Here’s what non-leaking the code should look like:

	HPEN		hPen1;
	HPEN		hPen2;
	HPEN		hPenOld;
	HGDIOBJ		retVal;

	hPen1 = ::CreatePen(PS_SOLID, 0, RGB(255, 0, 0));
	hPen2 = ::CreatePen(PS_DASHDOTDOT, 0, RGB(255, 0, 0));

	hPenOld = (HPEN)SelectObject(hDC, hPen1);

	doDrawing1(hDC);

	retVal = SelectObject(hDC, hPen2);
	
	doDrawing2(hDC);

	SelectObject(hDC, hPenOld);

	DeleteObject(hPen1);
	DeleteObject(hPen2);

To ensure your object is not still selected into the DC you can select the value that was returned to you by the first call (hPenOld in the code above), or you can select a stock object into the DC instead. For example:

	SelectObject(hDC, GetStockObject(BLACK_PEN));

Conclusion

When working with GDI objects you need to keep track of two things:


  1. Creation and Deletion of GDI objects. For every pen, brush, font, bitmap, palette that you create you must delete those objects when you are finished with them. Delete objects using DeleteObject().

  2. You need to ensure that none of the objects that you create are selected into a DC when you try to delete them.

Detecting memory leaks in Visual Test unit tests

By , June 6, 2018 5:40 pm

Introduction

We recently had a request asking if C++ Memory Validator could detect memory leaks in unit tests managed by Microsoft’s Visual Test and Visual Test Explorer. They told us what they’d tried to do and that it had failed. Our response was to find out what was failing, fix it and then describe what you need to do to use our tools with Visual Test. That’s what this article is about.

There were a few bugs specific to working with Visual Test, plus a very novel environment variable data corruption that can only happen in very unusual circumstances (you almost certainly would not hit these in normal usage of our tools). We fixed these tools, then experimented with several ways of working with Visual Test. We’re only going to talk about C++ Memory Validator here, but this article also applies to our Coverage, Performance and Thread tools.

You can run Visual Test from the command line, and also via Visual Test Explorer, which is a component of Visual Studio.

Monitoring Visual Test from the command line

The full details of how to work with Visual Test from the command line are documented in this article from Microsoft.

Using this information we know that we need to launch vstest.console.exe and pass the unit test DLL to that as an argument. For example:

vstest.console.exe unitTest.dll 

Using this information we can launch vstest.console.exe from C++ Memory Validator and pass it the appropriate DLL. You can set the startup directory to whatever you like. We’ve chose to set it to the same directory as the unit test DLL.

Unit Test Code

namespace UnitTest1
{		
	TEST_CLASS(UnitTest1)
	{
	public:
		
		TEST_METHOD(TestMethod1)
		{
			// pretend this is a real unit test, 
			// exercising a real target class/function
			// and that it leaks some memory

			char *ptr;

			ptr = new char[123];
			strcpy_s(ptr, 123, "Excellent Adventure");

			ptr = (char *)malloc(456);
			strcpy_s(ptr, 456, "Bogus Journey");

			ptr = (char *)malloc(789);
			strcpy_s(ptr, 789, "Face the Music");

			Assert::AreNotEqual("Bill", "Ted");
		}

	};
}

You’ll notice that you don’t need to specify the TEST_MODULE_INITIALIZE(moduleInit) function, or the TEST_MODULE_CLEANUP(moduleCleanup) function. You only need to specify the unit tests you want tested. C++ Memory Validator does the rest.

Monitoring Visual Test Explorer

When working with Visual Test Explorer, Visual Test is launched from Dev Studio, and then loads the unit tests for testing from a DLL. Visual Test then stays running in the background and does not shutdown. This means the “program has ended” signal that C++ Memory Validator needs doesn’t get sent. It also means that subsequent tests run with Visual Test Explorer won’t cause Visual Test to startup, meaning that the “program has started” signal that C++ Memory Validator needs doesn’t get sent.

To get around these problems we need to use the NT Service API (see help file, or online documentation for details) to contact C++ Memory Validator at the start of the unit tests, and also at the end of the unit tests. We do that using the TEST_MODULE_INITIALIZE(moduleInit) function, or the TEST_MODULE_CLEANUP(moduleCleanup) functions.

Functions

We use svlMVStub_LoadMemoryValidator() to load C++ Memory Validator into the unit test, then svlMVStub_StartMemoryValidator() to start it monitoring the unit test and communicating with the user interface. At the end of the tests we use svlMVStub_UnloadMemoryValidatorAndTerminateProcess(1000) to shutdown C++ Memory Validator and set a thread running that will after a delay of 1000ms terminate the vstest.executionengine.x86.exe (Visual Test) process. You may need to experiment with this delay on your machine.

Header Files

We need to include two header files from C++ Memory Validator. svlMVStubService.h and svlServiceError.h. You’ll find these in the svlMVStubService folder in the C++ Memory Validator install directory.

Libraries

We also need to link to svlMVStubService.lib (32 bit builds) or svlMVStubService_x64.lib (64 bit builds). You’ll find versions of these libraries in the svlMVStubService folder (one library per version of Visual Studio).

Unit Test Code

#include "svlMVStubService.h"
#include "svlServiceError.h"

namespace UnitTest1
{		
	TEST_MODULE_INITIALIZE(moduleInit)
	{
		SVL_SERVICE_ERROR	sse;

		sse = svlMVStub_LoadMemoryValidator();

		sse = svlMVStub_StartMemoryValidator();
	}

	TEST_MODULE_CLEANUP(moduleCleanup)
	{
		SVL_SERVICE_ERROR	sse;
		
		sse = svlMVStub_UnloadMemoryValidatorAndTerminateProcess(1000);
	}

	TEST_CLASS(UnitTest1)
	{
	public:
		
		TEST_METHOD(TestMethod1)
		{
			// pretend this is a real unit test, 
			// exercising a real target class/function
			// and that it leaks some memory

			char *ptr;

			ptr = new char[123];
			strcpy_s(ptr, 123, "Excellent Adventure");

			ptr = (char *)malloc(456);
			strcpy_s(ptr, 456, "Bogus Journey");

			ptr = (char *)malloc(789);
			strcpy_s(ptr, 789, "Face the Music");

			Assert::AreNotEqual("Bill", "Ted");
		}

	};
}

Having reworked the unit tests to support the NT Service API, we now need to launch devenv.exe from C++ Memory Validator, but with instructions to ignore devenv.exe and monitor Visual Test (vstest.executionengine.x86.exe). We do that from the launch dialog/wizard.

First choose devenv.exe to monitor using the Browse… button next to the Application to launch field. In this example we chose C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\Common7\IDE\devenv.exe


Next we need to setup the applications to monitor. Click the Edit… button next to the Application to monitor field. The applications to monitor dialog is displayed.


Choose devenv.exe using the Browse… button next to the Application to launch field.

Click the Add… button and add the vstest.executionenginex86.exe that corresponds with the devenv.exe you selected. In this example we chose C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\Common7\IDE\CommonExtensions\Microsoft\TestWindow\vstestexecutionengine.x86.exe. Be sure to choose the correct one – there are several with similar names to choose from.


Click OK to accept the application to monitor.


Click OK to accept the new definitions.

Now on the launch dialog, change the Application to monitor combo to select the vstest.execution.x86.exe entry.

Enter the name of your unit test dll(s) in the Arguments field, separated by spaces. If you specify the directory containing the DLLs as the startup directory, you can specify the DLL names without paths.

Running Your Tests

Now that we’ve setup C++ Memory Validator to monitor Visual Test when started from devenv, let’s get to work.

To start devenv, click the Go! button. Devenv will start. Load the solution that contains your unit tests. When you choose Run Selected Tests or Debug Selected Tests from Visual Test Explorer, the tests will run and will be monitored by C++ Memory Validator at the same time.

Panorama Theme by Themocracy