Setting up ISAPI on IIS 10

By , April 3, 2020 11:27 am

Introduction

OK so it’s 2020 and how many people are developing ISAPI extensions? More than you might imagine. Yeah Ruby on Rails and Rust are all the rage these days, but some people still need to work with ISAPI for a bunch of business reasons. I recently had to setup IIS 10 for work with ISAPI on Windows 10. I read a lot of articles on how to do it. None of them were complete, resulting in reading several articles to get something working so I put this together, mainly for my own benefit (because I really don’t need to spend that much time doing this again!). I’m sharing it so you don’t have to go through this.

There’s an interesting gotcha if you’re developing a 32 bit ISAPI extension. Don’t worry I cover that at the end.

I was trying to get a simple ISAPI extension to work before trying anything else. My guess is most of you are working on legacy code, but a few of you may have been instructed to write a new ISAPI. Here’s a good starting point for a simple ISAPI extension if you haven’t already written one.

Creating an ISAPI extension: https://www.codeproject.com/Articles/1432/What-is-an-ISAPI-Extension

Installing IIS components

IIS components are installed via the Windows features dialog.

In the Windows 10 search box type “Turn Windows features on and off”, when windows shows you the result that matches press return (or click it).



The feature selection box is displayed. Select the items highlighted red in the image shown below. Click OK.



If you’ve already got partway through configuring IIS Manager and have realised you don’t have all the required components installed that’s OK, just install them and then close IIS Manager and reopen it (I found that if I didn’t do that not all the component parts would show in IIS Manager, making finding say ISAPI and CGI Restrictions impossible.

Configuring IIS Manager

Start Internet Information Services Manager.

Website

First of all we need a website to work with. If you’ve already got one skip the next few lines.

Add a test website. Right click on “Sites” in the left hand menu and choose “Add Website…”

Choose a website name. For example: “test”.

Choose a location for the website. For example: C:\testISAPIWebsite

Change the port number (just for testing) so that it doesn’t conflict with any other sites you have. For example: 81.

Handler Mappings

Select the server node on the left hand side and double click click on Handler Mappings on the right hand size.



The handler mappings are displayed.



Right click in empty space and choose “Edit Feature Permissions…”.

The Edit Feature Permissions dialog is displayed. Enable Read, Script and Execute persmissions. When you select the execute check box you’ll notice the entry for ISAPI dlls is added to the displayed Handler Mappings. Click OK.



ISAPI and CGI Restrictions

Select the server node on the left hand side and double click click on “ISAPI and CGI Restrictions” on the right hand size.

Right click in empty space and choose “Add…”.

Add the path to your ISAPI dll, a description and select the check box so that it is allowed to execute. Click OK.



This will place a web.config in the directory that contains the DLL. It will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>
        <handlers accessPolicy="Read, Execute, Script">
            <remove name="ISAPI-dll" />
            <add name="ISAPI-dll" path="*.dll" verb="*" modules="IsapiModule" scriptProcessor="C:\testISAPIWebsite\validate.dll" resourceType="File" requireAccess="Execute" allowPathInfo="true" preCondition="bitness32" />
        </handlers>
    </system.webServer>
</configuration>

32 bit ISAPI extensions

If your ISAPI is 32 bit you’ll need to enable them. Go to application pools (under the server node), select the application pool that your website is in, right click, choose “Advanced Settings…”. Change the “Enable 32-Bit Applications” setting to True.


Authentication problems

If when trying to view your web pages you get strange error messages, select the server node on the left then go to “Feature Delegation” and turn any entries that are “Read only” to “Read/Write”. Then restart the server (top of the right hand bar).

Note that I’m assuming you’re working on a Dev machine. If you’re working on a production machine you might want to be a bit less cavalier than just turning all settings to Read/Write – work through them one at a time to find out what you need and change only that.

Trying out the website

If we assume your ISAPI is called validate.dll you should be add to test your ISAPI in a browser using http://localhost:81/validate.dll?12345678

The unintended consequence of not paying sick pay

By , March 6, 2020 4:49 pm

Setting the scene

I wrote this several years ago but it never became published. Reading it now it still seems to be valid. Even more so with coronavirus in the news.

Odds are that if you are reading this and you an IT professional this situation doesn’t apply to you. You are either employed in a full time permanent position (you receive sick pay) or you are a contractor (you provide your own sick pay – it’s a risk you take). This article is the result of a discussion with a friend. We are both based in the UK. In your country the legal aspects may be different but the principle remains.

However if you are a part time employee you may not be paid sick pay. I found this out when chatting with a friend of mine about one of their part time jobs (they have several, one main one to provide an income backbone and two others to ring the changes so they don’t do one type of work all the time).

I know this person really well. Conscientious, hard working, honest, caring, won’t accept jobs they can’t do well, etc. The type of person you want working for you. A string of events have happened to them recently that make them doubt their current main employer. First the employer won’t delegate and gets over stressed as a result. This then leads to the employer bad mouthing the staff to other people (which it turns out have been friends of some of the staff and in one case a customer – ooops!). This gets back to the staff and is seriously demoralising, especially when it’s unjustified.

Then recently my friend was ill for about a week. When they received their pay their pay they found out that they had not been paid for the days they took sick. You can imagine how this made them feel – not very valued by their employer. This has caused a lot of upset (and financial harm) to this person.

This raises some interesting questions.


  1. What do you gain/lose by paying sick pay?

  2. What do you gain/lose by not paying sick pay?

I’ve posed the questions in this way because this is surely the reasoning behind choosing not to pay sick pay. That paying it is a cost to be avoided and not paying it saves you money. Let us examine this:

What do you gain/lose by paying sick pay?

There is undoubtedly a risk that some of the people you employ are not going to be as honest as some others. Some put the hours in they are required to do and go home. Some do that and also do more and also do personal development. And some put the time in but take every opportunity to shirk off their work, be lazy and take sick days even when they are not sick.

It’s an unfortunate fact that some people will game the system to their advantage. If they know they’ll get paid for being ‘ill’ while they are really at the local cinema watching the latest flick on the first day of opening then they think that is a risk worth taking and they "pull a sicky"

But on the flip what you gain is loyalty from those staff that don’t fall into that group. They value the fact that you will look after them when they are ill. These people rarely abuse the sick day provision.

What do you gain/lose by not paying sick pay?

So there are pros and cons to paying sick pay. All reasonably obvious. But what about the consequences of not paying sick pay?

Let’s start with the obvious consequence: By not paying sick pay you save money not only by not paying for illegitimate sick day claims, but you also save money when hard working staff are ill too. Ca-Ching! Well, not really as you’ve just demonstrated to the hard working staff that you don’t value them. Pretty stupid move.

Any not so obvious consequences? Yes. If you are not going to be paid when you are sick, how is that any different to unpaid leave? Apart from the chances of dismissal if caught being dishonest, it’s the same. By not paying sick pay you remove any incentive to be honest about why you were not at work (were you ill or you just couldn’t be bothered or you thought tarring and feathering the local tramp was a better idea?).

"I fancy doing some decorating today? I’ll pull a sicky. I won’t get paid, but I will get this annoying job done. Which is more important to me today? My life is, Decorating it is then." And to hell with the business today. Instant holiday. No permission asked.

This one could leave some businesses in the lurch when they find out with no notice that someone isn’t coming in and the only reason is "I’m sick" (but what they don’t know is this person doesn’t care because they know they are not valued).

A lot of businesses employ part time staff. Many of them are businesses that start early in the morning and close late at night, resulting in a day longer than the typical 7.5 or 8 hours. The last thing these businesses need is to find out at 2pm as the second shift starts that a key team player isn’t coming in today. And by not paying sick pay they increase the likelihood of such a situation.

Working ill

The other unintended consequence of not paying sick pay is that people that are ill and should not attend work, may well choose to attend work because they need the money more than they need to stay in bed recovering. This is especially true of people on zero hours contracts and low pay contracts.

Coronavirus is in the news these days, and to stem this pandemic we really need ill people to do the right thing. But the right thing for society is in competition with the right thing for an individual on a low income who probably has negligible savings. They’re going to come to work with their illness as the short term gain for them (pay) outweighs the long term harm to others (some people may get ill). Humans discount future events, so that harm is in the distance and also not to them, as they’re already ill.

This latter scenario is seriously reduced as an outcome if people are paid enough to be able to take time off work when they are ill.

My friend – their choice

I’m not sure what they are going to do. What is clear though that having realised this unintended consequence and how their employer feels about them (despite customer comments to the contrary) their loyalty to the business has evaporated. I wouldn’t be at all surprised to find some unpaid "sick leave" happens to allow my friend to do various things they need to do in their personal life at the expense of the business they would otherwise be working for.

I guess some people are going to read this and think WTF? But the point is this consequence only happens when you have good conscientious staff and then you don’t value them. If they were paid sick pay they wouldn’t feel unvalued or even think of pulling a sicky, let alone an unpaid sicky (which is in this situation the same, but more wilful).

The really sad thing? My friend likes to do a good job. Wants to work where a good job and good attitude is valued. But it seems that for the types of work they do such employers are rare beasts (not like the IT world where to keep staff you have to be a good employer).

I’m not really sure if my words have done justice to what I’m trying to explain here.

Detecting Abandoned Critical Sections

By , March 6, 2020 4:24 pm

Multithreading is a powerful way to improve the processing throughput and responsiveness of your software. We use it to great effect at Software Verify. In order to manage multithreading successfully it’s necessary to use some form of synchronization between each thread that wishes to read/write data. Deadlocks can result. The main cause of deadlocks is two or more locks (critical section being an example) accessed in different orders on each thread. This has been the subject of much writing, so for now I won’t repeat that topic here.

There is another cause of deadlock which is less well known. The abandoned critical section.

In this article I’m going to describe how to detect abandoned critical sections. But first I need to describe them to you and explain how abandoned critical sections get created.

What is an Abandoned Critical Section?

An abandoned critical section is a critical section that has been locked but then the thread that owns the lock ends without unlocking the critical section. This creates a critical section that cannot be unlocked, and is thus permanently locked. If any other thread attempts to enter the critical section it will wait forever, in a deadlock caused by an infinite wait.

How does this happen?

There are several ways that a critical section can become abandoned.

  • Incorrect code.
  • Incorrect exception handling.
  • Terminate Thread.

Incorrect code

This is where the thread code enters a critical section to do some work and forgets to unlock the critical section. Then the thread exits. If you use object oriented code (CSingleLock for example) to manage the lifetime of critical section ownership then this problem should never happen. But if you manually control the locking, using say, CCriticalSection::Lock() and CCriticalSection::Unlock(), or EnterCriticalSection(&cs) and LeaveCriticalSection(&cs) then it’s possible for you to forget to leave a locked CS, or for a logic failure to result in a critical section not being locked.

If you’re using object oriented synchronization locking methods you might want to look at Thread Lock Checker to automate checking for some simple and common errors that can happen.

DWORD doThread(void	*param)
{
	EnterCriticalSection(&dataCS);
	
	doWork(data);
	
	return 0;	// forgot to call LeaveCriticalSection(&dataCS);
}

Incorrect exception handling

This is where some code in a thread is protected by an exception handler (you’re calling a 3rd party library, or working with data of unknown integrity) and a critical section is locked when an exception is thrown. In an ideal world the exception handler will leave that locked critical section. Unfortunately the writer of the exception handler may not known about the critical section, or they may have forgotten about it – either way the locked critical section doesn’t get unlocked. As with the previous case, if you use object oriented access to critical sections (CCriticalSection, CSingleLock) the process of unwinding the stack during the exception handling should automatically unlock these locks. This won’t happen if you’re using CRITICAL_SECTIONs with the Win32 API.

DWORD doThread(void	*param)
{
	__try
	{
		EnterCriticalSection(&dataCS);
	
		doWork(data);	// something inside here throws an exception. 
	
		LeaveCriticalSection(&dataCS);
	}
	__except(EXCEPTION_EXECUTE_HANDLER)
	{
		// forgot to call LeaveCriticalSection(&dataCS);
	}
	
	return 0;	
}

Terminate Thread

This is where a thread that is doing some work that has accessed some critical sections is killed by another thread calling TerminateThread(). There are occasions where TerminateThread() can be useful, but this is a last ditch method for dealing with threads. If your code is using TerminateThread() to manage your own threads why not spend some time to work out how not to use TerminateThread and to make your threads end normally (by exiting the thread or calling ExitThread()).

// correctly written thread

DWORD doThread(void	*param)
{
	EnterCriticalSection(&dataCS);
	
	doWork(data);
	
	LeaveCriticalSection(&dataCS);
	
	return 0;
}

void mainThread()
{
	HANDLE	hThread;
	DWORD	threadId;
	
	hThread = CreateThread(NULL, 0, doThread, NULL, 0, &threadId);
	if (hThread != NULL)
	{
		doSomeWork();
		
		TerminateThread(hThread, 0); // this is a bit brutal
		CloseHandle(hThread);
	}
}

How to detect Abandoned Critical Sections?

We have two ways to detect Abandoned Critical Sections.

  • Thread Wait Chain Inspector
  • Thread Validator

Thread Wait Chain Inspector

Thread Wait Chain Inspector is a free software tool that we wrote that uses the Win32 Wait Chain API to identify various wait chain states of the locks and waits in a given application. Just select the application in question and look at the results.


This tool tells you process ids and thread ids, but it can’t give you symbols, filenames and line numbers. It will provide thread names if you’re working on Windows 10 and you’ve named your threads using the SetThreadDescription() API.

Thread Validator

Thread Validator is our thread analysis software tool for analysing thread synchronization problems, deadlocks, busy locks, slow locks, contended locks and recursing locks. We’ve recently added some reporting options to Thread Validator will help you identify the location of abandoned critical sections.

I’ve used the tvExample demonstration application that ships with Thread Validator (you’ll need to build) to deliberately create two abandoned critical sections. From the test menu choose “Exit thread with a locked critical section” and “Terminate thread with a locked critical section”.

The summary display will show an abandoned count of 2 in the Errors panel.


The various locks displays will colour the abandoned thread dark purple and list the Lock status as Abandoned


If you click the Abandoned bar in the Errors panel, the display will move to the Analysis tab and the callstacks for the abandoned critical sections will be displayed.


Expanding each entry reveals the callstacks so that you can see where see where each critical section is abandoned. Note that each entry shows two callstacks. The first is where the critical section was created. The second is where the critical section was abandoned. You can expand any entry on any callstack to see the source code.

Abandoned because of thread exit


Abandoned because of TerminateThread()


Expanding the callstack entries to reveal the source code…


Conclusion

Abandoned Critical Sections are bad news. They cause deadlocks. But they don’t need to be hard to track down when you’ve got the right tools to put to work.

Turbo Debugger Symbols Viewer

By , March 6, 2020 12:48 pm

If you’re using Delphi or 32 bit C++ Builder your compiler/linker produces symbols in TDS format. TDS means Turbo Debugger Symbols – it’s an old naming convention from the days of the Turbo compilers.

There are occasions when it’s useful to know what’s inside the symbol file. Is the symbol name as I expected? Or is it mangled to something else? Is the symbol name in the debug info? If not, then maybe the compiler optimised that function out of existence (it does happen).

We’d already written a symbol file viewer for Visual Studio PDB files. It seemed logical to write a similar tool for TDS symbols. That tool is TDS Browser.

TDS symbols can be stored in the executable to which they relate, or in a separate TDS file. The 64 bit version of Delphi doesn’t provide an option for symbols in an external TDS file, which is odd, as no one wants to ship symbols with their executable. But hey, Embarcadero must have their reasons, right? As a result to view symbols with TDS Browser you can specify the TDS file or the executable file, either way we’ll load the symbols if they exist.

Symbol name mangling is different for 64 bit symbols, so we’ve provided an option to see the raw, unmangled symbol name for the occasions when you need that.

You can sort the symbols by any column, and reverse the sort by clicking the same column again. This is useful for finding symbols with source files – sort by filename. Select a symbol and we’ll show you the source code and line numbers if we can find the source code (not easy for source files that don’t have complete paths).

We provided options for resolving addresses into symbols for the 4 main use cases:

  • Crash at absolute address in DLL.
  • Crash at relative address in DLL.
  • Crash at symbol relative address.
  • Crash data from Windows Event Log in XML format.

The above options are on the Query menu.


If the crash address can be resolved into a symbol, the symbol is selected, and the relevant source code and line number information is displayed. A context menu also provides additional options for highlighing symbols and copying symbol related information to the clipboard.


Here’s a short video showing how to use TDS Browser.

An easier way to view crashes in the Windows Event Log

By , March 6, 2020 11:45 am

In a previous article I wrote about how to identify crashes in the Windows Event Log.

You need to use the Windows Event Viewer, inspect each entry looking for some keywords then decode the XML data to get the information you want. All a bit slow, tedious and error prone.

So we wrote a tool to do that for you: Event Log Crash Browser.

Event Log Crash Browser scans your event log looking for crash events, then picks out only the information that is useful:

  • The executable.
  • The DLL that crashed (if it did crash in a DLL, rather than non-DLL memory).
  • The exception code.
  • The offset into the DLL of the crash location (or the location in memory for non-DLL crashes).
  • We also read the version information from the DLL so that we can identify the company responsible for the DLL that crashed.

You can sort on any column, and filter by exception type, executable and DLL.

It’s a really easy way to see what failures are happening on your machine. A lot more convenient than Windows Event Viewer. Looking at this machine I can see that the Visual Studio compiler, linker and IDE crash from time to time. I can also see that the WMI provider service dies quite often from a heap corruption – this is a core bit of Microsoft technology that has problems. Most of the other failures are related to the software under development and test on this machine.

Identifying crashes with the Windows Event Log

By , February 12, 2020 4:43 pm

It’s an unfortunate and inevitable fact that while developing software sometimes your software will crash. This also happens, sometimes, hopefully very infrequently, in production code. Each time this happens Windows stores some information about each crash in the Windows Event Log, along with a multitude of other event information it logs.

In this article I’m going to explain two event log entry types which encode crashes, and how to read them. Then I’ll also introduce some tools that take the drudgery out of converting this information into symbol, filename and line number.

The Windows Event Log

The Windows event log can be viewed using Microsoft’s Event Viewer. Just type “Event Viewer” in the start menu search box and press return. That should start it. Crash information is stored in the sub category “Application” under “Windows Logs”. The two event sources that describe crashes are Windows Error Reporting and Application Error.


The image above shows a Windows Error Reporting event has been selected. The human readable form is shown below in the General tab. Although I say human readable, it really is unintelligible gibberish. None of the fields are identified and you have nothing to work with. The details tab isn’t any better – the raw data is present in text or XML form. Here’s the XML for the crash shown above.

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Windows Error Reporting" /> 
    <EventID Qualifiers="0">1001</EventID> 
    <Level>4</Level> 
    <Task>0</Task> 
    <Keywords>0x80000000000000</Keywords> 
    <TimeCreated SystemTime="2020-02-12T10:09:34.000000000Z" /> 
    <EventRecordID>260507</EventRecordID> 
    <Channel>Application</Channel> 
    <Computer>hydra</Computer> 
    <Security /> 
  </System>
  <EventData>
    <Data>2023787729086567941</Data> 
    <Data>1</Data> 
    <Data>APPCRASH</Data> 
    <Data>Not available</Data> 
    <Data>0</Data> 
    <Data>testDeliberateCrash.exe</Data> 
    <Data>1.0.0.1</Data> 
    <Data>5e419525</Data> 
    <Data>testDeliberateCrash.exe</Data> 
    <Data>1.0.0.1</Data> 
    <Data>5e419525</Data> 
    <Data>c0000005</Data> 
    <Data>000017b2</Data> 
    <Data /> 
    <Data /> 
    <Data>C:\Users\stephen\AppData\Local\Temp\WERC24C.tmp.WERInternalMetadata.xml</Data> 
  <Data>C:\Users\stephen\AppData\Local\Microsoft\Windows\WER\ReportArchive\AppCrash_testDeliberateCr_c31b903842d94a84d4621dceaac377462674f7a_eb589596_139ec4bd</Data> 
    <Data /> 
    <Data>0</Data> 
    <Data>c3d360b2-4d7f-11ea-83d3-001e4fdb3956</Data> 
    <Data>0</Data> 
    <Data>54756af49aec84f97c15f03794ffd605</Data> 
  </EventData>
</Event>

There’s quite a bit of data in here, the purpose of each field implied, not stated. Towards the end is some information related to minidumps, but if you go searching for it, the minidump will no longer be present.

The format for Application Error crashes is different.

Windows Error Reporting

The event log data for a Windows Error Reporting event contains many fields that we don’t need if we’re just investigating a crash address. Each event starts with an <Event> tag and ends with an </Event> tag.

We need to correctly identify the event. Inside the event is a <System> tag which contains a tag with an attribute “Provider Name” set to “Windows Error Reporting”.

Once the event is identified we need to find the <EventData> tag inside the event. The <EventData> contains 14 <Data> tags. These tags are present:


  • 1. Timestamp.

  • 2. Number of data items.

  • 3. Information Type.

  • 4. Information Status.

  • 5. Unknown.

  • 6. Crashing executable.

  • 7. Executable version.

  • 8. Executable timestamp.

  • 9. Crashing DLL. This will be the same as 6 if the crash is in the .exe.

  • 10. DLL version. This will be the same as 7 if the crash is in the .exe.

  • 11. DLL timestamp. This will be the same as 8 if the crash is in the .exe.

  • 12. Exception code.

  • 13. Fault offset.

  • 14. Class. This may or may not be present

Information Type is normally “APPCRASH”. In this case we’re interested in tags 9, 12 and 13.

If Information Type is “BEX”, the data is different:


  • 1. Timestamp.

  • 2. Number of data items.

  • 3. Information Type.

  • 4. Information Status.

  • 5. Unknown.

  • 6. Crashing executable.

  • 7. Executable version.

  • 8. Executable timestamp.

  • 9. Crashing DLL. This will be the same as 6 if the crash is in the .exe.

  • 10. DLL version. This will be the same as 7 if the crash is in the .exe.

  • 11. DLL timestamp. This will be the same as 8 if the crash is in the .exe.

  • 12. Fault offset.

  • 13. Exception code.

  • 14. Class. This may or may not be present

Note that the order of the fault offset and exception code has been reversed compared to APPCRASH.

Of these tags we’re interested in tags 9, 12 and 13.

If we want to version the crashing DLL we also need tags 10 and 11.

Application Error

The event log data for an Application Error event contains many fields that we don’t need if we’re just investigating a crash address. Each event starts with an <Event> tag and ends with an </Event> tag.

We need to correctly identify the event. Inside the event is a <System> tag which contains a tag with an attribute “Provider Name” set to “Application Error”.

Once the event is identified we need to find the <EventData> tag inside the event. The <EventData> contains at least 12 <Data> tags, some of which may not be present, or which may be empty. These tags are present:


  • 1. Crashing executable.

  • 2. Executable version.

  • 3. Executable timestamp.

  • 4. Crashing DLL. This will be the same as 1 if the crash is in the .exe.

  • 5. DLL version. This will be the same as 2 if the crash is in the .exe.

  • 6. DLL timestamp. This will be the same as 3 if the crash is in the .exe.

  • 7. Exception code.

  • 8. Fault offset.

  • 9. Process id.

  • 10. Application start timestamp.

  • 11. Application path.

  • 12. Module path.

Of these tags we’re interested in tags 7, 8 and 12.

If we want to version the crashing DLL we also need tags 5 and 6. If 12 isn’t available, use 4.

Removing the drudgery

The previous two sections have described which fields to extract data from. If you’re doing this manually this is tedious and error prone. You have to select the correct values from the correct fields and then use another application to turn them into a symbol, filename and line number. Our tools DbgHelpBrowser and MapFileBrowser are designed to take a crash offset inside a DLL and turn it into a human readable symbol, filename and line number. But that still requires you to do the hard work of fishing the correct data out of the XML dump.

Now there is a better way, we’ve added an extra option to these tools that allows you to paste the entire XML data from a crash event and the tool then extracts the data it needs to show you the symbol, filename and line number.

DbgHelpBrowser

Load the crashing exe (or DLL) into DbgHelpBrowser. This will cause the symbols to be loaded for the DLL (assuming symbols have been created and can be found). We’re not covering versioning the DLL as most likely you will have your own methods for this.

Choose the option Find Symbol from Event Viewer XML crash log… on the Query menu. The Event Viewer Crash Data dialog is displayed.


Paste the XML data into the dialog and click OK.


The main display will select the appropriate symbol in the main grid and display the relevant symbol, filename, line number and source code in the source code viewer below.


MapFileBrowser

Load the MAP file for the crashing exe (or DLL) into MapFileBrowser. We’re not covering versioning the DLL as most likely you will have your own methods for this.

Choose the option Find Symbol from Event Viewer XML crash log… on the Query menu. The Event Viewer Crash Data dialog is displayed.


Paste the XML data into the dialog and click OK.


The main display will select the appropriate symbol in the main grid and display the relevant symbol, filename, line number and source code in the source code viewer below.


Conclusion

Windows Event Logs can be hard to read and error prone to use. However when paired with suitable tools you can quickly and easily turn event log crashes into useful symbol, filename and line number information to inform your debugging efforts.

Monitoring a service with the NT Service API

By , February 11, 2020 5:21 pm

Debugging services is a pain. There is a lot that can go wrong and very little you can do to find out what went wrong. Perfect! Just what you need for an easy day at work. Services run in a restricted environment, these days you also need to be Administrator to do anything with them, and getting your favourite software tool which isn’t a debugger working with them is hard. I remember years ago seeing the list of things you needed to do to get NuMega’s BoundsChecker to work with services. It was a couple of web pages of instructions, each line containing a detailed step. You had to do all of the actions correctly in order to set things up to work with services.

These days Microsoft have changed the security landscape and it’s no longer possible to launch your data monitoring software tool from a service as that ability is correctly regarded as a security vulnerability. It’s also pretty much impossible to inject into a service from a GUI application. As a result the correct way to work with services is to add a few lines of glue code, in the form of calls to an API that setup communications with an already running user interface.

We’ve described our updated NT Service API in a previous article, so in this article I’m going to talk about the using the API to track errors in the service code calling the API and also describe how you use the user interface to work with services. This article will focus on C++ Memory Validator, but the techniques described here will also work for C++ Coverage Validator, C++ Performance Validator and C++ Thread Validator. If you’re using a .Net service, or a mixed mode service with a .Net entry point you don’t need to use the API, but the GUI parts of this article will still apply to you. If you using a native service or a mixed mode service with a native entry point all of this article applies to you.

Monitoring a Service

Before we get into the error codes and error handling in the GUI, let’s first take a tour of how things should work if everything goes to plan. This will provide some context for the errors I’m going to describe later. I’m going to assume you’ve built both the example service and the example service client, and that you’ve installed the service (serviceMV.exe -install in an Administrator mode command prompt). The service client passes a string to the service, which reverses it and passes it back to the client. The service also deliberately leaks some memory for testing purposes.

Here’s a video of the process.

From the Launch menu, choose Monitor a Service.


The Monitor a Service dialog is displayed.


Enter the full path to your service and click OK to start monitoring. The Validator will now setup some environment variables and some data in the registry that will be used by the service API. After a few seconds the Start your Service dialog appears.


Click OK, then start your service (you’ll need to do this from an Administrator command prompt).

serviceMV -start

The Validator attaches to the service and after a few moments various status information in the Validator title bar and the Validator status bar updated.

It is possible that you may get a debug information informational dialog displayed. You can dismiss this (it can be viewed from the Validator Tools menu). To change how symbols are found you’ll need to look at the Symbol Server and File Locations parts of the Validator settings dialog.


Next a dialog is displayed informing you that Administrator Privileges may be required.


For some services you may find that the Validator gets better data, or sends data to the GUI faster if the Validator is run in Administrator mode. If that is the case you’ll need to restart the Validator with Administrator privileges (and also stop and restart the service, etc).

For this particular example service, we don’t need Administrator privileges so we’ll continue without them.

Now we can interact with the service from the service client by sending a string to the service. The service reverses it and sends it back.

serviceClient "Hello World"


Once we’re done working with the service we can stop it (you’ll need to do this from an Administrator command prompt).

serviceMV -stop

The Validator disconnects from the service and displays all the data it has collected from the service.


That’s how it looks when everything goes according to plan.

What happens when things go wrong? That’s what the next section is about.

Tracking errors in the service

The various API functions return a SVL_SERVICE_ERROR error code. We’ve extended this code so that you can detect when the user has forgotten to do something prior to starting the service, or you can detect if various other error conditions have occurred. Some of these error codes are internal error codes and should never be seen by a customer, but we’re documenting them here for completeness.


  • SVL_FAIL_PATHS_DO_NOT_MATCH. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_INCORRECT_PRODUCT_PREFIX. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_X86_VALIDATOR_FOUND_EXPECTED_X64_VALIDATOR. Looks like you’re monitoring a 64 bit service with a 32 bit Validator. You need to use a 64 bit Validator.

  • SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR. Looks like you’re monitoring a 32 bit service with a 64 bit Validator with the svl*VStubService.lib library. You need to use a 64 bit Validator with the svl*VStubService6432.lib.

  • SVL_FAIL_DID_YOU_MONITOR_A_SERVICE_FROM_VALIDATOR. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ENV_VAR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_SPECIFIED. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_ID_NOT_A_PROCESS. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

  • SVL_FAIL_VALIDATOR_NOT_FOUND. Internal error. Looks like you forgot to Monitor a Service from the Launch Menu before starting the service.

To aid in debugging we strongly recommend that you log all error codes (successful or failure) from Software Verify API calls. This will allow you to track down errors rapidly rather a series of trial error coding mistakes or a back and forth with support but with no information to help support. We added all of the above error codes after 3 customers all reported similar, but different problems with using the service API. All of their problems would be have been solved if these error codes had been available.

Error codes can be logged with this call.

void writeToLogFile(const wchar_t     *fileName,
                    SVL_SERVICE_ERROR errCode);

Helpful messages can be logged with this call.

void writeToLogFile(const wchar_t *fileName,
                    const wchar_t *text);

Error codes can be turned into human readable messages with this call.

const wchar_t *getTextForErrorCode(SVL_SERVICE_ERROR	errorCode);

And if you need to log Windows error codes, use this call.

void writeToLogFileLastError(const wchar_t *fileName,
                             DWORD         errCode);

See the help documentation for all the available API calls.

Tracking errors in the GUI

There are a couple of mistakes that can be made in the user interface. These are related to monitoring the wrong type of service, and the location of the service. Where it is possible to identify this error in the GUI, we will do so. Where it is not, the error codes described above will help you understand the mistake that has been made.

64 bit Service, 32 bit GUI

If you try to monitor a 64 bit service with a 32 bit GUI that will fail. We can detect this and prevent this. When this error happens you will be shown an error dialog similar to this.


Note that monitoring a 32 bit service with a 64 bit GUI is OK, but you need to use the svl*VStubService6432.lib not the svl*VStubService.lib. We can’t detect this from the GUI, which is why the SVL_FAIL_X64_VALIDATOR_FOUND_EXPECTED_X86_VALIDATOR error code exists – you will get this if you are linked to svl*VStubService.lib when you should be linked to svl*VStubService6432.lib.

Service on a network share

Windows won’t let you start a service on a network share. And yet I’ve lost count of the number of times I’ve tried to do this. This is typically because I have the solution working on machine X (where I wrote it) and wish to test on machine Y, and I just use a network share to map it across. This works for applications and fails for services. This can be a real time waster and Windows isn’t exactly helpful about this, and of course it’s in a service’s startup code, so fun debugging that.

To make this failure easier to detect we check the path of the service you specify in the Monitor a Service dialog and determine if the service is on a network share. If it is we tell you we can’t work with it. This then alerts you to the fact you’ll need to copy that service locally to run tests on it. Probably and hour or two of your time saved, right there.


Conclusion

Working with services can be fraught with problems, but if you log your error codes you can easily and quickly identify any errors made configuring your use of the NT Service API that we were unable to catch with the Validator user interface.

Exception Tracer

By , August 24, 2019 8:43 am

We’ve just released another of our in-house tools – Exception Tracer.

Exception Tracer started off life as an experiment and then through a series of needing to capture debugging data for various problems with customers, morphed into the exception tracing tool that it is today. Exception Tracer logs debugging events that are sent to debuggers. Most debuggers respond to these events in an interactive manner, breaking the code on exceptions (such as access violations and breakpoints), stepping into and out of functions, inspecting variables. Exception Tracer doesn’t do any of those things. Exception Tracer simply logs every event and stores callstacks associated with each event. You can save the entire trace and inspect it later (or on a different machine if you wish).

Exception Tracer is great for understanding what exceptions are thrown by applications that throw a lot of exceptions, whether that is by design, or because something is going wrong and the exception handling mechanism is being triggered a lot.

We’ve provided filtering so that you only collect the events you’re interested in – perhaps all you’re interested in is what DLLs load and unload and the order they load in, or maybe you only care about a custom exception that your program throws.

We’ve also provided the ability to create minidumps when exceptions are thrown – minidump for any exception, or just the exceptions you care about.

Lastly, we’ve also provided automatic single stepping support (if you want it, turned off by default) with some intelligent options to reduce the amount of redundant single stepping events that are collected. Because you can turn single stepping on and off during a trace you can run at full speed to where the problem area is, turn on single stepping and collect just the area you need in detail.

We used single stepping to great effect to understand the cause of a stack overflow when one of our tools was shutting down on a customer machine. Turns out the culprit was an anti-virus product on the customer machine that was triggering an unexpected sequence of events that would never happen outside of the shutdown phase. We couldn’t get near this bug with a traditional debugger like Visual Studio, but Exception Tracer got us there (that’s where the intelligent filtering came in – the traces contained so many events we had to reduce the data size just to make it manageable when you were inspecting the results).


Select an item in the top window to view the event data and callstack in the lower windows.

Threads names are taken from GetThreadDescription() API (Windows 10), thread naming exceptions, and manual naming of threads (context menu).

Specific threads can be highlighted so that you can pick out related events on the same thread.

We’d love to hear about problems you’ve solved using Exception Tracer. Please let us know.

Thread Wait Chain Inspector

By , August 22, 2019 10:53 am

Since Windows Vista the Windows operating system has included functionality to iterate across the waiting objects that form a chain between threads. I’m waiting for thread A, which is waiting for thread B, which is waiting for process Y. That sort of thing. Waits come in the form of EnterCriticalSection, WaitForSingleObject, WaitForMultipleObjects, etc. All documented in Microsoft’s Synchronization API. If you get these waits wrong, you can get deadlocks, or waits that wait forever. Either way, it’s game over for your program if that happens.

The Wait Chain Traversal API was added with windows Vista, but only made public recently. Prior to the API, access to the Wait Chain API was only via Resource Monitor, and more recently via Task Manager. A detailed article by a Microsoft field engineer, faik hakan bilgen, documents the history of the Wait Chain user interfaces and then provides a console program (with source code on github) to provide a wait chain dump to a text file. Unfortunately this isn’t very easy to use as it relies on decoding thread ids and process ids to understand what is happening. Also because it’s a text file, to get an update you need to run the tool again.

We decided to take inspiration from our Thread Status Monitor tool and create a version specifically for wait chains – Thread Wait Chain Inspector.


Select the process you are interested in in the upper window. Wait chains for each thread are shown in the window below. Select a thread and it and any related threads (in the same wait chain) will be highlighted in yellow. Deadlocked threads are shown in red. Process names are displayed and thread names are taken from the GetThreadDescription() API (Windows 10 only).

If you wish to debug a process you can create a minidump for any process in a wait chain. Right click on the process of interest in the wait chain and Create a Minidump….

Thread Wait Chain Inspector is a free tool, complementing our other threading tools, Thread Validator, Thread Lock Checker and Thread Status Monitor.

Decompression and blue days

By , July 11, 2019 1:53 pm

It’s not uncommon for the founders of startup businesses to experience problems with motivation and problems with productivity as their business grows. I’m going to write about two issues I’ve run into over the years. They recur. You can’t stop them recurring. So the best thing to do is to understand them and accept them. They’re what I call decompression and blue days.

Decompression

Decompression is the word I use to identify the following pattern. You complete a major software release to the public. Then you find yourself unable to commit to any “serious” work for a period of time. For me, it’s typically one day. For you it could be an afternoon, a day, a week, maybe more.

So what is happening with decompression? I think it’s the process of your mind unwinding all the many layers of logic, dependencies, commitments and anxiety of %^&(ing up the release (it does happen!). During this decompression period I’ve found I can work on things tangentially related to the business, but not directly related to the business. As such I can work on side projects, read technical books, non-technical books, go for a walk, play musical instruments, provide mentorship, whatever. I just can’t work on the software or on marketing for the software during this period.

I’ve also found that I can’t do anything about this. Decompression needs to happen. Once it’s done I can get back to work with no distractions about any of the issues related to that previous software release. If I try to force it, by trying to work during a decompression period I just end up doing nothing, but getting frustrated that I’m doing nothing. That isn’t healthy, so I’ve come to the conclusion that the best thing to do is to accept that this happens and work with it. Do something else that is good for your mental health during one of these periods.

If you do this for yourself, cut your team some slack too. They’re probably going through their own version of the same thing.

Blue days

Blue days are different. These don’t come after any specific event. They just appear at random. You could be having a troubling business time or you could be having a great time, building the product, or you’ve already built it and have revenue pouring in, but then one day you’re thinking “I’m wasting my time. Why are we doing this? This will never succeed. Should I stop and spend my energy on (shiny! shiny!).” Typically this is accompanied by a very bleak outlook on life. Often this can be triggered by slow sales (which might mean you get this at certain times of year).

This is like a mini-depression, a very, very short duration depression. Emotionally it’s horrible. But they go away. After you’ve had this happen to you repeatedly you realise this is just hidden emotions bubbling to the surface and needing to be released. Being aware of this then makes the next time more bearable. Depending on your disposition what you do on a blue day will vary. You may bury yourself in work, or may need to leave all that behind and head off up a hill. Do what’s best for you. Mental health first.

Conclusion

Decompression and blue days both affect productivity and motivation. You can’t do much about them. But you can learn to recognise them, accept them for what they are, and that they will pass, and take action to make them bearable while they happen.

Hopefully if you’re reading this you recognise these two states and are now thinking “Someone else experiences this too. It’s normal!” and that’s a relief 🙂

There is also a small chance you ended up here because you’re seeking out articles on depression. If that’s the case you may find this wonderful talk at Business of Software by Greg Baugues talk about depression some help.

Panorama Theme by Themocracy