Rss Feed
Tweeter button
Facebook button
Technorati button
Reddit button
Myspace button
Linkedin button
Webonews button
Delicious button
Digg button
Flickr button
Stumbleupon button
Newsvine button

Posts tagged: C

Delete memory 5 times faster

By , March 2, 2010 12:56 pm

Memory management in C and C++ is typically done using either the malloc/realloc/free C runtime functions or the C++ operators new and delete. Typically the C++ operators call down to the underlying malloc/free implementation to do the actual memory allocations.

This is great, its useful, it works, BUT it puts all the allocations in the same heap. So when you come around to deallocating the heap manager will need to take into account all the other objects and allocations that are also in the heap but unrelated to the data you are deallocating. That adds overhead to the heap manager, causing memory fragmentation and slower heap management.

There is a different way you can handle this situation – you can use your own heap for a given group of allocations.

	HANDLE	hHeap;

	hHeap = HeapCreate(0, 0x00010000, 0); // 64K growable heap

The downside to this is that you have to remember to use the correct allocators and deallocators for these objects and not to use malloc/free etc. You can mitigate this in C++ by overriding new/delete to use your own heap.

void *myClass::operator new(size_t size)
{
	return HeapAlloc(hHeap, 0, size);
}

void myClass::operator delete(void *ptr)
{
	HeapFree(hHeap, 0, ptr);
}

The upside to this technique is that you can delete all your deallocations in one call by deleting the heap and not bothering with deleting individual allocations. This is also about 5 times faster.

Old style:

	for(i = 0; i < count; i++)
	{
		HeapFree(hHeap, 0, ptrs[i]);
	}

New style:

	HeapDestroy(hHeap);
	hHeap = NULL;

There is another benefit to this technique: By deleting the heap and then re-creating a heap for new allocations you remove all fragmentation from the heap and start the new heap with 0% fragmentation.

HeapDestroy timing demonstration

Download the source of the demonstration application. Project and solution files for Visual Studio 6.0 and Visual Studio 2008 are provided.

We use this technique for some of our tools where we want a high performance heap and zero fragmentation.

You will also want to ensure that whatever software tool you are using to monitor memory allocations will mark all entries in a heap that is destroyed as deallocated.

Share

Speed up your MFC program without a profiler!

By , February 13, 2010 3:35 pm

I’m going to explain how to potentially improve the performance of MFC programs using CMap<> and related containers (CMapStringToPtr, etc) by a substantial amount.

We use MFC for quite a lot of our software. We also use STL and also some custom containers. It is easy to fall into certain programming habits of using a particular container for a particular task. Sometimes the result is not what you expected!

During the Christmas break I decided to write some software to analyse the Software Verification website log files. Initially I wanted to do keyword mining, but it soon blossomed into a stats package calculating different statistics for keywords, referrers, domains, URLs, favourite domains, bounce rates, evaluation download rates, etc.

I was also reading the wonderful "Web Analytics an Hour a Day" book by Avinash Kauschik. And to top it all, I was ill. I’ve since been told by various people that I had Swine Flu.

The initial version didn’t calculate much and was quite fast, but when I wanted to scale up to calculating monthly data as well as yearly, that was 13 times as much calculation and the inefficiencies in container choice started to show. I’d been lazy, this was a hobby project, nothing serious, so I’d chosen the MFC class CMap.

I wanted to map keyword strings to countData objects which would hold all the data on that keyword. Later we could discard the map and just use an array of keyword objects for random access and sorting.  I wanted data for unique keywords, so during processing of the data it seemed natural to want to map the keyword to the keyword data. I was doing similar things for referrers, referring domains, requested URLs, etc.

A CMap<> is an MFC hash table. A declaration would look like this:

CMap<CString, LPCTSTR, countData *, countData *&> mfcData;

The data processing would be something like this (simplified)

	BOOL	b;

	while(TRUE)
	{
		CString	s;
			
		b = file.ReadString(s);
		if (!b)
			break;

		// lookup string

		countData	*cd = NULL;

		if (mfcData.Lookup(s, cd))
		{
			// if we know about the string update the count

			cd->addCount(1);
		}
		else
		{
			// otherwise create the string and update the count

			cd = new countData(s, 1);
			mfcData.SetAt(s, cd);
		}
	}

The problem with the CMap<> class is that its hash key calculation isn’t very sophisticated, so you get collisions reasonably quickly, which forces the CMap to start making linked lists for the non-identical collisions, and walking the linked lists for subsequent searches is linear time. That gets slower as you add more data.

After processing for one particlularly large set of data for 2009 took 36 hours I thought a better solution was required. Why did I let it run so long? Partly because once it got past a few hours I was curious to see how long it would take. By this time I was ill, so it didn’t really matter, I was spending much of this time in bed alternately too hot or too cold 🙁

The STL has a non-standard hash table, so we didn’t want to use that, so we opted to use the STL <map> class. The declaration would look like this

map<CString, countData *> stlData;

The data processing would be something like this (simplified)

	BOOL	b;

	while(TRUE)
	{
		if (stopScan)
			break;

		CString	s;
			
		b = file.ReadString(s);
		if (!b)
			break;

		// lookup string

		MAP_STL::iterator	iter;

		iter = stlData.find(s);
		if (iter != stlData.end())
		{
			// if we know about the string update the count

			iter->second->addCount(1);
		}
		else
		{
			// otherwise create the string and update the count

			countData	*cd;

			cd = new countData(s, 1);
			stlData[s] = cd;
		}
	}

This is not a hash table, its a B-tree, so it will use more memory than the hash map, and unique entries should be slower than the hash table, but colliding entries should be faster. With this new style of map substituted all over the program, for all the many unique datatypes being calculated, the processing time dropped from 36 hours to 1 hour.

In other words, the STL version processed the data in 2.77% of the time the MFC version processed the data. That is a 97% improvement in processing time.

I am not saying that you will get this improvement. This improvement was acheived partly because of the type of statistics being calculated and partly due to the input data having so many collisions. The dataset for the above test was 17.1GB of Apache log files.

I have created an example MFC program that shows a trivial example of reading in lines from a file, testing them for uniqueness and storing them and updating the counts for them. The program also does the same using STL. Some sample times shown below:

Size (MB) MFC STL Improvement
34 00:00:12 00:00:06 50%
2048 00:22:46 00:05:21 76%

Time comparison of reading a 34MB file with MFC and STL
For a sample 34MB input file, the MFC program processes the data in 12+ seconds and the STL program processes the same data in 6+ seconds. An approximately 50% improvement.

Time comparison of reading a 2GB file with MFC and STL
For a sample 2GB input file, the MFC program processes the data in 22 minutes 46 seconds seconds and the STL program processes the same data in 5 minutes 21 seconds. An approximately 76% improvement.

The larger the dataset, or the more different CMap<> you are using calculating the data, the chances are that changing them to STL <map> would be highly effective in speeding up that part of your application. Note that if you are only loading a few tens of items into the map, it will not be worth making the change.

You can download the executable and the source code with project files so that you can test this yourself.

I don’t recommend changing all occurrences of CMap<> for STL <map>, but changing a the right ones should make some easy and worthwhile gains for you.

Share

Support for MinGW and QtCreator

By , December 4, 2009 5:01 pm

Everyone uses Visual Studio to write C and C++ software don’t they? Yes! you all chorus. Apart from some guys at the back who like to use gcc and g++. They use MinGW when working on Windows. And they may even use Emacs, or perish the thought, vi!

Up until now we haven’t been able to cater to the needs of gcc and g++ users. We’d get email every month asking when we were going to support MinGW or if we supported QtCreator. It was frustrating admitting we couldn’t support that environment. Even more so as the founders of Software Verification wrote large GIS applications using gcc and g++ back in the early 1990s.

During October we integrated support for MinGW and QtCreator into Coverage Validator, Memory Validator, Performance Validator and Thread Validator. Both COFF and STABS debug formats are supported, which provides some flexibility in how you choose to handle your symbols.

We’ll continue to add support for additional compilers to our tools as long as there is interest from you, the kind people that use our software tools.

Share

Got a crash with a strange pointer value?

By , December 4, 2009 5:00 pm

Ever had a crash with a pointer value of 0xcdcdcdcd or 0xdddddddd or 0xfeeefeee?

You have? Sooner or later when you work with C or C++ you’ll bump into one of these crashes. You will no doubt try to reproduce the crash and thats when you’ll probably notice this unusual value has repeated itself. Perhaps the value is significant, perhaps it is telling you something?

I’ve has just written an article about the unusual bit patterns you can find in variables on the stack or in the various Win32 heaps or in the CRT heap. The article closes with some suggestions on how to prevent these errors from occurring.

A few minutes spent reading about the bit patterns may help you see these bit patterns less frequently and dig you out of any 0xfeeefeee hole you may have fallen into.

Share

New .Net software tools

By , January 22, 2007 5:14 pm

We’ve spent the last few years creating our software tools for C++, Java and all the funky scripting languages that are now getting the recognition they deserve (Python, Ruby, JavaScript, etc).

During all this time we’ve have been asked if we have .Net versions of our tools, nearly always a question related to C#. We had to answer “No, but we will have .Net versions at some time in the future.”. That time has come. We now have .Net versions of Memory Validator and Performance Validator available as beta products.

Users of our popular Memory Validator software tool (for C++, Delphi…) will notice that the UI for .Net is quite different. This is because detecting memory leaks in garbage collected environments requires different approaches to collecting and analysing data. We have some innovative ideas in .Net Memory Validator, including the Allocations view, which provides a breakdown of objects allocated per function name; the Objects view, which provides a breakdown of objects allocated per object type; the Generations view, which provides an easy to read display of how many objects allocated per generation per object type. You can easily spot the trend graph of object usage and determine which objects are climbing or falling. A reference view allows you to view the object heap as a graph. The hotspots, memory, analysis, virtual and diagnostic tabs will be familiar to users of the original Memory Validator for C++.

.Net Memory Validator has the same user interface features you will find on our other Memory Validator products for other garbage collected languages (Java, JavaScript, Python, Ruby, etc) making ease of use when switching languages a doddle. As with all our products the .Net version, although different under the hood has the same team behind it. You are probably familiar with the case of a company creating a software tool for language X and then porting it support language Y, but the language Y version is lacking because it was a ground up rewrite, often by people unfamiliar with the language X version. This leads to incompatiblities, differnt UI behaviour, new bugs. That isn’t how we create our new language versions. Each version has its own code base, which allows for the radical under the hood changes to accomodate each language. But it has the same team and keeps the same user interface elements where applicable. So even if the code under the hood changes you get the same experience regardless of language.

This also allows our bug fixing to be improved. Many bugs from one version of a product apply to our other language versions. Thus if we find and fix a bug in say Java Memory Validator that bug fix can often be applied to JavaScript Memory Validator, Python Memory Validator, Ruby Memory Validator and .Net Memory Validator, etc. And our customers usually get the bug fix the very next day. This development method has been carried into the .Net line of software tools.

.Net Performance Validator continues this trend of the same user interface. If you know how to use any of our Performance Validator products (including the C++ version) you will know how to use .Net Performance Validator. Its that easy.

The callstack view provides a real time insight onto where a particular thread is running. Raw Statistics lets you inspect the raw data collected about performance and Statistics lets you inspect this data in a more orderly fashion. Relations provides the same information but allows you to view which function was called from which function. Call Tree provides the a call tree which you can expand and contract to view the performance data. Call Graph provides this information as a graph with each function listed as infrequently as possible. Call Graph is a very useful way to find an expensive function, then right click, choose goto Call Tree Node and the first node in the call tree that relates to the same node in the Call Graph expands with the node highlighted and source code displayed if available. Analysis allows complex queries onto the data and the diagnostic tab provides information about the instrumentation process.

Cheers for now, we have more .Net tools to work on.

Linux? MacOS X? Not right now, but some time in the future. Where have you read that before? 🙂

Share

Panorama Theme by Themocracy