The future of development environments is bubble shaped

By , March 17, 2010 1:01 pm

Researchers at Brown University have created a new way of interacting with source code and debuggers with their innovative Code Bubble based IDE.

The link includes a video (8 minutes, worth watching the whole thing) and a written description of what they hope to acheive. If you are really keen, you can participate in the beta.

Its an interesting look into the future. I think we’ll see some of these ideas sooner than Visual Studio 2035 though 🙂

What will Visual Studio 2035 be able to do?

By , March 17, 2010 12:53 pm

Will Visual Studio 2035 exist? Will we design and write software the same way in the future?

Read Bruce Eckel’s thought provoking article about “Programming in the mid future”.

What is the difference between a page and a paragraph?

By , March 8, 2010 11:40 am

In the real world we all know that pages contain paragraphs and that paragraphs are full of sentences created from words.

In the world of Microsoft Windows Operating Systems it is somewhat different – paragraphs contain pages!

In this article I’m to going to explain what a page is and what a paragraph is and how they relate to each other and why this information can be useful helping to identify and resolve certain memory related bugs.

Virtual Memory Pages
A virtual memory page is the smallest unit of memory that can be mapped by the CPU. In the case of 32 bit x86 processors such as the Intel Pentium and AMD Athlon, a page is 4Kb. When you make a call to VirtualProtect() or VirtualQuery() you will be setting or querying the memory protection for sizes that are multiples of a page.

The size of a page may vary from CPU type to CPU type. For example a 64 bit x86 CPU will have a page size of 8Kb.

You can determine the size of a page by calling GetSystemInfo() and reading the SYSTEM_INFO.dwPageSize value.

Virtual Memory Paragraphs
A virtual memory paragraph is the minimum amount of memory that can be commited/reserved using the VirtualAlloc() call. On 32 bit x86 CPUs this value is 64Kb (0x00010000). If you have ever used the debugger and looked at the load addresses of DLLs in the Modules list you may have noticed that DLLs always load on 64Kb boundaries. This is the reason – the area a DLL is loaded into is initialised by a call to VirtualAlloc to reserve the memory prior to the DLL being loaded.

Loaded Modules

You can determine the size of a paragraph by calling GetSystemInfo() and reading the SYSTEM_INFO.dwAllocationGranularity value.

Given these values, you can see that (on 32 bit 86 systems) a virtual memory paragraph is composed of 16 virtual memory pages.

How can I use this information?
If you are using VirtualAlloc() it is important to know the granularity at which the allocations will be returned. This is the size of a paragraph. This information is fundamental in deciding how you would implement a custom heap. You know there are fixed boundaries at which your data can exist. You can enumerate the list of possible paragraph locations very quickly (there are 32,768 possible locations in a 2GB space, as opposed to 2 billion locations if the paragraph could start anywhere).

Custom heaps
If you are writing a custom heap, a key indicator to keep track of is memory fragmentation and memory utilisation. Knowing your paragraph and page sizes you can inspect how each page and each paragraph of memory are used by the application and the custom heap to determine if there is wastage, what wastage there is and what form the wastage takes. This information could lead you to modify your heap algorithm to use pages differently to reduce fragmentation. See Delete memory 5 times faster for one simple technique, using HeapAlloc, the same principles apply here.

Loading large data files
Another use for this information is finding out why a certain large file will not load into memory despite Task Manager saying that you have 2GB of free memory. It is not uncommon to find a forum posting somewhere from someone that has a large image file (a satellite phote, MRI scan, etc) that is about 1GB in size. They wish to load it into memory, do in-memory processing on it, save the results, discard the memory then repeat the process, often for numerous images.
Typically on the third attempt to load a large file, the file will not load and the forum poster is left very confused.

The typical implementation is to allocate space for the large file using a call such as malloc() or operator new(). Both of which use the C runtime heap to allocate the memory.

The principle seems fine, but the problem is caused by memory fragmentation which results in a less accessible, totally usable, free space because the remaining free space blocks are separated into a many smaller regions, most of which are smaller than any forthcoming large allocation required by the application. Without the information about where pages and paragraphs are situated, how big they are and what their status is, identifying the cause of this failure could be very time consuming. Once you know the cause, you can think about allocating and managing your memory differently and prevent the bug from happening in the first place.

For situations like these, using HeapAlloc() with a dedicate heap (created using HeapCreate()) or even just directly using VirtualAlloc() will most likely lead to superior results than using the C runtime heap.

A first step in understanding such bugs is to be able to visualize the memory and to also inspect the various page and paragraph information.

To aid in these tasks we have just added a VM Pages view and VM Paragraphs view to VM Validator to make identifying such issues easier. VM Validator is a free download.

Memory Validator will also be updated with a VM Paragraphs view in the next release (Memory Validator already has a more detailed VM Pages view).

Thank you to Blake Miller of Invensys for suggesting an alternative wording for one paragraph of this article.

Delete memory 5 times faster

By , March 2, 2010 12:56 pm

Memory management in C and C++ is typically done using either the malloc/realloc/free C runtime functions or the C++ operators new and delete. Typically the C++ operators call down to the underlying malloc/free implementation to do the actual memory allocations.

This is great, its useful, it works, BUT it puts all the allocations in the same heap. So when you come around to deallocating the heap manager will need to take into account all the other objects and allocations that are also in the heap but unrelated to the data you are deallocating. That adds overhead to the heap manager, causing memory fragmentation and slower heap management.

There is a different way you can handle this situation – you can use your own heap for a given group of allocations.

	HANDLE	hHeap;

	hHeap = HeapCreate(0, 0x00010000, 0); // 64K growable heap

The downside to this is that you have to remember to use the correct allocators and deallocators for these objects and not to use malloc/free etc. You can mitigate this in C++ by overriding new/delete to use your own heap.

void *myClass::operator new(size_t size)
	return HeapAlloc(hHeap, 0, size);

void myClass::operator delete(void *ptr)
	HeapFree(hHeap, 0, ptr);

The upside to this technique is that you can delete all your deallocations in one call by deleting the heap and not bothering with deleting individual allocations. This is also about 5 times faster.

Old style:

	for(i = 0; i < count; i++)
		HeapFree(hHeap, 0, ptrs[i]);

New style:

	hHeap = NULL;

There is another benefit to this technique: By deleting the heap and then re-creating a heap for new allocations you remove all fragmentation from the heap and start the new heap with 0% fragmentation.

HeapDestroy timing demonstration

Download the source of the demonstration application. Project and solution files for Visual Studio 6.0 and Visual Studio 2008 are provided.

We use this technique for some of our tools where we want a high performance heap and zero fragmentation.

You will also want to ensure that whatever software tool you are using to monitor memory allocations will mark all entries in a heap that is destroyed as deallocated.

Panorama Theme by Themocracy