onsdag 8 augusti 2012

Determining global variables

A global variable is one that is likely to be locked at all times, i.e. it's accessed throughout the applications, and often indirectly.

The type of any memory area, with regard to lifetime, can be determined in many ways.  The most obvious one is to measure the number of memory accesses (load, store), henceforth called an "op", between new and free, compared to the total number of operations within the entire program.  That will give us all memory areas that are allocated in the beginning and freed at the end, which is common for architectural-type objects that are visible from everywhere. They usually also have a fairly low number of accesses compared to the total number of accesses within its lifetime. Calculating the number (own ops + other ops) / total app ops will give a number from 0 to 1, where 1 is an object allocated at the very beginning and freed at the very end.

Looking at the distribution of lifetime among pointers in a histogram for different applications, we see a clustering pattern for long-lived vs short-lived objects:

sort /etc/dictionaries-common/words

uniq /etc/dictionaries-common/words

soffice lorem-ipsum-10.odt
The different apps show different behaviour.

Starting with sort, no objects have a really long life-time, and it's approximately evenly distributed throughout the lifetime of the application. There's a cut-off point between about 30% and 15% where there's a distinction between longer and shorter life spans, but here a better way of determining life time needs to be used.

The sceond example, uniq, is on the other hand a very non-malloc-intensive application, where a a bunch of memory is allocated in the beginning and then no more. It is safe to assume that this memory will be locked throughout the program's lifetime.

The last example, Star Office, shows a more common large-scale application memory allocation usage, where a set of architectural objects are allocated in the beginning, with a lifetime of close to 100%, then a few allocations here and there, followed by an set of objects that have a very low life-time, presumably for the current document itself.   In this case, the first objects (i.e. long lifetime) would be locked all the time, whereas the later objects (i.e. short lifetime) would be locked just-in-time.

Well, how do you determine the type of variable, then?

As stated above in the example with sort, the application displays an interesting behaviour that could possibly give better understanding of applications with similar memory allocations patterns. Because sort is a small program, I might do a manual analysis of memory usage to see how else we can determine if an object is "global" (over a piece of code) and should be locked or not.  This might be work for the future.

Inga kommentarer:

Skicka en kommentar