-
Replicate the memory usage observed by the customer. Make
sure you are seeing exactly what the customer is claiming. If possible
get them to confirm that this is the case.
-
If possible, modify the test case so that it memory usage is consistent
each time you run it. Remove any mouse interaction, and 'random' behaviour.
-
Measure the memory usage, so that you have a base line to
which you can compare future results. I would use gmemusage to do this,
since it is most likely what the customer is using.
Gmemusage, as it come with 6.[345] upto 6.5.2m, isn't particularly
useful for recording the memory usage by an application.
To get around this, I have hacked it (
RFE # 636712 has been filed
[N.B. this RFE was completed and incorporated into 6.5.4]
) so that when you hit the 'y' key, it will do the equivalent of
pressing the 't' key; that is, it will print out the currently viewed information
to stdout; and it will do it at every interval, as specified by the -i
command line parameter.
Because gmemusage access alot of kernel structures, it varies somewhat
with OS version.
-
Record the memory usage in a sensible way. Recording
memory usage over time isn't necessarily a good way, since modifications/fixes
can effect the test case's performance, which may slow down the program
and, thus, the memory consumption over time. I have found that measuring
usage per frame is a good way. If the app is paging a database, then measuring
the memory consumption per database page operation, would be a good way.
To do this, you should fix the number of frames that the test cases produces
- make it a fairly substantial number, preferably just prior to it running
out of memory. Use your own judgement on this, since you may only require
a short test case, and the less time spent running test cases the better.
-
When you have a file of gmemusage output, you should
filter it so that you have a file of numbers indicating the increas in
memory usage.
I do something like this :-
-
run gmemusage, redirecting stdout to a file - organise the experiment results
and make a note of how you ran the test case together with any modifications
you had made to it.
iris$ gmemusage > experiment_original/gmemusage.out
-
run the test case, pressing 'y' in the gmemusage window at the start of
the test, and 'esc' when it finishes
-
change into the experiment directory
iris$ cd experiment_original
-
filter out the lines for the 'Res' column for 'testcase' - you should replace
'testcase' with whatever the test case is called :-
iris$ awk '/testcase/ { print $4 }' gmemusage.out > testcase.res
-
add a number to the start of each line in the file to indicate how far
through the test it is - here I've used 100 for a percentage; you could
use the total number of frames to get a usage/frame graph
iris$ awk -v TNR=`awk 'END { print NR }' testcase.res` '{ print
100*NR/TNR, $0 }' testcase.res > testcase.res.prop
Here I have used the suffix 'res' for 'resident memory' and 'prop'
for 'proportional'. Use whatever you want.
-
plot a graph using, for example, gnuplot.
irix$ gnuplot
gnuplot> set data style lines
gnuplot> plot "testcase.res.prop"
When you have made some modifications to the test case, you can plot
the two memory usage profiles on the same graph.
-
'snapshot' the graph. This provides images which can be good to show improvements
to customers and to describe the problem to engineering.
-
Try various fixes, measuring the memory usage each time :-
-
Make the test case single threaded, by setting pfMultiprocess( PFMP_APPCULLDRAW
). See pfMultiprocess(3pf).
-
Try the amallopt() calls. These reduce the amount of arena fragmentation.
I usually insert code such as the following :-
if ( getenv( "AMALLOPT_FIX" ) != NULL )
{
int size = atoi( getenv( "AMALLOPT_FIX"
) );
fprintf( stderr, "Implementing AMALLOPT_FIX\n"
);
amallopt( M_MXCHK, size, pfGetSharedArena()
);
amallopt( M_FREEHD, 1, pfGetSharedArena()
);
}
which allows me to control the test case with an environment variable
instead of recompiling each time.
See amalloc(3c) for a detailed explaination of the amallopt() calls.
I will give you a brief explaination.
The amallopt() calls control how the arena memory allocation algorithm
works.
When an area of memory in the arena are freed (with afree(), arealloc()
or arecalloc()), their details (size and location) are recorded in a free
list. When new memory is requested, the free list is searched for an area
which will satisfy the request. By default, only the first 100 entries
of the free list are searched. If it searches this number of entries in
the list and doesn't find one that satisfies the request, then it allocates
new memory in the usual way (thereby increasing the amount of memory allocated
to the process). The number of free list entries that is searched can be
controlled by the M_MXCHK command. The higher the number, the more likely
it is that an area of memory can be found, and the less likely a new area
of memory will be allocated.
The M_FREEHD command controls behaviour when memory is freed. If the
command is set to '1' then the details of memory being freed is placed
at the beginning of the free list (where the search is started). The default
is '0', which places it at the end of the free list.
The disadvantages in both these settings is the they produce sub-optimal
performance, which also explains why they are not set this way by default.