Tips diagnosing Alleged Performer Memory Leaks

    1. Replicate the memory usage observed by the customer. Make sure you are seeing exactly what the customer is claiming. If possible get them to confirm that this is the case.
    2. If possible, modify the test case so that it memory usage is consistent each time you run it. Remove any mouse interaction, and 'random' behaviour.
    3. Measure the memory usage, so that you have a base line to which you can compare future results. I would use gmemusage to do this, since it is most likely what the customer is using.


    4. Gmemusage, as it come with 6.[345] upto 6.5.2m, isn't particularly useful for recording the memory usage by an application.

      To get around this, I have hacked it ( RFE # 636712 has been filed [N.B. this RFE was completed and incorporated into 6.5.4] ) so that when you hit the 'y' key, it will do the equivalent of pressing the 't' key; that is, it will print out the currently viewed information to stdout; and it will do it at every interval, as specified by the -i command line parameter.

      Because gmemusage access alot of kernel structures, it varies somewhat with OS version.

    5. Record the memory usage in a sensible way. Recording memory usage over time isn't necessarily a good way, since modifications/fixes can effect the test case's performance, which may slow down the program and, thus, the memory consumption over time. I have found that measuring usage per frame is a good way. If the app is paging a database, then measuring the memory consumption per database page operation, would be a good way. To do this, you should fix the number of frames that the test cases produces - make it a fairly substantial number, preferably just prior to it running out of memory. Use your own judgement on this, since you may only require a short test case, and the less time spent running test cases the better.
    6. When you have a file of gmemusage output, you should filter it so that you have a file of numbers indicating the increas in memory usage.

    7. I do something like this :-
       

      1. run gmemusage, redirecting stdout to a file - organise the experiment results and make a note of how you ran the test case together with any modifications you had made to it.

        iris$ gmemusage > experiment_original/gmemusage.out
      2. run the test case, pressing 'y' in the gmemusage window at the start of the test, and 'esc' when it finishes
      3. change into the experiment directory

        iris$ cd experiment_original

      4. filter out the lines for the 'Res' column for 'testcase' - you should replace 'testcase' with whatever the test case is called :-

        iris$ awk '/testcase/ { print $4 }' gmemusage.out > testcase.res

      5. add a number to the start of each line in the file to indicate how far through the test it is - here I've used 100 for a percentage; you could use the total number of frames to get a usage/frame graph

        iris$ awk -v TNR=`awk 'END { print NR }' testcase.res` '{ print 100*NR/TNR, $0 }' testcase.res > testcase.res.prop

        Here I have used the suffix 'res' for 'resident memory' and 'prop' for 'proportional'. Use whatever you want.
      6. plot a graph using, for example, gnuplot.

        irix$ gnuplot
        gnuplot> set data style lines
        gnuplot> plot "testcase.res.prop"

        When you have made some modifications to the test case, you can plot the two memory usage profiles on the same graph.
      7. 'snapshot' the graph. This provides images which can be good to show improvements to customers and to describe the problem to engineering.
    8. Try various fixes, measuring the memory usage each time :-
      1. Make the test case single threaded, by setting pfMultiprocess( PFMP_APPCULLDRAW ). See pfMultiprocess(3pf).
      2. Try the amallopt() calls. These reduce the amount of arena fragmentation. I usually insert code such as the following :-

            if ( getenv( "AMALLOPT_FIX" ) != NULL )

      3.     {
              int size = atoi( getenv( "AMALLOPT_FIX" ) );
              fprintf( stderr, "Implementing AMALLOPT_FIX\n" );
              amallopt( M_MXCHK, size, pfGetSharedArena() );
              amallopt( M_FREEHD, 1, pfGetSharedArena() );
            }

        which allows me to control the test case with an environment variable instead of recompiling each time.

        See amalloc(3c) for a detailed explaination of the amallopt() calls. I will give you a brief explaination.

        The amallopt() calls control how the arena memory allocation algorithm works.

        When an area of memory in the arena are freed (with afree(), arealloc() or arecalloc()), their details (size and location) are recorded in a free list. When new memory is requested, the free list is searched for an area which will satisfy the request. By default, only the first 100 entries of the free list are searched. If it searches this number of entries in the list and doesn't find one that satisfies the request, then it allocates new memory in the usual way (thereby increasing the amount of memory allocated to the process). The number of free list entries that is searched can be controlled by the M_MXCHK command. The higher the number, the more likely it is that an area of memory can be found, and the less likely a new area of memory will be allocated.

        The M_FREEHD command controls behaviour when memory is freed. If the command is set to '1' then the details of memory being freed is placed at the beginning of the free list (where the search is started). The default is '0', which places it at the end of the free list.

        The disadvantages in both these settings is the they produce sub-optimal performance, which also explains why they are not set this way by default.