Unix/Perl script to monitor process memory usage

We have been running into some problems with Solr/Lucene’s fieldCache lately at work. We have a lot of dynamic fields on which we do sorting. When you do a sort on a field, Solr populates the underlying unconfigurable lucene field cache with an entry. The number of entries made is directly proportional to the number of documents you have in your index. In our case, we have 14 million documents, so a sort on one field will populate the field cache with 14 million integers plus more, which depends on how many distinct values that field can take. Since we have about 250 of these dynamic fields and we can sort on any of those fields, the field cache gets built up pretty quickly. In about 6 to 7 hours of operation, we have tomcat consuming about 90% memory of the 4 GB box on which it is running and we started getting OutOfMemory exceptions because the heap space was full.

The only temporary work-around before we rework our Solr schema seemed to be to restart tomcat automatically when the total memory consumption is over 80% or so. In our production scene, this happens every 6 to 7 hours. We did not want to set a cron job to simply restart tomcat every 6 or 7 hours. Instead it should monitor the memory usage and restart tomcat only if 80% or more memory is consumed by it. Of course, this is assuming no other memory-intensive process is running on that machine.

While trying the Unix free command, we started observing some problems. It was not reflecting what top command was showing. For example, top said 76.7% of memory is being used by tomcat, but
free -m
gave the following output:

             total       used       free     shared    buffers     cached
Mem:          4013       3993         19          0         10       1149
-/+ buffers/cache:       2833       1179
Swap:         1953         12       1940

So how do we get the 76.7% from this output? None of the row ratios seem to give the correct answer.

Instead if you look at Unix ps aux and the 4th column of its output, you will get the memory consumed by that process. The output of
ps aux | grep tomcat

binuser    953  0.2 76.7 22136372 3152708 ?    Sl   Nov15   2:19 /usr/lib/jvm/java-7-oracle/bin/java -Djava.util.logging.config.file=/var/lib/tomcat6/conf/logging.properties -Dsolr.home=/var/solr -Xms1g -Xmx3600m -XX:+UseConcMarkSweepGC -Dcom.sun.management.jmxremote.port=7009 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=XXX.YY.ZZZ.WWW -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/usr/share/tomcat6/endorsed -classpath /usr/share/tomcat6/bin/bootstrap.jar -Dcatalina.base=/var/lib/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.io.tmpdir=/tmp/tomcat6-tmp org.apache.catalina.startup.Bootstrap start
joe    17191  0.0  0.0   6156   656 pts/0    R+   16:25   0:00 grep --color=auto tomcat

Ok, so that has two lines of output and the first line is what we want. (The second is the command-line search we performed.) You look at the first line 4th column and you see 76.7, which is the percentage of physical memory this process is taking.

So here is the Perl script to restart tomcat when the memory consumed by tomcat exceeds 80%:

use strict;

# capture the output of system command
my $output = `ps aux | grep tomcat`;

my @lines = split /\n/, $output;
for my $line (@lines) {
    # get the line about tomcat process
    if ($line =~ m{\Qorg.apache.catalina\E}) {
        my @vals = split /\s+/, $line;
        my $mem_used_percent = $vals[3];
        if ($mem_used_percent > 80) {
            print "Used mem % = $mem_used_percent\n";
            my $now_string = localtime;
            print "Restarting tomcat at $now_string\n";
            system("/etc/init.d/tomcat6 restart");
            $now_string = localtime;
            print "Successfully restarted at $now_string\n\n";

Put that in a sudo cron job (since you are restarting a service) to run every minute and you can log the output to a file to monitor when tomcat got restarted last. So now I can go to sleep peacefully :-).

2 thoughts on “Unix/Perl script to monitor process memory usage

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s