Linux memory leak detection

Tracking down the source of a memory leak in Linux is not always straightforward…

Signs of a Memory Leak:

Typically, the first sign of a memory leak is the oom-killer.  If programs start dying inexplicably, check the system log (usually /var/log/messages) for evidence of the oom-killer in action.  This should be accompanied by low memory reported by your system resource monitor (eg, System Monitor in GNOME).  On the command-line, you can use top or free for a quick diagnosis.

Low memory however, may not indicate a memory leak – it could just be that you are running too many programs, or some memory-intensive processes are hogging space, or that the system is not configured optimally – eg, it’s not taking full advantage of available memory, or the swap space is too small.  To confirm a memory leak, you should monitor memory usage over time, preferably with no changes to running processes.  A simple way to do this is:

>free -lmt -s 300

which will output memory usage in some detail every 5 minutes.  Note however that technically a system running optimally should be using nearly all available memory, as it’s more efficient to keep data in memory rather than deallocating/reallocating and unloading/reloading the same data over and over.  Therefore, an optimal system may show high memory usage all the time.  Most of this memory is just pre-allocated, and is automatically deallocated by the system when needed by other processes.

User-Space Leaks:

It is relatively easy to detect / eliminate user-space memory hogs, so this is an obvious next step.  User-space is basically non-kernel space, and refers to memory used by running programs and processes – ie, not memory used by the system kernel.  A quick method is to use your system monitor program, or if you prefer the command-line, run:

>top

and hit “M” to sort by memory usage.  This will quickly tell you what programs are hogging the most memory on the system.  Certainly if they eat up more memory over time, then you have a good lead on the source of a memory leak.  Another method of getting pretty much the same information is:

>ps -e o pid,command,pmem,rsz,vsz k +rsz

Kernel-Space Leaks:

The system kernel doesn’t show up as a process, so detecting non-user-space memory usage is tricky.  The most helpful tool to start with is the meminfo file:

>cat /proc/meminfo

If you are considering a kernel memory leak, then you should learn how to read meminfo.  From the manual, you can see that a lot of memory is allocated by the system for temporary use (eg, Buffers, Cache, Inactive, Virtual), and is released when needed, so high memory usage by these segments is usually a good thing.  In addition, the following hints may help you to understand where memory is allocated:

MemTotal = LowTotal + HighTotal
MemFree = LowFree + HighFree
Slab = SReclaimable + SUnreclaimable
Active = Active(anon) + Active(file)
Inactive = Inactive(anon) + Inactive(file)
AnonPages + ?X? = Active(anon) + Inactive(anon)
Buffers + Cached = Active(file) + Inactive(file) + ?X?
AnonPages + Buffers + Cached = Active + Inactive
SwapTotal = SwapFree + SwapCached

If you do find something out of whack in meminfo (a segment being allocated too much memory, or growing too big), then you may have found a system configuration issue, or possibly even a kernel memory leak.  In some cases, additional information may be available, such as /proc/slabinfo for Slab, and vmstat or /proc/vmallocinfo for Virtual memory issues.

Other Leaks:

A common question is: How come adding up all the different memory segments doesn’t add up to MemTotal?  MemTotal is the total amount of physical memory, which excludes the swap space (also known as virtual memory, or file-backed memory).  It is roughly equivalent to:

MemFree + Slab + PageTables + VmallocUsed + Buffers + Cached + User-Space memory

where user-space memory is the sum of all resident memory (RSZ) used by all processes as indicated by the ps command above.  There is typically a (possibly large) discrepancy between these numbers and MemTotal because unfortunately, meminfo doesn’t keep track of all memory allocations.

Since even meminfo doesn’t keep track of all memory allocation, it is possible for a memory leak to be in that part of memory that isn’t being accounted for.  This could be due to a user-space process, or a system component.  Unfortunately, your best bet for tracking down a memory leak here is by process of elimination – ie, removing programs until the memory leak stops.  Options to consider:

  • kernel modules – these are add-ons to the kernel that are written and maintained by 3rd-parties, and include hardware / device drivers
  • system services – also called daemons, these are continuously running processes, typically managed by the service command
  • the desktop – the many layers of software that run your desktop include the X server (X.org), desktop environment (GNOME or KDE), window manager (metacity, compiz), window decorator (Emerald), and special effects (fusion).

MemTotal = LowTotal + HighTotal
MemFree = LowFree + HighFree
Slab = SReclaimable + SUnreclaimable
Active = Active(anon) + Active(file)
Inactive = Inactive(anon) + Inactive(file)
AnonPages + ?X? = Active(anon) + Inactive(anon)
Buffers + Cached = Active(file) + Inactive(file) + ?X?
AnonPages + Buffers + Cached = Active + Inactive
SwapTotal = SwapFree + SwapCached

One thought on “Linux memory leak detection

  1. I’m that far that i understand the above for 95% and have indeed a missing 3.33GB in the memory add-up. there any app that takes care off such a problem?

Leave a Reply to lars Cancel reply

Your email address will not be published. Required fields are marked *