Linux process anonymous memory

A server administrator who tries to detect the most RAM-consuming process in the system may have a fair question: what actually memory consumption is. To address the question one needs to go deeper into the kernel realm trying to avoid, if possible, popular utilites like ps and top. Them merely display information that lives in the kernel.

Every process or task is represented by struct task_struct in the kernel which holds loads of process info. We only intersted in one particular and important field: struct mm_struct. The field represents a process virtual memory layout which has almost nothing to do with physical RAM. A rough view of the layout might be seen in /proc/<pid>/maps file for the process.

So, we have come to the virtual memory of processes. This is all the pages the process has. “To have” pages means that process is allowed to read/write them or execute a code in them. This, in turn, means that kernel will properly respond to page-fault hardware exception by altering process page tables. If the virtual address does not belong to the process, the kernel will send SIGSEGV or other signal to the process, otherwise it will try to map the pages to the virtual address and load the contents of the page from the backend storage (swap), if necessary.

If a process tries to allocate memory it may do this in several ways: extending the stack (push operation), extending the heap (sbrk() call), direct allocation (e.g. mmap()). Functions like malloc() make use of mmap() in most cases. Mmap stands for ‘memory mapping’ which will map pages to the process virtual memory.

# cat /proc/$$/maps
00400000-004dd000 r-xp 00000000 08:01 197198                             /usr/bin/bash
006dc000-006dd000 r--p 000dc000 08:01 197198                             /usr/bin/bash
006dd000-006e6000 rw-p 000dd000 08:01 197198                             /usr/bin/bash
006e6000-006ec000 rw-p 00000000 00:00 0
02610000-02652000 rw-p 00000000 00:00 0                                  [heap]
7f8665bcb000-7f8665bd7000 r-xp 00000000 08:01 198477                     /usr/lib64/libnss_files-2.17.so
7f8665bd7000-7f8665dd6000 ---p 0000c000 08:01 198477                     /usr/lib64/libnss_files-2.17.so
7f8665dd6000-7f8665dd7000 r--p 0000b000 08:01 198477                     /usr/lib64/libnss_files-2.17.so
7f8665dd7000-7f8665dd8000 rw-p 0000c000 08:01 198477                     /usr/lib64/libnss_files-2.17.so
<...trimmed...>

Each mapping line has start and end virtual address, permissions, offset in the mapped file, major-minor device numbers where file resides, inode of the file and file name.

Mapping can represent a file on disk or be anonymous. Anonymous memory is the memory we allocate and use (e.g via malloc()). In the maps file above lines 5 and 6 are anonymous mappings.

Lets create our own anonymous mapping:

#include <stdio.h>
#include <sys/mman.h>

int main(int argc, char *argv[]) {
        char *ptr;

        ptr = (char *) mmap((void *) 0x123000, 2 * getpagesize(), PROT_WRITE|PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
        printf("ptr is %p\n",ptr);

        sleep(100);
}

The first argument is our desired virtual address where we want to map the pages. The second argument is a mapping length; we use two pages. The third one is a bitmask of the permissions of our mapping. The fourth argument defines that our mapping is anonymous and is not backed by any file. We do not need the other arguments for they are only required for file-mapped memory.

Compile and run it:

# ./a.out &
[1] 26140
ptr is 0x123000
# cat /proc/26140/maps
00123000-00125000 rw-p 00000000 00:00 0
00400000-00401000 r-xp 00000000 08:01 1253390                            /root/a.out
<...trimmed...>

We asked to create a two-page mapping from the virtual address 0x1230000. Since memory is allocated in pages, the address should be multiple of the page size. 4096 bytes in our case. The kernel does not guarantee to map the pages at the specified address, so normally we would put 0 as the first argument to mmap() to allow the kernel to choose the address.

Our mapping is the first. The size is exactly two pages 2 * 4096 = 8192 = 0x2000 bytes. It has exactly the start address we asked. The permissions and size, of course, the same as well.