Control-groups in Rhel6

One new feature that I’m very enthusiastic about in RHEL6 is Control Groups (cgroup for short). It allows you to create groups and allocate resources to these. You can then bunch your applications into groups at your heart’s content.

It’s relatively simple to set up, and configuration can be done in two different ways. You can use the supplied cgset command, or if you’re accustomed to doing it the usual way when dealing with kernel settings, you can simply echo values into the pseudo-files under the control group.

Here’s a controlgroup in action:

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@rhel6beta cgtest]# grep $$ /cgroup/gen/group1/tasks
1138
[root@rhel6beta cgtest]# cat /cgroup/gen/group1/memory.limit_in_bytes
536870912
[root@rhel6beta cgtest]# gcc alloc.c -o alloc && ./alloc
Allocating 642355200 bytes of RAM,,,
Killed
[root@rhel6beta cgtest]# echo `echo 1024*1024*1024| bc` >
/cgroup/gen/group1/memory.limit_in_bytes
[root@rhel6beta cgtest]# ./alloc
Allocating 642355200 bytes of RAM,,,
Successfully allocated 642355200 bytes of RAM, captn' Erik...
[root@rhel6beta cgtest]#

The first line shows that the shell which launches the app is under the control of the cgroup group1, so subsequently all it’s child processes are subject to the same restrictions.

As you can also see, the initial memory limit in the group is 512M. Alloc is a simple C app I wrote which calloc()s 612M of RAM (for demonstrative purposes, I’ve disabled swap on the system altogether). At the first run, the kernel kills the process in the same way it would if the whole system had run out of memory. The kernel message also indicates that the control group ran out of memory, and not the system as a whole:

1
2
3
4
...
May 13 17:56:20 rhel6beta kernel: Memory cgroup out of memory: kill process
1710 (alloc) score 9861 or a child
May 13 17:56:20 rhel6beta kernel: Killed process 1710 (alloc)

Unfortunately it doesn’t indicate which cgroup the process belonged to. Maybe it should?

cgroups doesn’t just give you the ability to limit the amount of RAM, it has a lot of tuneables. You can even set swappiness on a per-group basis! You can limit the devices applications are allowed to access, you can freeze processes as well as tag outgoing network packets with a class ID, in case you want to do shaping or profiling on your network! Perfect if you want to prioritise SSH traffic over anything else, so you can comfortably worked even when your uplink is saturated. Furthermore, you can easily get an overview of memory usage, CPU accounting etc. of applications in any given group.

All this means you can clearly separate resources and to quite a large extent ensure that some applications won’t starve the whole system, or each other from resources. Very handy, no more waiting for half an hour for the swap to fill up and OOM to kick (and often chose the wrong PID) in when customer’s applications have run astray.

A much welcomed addition to RHEL!

May 13th, 2010