… or why you should care about “OOMScoreAdjust” in your systemd-enabled docker-images Recently I put together a docker image with “PostgreSQL 9.4” installed from the project’s software repository on “CentOS 7.1” to back a rails application. Unfortunately I was not abled to run the “PostgreSQL”-server in a container based on that image. It failed with exit code 206/OOM_ADJUST. In this article I’m going to describe the reason for the failure using a minimal failing example.
Preface
I normally run distributions which use “systemd” as PID 1: Arch Linux on workstations/servers and “CentOS” on servers. I like “systemd” as process manager a lot. That’s why I decided to use it within the docker images as well.
The problem occurs with “systemd 208”, which is the packaged version for “CentOS 7.1.1503”. According to upstream the problem has long been fixed in one of the newer releases of “systemd”, but it seems as if the fix has not been backported by CentOS so far. As of the time writing, I use “docker” 1.8.3 on all of my machines.
I’m going to reproduce the problem with a minimal failing example: Just a so
called oneshot
-service which outputs a string and then exits.
Reproduce the issue
Setup a failing “docker” image
Please, create a working directory first and make it your current working directory.
mkdir -p my-failing-docker-image
cd my-failing-docker-image
After that, create a Dockerfile
with the following content.
FROM feduxorg/centos:latest
MAINTAINER me@example.org
ADD test.service /etc/systemd/system/test.service
Then create the test.service
-file which should be added to the image. It
should contain the following lines.
[Unit]
Description=Test Service
[Service]
Type=oneshot
ExecStart=/usr/bin/echo asdf
OOMScoreAdjust=-1000
Build the image
After you have created the files, build the docker image.
docker build -t exampleorg/my-failing-docker-image .
Run the container
Since this is a “systemd”-enabled image, you need to invoke docker
with the
following commandline. The most important part is -v
/sys/fs/cgroup:/sys/fs/cgroup
. Without
that volume “systemd” will start correctly and outputs an error message
instead. The rest of the commandline only makes sure, that the container is
safely removed after you stopped it.
docker run -it --rm --name my-failing-docker-image-1 -v /sys/fs/cgroup:/sys/fs/cgroup exampleorg/my-failing-docker-image
# => systemd 208 running in system mode. (+PAM -LIBWRAP -AUDIT +SELINUX -IMA +SYSVINIT -LIBCRYPTSETUP -GCRYPT -ACL -XZ)
# => Detected virtualization 'docker'.
# => Welcome to CentOS Linux 7 (Core)!
# [...]
Don’t be worried, that you don’t see a login prompt after starting the
container. The corresponding service console-getty
is masked in “docker”’s
default “CentOS”-image – look for “masked” in systemd.unit
(5)
if want to read more about that topic. You
can use docker exec
instead to get a shell inside the container – see
the full commandline below.
Open Shell within the container
To open a shell in the container run the following command in another terminal.
docker exec -it my-failing-docker-image-1 bash
# => [root@f410a6d1b93f /]#
Run the service
Now try to run the faulty test
service.
systemctl start test.service
When you have a look at the status of the service you should see something
similar to this. The exit status is 206/OOM_ADJUST
and “systemd” reports
Failed to start Test Service
.
systemctl status test.service
# => test.service - Test Service
# => Loaded: loaded (/etc/systemd/system/test.service; static)
# => Active: failed (Result: exit-code) since Sat 2015-10-24 08:18:07 UTC; 12s ago
# => Process: 117 ExecStart=/usr/bin/echo asdf (code=exited, status=206/OOM_ADJUST)
# => Main PID: 117 (code=exited, status=206/OOM_ADJUST)
# =>
# => Oct 24 08:18:07 f410a6d1b93f systemd[1]: test.service: main process exited, code=exited, status=206/OOM_ADJUST
# => Oct 24 08:18:07 f410a6d1b93f systemd[1]: Failed to start Test Service.
# => Oct 24 08:18:07 f410a6d1b93f systemd[1]: Unit test.service entered failed state.
Fixing the Test Image
Stop the container
Now let’s move on and fix that. First of all you should stop the container
choosing one of the following options: Run systemctl poweroff
from within the
container and wait for the power off to finish …
systemctl poweroff
… or stop the container from “the outside” by using the docker
kill
-command. Normally I choose the docker kill
-command, but both work fine
– and of course you can use docker stop
as well.
docker kill my-failing-docker-image-1
# or
docker stop my-failing-docker-image-1
Fix the service
After you stopped the container, you need to remove the line with
OOMScoreAdjust
from the test.service
-file …
[Unit]
Description=Test Service
[Service]
Type=oneshot
ExecStart=/usr/bin/echo asdf
… and build the image again. To read some more about OOMScoreAdjust
have a look at the “systemd.exec (5)”
manual
docker build -t exampleorg/my-failing-docker-image .
When the build has finished you can run the fixed container.
docker run -it --rm --name my-failing-docker-image-1 -v /sys/fs/cgroup:/sys/fs/cgroup exampleorg/my-failing-docker-image
Run and check the service again
Please open a shell in the container …
docker exec -it my-failing-docker-image-1 bash
… and run the fixed service.
systemctl start test.service
When you have a look at the status of the service you should see something similar to the following output. Now the service works as expected.
systemctl status test.service
# => test.service - Test Service
# => Loaded: loaded (/etc/systemd/system/test.service; static)
# => Drop-In: /run/systemd/system/test.service.d
# => └─00-docker.conf
# => Active: inactive (dead)
# =>
# => Oct 24 08:41:40 f410a6d1b93f systemd[1]: Starting Test Service...
# => Oct 24 08:41:40 f410a6d1b93f echo[217]: asdf
# => Oct 24 08:41:40 f410a6d1b93f systemd[1]: Started Test Service.
Fixing the “PostgreSQL”-image
As mentioned at the beginning, I had that issued with “PostgreSQL” 9.4. To fix
the “PostgreSQL”-image I added a RUN
-command similar to the following one to
my Dockerfile
.
RUN sed -i -e "s/OOMScoreAdjust/# OOMScoreAdjust/" /usr/lib/systemd/system/postgresql-9.4.service
Software Packages with “OOMScoreAdjust”
From my knowledge the following packages contain a .service
-file which
includes OOMScoreAdjust
and can be installed on “CentOS 7.1”. There might be
others as well.
Software Package | Service | Source |
---|---|---|
PostgreSQL 9.4 | postgres.service | PostgreSQL-Project |
dbus 1.6.12 | dbus.service | CentOS-Repository |
Conclusion
If you are going to use “systemd”-enabled “docker” images with “systemd 208”, make sure that none
of the involved “service”-unit files contain OOMScoreAdjust
. Otherwise they
won’t start in your container(s). Alternatively try to use the latest version
of “systemd” in your “docker” image.
Thanks for reading!
References
- RHEL7 bug tracker: Negative OOMScoreAdjust kills any process in container
- LXC bug tracker: OOMScoreAdjust= in dbus.service on systemd-based unprivileged containers
- Systemd bug tracker: OOMScoreAdjust for systemd-service-units inside containers
Updates
Update 2015-10-28 #1
- Change the title of this article to see, if there is more interest in this article with the new title
- Add link to systemd.exec manual
Update 2015-10-28 #2
- Added references to bug trackers