Per-Process Namespaces
Posted by Utumno on Wed 18 Feb 2009 at 11:34
dev_server# ./blob_install Your Linux distribtion 'Debian GNU/Linux 5.0 \n \l' is unsupported.
Supported distributions: Fedora 8, 9 and 10; Ubuntu 8.04 and 8.10. Exiting.
Attack
The Blob's only function is to check out pre-compiled software from one of our subcontractor's SVN servers, create workspace ( read: create a bunch of directories, move some files around, maybe set up a few links, check availability of some tools ). In short: it sets up environment for development. It is purely a console software, has no graphical parts, for sure does not interact with any other processes or daemons, does not use any advanced libraries and as such, should easily work in Debian or any other reasonably recent distribution.So let's install it anyway: run it under strace and see it open /etc/issue:
dev_server# cat /etc/issue Debian GNU/Linux 5.0 \n \lNow open blob_install with a hex editor and notice that it expects "Ubuntu 8.10 \n \l" there. Temporarily change /etc/issue and watch The Blob install flawlessly.
Blob Strikes Back
At this point I thought I could just change /etc/issue back to its Debian version and start using The Blob; unfortunately, it turned out I underestimated its stubbornness: it actually checks the distribution every single time it performs any action.It thus became clear that I would either have to permanently keep /etc/issue in its Ubuntu form (which I don't want to do) or try to fool The Blob in some more clever way.
Regrouping
Fortunately I stumbled upon Mike Hommey's excellent write-up about per-process namespaces.Per-process namepaces let one selectively 'unshare' any resources that were being shared at the time of creation of the process. This feature, among other things, allows each process to have a different set of mount points. Combined with bind mounts, it can allow some useful setups: we can create a new namespace, bind-mount something, run a process and only this process (and its children) will see the new mount. Furthermore, as soon as the process which created the namespace exits, the namespace and mounts inside it disappear.
Let's see an example. First we have to create a new namespace. This can be achieved with help of Mike's helper program 'newns':
#include <sched.h>
#include <syscall.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
syscall(SYS_unshare, CLONE_NEWNS);
if (argc > 1)
return execvp(argv[1], &argv[1]);
return execv("/bin/sh", NULL);
}
Here we used the unshare(2) syscall to create the namespace - we have to do it this way because Lenny's glibc does not implement the syscall.Compile this , create two files
dev_server# echo FIRST > first dev_server# echo SECOND > secondand try
dev_server# ./newns dev_server# mount -n --bind second first dev_server# cat first SECONDNow in a different console
dev_server# cat first FIRSTYou can see that the bind-mount is only seen by the process which called 'unshare' and its children. More concise way to see the above is
dev_server# ./newns sh -c "mount -n --bind second first; cat first" SECOND dev_server# cat first FIRST
Counterattack
Armed with this knowledge we can attack The Blob again:1) Create /etc/issue.ubuntu with the proper Ubuntu string inside
2) move The Blob's binaries to /usr/local/bin/theblob
3) create a script /usr/local/bin/ubuntize:
#!/bin/sh /usr/local/bin/newns sh -c "mount -n --bind /etc/issue.ubuntu /etc/issue; /usr/local/bin/theblob/$0 $@"4) for each binary, create a link:
dev_server# ln -s /usr/local/bin/ubuntize /usr/local/bin/blob_install dev_server# ln -s /usr/local/bin/ubuntize /usr/local/bin/blob_checkout dev_server# ln -s /usr/local/bin/ubuntize /usr/local/bin/blob_create ... and so on ...
And we're done! The Blob's binaries think they are running on a Ubuntu 8.10 system, and everything else continues to work normally. Victory!
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
I admit I'd have used a hex-editor myself, but this per-process stuff looks very very cool.
I can think of lots of uses for it already.
[ Parent | Reply to this comment ]
[ Send Message | View Utumno's Scratchpad | View Weblogs ]
For a more advanced example of what per-process namespaces can do, take a look here:
http://glandium.org/blog/?p=224
[ Parent | Reply to this comment ]
but could this a approach to setup a protected environment for guest users or processes.
something like a cheap chroot ?
[ Parent | Reply to this comment ]
[ Send Message | View Utumno's Scratchpad | View Weblogs ]
[ Parent | Reply to this comment ]
These do basically exacly what you want: Create namespaces (called "security context" in here, but basically the same), chroot and run the "init" process of any kernel-compatible distro you copied over. There's also some sweet "newvserver" script which installs a fresh Debian (stable, unstable, your choice) inside a new vserver and applies some cosmetics so it starts and shuts down without error messages.
You can, however, use every piece on it's own as well, which comes in handy sometimes. You can run any process in any context with "chcon" (available only on the root server) or limit a process to certain network addresses ("chbind").
In this example I run "iotop" in context 1 (special superior context) to see all processes of all vservers:
chcontext --ctx 1 iotop
Thus seeing where all that disk IO comes from.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
You might have a look at vserver, as described earlier.
[ Parent | Reply to this comment ]
I have a certain program that was gpl, still in debian packages but contains a segfault since 2005.
Since then, the commercial version has corrected it but not in the gpl one..
And now, the commercial one is not maintained anymore.. Great.
Otherwise, I'll try to track the bug..
[ Parent | Reply to this comment ]
[ Send Message | View Utumno's Scratchpad | View Weblogs ]
[ Parent | Reply to this comment ]
fake_issue.c :
#include <syscall.h>
#include <string.h>
int
open (const char *pathname, mode_t mode) {
return syscall (SYS_open, !strcmp (pathname, "/etc/issue") ?
"/etc/fake_issue" : pathname, mode);
}
$ gcc -shared -o fake_issue.so fake_issue.c
$ LD_PRELOAD=$PWD/fake_issue.so bad_program
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
--
Charles Darke
http://digitalconsumption.com
[ Parent | Reply to this comment ]