Skip to content
  • Eric W. Biederman's avatar
    pidns: Support unsharing the pid namespace. · 50804fe3
    Eric W. Biederman authored
    
    
    Unsharing of the pid namespace unlike unsharing of other namespaces
    does not take affect immediately.  Instead it affects the children
    created with fork and clone.  The first of these children becomes the init
    process of the new pid namespace, the rest become oddball children
    of pid 0.  From the point of view of the new pid namespace the process
    that created it is pid 0, as it's pid does not map.
    
    A couple of different semantics were considered but this one was
    settled on because it is easy to implement and it is usable from
    pam modules.  The core reasons for the existence of unshare.
    
    I took a survey of the callers of pam modules and the following
    appears to be a representative sample of their logic.
    {
    	setup stuff include pam
    	child = fork();
    	if (!child) {
    		setuid()
                    exec /bin/bash
            }
            waitpid(child);
    
            pam and other cleanup
    }
    
    As you can see there is a fork to create the unprivileged user
    space process.  Which means that the unprivileged user space
    process will appear as pid 1 in the new pid namespace.  Further
    most login processes do not cope with extraneous children which
    means shifting the duty of reaping extraneous child process to
    the creator of those extraneous children makes the system more
    comprehensible.
    
    The practical reason for this set of pid namespace semantics is
    that it is simple to implement and verify they work correctly.
    Whereas an implementation that requres changing the struct
    pid on a process comes with a lot more races and pain.  Not
    the least of which is that glibc caches getpid().
    
    These semantics are implemented by having two notions
    of the pid namespace of a proces.  There is task_active_pid_ns
    which is the pid namspace the process was created with
    and the pid namespace that all pids are presented to
    that process in.  The task_active_pid_ns is stored
    in the struct pid of the task.
    
    Then there is the pid namespace that will be used for children
    that pid namespace is stored in task->nsproxy->pid_ns.
    
    Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
    50804fe3