1. 26 Jun, 2006 1 commit
  2. 25 Jun, 2006 1 commit
    • KaiGai Kohei's avatar
      [PATCH] pacct: add pacct_struct to fix some pacct bugs. · 0e464814
      KaiGai Kohei authored
      The pacct facility need an i/o operation when an accounting record is
      generated.  There is a possibility to wake OOM killer up.  If OOM killer is
      activated, it kills some processes to make them release process memory
      regions.
      
      But acct_process() is called in the killed processes context before calling
      exit_mm(), so those processes cannot release own memory.  In the results, any
      processes stop in this point and it finally cause a system stall.
      0e464814
  3. 23 Jun, 2006 2 commits
  4. 01 May, 2006 1 commit
  5. 20 Apr, 2006 1 commit
  6. 19 Apr, 2006 1 commit
  7. 15 Apr, 2006 1 commit
  8. 31 Mar, 2006 3 commits
    • Kirill Korotaev's avatar
      [PATCH] wrong error path in dup_fd() leading to oopses in RCU · 42862298
      Kirill Korotaev authored
      
      
      Wrong error path in dup_fd() - it should return NULL on error,
      not an address of already freed memory :/
      
      Triggered by OpenVZ stress test suite.
      
      What is interesting is that it was causing different oopses in RCU like
      below:
      Call Trace:
         [<c013492c>] rcu_do_batch+0x2c/0x80
         [<c0134bdd>] rcu_process_callbacks+0x3d/0x70
         [<c0126cf3>] tasklet_action+0x73/0xe0
         [<c01269aa>] __do_softirq+0x10a/0x130
         [<c01058ff>] do_softirq+0x4f/0x60
         =======================
         [<c0113817>] smp_apic_timer_interrupt+0x77/0x110
         [<c0103b54>] apic_timer_interrupt+0x1c/0x24
        Code:  Bad EIP value.
         <0>Kernel panic - not syncing: Fatal exception in interrupt
      Signed-Off-By: default avatarPavel Emelianov <xemul@sw.ru>
      Signed-Off-By: default avatarDmitry Mishin <dim@openvz.org>
      Signed-Off-By: default avatarKirill Korotaev <dev@openvz.org>
      Signed-Off-By: default avatarLinus Torvalds <torvalds@osdl.org>
      42862298
    • Eric W. Biederman's avatar
      [PATCH] pidhash: Refactor the pid hash table · 92476d7f
      Eric W. Biederman authored
      
      
      Simplifies the code, reduces the need for 4 pid hash tables, and makes the
      code more capable.
      
      In the discussions I had with Oleg it was felt that to a large extent the
      cleanup itself justified the work.  With struct pid being dynamically
      allocated meant we could create the hash table entry when the pid was
      allocated and free the hash table entry when the pid was freed.  Instead of
      playing with the hash lists when ever a process would attach or detach to a
      process.
      
      For myself the fact that it gave what my previous task_ref patch gave for free
      with simpler code was a big win.  The problem is that if you hold a reference
      to struct task_struct you lock in 10K of low memory.  If you do that in a user
      controllable way like /proc does, with an unprivileged but hostile user space
      application with typical resource limits of 1000 fds and 100 processes I can
      trigger the OOM killer by consuming all of low memory with task structs, on a
      machine wight 1GB of low memory.
      
      If I instead hold a reference to struct pid which holds a pointer to my
      task_struct, I don't suffer from that problem because struct pid is 2 orders
      of magnitude smaller.  In fact struct pid is small enough that most other
      kernel data structures dwarf it, so simply limiting the number of referring
      data structures is enough to prevent exhaustion of low memory.
      
      This splits the current struct pid into two structures, struct pid and struct
      pid_link, and reduces our number of hash tables from PIDTYPE_MAX to just one.
      struct pid_link is the per process linkage into the hash tables and lives in
      struct task_struct.  struct pid is given an indepedent lifetime, and holds
      pointers to each of the pid types.
      
      The independent life of struct pid simplifies attach_pid, and detach_pid,
      because we are always manipulating the list of pids and not the hash table.
      In addition in giving struct pid an indpendent life it makes the concept much
      more powerful.
      
      Kernel data structures can now embed a struct pid * instead of a pid_t and
      not suffer from pid wrap around problems or from keeping unnecessarily
      large amounts of memory allocated.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      92476d7f
    • Andrew Morton's avatar
      [PATCH] resurrect __put_task_struct · 158d9ebd
      Andrew Morton authored
      
      
      This just got nuked in mainline.  Bring it back because Eric's patches use it.
      
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      158d9ebd
  9. 29 Mar, 2006 9 commits
  10. 27 Mar, 2006 1 commit
  11. 26 Mar, 2006 2 commits
  12. 24 Mar, 2006 1 commit
    • Paul Jackson's avatar
      [PATCH] cpuset memory spread slab cache optimizations · c61afb18
      Paul Jackson authored
      
      
      The hooks in the slab cache allocator code path for support of NUMA
      mempolicies and cpuset memory spreading are in an important code path.  Many
      systems will use neither feature.
      
      This patch optimizes those hooks down to a single check of some bits in the
      current tasks task_struct flags.  For non NUMA systems, this hook and related
      code is already ifdef'd out.
      
      The optimization is done by using another task flag, set if the task is using
      a non-default NUMA mempolicy.  Taking this flag bit along with the
      PF_SPREAD_PAGE and PF_SPREAD_SLAB flag bits added earlier in this 'cpuset
      memory spreading' patch set, one can check for the combination of any of these
      special case memory placement mechanisms with a single test of the current
      tasks task_struct flags.
      
      This patch also tightens up the code, to save a few bytes of kernel text
      space, and moves some of it out of line.  Due to the nested inlines called
      from multiple places, we were ending up with three copies of this code, which
      once we get off the main code path (for local node allocation) seems a bit
      wasteful of instruction memory.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c61afb18
  13. 23 Mar, 2006 2 commits
    • Jens Axboe's avatar
    • Eric Dumazet's avatar
      [PATCH] Shrinks sizeof(files_struct) and better layout · 0c9e63fd
      Eric Dumazet authored
      
      
      1) Reduce the size of (struct fdtable) to exactly 64 bytes on 32bits
         platforms, lowering kmalloc() allocated space by 50%.
      
      2) Reduce the size of (files_struct), using a special 32 bits (or
         64bits) embedded_fd_set, instead of a 1024 bits fd_set for the
         close_on_exec_init and open_fds_init fields.  This save some ram (248
         bytes per task) as most tasks dont open more than 32 files.  D-Cache
         footprint for such tasks is also reduced to the minimum.
      
      3) Reduce size of allocated fdset.  Currently two full pages are
         allocated, that is 32768 bits on x86 for example, and way too much.  The
         minimum is now L1_CACHE_BYTES.
      
      UP and SMP should benefit from this patch, because most tasks will touch
      only one cache line when open()/close() stdin/stdout/stderr (0/1/2),
      (next_fd, close_on_exec_init, open_fds_init, fd_array[0 ..  2] being in the
      same cache line)
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0c9e63fd
  14. 22 Mar, 2006 1 commit
  15. 18 Mar, 2006 1 commit
  16. 17 Mar, 2006 1 commit
  17. 14 Mar, 2006 1 commit
    • GOTO Masanori's avatar
      [PATCH] Fix sigaltstack corruption among cloned threads · f9a3879a
      GOTO Masanori authored
      
      
      This patch fixes alternate signal stack corruption among cloned threads
      with CLONE_SIGHAND (and CLONE_VM) for linux-2.6.16-rc6.
      
      The value of alternate signal stack is currently inherited after a call of
      clone(...  CLONE_SIGHAND | CLONE_VM).  But if sigaltstack is set by a
      parent thread, and then if multiple cloned child threads (+ parent threads)
      call signal handler at the same time, some threads may be conflicted -
      because they share to use the same alternative signal stack region.
      Finally they get sigsegv.  It's an undesirable race condition.  Note that
      child threads created from NPTL pthread_create() also hit this conflict
      when the parent thread uses sigaltstack, without my patch.
      
      To fix this problem, this patch clears the child threads' sigaltstack
      information like exec().  This behavior follows the SUSv3 specification.
      In SUSv3, pthread_create() says "The alternate stack shall not be inherited
      (when new threads are initialized)".  It means that sigaltstack should be
      cleared when sigaltstack memory space is shared by cloned threads with
      CLONE_SIGHAND.
      
      Note that I chose "if (clone_flags & CLONE_SIGHAND)" line because:
        - If clone_flags line is not existed, fork() does not inherit sigaltstack.
        - CLONE_VM is another choice, but vfork() does not inherit sigaltstack.
        - CLONE_SIGHAND implies CLONE_VM, and it looks suitable.
        - CLONE_THREAD is another candidate, and includes CLONE_SIGHAND + CLONE_VM,
          but this flag has a bit different semantics.
      I decided to use CLONE_SIGHAND.
      
      [ Changed to test for CLONE_VM && !CLONE_VFORK after discussion --Linus ]
      Signed-off-by: default avatarGOTO Masanori <gotom@sanori.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Acked-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f9a3879a
  18. 11 Mar, 2006 1 commit
  19. 15 Feb, 2006 2 commits
    • Oleg Nesterov's avatar
      [PATCH] fix kill_proc_info() vs fork() theoretical race · dadac81b
      Oleg Nesterov authored
      
      
      copy_process:
      
      	attach_pid(p, PIDTYPE_PID, p->pid);
      	attach_pid(p, PIDTYPE_TGID, p->tgid);
      
      What if kill_proc_info(p->pid) happens in between?
      
      copy_process() holds current->sighand.siglock, so we are safe
      in CLONE_THREAD case, because current->sighand == p->sighand.
      
      Otherwise, p->sighand is unlocked, the new process is already
      visible to the find_task_by_pid(), but have a copy of parent's
      'struct pid' in ->pids[PIDTYPE_TGID].
      
      This means that __group_complete_signal() may hang while doing
      
      	do ... while (next_thread() != p)
      
      We can solve this problem if we reverse these 2 attach_pid()s:
      
      	attach_pid() does wmb()
      
      	group_send_sig_info() calls spin_lock(), which
      	provides a read barrier. // Yes ?
      
      I don't think we can hit this race in practice, but still.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      dadac81b
    • Oleg Nesterov's avatar
      [PATCH] fix kill_proc_info() vs CLONE_THREAD race · 3f17da69
      Oleg Nesterov authored
      
      
      There is a window after copy_process() unlocks ->sighand.siglock
      and before it adds the new thread to the thread list.
      
      In that window __group_complete_signal(SIGKILL) will not see the
      new thread yet, so this thread will start running while the whole
      thread group was supposed to exit.
      
      I beleive we have another good reason to place attach_pid(PID/TGID)
      under ->sighand.siglock. We can do the same for
      
      	release_task()->__unhash_process()
      
      	de_thread()->switch_exec_pids()
      
      After that we don't need tasklist_lock to iterate over the thread
      list, and we can simplify things, see for example do_sigaction()
      or sys_times().
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3f17da69
  20. 08 Feb, 2006 5 commits
  21. 01 Feb, 2006 1 commit
  22. 12 Jan, 2006 1 commit