1. 01 Sep, 2006 1 commit
    • Shailabh Nagar's avatar
      [PATCH] task delay accounting fixes · 35df17c5
      Shailabh Nagar authored
      Cleanup allocation and freeing of tsk->delays used by delay accounting.
      This solves two problems reported for delay accounting:
      
      1. oops in __delayacct_blkio_ticks
      http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1844.html
      
      Currently tsk->delays is getting freed too early in task exit which can
      cause a NULL tsk->delays to get accessed via reading of /proc/<tgid>/stats.
       The patch fixes this problem by freeing tsk->delays closer to when
      task_struct itself is freed up.  As a result, it also eliminates the use of
      tsk->delays_lock which was only being used (inadequately) to safeguard
      access to tsk->delays while a task was exiting.
      
      2. Possible memory leak in kernel/delayacct.c
      http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1389.html
      
      
      
      The patch cleans up tsk->delays allocations after a bad fork which was
      missing earlier.
      
      The patch has been tested to fix the problems listed above and stress
      tested with rapid calls to delay accounting's taskstats command interface
      (which is the other path that can access the same data, besides the /proc
      interface causing the oops above).
      Signed-off-by: default avatarShailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      35df17c5
  2. 15 Jul, 2006 4 commits
    • Shailabh Nagar's avatar
      [PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks · f9fd8914
      Shailabh Nagar authored
      
      
      On systems with a large number of cpus, with even a modest rate of tasks
      exiting per cpu, the volume of taskstats data sent on thread exit can
      overflow a userspace listener's buffers.
      
      One approach to avoiding overflow is to allow listeners to get data for a
      limited and specific set of cpus.  By scaling the number of listeners
      and/or the cpus they monitor, userspace can handle the statistical data
      overload more gracefully.
      
      In this patch, each listener registers to listen to a specific set of cpus
      by specifying a cpumask.  The interest is recorded per-cpu.  When a task
      exits on a cpu, its taskstats data is unicast to each listener interested
      in that cpu.
      
      Thanks to Andrew Morton for pointing out the various scalability and
      general concerns of previous attempts and for suggesting this design.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: default avatarShailabh Nagar <nagar@watson.ibm.com>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: default avatarChandra Seetharaman <sekharan@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f9fd8914
    • Shailabh Nagar's avatar
      [PATCH] delay accounting taskstats interface send tgid once · ad4ecbcb
      Shailabh Nagar authored
      
      
      Send per-tgid data only once during exit of a thread group instead of once
      with each member thread exit.
      
      Currently, when a thread exits, besides its per-tid data, the per-tgid data
      of its thread group is also sent out, if its thread group is non-empty.
      The per-tgid data sent consists of the sum of per-tid stats for all
      *remaining* threads of the thread group.
      
      This patch modifies this sending in two ways:
      
      - the per-tgid data is sent only when the last thread of a thread group
        exits.  This cuts down heavily on the overhead of sending/receiving
        per-tgid data, especially when other exploiters of the taskstats
        interface aren't interested in per-tgid stats
      
      - the semantics of the per-tgid data sent are changed.  Instead of being
        the sum of per-tid data for remaining threads, the value now sent is the
        true total accumalated statistics for all threads that are/were part of
        the thread group.
      
      The patch also addresses a minor issue where failure of one accounting
      subsystem to fill in the taskstats structure was causing the send of
      taskstats to not be sent at all.
      
      The patch has been tested for stability and run cerberus for over 4 hours
      on an SMP.
      
      [akpm@osdl.org: bugfixes]
      Signed-off-by: default avatarShailabh Nagar <nagar@watson.ibm.com>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ad4ecbcb
    • Shailabh Nagar's avatar
      [PATCH] per-task-delay-accounting: taskstats interface · c757249a
      Shailabh Nagar authored
      
      
      Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC
      family), for getting statistics of tasks and thread groups during their
      lifetime and when they exit.  The interface is intended for use by multiple
      accounting packages though it is being created in the context of delay
      accounting.
      
      This patch creates the interface without populating the fields of the data
      that is sent to the user in response to a command or upon the exit of a task.
      Each accounting package interested in using taskstats has to provide an
      additional patch to add its stats to the common structure.
      
      [akpm@osdl.org: cleanups, Kconfig fix]
      Signed-off-by: default avatarShailabh Nagar <nagar@us.ibm.com>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
      Cc: Erich Focht <efocht@ess.nec.de>
      Cc: Levent Serinol <lserinol@gmail.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c757249a
    • Shailabh Nagar's avatar
      [PATCH] per-task-delay-accounting: setup · ca74e92b
      Shailabh Nagar authored
      
      
      Initialization code related to collection of per-task "delay" statistics which
      measure how long it had to wait for cpu, sync block io, swapping etc.  The
      collection of statistics and the interface are in other patches.  This patch
      sets up the data structures and allows the statistics collection to be
      disabled through a kernel boot parameter.
      Signed-off-by: default avatarShailabh Nagar <nagar@watson.ibm.com>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
      Cc: Erich Focht <efocht@ess.nec.de>
      Cc: Levent Serinol <lserinol@gmail.com>
      Cc: Jay Lan <jlan@engr.sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ca74e92b
  3. 03 Jul, 2006 2 commits
    • Ingo Molnar's avatar
      [PATCH] sched: cleanup, remove task_t, convert to struct task_struct · 36c8b586
      Ingo Molnar authored
      
      
      cleanup: remove task_t and convert all the uses to struct task_struct. I
      introduced it for the scheduler anno and it was a mistake.
      
      Conversion was mostly scripted, the result was reviewed and all
      secondary whitespace and style impact (if any) was fixed up by hand.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      36c8b586
    • Ingo Molnar's avatar
      [PATCH] lockdep: better lock debugging · 9a11b49a
      Ingo Molnar authored
      
      
      Generic lock debugging:
      
       - generalized lock debugging framework. For example, a bug in one lock
         subsystem turns off debugging in all lock subsystems.
      
       - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from
         the mutex/rtmutex debugging code: it caused way too much prototype
         hackery, and lockdep will give the same information anyway.
      
       - ability to do silent tests
      
       - check lock freeing in vfree too.
      
       - more finegrained debugging options, to allow distributions to
         turn off more expensive debugging features.
      
      There's no separate 'held mutexes' list anymore - but there's a 'held locks'
      stack within lockdep, which unifies deadlock detection across all lock
      classes.  (this is independent of the lockdep validation stuff - lockdep first
      checks whether we are holding a lock already)
      
      Here are the current debugging options:
      
      CONFIG_DEBUG_MUTEXES=y
      CONFIG_DEBUG_LOCK_ALLOC=y
      
      which do:
      
       config DEBUG_MUTEXES
                bool "Mutex debugging, basic checks"
      
       config DEBUG_LOCK_ALLOC
               bool "Detect incorrect freeing of live mutexes"
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9a11b49a
  4. 30 Jun, 2006 1 commit
  5. 28 Jun, 2006 2 commits
  6. 26 Jun, 2006 1 commit
  7. 25 Jun, 2006 4 commits
  8. 23 Jun, 2006 1 commit
  9. 17 Jun, 2006 1 commit
  10. 01 May, 2006 1 commit
  11. 19 Apr, 2006 1 commit
  12. 11 Apr, 2006 1 commit
    • Jens Axboe's avatar
      [PATCH] splice: add direct fd <-> fd splicing support · b92ce558
      Jens Axboe authored
      
      
      It's more efficient for sendfile() emulation. Basically we cache an
      internal private pipe and just use that as the intermediate area for
      pages. Direct splicing is not available from sys_splice(), it is only
      meant to be used for sendfile() emulation.
      
      Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
      exit for the normal fast path.
      Signed-off-by: default avatarJens Axboe <axboe@suse.de>
      b92ce558
  13. 31 Mar, 2006 1 commit
    • Eric W. Biederman's avatar
      [PATCH] task: RCU protect task->usage · 8c7904a0
      Eric W. Biederman authored
      
      
      A big problem with rcu protected data structures that are also reference
      counted is that you must jump through several hoops to increase the reference
      count.  I think someone finally implemented atomic_inc_not_zero(&count) to
      automate the common case.  Unfortunately this means you must special case the
      rcu access case.
      
      When data structures are only visible via rcu in a manner that is not
      determined by the reference count on the object (i.e.  tasks are visible until
      their zombies are reaped) there is a much simpler technique we can employ.
      Simply delaying the decrement of the reference count until the rcu interval is
      over.
      
      What that means is that the proc code that looks up a task and later
      wants to sleep can now do:
      
      rcu_read_lock();
      task = find_task_by_pid(some_pid);
      if (task) {
      	get_task_struct(task);
      }
      rcu_read_unlock();
      
      The effect on the rest of the kernel is that put_task_struct becomes cheaper
      and immediate, and in the case where the task has been reaped it frees the
      task immediate instead of unnecessarily waiting an until the rcu interval is
      over.
      
      Cleanup of task_struct does not happen when its reference count drops to
      zero, instead cleanup happens when release_task is called.  Tasks can only
      be looked up via rcu before release_task is called.  All rcu protected
      members of task_struct are freed by release_task.
      
      Therefore we can move call_rcu from put_task_struct into release_task.  And
      we can modify release_task to not immediately release the reference count
      but instead have it call put_task_struct from the function it gives to
      call_rcu.
      
      The end result:
      
      - get_task_struct is safe in an rcu context where we have just looked
        up the task.
      
      - put_task_struct() simplifies into its old pre rcu self.
      
      This reorganization also makes put_task_struct uncallable from modules as
      it is not exported but it does not appear to be called from any modules so
      this should not be an issue, and is trivially fixed.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8c7904a0
  14. 29 Mar, 2006 14 commits
  15. 27 Mar, 2006 2 commits
  16. 23 Mar, 2006 1 commit
  17. 18 Mar, 2006 1 commit
  18. 21 Feb, 2006 1 commit