1. 08 Jun, 2011 1 commit
    • Christoph Hellwig's avatar
      writeback: split inode_wb_list_lock into bdi_writeback.list_lock · f758eeab
      Christoph Hellwig authored
      Split the global inode_wb_list_lock into a per-bdi_writeback list_lock,
      as it's currently the most contended lock in the system for metadata
      heavy workloads.  It won't help for single-filesystem workloads for
      which we'll need the I/O-less balance_dirty_pages, but at least we
      can dedicate a cpu to spinning on each bdi now for larger systems.
      
      Based on earlier patches from Nick Piggin and Dave Chinner.
      
      It reduces lock contentions to 1/4 in this test case:
      10 HDD JBOD, 100 dd on each disk, XFS, 6GB ram
      
      lock_stat version 0.3
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                    class name    con-bounces    contentions   waittime-min   waittime-max waittime-total    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      vanilla 2.6.39-rc3:
                            inode_wb_list_lock:         42590          44433           0.12         147.74      144127.35         252274         886792           0.08         121.34      917211.23
                            ------------------
                            inode_wb_list_lock              2          [<ffffffff81165da5>] bdev_inode_switch_bdi+0x29/0x85
                            inode_wb_list_lock             34          [<ffffffff8115bd0b>] inode_wb_list_del+0x22/0x49
                            inode_wb_list_lock          12893          [<ffffffff8115bb53>] __mark_inode_dirty+0x170/0x1d0
                            inode_wb_list_lock          10702          [<ffffffff8115afef>] writeback_single_inode+0x16d/0x20a
                            ------------------
                            inode_wb_list_lock              2          [<ffffffff81165da5>] bdev_inode_switch_bdi+0x29/0x85
                            inode_wb_list_lock             19          [<ffffffff8115bd0b>] inode_wb_list_del+0x22/0x49
                            inode_wb_list_lock           5550          [<ffffffff8115bb53>] __mark_inode_dirty+0x170/0x1d0
                            inode_wb_list_lock           8511          [<ffffffff8115b4ad>] writeback_sb_inodes+0x10f/0x157
      
      2.6.39-rc3 + patch:
                      &(&wb->list_lock)->rlock:         11383          11657           0.14         151.69       40429.51          90825         527918           0.11         145.90      556843.37
                      ------------------------
                      &(&wb->list_lock)->rlock             10          [<ffffffff8115b189>] inode_wb_list_del+0x5f/0x86
                      &(&wb->list_lock)->rlock           1493          [<ffffffff8115b1ed>] writeback_inodes_wb+0x3d/0x150
                      &(&wb->list_lock)->rlock           3652          [<ffffffff8115a8e9>] writeback_sb_inodes+0x123/0x16f
                      &(&wb->list_lock)->rlock           1412          [<ffffffff8115a38e>] writeback_single_inode+0x17f/0x223
                      ------------------------
                      &(&wb->list_lock)->rlock              3          [<ffffffff8110b5af>] bdi_lock_two+0x46/0x4b
                      &(&wb->list_lock)->rlock              6          [<ffffffff8115b189>] inode_wb_list_del+0x5f/0x86
                      &(&wb->list_lock)->rlock           2061          [<ffffffff8115af97>] __mark_inode_dirty+0x173/0x1cf
                      &(&wb->list_lock)->rlock           2629          [<ffffffff8115a8e9>] writeback_sb_inodes+0x123/0x16f
      
      hughd@google.com: fix recursive lock when bdi_lock_two() is called with new the same as old
      akpm@linux-foundation.org: cleanup bdev_inode_switch_bdi() comment
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      f758eeab
  2. 27 May, 2011 1 commit
  3. 25 May, 2011 3 commits
    • Ying Han's avatar
      vmscan: change shrinker API by passing shrink_control struct · 1495f230
      Ying Han authored
      Change each shrinker's API by consolidating the existing parameters into
      shrink_control struct.  This will simplify any further features added w/o
      touching each file of shrinker.
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: fix warning]
      [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
      [akpm@linux-foundation.org: fix xfs warning]
      [akpm@linux-foundation.org: update gfs2]
      Signed-off-by: default avatarYing Han <yinghan@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1495f230
    • Peter Zijlstra's avatar
      mm: Convert i_mmap_lock to a mutex · 3d48ae45
      Peter Zijlstra authored
      Straightforward conversion of i_mmap_lock to a mutex.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d48ae45
    • Peter Zijlstra's avatar
      mm: Remove i_mmap_lock lockbreak · 97a89413
      Peter Zijlstra authored
      Hugh says:
       "The only significant loser, I think, would be page reclaim (when
        concurrent with truncation): could spin for a long time waiting for
        the i_mmap_mutex it expects would soon be dropped? "
      
      Counter points:
       - cpu contention makes the spin stop (need_resched())
       - zap pages should be freeing pages at a higher rate than reclaim
         ever can
      
      I think the simplification of the truncate code is definitely worth it.
      
      Effectively reverts: 2aa15890 ("mm: prevent concurrent
      unmap_mapping_range() on the same inode") and takes out the code that
      caused its problem.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97a89413
  4. 22 May, 2011 1 commit
  5. 05 Apr, 2011 1 commit
    • Jens Axboe's avatar
      fs: export empty_aops · 7dcda1c9
      Jens Axboe authored
      With the ->sync_page() hook gone, we have a few users that
      add their own static address_space_operations without any
      functions defined.
      
      fs/inode.c already has an empty_aops that it uses for init
      purposes. Lets export that and use it in the places where
      an otherwise empty aops was defined.
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      7dcda1c9
  6. 28 Mar, 2011 1 commit
  7. 25 Mar, 2011 8 commits
    • Christoph Hellwig's avatar
      fs: simplify iget & friends · 0b2d0724
      Christoph Hellwig authored
      Merge get_new_inode/get_new_inode_fast into iget5_locked/iget_locked
      as those were the only callers.  Remove the internal ifind/ifind_fast
      helpers - ifind_fast only had a single caller, and ifind had two
      callers wanting it to do different things.  Also clean up the comments
      in this area to focus on information important to a developer trying
      to use it, instead of overloading them with implementation details.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0b2d0724
    • Dave Chinner's avatar
      fs: rename inode_lock to inode_hash_lock · 67a23c49
      Dave Chinner authored
      All that remains of the inode_lock is protecting the inode hash list
      manipulation and traversals. Rename the inode_lock to
      inode_hash_lock to reflect it's actual function.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      67a23c49
    • Dave Chinner's avatar
      fs: move i_wb_list out from under inode_lock · a66979ab
      Dave Chinner authored
      Protect the inode writeback list with a new global lock
      inode_wb_list_lock and use it to protect the list manipulations and
      traversals. This lock replaces the inode_lock as the inodes on the
      list can be validity checked while holding the inode->i_lock and
      hence the inode_lock is no longer needed to protect the list.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a66979ab
    • Dave Chinner's avatar
      fs: move i_sb_list out from under inode_lock · 55fa6091
      Dave Chinner authored
      Protect the per-sb inode list with a new global lock
      inode_sb_list_lock and use it to protect the list manipulations and
      traversals. This lock replaces the inode_lock as the inodes on the
      list can be validity checked while holding the inode->i_lock and
      hence the inode_lock is no longer needed to protect the list.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      55fa6091
    • Dave Chinner's avatar
      fs: remove inode_lock from iput_final and prune_icache · f283c86a
      Dave Chinner authored
      Now that inode state changes are protected by the inode->i_lock and
      the inode LRU manipulations by the inode_lru_lock, we can remove the
      inode_lock from prune_icache and the initial part of iput_final().
      
      instead of using the inode_lock to protect the inode during
      iput_final, use the inode->i_lock instead. This protects the inode
      against new references being taken while we change the inode state
      to I_FREEING, as well as preventing prune_icache from grabbing the
      inode while we are manipulating it. Hence we no longer need the
      inode_lock in iput_final prior to setting I_FREEING on the inode.
      
      For prune_icache, we no longer need the inode_lock to protect the
      LRU list, and the inodes themselves are protected against freeing
      races by the inode->i_lock. Hence we can lift the inode_lock from
      prune_icache as well.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      f283c86a
    • Dave Chinner's avatar
      fs: Lock the inode LRU list separately · 02afc410
      Dave Chinner authored
      Introduce the inode_lru_lock to protect the inode_lru list. This
      lock is nested inside the inode->i_lock to allow the inode to be
      added to the LRU list in iput_final without needing to deal with
      lock inversions. This keeps iput_final() clean and neat.
      
      Further, where marking the inode I_FREEING and removing it from the
      LRU, move the LRU list manipulation within the inode->i_lock to keep
      the list manipulation consistent with iput_final. This also means
      that most of the open coded LRU list removal + unused inode
      accounting can now use the inode_lru_list_del() wrappers which
      cleans the code up further.
      
      However, this locking change means what the LRU traversal in
      prune_icache() inverts this lock ordering and needs to use trylock
      semantics on the inode->i_lock to avoid deadlocking. In these cases,
      if we fail to lock the inode we move it to the back of the LRU to
      prevent spinning on it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      02afc410
    • Dave Chinner's avatar
      fs: factor inode disposal · b2b2af8e
      Dave Chinner authored
      We have a couple of places that dispose of inodes. factor the
      disposal into evict() to isolate this code and make it simpler to
      peel away the inode_lock from the code.
      
      While doing this, change the logic flow in iput_final() to separate
      the different cases that need to be handled to make the transitions
      the inode goes through more obvious.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b2b2af8e
    • Dave Chinner's avatar
      fs: protect inode->i_state with inode->i_lock · 250df6ed
      Dave Chinner authored
      Protect inode state transitions and validity checks with the
      inode->i_lock. This enables us to make inode state transitions
      independently of the inode_lock and is the first step to peeling
      away the inode_lock from the code.
      
      This requires that __iget() is done atomically with i_state checks
      during list traversals so that we don't race with another thread
      marking the inode I_FREEING between the state check and grabbing the
      reference.
      
      Also remove the unlock_new_inode() memory barrier optimisation
      required to avoid taking the inode_lock when clearing I_NEW.
      Simplify the code by simply taking the inode->i_lock around the
      state change and wakeup. Because the wakeup is no longer tricky,
      remove the wake_up_inode() function and open code the wakeup where
      necessary.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      250df6ed
  8. 24 Mar, 2011 2 commits
  9. 21 Mar, 2011 1 commit
  10. 16 Mar, 2011 1 commit
    • Christoph Hellwig's avatar
      prune back iprune_sem · bab1d944
      Christoph Hellwig authored
      iprune_sem is continously giving us lockdep warnings because we do take it in
      read mode in the reclaim path, but we're also doing non-NOFS allocations under
      it taken in write mode.
      
      Taking a bit deeper look at it I think it's fixable quite trivially:
      
       - for invalidate_inodes we do not need iprune_sem at all.  We have an active
         reference on the superblock, so the filesystem is not going away until it
         has finished.
       - for evict_inodes we do need it, to make sure prune_icache has done it's
         work before we tear down the superblock.  But there is no reason to
         hold it over the actual reclaim operation - it's enough to cycle through
         it after the actual reclaim to make sure we wait for any pending
         prune_icache to complete.  We just have to remove the WARN_ON for
         otherwise busy inodes as they can actually happen now.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      bab1d944
  11. 24 Feb, 2011 2 commits
    • NeilBrown's avatar
      Fix over-zealous flush_disk when changing device size. · 93b270f7
      NeilBrown authored
      There are two cases when we call flush_disk.
      In one, the device has disappeared (check_disk_change) so any
      data will hold becomes irrelevant.
      In the oter, the device has changed size (check_disk_size_change)
      so data we hold may be irrelevant.
      
      In both cases it makes sense to discard any 'clean' buffers,
      so they will be read back from the device if needed.
      
      In the former case it makes sense to discard 'dirty' buffers
      as there will never be anywhere safe to write the data.  In the
      second case it *does*not* make sense to discard dirty buffers
      as that will lead to file system corruption when you simply enlarge
      the containing devices.
      
      flush_disk calls __invalidate_devices.
      __invalidate_device calls both invalidate_inodes and invalidate_bdev.
      
      invalidate_inodes *does* discard I_DIRTY inodes and this does lead
      to fs corruption.
      
      invalidate_bev *does*not* discard dirty pages, but I don't really care
      about that at present.
      
      So this patch adds a flag to __invalidate_device (calling it
      __invalidate_device2) to indicate whether dirty buffers should be
      killed, and this is passed to invalidate_inodes which can choose to
      skip dirty inodes.
      
      flusk_disk then passes true from check_disk_change and false from
      check_disk_size_change.
      
      dm avoids tripping over this problem by calling i_size_write directly
      rathher than using check_disk_size_change.
      
      md does use check_disk_size_change and so is affected.
      
      This regression was introduced by commit 608aeef1 which causes
      check_disk_size_change to call flush_disk, so it is suitable for any
      kernel since 2.6.27.
      
      Cc: stable@kernel.org
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: Andrew Patterson <andrew.patterson@hp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      93b270f7
    • Miklos Szeredi's avatar
      mm: prevent concurrent unmap_mapping_range() on the same inode · 2aa15890
      Miklos Szeredi authored
      Michael Leun reported that running parallel opens on a fuse filesystem
      can trigger a "kernel BUG at mm/truncate.c:475"
      
      Gurudas Pai reported the same bug on NFS.
      
      The reason is, unmap_mapping_range() is not prepared for more than
      one concurrent invocation per inode.  For example:
      
        thread1: going through a big range, stops in the middle of a vma and
           stores the restart address in vm_truncate_count.
      
        thread2: comes in with a small (e.g. single page) unmap request on
           the same vma, somewhere before restart_address, finds that the
           vma was already unmapped up to the restart address and happily
           returns without doing anything.
      
      Another scenario would be two big unmap requests, both having to
      restart the unmapping and each one setting vm_truncate_count to its
      own value.  This could go on forever without any of them being able to
      finish.
      
      Truncate and hole punching already serialize with i_mutex.  Other
      callers of unmap_mapping_range() do not, and it's difficult to get
      i_mutex protection for all callers.  In particular ->d_revalidate(),
      which calls invalidate_inode_pages2_range() in fuse, may be called
      with or without i_mutex.
      
      This patch adds a new mutex to 'struct address_space' to prevent
      running multiple concurrent unmap_mapping_range() on the same mapping.
      
      [ We'll hopefully get rid of all this with the upcoming mm
        preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
        lockbreak" patch in particular.  But that is for 2.6.39 ]
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Reported-by: default avatarMichael Leun <lkml20101129@newton.leun.net>
      Reported-by: default avatarGurudas Pai <gurudas.pai@oracle.com>
      Tested-by: default avatarGurudas Pai <gurudas.pai@oracle.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2aa15890
  12. 07 Jan, 2011 4 commits
    • Nick Piggin's avatar
      fs: avoid inode RCU freeing for pseudo fs · ff0c7d15
      Nick Piggin authored
      Pseudo filesystems that don't put inode on RCU list or reachable by
      rcu-walk dentries do not need to RCU free their inodes.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      ff0c7d15
    • Nick Piggin's avatar
      fs: icache RCU free inodes · fa0d7e3d
      Nick Piggin authored
      RCU free the struct inode. This will allow:
      
      - Subsequent store-free path walking patch. The inode must be consulted for
        permissions when walking, so an RCU inode reference is a must.
      - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
        to take i_lock no longer need to take sb_inode_list_lock to walk the list in
        the first place. This will simplify and optimize locking.
      - Could remove some nested trylock loops in dcache code
      - Could potentially simplify things a bit in VM land. Do not need to take the
        page lock to follow page->mapping.
      
      The downsides of this is the performance cost of using RCU. In a simple
      creat/unlink microbenchmark, performance drops by about 10% due to inability to
      reuse cache-hot slab objects. As iterations increase and RCU freeing starts
      kicking over, this increases to about 20%.
      
      In cases where inode lifetimes are longer (ie. many inodes may be allocated
      during the average life span of a single inode), a lot of this cache reuse is
      not applicable, so the regression caused by this patch is smaller.
      
      The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
      however this adds some complexity to list walking and store-free path walking,
      so I prefer to implement this at a later date, if it is shown to be a win in
      real situations. I haven't found a regression in any non-micro benchmark so I
      doubt it will be a problem.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      fa0d7e3d
    • Nick Piggin's avatar
      fs: use fast counters for vfs caches · 3e880fb5
      Nick Piggin authored
      percpu_counter library generates quite nasty code, so unless you need
      to dynamically allocate counters or take fast approximate value, a
      simple per cpu set of counters is much better.
      
      The percpu_counter can never be made to work as well, because it has an
      indirection from pointer to percpu memory, and it can't use direct
      this_cpu_inc interfaces because it doesn't use static PER_CPU data, so
      code will always be worse.
      
      In the fastpath, it is the difference between this:
      
              incl %gs:nr_dentry      # nr_dentry
      
      and this:
      
              movl    percpu_counter_batch(%rip), %edx        # percpu_counter_batch,
              movl    $1, %esi        #,
              movq    $nr_dentry, %rdi        #,
              call    __percpu_counter_add    # (plus I clobber registers)
      
      __percpu_counter_add:
              pushq   %rbp    #
              movq    %rsp, %rbp      #,
              subq    $32, %rsp       #,
              movq    %rbx, -24(%rbp) #,
              movq    %r12, -16(%rbp) #,
              movq    %r13, -8(%rbp)  #,
              movq    %rdi, %rbx      # fbc, fbc
      #APP
      # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1
              movq %gs:kernel_stack,%rax      #, pfo_ret__
      # 0 "" 2
      #NO_APP
              incl    -8124(%rax)     # <variable>.preempt_count
              movq    32(%rdi), %r12  # <variable>.counters, tcp_ptr__
      #APP
      # 78 "lib/percpu_counter.c" 1
              add %gs:this_cpu_off, %r12      # this_cpu_off, tcp_ptr__
      # 0 "" 2
      #NO_APP
              movslq  (%r12),%r13     #* tcp_ptr__, tmp73
              movslq  %edx,%rax       # batch, batch
              addq    %rsi, %r13      # amount, count
              cmpq    %rax, %r13      # batch, count
              jge     .L27    #,
              negl    %edx    # tmp76
              movslq  %edx,%rdx       # tmp76, tmp77
              cmpq    %rdx, %r13      # tmp77, count
              jg      .L28    #,
      .L27:
              movq    %rbx, %rdi      # fbc,
              call    _raw_spin_lock  #
              addq    %r13, 8(%rbx)   # count, <variable>.count
              movq    %rbx, %rdi      # fbc,
              movl    $0, (%r12)      #,* tcp_ptr__
              call    _raw_spin_unlock        #
      .L29:
      #APP
      # 216 "/home/npiggin/usr/src/linux-2.6/arch/x86/include/asm/thread_info.h" 1
              movq %gs:kernel_stack,%rax      #, pfo_ret__
      # 0 "" 2
      #NO_APP
              decl    -8124(%rax)     # <variable>.preempt_count
              movq    -8136(%rax), %rax       #, D.14625
              testb   $8, %al #, D.14625
              jne     .L32    #,
      .L31:
              movq    -24(%rbp), %rbx #,
              movq    -16(%rbp), %r12 #,
              movq    -8(%rbp), %r13  #,
              leave
              ret
              .p2align 4,,10
              .p2align 3
      .L28:
              movl    %r13d, (%r12)   # count,*
              jmp     .L29    #
      .L32:
              call    preempt_schedule        #
              .p2align 4,,6
              jmp     .L31    #
              .size   __percpu_counter_add, .-__percpu_counter_add
              .p2align 4,,15
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      3e880fb5
    • Nick Piggin's avatar
      vfs: revert per-cpu nr_unused counters for dentry and inodes · 86c8749e
      Nick Piggin authored
      The nr_unused counters count the number of objects on an LRU, and as such they
      are synchronized with LRU object insertion and removal and scanning, and
      protected under the LRU lock.
      
      Making it per-cpu does not actually get any concurrency improvements because of
      this lock, and summing the counter is much slower, and
      incrementing/decrementing it costs more code size and is slower too.
      
      These counters should stay per-LRU, which currently means global.
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      86c8749e
  13. 26 Oct, 2010 14 commits