1. 22 Sep, 2016 2 commits
  2. 03 Aug, 2016 2 commits
  3. 05 Jul, 2016 1 commit
  4. 01 Jul, 2016 1 commit
  5. 30 Jun, 2016 1 commit
  6. 29 May, 2016 2 commits
  7. 27 May, 2016 1 commit
  8. 09 May, 2016 1 commit
  9. 01 May, 2016 1 commit
  10. 11 Apr, 2016 1 commit
  11. 10 Apr, 2016 1 commit
  12. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      This promise never materialized.  And unlikely will.
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      Let's stop pretending that pages in page cache are special.  They are
      The changes are pretty straight-forward:
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
       - page_cache_get() -> get_page();
       - page_cache_release() -> put_page();
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      virtual patch
      expression E;
      + E
      expression E;
      + E
      + PAGE_SHIFT
      + PAGE_SIZE
      + PAGE_MASK
      expression E;
      + PAGE_ALIGN(E)
      expression E;
      - page_cache_get(E)
      + get_page(E)
      expression E;
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  13. 31 Mar, 2016 1 commit
    • Andreas Gruenbacher's avatar
      posix_acl: Inode acl caching fixes · b8a7a3a6
      Andreas Gruenbacher authored
      When get_acl() is called for an inode whose ACL is not cached yet, the
      get_acl inode operation is called to fetch the ACL from the filesystem.
      The inode operation is responsible for updating the cached acl with
      set_cached_acl().  This is done without locking at the VFS level, so
      another task can call set_cached_acl() or forget_cached_acl() before the
      get_acl inode operation gets to calling set_cached_acl(), and then
      get_acl's call to set_cached_acl() results in caching an outdate ACL.
      Prevent this from happening by setting the cached ACL pointer to a
      task-specific sentinel value before calling the get_acl inode operation.
      Move the responsibility for updating the cached ACL from the get_acl
      inode operations to get_acl().  There, only set the cached ACL if the
      sentinel value hasn't changed.
      The sentinel values are chosen to have odd values.  Likewise, the value
      of ACL_NOT_CACHED is odd.  In contrast, ACL object pointers always have
      an even value (ACLs are aligned in memory).  This allows to distinguish
      uncached ACLs values from ACL objects.
      In addition, switch from guarding inode->i_acl and inode->i_default_acl
      upates by the inode->i_lock spinlock to using xchg() and cmpxchg().
      Filesystems that do not want ACLs returned from their get_acl inode
      operations to be cached must call forget_cached_acl() to prevent the VFS
      from doing so.
      (Patch written by Al Viro and Andreas Gruenbacher.)
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  14. 22 Jan, 2016 1 commit
    • Al Viro's avatar
      wrappers for ->i_mutex access · 5955102c
      Al Viro authored
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  15. 15 Jan, 2016 1 commit
    • Vladimir Davydov's avatar
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov authored
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  16. 09 Jan, 2016 1 commit
  17. 08 Jan, 2016 1 commit
  18. 30 Dec, 2015 1 commit
  19. 09 Dec, 2015 1 commit
    • Al Viro's avatar
      replace ->follow_link() with new method that could stay in RCU mode · 6b255391
      Al Viro authored
      new method: ->get_link(); replacement of ->follow_link().  The differences
      	* inode and dentry are passed separately
      	* might be called both in RCU and non-RCU mode;
      the former is indicated by passing it a NULL dentry.
      	* when called that way it isn't allowed to block
      and should return ERR_PTR(-ECHILD) if it needs to be called
      in non-RCU mode.
      It's a flagday change - the old method is gone, all in-tree instances
      converted.  Conversion isn't hard; said that, so far very few instances
      do not immediately bail out when called in RCU mode.  That'll change
      in the next commits.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  20. 08 Dec, 2015 1 commit
    • Al Viro's avatar
      9p: ->evict_inode() should kick out ->i_data, not ->i_mapping · 4ad78628
      Al Viro authored
      For block devices the pagecache is associated with the inode
      on bdevfs, not with the aliasing ones on the mountable filesystems.
      The latter have its own ->i_data empty and ->i_mapping pointing
      to the (unique per major/minor) bdevfs inode.  That guarantees
      cache coherence between all block device inodes with the same
      device number.
      Eviction of an alias inode has no business trying to evict the
      pages belonging to bdevfs one; moreover, ->i_mapping is only
      safe to access when the thing is opened.  At the time of
      ->evict_inode() the victim is definitely *not* opened.  We are
      about to kill the address space embedded into struct inode
      (inode->i_data) and that's what we need to empty of any pages.
      9p instance tries to empty inode->i_mapping instead, which is
      both unsafe and bogus - if we have several device nodes with
      the same device number in different places, closing one of them
      should not try to empty the (shared) page cache.
      Fortunately, other instances in the tree are OK; they are
      evicting from &inode->i_data instead, as 9p one should.
      Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that
      Reported-by: default avatar"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Tested-by: default avatar"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  21. 07 Dec, 2015 2 commits
  22. 14 Nov, 2015 2 commits
    • Andreas Gruenbacher's avatar
      9p: xattr simplifications · e409de99
      Andreas Gruenbacher authored
      Now that the xattr handler is passed to the xattr handler operations, we
      can use the same get and set operations for the user, trusted, and security
      xattr namespaces.  In those namespaces, we can access the full attribute
      name by "reattaching" the name prefix the vfs has skipped for us.  Add a
      xattr_full_name helper to make this obvious in the code.
      For the "system.posix_acl_access" and "system.posix_acl_default"
      attributes, handler->prefix is the full attribute name; the suffix is the
      empty string.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Latchesar Ionkov <lucho@ionkov.net>
      Cc: v9fs-developer@lists.sourceforge.net
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • Andreas Gruenbacher's avatar
      xattr handlers: Pass handler to operations instead of flags · d9a82a04
      Andreas Gruenbacher authored
      The xattr_handler operations are currently all passed a file system
      specific flags value which the operations can use to disambiguate between
      different handlers; some file systems use that to distinguish the xattr
      namespace, for example.  In some oprations, it would be useful to also have
      access to the handler prefix.  To allow that, pass a pointer to the handler
      to operations instead of the flags value alone.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  23. 11 Nov, 2015 1 commit
  24. 09 Nov, 2015 1 commit
  25. 06 Nov, 2015 1 commit
  26. 22 Oct, 2015 1 commit
  27. 23 Aug, 2015 2 commits
  28. 12 Jul, 2015 1 commit
  29. 08 Jun, 2015 1 commit
    • Tejun Heo's avatar
      v9fs: fix error handling in v9fs_session_init() · 412a19b6
      Tejun Heo authored
      On failure, v9fs_session_init() returns with the v9fs_session_info
      struct partially initialized and expects the caller to invoke
      v9fs_session_close() to clean it up; however, it doesn't track whether
      the bdi is initialized or not and curiously invokes bdi_destroy() in
      both vfs_session_init() failure path too.
      A. If v9fs_session_init() fails before the bdi is initialized, the
         follow-up v9fs_session_close() will invoke bdi_destroy() on an
         uninitialized bdi.
      B. If v9fs_session_init() fails after the bdi is initialized,
         bdi_destroy() will be called twice on the same bdi - once in the
         failure path of v9fs_session_init() and then by
      A is broken no matter what.  B used to be okay because bdi_destroy()
      allowed being invoked multiple times on the same bdi, which BTW was
      broken in its own way - if bdi_destroy() was invoked on an initialiezd
      but !registered bdi, it'd fail to free percpu counters.  Since
      f0054bb1 ("writeback: move backing_dev_info->wb_lock and
      ->worklist into bdi_writeback"), this no longer work - bdi_destroy()
      on an initialized but not registered bdi works correctly but multiple
      invocations of bdi_destroy() is no longer allowed.
      The obvious culprit here is v9fs_session_init()'s odd and broken error
      behavior.  It should simply clean up after itself on failures.  This
      patch makes the following updates to v9fs_session_init().
      * @rc -> @retval error return propagation removed.  It didn't serve
        any purpose.  Just use @rc.
      * Move addition to v9fs_sessionlist to the end of the function so that
        incomplete sessions are not put on the list or iterated and error
        path doesn't have to worry about it.
      * Update error handling so that it cleans up after itself.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  30. 15 May, 2015 1 commit
  31. 11 May, 2015 4 commits