1. 27 Nov, 2017 1 commit
    • Linus Torvalds's avatar
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds authored
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      The script to do this was:
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
                ACTIVE NOUSER"
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 16 Nov, 2017 1 commit
  3. 08 Nov, 2017 1 commit
  4. 04 Nov, 2017 1 commit
  5. 17 Jul, 2017 1 commit
    • David Howells's avatar
      VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) · bc98a42c
      David Howells authored
      Firstly by applying the following with coccinelle's spatch:
      	@@ expression SB; @@
      	-SB->s_flags & MS_RDONLY
      to effect the conversion to sb_rdonly(sb), then by applying:
      	@@ expression A, SB; @@
      	-(!sb_rdonly(SB)) && A
      	+!sb_rdonly(SB) && A
      	-A != (sb_rdonly(SB))
      	+A != sb_rdonly(SB)
      	-A == (sb_rdonly(SB))
      	+A == sb_rdonly(SB)
      	-A && (sb_rdonly(SB))
      	+A && sb_rdonly(SB)
      	-A || (sb_rdonly(SB))
      	+A || sb_rdonly(SB)
      	-(sb_rdonly(SB)) != A
      	+sb_rdonly(SB) != A
      	-(sb_rdonly(SB)) == A
      	+sb_rdonly(SB) == A
      	-(sb_rdonly(SB)) && A
      	+sb_rdonly(SB) && A
      	-(sb_rdonly(SB)) || A
      	+sb_rdonly(SB) || A
      	@@ expression A, B, SB; @@
      	-(sb_rdonly(SB)) ? 1 : 0
      	-(sb_rdonly(SB)) ? A : B
      	+sb_rdonly(SB) ? A : B
      to remove left over excess bracketage and finally by applying:
      	@@ expression A, SB; @@
      	-(A & MS_RDONLY) != sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) != sb_rdonly(SB)
      	-(A & MS_RDONLY) == sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) == sb_rdonly(SB)
      to make comparisons against the result of sb_rdonly() (which is a bool)
      work correctly.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
  6. 20 Apr, 2017 1 commit
  7. 07 Oct, 2016 1 commit
  8. 20 Jun, 2016 1 commit
  9. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      This promise never materialized.  And unlikely will.
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      Let's stop pretending that pages in page cache are special.  They are
      The changes are pretty straight-forward:
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
       - page_cache_get() -> get_page();
       - page_cache_release() -> put_page();
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      virtual patch
      expression E;
      + E
      expression E;
      + E
      + PAGE_SHIFT
      + PAGE_SIZE
      + PAGE_MASK
      expression E;
      + PAGE_ALIGN(E)
      expression E;
      - page_cache_get(E)
      + get_page(E)
      expression E;
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  10. 27 Jan, 2016 1 commit
  11. 15 Jan, 2016 1 commit
    • Vladimir Davydov's avatar
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov authored
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  12. 15 Apr, 2015 1 commit
  13. 25 Feb, 2015 1 commit
    • Colin Ian King's avatar
      eCryptfs: ensure copy to crypt_stat->cipher does not overrun · 2a559a8b
      Colin Ian King authored
      The patch 237fead6
      : "[PATCH] ecryptfs: fs/Makefile and
      fs/Kconfig" from Oct 4, 2006, leads to the following static checker
        fs/ecryptfs/crypto.c:846 ecryptfs_new_file_context()
        error: off-by-one overflow 'crypt_stat->cipher' size 32.  rl = '0-32'
      There is a mismatch between the size of ecryptfs_crypt_stat.cipher
      and ecryptfs_mount_crypt_stat.global_default_cipher_name causing the
      copy of the cipher name to cause a off-by-one string copy error. This
      fix ensures the space reserved for this string is the same size including
      the trailing zero at the end throughout ecryptfs.
      This fix avoids increasing the size of ecryptfs_crypt_stat.cipher
      and also ecryptfs_parse_tag_70_packet_silly_stack.cipher_string and instead
      reduces the of ECRYPTFS_MAX_CIPHER_NAME_SIZE to 31 and includes the + 1 for
      the end of string terminator.
      NOTE: An overflow is not possible in practice since the value copied
      into global_default_cipher_name is validated by
      ecryptfs_code_for_cipher_string() at mount time. None of the allowed
      cipher strings are long enough to cause the potential buffer overflow
      fixed by this patch.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      [tyhicks: Added the NOTE about the overflow not being triggerable]
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
  14. 20 Jan, 2015 1 commit
  15. 23 Oct, 2014 2 commits
    • Miklos Szeredi's avatar
      fs: limit filesystem stacking depth · 69c433ed
      Miklos Szeredi authored
      Add a simple read-only counter to super_block that indicates how deep this
      is in the stack of filesystems.  Previously ecryptfs was the only stackable
      filesystem and it explicitly disallowed multiple layers of itself.
      Overlayfs, however, can be stacked recursively and also may be stacked
      on top of ecryptfs or vice versa.
      To limit the kernel stack usage we must limit the depth of the
      filesystem stack.  Initially the limit is set to 2.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
    • Tyler Hicks's avatar
      eCryptfs: Force RO mount when encrypted view is enabled · 332b122d
      Tyler Hicks authored
      The ecryptfs_encrypted_view mount option greatly changes the
      functionality of an eCryptfs mount. Instead of encrypting and decrypting
      lower files, it provides a unified view of the encrypted files in the
      lower filesystem. The presence of the ecryptfs_encrypted_view mount
      option is intended to force a read-only mount and modifying files is not
      supported when the feature is in use. See the following commit for more
       [PATCH] eCryptfs: Encrypted passthrough
      This patch forces the mount to be read-only when the
      ecryptfs_encrypted_view mount option is specified by setting the
      MS_RDONLY flag on the superblock. Additionally, this patch removes some
      broken logic in ecryptfs_open() that attempted to prevent modifications
      of files when the encrypted view feature was in use. The check in
      ecryptfs_open() was not sufficient to prevent file modifications using
      system calls that do not operate on a file descriptor.
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Reported-by: default avatarPriya Bansal <p.bansal@samsung.com>
      Cc: stable@vger.kernel.org # v2.6.21+: e77a56dd [PATCH] eCryptfs: Encrypted passthrough
  16. 25 Oct, 2013 1 commit
  17. 10 Jul, 2013 1 commit
  18. 04 Mar, 2013 1 commit
    • Eric W. Biederman's avatar
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman authored
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarKees Cook <keescook@google.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
  19. 03 Oct, 2012 1 commit
  20. 21 Sep, 2012 1 commit
  21. 14 Sep, 2012 1 commit
  22. 22 Jul, 2012 1 commit
  23. 14 Jul, 2012 2 commits
  24. 08 Jul, 2012 1 commit
    • Tyler Hicks's avatar
      eCryptfs: Copy up POSIX ACL and read-only flags from lower mount · 069ddcda
      Tyler Hicks authored
      When the eCryptfs mount options do not include '-o acl', but the lower
      filesystem's mount options do include 'acl', the MS_POSIXACL flag is not
      flipped on in the eCryptfs super block flags. This flag is what the VFS
      checks in do_last() when deciding if the current umask should be applied
      to a newly created inode's mode or not. When a default POSIX ACL mask is
      set on a directory, the current umask is incorrectly applied to new
      inodes created in the directory. This patch ignores the MS_POSIXACL flag
      passed into ecryptfs_mount() and sets the flag on the eCryptfs super
      block depending on the flag's presence on the lower super block.
      Additionally, it is incorrect to allow a writeable eCryptfs mount on top
      of a read-only lower mount. This missing check did not allow writes to
      the read-only lower mount because permissions checks are still performed
      on the lower filesystem's objects but it is best to simply not allow a
      rw mount on top of ro mount. However, a ro eCryptfs mount on top of a rw
      mount is valid and still allowed.
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Reported-by: default avatarStefan Beller <stefanbeller@googlemail.com>
      Cc: John Johansen <john.johansen@canonical.com>
  25. 21 Mar, 2012 2 commits
  26. 10 Aug, 2011 1 commit
  27. 29 May, 2011 3 commits
  28. 25 Apr, 2011 1 commit
    • Tyler Hicks's avatar
      eCryptfs: Add reference counting to lower files · 332ab16f
      Tyler Hicks authored
      For any given lower inode, eCryptfs keeps only one lower file open and
      multiplexes all eCryptfs file operations through that lower file. The
      lower file was considered "persistent" and stayed open from the first
      lookup through the lifetime of the inode.
      This patch keeps the notion of a single, per-inode lower file, but adds
      reference counting around the lower file so that it is closed when not
      currently in use. If the reference count is at 0 when an operation (such
      as open, create, etc.) needs to use the lower file, a new lower file is
      opened. Since the file is no longer persistent, all references to the
      term persistent file are changed to lower file.
      Locking is added around the sections of code that opens the lower file
      and assign the pointer in the inode info, as well as the code the fputs
      the lower file when all eCryptfs users are done with it.
      This patch is needed to fix issues, when mounted on top of the NFSv3
      client, where the lower file is left silly renamed until the eCryptfs
      inode is destroyed.
      Signed-off-by: default avatarTyler Hicks <tyhicks@linux.vnet.ibm.com>
  29. 31 Mar, 2011 1 commit
  30. 28 Mar, 2011 3 commits
  31. 17 Jan, 2011 2 commits
  32. 14 Jan, 2011 1 commit