1. 04 Jan, 2019 1 commit
    • Linus Torvalds's avatar
      Remove 'type' argument from access_ok() function · 96d4f267
      Linus Torvalds authored
      Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
      of the user address range verification function since we got rid of the
      old racy i386-only code to walk page tables by hand.
      
      It existed because the original 80386 would not honor the write protect
      bit when in kernel mode, so you had to do COW by hand before doing any
      user access.  But we haven't supported that in a long time, and these
      days the 'type' argument is a purely historical artifact.
      
      A discussion about extending 'user_access_begin()' to do the range
      checking resulted this patch, because there is no way we're going to
      move the old VERIFY_xyz interface to that model.  And it's best done at
      the end of the merge window when I've done most of my merges, so let's
      just get this done once and for all.
      
      This patch was mostly done with a sed-script, with manual fix-ups for
      the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
      
      There were a couple of notable cases:
      
       - csky still had the old "verify_area()" name as an alias.
      
       - the iter_iov code had magical hardcoded knowledge of the actual
         values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
         really used it)
      
       - microblaze used the type argument for a debug printout
      
      but other than those oddities this should be a total no-op patch.
      
      I tried to fix up all architectures, did fairly extensive grepping for
      access_ok() uses, and the changes are trivial, but I may have missed
      something.  Any missed conversion should be trivially fixable, though.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96d4f267
  2. 13 Dec, 2018 2 commits
  3. 28 Nov, 2018 1 commit
  4. 25 Nov, 2018 1 commit
  5. 23 Oct, 2018 3 commits
    • David Howells's avatar
      iov_iter: Add I/O discard iterator · 9ea9ce04
      David Howells authored
      Add a new iterator, ITER_DISCARD, that can only be used in READ mode and
      just discards any data copied to it.
      
      This is useful in a network filesystem for discarding any unwanted data
      sent by a server.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      9ea9ce04
    • David Howells's avatar
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells authored
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      aa563d7b
    • David Howells's avatar
      iov_iter: Use accessor function · 00e23707
      David Howells authored
      Use accessor functions to access an iterator's type and direction.  This
      allows for the possibility of using some other method of determining the
      type of iterator than if-chains with bitwise-AND conditions.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      00e23707
  6. 15 Jul, 2018 3 commits
  7. 15 May, 2018 1 commit
  8. 02 May, 2018 2 commits
  9. 12 Oct, 2017 1 commit
  10. 21 Sep, 2017 1 commit
  11. 07 Jul, 2017 1 commit
    • Al Viro's avatar
      iov_iter: saner checks on copyin/copyout · 09fc68dc
      Al Viro authored
      * might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic
      codepath also needs might_fault() coverage)
      * we have already done object size checks
      * we have *NOT* done access_ok() recently enough; we rely upon the
      iovec array having passed sanity checks back when it had been created
      and not nothing having buggered it since.  However, that's very much
      non-local, so we'd better recheck that.
      
      So the thing we want does not match anything in uaccess - we need
      access_ok + kasan checks + raw copy without any zeroing.  Just define
      such helpers and use them here.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      09fc68dc
  12. 30 Jun, 2017 2 commits
  13. 09 Jun, 2017 1 commit
    • Dan Williams's avatar
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af
      Dan Williams authored
      The pmem driver has a need to transfer data with a persistent memory
      destination and be able to rely on the fact that the destination writes are not
      cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
      (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
      to ensure data-writes have reached a power-fail-safe zone in the platform. The
      fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
      around and fence previous writes with an "sfence".
      
      Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
      memcpy_flushcache, that guarantee that the destination buffer is not dirty in
      the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
      will be used to replace the "pmem api" (include/linux/pmem.h +
      arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
      and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
      config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
      otherwise.
      
      This is meant to satisfy the concern from Linus that if a driver wants to do
      something beyond the normal nocache semantics it should be something private to
      that driver [1], and Al's concern that anything uaccess related belongs with
      the rest of the uaccess code [2].
      
      The first consumer of this interface is a new 'copy_from_iter' dax operation so
      that pmem can inject cache maintenance operations without imposing this
      overhead on other dax-capable drivers.
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      0aed55af
  14. 09 May, 2017 1 commit
    • Michal Hocko's avatar
      treewide: use kv[mz]alloc* rather than opencoded variants · 752ade68
      Michal Hocko authored
      There are many code paths opencoding kvmalloc.  Let's use the helper
      instead.  The main difference to kvmalloc is that those users are
      usually not considering all the aspects of the memory allocator.  E.g.
      allocation requests <= 32kB (with 4kB pages) are basically never failing
      and invoke OOM killer to satisfy the allocation.  This sounds too
      disruptive for something that has a reasonable fallback - the vmalloc.
      On the other hand those requests might fallback to vmalloc even when the
      memory allocator would succeed after several more reclaim/compaction
      attempts previously.  There is no guarantee something like that happens
      though.
      
      This patch converts many of those places to kv[mz]alloc* helpers because
      they are more conservative.
      
      Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
      Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
      Acked-by: David Sterba <dsterba@suse.com> # btrfs
      Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
      Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Colin Cross <ccross@android.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Santosh Raspatur <santosh@chelsio.com>
      Cc: Hariprasad S <hariprasad@chelsio.com>
      Cc: Yishai Hadas <yishaih@mellanox.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: "Yan, Zheng" <zyan@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      752ade68
  15. 08 May, 2017 1 commit
    • Al Viro's avatar
      fix braino in generic_file_read_iter() · 5b47d59a
      Al Viro authored
      Wrong sign of iov_iter_revert() argument.  Unfortunately, slipped through
      the testing, since most of the time we don't do anything to the iterator
      afterwards and potential oops on walking the iter->iov too far backwards
      is too infrequent to be easily triggered.
      
      Add a sanity check in iov_iter_revert() to catch bugs like this one;
      fortunately, the same braino hadn't happened in other callers, but we'd
      better have a warning if such thing crops up.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5b47d59a
  16. 29 Apr, 2017 1 commit
  17. 02 Apr, 2017 1 commit
  18. 28 Mar, 2017 2 commits
  19. 15 Jan, 2017 1 commit
  20. 23 Dec, 2016 1 commit
    • Al Viro's avatar
      [iov_iter] fix iterate_all_kinds() on empty iterators · 33844e66
      Al Viro authored
      Problem similar to ones dealt with in "fold checks into iterate_and_advance()"
      and followups, except that in this case we really want to do nothing when
      asked for zero-length operation - unlike zero-length iterate_and_advance(),
      zero-length iterate_all_kinds() has no side effects, and callers are simpler
      that way.
      
      That got exposed when copy_from_iter_full() had been used by tipc, which
      builds an msghdr with zero payload and (now) feeds it to a primitive
      based on iterate_all_kinds() instead of iterate_and_advance().
      Reported-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Tested-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      33844e66
  21. 05 Dec, 2016 1 commit
    • Al Viro's avatar
      [iov_iter] new primitives - copy_from_iter_full() and friends · cbbd26b8
      Al Viro authored
      copy_from_iter_full(), copy_from_iter_full_nocache() and
      csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
      et.al., advancing iterator only in case of successful full copy
      and returning whether it had been successful or not.
      
      Convert some obvious users.  *NOTE* - do not blindly assume that
      something is a good candidate for those unless you are sure that
      not advancing iov_iter in failure case is the right thing in
      this case.  Anything that does short read/short write kind of
      stuff (or is in a loop, etc.) is unlikely to be a good one.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      cbbd26b8
  22. 17 Nov, 2016 1 commit
    • Abhi Das's avatar
      fix iov_iter_advance() for ITER_PIPE · 680bb946
      Abhi Das authored
      iov_iter_advance() needs to decrement iter->count by the number of
      bytes we'd moved beyond.  Normal flavours do that, but ITER_PIPE
      doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files
      ends up with a bogus fallback to page cache read, resulting in incorrect
      values for file offset and bytes read.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      680bb946
  23. 01 Nov, 2016 1 commit
  24. 15 Oct, 2016 1 commit
  25. 11 Oct, 2016 1 commit
  26. 05 Oct, 2016 2 commits
    • Miklos Szeredi's avatar
      pipe: add pipe_buf_release() helper · a779638c
      Miklos Szeredi authored
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a779638c
    • Al Viro's avatar
      new iov_iter flavour: pipe-backed · 241699cd
      Al Viro authored
      iov_iter variant for passing data into pipe.  copy_to_iter()
      copies data into page(s) it has allocated and stuffs them into
      the pipe; copy_page_to_iter() stuffs there a reference to the
      page given to it.  Both will try to coalesce if possible.
      iov_iter_zero() is similar to copy_to_iter(); iov_iter_get_pages()
      and friends will do as copy_to_iter() would have and return the
      pages where the data would've been copied.  iov_iter_advance()
      will truncate everything past the spot it has advanced to.
      
      New primitive: iov_iter_pipe(), used for initializing those.
      pipe should be locked all along.
      
      Running out of space acts as fault would for iovec-backed ones;
      in other words, giving it to ->read_iter() may result in short
      read if the pipe overflows, or -EFAULT if it happens with nothing
      copied there.
      
      In other words, ->read_iter() on those acts pretty much like
      ->splice_read().  Moreover, all generic_file_splice_read() users,
      as well as many other ->splice_read() instances can be switched
      to that scheme - that'll happen in the next commit.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      241699cd
  27. 27 Sep, 2016 1 commit
    • Al Viro's avatar
      get rid of separate multipage fault-in primitives · 4bce9f6e
      Al Viro authored
      * the only remaining callers of "short" fault-ins are just as happy with generic
      variants (both in lib/iov_iter.c); switch them to multipage variants, kill the
      "short" ones
      * rename the multipage variants to now available plain ones.
      * get rid of compat macro defining iov_iter_fault_in_multipage_readable by
      expanding it in its only user.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4bce9f6e
  28. 17 Sep, 2016 1 commit
  29. 28 Jul, 2016 1 commit
    • Mikulas Patocka's avatar
      mm: optimize copy_page_to/from_iter_iovec · 3fa6c507
      Mikulas Patocka authored
      copy_page_to_iter_iovec() and copy_page_from_iter_iovec() copy some data
      to userspace or from userspace.  These functions have a fast path where
      they map a page using kmap_atomic and a slow path where they use kmap.
      
      kmap is slower than kmap_atomic, so the fast path is preferred.
      
      However, on kernels without highmem support, kmap just calls
      page_address, so there is no need to avoid kmap.  On kernels without
      highmem support, the fast path just increases code size (and cache
      footprint) and it doesn't improve copy performance in any way.
      
      This patch enables the fast path only if CONFIG_HIGHMEM is defined.
      
      Code size reduced by this patch:
        x86 (without highmem)	  928
        x86-64		  960
        sparc64		  848
        alpha			 1136
        pa-risc		 1200
      
      [akpm@linux-foundation.org: use IS_ENABLED(), per Andi]
      Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1607221711410.4818@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3fa6c507
  30. 09 Jun, 2016 1 commit
  31. 25 May, 2016 1 commit
    • Al Viro's avatar
      do "fold checks into iterate_and_advance()" right · 19f18459
      Al Viro authored
      the only case when we should skip the iterate_and_advance() guts
      is when nothing's left in the iterator, _not_ just when requested
      amount is 0.  Said guts will do nothing in the latter case anyway;
      the problem we tried to deal with in the aforementioned commit is
      that when there's nothing left *and* the amount requested is 0,
      we might end up deferencing one iovec too many; the value we fetch
      from there is discarded in that case, but theoretically it might
      oops if the iovec array ends exactly at the end of page with the
      next page not mapped.
      
      Bailing out on zero size requested had an unexpected side effect -
      zero-length segment in the beginning of iovec array ended up
      throwing do_loop_readv_writev() into infinite spin; we do not
      advance past the empty segment at all.  Reproducer is trivial:
      echo '#include <sys/uio.h>' >a.c
      echo 'main() {char c; struct iovec v[] = {{&c,0},{&c,1}}; readv(0,v,2);}' >>a.c
      cc a.c && ./a.out </proc/uptime
      
      which should end up with the process not hanging.  Probably ought to
      go into LTP or xfstests...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      19f18459