1. 15 Jun, 2018 1 commit
  2. 08 Jun, 2018 1 commit
    • Thadeu Lima de Souza Cascardo's avatar
      fs/binfmt_misc.c: do not allow offset overflow · 5cc41e09
      Thadeu Lima de Souza Cascardo authored
      WHen registering a new binfmt_misc handler, it is possible to overflow
      the offset to get a negative value, which might crash the system, or
      possibly leak kernel data.
      
      Here is a crash log when 2500000000 was used as an offset:
      
        BUG: unable to handle kernel paging request at ffff989cfd6edca0
        IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
        PGD 1ef3e067 P4D 1ef3e067 PUD 0
        Oops: 0000 [#1] SMP NOPTI
        Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
        CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
        RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
        Call Trace:
          search_binary_handler+0x97/0x1d0
          do_execveat_common.isra.34+0x667/0x810
          SyS_execve+0x31/0x40
          do_syscall_64+0x73/0x130
          entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Use kstrtoint instead of simple_strtoul.  It will work as the code
      already set the delimiter byte to '\0' and we only do it when the field
      is not empty.
      
      Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX.  Also tested
      with examples documented at Documentation/admin-guide/binfmt-misc.rst
      and other registrations from packages on Ubuntu.
      
      Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: 's avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      5cc41e09
  3. 02 Apr, 2018 1 commit
  4. 13 Oct, 2017 1 commit
    • Eryu Guan's avatar
      fs/binfmt_misc.c: node could be NULL when evicting inode · 7e866006
      Eryu Guan authored
      inode->i_private is assigned by a Node pointer only after registering a
      new binary format, so it could be NULL if inode was created by
      bm_fill_super() (or iput() was called by the error path in
      bm_register_write()), and this could result in NULL pointer dereference
      when evicting such an inode.  e.g.  mount binfmt_misc filesystem then
      umount it immediately:
      
        mount -t binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc
        umount /proc/sys/fs/binfmt_misc
      
      will result in
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000013
        IP: bm_evict_inode+0x16/0x40 [binfmt_misc]
        ...
        Call Trace:
         evict+0xd3/0x1a0
         iput+0x17d/0x1d0
         dentry_unlink_inode+0xb9/0xf0
         __dentry_kill+0xc7/0x170
         shrink_dentry_list+0x122/0x280
         shrink_dcache_parent+0x39/0x90
         do_one_tree+0x12/0x40
         shrink_dcache_for_umount+0x2d/0x90
         generic_shutdown_super+0x1f/0x120
         kill_litter_super+0x29/0x40
         deactivate_locked_super+0x43/0x70
         deactivate_super+0x45/0x60
         cleanup_mnt+0x3f/0x70
         __cleanup_mnt+0x12/0x20
         task_work_run+0x86/0xa0
         exit_to_usermode_loop+0x6d/0x99
         syscall_return_slowpath+0xba/0xf0
         entry_SYSCALL_64_fastpath+0xa3/0xa
      
      Fix it by making sure Node (e) is not NULL.
      
      Link: http://lkml.kernel.org/r/20171010100642.31786-1-eguan@redhat.com
      Fixes: 83f91827 ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()")
      Signed-off-by: 's avatarEryu Guan <eguan@redhat.com>
      Acked-by: 's avatarOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      7e866006
  5. 04 Oct, 2017 5 commits
  6. 04 Sep, 2017 1 commit
  7. 27 Apr, 2017 1 commit
  8. 02 Mar, 2017 1 commit
  9. 28 Sep, 2016 1 commit
  10. 29 May, 2016 1 commit
  11. 30 Mar, 2016 1 commit
  12. 22 Jan, 2016 1 commit
    • Al Viro's avatar
      wrappers for ->i_mutex access · 5955102c
      Al Viro authored
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  13. 17 Apr, 2015 1 commit
  14. 15 Apr, 2015 1 commit
  15. 17 Dec, 2014 1 commit
    • Al Viro's avatar
      unfuck binfmt_misc.c (broken by commit e6084d4a) · 7d65cf10
      Al Viro authored
      scanarg(s, del) never returns s; the empty field results in s + 1.
      Restore the correct checks, and move NUL-termination into scanarg(),
      while we are at it.
      
      Incidentally, mixing "coding style cleanups" (for small values of cleanup)
      with functional changes is a Bad Idea(tm)...
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      7d65cf10
  16. 13 Dec, 2014 1 commit
    • David Drysdale's avatar
      syscalls: implement execveat() system call · 51f39a1f
      David Drysdale authored
      This patchset adds execveat(2) for x86, and is derived from Meredydd
      Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).
      
      The primary aim of adding an execveat syscall is to allow an
      implementation of fexecve(3) that does not rely on the /proc filesystem,
      at least for executables (rather than scripts).  The current glibc version
      of fexecve(3) is implemented via /proc, which causes problems in sandboxed
      or otherwise restricted environments.
      
      Given the desire for a /proc-free fexecve() implementation, HPA suggested
      (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
      an appropriate generalization.
      
      Also, having a new syscall means that it can take a flags argument without
      back-compatibility concerns.  The current implementation just defines the
      AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
      added in future -- for example, flags for new namespaces (as suggested at
      https://lkml.org/lkml/2006/7/11/474).
      
      Related history:
       - https://lkml.org/lkml/2006/12/27/123 is an example of someone
         realizing that fexecve() is likely to fail in a chroot environment.
       - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
         documenting the /proc requirement of fexecve(3) in its manpage, to
         "prevent other people from wasting their time".
       - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
         problem where a process that did setuid() could not fexecve()
         because it no longer had access to /proc/self/fd; this has since
         been fixed.
      
      This patch (of 4):
      
      Add a new execveat(2) system call.  execveat() is to execve() as openat()
      is to open(): it takes a file descriptor that refers to a directory, and
      resolves the filename relative to that.
      
      In addition, if the filename is empty and AT_EMPTY_PATH is specified,
      execveat() executes the file to which the file descriptor refers.  This
      replicates the functionality of fexecve(), which is a system call in other
      UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and
      so relies on /proc being mounted).
      
      The filename fed to the executed program as argv[0] (or the name of the
      script fed to a script interpreter) will be of the form "/dev/fd/<fd>"
      (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively
      reflecting how the executable was found.  This does however mean that
      execution of a script in a /proc-less environment won't work; also, script
      execution via an O_CLOEXEC file descriptor fails (as the file will not be
      accessible after exec).
      
      Based on patches by Meredydd Luff.
      Signed-off-by: 's avatarDavid Drysdale <drysdale@google.com>
      Cc: Meredydd Luff <meredydd@senatehouse.org>
      Cc: Shuah Khan <shuah.kh@samsung.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Rich Felker <dalias@aerifal.cx>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      51f39a1f
  17. 11 Dec, 2014 4 commits
    • Andrew Morton's avatar
      fs/binfmt_misc.c: use GFP_KERNEL instead of GFP_USER · f7e1ad1a
      Andrew Morton authored
      GFP_USER means "honour cpuset nodes-allowed beancounting".  These are
      regular old kernel objects and there seems no reason to give them this
      treatment.
      Acked-by: 's avatarMike Frysinger <vapier@gentoo.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      f7e1ad1a
    • Mike Frysinger's avatar
      binfmt_misc: clean up code style a bit · e6084d4a
      Mike Frysinger authored
      Clean up various coding style issues that checkpatch complains about.
      No functional changes here.
      Signed-off-by: 's avatarMike Frysinger <vapier@gentoo.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6084d4a
    • Mike Frysinger's avatar
      binfmt_misc: add comments & debug logs · 6b899c4e
      Mike Frysinger authored
      When trying to develop a custom format handler, the errors returned all
      effectively get bucketed as EINVAL with no kernel messages.  The other
      errors (ENOMEM/EFAULT) are internal/obvious and basic.  Thus any time a
      bad handler is rejected, the developer has to walk the dense code and
      try to guess where it went wrong.  Needing to dive into kernel code is
      itself a fairly high barrier for a lot of people.
      
      To improve this situation, let's deploy extensive pr_debug markers at
      logical parse points, and add comments to the dense parsing logic.  It
      let's you see exactly where the parsing aborts, the string the kernel
      received (useful when dealing with shell code), how it translated the
      buffers to binary data, and how it will apply the mask at runtime.
      
      Some example output:
        $ echo ':qemu-foo:M::\x7fELF\xAD\xAD\x01\x00:\xff\xff\xff\xff\xff\x00\xff\x00:/usr/bin/qemu-foo:POC' > register
        $ dmesg
        binfmt_misc: register: received 92 bytes
        binfmt_misc: register: delim: 0x3a {:}
        binfmt_misc: register: name: {qemu-foo}
        binfmt_misc: register: type: M (magic)
        binfmt_misc: register: offset: 0x0
        binfmt_misc: register: magic[raw]: 5c 78 37 66 45 4c 46 5c 78 41 44 5c 78 41 44 5c  \x7fELF\xAD\xAD\
        binfmt_misc: register: magic[raw]: 78 30 31 5c 78 30 30 00                          x01\x00.
        binfmt_misc: register:  mask[raw]: 5c 78 66 66 5c 78 66 66 5c 78 66 66 5c 78 66 66  \xff\xff\xff\xff
        binfmt_misc: register:  mask[raw]: 5c 78 66 66 5c 78 30 30 5c 78 66 66 5c 78 30 30  \xff\x00\xff\x00
        binfmt_misc: register:  mask[raw]: 00                                               .
        binfmt_misc: register: magic/mask length: 8
        binfmt_misc: register: magic[decoded]: 7f 45 4c 46 ad ad 01 00                          .ELF....
        binfmt_misc: register:  mask[decoded]: ff ff ff ff ff 00 ff 00                          ........
        binfmt_misc: register:  magic[masked]: 7f 45 4c 46 ad 00 01 00                          .ELF....
        binfmt_misc: register: interpreter: {/usr/bin/qemu-foo}
        binfmt_misc: register: flag: P (preserve argv0)
        binfmt_misc: register: flag: O (open binary)
        binfmt_misc: register: flag: C (preserve creds)
      
      The [raw] lines show us exactly what was received from userspace.  The
      lines after that show us how the kernel has decoded things.
      Signed-off-by: 's avatarMike Frysinger <vapier@gentoo.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b899c4e
    • Yann Droneaud's avatar
      binfmt_misc: replace get_unused_fd() with get_unused_fd_flags(0) · c6cb898b
      Yann Droneaud authored
      This patch replaces calls to get_unused_fd() with equivalent call to
      get_unused_fd_flags(0) to preserve current behavor for existing code.
      
      In a further patch, get_unused_fd() will be removed so that new code start
      using get_unused_fd_flags(), with the hope O_CLOEXEC could be used, either
      by default or choosen by userspace.
      Signed-off-by: 's avatarYann Droneaud <ydroneaud@opteya.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      c6cb898b
  18. 14 Oct, 2014 2 commits
    • Arnd Bergmann's avatar
      binfmt_misc: work around gcc-4.9 warning · de8288b1
      Arnd Bergmann authored
      gcc-4.9 on ARM gives us a mysterious warning about the binfmt_misc
      parse_command function:
      
        fs/binfmt_misc.c: In function 'parse_command.part.3':
        fs/binfmt_misc.c:405:7: warning: array subscript is above array bounds [-Warray-bounds]
      
      I've managed to trace this back to the ARM implementation of memset,
      which is called from copy_from_user in case of a fault and which does
      
       #define memset(p,v,n)                                                  \
              ({                                                              \
                      void *__p = (p); size_t __n = n;                        \
                      if ((__n) != 0) {                                       \
                              if (__builtin_constant_p((v)) && (v) == 0)      \
                                      __memzero((__p),(__n));                 \
                              else                                            \
                                      memset((__p),(v),(__n));                \
                      }                                                       \
                      (__p);                                                  \
              })
      
      Apparently gcc gets confused by the check for "size != 0" and believes
      that the size might be zero when it gets to the line that does "if
      (s[count-1] == '\n')", so it would access data outside of the array.
      
      gcc is clearly wrong here, since this condition was already checked
      earlier in the function and the 'size' value can not change in the
      meantime.
      
      Fortunately, we can work around it and get rid of the warning by
      rearranging the function to check for zero size after doing the
      copy_from_user.  It is still safe to pass a zero size into
      copy_from_user, so it does not cause any side effects.
      Signed-off-by: 's avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      de8288b1
    • Mike Frysinger's avatar
      binfmt_misc: expand the register format limit to 1920 bytes · bbaecc08
      Mike Frysinger authored
      The current code places a 256 byte limit on the registration format.
      This ends up being fairly limited when you try to do matching against a
      binary format like ELF:
      
       - the magic & mask formats cannot have any embedded NUL chars
         (string_unescape_inplace halts at the first NUL)
       - each escape sequence quadruples the size: \x00 is needed for NUL
       - trying to match bytes at the start of the file as well as further
         on leads to a lot of \x00 sequences in the mask
       - magic & mask have to be the same length (when decoded)
       - still need bytes for the other fields
       - impossible!
      
      Let's look at a concrete (and common) example: using QEMU to run MIPS
      ELFs.  The name field uses 11 bytes "qemu-mipsel".  The interp uses 20
      bytes "/usr/bin/qemu-mipsel".  The type & flags takes up 4 bytes.  We
      need 7 bytes for the delimiter (usually ":").  We can skip offset.  So
      already we're down to 107 bytes to use with the magic/mask instead of
      the real limit of 128 (BINPRM_BUF_SIZE).  If people use shell code to
      register (which they do the majority of the time), they're down to ~26
      possible bytes since the escape sequence must be \x##.
      
      The ELF format looks like (both 32 & 64 bit):
      
      	e_ident: 16 bytes
      	e_type: 2 bytes
      	e_machine: 2 bytes
      
      Those 20 bytes are enough for most architectures because they have so few
      formats in the first place, thus they can be uniquely identified.  That
      also means for shell users, since 20 is smaller than 26, they can sanely
      register a handler.
      
      But for some targets (like MIPS), we need to poke further.  The ELF fields
      continue on:
      
      	e_entry: 4 or 8 bytes
      	e_phoff: 4 or 8 bytes
      	e_shoff: 4 or 8 bytes
      	e_flags: 4 bytes
      
      We only care about e_flags here as that includes the bits to identify
      whether the ELF is O32/N32/N64.  But now we have to consume another 16
      bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the
      flags.  If every byte is escaped, we send 288 more bytes to the kernel
      ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4
      {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our
      budget.
      
      Even if we try to be clever and do the decoding ourselves (rather than
      relying on the kernel to process \x##), we still can't hit the mark --
      string_unescape_inplace treats mask & magic as C strings so NUL cannot
      be embedded.  That leaves us with having to pass \x00 for the 12/24
      entry/phoff/shoff bytes (as those will be completely random addresses),
      and that is a minimum requirement of 48/96 bytes for the mask alone.
      Add up the rest and we blow through it (this is for 64 bit ELFs):
      magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} +
             4 {e_flags} = 48              # ^^ See note below.
      mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} +
             4 {e_flags} = 120
      Remember above we had 107 left over, and now we're at 168.  This is of
      course the *best* case scenario -- you'll also want to have NUL bytes
      in the magic & mask too to match literal zeros.
      
      Note: the reason we can use 24 in the magic is that we can work off of the
      fact that for bytes the mask would clobber, we can stuff any value into
      magic that we want.  So when mask is \x00, we don't need the magic to also
      be \x00, it can be an unescaped raw byte like '!'.  This lets us handle
      more formats (barely) under the current 256 limit, but that's a pretty
      tall hoop to force people to jump through.
      
      With all that said, let's bump the limit from 256 bytes to 1920.  This way
      we support escaping every byte of the mask & magic field (which is 1024
      bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other
      fields.  Like long paths to the interpreter (when you have source in your
      /really/long/homedir/qemu/foo).  Since the current code stuffs more than
      one structure into the same buffer, we leave a bit of space to easily
      round up to 2k.  1920 is just as arbitrary as 256 ;).
      Signed-off-by: 's avatarMike Frysinger <vapier@gentoo.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      bbaecc08
  19. 03 Apr, 2014 1 commit
  20. 01 May, 2013 1 commit
  21. 04 Mar, 2013 1 commit
    • Eric W. Biederman's avatar
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman authored
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: 's avatarKees Cook <keescook@chromium.org>
      Reported-by: 's avatarKees Cook <keescook@google.com>
      Signed-off-by: 's avatar"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  22. 23 Feb, 2013 1 commit
  23. 21 Dec, 2012 1 commit
    • Kees Cook's avatar
      exec: do not leave bprm->interp on stack · b66c5984
      Kees Cook authored
      If a series of scripts are executed, each triggering module loading via
      unprintable bytes in the script header, kernel stack contents can leak
      into the command line.
      
      Normally execution of binfmt_script and binfmt_misc happens recursively.
      However, when modules are enabled, and unprintable bytes exist in the
      bprm->buf, execution will restart after attempting to load matching
      binfmt modules.  Unfortunately, the logic in binfmt_script and
      binfmt_misc does not expect to get restarted.  They leave bprm->interp
      pointing to their local stack.  This means on restart bprm->interp is
      left pointing into unused stack memory which can then be copied into the
      userspace argv areas.
      
      After additional study, it seems that both recursion and restart remains
      the desirable way to handle exec with scripts, misc, and modules.  As
      such, we need to protect the changes to interp.
      
      This changes the logic to require allocation for any changes to the
      bprm->interp.  To avoid adding a new kmalloc to every exec, the default
      value is left as-is.  Only when passing through binfmt_script or
      binfmt_misc does an allocation take place.
      
      For a proof of concept, see DoTest.sh from:
      
         http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/Signed-off-by: 's avatarKees Cook <keescook@chromium.org>
      Cc: halfdog <me@halfdog.net>
      Cc: P J P <ppandit@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      b66c5984
  24. 18 Dec, 2012 1 commit
    • Kees Cook's avatar
      exec: use -ELOOP for max recursion depth · d7402698
      Kees Cook authored
      To avoid an explosion of request_module calls on a chain of abusive
      scripts, fail maximum recursion with -ELOOP instead of -ENOEXEC. As soon
      as maximum recursion depth is hit, the error will fail all the way back
      up the chain, aborting immediately.
      
      This also has the side-effect of stopping the user's shell from attempting
      to reexecute the top-level file as a shell script. As seen in the
      dash source:
      
              if (cmd != path_bshell && errno == ENOEXEC) {
                      *argv-- = cmd;
                      *argv = cmd = path_bshell;
                      goto repeat;
              }
      
      The above logic was designed for running scripts automatically that lacked
      the "#!" header, not to re-try failed recursion. On a legitimate -ENOEXEC,
      things continue to behave as the shell expects.
      
      Additionally, when tracking recursion, the binfmt handlers should not be
      involved. The recursion being tracked is the depth of calls through
      search_binary_handler(), so that function should be exclusively responsible
      for tracking the depth.
      Signed-off-by: 's avatarKees Cook <keescook@chromium.org>
      Cc: halfdog <me@halfdog.net>
      Cc: P J P <ppandit@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7402698
  25. 29 Nov, 2012 2 commits
  26. 06 May, 2012 1 commit
  27. 23 Mar, 2012 1 commit
  28. 21 Mar, 2012 1 commit
  29. 07 Jan, 2012 1 commit
  30. 02 Nov, 2011 1 commit
  31. 20 Jul, 2011 1 commit