1. 23 May, 2017 2 commits
  2. 22 May, 2017 1 commit
    • Linus Torvalds's avatar
      x86: fix 32-bit case of __get_user_asm_u64() · 33c9e972
      Linus Torvalds authored
      The code to fetch a 64-bit value from user space was entirely buggered,
      and has been since the code was merged in early 2016 in commit
      b2f68038 ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
      kernels").
      
      Happily the buggered routine is almost certainly entirely unused, since
      the normal way to access user space memory is just with the non-inlined
      "get_user()", and the inlined version didn't even historically exist.
      
      The normal "get_user()" case is handled by external hand-written asm in
      arch/x86/lib/getuser.S that doesn't have either of these issues.
      
      There were two independent bugs in __get_user_asm_u64():
      
       - it still did the STAC/CLAC user space access marking, even though
         that is now done by the wrapper macros, see commit 11f1a4b9
         ("x86: reorganize SMAP handling in user space accesses").
      
         This didn't result in a semantic error, it just means that the
         inlined optimized version was hugely less efficient than the
         allegedly slower standard version, since the CLAC/STAC overhead is
         quite high on modern Intel CPU's.
      
       - the double register %eax/%edx was marked as an output, but the %eax
         part of it was touched early in the asm, and could thus clobber other
         inputs to the asm that gcc didn't expect it to touch.
      
         In particular, that meant that the generated code could look like
         this:
      
              mov    (%eax),%eax
              mov    0x4(%eax),%edx
      
         where the load of %edx obviously was _supposed_ to be from the 32-bit
         word that followed the source of %eax, but because %eax was
         overwritten by the first instruction, the source of %edx was
         basically random garbage.
      
      The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
      the 64-bit output as early-clobber to let gcc know that no inputs should
      alias with the output register.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@kernel.org   # v4.8+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33c9e972
  3. 21 May, 2017 3 commits
    • Linus Torvalds's avatar
      Clean up x86 unsafe_get/put_user() type handling · 334a023e
      Linus Torvalds authored
      Al noticed that unsafe_put_user() had type problems, and fixed them in
      commit a7cc722f ("fix unsafe_put_user()"), which made me look more
      at those functions.
      
      It turns out that unsafe_get_user() had a type issue too: it limited the
      largest size of the type it could handle to "unsigned long".  Which is
      fine with the current users, but doesn't match our existing normal
      get_user() semantics, which can also handle "u64" even when that does
      not fit in a long.
      
      While at it, also clean up the type cast in unsafe_put_user().  We
      actually want to just make it an assignment to the expected type of the
      pointer, because we actually do want warnings from types that don't
      convert silently.  And it makes the code more readable by not having
      that one very long and complex line.
      
      [ This patch might become stable material if we ever end up back-porting
        any new users of the unsafe uaccess code, but as things stand now this
        doesn't matter for any current existing uses. ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      334a023e
    • Al Viro's avatar
      osf_wait4(): fix infoleak · a8c39544
      Al Viro authored
      failing sys_wait4() won't fill struct rusage...
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a8c39544
    • Al Viro's avatar
      fix unsafe_put_user() · a7cc722f
      Al Viro authored
      __put_user_size() relies upon its first argument having the same type as what
      the second one points to; the only other user makes sure of that and
      unsafe_put_user() should do the same.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a7cc722f
  4. 19 May, 2017 14 commits
    • Radim Krčmář's avatar
      KVM: x86: prevent uninitialized variable warning in check_svme() · 92ceb767
      Radim Krčmář authored
      get_msr() of MSR_EFER is currently always going to succeed, but static
      checker doesn't see that far.
      
      Don't complicate stuff and just use 0 for the fallback -- it means that
      the feature is not present.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      92ceb767
    • Radim Krčmář's avatar
      KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh() · 34b0dadb
      Radim Krčmář authored
      Static analysis noticed that pmu->nr_arch_gp_counters can be 32
      (INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'.
      
      I didn't add BUILD_BUG_ON for it as we have a better checker.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 25462f7f ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch")
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      34b0dadb
    • Radim Krčmář's avatar
      KVM: x86: zero base3 of unusable segments · f0367ee1
      Radim Krčmář authored
      Static checker noticed that base3 could be used uninitialized if the
      segment was not present (useable).  Random stack values probably would
      not pass VMCS entry checks.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 1aa36616 ("KVM: x86 emulator: consolidate segment accessors")
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      f0367ee1
    • Wanpeng Li's avatar
      KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation · cbfc6c91
      Wanpeng Li authored
      Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation.
      
      - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only,
        a read should be ignored) in guest can get a random number.
      - "rep insb" instruction to access PIT register port 0x43 can control memcpy()
        in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes,
        which will disclose the unimportant kernel memory in host but no crash.
      
      The similar test program below can reproduce the read out-of-bounds vulnerability:
      
      void hexdump(void *mem, unsigned int len)
      {
              unsigned int i, j;
      
              for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++)
              {
                      /* print offset */
                      if(i % HEXDUMP_COLS == 0)
                      {
                              printf("0x%06x: ", i);
                      }
      
                      /* print hex data */
                      if(i < len)
                      {
                              printf("%02x ", 0xFF & ((char*)mem)[i]);
                      }
                      else /* end of block, just aligning for ASCII dump */
                      {
                              printf("   ");
                      }
      
                      /* print ASCII dump */
                      if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1))
                      {
                              for(j = i - (HEXDUMP_COLS - 1); j <= i; j++)
                              {
                                      if(j >= len) /* end of block, not really printing */
                                      {
                                              putchar(' ');
                                      }
                                      else if(isprint(((char*)mem)[j])) /* printable char */
                                      {
                                              putchar(0xFF & ((char*)mem)[j]);
                                      }
                                      else /* other char */
                                      {
                                              putchar('.');
                                      }
                              }
                              putchar('\n');
                      }
              }
      }
      
      int main(void)
      {
      	int i;
      	if (iopl(3))
      	{
      		err(1, "set iopl unsuccessfully\n");
      		return -1;
      	}
      	static char buf[0x40];
      
      	/* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x41, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x42, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x44, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x45, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x40 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x43 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      	return 0;
      }
      
      The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation
      w/o clear after using which results in some random datas are left over in
      the buffer. Guest reads port 0x43 will be ignored since it is write only,
      however, the function kernel_pio() can't distigush this ignore from successfully
      reads data from device's ioport. There is no new data fill the buffer from
      port 0x43, however, emulator_pio_in_emulated() will copy the stale data in
      the buffer to the guest unconditionally. This patch fixes it by clearing the
      buffer before in instruction emulation to avoid to grant guest the stale data
      in the buffer.
      
      In addition, string I/O is not supported for in kernel device. So there is no
      iteration to read ioport %RCX times for string I/O. The function kernel_pio()
      just reads one round, and then copy the io size * %RCX to the guest unconditionally,
      actually it copies the one round ioport data w/ other random datas which are left
      over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by
      introducing the string I/O support for in kernel device in order to grant the right
      ioport datas to the guest.
      
      Before the patch:
      
      0x000000: fe 38 93 93 ff ff ab ab .8......
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      After the patch:
      
      0x000000: 1e 02 f8 00 ff ff ab ab ........
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: d2 e2 d2 df d2 db d2 d7 ........
      0x000008: d2 d3 d2 cf d2 cb d2 c7 ........
      0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........
      0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: 00 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 00 00 00 00 ........
      0x000018: 00 00 00 00 00 00 00 00 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      Reported-by: default avatarMoguofang <moguofang@huawei.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Moguofang <moguofang@huawei.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      cbfc6c91
    • Wanpeng Li's avatar
      KVM: x86: Fix potential preemption when get the current kvmclock timestamp · e2c2206a
      Wanpeng Li authored
       BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809
       caller is __this_cpu_preempt_check+0x13/0x20
       CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13
       Call Trace:
        dump_stack+0x99/0xce
        check_preemption_disabled+0xf5/0x100
        __this_cpu_preempt_check+0x13/0x20
        get_kvmclock_ns+0x6f/0x110 [kvm]
        get_time_ref_counter+0x5d/0x80 [kvm]
        kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
        ? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
        ? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm]
        kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm]
        kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
        ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
        ? __fget+0xf3/0x210
        do_vfs_ioctl+0xa4/0x700
        ? __fget+0x114/0x210
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc2
       RIP: 0033:0x7f9d164ed357
        ? __this_cpu_preempt_check+0x13/0x20
      
      This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/
      CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled.
      
      Safe access to per-CPU data requires a couple of constraints, though: the
      thread working with the data cannot be preempted and it cannot be migrated
      while it manipulates per-CPU variables. If the thread is preempted, the
      thread that replaces it could try to work with the same variables; migration
      to another CPU could also cause confusion. However there is no preemption
      disable when reads host per-CPU tsc rate to calculate the current kvmclock
      timestamp.
      
      This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both
      __this_cpu_read() and rdtsc() are not preempted.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      e2c2206a
    • Arnd Bergmann's avatar
      arm64: dts: rockchip: fix include reference · 6bf1c2d2
      Arnd Bergmann authored
      The way we handle include paths for DT has changed a bit, which
      broke a file that had an unconventional way to reference a common
      header file:
      
      arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts:47:10: fatal error: include/dt-bindings/input/linux-event-codes.h: No such file or directory
      
      This removes the leading "include/" from the path name, which fixes it.
      
      Fixes: d5d332d3 ("devicetree: Move include prefixes from arch to separate directory")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      6bf1c2d2
    • Arnd Bergmann's avatar
      ARM: remove duplicate 'const' annotations' · 0527873b
      Arnd Bergmann authored
      gcc-7 warns about some declarations that are more 'const' than necessary:
      
      arch/arm/mach-at91/pm.c:338:34: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
       static const struct of_device_id const ramc_ids[] __initconst = {
      arch/arm/mach-bcm/bcm_kona_smc.c:36:34: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
       static const struct of_device_id const bcm_kona_smc_ids[] __initconst = {
      arch/arm/mach-spear/time.c:207:34: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
       static const struct of_device_id const timer_of_match[] __initconst = {
      arch/arm/mach-omap2/prm_common.c:714:34: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
       static const struct of_device_id const omap_prcm_dt_match_table[] __initconst = {
      arch/arm/mach-omap2/vc.c:562:35: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
       static const struct i2c_init_data const omap4_i2c_timing_data[] __initconst = {
      
      The ones in arch/arm were apparently all introduced accidentally by one
      commit that correctly marked a lot of variables as __initconst.
      
      Fixes: 19c233b7 ("ARM: appropriate __init annotation for const data")
      Acked-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarNicolas Pitre <nico@linaro.org>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: default avatarKrzysztof Hałasa <khalasa@piap.pl>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      0527873b
    • Rob Herring's avatar
      arm64: defconfig: enable options needed for QCom DB410c board · f4e506c5
      Rob Herring authored
      Enable Qualcomm drivers needed to boot Dragonboard 410c with HDMI. This
      enables support for clocks, regulators, and USB PHY.
      
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      [Olof: Turned off _RPM configs per follow-up email]
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      f4e506c5
    • Rob Herring's avatar
      arm64: defconfig: sync with savedefconfig · eb1e6716
      Rob Herring authored
      Sync the defconfig with savedefconfig as config options change/move over
      time.
      
      Generated with the following commands:
      make defconfig
      make savedefconfig
      cp defconfig arch/arm64/configs/defconfig
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      eb1e6716
    • Linus Walleij's avatar
      ARM: configs: add a gemini defconfig · 877dd4b1
      Linus Walleij authored
      It makes sense to have a stripped-down defconfig for just Gemini, as
      it is a pretty small platform used in NAS etc, and will use appended
      device tree. It is also quick to compile and test. Hopefully this
      defconfig can be a good base for distributions such as OpenWRT.
      
      I plan to add in the config options needed for the different
      variants of Gemini as we go along.
      
      Cc: Janos Laube <janos.dev@gmail.com>
      Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com>
      Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      877dd4b1
    • Olof Johansson's avatar
      devicetree: Move include prefixes from arch to separate directory · d5d332d3
      Olof Johansson authored
      We use a directory under arch/$ARCH/boot/dts as an include path
      that has links outside of the subtree to find dt-bindings from under
      include/dt-bindings. That's been working well, but new DT architectures
      haven't been adding them by default.
      
      Recently there's been a desire to share some of the DT material between
      arm and arm64, which originally caused developers to create symlinks or
      relative includes between the subtrees. This isn't ideal -- it breaks
      if the DT files aren't stored in the exact same hierarchy as the kernel
      tree, and generally it's just icky.
      
      As a somewhat cleaner solution we decided to add a $ARCH/ prefix link
      once, and allow DTS files to reference dtsi (and dts) files in other
      architectures that way.
      
      Original approach was to create these links under each architecture,
      but it lead to the problem of recursive symlinks.
      
      As a remedy, move the include link directories out of the architecture
      trees into a common location. At the same time, they can now share one
      directory and one dt-bindings/ link as well.
      
      Fixes: 4027494a ('ARM: dts: add arm/arm64 include symlinks')
      Reported-by: default avatarRussell King <linux@armlinux.org.uk>
      Reported-by: default avatarOmar Sandoval <osandov@osandov.com>
      Reviewed-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Reviewed-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Tested-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: linux-arch <linux-arch@vger.kernel.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      d5d332d3
    • Juergen Gross's avatar
      xen: make xen_flush_tlb_all() static · c71e6d80
      Juergen Gross authored
      xen_flush_tlb_all() is used in arch/x86/xen/mmu.c only. Make it static.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      c71e6d80
    • Juergen Gross's avatar
      xen: cleanup pvh leftovers from pv-only sources · 989513a7
      Juergen Gross authored
      There are some leftovers testing for pvh guest mode in pv-only source
      files. Remove them.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      989513a7
    • Michael Ellerman's avatar
      powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hash · e41e53cd
      Michael Ellerman authored
      virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on
      an address. What this means in practice is that it should only return true for
      addresses in the linear mapping which are backed by a valid PFN.
      
      We are failing to properly check that the address is in the linear mapping,
      because virt_to_pfn() will return a valid looking PFN for more or less any
      address. That bug is actually caused by __pa(), used in virt_to_pfn().
      
      eg: __pa(0xc000000000010000) = 0x10000  # Good
          __pa(0xd000000000010000) = 0x10000  # Bad!
          __pa(0x0000000000010000) = 0x10000  # Bad!
      
      This started happening after commit bdbc29c1 ("powerpc: Work around gcc
      miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition
      of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from
      the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give
      you something bogus back.
      
      Until we can verify if that GCC bug is no longer an issue, or come up with
      another solution, this commit does the minimal fix to make virt_addr_valid()
      work, by explicitly checking that the address is in the linear mapping region.
      
      Fixes: bdbc29c1 ("powerpc: Work around gcc miscompilation of __pa() on 64-bit")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Tested-by: default avatarBreno Leitao <breno.leitao@gmail.com>
      e41e53cd
  5. 18 May, 2017 1 commit
  6. 17 May, 2017 6 commits
    • Liam R. Howlett's avatar
      sparc/ftrace: Fix ftrace graph time measurement · 48078d2d
      Liam R. Howlett authored
      The ftrace function_graph time measurements of a given function is not
      accurate according to those recorded by ftrace using the function
      filters.  This change pulls the x86_64 fix from 'commit 722b3c74
      ("ftrace/graph: Trace function entry before updating index")' into the
      sparc specific prepare_ftrace_return which stops ftrace from
      counting interrupted tasks in the time measurement.
      
      Example measurements for select_task_rq_fair running "hackbench 100
      process 1000":
      
                    |  tracing/trace_stat/function0  |  function_graph
       Before patch |  2.802 us                      |  4.255 us
       After patch  |  2.749 us                      |  3.094 us
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@Oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48078d2d
    • Orlando Arias's avatar
      sparc: Fix -Wstringop-overflow warning · deba804c
      Orlando Arias authored
      Greetings,
      
      GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows
      in calls to string handling functions [1][2]. Due to the way
      ``empty_zero_page'' is declared in arch/sparc/include/setup.h, this
      causes a warning to trigger at compile time in the function mem_init(),
      which is subsequently converted to an error. The ensuing patch fixes
      this issue and aligns the declaration of empty_zero_page to that of
      other architectures. Thank you.
      
      Cheers,
      Orlando.
      
      [1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html
      [2] https://gcc.gnu.org/gcc-7/changes.htmlSigned-off-by: default avatarOrlando Arias <oarias@knights.ucf.edu>
      
      --------------------------------------------------------------------------------
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      deba804c
    • Nitin Gupta's avatar
      sparc64: Fix mapping of 64k pages with MAP_FIXED · b6c41cb0
      Nitin Gupta authored
      An incorrect huge page alignment check caused
      mmap failure for 64K pages when MAP_FIXED is used
      with address not aligned to HPAGE_SIZE.
      
      Orabug: 25885991
      
      Fixes: dcd1912d ("sparc64: Add 64K page size support")
      Signed-off-by: default avatarNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6c41cb0
    • Mark Rutland's avatar
      arm64/cpufeature: don't use mutex in bringup path · 63a1e1c9
      Mark Rutland authored
      Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
      must take the jump_label mutex.
      
      We call cpus_set_cap() in the secondary bringup path, from the idle
      thread where interrupts are disabled. Taking a mutex in this path "is a
      NONO" regardless of whether it's contended, and something we must avoid.
      We didn't spot this until recently, as ___might_sleep() won't warn for
      this case until all CPUs have been brought up.
      
      This patch avoids taking the mutex in the secondary bringup path. The
      poking of static keys is deferred until enable_cpu_capabilities(), which
      runs in a suitable context on the boot CPU. To account for the static
      keys being set later, cpus_have_const_cap() is updated to use another
      static key to check whether the const cap keys have been initialised,
      falling back to the caps bitmap until this is the case.
      
      This means that users of cpus_have_const_cap() gain should only gain a
      single additional NOP in the fast path once the const caps are
      initialised, but should always see the current cap value.
      
      The hyp code should never dereference the caps array, since the caps are
      initialized before we run the module initcall to initialise hyp. A check
      is added to the hyp init code to document this requirement.
      
      This change will sidestep a number of issues when the upcoming hotplug
      locking rework is merged.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarMarc Zyniger <marc.zyngier@arm.com>
      Reviewed-by: default avatarSuzuki Poulose <suzuki.poulose@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Sewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      63a1e1c9
    • Ravikumar Kattekola's avatar
      ARM: dts: dra7: Reduce cpu thermal shutdown temperature · bca52388
      Ravikumar Kattekola authored
      On dra7, as per TRM, the HW shutdown (TSHUT) temperature is hardcoded
      to 123C and cannot be modified by SW. This means when the temperature
      reaches 123C HW asserts TSHUT output which signals a warm reset.
      This reset is held until the temperature goes below the TSHUT low (105C).
      
      While in SW, the thermal driver continuously monitors current temperature
      and takes decisions based on whether it reached an alert or a critical point.
      The intention of setting a SW critical point is to prevent force reset by HW
      and instead do an orderly_poweroff(). But if the SW critical temperature is
      greater than or equal to that of HW then it defeats the purpose. To address
      this and let SW take action before HW does keep the SW critical temperature
      less than HW TSHUT value.
      
      The value for SW critical temperature was chosen as 120C just to ensure
      we give SW sometime before HW catches up.
      
      Document reference
      SPRUI30C – DRA75x, DRA74x Technical Reference Manual - November 2016
      SPRUHZ6H - AM572x Technical Reference Manual - November 2016
      
      Tested on:
      DRA75x PG 2.0 Rev H EVM
      Signed-off-by: default avatarRavikumar Kattekola <rk@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      bca52388
    • Michael Ellerman's avatar
      powerpc/mm: Fix crash in page table dump with huge pages · bfb9956a
      Michael Ellerman authored
      The page table dump code doesn't know about huge pages, so currently
      it crashes (or walks random memory, usually leading to a crash), if it
      finds a huge page. On Book3S we only see huge pages in the Linux page
      tables when we're using the P9 Radix MMU.
      
      Teaching the code to properly handle huge pages is a bit more involved,
      so for now just prevent the crash.
      
      Cc: stable@vger.kernel.org # v4.10+
      Fixes: 8eb07b18 ("powerpc/mm: Dump linux pagetables")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      bfb9956a
  7. 16 May, 2017 7 commits
  8. 15 May, 2017 6 commits
    • Ganapatrao Kulkarni's avatar
      arm64: perf: Ignore exclude_hv when kernel is running in HYP · 78a19cfd
      Ganapatrao Kulkarni authored
      commit d98ecdac ("arm64: perf: Count EL2 events if the kernel is
      running in HYP") returns -EINVAL when perf system call perf_event_open is
      called with exclude_hv != exclude_kernel. This change breaks applications
      on VHE enabled ARMv8.1 platforms. The issue was observed with HHVM
      application, which calls perf_event_open with exclude_hv = 1 and
      exclude_kernel = 0.
      
      There is no separate hypervisor privilege level when VHE is enabled, the
      host kernel runs at EL2. So when VHE is enabled, we should ignore
      exclude_hv from the application. This behaviour is consistent with PowerPC
      where the exclude_hv is ignored when the hypervisor is not present and with
      x86 where this flag is ignored.
      Signed-off-by: default avatarGanapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      [will: added comment to justify the behaviour of exclude_hv]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      78a19cfd
    • Robin Murphy's avatar
      arm64: Remove redundant mov from LL/SC cmpxchg · 8df728e1
      Robin Murphy authored
      The cmpxchg implementation introduced by commit c342f782 ("arm64:
      cmpxchg: patch in lse instructions when supported by the CPU") performs
      an apparently redundant register move of [old] to [oldval] in the
      success case - it always uses the same register width as [oldval] was
      originally loaded with, and is only executed when [old] and [oldval] are
      known to be equal anyway.
      
      The only effect it seemingly does have is to take up a surprising amount
      of space in the kernel text, as removing it reveals:
      
         text	   data	    bss	    dec	    hex	filename
      12426658	1348614	4499749	18275021	116dacd	vmlinux.o.new
      12429238	1348614	4499749	18277601	116e4e1	vmlinux.o.old
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      8df728e1
    • Paolo Bonzini's avatar
      KVM: nVMX: fix EPT permissions as reported in exit qualification · 0780516a
      Paolo Bonzini authored
      This fixes the new ept_access_test_read_only and ept_access_test_read_write
      testcases from vmx.flat.
      
      The problem is that gpte_access moves bits around to switch from EPT
      bit order (XWR) to ACC_*_MASK bit order (RWX).  This results in an
      incorrect exit qualification.  To fix this, make pt_access and
      pte_access operate on raw PTE values (only with NX flipped to mean
      "can execute") and call gpte_access at the end of the walk.  This
      lets us use pte_access to compute the exit qualification with XWR
      bit order.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarXiao Guangrong <xiaoguangrong@tencent.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      0780516a
    • Wanpeng Li's avatar
      KVM: VMX: Don't enable EPT A/D feature if EPT feature is disabled · fce6ac4c
      Wanpeng Li authored
      We can observe eptad kvm_intel module parameter is still Y
      even if ept is disabled which is weird. This patch will
      not enable EPT A/D feature if EPT feature is disabled.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      fce6ac4c
    • Wanpeng Li's avatar
      KVM: x86: Fix load damaged SSEx MXCSR register · a575813b
      Wanpeng Li authored
      Reported by syzkaller:
      
         BUG: unable to handle kernel paging request at ffffffffc07f6a2e
         IP: report_bug+0x94/0x120
         PGD 348e12067
         P4D 348e12067
         PUD 348e14067
         PMD 3cbd84067
         PTE 80000003f7e87161
      
         Oops: 0003 [#1] SMP
         CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G           OE   4.11.0+ #8
         task: ffff92fdfb525400 task.stack: ffffbda6c3d04000
         RIP: 0010:report_bug+0x94/0x120
         RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202
          do_trap+0x156/0x170
          do_error_trap+0xa3/0x170
          ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
          ? mark_held_locks+0x79/0xa0
          ? retint_kernel+0x10/0x10
          ? trace_hardirqs_off_thunk+0x1a/0x1c
          do_invalid_op+0x20/0x30
          invalid_op+0x1e/0x30
         RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
          ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm]
          kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm]
          kvm_vcpu_ioctl+0x384/0x780 [kvm]
          ? kvm_vcpu_ioctl+0x384/0x780 [kvm]
          ? sched_clock+0x13/0x20
          ? __do_page_fault+0x2a0/0x550
          do_vfs_ioctl+0xa4/0x700
          ? up_read+0x1f/0x40
          ? __do_page_fault+0x2a0/0x550
          SyS_ioctl+0x79/0x90
          entry_SYSCALL_64_fastpath+0x23/0xc2
      
      SDM mentioned that "The MXCSR has several reserved bits, and attempting to write
      a 1 to any of these bits will cause a general-protection exception(#GP) to be
      generated". The syzkaller forks' testcase overrides xsave area w/ random values
      and steps on the reserved bits of MXCSR register. The damaged MXCSR register
      values of guest will be restored to SSEx MXCSR register before vmentry. This
      patch fixes it by catching userspace override MXCSR register reserved bits w/
      random values and bails out immediately.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      a575813b
    • Dan Carpenter's avatar
      kvm: nVMX: off by one in vmx_write_pml_buffer() · 4769886b
      Dan Carpenter authored
      There are PML_ENTITY_NUM elements in the pml_address[] array so the >
      should be >= or we write beyond the end of the array when we do:
      
      	pml_address[vmcs12->guest_pml_index--] = gpa;
      
      Fixes: c5f983f6 ("nVMX: Implement emulated Page Modification Logging")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      4769886b