1. 11 Jan, 2018 1 commit
    • Miroslav Benes's avatar
      livepatch: Remove immediate feature · d0807da7
      Miroslav Benes authored
      Immediate flag has been used to disable per-task consistency and patch
      all tasks immediately. It could be useful if the patch doesn't change any
      function or data semantics.
      
      However, it causes problems on its own. The consistency problem is
      currently broken with respect to immediate patches.
      
      func            a
      patches         1i
                      2i
                      3
      
      When the patch 3 is applied, only 2i function is checked (by stack
      checking facility). There might be a task sleeping in 1i though. Such
      task is migrated to 3, because we do not check 1i in
      klp_check_stack_func() at all.
      
      Coming atomic replace feature would be easier to implement and more
      reliable without immediate.
      
      Thus, remove immediate feature completely and save us from the problems.
      
      Note that force feature has the similar problem. However it is
      considered as a last resort. If used, administrator should not apply any
      new live patches and should plan for reboot into an updated kernel.
      
      The architectures would now need to provide HAVE_RELIABLE_STACKTRACE to
      fully support livepatch.
      Signed-off-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      d0807da7
  2. 07 Dec, 2017 1 commit
    • Miroslav Benes's avatar
      livepatch: force transition to finish · c99a2be7
      Miroslav Benes authored
      If a task sleeps in a set of patched functions uninterruptedly, it could
      block the whole transition indefinitely.  Thus it may be useful to clear
      its TIF_PATCH_PENDING to allow the process to finish.
      
      Admin can do that now by writing to force sysfs attribute in livepatch
      sysfs directory. TIF_PATCH_PENDING is then cleared for all tasks and the
      transition can finish successfully.
      
      Important note! Administrator should not use this feature without a
      clearance from a patch distributor. It must be checked that by doing so
      the consistency model guarantees are not violated. Removal (rmmod) of
      patch modules is permanently disabled when the feature is used. It
      cannot be guaranteed there is no task sleeping in such module.
      Signed-off-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      c99a2be7
  3. 04 Dec, 2017 1 commit
    • Miroslav Benes's avatar
      livepatch: send a fake signal to all blocking tasks · 43347d56
      Miroslav Benes authored
      Live patching consistency model is of LEAVE_PATCHED_SET and
      SWITCH_THREAD. This means that all tasks in the system have to be marked
      one by one as safe to call a new patched function. Safe means when a
      task is not (sleeping) in a set of patched functions. That is, no
      patched function is on the task's stack. Another clearly safe place is
      the boundary between kernel and userspace. The patching waits for all
      tasks to get outside of the patched set or to cross the boundary. The
      transition is completed afterwards.
      
      The problem is that a task can block the transition for quite a long
      time, if not forever. It could sleep in a set of patched functions, for
      example.  Luckily we can force the task to leave the set by sending it a
      fake signal, that is a signal with no data in signal pending structures
      (no handler, no sign of proper signal delivered). Suspend/freezer use
      this to freeze the tasks as well. The task gets TIF_SIGPENDING set and
      is woken up (if it has been sleeping in the kernel before) or kicked by
      rescheduling IPI (if it was running on other CPU). This causes the task
      to go to kernel/userspace boundary where the signal would be handled and
      the task would be marked as safe in terms of live patching.
      
      There are tasks which are not affected by this technique though. The
      fake signal is not sent to kthreads. They should be handled differently.
      They can be woken up so they leave the patched set and their
      TIF_PATCH_PENDING can be cleared thanks to stack checking.
      
      For the sake of completeness, if the task is in TASK_RUNNING state but
      not currently running on some CPU it doesn't get the IPI, but it would
      eventually handle the signal anyway. Second, if the task runs in the
      kernel (in TASK_RUNNING state) it gets the IPI, but the signal is not
      handled on return from the interrupt. It would be handled on return to
      the userspace in the future when the fake signal is sent again. Stack
      checking deals with these cases in a better way.
      
      If the task was sleeping in a syscall it would be woken by our fake
      signal, it would check if TIF_SIGPENDING is set (by calling
      signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with
      ERESTART* return values are restarted in case of the fake signal (see
      do_signal()). EINTR is propagated back to the userspace program. This
      could disturb the program, but...
      
      * each process dealing with signals should react accordingly to EINTR
        return values.
      * syscalls returning EINTR happen to be quite common situation in the
        system even if no fake signal is sent.
      * freezer sends the fake signal and does not deal with EINTR anyhow.
        Thus EINTR values are returned when the system is resumed.
      
      The very safe marking is done in architectures' "entry" on syscall and
      interrupt/exception exit paths, and in a stack checking functions of
      livepatch.  TIF_PATCH_PENDING is cleared and the next
      recalc_sigpending() drops TIF_SIGPENDING. In connection with this, also
      call klp_update_patch_state() before do_signal(), so that
      recalc_sigpending() in dequeue_signal() can clear TIF_PATCH_PENDING
      immediately and thus prevent a double call of do_signal().
      
      Note that the fake signal is not sent to stopped/traced tasks. Such task
      prevents the patching to finish till it continues again (is not traced
      anymore).
      
      Last, sending the fake signal is not automatic. It is done only when
      admin requests it by writing 1 to signal sysfs attribute in livepatch
      sysfs directory.
      Signed-off-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: x86@kernel.org
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      43347d56
  4. 19 Oct, 2017 1 commit
    • Joe Lawrence's avatar
      livepatch: add (un)patch callbacks · 93862e38
      Joe Lawrence authored
      Provide livepatch modules a klp_object (un)patching notification
      mechanism.  Pre and post-(un)patch callbacks allow livepatch modules to
      setup or synchronize changes that would be difficult to support in only
      patched-or-unpatched code contexts.
      
      Callbacks can be registered for target module or vmlinux klp_objects,
      but each implementation is klp_object specific.
      
        - Pre-(un)patch callbacks run before any (un)patching transition
          starts.
      
        - Post-(un)patch callbacks run once an object has been (un)patched and
          the klp_patch fully transitioned to its target state.
      
      Example use cases include modification of global data and registration
      of newly available services/handlers.
      
      See Documentation/livepatch/callbacks.txt for details and
      samples/livepatch/ for examples.
      Signed-off-by: 's avatarJoe Lawrence <joe.lawrence@redhat.com>
      Acked-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      93862e38
  5. 02 Oct, 2017 1 commit
  6. 14 Sep, 2017 1 commit
    • Joe Lawrence's avatar
      livepatch: introduce shadow variable API · 439e7271
      Joe Lawrence authored
      Add exported API for livepatch modules:
      
        klp_shadow_get()
        klp_shadow_alloc()
        klp_shadow_get_or_alloc()
        klp_shadow_free()
        klp_shadow_free_all()
      
      that implement "shadow" variables, which allow callers to associate new
      shadow fields to existing data structures.  This is intended to be used
      by livepatch modules seeking to emulate additions to data structure
      definitions.
      
      See Documentation/livepatch/shadow-vars.txt for a summary of the new
      shadow variable API, including a few common use cases.
      
      See samples/livepatch/livepatch-shadow-* for example modules that
      demonstrate shadow variables.
      
      [jkosina@suse.cz: fix __klp_shadow_get_or_alloc() comment as spotted by
       Josh]
      Signed-off-by: 's avatarJoe Lawrence <joe.lawrence@redhat.com>
      Acked-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      439e7271
  7. 08 Mar, 2017 2 commits
    • Josh Poimboeuf's avatar
      livepatch: allow removal of a disabled patch · 3ec24776
      Josh Poimboeuf authored
      Currently we do not allow patch module to unload since there is no
      method to determine if a task is still running in the patched code.
      
      The consistency model gives us the way because when the unpatching
      finishes we know that all tasks were marked as safe to call an original
      function. Thus every new call to the function calls the original code
      and at the same time no task can be somewhere in the patched code,
      because it had to leave that code to be marked as safe.
      
      We can safely let the patch module go after that.
      
      Completion is used for synchronization between module removal and sysfs
      infrastructure in a similar way to commit 942e4431 ("module: Fix
      mod->mkobj.kobj potentially freed too early").
      
      Note that we still do not allow the removal for immediate model, that is
      no consistency model. The module refcount may increase in this case if
      somebody disables and enables the patch several times. This should not
      cause any harm.
      
      With this change a call to try_module_get() is moved to
      __klp_enable_patch from klp_register_patch to make module reference
      counting symmetric (module_put() is in a patch disable path) and to
      allow to take a new reference to a disabled module when being enabled.
      
      Finally, we need to be very careful about possible races between
      klp_unregister_patch(), kobject_put() functions and operations
      on the related sysfs files.
      
      kobject_put(&patch->kobj) must be called without klp_mutex. Otherwise,
      it might be blocked by enabled_store() that needs the mutex as well.
      In addition, enabled_store() must check if the patch was not
      unregisted in the meantime.
      
      There is no need to do the same for other kobject_put() callsites
      at the moment. Their sysfs operations neither take the lock nor
      they access any data that might be freed in the meantime.
      
      There was an attempt to use kobjects the right way and prevent these
      races by design. But it made the patch definition more complicated
      and opened another can of worms. See
      https://lkml.kernel.org/r/1464018848-4303-1-git-send-email-pmladek@suse.com
      
      [Thanks to Petr Mladek for improving the commit message.]
      Signed-off-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: 's avatarPetr Mladek <pmladek@suse.com>
      Acked-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      3ec24776
    • Josh Poimboeuf's avatar
      livepatch: change to a per-task consistency model · d83a7cb3
      Josh Poimboeuf authored
      Change livepatch to use a basic per-task consistency model.  This is the
      foundation which will eventually enable us to patch those ~10% of
      security patches which change function or data semantics.  This is the
      biggest remaining piece needed to make livepatch more generally useful.
      
      This code stems from the design proposal made by Vojtech [1] in November
      2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
      consistency and syscall barrier switching combined with kpatch's stack
      trace switching.  There are also a number of fallback options which make
      it quite flexible.
      
      Patches are applied on a per-task basis, when the task is deemed safe to
      switch over.  When a patch is enabled, livepatch enters into a
      transition state where tasks are converging to the patched state.
      Usually this transition state can complete in a few seconds.  The same
      sequence occurs when a patch is disabled, except the tasks converge from
      the patched state to the unpatched state.
      
      An interrupt handler inherits the patched state of the task it
      interrupts.  The same is true for forked tasks: the child inherits the
      patched state of the parent.
      
      Livepatch uses several complementary approaches to determine when it's
      safe to patch tasks:
      
      1. The first and most effective approach is stack checking of sleeping
         tasks.  If no affected functions are on the stack of a given task,
         the task is patched.  In most cases this will patch most or all of
         the tasks on the first try.  Otherwise it'll keep trying
         periodically.  This option is only available if the architecture has
         reliable stacks (HAVE_RELIABLE_STACKTRACE).
      
      2. The second approach, if needed, is kernel exit switching.  A
         task is switched when it returns to user space from a system call, a
         user space IRQ, or a signal.  It's useful in the following cases:
      
         a) Patching I/O-bound user tasks which are sleeping on an affected
            function.  In this case you have to send SIGSTOP and SIGCONT to
            force it to exit the kernel and be patched.
         b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
            then it will get patched the next time it gets interrupted by an
            IRQ.
         c) In the future it could be useful for applying patches for
            architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
            this case you would have to signal most of the tasks on the
            system.  However this isn't supported yet because there's
            currently no way to patch kthreads without
            HAVE_RELIABLE_STACKTRACE.
      
      3. For idle "swapper" tasks, since they don't ever exit the kernel, they
         instead have a klp_update_patch_state() call in the idle loop which
         allows them to be patched before the CPU enters the idle state.
      
         (Note there's not yet such an approach for kthreads.)
      
      All the above approaches may be skipped by setting the 'immediate' flag
      in the 'klp_patch' struct, which will disable per-task consistency and
      patch all tasks immediately.  This can be useful if the patch doesn't
      change any function or data semantics.  Note that, even with this flag
      set, it's possible that some tasks may still be running with an old
      version of the function, until that function returns.
      
      There's also an 'immediate' flag in the 'klp_func' struct which allows
      you to specify that certain functions in the patch can be applied
      without per-task consistency.  This might be useful if you want to patch
      a common function like schedule(), and the function change doesn't need
      consistency but the rest of the patch does.
      
      For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
      must set patch->immediate which causes all tasks to be patched
      immediately.  This option should be used with care, only when the patch
      doesn't change any function or data semantics.
      
      In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
      may be allowed to use per-task consistency if we can come up with
      another way to patch kthreads.
      
      The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
      is in transition.  Only a single patch (the topmost patch on the stack)
      can be in transition at a given time.  A patch can remain in transition
      indefinitely, if any of the tasks are stuck in the initial patch state.
      
      A transition can be reversed and effectively canceled by writing the
      opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
      the transition is in progress.  Then all the tasks will attempt to
      converge back to the original patch state.
      
      [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.czSigned-off-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: Ingo Molnar <mingo@kernel.org>        # for the scheduler changes
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      d83a7cb3
  8. 26 Jan, 2017 1 commit
  9. 11 Jan, 2017 1 commit
    • Miroslav Benes's avatar
      livepatch: doc: remove the limitation for schedule() patching · 372e2db7
      Miroslav Benes authored
      The Limitations section of the documentation describes the impossibility
      to livepatch anything that is inlined to __schedule() function. This had
      been true till 4.9 kernel came. Thanks to commit 0100301b
      ("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
      __switch_to_asm function now (implemented in assembly) called properly
      from context_switch(). RIP is thus saved on the stack and a task would
      return to proper version of __schedule() et al. functions.
      
      Of course __switch_to_asm() is not patchable for the reason described in
      the section. But there is no __fentry__ call and I cannot imagine a
      reason to do it anyway.
      
      Therefore, remove the paragraphs from the section.
      Signed-off-by: 's avatarMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: 's avatarPetr Mladek <pmladek@suse.com>
      Acked-by: 's avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      372e2db7
  10. 09 Dec, 2016 1 commit
  11. 18 Aug, 2016 1 commit
  12. 27 Apr, 2016 1 commit
    • Petr Mladek's avatar
      livepatch: Add some basic livepatch documentation · 5e4e3844
      Petr Mladek authored
      livepatch framework deserves some documentation, definitely.
      This is an attempt to provide some basic info. I hope that
      it will be useful for both LivePatch producers and also
      potential developers of the framework itself.
      
      [jkosina@suse.cz:
       - incorporated feedback (grammar fixes) from
         Chris J Arges <chris.j.arges@canonical.com>
       - s/LivePatch/livepatch in changelog as pointed out by
         Josh Poimboeuf <jpoimboe@redhat.com>
       - incorporated part of feedback (grammar fixes / reformulations) from
         Balbir Singh <bsingharora@gmail.com>
      ]
      Acked-by: 's avatarJessica Yu <jeyu@redhat.com>
      Signed-off-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarJiri Kosina <jkosina@suse.cz>
      5e4e3844
  13. 01 Apr, 2016 1 commit