Skip to content
  • Seiji Aguchi's avatar
    efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed · e0d59733
    Seiji Aguchi authored
    Currently, when mounting pstore file system, a read callback of
    efi_pstore driver runs mutiple times as below.
    
    - In the first read callback, scan efivar_sysfs_list from head and pass
      a kmsg buffer of a entry to an upper pstore layer.
    - In the second read callback, rescan efivar_sysfs_list from the entry
      and pass another kmsg buffer to it.
    - Repeat the scan and pass until the end of efivar_sysfs_list.
    
    In this process, an entry is read across the multiple read function
    calls. To avoid race between the read and erasion, the whole process
    above is protected by a spinlock, holding in open() and releasing in
    close().
    
    At the same time, kmemdup() is called to pass the buffer to pstore
    filesystem during it. And then, it causes a following lockdep warning.
    
    To make the dynamic memory allocation runnable without taking spinlock,
    holding off a deletion of sysfs entry if it happens while scanning it
    via efi_pstore, and deleting it after the scan is completed.
    
    To implement it, this patch introduces two flags, scanning and deleting,
    to efivar_entry.
    
    On the code basis, it seems that all the scanning and deleting logic is
    not needed because __efivars->lock are not dropped when reading from the
    EFI variable store.
    
    But, the scanning and deleting logic is still needed because an
    efi-pstore and a pstore filesystem works as follows.
    
    In case an entry(A) is found, the pointer is saved to psi->data.  And
    efi_pstore_read() passes the entry(A) to a pstore filesystem by
    releasing  __efivars->lock.
    
    And then, the pstore filesystem calls efi_pstore_read() again and the
    same entry(A), which is saved to psi->data, is used for resuming to scan
    a sysfs-list.
    
    So, to protect the entry(A), the logic is needed.
    
    [    1.143710] ------------[ cut here ]------------
    [    1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
    [    1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
    [    1.144058] Modules linked in:
    [    1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
    
    
    [    1.144058]  0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
    [    1.144058]  ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
    [    1.144058]  00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
    [    1.144058] Call Trace:
    [    1.144058]  [<ffffffff816614a5>] dump_stack+0x54/0x74
    [    1.144058]  [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
    [    1.144058]  [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
    [    1.144058]  [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
    [    1.144058]  [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
    [    1.144058]  [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
    [    1.144058]  [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
    [    1.144058]  [<ffffffff8115b260>] kmemdup+0x20/0x50
    [    1.144058]  [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
    [    1.144058]  [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
    [    1.144058]  [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
    [    1.144058]  [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
    [    1.144058]  [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
    [    1.144058]  [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
    [    1.158207]  [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
    [    1.158207]  [<ffffffff8128ce30>] ? parse_options+0x80/0x80
    [    1.158207]  [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
    [    1.158207]  [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
    [    1.158207]  [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
    [    1.158207]  [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
    [    1.158207]  [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
    [    1.158207]  [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
    [    1.158207]  [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
    [    1.158207]  [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
    [    1.158207]  [<ffffffff811cc373>] SyS_mount+0x83/0xc0
    [    1.158207]  [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
    [    1.158207] ---[ end trace 61981bc62de9f6f4 ]---
    
    Signed-off-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
    Tested-by: default avatarMadper Xie <cxie@redhat.com>
    Cc: stable@kernel.org
    Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
    e0d59733