- Nov 18, 2021
-
-
Miquel Raynal authored
commit 6bcd2960 upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type *before* calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: d525914b ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Cc: Jan Hoffmann <jan@3e8.eu> Cc: Kestrel seventyfour <kestrelseventyfour@gmail.com> Signed-off-by:
Miquel Raynal <miquel.raynal@bootlin.com> Tested-by:
Jan Hoffmann <jan@3e8.eu> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-10-miquel.raynal@bootlin.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miquel Raynal authored
commit d707bb74 upstream. Following the introduction of the generic ECC engine infrastructure, it was necessary to reorganize the code and move the ECC configuration in the ->attach_chip() hook. Failing to do that properly lead to a first series of fixes supposed to stabilize the situation. Unfortunately, this only fixed the use of software ECC engines, preventing any other kind of engine to be used, including on-die ones. It is now time to (finally) fix the situation by ensuring that we still provide a default (eg. software ECC) but will still support different ECC engines such as on-die ECC engines if properly described in the device tree. There are no changes needed on the core side in order to do this, but we just need to leverage the logic there which allows: 1- a subsystem default (set to Host engines in the raw NAND world) 2- a driver specific default (here set to software ECC engines) 3- any type of engine requested by the user (ie. described in the DT) As the raw NAND subsystem has not yet been fully converted to the ECC engine infrastructure, in order to provide a default ECC engine for this driver we need to set chip->ecc.engine_type *before* calling nand_scan(). During the initialization step, the core will consider this entry as the default engine for this driver. This value may of course be overloaded by the user if the usual DT properties are provided. Fixes: 59d93473 ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()") Cc: stable@vger.kernel.org Signed-off-by:
Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-2-miquel.raynal@bootlin.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miquel Raynal authored
commit 9be1446e upstream. The introduction of the generic ECC engine API lead to a number of changes in various drivers which broke some of them. Here is a typical example: I expected the SM_ORDER option to be handled by the Hamming ECC engine internals. Problem: the fsmc driver does not instantiate (yet) a real ECC engine object so we had to use a 'bare' ECC helper instead of the shiny rawnand functions. However, when not intializing this engine properly and using the bare helpers, we do not get the SM ORDER feature handled automatically. It looks like this was lost in the process so let's ensure we use the right SM ORDER now. Fixes: ad9ffdce ("mtd: rawnand: fsmc: Fix external use of SW Hamming ECC helper") Cc: stable@vger.kernel.org Signed-off-by:
Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210928221507.199198-2-miquel.raynal@bootlin.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dong Aisheng authored
commit e90547d5 upstream. Usually the dash '-' is preferred in node name. So far, not dts in upstream kernel, so we just update node name in driver. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Fixes: 5e4c1243 ("remoteproc: imx_rproc: support remote cores booted before Linux Kernel") Reviewed-and-tested-by:
Peng Fan <peng.fan@nxp.com> Signed-off-by:
Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by:
Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-6-peng.fan@oss.nxp.com Signed-off-by:
Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dong Aisheng authored
commit afe670e2 upstream. vdev regions are typically named vdev0buffer, vdev0ring0, vdev0ring1 and etc. Change to strncmp to cover them all. Fixes: 8f2d8961 ("remoteproc: imx_rproc: ignore mapping vdev regions") Reviewed-and-tested-by:
Peng Fan <peng.fan@nxp.com> Signed-off-by:
Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by:
Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-5-peng.fan@oss.nxp.com Signed-off-by:
Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dong Aisheng authored
commit 970675f6 upstream. Currently the is_iomem is a random value in the stack which may be default to true even on those platforms that not use iomem to store firmware. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Fixes: 40df0a91 ("remoteproc: add is_iomem to da_to_va") Reviewed-and-tested-by:
Peng Fan <peng.fan@nxp.com> Signed-off-by:
Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by:
Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-3-peng.fan@oss.nxp.com Signed-off-by:
Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Fan authored
commit 24acbd9d upstream. It seems luckliy work on i.MX platform, but it is wrong. Need use memcpy_toio, not memcpy_fromio. Fixes: 40df0a91 ("remoteproc: add is_iomem to da_to_va") Tested-by: Dong Aisheng <aisheng.dong@nxp.com> (i.MX8MQ) Reported-by:
kernel test robot <lkp@intel.com> Reported-by:
Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by:
Peng Fan <peng.fan@nxp.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210910090621.3073540-2-peng.fan@oss.nxp.com Signed-off-by:
Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Halil Pasic authored
commit ad9a1451 upstream. Since commit 48720ba5 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") we were supposed to make sure that virtio_ccw_release_dev() completes before the ccw device and the attached dma pool are torn down, but unfortunately we did not. Before that commit it used to be OK to delay cleaning up the memory allocated by virtio-ccw indefinitely (which isn't really intuitive for guys used to destruction happens in reverse construction order), but now we trigger a BUG_ON if the genpool is destroyed before all memory allocated from it is deallocated. Which brings down the guest. We can observe this problem, when unregister_virtio_device() does not give up the last reference to the virtio_device (e.g. because a virtio-scsi attached scsi disk got removed without previously unmounting its previously mounted partition). To make sure that the genpool is only destroyed after all the necessary freeing is done let us take a reference on the ccw device on each ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free(). Actually there are multiple approaches to fixing the problem at hand that can work. The upside of this one is that it is the safest one while remaining simple. We don't crash the guest even if the driver does not pair allocations and frees. The downside is the reference counting overhead, that the reference counting for ccw devices becomes more complex, in a sense that we need to pair the calls to the aforementioned functions for it to be correct, and that if we happen to leak, we leak more than necessary (the whole ccw device instead of just the genpool). Some alternatives to this approach are taking a reference in virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or making sure virtio_ccw_release_dev() completes its work before virtio_ccw_remove() returns. The downside of these approaches is that these are less safe against programming errors. Cc: <stable@vger.kernel.org> # v5.3 Signed-off-by:
Halil Pasic <pasic@linux.ibm.com> Fixes: 48720ba5 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") Reported-by:
<bfu@redhat.com> Reviewed-by:
Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by:
Cornelia Huck <cohuck@redhat.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Harald Freudenberger authored
commit 3826350e upstream. When a queue is switched to soft offline during heavy load and later switched to soft online again and now used, it may be that the caller is blocked forever in the ioctl call. The failure occurs because there is a pending reply after the queue(s) have been switched to offline. This orphaned reply is received when the queue is switched to online and is accidentally counted for the outstanding replies. So when there was a valid outstanding reply and this orphaned reply is received it counts as the outstanding one thus dropping the outstanding counter to 0. Voila, with this counter the receive function is not called any more and the real outstanding reply is never received (until another request comes in...) and the ioctl blocks. The fix is simple. However, instead of readjusting the counter when an orphaned reply is detected, I check the queue status for not empty and compare this to the outstanding counter. So if the queue is not empty then the counter must not drop to 0 but at least have a value of 1. Signed-off-by:
Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sven Schnelle authored
commit 213fca9e upstream. commit 9c6c273a ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") changed the timer setup from init_timer_on_stack(() to timer_setup(), but missed to change the mod_timer() call. And while at it, use msecs_to_jiffies() instead of the open coded timeout calculation. Cc: stable@vger.kernel.org Fixes: 9c6c273a ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") Signed-off-by:
Sven Schnelle <svens@linux.ibm.com> Reviewed-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vineeth Vijayan authored
commit a4751f15 upstream. Check the validity of subchanel before reading other fields in the schib. Fixes: d3683c05 ("s390/cio: add dev_busid sysfs entry for each subchannel") CC: <stable@vger.kernel.org> Reported-by:
Cornelia Huck <cohuck@redhat.com> Signed-off-by:
Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by:
Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20211105154451.847288-1-vneethv@linux.ibm.com Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Richter authored
commit 9d48c7af upstream. When a CPU is hotplugged while the perf stat -e cycles command is running, a wrong (very large) value is displayed immediately after the CPU removal: Check the values, shouldn't be too high as in time counts unit events 1.001101919 29261846 cycles 2.002454499 17523405 cycles 3.003659292 24361161 cycles 4.004816983 18446744073638406144 cycles 5.005671647 <not counted> cycles ... The CPU hotplug off took place after 3 seconds. The issue is the read of the event count value after 4 seconds when the CPU is not available and the read of the counter returns an error. This is treated as a counter value of zero. This results in a very large value (0 - previous_value). Fix this by detecting the hotplugged off CPU and report 0 instead of a very large number. Cc: stable@vger.kernel.org Fixes: a029a4ea ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") Reported-by:
Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by:
Thomas Richter <tmricht@linux.ibm.com> Reviewed-by:
Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit 2aa36604 upstream. It is generally unsafe to call put_device() with dpm_list_mtx held, because the given device's release routine may carry out an action depending on that lock which then may deadlock, so modify the system-wide suspend and resume of devices to always drop dpm_list_mtx before calling put_device() (and adjust white space somewhat while at it). For instance, this prevents the following splat from showing up in the kernel log after a system resume in certain configurations: [ 3290.969514] ====================================================== [ 3290.969517] WARNING: possible circular locking dependency detected [ 3290.969519] 5.15.0+ #2420 Tainted: G S [ 3290.969523] ------------------------------------------------------ [ 3290.969525] systemd-sleep/4553 is trying to acquire lock: [ 3290.969529] ffff888117ab1138 ((wq_completion)hci0#2){+.+.}-{0:0}, at: flush_workqueue+0x87/0x4a0 [ 3290.969554] but task is already holding lock: [ 3290.969556] ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0 [ 3290.969571] which lock already depends on the new lock. [ 3290.969573] the existing dependency chain (in reverse order) is: [ 3290.969575] -> #3 (dpm_list_mtx){+.+.}-{3:3}: [ 3290.969583] __mutex_lock+0x9d/0xa30 [ 3290.969591] device_pm_add+0x2e/0xe0 [ 3290.969597] device_add+0x4d5/0x8f0 [ 3290.969605] hci_conn_add_sysfs+0x43/0xb0 [bluetooth] [ 3290.969689] hci_conn_complete_evt.isra.71+0x124/0x750 [bluetooth] [ 3290.969747] hci_event_packet+0xd6c/0x28a0 [bluetooth] [ 3290.969798] hci_rx_work+0x213/0x640 [bluetooth] [ 3290.969842] process_one_work+0x2aa/0x650 [ 3290.969851] worker_thread+0x39/0x400 [ 3290.969859] kthread+0x142/0x170 [ 3290.969865] ret_from_fork+0x22/0x30 [ 3290.969872] -> #2 (&hdev->lock){+.+.}-{3:3}: [ 3290.969881] __mutex_lock+0x9d/0xa30 [ 3290.969887] hci_event_packet+0xba/0x28a0 [bluetooth] [ 3290.969935] hci_rx_work+0x213/0x640 [bluetooth] [ 3290.969978] process_one_work+0x2aa/0x650 [ 3290.969985] worker_thread+0x39/0x400 [ 3290.969993] kthread+0x142/0x170 [ 3290.969999] ret_from_fork+0x22/0x30 [ 3290.970004] -> #1 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}: [ 3290.970013] process_one_work+0x27d/0x650 [ 3290.970020] worker_thread+0x39/0x400 [ 3290.970028] kthread+0x142/0x170 [ 3290.970033] ret_from_fork+0x22/0x30 [ 3290.970038] -> #0 ((wq_completion)hci0#2){+.+.}-{0:0}: [ 3290.970047] __lock_acquire+0x15cb/0x1b50 [ 3290.970054] lock_acquire+0x26c/0x300 [ 3290.970059] flush_workqueue+0xae/0x4a0 [ 3290.970066] drain_workqueue+0xa1/0x130 [ 3290.970073] destroy_workqueue+0x34/0x1f0 [ 3290.970081] hci_release_dev+0x49/0x180 [bluetooth] [ 3290.970130] bt_host_release+0x1d/0x30 [bluetooth] [ 3290.970195] device_release+0x33/0x90 [ 3290.970201] kobject_release+0x63/0x160 [ 3290.970211] dpm_resume+0x164/0x3e0 [ 3290.970215] dpm_resume_end+0xd/0x20 [ 3290.970220] suspend_devices_and_enter+0x1a4/0xba0 [ 3290.970229] pm_suspend+0x26b/0x310 [ 3290.970236] state_store+0x42/0x90 [ 3290.970243] kernfs_fop_write_iter+0x135/0x1b0 [ 3290.970251] new_sync_write+0x125/0x1c0 [ 3290.970257] vfs_write+0x360/0x3c0 [ 3290.970263] ksys_write+0xa7/0xe0 [ 3290.970269] do_syscall_64+0x3a/0x80 [ 3290.970276] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3290.970284] other info that might help us debug this: [ 3290.970285] Chain exists of: (wq_completion)hci0#2 --> &hdev->lock --> dpm_list_mtx [ 3290.970297] Possible unsafe locking scenario: [ 3290.970299] CPU0 CPU1 [ 3290.970300] ---- ---- [ 3290.970302] lock(dpm_list_mtx); [ 3290.970306] lock(&hdev->lock); [ 3290.970310] lock(dpm_list_mtx); [ 3290.970314] lock((wq_completion)hci0#2); [ 3290.970319] *** DEADLOCK *** [ 3290.970321] 7 locks held by systemd-sleep/4553: [ 3290.970325] #0: ffff888103bcd448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xa7/0xe0 [ 3290.970341] #1: ffff888115a14488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0 [ 3290.970355] #2: ffff888100f719e0 (kn->active#233){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0 [ 3290.970369] #3: ffffffff82661048 (autosleep_lock){+.+.}-{3:3}, at: state_store+0x12/0x90 [ 3290.970384] #4: ffffffff82658ac8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x9f/0x310 [ 3290.970399] #5: ffffffff827f2a48 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x4c/0x80 [ 3290.970416] #6: ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0 [ 3290.970428] stack backtrace: [ 3290.970431] CPU: 3 PID: 4553 Comm: systemd-sleep Tainted: G S 5.15.0+ #2420 [ 3290.970438] Hardware name: Dell Inc. XPS 13 9380/0RYJWW, BIOS 1.5.0 06/03/2019 [ 3290.970441] Call Trace: [ 3290.970446] dump_stack_lvl+0x44/0x57 [ 3290.970454] check_noncircular+0x105/0x120 [ 3290.970468] ? __lock_acquire+0x15cb/0x1b50 [ 3290.970474] __lock_acquire+0x15cb/0x1b50 [ 3290.970487] lock_acquire+0x26c/0x300 [ 3290.970493] ? flush_workqueue+0x87/0x4a0 [ 3290.970503] ? __raw_spin_lock_init+0x3b/0x60 [ 3290.970510] ? lockdep_init_map_type+0x58/0x240 [ 3290.970519] flush_workqueue+0xae/0x4a0 [ 3290.970526] ? flush_workqueue+0x87/0x4a0 [ 3290.970544] ? drain_workqueue+0xa1/0x130 [ 3290.970552] drain_workqueue+0xa1/0x130 [ 3290.970561] destroy_workqueue+0x34/0x1f0 [ 3290.970572] hci_release_dev+0x49/0x180 [bluetooth] [ 3290.970624] bt_host_release+0x1d/0x30 [bluetooth] [ 3290.970687] device_release+0x33/0x90 [ 3290.970695] kobject_release+0x63/0x160 [ 3290.970705] dpm_resume+0x164/0x3e0 [ 3290.970710] ? dpm_resume_early+0x251/0x3b0 [ 3290.970718] dpm_resume_end+0xd/0x20 [ 3290.970723] suspend_devices_and_enter+0x1a4/0xba0 [ 3290.970737] pm_suspend+0x26b/0x310 [ 3290.970746] state_store+0x42/0x90 [ 3290.970755] kernfs_fop_write_iter+0x135/0x1b0 [ 3290.970764] new_sync_write+0x125/0x1c0 [ 3290.970777] vfs_write+0x360/0x3c0 [ 3290.970785] ksys_write+0xa7/0xe0 [ 3290.970794] do_syscall_64+0x3a/0x80 [ 3290.970803] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3290.970811] RIP: 0033:0x7f41b1328164 [ 3290.970819] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 4a d2 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83 [ 3290.970824] RSP: 002b:00007ffe6ae21b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 3290.970831] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f41b1328164 [ 3290.970836] RDX: 0000000000000004 RSI: 000055965e651070 RDI: 0000000000000004 [ 3290.970839] RBP: 000055965e651070 R08: 000055965e64f390 R09: 00007f41b1e3d1c0 [ 3290.970843] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004 [ 3290.970846] R13: 0000000000000001 R14: 000055965e64f2b0 R15: 0000000000000004 Cc: All applicable <stable@vger.kernel.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Coly Li authored
commit 2878feae upstream. This reverts commit 2fd3e5ef. The above commit replaces page_address(bv->bv_page) by bvec_virt(bv) to avoid directly access to bv->bv_page, but in situation bv->bv_offset is not zero and page_address(bv->bv_page) is not equal to bvec_virt(bv). In such case a memory corruption may happen because memory in next page is tainted by following line in do_btree_node_write(), memcpy(bvec_virt(bv), addr, PAGE_SIZE); This patch reverts the mentioned commit to avoid the memory corruption. Fixes: 2fd3e5ef ("bcache: use bvec_virt") Signed-off-by:
Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # 5.15 Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211103151041.70516-1-colyli@suse.de Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Coly Li authored
commit 8468f450 upstream. In bcache_device_free(), pointer disk is referenced still in ida_simple_remove() after blk_cleanup_disk() gets called on this pointer. This may cause a potential panic by use-after-free on the disk pointer. This patch fixes the problem by calling blk_cleanup_disk() after ida_simple_remove(). Fixes: bc70852f ("bcache: convert to blk_alloc_disk/blk_cleanup_disk") Signed-off-by:
Coly Li <colyli@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: stable@vger.kernel.org # v5.14+ Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211103064917.67383-1-colyli@suse.de Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marek Vasut authored
commit 33a5471f upstream. The note in c2adda27 ("video: backlight: Add of_find_backlight helper in backlight.c") says that gpio-backlight uses brightness as power state. This has been fixed since in ec665b75 ("backlight: gpio-backlight: Correct initial power state handling") and other backlight drivers do not require this workaround. Drop the workaround. This fixes the case where e.g. pwm-backlight can perfectly well be set to brightness 0 on boot in DT, which without this patch leads to the display brightness to be max instead of off. Fixes: c2adda27 ("video: backlight: Add of_find_backlight helper in backlight.c") Cc: <stable@vger.kernel.org> # 5.4+ Cc: <stable@vger.kernel.org> # 4.19.x: ec665b75: backlight: gpio-backlight: Correct initial power state handling Signed-off-by:
Marek Vasut <marex@denx.de> Acked-by:
Noralf Trønnes <noralf@tronnes.org> Reviewed-by:
Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by:
Lee Jones <lee.jones@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jack Andersen authored
commit 313c84b5 upstream. This patch extends the DLN2 driver; adding cell for adc_dln2 module. The original patch[1] fell through the cracks when the driver was added so ADC has never actually been usable. That patch did not have ACPI support which was added in v5.9, so the oldest supported version this current patch can be backported to is 5.10. [1] https://www.spinics.net/lists/linux-iio/msg33975.html Cc: <stable@vger.kernel.org> # 5.10+ Signed-off-by:
Jack Andersen <jackoalan@gmail.com> Signed-off-by:
Noralf Trønnes <noralf@tronnes.org> Signed-off-by:
Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20211018112541.25466-1-noralf@tronnes.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rongwei Wang authored
commit 8468e937 upstream. When truncating pagecache on file THP, the private pages of a process should not be unmapped mapping. This incorrect behavior on a dynamic shared libraries which will cause related processes to happen core dump. A simple test for a DSO (Prerequisite is the DSO mapped in file THP): int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); } close(fd); return 0; } The test only to open a target DSO, and do nothing. But this operation will lead one or more process to happen core dump. This patch mainly to fix this bug. Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by:
Rongwei Wang <rongwei.wang@linux.alibaba.com> Tested-by:
Xu Yu <xuyu@linux.alibaba.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Song Liu <song@kernel.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Collin Fijalkovich <cfijalkovich@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rongwei Wang authored
commit 55fc0d91 upstream. Patch series "fix two bugs for file THP". This patch (of 2): Transparent huge page has supported read-only non-shmem files. The file- backed THP is collapsed by khugepaged and truncated when written (for shared libraries). However, there is a race when multiple writers truncate the same page cache concurrently. In that case, subpage(s) of file THP can be revealed by find_get_entry in truncate_inode_pages_range, which will trigger PageTail BUG_ON in truncate_inode_page, as follows: page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0 flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head) raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000 head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000 page dumped because: VM_BUG_ON_PAGE(PageTail(page)) ------------[ cut here ]------------ kernel BUG at mm/truncate.c:213! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ... CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ... Hardware name: ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) Call trace: truncate_inode_page+0x64/0x70 truncate_inode_pages_range+0x550/0x7e4 truncate_pagecache+0x58/0x80 do_dentry_open+0x1e4/0x3c0 vfs_open+0x38/0x44 do_open+0x1f0/0x310 path_openat+0x114/0x1dc do_filp_open+0x84/0x134 do_sys_openat2+0xbc/0x164 __arm64_sys_openat+0x74/0xc0 el0_svc_common.constprop.0+0x88/0x220 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x1a4/0x1b0 el0_sync+0x180/0x1c0 Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000) This patch mainly to lock filemap when one enter truncate_pagecache(), avoiding truncating the same page cache concurrently. Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com Fixes: eb6ecbed ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by:
Xu Yu <xuyu@linux.alibaba.com> Signed-off-by:
Rongwei Wang <rongwei.wang@linux.alibaba.com> Suggested-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by:
Song Liu <song@kernel.org> Cc: Collin Fijalkovich <cfijalkovich@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michal Hocko authored
commit 60e2793d upstream. Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory. This can happen for 2 different reasons. a) Memcg is out of memory and we rely on mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) normal allocation fails. The latter is quite problematic because allocation paths already trigger out_of_memory and the page allocator tries really hard to not fail allocations. Anyway, if the OOM killer has been already invoked there is no reason to invoke it again from the #PF path. Especially when the OOM condition might be gone by that time and we have no way to find out other than allocate. Moreover if the allocation failed and the OOM killer hasn't been invoked then we are unlikely to do the right thing from the #PF context because we have already lost the allocation context and restictions and therefore might oom kill a task from a different NUMA domain. This all suggests that there is no legitimate reason to trigger out_of_memory from pagefault_out_of_memory so drop it. Just to be sure that no #PF path returns with VM_FAULT_OOM without allocation print a warning that this is happening before we restart the #PF. [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller. This is a local problem related to memcg, however, it causes unnecessary global OOM kills that are repeated over and over again and escalate into a real disaster. This has been broken since kmem accounting has been introduced for cgroup v1 (3.8). There was no kmem specific reclaim for the separate limit so the only way to handle kmem hard limit was to return with ENOMEM. In upstream the problem will be fixed by removing the outdated kmem limit, however stable and LTS kernels cannot do it and are still affected. This patch fixes the problem and should be backported into stable/LTS.] Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com Signed-off-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit 0b28179a upstream. Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It can be misused and allowed to trigger global OOM from inside a memcg-limited container. On the other hand if memcg fails allocation, called from inside #PF handler it triggers global OOM from inside pagefault_out_of_memory(). To prevent these problems this patchset: (a) removes execution of out_of_memory() from pagefault_out_of_memory(), becasue nobody can explain why it is necessary. (b) allow memcg to fail allocation of dying/killed tasks. This patch (of 3): Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory which in turn executes out_out_memory() and can kill a random task. An allocation might fail when the current task is the oom victim and there are no memory reserves left. The OOM killer is already handled at the page allocator level for the global OOM and at the charging level for the memcg one. Both have much more information about the scope of allocation/charge request. This means that either the OOM killer has been invoked properly and didn't lead to the allocation success or it has been skipped because it couldn't have been invoked. In both cases triggering it from here is pointless and even harmful. It makes much more sense to let the killed task die rather than to wake up an eternally hungry oom-killer and send him to choose a fatter victim for breakfast. Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Suggested-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit a4ebf1b6 upstream. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It is assumed that the amount of the memory charged by those tasks is bound and most of the memory will get released while the task is exiting. This is resembling a heuristic for the global OOM situation when tasks get access to memory reserves. There is no global memory shortage at the memcg level so the memcg heuristic is more relieved. The above assumption is overly optimistic though. E.g. vmalloc can scale to really large requests and the heuristic would allow that. We used to have an early break in the vmalloc allocator for killed tasks but this has been reverted by commit b8c8a338 ("Revert "vmalloc: back off when the current task is killed""). There are likely other similar code paths which do not check for fatal signals in an allocation&charge loop. Also there are some kernel objects charged to a memcg which are not bound to a process life time. It has been observed that it is not really hard to trigger these bypasses and cause global OOM situation. One potential way to address these runaways would be to limit the amount of excess (similar to the global OOM with limited oom reserves). This is certainly possible but it is not really clear how much of an excess is desirable and still protects from global OOMs as that would have to consider the overall memcg configuration. This patch is addressing the problem by removing the heuristic altogether. Bypass is only allowed for requests which either cannot fail or where the failure is not desirable while excess should be still limited (e.g. atomic requests). Implementation wise a killed or dying task fails to charge if it has passed the OOM killer stage. That should give all forms of reclaim chance to restore the limit before the failure (ENOMEM) and tell the caller to back off. In addition, this patch renames should_force_charge() helper to task_is_dying() because now its use is not associated witch forced charging. This patch depends on pagefault_out_of_memory() to not trigger out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM and cause a global OOM killer. Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Suggested-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Shakeel Butt <shakeelb@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matthew Wilcox (Oracle) authored
commit d417b49f upstream. It is not safe to check page->index without holding the page lock. It can be changed if the page is moved between the swap cache and the page cache for a shmem file, for example. There is a VM_BUG_ON below which checks page->index is correct after taking the page lock. Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org Fixes: 5c211ba2 ("mm: add and use find_lock_entries") Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by:
<syzbot+c87be4f669d920c76330@syzkaller.appspotmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dominique Martinet authored
commit 27eb4c31 upstream. Link: https://lkml.kernel.org/r/99338965-d36c-886e-cd0e-1d8fff2b4746@gmail.com Reported-by:
<syzbot+06472778c97ed94af66d@syzkaller.appspotmail.com> Cc: stable@vger.kernel.org Signed-off-by:
Dominique Martinet <asmadeus@codewreck.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Borkmann authored
[ Upstream commit 3dc20f47 ] Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT state and NTF_USE flag with a dynamic NUD state from a user space control plane. Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing neighbor entry in combination with NTF_USE flag. This is due to the latter directly calling into neigh_event_send() without any meta data updates as happening in __neigh_update(). Thus, to enable this use case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the NUD_PERMANENT state in particular so that a latter neigh_event_send() is able to re-resolve a neighbor entry. Before fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] As can be seen, despite the admin-triggered replace, the entry remains in the NUD_PERMANENT state. After fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] After the fix, the admin-triggered replace switches to a dynamic state from the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can transition back from there, if needed, into NUD_PERMANENT. Similar before/after behavior can be observed for below transitions: Before fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] After fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [..] Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Roopa Prabhu <roopa@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Anatolij Gustschin authored
commit adec566b upstream. memset() and memcpy() on an MMIO region like here results in a lockup at startup on mpc5200 platform (since this first happens during probing of the ATA and Ethernet drivers). Use memset_io() and memcpy_toio() instead. Fixes: 2f9ea1bd ("bestcomm: core bestcomm support for Freescale MPC5200") Cc: stable@vger.kernel.org # v5.14+ Signed-off-by:
Anatolij Gustschin <agust@denx.de> Link: https://lore.kernel.org/r/20211014094012.21286-1-agust@denx.de Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kishon Vijay Abraham I authored
commit eb91224e upstream. udma_get_*() checks if rchan/tchan/rflow is already allocated by checking if it has a NON NULL value. For the error cases, rchan/tchan/rflow will have error value and udma_get_*() considers this as already allocated (PASS) since the error values are NON NULL. This results in NULL pointer dereference error while de-referencing rchan/tchan/rflow. Reset the value of rchan/tchan/rflow to NULL if a channel request fails. CC: stable@vger.kernel.org Acked-by:
Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by:
Kishon Vijay Abraham I <kishon@ti.com> Link: https://lore.kernel.org/r/20211031032411.27235-3-kishon@ti.com Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kishon Vijay Abraham I authored
commit 5c6c6d60 upstream. bcdma_get_*() checks if bchan is already allocated by checking if it has a NON NULL value. For the error cases, bchan will have error value and bcdma_get_*() considers this as already allocated (PASS) since the error values are NON NULL. This results in NULL pointer dereference error while de-referencing bchan. Reset the value of bchan to NULL if a channel request fails. CC: stable@vger.kernel.org Acked-by:
Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by:
Kishon Vijay Abraham I <kishon@ti.com> Link: https://lore.kernel.org/r/20211031032411.27235-2-kishon@ti.com Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Namjae Jeon authored
commit b53ad810 upstream. When validating request length in ksmbd_check_message, 8byte alignment is not needed for compound request. It can cause wrong validation of request length. Fixes: e2f34481 ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org # v5.15 Acked-by:
Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by:
Namjae Jeon <linkinjeon@kernel.org> Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marios Makassikis authored
commit 78f1688a upstream. The validate_negotiate_info_req struct definition includes an extra field to access the data coming after the header. This causes the check in fsctl_validate_negotiate_info() to count the first element of the array twice. This in turn makes some valid requests fail, depending on whether they include padding or not. Fixes: f7db8fd0 ("ksmbd: add validation in smb2_ioctl") Cc: stable@vger.kernel.org # v5.15 Acked-by:
Namjae Jeon <linkinjeon@kernel.org> Acked-by:
Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by:
Marios Makassikis <mmakassikis@freebox.fr> Signed-off-by:
Steve French <stfrench@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shin'ichiro Kawasaki authored
commit 86399ea0 upstream. When BLKRESETZONE ioctl and data read race, the data read leaves stale page cache. The commit e5113505 ("block: Discard page cache of zone reset target range") added page cache truncation to avoid stale page cache after the ioctl. However, the stale page cache still can be read during the reset zone operation for the ioctl. To avoid the stale page cache completely, hold invalidate_lock of the block device file mapping. Fixes: e5113505 ("block: Discard page cache of zone reset target range") Signed-off-by:
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211111085238.942492-1-shinichiro.kawasaki@wdc.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shin'ichiro Kawasaki authored
commit 35e4c6c1 upstream. When BLKZEROOUT ioctl and data read race, the data read leaves stale page cache. To avoid the stale page cache, hold invalidate_lock of the block device file mapping. The stale page cache is observed when blktests test case block/009 is modified to call "blkdiscard -z" command and repeated hundreds of times. This patch can be applied back to the stable kernel version v5.15.y. Rework is required for older stable kernels. Fixes: 22dd6d35 ("block: invalidate the page cache when issuing BLKZEROOUT") Signed-off-by:
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211109104723.835533-3-shinichiro.kawasaki@wdc.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shin'ichiro Kawasaki authored
commit 7607c44c upstream. When BLKDISCARD ioctl and data read race, the data read leaves stale page cache. To avoid the stale page cache, hold invalidate_lock of the block device file mapping. The stale page cache is observed when blktests test case block/009 is repeated hundreds of times. This patch can be applied back to the stable kernel version v5.15.y with slight patch edit. Rework is required for older stable kernels. Fixes: 351499a1 ("block: Invalidate cache on discard v2") Signed-off-by:
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.15 Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211109104723.835533-2-shinichiro.kawasaki@wdc.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matthew Brost authored
commit fc30a676 upstream. Prior to this patch the blocked context counter was cleared on init_sched_state (used during registering a context & resets) which is incorrect. This state needs to be persistent or the counter can read the incorrect value resulting in scheduling never getting enabled again. Fixes: 62eaf0ae ("drm/i915/guc: Support request cancellation") Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-2-matthew.brost@intel.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gao Xiang authored
commit 86432a6d upstream. There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL before actual I/O submission. Thus, the decompression chain can be extended if the following pcluster chain hooks such tail pcluster. As the related comment mentioned, if some page is made of a hooked pcluster and another followed pcluster, it can be reused for in-place I/O (since I/O should be submitted anyway): _______________________________________________________________ | tail (partial) page | head (partial) page | |_____PRIMARY_HOOKED___|____________PRIMARY_FOLLOWED____________| However, it's by no means safe to reuse as pagevec since if such PRIMARY_HOOKED pclusters finally move into bypass chain without I/O submission. It's somewhat hard to reproduce with LZ4 and I just found it (general protection fault) by ro_fsstressing a LZMA image for long time. I'm going to actively clean up related code together with multi-page folio adaption in the next few months. Let's address it directly for easier backporting for now. Call trace for reference: z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs] z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs] z_erofs_runqueue+0x5f3/0x840 [erofs] z_erofs_readahead+0x1e8/0x320 [erofs] read_pages+0x91/0x270 page_cache_ra_unbounded+0x18b/0x240 filemap_get_pages+0x10a/0x5f0 filemap_read+0xa9/0x330 new_sync_read+0x11b/0x1a0 vfs_read+0xf1/0x190 Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org Fixes: 3883a79a ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Xiubo Li authored
commit 0e24421a upstream. If the max_mds is decreased in a cephfs cluster, there is a window of time before the MDSs are removed. If a map goes out during this period, the mdsmap may show the decreased max_mds but still shows those MDSes as in or in the export target list. Ensure that we don't fail the map decode in that case. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52436 Fixes: d517b398 ("ceph: reconnect to the export targets on new mdsmaps") Signed-off-by:
Xiubo Li <xiubli@redhat.com> Reviewed-by:
Jeff Layton <jlayton@kernel.org> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dongliang Mu authored
commit 5429c9db upstream. if2fs_fill_super -> f2fs_build_segment_manager -> create_discard_cmd_control -> f2fs_start_discard_thread It invokes kthread_run to create a thread and run issue_discard_thread. However, if f2fs_build_node_manager fails, the control flow goes to free_nm and calls f2fs_destroy_node_manager. This function will free sbi->nm_info. However, if issue_discard_thread accesses sbi->nm_info after the deallocation, but before the f2fs_stop_discard_thread, it will cause UAF(Use-after-free). -> f2fs_destroy_segment_manager -> destroy_discard_cmd_control -> f2fs_stop_discard_thread Fix this by stopping discard thread before f2fs_destroy_node_manager. Note that, the commit d6d2b491 introduces the call of f2fs_available_free_memory into issue_discard_thread. Cc: stable@vger.kernel.org Fixes: d6d2b491 ("f2fs: allow to change discard policy based on cached discard cmds") Signed-off-by:
Dongliang Mu <mudongliangabcd@gmail.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daeho Jeong authored
commit 09631cf3 upstream. Need to include non-compressed blocks in compr_written_block to estimate average compression ratio more accurately. Fixes: 5ac443e2 ("f2fs: add sysfs nodes to get runtime compression stat") Cc: stable@vger.kernel.org Signed-off-by:
Daeho Jeong <daehojeong@google.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jaegeuk Kim authored
commit 92d602bc upstream. We use inline_dentry which requires to allocate dentry page when adding a link. If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem) twice by f2fs_lock_op(). I think this should be okay, but how about stopping the lockdep complaint [1]? f2fs_create() - f2fs_lock_op() - f2fs_do_add_link() - __f2fs_find_entry - f2fs_get_read_data_page() -> kswapd - shrink_node - f2fs_evict_inode - f2fs_lock_op() [1] fs_reclaim ){+.+.}-{0:0} : kswapd0: lock_acquire+0x114/0x394 kswapd0: __fs_reclaim_acquire+0x40/0x50 kswapd0: prepare_alloc_pages+0x94/0x1ec kswapd0: __alloc_pages_nodemask+0x78/0x1b0 kswapd0: pagecache_get_page+0x2e0/0x57c kswapd0: f2fs_get_read_data_page+0xc0/0x394 kswapd0: f2fs_find_data_page+0xa4/0x23c kswapd0: find_in_level+0x1a8/0x36c kswapd0: __f2fs_find_entry+0x70/0x100 kswapd0: f2fs_do_add_link+0x84/0x1ec kswapd0: f2fs_mkdir+0xe4/0x1e4 kswapd0: vfs_mkdir+0x110/0x1c0 kswapd0: do_mkdirat+0xa4/0x160 kswapd0: __arm64_sys_mkdirat+0x24/0x34 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 kswapd0: -> #1 ( &sbi->cp_rwsem ){++++}-{3:3} : kswapd0: lock_acquire+0x114/0x394 kswapd0: down_read+0x7c/0x98 kswapd0: f2fs_do_truncate_blocks+0x78/0x3dc kswapd0: f2fs_truncate+0xc8/0x128 kswapd0: f2fs_evict_inode+0x2b8/0x8b8 kswapd0: evict+0xd4/0x2f8 kswapd0: iput+0x1c0/0x258 kswapd0: do_unlinkat+0x170/0x2a0 kswapd0: __arm64_sys_unlinkat+0x4c/0x68 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 Cc: stable@vger.kernel.org Fixes: bdbc90fa ("f2fs: don't put dentry page in pagecache into highmem") Reviewed-by:
Chao Yu <chao@kernel.org> Reviewed-by:
Stanley Chu <stanley.chu@mediatek.com> Reviewed-by:
Light Hsieh <light.hsieh@mediatek.com> Tested-by:
Light Hsieh <light.hsieh@mediatek.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Guo Ren authored
commit 69ea4630 upstream. When using "devm_request_threaded_irq(,,,,IRQF_ONESHOT,,)" in a driver, only the first interrupt is handled, and following interrupts are never delivered (initially reported in [1]). That's because the RISC-V PLIC cannot EOI masked interrupts, as explained in the description of Interrupt Completion in the PLIC spec [2]: <quote> The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that *is currently enabled* for the target, the completion is silently ignored. </quote> Re-enable the interrupt before completion if it has been masked during the handling, and remask it afterwards. [1] http://lists.infradead.org/pipermail/linux-riscv/2021-July/007441.html [2] https://github.com/riscv/riscv-plic-spec/blob/8bc15a35d07c9edf7b5d23fec9728302595ffc4d/riscv-plic.adoc Fixes: bb0fed1c ("irqchip/sifive-plic: Switch to fasteoi flow") Reported-by:
Vincent Pelletier <plr.vincent@gmail.com> Tested-by:
Nikita Shubin <nikita.shubin@maquefel.me> Signed-off-by:
Guo Ren <guoren@linux.alibaba.com> Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Atish Patra <atish.patra@wdc.com> Reviewed-by:
Anup Patel <anup@brainfault.org> [maz: amended commit message] Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211105094748.3894453-1-guoren@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-