Author: Jeffrey Hugo <quic_jhugo@quicinc.com> Date: Fri Oct 4 13:52:09 2024 -0600 accel/qaic: Add AIC080 support [ Upstream commit b8128f7815ff135f0333c1b46dcdf1543c41b860 ] Add basic support for the new AIC080 product. The PCIe Device ID is 0xa080. AIC080 is a lower cost, lower performance SKU variant of AIC100. From the qaic perspective, it is the same as AIC100. Reviewed-by: Troy Hanson <quic_thanson@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004195209.3910996-1-quic_jhugo@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Denose <jdenose@google.com> Date: Tue Nov 12 22:25:17 2024 +0000 ACPI: video: force native for Apple MacbookPro11,2 and Air7,2 [ Upstream commit 295991836b23c12ddb447f7f583a17fd3616ad7d ] There is a bug in the Macbook Pro 11,2 and Air 7,2 firmware similar to what is described in: commit 7dc918daaf29 ("ACPI: video: force native for Apple MacbookPro9,2") This bug causes their backlights not to come back after resume. Add DMI quirks to select the working native Intel firmware interface such that the backlght comes back on after resume. Signed-off-by: Jonathan Denose <jdenose@google.com> Link: https://patch.msgid.link/20241112222516.1.I7fa78e6acbbed56ed5677f5e2dacc098a269d955@changeid [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com> Date: Sat Nov 9 23:00:28 2024 +0100 ACPI: x86: Add adev NULL check to acpi_quirk_skip_serdev_enumeration() [ Upstream commit 4a49194f587a62d972b602e3e1a2c3cfe6567966 ] acpi_dev_hid_match() does not check for adev == NULL, dereferencing it unconditional. Add a check for adev being NULL before calling acpi_dev_hid_match(). At the moment acpi_quirk_skip_serdev_enumeration() is never called with a controller_parent without an ACPI companion, but better safe than sorry. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241109220028.83047-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com> Date: Sat Nov 16 10:58:24 2024 +0100 ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 8 A1-840 [ Upstream commit 82f250ed1a1dcde0ad2a1513f85af7f9514635e8 ] The Acer Iconia One 8 A1-840 (not to be confused with the A1-840FHD which is a different model) ships with Android 4.4 as factory OS and has the usual broken DSDT issues for x86 Android tablets. Add quirks to skip ACPI I2C client enumeration and disable ACPI battery/AC and ACPI GPIO event handlers. Also add the "INT33F5" HID for the TI PMIC used on this tablet to the list of HIDs for which not to skip i2c_client instantiation, since we do want an ACPI instantiated i2c_client for the PMIC. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241116095825.11660-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com> Date: Sat Nov 16 10:58:25 2024 +0100 ACPI: x86: Clean up Asus entries in acpi_quirk_skip_dmi_ids[] [ Upstream commit bd8aa15848f5f21951cd0b0d01510b3ad1f777d4 ] The Asus entries in the acpi_quirk_skip_dmi_ids[] table are the only entries without a comment which model they apply to. Add these comments. The Asus TF103C entry also is in the wrong place for what is supposed to be an alphabetically sorted list. Move it up so that the list is properly sorted and add a comment that the list is alphabetically sorted. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241116095825.11660-2-hdegoede@redhat.com [ rjw: Changelog and subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com> Date: Sat Nov 9 22:59:36 2024 +0100 ACPI: x86: Make UART skip quirks work on PCI UARTs without an UID [ Upstream commit 7f261203d7c2e0c06e668b25dfaaee091a79ab25 ] The Vexia EDU ATLA 10 tablet (9V version) which shipped with Android 4.2 as factory OS has the usual broken DSDT issues for x86 Android tablets. On top of that this tablet is special because all its LPSS island peripherals are enumerated as PCI devices rather then as ACPI devices as they typically are. For the x86-android-tablets kmod to be able to instantiate a serdev client for the Bluetooth HCI on this tablet, an ACPI_QUIRK_UART1_SKIP quirk is necessary. Modify acpi_dmi_skip_serdev_enumeration() to work with PCI enumerated UARTs without an UID, such as the UARTs on this tablet. Also make acpi_dmi_skip_serdev_enumeration() exit early if there are no quirks, since there is nothing to do then. And add the necessary quirks for the Vexia EDU ATLA 10 tablet. This should compile with CONFIG_PCI being unset without issues because dev_is_pci() is defined as "(false)" then. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241109215936.83004-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:00 2024 +0100 af_packet: avoid erroring out after sock_init_data() in packet_create() [ Upstream commit 46f2a11cb82b657fd15bab1c47821b635e03838b ] After sock_init_data() the allocated sk object is attached to the provided sock object. On error, packet_create() frees the sk object leaving the dangling pointer in the sock object on return. Some other code may try to use this pointer and cause use-after-free. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-2-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Tue Oct 8 14:02:30 2024 +0200 ALSA: hda/conexant: Use the new codec SSID matching [ Upstream commit 1f55e3699fc9ced72400cdca39fe248bf2b288a2 ] Now we can perform the codec ID matching primarily, and reduce the conditional application of the quirk for conflicting PCI SSID between System76 and Tuxedo devices. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241008120233.7154-3-tiwai@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sahas Leelodharry <sahas.leelodharry@mail.mcgill.ca> Date: Mon Dec 2 03:28:33 2024 +0000 ALSA: hda/realtek: Add support for Samsung Galaxy Book3 360 (NP730QFG) commit e2974a220594c06f536e65dfd7b2447e0e83a1cb upstream. Fixes the 3.5mm headphone jack on the Samsung Galaxy Book 3 360 NP730QFG laptop. Unlike the other Galaxy Book3 series devices, this device only needs the ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET quirk. Verified changes on the device and compared with codec state in Windows. [ white-space fixes by tiwai ] Signed-off-by: Sahas Leelodharry <sahas.leelodharry@mail.mcgill.ca> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/QB1PR01MB40047D4CC1282DB7F1333124CC352@QB1PR01MB4004.CANPRD01.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nazar Bilinskyi <nbilinskyi@gmail.com> Date: Sun Dec 1 01:16:31 2024 +0200 ALSA: hda/realtek: Enable mute and micmute LED on HP ProBook 430 G8 commit 3a83f7baf1346aca885cb83cb888e835fef7c472 upstream. HP ProBook 430 G8 has a mute and micmute LEDs that can be made to work using quirk ALC236_FIXUP_HP_GPIO_LED. Enable already existing quirk. Signed-off-by: Nazar Bilinskyi <nbilinskyi@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241130231631.8929-1-nbilinskyi@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chris Chiu <chris.chiu@canonical.com> Date: Mon Dec 2 22:46:59 2024 +0800 ALSA: hda/realtek: fix micmute LEDs don't work on HP Laptops commit 0d08f0eec961acdb0424a3e2cfb37cfb89154833 upstream. These HP laptops use Realtek HDA codec ALC3315 combined CS35L56 Amplifiers. They need the quirk ALC285_FIXUP_HP_GPIO_LED to get the micmute LED working. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Reviewed-by: Simon Trimmer <simont@opensource.cirrus.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241202144659.1553504-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Colin Ian King <colin.i.king@gmail.com> Date: Thu Dec 5 10:28:33 2024 +0000 ALSA: hda/realtek: Fix spelling mistake "Firelfy" -> "Firefly" commit 20c3b3e5f2641eff3d85f33e6a468ac052b169bd upstream. There is a spelling mistake in a literal string in the alc269_fixup_tbl quirk table. Fix it. Fixes: 0d08f0eec961 ("ALSA: hda/realtek: fix micmute LEDs don't work on HP Laptops") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20241205102833.476190-1-colin.i.king@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Tue Oct 8 14:02:31 2024 +0200 ALSA: hda/realtek: Use codec SSID matching for Lenovo devices [ Upstream commit 504f052aa3435ab2f15af8b20bc4f4de8ff259c7 ] Now we can perform the codec ID matching primarily, and reduce the conditional application of the quirk for conflicting PCI SSIDs in various Lenovo devices. Here, HDA_CODEC_QUIRK() is applied at first so that the device with the codec SSID matching is picked up, followed by SND_PCI_QUIRK() for PCI SSID matching with the same ID number. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241008120233.7154-4-tiwai@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Fri Oct 11 09:21:52 2024 +0200 ALSA: hda: Fix build error without CONFIG_SND_DEBUG commit 0ddf2784d6c29e59409a62b8f32dc5abe56135a4 upstream. The macro should have been defined without setting the non-existing name field in the case of CONFIG_SND_DEBUG=n. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/20241011131046.5eb3905a@canb.auug.org.au Fixes: 5b1913a79c3e ("ALSA: hda: Use own quirk lookup helper") Link: https://patch.msgid.link/20241011072152.14657-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Tue Oct 8 14:02:29 2024 +0200 ALSA: hda: Use own quirk lookup helper [ Upstream commit 5b1913a79c3e0518d9c5db343fa9fc4edcea041f ] For allowing the primary codec SSID matching (that works around the conflicting PCI SSID problems), introduce a new struct hda_quirk, which is compatible with the existing struct snd_pci_quirk along with new helper functions and macros. The existing snd_pci_quirk tables are replaced with hda_quirk tables accordingly, while keeping SND_PCI_QUIRK() entry definitions as is. This patch shouldn't bring any behavior change, just some renaming and shifting the code. The actual change for the codec SSID matching will follow after this. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241008120233.7154-2-tiwai@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Thu Nov 28 18:04:22 2024 +0100 ALSA: seq: ump: Fix seq port updates per FB info notify [ Upstream commit aaa55faa2495320e44bc643a917c701f2cc89ee7 ] update_port_infos() is called when a UMP FB Info update notification is received, and this function is supposed to update the attributes of the corresponding sequencer port. However, the function had a few issues and it brought to the incorrect states. Namely: - It tried to get a wrong sequencer info for the update without correcting the port number with the group-offset 1 - The loop exited immediately when a sequencer port isn't present; this ended up with the truncation if a sequencer port in the middle goes away This patch addresses those bugs. Fixes: 4a16a3af0571 ("ALSA: seq: ump: Handle FB info update") Link: https://patch.msgid.link/20241128170423.23351-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Asahi Lina <lina@asahilina.net> Date: Mon Dec 2 22:17:15 2024 +0900 ALSA: usb-audio: Add extra PID for RME Digiface USB commit f09f0397db641f99f6c3e109283d82e3584bfb50 upstream. It seems there is an alternate version of the hardware with a different PID. User testing reveals this still works with the same interface as far as the kernel is concerned, so just add the extra PID. Thanks to Heiko Engemann for testing with this version. Due to the way quirks-table.h is structured, that means we have to turn the entire quirk struct into a macro to avoid duplicating it... Cc: stable@vger.kernel.org Signed-off-by: Asahi Lina <lina@asahilina.net> Link: https://patch.msgid.link/20241202-rme-digiface-usb-id-v1-1-50f730d7a46e@asahilina.net Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marie Ramlow <me@nycode.dev> Date: Sat Nov 30 17:52:40 2024 +0100 ALSA: usb-audio: add mixer mapping for Corsair HS80 commit a7de2b873f3dbcda02d504536f1ec6dc50e3f6c4 upstream. The Corsair HS80 RGB Wireless is a USB headset with a mic and a sidetone feature. It has the same quirk as the Virtuoso series. This labels the mixers appropriately, so applications don't move the sidetone volume when they actually intend to move the main headset volume. Signed-off-by: Marie Ramlow <me@nycode.dev> cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241130165240.17838-1-me@nycode.dev Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dan Carpenter <dan.carpenter@linaro.org> Date: Mon Dec 2 15:57:54 2024 +0300 ALSA: usb-audio: Fix a DMA to stack memory bug commit f7d306b47a24367302bd4fe846854e07752ffcd9 upstream. The usb_get_descriptor() function does DMA so we're not allowed to use a stack buffer for that. Doing DMA to the stack is not portable all architectures. Move the "new_device_descriptor" from being stored on the stack and allocate it with kmalloc() instead. Fixes: b909df18ce2a ("ALSA: usb-audio: Fix potential out-of-bound accesses for Extigy and Mbox devices") Cc: stable@kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/60e3aa09-039d-46d2-934c-6f123026c2eb@stanley.mountain Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Tue Nov 5 13:02:18 2024 +0100 ALSA: usb-audio: Make mic volume workarounds globally applicable [ Upstream commit d6e6b9218ced5249b9136833ef5ec3f554ec7fde ] It seems that many webcams have buggy firmware and don't expose the mic capture volume with the proper resolution. We have workarounds in mixer.c, but judging from the numbers, those can be better managed as global quirk flags. Link: https://patch.msgid.link/20241105120220.5740-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de> Date: Thu Nov 28 09:04:16 2024 +0100 ALSA: usb-audio: Notify xrun for low-latency mode [ Upstream commit 4f9d674377d090e38d93360bd4df21b67534d622 ] The low-latency mode of USB-audio driver uses a similar approach like the implicit feedback mode but it has an explicit queuing at the trigger start time. The difference is, however, that no packet will be handled any longer after all queued packets are handled but no enough data is fed. In the case of implicit feedback mode, the capture-side packet handling triggers the re-queuing, and this checks the XRUN. OTOH, in the low-latency mode, it just stops without XRUN notification unless any new action is taken from user-space via ack callback. For example, when you stop the stream in aplay, no XRUN is reported. This patch adds the XRUN check at the packet complete callback in the case all pending URBs are exhausted. Strictly speaking, this state doesn't match really with XRUN; in theory the application may queue immediately after this happens. But such behavior is only for 1-period configuration, which the USB-audio driver doesn't support. So we may conclude that this situation leads certainly to XRUN. A caveat is that the XRUN should be triggered only for the PCM RUNNING state, and not during DRAINING. This additional state check is put in notify_xrun(), too. Fixes: d5f871f89e21 ("ALSA: usb-audio: Improved lowlatency playback support") Reported-by: Leonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/25d5b0d8-4efd-4630-9d33-7a9e3fa9dc2b@gmail.com Link: https://patch.msgid.link/20241128080446.1181-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Zyngier <maz@kernel.org> Date: Sun Dec 1 09:27:02 2024 +0000 arch_numa: Restore nid checks before registering a memblock with a node commit 180bbad698641873120a48857bb3b9f3166bf684 upstream. Commit 767507654c22 ("arch_numa: switch over to numa_memblks") significantly cleaned up the NUMA registration code, but also dropped a significant check that was refusing to accept to configure a memblock with an invalid nid. On "quality hardware" such as my ThunderX machine, this results in a kernel that dies immediately: [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x431f0a10] [ 0.000000] Linux version 6.12.0-00013-g8920d74cf8db (maz@valley-girl) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #3872 SMP PREEMPT Wed Nov 27 15:25:49 GMT 2024 [ 0.000000] KASLR disabled due to lack of seed [ 0.000000] Machine model: Cavium ThunderX CN88XX board [ 0.000000] efi: EFI v2.4 by American Megatrends [ 0.000000] efi: ESRT=0xffce0ff18 SMBIOS 3.0=0xfffb0000 ACPI 2.0=0xffec60000 MEMRESERVE=0xffc905d98 [ 0.000000] esrt: Reserving ESRT space from 0x0000000ffce0ff18 to 0x0000000ffce0ff50. [ 0.000000] earlycon: pl11 at MMIO 0x000087e024000000 (options '115200n8') [ 0.000000] printk: legacy bootconsole [pl11] enabled [ 0.000000] NODE_DATA(0) allocated [mem 0xff6754580-0xff67566bf] [ 0.000000] Unable to handle kernel paging request at virtual address 0000000000001d40 [ 0.000000] Mem abort info: [ 0.000000] ESR = 0x0000000096000004 [ 0.000000] EC = 0x25: DABT (current EL), IL = 32 bits [ 0.000000] SET = 0, FnV = 0 [ 0.000000] EA = 0, S1PTW = 0 [ 0.000000] FSC = 0x04: level 0 translation fault [ 0.000000] Data abort info: [ 0.000000] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 0.000000] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 0.000000] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 0.000000] [0000000000001d40] user address but active_mm is swapper [ 0.000000] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.0-00013-g8920d74cf8db #3872 [ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT) [ 0.000000] pstate: a00000c5 (NzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.000000] pc : sparse_init_nid+0x54/0x428 [ 0.000000] lr : sparse_init+0x118/0x240 [ 0.000000] sp : ffff800081da3cb0 [ 0.000000] x29: ffff800081da3cb0 x28: 0000000fedbab10c x27: 0000000000000001 [ 0.000000] x26: 0000000ffee250f8 x25: 0000000000000001 x24: ffff800082102cd0 [ 0.000000] x23: 0000000000000001 x22: 0000000000000000 x21: 00000000001fffff [ 0.000000] x20: 0000000000000001 x19: 0000000000000000 x18: ffffffffffffffff [ 0.000000] x17: 0000000001b00000 x16: 0000000ffd130000 x15: 0000000000000000 [ 0.000000] x14: 00000000003e0000 x13: 00000000000001c8 x12: 0000000000000014 [ 0.000000] x11: ffff800081e82860 x10: ffff8000820fb2c8 x9 : ffff8000820fb490 [ 0.000000] x8 : 0000000000ffed20 x7 : 0000000000000014 x6 : 00000000001fffff [ 0.000000] x5 : 00000000ffffffff x4 : 0000000000000000 x3 : 0000000000000000 [ 0.000000] x2 : 0000000000000000 x1 : 0000000000000040 x0 : 0000000000000007 [ 0.000000] Call trace: [ 0.000000] sparse_init_nid+0x54/0x428 [ 0.000000] sparse_init+0x118/0x240 [ 0.000000] bootmem_init+0x70/0x1c8 [ 0.000000] setup_arch+0x184/0x270 [ 0.000000] start_kernel+0x74/0x670 [ 0.000000] __primary_switched+0x80/0x90 [ 0.000000] Code: f865d804 d37df060 cb030000 d2800003 (b95d4084) [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- while previous kernel versions were able to recognise how brain-damaged the machine is, and only build a fake node. Use the memblock_validate_numa_coverage() helper to restore some sanity and a "working" system. Fixes: 767507654c22 ("arch_numa: switch over to numa_memblks") Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241201092702.3792845-1-maz@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Catalin Marinas <catalin.marinas@arm.com> Date: Tue Dec 3 15:19:41 2024 +0000 arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs commit c0900d15d31c2597dd9f634c8be2b71762199890 upstream. Linux currently sets the TCR_EL1.AS bit unconditionally during CPU bring-up. On an 8-bit ASID CPU, this is RES0 and ignored, otherwise 16-bit ASIDs are enabled. However, if running in a VM and the hypervisor reports 8-bit ASIDs (ID_AA64MMFR0_EL1.ASIDBits == 0) on a 16-bit ASIDs CPU, Linux uses bits 8 to 63 as a generation number for tracking old process ASIDs. The bottom 8 bits of this generation end up being written to TTBR1_EL1 and also used for the ASID-based TLBI operations as the upper 8 bits of the ASID. Following an ASID roll-over event we can have threads of the same application with the same 8-bit ASID but different generation numbers running on separate CPUs. Both TLB caching and the TLBI operations will end up using different actual 16-bit ASIDs for the same process. A similar scenario can happen in a big.LITTLE configuration if the boot CPU only uses 8-bit ASIDs while secondary CPUs have 16-bit ASIDs. Ensure that the ASID generation is only tracked by bits 16 and up, leaving bits 15:8 as 0 if the kernel uses 8-bit ASIDs. Note that clearing TCR_EL1.AS is not sufficient since the architecture requires that the top 8 bits of the ASID passed to TLBI instructions are 0 rather than ignored in such configuration. Cc: stable@vger.kernel.org Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241203151941.353796-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yang Shi <yang@os.amperecomputing.com> Date: Mon Nov 25 09:16:50 2024 -0800 arm64: mm: Fix zone_dma_limit calculation commit 56a708742a8bf127eb66798bfc9c9516c61f9930 upstream. Commit ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") and subsequent patches changed how zone_dma_limit is calculated to allow a reduced ZONE_DMA even when RAM starts above 4GB. Commit 122c234ef4e1 ("arm64: mm: keep low RAM dma zone") further fixed this to ensure ZONE_DMA remains below U32_MAX if RAM starts below 4GB, especially on platforms that do not have IORT or DT description of the device DMA ranges. While zone boundaries calculation was fixed by the latter commit, zone_dma_limit, used to determine the GFP_DMA flag in the core code, was not updated. This results in excessive use of GFP_DMA and unnecessary ZONE_DMA allocations on some platforms. Update zone_dma_limit to match the actual upper bound of ZONE_DMA. Fixes: ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") Cc: <stable@vger.kernel.org> # 6.12.x Reported-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Tested-by: Yutang Jiang <jiangyutang@os.amperecomputing.com> Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Link: https://lore.kernel.org/r/20241125171650.77424-1-yang@os.amperecomputing.com [catalin.marinas@arm.com: some tweaking of the commit log] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mark Rutland <mark.rutland@arm.com> Date: Thu Dec 5 12:16:53 2024 +0000 arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR commit f5d71291841aecfe5d8435da2dfa7f58ccd18bc8 upstream. Currently fpmr_set() doesn't initialize the temporary 'fpmr' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently an arbitrary value will be written back to target->thread.uw.fpmr, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing contents of FPMR will be retained. Before this patch: | # ./fpmr-test | Attempting to write NT_ARM_FPMR::fpmr = 0x900d900d900d900d | SETREGSET(nt=0x40e, len=8) wrote 8 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d | | Attempting to write NT_ARM_FPMR (zero length) | SETREGSET(nt=0x40e, len=0) wrote 0 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0xffff800083963d50 After this patch: | # ./fpmr-test | Attempting to write NT_ARM_FPMR::fpmr = 0x900d900d900d900d | SETREGSET(nt=0x40e, len=8) wrote 8 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d | | Attempting to write NT_ARM_FPMR (zero length) | SETREGSET(nt=0x40e, len=0) wrote 0 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d Fixes: 4035c22ef7d4 ("arm64/ptrace: Expose FPMR via ptrace") Cc: <stable@vger.kernel.org> # 6.9.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mark Rutland <mark.rutland@arm.com> Date: Thu Dec 5 12:16:54 2024 +0000 arm64: ptrace: fix partial SETREGSET for NT_ARM_POE commit 594bfc4947c4fcabba1318d8384c61a29a6b89fb upstream. Currently poe_set() doesn't initialize the temporary 'ctrl' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently an arbitrary value will be written back to target->thread.por_el0, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing contents of POR_EL1 will be retained. Before this patch: | # ./poe-test | Attempting to write NT_ARM_POE::por_el0 = 0x900d900d900d900d | SETREGSET(nt=0x40f, len=8) wrote 8 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d | | Attempting to write NT_ARM_POE (zero length) | SETREGSET(nt=0x40f, len=0) wrote 0 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0xffff8000839c3d50 After this patch: | # ./poe-test | Attempting to write NT_ARM_POE::por_el0 = 0x900d900d900d900d | SETREGSET(nt=0x40f, len=8) wrote 8 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d | | Attempting to write NT_ARM_POE (zero length) | SETREGSET(nt=0x40f, len=0) wrote 0 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d Fixes: 175198199262 ("arm64/ptrace: add support for FEAT_POE") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mark Rutland <mark.rutland@arm.com> Date: Thu Dec 5 12:16:52 2024 +0000 arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL commit ca62d90085f4af36de745883faab9f8a7cbb45d3 upstream. Currently tagged_addr_ctrl_set() doesn't initialize the temporary 'ctrl' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently tagged_addr_ctrl_set() will consume an arbitrary value, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. As set_tagged_addr_ctrl() only accepts values where bits [63:4] zero and rejects other values, a partial SETREGSET attempt will randomly succeed or fail depending on the value of the uninitialized value, and the exposure is significantly limited. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing value of the tagged address ctrl will be retained. The NT_ARM_TAGGED_ADDR_CTRL regset is only visible in the user_aarch64_view used by a native AArch64 task to manipulate another native AArch64 task. As get_tagged_addr_ctrl() only returns an error value when called for a compat task, tagged_addr_ctrl_get() and tagged_addr_ctrl_set() should never observe an error value from get_tagged_addr_ctrl(). Add a WARN_ON_ONCE() to both to indicate that such an error would be unexpected, and error handlnig is not missing in either case. Fixes: 2200aa7154cb ("arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Uwe Kleine-König <ukleinek@debian.org> Date: Fri Nov 22 08:56:05 2024 +0100 ASoC: amd: yc: Add quirk for microphone on Lenovo Thinkpad T14s Gen 6 21M1CTO1WW [ Upstream commit cbc86dd0a4fe9f8c41075328c2e740b68419d639 ] Add a quirk for Tova's Lenovo Thinkpad T14s with product name 21M1. Suggested-by: Tova <blueaddagio@laposte.net> Link: https://bugs.debian.org/1087673 Signed-off-by: Uwe Kleine-König <ukleinek@debian.org> Link: https://patch.msgid.link/20241122075606.213132-2-ukleinek@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Far <anf1980@gmail.com> Date: Sat Nov 16 21:58:45 2024 +0300 ASoC: amd: yc: fix internal mic on Redmi G 2022 [ Upstream commit 67a0463d339059eeeead9cd015afa594659cfdaf ] This laptop model requires an additional detection quirk to enable the internal microphone Signed-off-by: Alex Far <anf1980@gmail.com> Link: https://patch.msgid.link/ZzjrZY3sImcqTtGx@RedmiG Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Karlman <jonas@kwiboo.se> Date: Fri Nov 15 04:43:44 2024 +0000 ASoC: hdmi-codec: reorder channel allocation list [ Upstream commit 82ff5abc2edcfba0c0f1a1be807795e2876f46e9 ] The ordering in hdmi_codec_get_ch_alloc_table_idx() results in wrong channel allocation for a number of cases, e.g. when ELD reports FL|FR|LFE|FC|RL|RR or FL|FR|LFE|FC|RL|RR|RC|RLC|RRC: ca_id 0x01 with speaker mask FL|FR|LFE is selected instead of ca_id 0x03 with speaker mask FL|FR|LFE|FC for 4 channels and ca_id 0x04 with speaker mask FL|FR|RC gets selected instead of ca_id 0x0b with speaker mask FL|FR|LFE|FC|RL|RR for 6 channels Fix this by reordering the channel allocation list with most specific speaker masks at the top. Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Link: https://patch.msgid.link/20241115044344.3510979-1-christianshewitt@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Date: Thu Oct 10 13:20:08 2024 +0200 ASoC: Intel: avs: Fix return status of avs_pcm_hw_constraints_init() commit a0aae96be5ffc5b456ca07bfe1385b721c20e184 upstream. Check for return code from avs_pcm_hw_constraints_init() in avs_dai_fe_startup() only checks if value is different from 0. Currently function can return positive value, change it to return 0 on success. Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> I've observed KASAN on our setups and while patch itself is correct regardless. Problem seems to be caused by recent changes to rates, as this started happening after recent patchsets and doesn't reproduce with those reverted https://lore.kernel.org/linux-sound/20240905-alsa-12-24-128-v1-0-8371948d3921@baylibre.com/ https://lore.kernel.org/linux-sound/20240911135756.24434-1-tiwai@suse.de/ I've tested using Mark tree, where they are both applied and for some reason snd_pcm_hw_constraint_minmax() started returning positive value, while previously it returned 0. I'm bit worried if it signals some potential deeper problem regarding constraints with above changes. Link: https://patch.msgid.link/20241010112008.545526-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mac Chiang <mac.chiang@intel.com> Date: Mon Oct 28 15:26:31 2024 +0800 ASoC: Intel: soc-acpi-intel-arl-match: Add rt722 and rt1320 support [ Upstream commit f193fb888d1da45365daa7d0ff7a964c8305d407 ] This patch adds support for the rt722 multi-function codec and the rt1320 amplifier in the ARL board configuration. Link 0: RT722 codec with three endpoints: Headset, Speaker, and DMIC. Link 2: RT1320 amplifier. Note: The Speaker endpoint on the RT722 codec is not used. Signed-off-by: Mac Chiang <mac.chiang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241028072631.15536-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Balamurugan C <balamurugan.c@intel.com> Date: Fri Oct 4 11:01:33 2024 +0800 ASoC: Intel: sof_rt5682: Add HDMI-In capture with rt5682 support for MTL. [ Upstream commit 0f5d2228a99a4733b2a6652e16255be9caf2616a ] Added match table entry on mtl machines to support HDMI-In capture with rt5682 I2S audio codec. also added the respective quirk configuration in rt5682 machine driver. Signed-off-by: Balamurugan C <balamurugan.c@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241004030135.67968-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Keepax <ckeepax@opensource.cirrus.com> Date: Wed Oct 16 11:03:43 2024 +0800 ASoC: Intel: sof_sdw: Add quirk for cs42l43 system using host DMICs [ Upstream commit ea657f6b24e11651a39292082be84ad81a89e525 ] Add quirk to inform the machine driver to not bind in the cs42l43 microphone DAI link. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016030344.13535-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Keepax <ckeepax@opensource.cirrus.com> Date: Wed Oct 16 11:03:44 2024 +0800 ASoC: Intel: sof_sdw: Add quirks for some new Lenovo laptops [ Upstream commit 83c062ae81e89f73e3ab85953111a8b3daaaf98e ] Add some more sidecar amplifier quirks. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016030344.13535-5-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nícolas F. R. A. Prado <nfraprado@collabora.com> Date: Tue Dec 3 16:20:58 2024 -0300 ASoC: mediatek: mt8188-mt6359: Remove hardcoded dmic codec [ Upstream commit ec16a3cdf37e507013062f9c4a2067eacdd12b62 ] Remove hardcoded dmic codec from the UL_SRC dai link to avoid requiring a dmic codec to be present for the driver to probe, as not every MT8188-based platform might need a dmic codec. The codec can be assigned to the dai link through the dai-link property in Devicetree on the platforms where it is needed. No Devicetree currently relies on it so it is safe to remove without worrying about backward compatibility. Fixes: 9f08dcbddeb3 ("ASoC: mediatek: mt8188-mt6359: support new board with nau88255") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20241203-mt8188-6359-unhardcode-dmic-v1-1-346e3e5cbe6d@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Keepax <ckeepax@opensource.cirrus.com> Date: Wed Oct 16 11:03:42 2024 +0800 ASoC: sdw_utils: Add a quirk to allow the cs42l43 mic DAI to be ignored [ Upstream commit a6f7afb39362ef70d08d23e5bfc0a14d69fafea1 ] To support some systems using host microphones add a quirk to allow the cs42l43 microphone DAI link to be ignored. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016030344.13535-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mac Chiang <mac.chiang@intel.com> Date: Mon Oct 28 15:26:29 2024 +0800 ASoC: sdw_utils: Add quirk to exclude amplifier function [ Upstream commit 358ee2c1493e5d2c59820ffd8087eb0e367be4c6 ] When SKUs use the multi-function codec, which integrates Headset, Amplifier and DMIC. The corresponding quirks provide options to support internal amplifier/DMIC or not. In the case of RT722, this SKU excludes the internal amplifier and use an additional amplifier instead. Signed-off-by: Mac Chiang <mac.chiang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241028072631.15536-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Keepax <ckeepax@opensource.cirrus.com> Date: Wed Oct 16 11:03:41 2024 +0800 ASoC: sdw_utils: Add support for exclusion DAI quirks [ Upstream commit 3d9b44d0972be1298400e449cfbcc436df2e988e ] The system contains a mechanism for certain DAI links to be included based on a quirk. Add support for certain DAI links to excluded based on a quirk, this is useful in situations where the vast majority of SKUs utilise a feature so it is easier to quirk on those that don't. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241016030344.13535-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bard Liao <yung-chuan.liao@linux.intel.com> Date: Wed Nov 27 17:29:54 2024 +0800 ASoC: SOF: ipc3-topology: Convert the topology pin index to ALH dai index [ Upstream commit e9db1b551774037ebe39dde4a658d89ba95e260b ] Intel SoundWire machine driver always uses Pin number 2 and above. Currently, the pin number is used as the FW DAI index directly. As a result, FW DAI 0 and 1 are never used. That worked fine because we use up to 2 DAIs in a SDW link. Convert the topology pin index to ALH dai index, the mapping is using 2-off indexing, iow, pin #2 is ALH dai #0. The issue exists since beginning. And the Fixes tag is the first commit that this commit can be applied. Fixes: b66bfc3a9810 ("ASoC: SOF: sof-audio: Fix broken early bclk feature for SSP") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://patch.msgid.link/20241127092955.20026-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Stable-dep-of: 6d544ea21d36 ("ASoC: SOF: ipc3-topology: fix resource leaks in sof_ipc3_widget_setup_comp_dai()") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <dan.carpenter@linaro.org> Date: Sat Nov 30 13:09:06 2024 +0300 ASoC: SOF: ipc3-topology: fix resource leaks in sof_ipc3_widget_setup_comp_dai() [ Upstream commit 6d544ea21d367cbd9746ae882e67a839391a6594 ] These error paths should free comp_dai before returning. Fixes: 909dadf21aae ("ASoC: SOF: topology: Make DAI widget parsing IPC agnostic") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/67d185cf-d139-4f8c-970a-dbf0542246a8@stanley.mountain Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Liequan Che <cheliequan@inspur.com> Date: Mon Dec 2 19:56:38 2024 +0800 bcache: revert replacing IS_ERR_OR_NULL with IS_ERR again commit b2e382ae12a63560fca35050498e19e760adf8c0 upstream. Commit 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") leads a NULL pointer deference in cache_set_flush(). 1721 if (!IS_ERR_OR_NULL(c->root)) 1722 list_add(&c->root->list, &c->btree_cache); >From the above code in cache_set_flush(), if previous registration code fails before allocating c->root, it is possible c->root is NULL as what it is initialized. __bch_btree_node_alloc() never returns NULL but c->root is possible to be NULL at above line 1721. This patch replaces IS_ERR() by IS_ERR_OR_NULL() to fix this. Fixes: 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") Signed-off-by: Liequan Che <cheliequan@inspur.com> Cc: stable@vger.kernel.org Cc: Zheng Wang <zyytlz.wz@163.com> Reviewed-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20241202115638.28957-1-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Damien Le Moal <dlemoal@kernel.org> Date: Thu Nov 7 15:42:59 2024 +0900 block: RCU protect disk->conv_zones_bitmap [ Upstream commit d7cb6d7414ea1b33536fa6d11805cb8dceec1f97 ] Ensure that a disk revalidation changing the conventional zones bitmap of a disk does not cause invalid memory references when using the disk_zone_is_conv() helper by RCU protecting the disk->conv_zones_bitmap pointer. disk_zone_is_conv() is modified to operate under the RCU read lock and the function disk_set_conv_zones_bitmap() is added to update a disk conv_zones_bitmap pointer using rcu_replace_pointer() with the disk zone_wplugs_lock spinlock held. disk_free_zone_resources() is modified to call disk_update_zone_resources() with a NULL bitmap pointer to free the disk conv_zones_bitmap. disk_set_conv_zones_bitmap() is also used in disk_update_zone_resources() to set the new (revalidated) bitmap and free the old one. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241107064300.227731-2-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danil Pylaev <danstiv404@gmail.com> Date: Mon Oct 21 12:22:44 2024 +0000 Bluetooth: Add new quirks for ATS2851 [ Upstream commit 94464a7b71634037b13d54021e0dfd0fb0d8c1f0 ] This adds quirks for broken extended create connection, and write auth payload timeout. Signed-off-by: Danil Pylaev <danstiv404@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiande Lu <jiande.lu@mediatek.com> Date: Mon Nov 4 22:59:31 2024 +0800 Bluetooth: btusb: Add 3 HWIDs for MT7925 [ Upstream commit de7dcf9d1df4b0009735756d0a2adff09c3f21d4 ] Add below HWIDs for MediaTek MT7925 USB Bluetooth chip. VID 0x0489, PID 0xe14f VID 0x0489, PID 0xe150 VID 0x0489, PID 0xe151 Patch has been tested successfully and controller is recognized device pair successfully. MT7925 module bring up message as below. Bluetooth: Core ver 2.22 Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: L2CAP socket layer initialized Bluetooth: SCO socket layer initialized Bluetooth: hci0: HW/SW Version: 0x00000000, Build Time: 20240816133202 Bluetooth: hci0: Device setup in 286558 usecs Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported. Bluetooth: hci0: AOSP extensions version v1.00 Bluetooth: BNEP (Ethernet Emulation) ver 1.3 Bluetooth: BNEP filters: protocol multicast Bluetooth: BNEP socket layer initialized Bluetooth: MGMT ver 1.22 Bluetooth: RFCOMM TTY layer initialized Bluetooth: RFCOMM socket layer initialized Bluetooth: RFCOMM ver 1.11 Signed-off-by: Jiande Lu <jiande.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hao Qin <hao.qin@mediatek.com> Date: Sat Oct 26 11:18:18 2024 +0800 Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925 [ Upstream commit faa5fd605d2081b6c9fa2355b59582d4ccd24b16 ] Add VID 0489 & PID e111 for MediaTek MT7925 USB Bluetooth chip. The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below. T: Bus=02 Lev=02 Prnt=02 Port=04 Cnt=02 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e111 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us Signed-off-by: Hao Qin <hao.qin@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan McCrohan <jmccrohan@gmail.com> Date: Sat Nov 2 01:10:14 2024 +0000 Bluetooth: btusb: Add new VID/PID 0489/e124 for MT7925 [ Upstream commit 679cb60fd60774798719c3e449874a168642a8e6 ] Add VID 0489 & PID e124 for MediaTek MT7925 USB Bluetooth chip. The information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below. T: Bus=01 Lev=01 Prnt=01 Port=08 Cnt=02 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e124 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us Signed-off-by: Jonathan McCrohan <jmccrohan@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hilda Wu <hildawu@realtek.com> Date: Tue Oct 1 16:37:29 2024 +0800 Bluetooth: btusb: Add RTL8852BE device 0489:e123 to device tables [ Upstream commit 69b84ffce260ff13826dc10aeb3c3e5c2288a552 ] Add the support ID 0489:e123 to usb_device_id table for Realtek RTL8852B chip. The device info from /sys/kernel/debug/usb/devices as below. T: Bus=01 Lev=01 Prnt=01 Port=07 Cnt=04 Dev#= 7 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e123 Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Hilda Wu <hildawu@realtek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiande Lu <jiande.lu@mediatek.com> Date: Mon Sep 30 18:47:50 2024 +0800 Bluetooth: btusb: Add USB HW IDs for MT7920/MT7925 [ Upstream commit a94bc93a305bdcb20cc62978c334cace932b1be0 ] Add HW IDs for wireless module. These HW IDs are extracted from Windows driver inf file and the test for card bring up successful. MT7920 HW IDs test with below patch. https://patchwork.kernel.org/project/bluetooth/ patch/20240930081257.23975-1-chris.lu@mediatek.com/ Patch has been tested successfully and controller is recognized devices pair successfully. MT7920 module bring up message as below. Bluetooth: Core ver 2.22 Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: L2CAP socket layer initialized Bluetooth: SCO socket layer initialized Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20240930111457 Bluetooth: hci0: Device setup in 143004 usecs Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported. Bluetooth: hci0: AOSP extensions version v1.00 Bluetooth: hci0: AOSP quality report is supported Bluetooth: BNEP (Ethernet Emulation) ver 1.3 Bluetooth: BNEP filters: protocol multicast Bluetooth: BNEP socket layer initialized Bluetooth: MGMT ver 1.22 Bluetooth: RFCOMM TTY layer initialized Bluetooth: RFCOMM socket layer initialized Bluetooth: RFCOMM ver 1.11 MT7925 module bring up message as below. Bluetooth: Core ver 2.22 Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: L2CAP socket layer initialized Bluetooth: SCO socket layer initialized Bluetooth: hci0: HW/SW Version: 0x00000000, Build Time: 20240816133202 Bluetooth: hci0: Device setup in 286558 usecs Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported. Bluetooth: hci0: AOSP extensions version v1.00 Bluetooth: BNEP (Ethernet Emulation) ver 1.3 Bluetooth: BNEP filters: protocol multicast Bluetooth: BNEP socket layer initialized Bluetooth: MGMT ver 1.22 Bluetooth: RFCOMM TTY layer initialized Bluetooth: RFCOMM socket layer initialized Bluetooth: RFCOMM ver 1.11 Signed-off-by: Jiande Lu <jiande.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Markus Elfring <elfring@users.sourceforge.net> Date: Tue Oct 1 09:21:25 2024 +0200 Bluetooth: hci_conn: Reduce hci_conn_drop() calls in two functions [ Upstream commit d96b543c6f3b78b6440b68b5a5bbface553eff28 ] An hci_conn_drop() call was immediately used after a null pointer check for an hci_conn_link() call in two function implementations. Thus call such a function only once instead directly before the checks. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Date: Tue Oct 1 12:34:06 2024 -0400 Bluetooth: hci_conn: Use disable_delayed_work_sync [ Upstream commit 2b0f2fc9ed62e73c95df1fa8ed2ba3dac54699df ] This makes use of disable_delayed_work_sync instead cancel_delayed_work_sync as it not only cancel the ongoing work but also disables new submit which is disarable since the object holding the work is about to be freed. Reported-by: syzbot+2446dd3cb07277388db6@syzkaller.appspotmail.com Tested-by: syzbot+2446dd3cb07277388db6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2446dd3cb07277388db6 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Date: Tue Oct 8 10:16:48 2024 -0400 Bluetooth: hci_core: Fix not checking skb length on hci_acldata_packet [ Upstream commit 3fe288a8214e7dd784d1f9b7c9e448244d316b47 ] This fixes not checking if skb really contains an ACL header otherwise the code may attempt to access some uninitilized/invalid memory past the valid skb->data. Reported-by: syzbot+6ea290ba76d8c1eb1ac2@syzkaller.appspotmail.com Tested-by: syzbot+6ea290ba76d8c1eb1ac2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6ea290ba76d8c1eb1ac2 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:01 2024 +0100 Bluetooth: L2CAP: do not leave dangling sk pointer on error in l2cap_sock_create() [ Upstream commit 7c4f78cdb8e7501e9f92d291a7d956591bf73be9 ] bt_sock_alloc() allocates the sk object and attaches it to the provided sock object. On error l2cap_sock_alloc() frees the sk object, but the dangling pointer is still attached to the sock object, which may create use-after-free in other code. Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-3-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:02 2024 +0100 Bluetooth: RFCOMM: avoid leaving dangling sk pointer in rfcomm_sock_alloc() [ Upstream commit 3945c799f12b8d1f49a3b48369ca494d981ac465 ] bt_sock_alloc() attaches allocated sk object to the provided sock object. If rfcomm_dlc_alloc() fails, we release the sk object, but leave the dangling pointer in the sock object, which may cause use-after-free. Fix this by swapping calls to bt_sock_alloc() and rfcomm_dlc_alloc(). Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-4-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danil Pylaev <danstiv404@gmail.com> Date: Mon Oct 21 12:22:46 2024 +0000 Bluetooth: Set quirks for ATS2851 [ Upstream commit 677a55ba11a82c2835550a82324cec5fcb2f9e2d ] This adds quirks for broken ats2851 features. Signed-off-by: Danil Pylaev <danstiv404@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danil Pylaev <danstiv404@gmail.com> Date: Mon Oct 21 12:22:45 2024 +0000 Bluetooth: Support new quirks for ATS2851 [ Upstream commit 5bd3135924b4570dcecc8793f7771cb8d42d8b19 ] This adds support for quirks for broken extended create connection, and write auth payload timeout. Signed-off-by: Danil Pylaev <danstiv404@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Xu <dxu@dxuuu.xyz> Date: Wed Nov 27 15:58:29 2024 -0700 bnxt_en: ethtool: Supply ntuple rss context action [ Upstream commit be75cda92a65a13db242117d674cd5584477a168 ] Commit 2f4f9fe5bf5f ("bnxt_en: Support adding ntuple rules on RSS contexts") added support for redirecting to an RSS context as an ntuple rule action. However, it forgot to update the ETHTOOL_GRXCLSRULE codepath. This caused `ethtool -n` to always report the action as "Action: Direct to queue 0" which is wrong. Fix by teaching bnxt driver to report the RSS context when applicable. Fixes: 2f4f9fe5bf5f ("bnxt_en: Support adding ntuple rules on RSS contexts") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://patch.msgid.link/2e884ae39e08dc5123be7c170a6089cefe6a78f7.1732748253.git.dxu@dxuuu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Luczaj <mhal@rbox.co> Date: Mon Nov 18 22:03:41 2024 +0100 bpf, vsock: Fix poll() missing a queue [ Upstream commit 9f0fc98145218ff8f50d8cfa3b393785056c53e1 ] When a verdict program simply passes a packet without redirection, sk_msg is enqueued on sk_psock::ingress_msg. Add a missing check to poll(). Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://lore.kernel.org/r/20241118-vsock-bpf-poll-close-v1-1-f1b9669cacdc@rbox.co Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Luczaj <mhal@rbox.co> Date: Mon Nov 18 22:03:43 2024 +0100 bpf, vsock: Invoke proto::close on close() [ Upstream commit 135ffc7becc82cfb84936ae133da7969220b43b2 ] vsock defines a BPF callback to be invoked when close() is called. However, this callback is never actually executed. As a result, a closed vsock socket is not automatically removed from the sockmap/sockhash. Introduce a dummy vsock_close() and make vsock_release() call proto::close. Note: changes in __vsock_release() look messy, but it's only due to indent level reduction and variables xmas tree reorder. Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://lore.kernel.org/r/20241118-vsock-bpf-poll-close-v1-3-f1b9669cacdc@rbox.co Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hou Tao <houtao1@huawei.com> Date: Wed Nov 6 14:35:40 2024 +0800 bpf: Call free_htab_elem() after htab_unlock_bucket() [ Upstream commit b9e9ed90b10c82a4e9d4d70a2890f06bfcdd3b78 ] For htab of maps, when the map is removed from the htab, it may hold the last reference of the map. bpf_map_fd_put_ptr() will invoke bpf_map_free_id() to free the id of the removed map element. However, bpf_map_fd_put_ptr() is invoked while holding a bucket lock (raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock (spinlock_t), triggering the following lockdep warning: ============================= [ BUG: Invalid wait context ] 6.11.0-rc4+ #49 Not tainted ----------------------------- test_maps/4881 is trying to lock: ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70 other info that might help us debug this: context-{5:5} 2 locks held by test_maps/4881: #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270 #1: ffff888149ced148 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80 stack backtrace: CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... Call Trace: <TASK> dump_stack_lvl+0x6e/0xb0 dump_stack+0x10/0x20 __lock_acquire+0x73e/0x36c0 lock_acquire+0x182/0x450 _raw_spin_lock_irqsave+0x43/0x70 bpf_map_free_id.part.0+0x21/0x70 bpf_map_put+0xcf/0x110 bpf_map_fd_put_ptr+0x9a/0xb0 free_htab_elem+0x69/0xe0 htab_map_update_elem+0x50f/0xa80 bpf_fd_htab_map_update_elem+0x131/0x270 htab_map_update_elem+0x50f/0xa80 bpf_fd_htab_map_update_elem+0x131/0x270 bpf_map_update_value+0x266/0x380 __sys_bpf+0x21bb/0x36b0 __x64_sys_bpf+0x45/0x60 x64_sys_call+0x1b2a/0x20d0 do_syscall_64+0x5d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e One way to fix the lockdep warning is using raw_spinlock_t for map_idr_lock as well. However, bpf_map_alloc_id() invokes idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a similar lockdep warning because the slab's lock (s->cpu_slab->lock) is still a spinlock. Instead of changing map_idr_lock's type, fix the issue by invoking htab_put_fd_value() after htab_unlock_bucket(). However, only deferring the invocation of htab_put_fd_value() is not enough, because the old map pointers in htab of maps can not be saved during batched deletion. Therefore, also defer the invocation of free_htab_elem(), so these to-be-freed elements could be linked together similar to lru map. There are four callers for ->map_fd_put_ptr: (1) alloc_htab_elem() (through htab_put_fd_value()) It invokes ->map_fd_put_ptr() under a raw_spinlock_t. The invocation of htab_put_fd_value() can not simply move after htab_unlock_bucket(), because the old element has already been stashed in htab->extra_elems. It may be reused immediately after htab_unlock_bucket() and the invocation of htab_put_fd_value() after htab_unlock_bucket() may release the newly-added element incorrectly. Therefore, saving the map pointer of the old element for htab of maps before unlocking the bucket and releasing the map_ptr after unlock. Beside the map pointer in the old element, should do the same thing for the special fields in the old element as well. (2) free_htab_elem() (through htab_put_fd_value()) Its caller includes __htab_map_lookup_and_delete_elem(), htab_map_delete_elem() and __htab_map_lookup_and_delete_batch(). For htab_map_delete_elem(), simply invoke free_htab_elem() after htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch(), just like lru map, linking the to-be-freed element into node_to_free list and invoking free_htab_elem() for these element after unlock. It is safe to reuse batch_flink as the link for node_to_free, because these elements have been removed from the hash llist. Because htab of maps doesn't support lookup_and_delete operation, __htab_map_lookup_and_delete_elem() doesn't have the problem, so kept it as is. (3) fd_htab_map_free() It invokes ->map_fd_put_ptr without raw_spinlock_t. (4) bpf_fd_htab_map_update_elem() It invokes ->map_fd_put_ptr without raw_spinlock_t. After moving free_htab_elem() outside htab bucket lock scope, using pcpu_freelist_push() instead of __pcpu_freelist_push() to disable the irq before freeing elements, and protecting the invocations of bpf_mem_cache_free() with migrate_{disable|enable} pair. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241106063542.357743-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com> Date: Tue Dec 3 20:47:53 2024 -0800 bpf: Don't mark STACK_INVALID as STACK_MISC in mark_stack_slot_misc [ Upstream commit 69772f509e084ec6bca12dbcdeeeff41b0103774 ] Inside mark_stack_slot_misc, we should not upgrade STACK_INVALID to STACK_MISC when allow_ptr_leaks is false, since invalid contents shouldn't be read unless the program has the relevant capabilities. The relaxation only makes sense when env->allow_ptr_leaks is true. However, such conversion in privileged mode becomes unnecessary, as invalid slots can be read without being upgraded to STACK_MISC. Currently, the condition is inverted (i.e. checking for true instead of false), simply remove it to restore correct behavior. Fixes: eaf18febd6eb ("bpf: preserve STACK_ZERO slots on partial reg spills") Acked-by: Andrii Nakryiko <andrii@kernel.org> Reported-by: Tao Lyu <tao.lyu@epfl.ch> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241204044757.1483141-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tao Lyu <tao.lyu@epfl.ch> Date: Mon Dec 2 16:02:37 2024 -0800 bpf: Ensure reg is PTR_TO_STACK in process_iter_arg [ Upstream commit 12659d28615d606b36e382f4de2dd05550d202af ] Currently, KF_ARG_PTR_TO_ITER handling missed checking the reg->type and ensuring it is PTR_TO_STACK. Instead of enforcing this in the caller of process_iter_arg, move the check into it instead so that all callers will gain the check by default. This is similar to process_dynptr_func. An existing selftest in verifier_bits_iter.c fails due to this change, but it's because it was passing a NULL pointer into iter_next helper and getting an error further down the checks, but probably meant to pass an uninitialized iterator on the stack (as is done in the subsequent test below it). We will gain coverage for non-PTR_TO_STACK arguments in later patches hence just change the declaration to zero-ed stack object. Fixes: 06accc8779c1 ("bpf: add support for open-coded iterator loops") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Tao Lyu <tao.lyu@epfl.ch> [ Kartikeya: move check into process_iter_arg, rewrite commit log ] Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241203000238.3602922-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hou Tao <houtao1@huawei.com> Date: Fri Dec 6 19:06:18 2024 +0800 bpf: Fix exact match conditions in trie_get_next_key() [ Upstream commit 27abc7b3fa2e09bbe41e2924d328121546865eda ] trie_get_next_key() uses node->prefixlen == key->prefixlen to identify an exact match, However, it is incorrect because when the target key doesn't fully match the found node (e.g., node->prefixlen != matchlen), these two nodes may also have the same prefixlen. It will return expected result when the passed key exist in the trie. However when a recently-deleted key or nonexistent key is passed to trie_get_next_key(), it may skip keys and return incorrect result. Fix it by using node->prefixlen == matchlen to identify exact matches. When the condition is true after the search, it also implies node->prefixlen equals key->prefixlen, otherwise, the search would return NULL instead. Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241206110622.1161752-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tao Lyu <tao.lyu@epfl.ch> Date: Tue Dec 3 20:47:54 2024 -0800 bpf: Fix narrow scalar spill onto 64-bit spilled scalar slots [ Upstream commit b0e66977dc072906bb76555fb1a64261d7f63d0f ] When CAP_PERFMON and CAP_SYS_ADMIN (allow_ptr_leaks) are disabled, the verifier aims to reject partial overwrite on an 8-byte stack slot that contains a spilled pointer. However, in such a scenario, it rejects all partial stack overwrites as long as the targeted stack slot is a spilled register, because it does not check if the stack slot is a spilled pointer. Incomplete checks will result in the rejection of valid programs, which spill narrower scalar values onto scalar slots, as shown below. 0: R1=ctx() R10=fp0 ; asm volatile ( @ repro.bpf.c:679 0: (7a) *(u64 *)(r10 -8) = 1 ; R10=fp0 fp-8_w=1 1: (62) *(u32 *)(r10 -8) = 1 attempt to corrupt spilled pointer on stack processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0. Fix this by expanding the check to not consider spilled scalar registers when rejecting the write into the stack. Previous discussion on this patch is at link [0]. [0]: https://lore.kernel.org/bpf/20240403202409.2615469-1-tao.lyu@epfl.ch Fixes: ab125ed3ec1c ("bpf: fix check for attempt to corrupt spilled pointer") Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Tao Lyu <tao.lyu@epfl.ch> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241204044757.1483141-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Date: Fri Nov 22 13:10:30 2024 +0100 bpf: fix OOB devmap writes when deleting elements commit ab244dd7cf4c291f82faacdc50b45cc0f55b674d upstream. Jordy reported issue against XSKMAP which also applies to DEVMAP - the index used for accessing map entry, due to being a signed integer, causes the OOB writes. Fix is simple as changing the type from int to u32, however, when compared to XSKMAP case, one more thing needs to be addressed. When map is released from system via dev_map_free(), we iterate through all of the entries and an iterator variable is also an int, which implies OOB accesses. Again, change it to be u32. Example splat below: [ 160.724676] BUG: unable to handle page fault for address: ffffc8fc2c001000 [ 160.731662] #PF: supervisor read access in kernel mode [ 160.736876] #PF: error_code(0x0000) - not-present page [ 160.742095] PGD 0 P4D 0 [ 160.744678] Oops: Oops: 0000 [#1] PREEMPT SMP [ 160.749106] CPU: 1 UID: 0 PID: 520 Comm: kworker/u145:12 Not tainted 6.12.0-rc1+ #487 [ 160.757050] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 160.767642] Workqueue: events_unbound bpf_map_free_deferred [ 160.773308] RIP: 0010:dev_map_free+0x77/0x170 [ 160.777735] Code: 00 e8 fd 91 ed ff e8 b8 73 ed ff 41 83 7d 18 19 74 6e 41 8b 45 24 49 8b bd f8 00 00 00 31 db 85 c0 74 48 48 63 c3 48 8d 04 c7 <48> 8b 28 48 85 ed 74 30 48 8b 7d 18 48 85 ff 74 05 e8 b3 52 fa ff [ 160.796777] RSP: 0018:ffffc9000ee1fe38 EFLAGS: 00010202 [ 160.802086] RAX: ffffc8fc2c001000 RBX: 0000000080000000 RCX: 0000000000000024 [ 160.809331] RDX: 0000000000000000 RSI: 0000000000000024 RDI: ffffc9002c001000 [ 160.816576] RBP: 0000000000000000 R08: 0000000000000023 R09: 0000000000000001 [ 160.823823] R10: 0000000000000001 R11: 00000000000ee6b2 R12: dead000000000122 [ 160.831066] R13: ffff88810c928e00 R14: ffff8881002df405 R15: 0000000000000000 [ 160.838310] FS: 0000000000000000(0000) GS:ffff8897e0c40000(0000) knlGS:0000000000000000 [ 160.846528] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 160.852357] CR2: ffffc8fc2c001000 CR3: 0000000005c32006 CR4: 00000000007726f0 [ 160.859604] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 160.866847] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 160.874092] PKRU: 55555554 [ 160.876847] Call Trace: [ 160.879338] <TASK> [ 160.881477] ? __die+0x20/0x60 [ 160.884586] ? page_fault_oops+0x15a/0x450 [ 160.888746] ? search_extable+0x22/0x30 [ 160.892647] ? search_bpf_extables+0x5f/0x80 [ 160.896988] ? exc_page_fault+0xa9/0x140 [ 160.900973] ? asm_exc_page_fault+0x22/0x30 [ 160.905232] ? dev_map_free+0x77/0x170 [ 160.909043] ? dev_map_free+0x58/0x170 [ 160.912857] bpf_map_free_deferred+0x51/0x90 [ 160.917196] process_one_work+0x142/0x370 [ 160.921272] worker_thread+0x29e/0x3b0 [ 160.925082] ? rescuer_thread+0x4b0/0x4b0 [ 160.929157] kthread+0xd4/0x110 [ 160.932355] ? kthread_park+0x80/0x80 [ 160.936079] ret_from_fork+0x2d/0x50 [ 160.943396] ? kthread_park+0x80/0x80 [ 160.950803] ret_from_fork_asm+0x11/0x20 [ 160.958482] </TASK> Fixes: 546ac1ffb70d ("bpf: add devmap, a map for storing net device references") CC: stable@vger.kernel.org Reported-by: Jordy Zomer <jordyzomer@google.com> Suggested-by: Jordy Zomer <jordyzomer@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20241122121030.716788-3-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hou Tao <houtao1@huawei.com> Date: Fri Dec 6 19:06:16 2024 +0800 bpf: Handle BPF_EXIST and BPF_NOEXIST for LPM trie [ Upstream commit eae6a075e9537dd69891cf77ca5a88fa8a28b4a1 ] Add the currently missing handling for the BPF_EXIST and BPF_NOEXIST flags. These flags can be specified by users and are relevant since LPM trie supports exact matches during update. Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation") Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241206110622.1161752-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hou Tao <houtao1@huawei.com> Date: Fri Dec 6 19:06:17 2024 +0800 bpf: Handle in-place update for full LPM trie correctly [ Upstream commit 532d6b36b2bfac5514426a97a4df8d103d700d43 ] When a LPM trie is full, in-place updates of existing elements incorrectly return -ENOSPC. Fix this by deferring the check of trie->n_entries. For new insertions, n_entries must not exceed max_entries. However, in-place updates are allowed even when the trie is full. Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation") Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241206110622.1161752-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leon Hwang <leon.hwang@linux.dev> Date: Tue Oct 15 23:02:06 2024 +0800 bpf: Prevent tailcall infinite loop caused by freplace [ Upstream commit d6083f040d5d8f8d748462c77e90547097df936e ] There is a potential infinite loop issue that can occur when using a combination of tail calls and freplace. In an upcoming selftest, the attach target for entry_freplace of tailcall_freplace.c is subprog_tc of tc_bpf2bpf.c, while the tail call in entry_freplace leads to entry_tc. This results in an infinite loop: entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc. The problem arises because the tail_call_cnt in entry_freplace resets to zero each time entry_freplace is executed, causing the tail call mechanism to never terminate, eventually leading to a kernel panic. To fix this issue, the solution is twofold: 1. Prevent updating a program extended by an freplace program to a prog_array map. 2. Prevent extending a program that is already part of a prog_array map with an freplace program. This ensures that: * If a program or its subprogram has been extended by an freplace program, it can no longer be updated to a prog_array map. * If a program has been added to a prog_array map, neither it nor its subprograms can be extended by an freplace program. Moreover, an extension program should not be tailcalled. As such, return -EINVAL if the program has a type of BPF_PROG_TYPE_EXT when adding it to a prog_array map. Additionally, fix a minor code style issue by replacing eight spaces with a tab for proper formatting. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20241015150207.70264-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andrii Nakryiko <andrii@kernel.org> Date: Fri Nov 1 11:17:52 2024 -0700 bpf: put bpf_link's program when link is safe to be deallocated [ Upstream commit f44ec8733a8469143fde1984b5e6931b2e2f6f3f ] In general, BPF link's underlying BPF program should be considered to be reachable through attach hook -> link -> prog chain, and, pessimistically, we have to assume that as long as link's memory is not safe to free, attach hook's code might hold a pointer to BPF program and use it. As such, it's not (generally) correct to put link's program early before waiting for RCU GPs to go through. More eager bpf_prog_put() that we currently do is mostly correct due to BPF program's release code doing similar RCU GP waiting, but as will be shown in the following patches, BPF program can be non-sleepable (and, thus, reliant on only "classic" RCU GP), while BPF link's attach hook can have sleepable semantics and needs to be protected by RCU Tasks Trace, and for such cases BPF link has to go through RCU Tasks Trace + "classic" RCU GPs before being deallocated. And so, if we put BPF program early, we might free BPF program before we free BPF link, leading to use-after-free situation. So, this patch defers bpf_prog_put() until we are ready to perform bpf_link's deallocation. At worst, this delays BPF program freeing by one extra RCU GP, but that seems completely acceptable. Alternatively, we'd need more elaborate ways to determine BPF hook, BPF link, and BPF program lifetimes, and how they relate to each other, which seems like an unnecessary complication. Note, for most BPF links we still will perform eager bpf_prog_put() and link dealloc, so for those BPF links there are no observable changes whatsoever. Only BPF links that use deferred dealloc might notice slightly delayed freeing of BPF programs. Also, to reduce code and logic duplication, extract program put + link dealloc logic into bpf_link_dealloc() helper. Link: https://lore.kernel.org/20241101181754.782341-1-andrii@kernel.org Tested-by: Jordan Rife <jrife@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hou Tao <houtao1@huawei.com> Date: Fri Dec 6 19:06:15 2024 +0800 bpf: Remove unnecessary kfree(im_node) in lpm_trie_update_elem [ Upstream commit 3d5611b4d7efbefb85a74fcdbc35c603847cc022 ] There is no need to call kfree(im_node) when updating element fails, because im_node must be NULL. Remove the unnecessary kfree() for im_node. Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241206110622.1161752-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: 532d6b36b2bf ("bpf: Handle in-place update for full LPM trie correctly") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Mohammadi <amirmohammadi1999.am@gmail.com> Date: Thu Nov 21 12:04:13 2024 +0330 bpftool: fix potential NULL pointer dereferencing in prog_dump() [ Upstream commit ef3ba8c258ee368a5343fa9329df85b4bcb9e8b5 ] A NULL pointer dereference could occur if ksyms is not properly checked before usage in the prog_dump() function. Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Signed-off-by: Amir Mohammadi <amiremohamadi@yahoo.com> Reviewed-by: Quentin Monnet <qmo@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20241121083413.7214-1-amiremohamadi@yahoo.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qu Wenruo <wqu@suse.com> Date: Tue Sep 24 12:52:17 2024 +0930 btrfs: avoid unnecessary device path update for the same device [ Upstream commit 2e8b6bc0ab41ce41e6dfcc204b6cc01d5abbc952 ] [PROBLEM] It is very common for udev to trigger device scan, and every time a mounted btrfs device got re-scan from different soft links, we will get some of unnecessary device path updates, this is especially common for LVM based storage: # lvs scratch1 test -wi-ao---- 10.00g scratch2 test -wi-a----- 10.00g scratch3 test -wi-a----- 10.00g scratch4 test -wi-a----- 10.00g scratch5 test -wi-a----- 10.00g test test -wi-a----- 10.00g # mkfs.btrfs -f /dev/test/scratch1 # mount /dev/test/scratch1 /mnt/btrfs # dmesg -c [ 205.705234] BTRFS: device fsid 7be2602f-9e35-4ecf-a6ff-9e91d2c182c9 devid 1 transid 6 /dev/mapper/test-scratch1 (253:4) scanned by mount (1154) [ 205.710864] BTRFS info (device dm-4): first mount of filesystem 7be2602f-9e35-4ecf-a6ff-9e91d2c182c9 [ 205.711923] BTRFS info (device dm-4): using crc32c (crc32c-intel) checksum algorithm [ 205.713856] BTRFS info (device dm-4): using free-space-tree [ 205.722324] BTRFS info (device dm-4): checking UUID tree So far so good, but even if we just touched any soft link of "dm-4", we will get quite some unnecessary device path updates. # touch /dev/mapper/test-scratch1 # dmesg -c [ 469.295796] BTRFS info: devid 1 device path /dev/mapper/test-scratch1 changed to /dev/dm-4 scanned by (udev-worker) (1221) [ 469.300494] BTRFS info: devid 1 device path /dev/dm-4 changed to /dev/mapper/test-scratch1 scanned by (udev-worker) (1221) Such device path rename is unnecessary and can lead to random path change due to the udev race. [CAUSE] Inside device_list_add(), we are using a very primitive way checking if the device has changed, strcmp(). Which can never handle links well, no matter if it's hard or soft links. So every different link of the same device will be treated as a different device, causing the unnecessary device path update. [FIX] Introduce a helper, is_same_device(), and use path_equal() to properly detect the same block device. So that the different soft links won't trigger the rename race. Reviewed-by: Filipe Manana <fdmanana@suse.com> Link: https://bugzilla.suse.com/show_bug.cgi?id=1230641 Reported-by: Fabian Vogt <fvogt@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qu Wenruo <wqu@suse.com> Date: Tue Sep 24 14:27:07 2024 +0930 btrfs: canonicalize the device path before adding it [ Upstream commit 7e06de7c83a746e58d4701e013182af133395188 ] [PROBLEM] Currently btrfs accepts any file path for its device, resulting some weird situation: # ./mount_by_fd /dev/test/scratch1 /mnt/btrfs/ The program has the following source code: #include <fcntl.h> #include <stdio.h> #include <sys/mount.h> int main(int argc, char *argv[]) { int fd = open(argv[1], O_RDWR); char path[256]; snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); return mount(path, argv[2], "btrfs", 0, NULL); } Then we can have the following weird device path: BTRFS: device fsid 2378be81-fe12-46d2-a9e8-68cf08dd98d5 devid 1 transid 7 /proc/self/fd/3 (253:2) scanned by mount_by_fd (18440) Normally it's not a big deal, and later udev can trigger a device path rename. But if udev didn't trigger, the device path "/proc/self/fd/3" will show up in mtab. [CAUSE] For filename "/proc/self/fd/3", it means the opened file descriptor 3. In above case, it's exactly the device we want to open, aka points to "/dev/test/scratch1" which is another symlink pointing to "/dev/dm-2". Inside kernel we solve the mount source using LOOKUP_FOLLOW, which follows the symbolic link and grab the proper block device. But inside btrfs we also save the filename into btrfs_device::name, and utilize that member to report our mount source, which leads to the above situation. [FIX] Instead of unconditionally trust the path, check if the original file (not following the symbolic link) is inside "/dev/", if not, then manually lookup the path to its final destination, and use that as our device path. This allows us to still use symbolic links, like "/dev/mapper/test-scratch" from LVM2, which is required for fstests runs with LVM2 setup. And for really weird names, like the above case, we solve it to "/dev/dm-2" instead. Reviewed-by: Filipe Manana <fdmanana@suse.com> Link: https://bugzilla.suse.com/show_bug.cgi?id=1230641 Reported-by: Fabian Vogt <fvogt@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Boris Burkov <boris@bur.io> Date: Tue Oct 15 14:27:32 2024 -0700 btrfs: do not clear read-only when adding sprout device [ Upstream commit 70958a949d852cbecc3d46127bf0b24786df0130 ] If you follow the seed/sprout wiki, it suggests the following workflow: btrfstune -S 1 seed_dev mount seed_dev mnt btrfs device add sprout_dev mount -o remount,rw mnt The first mount mounts the FS readonly, which results in not setting BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add somewhat surprisingly clears the readonly bit on the sb (though the mount is still practically readonly, from the users perspective...). Finally, the remount checks the readonly bit on the sb against the flag and sees no change, so it does not run the code intended to run on ro->rw transitions, leaving BTRFS_FS_OPEN unset. As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and does no work. This results in leaking deleted snapshots until we run out of space. I propose fixing it at the first departure from what feels reasonable: when we clear the readonly bit on the sb during device add. A new fstest I have written reproduces the bug and confirms the fix. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johannes Thumshirn <johannes.thumshirn@wdc.com> Date: Tue Sep 10 09:55:01 2024 +0200 btrfs: don't take dev_replace rwsem on task already holding it [ Upstream commit 8cca35cb29f81eba3e96ec44dad8696c8a2f9138 ] Running fstests btrfs/011 with MKFS_OPTIONS="-O rst" to force the usage of the RAID stripe-tree, we get the following splat from lockdep: BTRFS info (device sdd): dev_replace from /dev/sdd (devid 1) to /dev/sdb started ============================================ WARNING: possible recursive locking detected 6.11.0-rc3-btrfs-for-next #599 Not tainted -------------------------------------------- btrfs/2326 is trying to acquire lock: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 but task is already holding lock: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&fs_info->dev_replace.rwsem); lock(&fs_info->dev_replace.rwsem); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by btrfs/2326: #0: ffff88810f215c98 (&fs_info->dev_replace.rwsem){++++}-{3:3}, at: btrfs_map_block+0x39f/0x2250 stack backtrace: CPU: 1 UID: 0 PID: 2326 Comm: btrfs Not tainted 6.11.0-rc3-btrfs-for-next #599 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x5b/0x80 __lock_acquire+0x2798/0x69d0 ? __pfx___lock_acquire+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 lock_acquire+0x19d/0x4a0 ? btrfs_map_block+0x39f/0x2250 ? __pfx_lock_acquire+0x10/0x10 ? find_held_lock+0x2d/0x110 ? lock_is_held_type+0x8f/0x100 down_read+0x8e/0x440 ? btrfs_map_block+0x39f/0x2250 ? __pfx_down_read+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 btrfs_map_block+0x39f/0x2250 ? btrfs_dev_replace_by_ioctl+0xd69/0x1d00 ? btrfs_bio_counter_inc_blocked+0xd9/0x2e0 ? __kasan_slab_alloc+0x6e/0x70 ? __pfx_btrfs_map_block+0x10/0x10 ? __pfx_btrfs_bio_counter_inc_blocked+0x10/0x10 ? kmem_cache_alloc_noprof+0x1f2/0x300 ? mempool_alloc_noprof+0xed/0x2b0 btrfs_submit_chunk+0x28d/0x17e0 ? __pfx_btrfs_submit_chunk+0x10/0x10 ? bvec_alloc+0xd7/0x1b0 ? bio_add_folio+0x171/0x270 ? __pfx_bio_add_folio+0x10/0x10 ? __kasan_check_read+0x20/0x20 btrfs_submit_bio+0x37/0x80 read_extent_buffer_pages+0x3df/0x6c0 btrfs_read_extent_buffer+0x13e/0x5f0 read_tree_block+0x81/0xe0 read_block_for_search+0x4bd/0x7a0 ? __pfx_read_block_for_search+0x10/0x10 btrfs_search_slot+0x78d/0x2720 ? __pfx_btrfs_search_slot+0x10/0x10 ? lock_is_held_type+0x8f/0x100 ? kasan_save_track+0x14/0x30 ? __kasan_slab_alloc+0x6e/0x70 ? kmem_cache_alloc_noprof+0x1f2/0x300 btrfs_get_raid_extent_offset+0x181/0x820 ? __pfx_lock_acquire+0x10/0x10 ? __pfx_btrfs_get_raid_extent_offset+0x10/0x10 ? down_read+0x194/0x440 ? __pfx_down_read+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 btrfs_map_block+0x5b5/0x2250 ? __pfx_btrfs_map_block+0x10/0x10 scrub_submit_initial_read+0x8fe/0x11b0 ? __pfx_scrub_submit_initial_read+0x10/0x10 submit_initial_group_read+0x161/0x3a0 ? lock_release+0x20e/0x710 ? __pfx_submit_initial_group_read+0x10/0x10 ? __pfx_lock_release+0x10/0x10 scrub_simple_mirror.isra.0+0x3eb/0x580 scrub_stripe+0xe4d/0x1440 ? lock_release+0x20e/0x710 ? __pfx_scrub_stripe+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? do_raw_read_unlock+0x44/0x70 ? _raw_read_unlock+0x23/0x40 scrub_chunk+0x257/0x4a0 scrub_enumerate_chunks+0x64c/0xf70 ? __mutex_unlock_slowpath+0x147/0x5f0 ? __pfx_scrub_enumerate_chunks+0x10/0x10 ? bit_wait_timeout+0xb0/0x170 ? __up_read+0x189/0x700 ? scrub_workers_get+0x231/0x300 ? up_write+0x490/0x4f0 btrfs_scrub_dev+0x52e/0xcd0 ? create_pending_snapshots+0x230/0x250 ? __pfx_btrfs_scrub_dev+0x10/0x10 btrfs_dev_replace_by_ioctl+0xd69/0x1d00 ? lock_acquire+0x19d/0x4a0 ? __pfx_btrfs_dev_replace_by_ioctl+0x10/0x10 ? lock_release+0x20e/0x710 ? btrfs_ioctl+0xa09/0x74f0 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_lock+0x11e/0x240 ? __pfx_do_raw_spin_lock+0x10/0x10 btrfs_ioctl+0xa14/0x74f0 ? lock_acquire+0x19d/0x4a0 ? find_held_lock+0x2d/0x110 ? __pfx_btrfs_ioctl+0x10/0x10 ? lock_release+0x20e/0x710 ? do_sigaction+0x3f0/0x860 ? __pfx_do_vfs_ioctl+0x10/0x10 ? do_raw_spin_lock+0x11e/0x240 ? lockdep_hardirqs_on_prepare+0x270/0x3e0 ? _raw_spin_unlock_irq+0x28/0x50 ? do_sigaction+0x3f0/0x860 ? __pfx_do_sigaction+0x10/0x10 ? __x64_sys_rt_sigaction+0x18e/0x1e0 ? __pfx___x64_sys_rt_sigaction+0x10/0x10 ? __x64_sys_close+0x7c/0xd0 __x64_sys_ioctl+0x137/0x190 do_syscall_64+0x71/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f0bd1114f9b Code: Unable to access opcode bytes at 0x7f0bd1114f71. RSP: 002b:00007ffc8a8c3130 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f0bd1114f9b RDX: 00007ffc8a8c35e0 RSI: 00000000ca289435 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007 R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffc8a8c6c85 R13: 00000000398e72a0 R14: 0000000000004361 R15: 0000000000000004 </TASK> This happens because on RAID stripe-tree filesystems we recurse back into btrfs_map_block() on scrub to perform the logical to device physical mapping. But as the device replace task is already holding the dev_replace::rwsem we deadlock. So don't take the dev_replace::rwsem in case our task is the task performing the device replace. Suggested-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Sterba <dsterba@suse.com> Date: Wed Oct 9 16:32:17 2024 +0200 btrfs: drop unused parameter data from btrfs_fill_super() [ Upstream commit 01c5db782e3ad1aea1d06a1765c710328c145f10 ] The only caller passes NULL, we can drop the parameter. This is since the new mount option parser done in 3bb17a25bcb09a ("btrfs: add get_tree callback for new mount API"). Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 951a3f59d268 ("btrfs: fix mount failure due to remount races") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Sterba <dsterba@suse.com> Date: Wed Oct 9 16:32:07 2024 +0200 btrfs: drop unused parameter options from open_ctree() [ Upstream commit 87cbab86366e75dec52f787e0e0b17b2aea769ca ] Since the new mount option parser in commit ad21f15b0f79 ("btrfs: switch to the new mount API") we don't pass the options like that anymore. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 951a3f59d268 ("btrfs: fix mount failure due to remount races") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com> Date: Fri Nov 29 13:33:03 2024 +0000 btrfs: fix missing snapshot drew unlock when root is dead during swap activation [ Upstream commit 9c803c474c6c002d8ade68ebe99026cc39c37f85 ] When activating a swap file we acquire the root's snapshot drew lock and then check if the root is dead, failing and returning with -EPERM if it's dead but without unlocking the root's snapshot lock. Fix this by adding the missing unlock. Fixes: 60021bd754c6 ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qu Wenruo <wqu@suse.com> Date: Wed Oct 30 11:25:48 2024 +1030 btrfs: fix mount failure due to remount races [ Upstream commit 951a3f59d268fe1397aaeb9a96fcb1944890c4cb ] [BUG] The following reproducer can cause btrfs mount to fail: dev="/dev/test/scratch1" mnt1="/mnt/test" mnt2="/mnt/scratch" mkfs.btrfs -f $dev mount $dev $mnt1 btrfs subvolume create $mnt1/subvol1 btrfs subvolume create $mnt1/subvol2 umount $mnt1 mount $dev $mnt1 -o subvol=subvol1 while mount -o remount,ro $mnt1; do mount -o remount,rw $mnt1; done & bg=$! while mount $dev $mnt2 -o subvol=subvol2; do umount $mnt2; done kill $bg wait umount -R $mnt1 umount -R $mnt2 The script will fail with the following error: mount: /mnt/scratch: /dev/mapper/test-scratch1 already mounted on /mnt/test. dmesg(1) may have more information after failed mount system call. umount: /mnt/test: target is busy. umount: /mnt/scratch/: not mounted And there is no kernel error message. [CAUSE] During the btrfs mount, to support mounting different subvolumes with different RO/RW flags, we need to detect that and retry if needed: Retry with matching RO flags if the initial mount fail with -EBUSY. The problem is, during that retry we do not hold any super block lock (s_umount), this means there can be a remount process changing the RO flags of the original fs super block. If so, we can have an EBUSY error during retry. And this time we treat any failure as an error, without any retry and cause the above EBUSY mount failure. [FIX] The current retry behavior is racy because we do not have a super block thus no way to hold s_umount to prevent the race with remount. Solve the root problem by allowing fc->sb_flags to mismatch from the sb->s_flags at btrfs_get_tree_super(). Then at the re-entry point btrfs_get_tree_subvol(), manually check the fc->s_flags against sb->s_flags, if it's a RO->RW mismatch, then reconfigure with s_umount lock hold. Reported-by: Enno Gotthold <egotthold@suse.com> Reported-by: Fabian Vogt <fvogt@suse.com> [ Special thanks for the reproducer and early analysis pointing to btrfs. ] Fixes: f044b318675f ("btrfs: handle the ro->rw transition for mounting different subvolumes") Link: https://bugzilla.suse.com/show_bug.cgi?id=1231836 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Date: Wed Nov 27 16:22:46 2024 -0800 cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU commit b3fce429a1e030b50c1c91351d69b8667eef627b upstream. Commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU") adds functionality that architectures can use to optionally allocate and build cacheinfo early during boot. Commit 6539cffa9495 ("cacheinfo: Add arch specific early level initializer") lets secondary CPUs correct (and reallocate memory) cacheinfo data if needed. If the early build functionality is not used and cacheinfo does not need correction, memory for cacheinfo is never allocated. x86 does not use the early build functionality. Consequently, during the cacheinfo CPU hotplug callback, last_level_cache_is_valid() attempts to dereference a NULL pointer: BUG: kernel NULL pointer dereference, address: 0000000000000100 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not present page PGD 0 P4D 0 Oops: 0000 [#1] PREEPMT SMP NOPTI CPU: 0 PID 19 Comm: cpuhp/0 Not tainted 6.4.0-rc2 #1 RIP: 0010: last_level_cache_is_valid+0x95/0xe0a Allocate memory for cacheinfo during the cacheinfo CPU hotplug callback if not done earlier. Moreover, before determining the validity of the last-level cache info, ensure that it has been allocated. Simply checking for non-zero cache_leaves() is not sufficient, as some architectures (e.g., Intel processors) have non-zero cache_leaves() before allocation. Dereferencing NULL cacheinfo can occur in update_per_cpu_data_slice_size(). This function iterates over all online CPUs. However, a CPU may have come online recently, but its cacheinfo may not have been allocated yet. While here, remove an unnecessary indentation in allocate_cache_info(). [ bp: Massage. ] Fixes: 6539cffa9495 ("cacheinfo: Add arch specific early level initializer") Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Radu Rendec <rrendec@redhat.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: stable@vger.kernel.org # 6.3+ Link: https://lore.kernel.org/r/20241128002247.26726-2-ricardo.neri-calderon@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:42 2024 +0100 can: c_can: c_can_handle_bus_err(): update statistics if skb allocation fails [ Upstream commit 9e66242504f49e17481d8e197730faba7d99c934 ] Ensure that the statistics are always updated, even if the skb allocation fails. Fixes: 4d6d26537940 ("can: c_can: fix {rx,tx}_errors statistics") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-2-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Kleine-Budde <mkl@pengutronix.de> Date: Thu Nov 21 11:08:25 2024 +0100 can: dev: can_set_termination(): allow sleeping GPIOs commit ee1dfbdd8b4b6de85e96ae2059dc9c1bdb6b49b5 upstream. In commit 6e86a1543c37 ("can: dev: provide optional GPIO based termination support") GPIO based termination support was added. For no particular reason that patch uses gpiod_set_value() to set the GPIO. This leads to the following warning, if the systems uses a sleeping GPIO, i.e. behind an I2C port expander: | WARNING: CPU: 0 PID: 379 at /drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x50/0x6c | CPU: 0 UID: 0 PID: 379 Comm: ip Not tainted 6.11.0-20241016-1 #1 823affae360cc91126e4d316d7a614a8bf86236c Replace gpiod_set_value() by gpiod_set_value_cansleep() to allow the use of sleeping GPIOs. Cc: Nicolai Buchwitz <nb@tipi-net.de> Cc: Lino Sanfilippo <l.sanfilippo@kunbus.com> Cc: stable@vger.kernel.org Reported-by: Leonard Göhrs <l.goehrs@pengutronix.de> Tested-by: Leonard Göhrs <l.goehrs@pengutronix.de> Fixes: 6e86a1543c37 ("can: dev: provide optional GPIO based termination support") Link: https://patch.msgid.link/20241121-dev-fix-can_set_termination-v1-1-41fa6e29216d@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:52 2024 +0100 can: ems_usb: ems_usb_rx_err(): fix {rx,tx}_errors statistics [ Upstream commit 72a7e2e74b3075959f05e622bae09b115957dffe ] The ems_usb_rx_err() function only incremented the receive error counter and never the transmit error counter, even if the ECC_DIR flag reported that an error had occurred during transmission. Increment the receive/transmit error counter based on the value of the ECC_DIR flag. Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-12-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:53 2024 +0100 can: f81604: f81604_handle_can_bus_errors(): fix {rx,tx}_errors statistics [ Upstream commit d7b916540c2ba3d2a88c27b2a6287b39d8eac052 ] The f81604_handle_can_bus_errors() function only incremented the receive error counter and never the transmit error counter, even if the ECC_DIR flag reported that an error had occurred during transmission. Increment the receive/transmit error counter based on the value of the ECC_DIR flag. Fixes: 88da17436973 ("can: usb: f81604: add Fintek F81604 support") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-13-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Kozhinov <ak.alexander.kozhinov@gmail.com> Date: Fri Oct 18 23:24:26 2024 +0200 can: gs_usb: add usb endpoint address detection at driver probe step [ Upstream commit 889b2ae9139a87b3390f7003cb1bb3d65bf90a26 ] There is an approach made to implement gs_usb firmware/driver based on Zephyr RTOS. It was found that USB stack of Zephyr RTOS overwrites USB EP addresses, if they have different last 4 bytes in absence of other endpoints. For example in case of gs_usb candlelight firmware EP-IN is 0x81 and EP-OUT 0x02. If there are no additional USB endpoints, Zephyr RTOS will overwrite EP-OUT to 0x01. More information can be found in the discussion with Zephyr RTOS USB stack maintainer here: https://github.com/zephyrproject-rtos/zephyr/issues/67812 There are already two different gs_usb FW driver implementations based on Zephyr RTOS: 1. https://github.com/CANnectivity/cannectivity (by: https://github.com/henrikbrixandersen) 2. https://github.com/zephyrproject-rtos/zephyr/compare/main...KozhinovAlexander:zephyr:gs_usb (by: https://github.com/KozhinovAlexander) At the moment both Zephyr RTOS implementations use dummy USB endpoint, to overcome described USB stack behavior from Zephyr itself. Since Zephyr RTOS is intended to be used on microcontrollers with very constrained amount of resources (ROM, RAM) and additional endpoint requires memory, it is more convenient to update the gs_usb driver in the Linux kernel. To fix this problem, update the gs_usb driver from using hard coded endpoint numbers to evaluate the endpoint descriptors and use the endpoints provided there. Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Alexander Kozhinov <ak.alexander.kozhinov@gmail.com> Link: https://patch.msgid.link/20241018212450.31746-1-ak.alexander.kozhinov@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:45 2024 +0100 can: hi311x: hi3110_can_ist(): fix potential use-after-free [ Upstream commit 9ad86d377ef4a19c75a9c639964879a5b25a433b ] The commit a22bd630cfff ("can: hi311x: do not report txerr and rxerr during bus-off") removed the reporting of rxerr and txerr even in case of correct operation (i. e. not bus-off). The error count information added to the CAN frame after netif_rx() is a potential use after free, since there is no guarantee that the skb is in the same state. It might be freed or reused. Fix the issue by postponing the netif_rx() call in case of txerr and rxerr reporting. Fixes: a22bd630cfff ("can: hi311x: do not report txerr and rxerr during bus-off") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-5-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:49 2024 +0100 can: hi311x: hi3110_can_ist(): fix {rx,tx}_errors statistics [ Upstream commit 3e4645931655776e757f9fb5ae29371cd7cb21a2 ] The hi3110_can_ist() function was incorrectly incrementing only the receive error counter, even in cases of bit or acknowledgment errors that occur during transmission. The fix the issue by incrementing the appropriate counter based on the type of error. Fixes: 57e83fb9b746 ("can: hi311x: Add Holt HI-311x CAN driver") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-9-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:48 2024 +0100 can: ifi_canfd: ifi_canfd_handle_lec_err(): fix {rx,tx}_errors statistics [ Upstream commit bb03d568bb21b4afe7935d1943bcf68ddea3ea45 ] The ifi_canfd_handle_lec_err() function was incorrectly incrementing only the receive error counter, even in cases of bit or acknowledgment errors that occur during transmission. Fix the issue by incrementing the appropriate counter based on the type of error. Fixes: 5bbd655a8bd0 ("can: ifi: Add more detailed error reporting") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Reviewed-by: Marek Vasut <marex@denx.de> Link: https://patch.msgid.link/20241122221650.633981-8-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Antipov <dmantipov@yandex.ru> Date: Tue Nov 5 12:48:23 2024 +0300 can: j1939: j1939_session_new(): fix skb reference counting [ Upstream commit a8c695005bfe6569acd73d777ca298ddddd66105 ] Since j1939_session_skb_queue() does an extra skb_get() for each new skb, do the same for the initial one in j1939_session_new() to avoid refcount underflow. Reported-by: syzbot+d4e8dc385d9258220c31@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d4e8dc385d9258220c31 Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241105094823.2403806-1-dmantipov@yandex.ru [mkl: clean up commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:47 2024 +0100 can: m_can: m_can_handle_lec_err(): fix {rx,tx}_errors statistics [ Upstream commit 988d4222bf9039a875a3d48f2fe35c317831ff68 ] The m_can_handle_lec_err() function was incorrectly incrementing only the receive error counter, even in cases of bit or acknowledgment errors that occur during transmission. Fix the issue by incrementing the appropriate counter based on the type of error. Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-7-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Kleine-Budde <mkl@pengutronix.de> Date: Sun Nov 24 18:42:56 2024 +0100 can: mcp251xfd: mcp251xfd_get_tef_len(): work around erratum DS80000789E 6. commit 30447a1bc0e066e492552b3e5ffeb63c1605dfe2 upstream. Commit b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum") introduced mcp251xfd_get_tef_len() to get the number of unhandled transmit events from the Transmit Event FIFO (TEF). As the TEF has no head index, the driver uses the TX-FIFO's tail index instead, assuming that send frames are completed. When calculating the number of unhandled TEF events, that commit didn't take mcp2518fd erratum DS80000789E 6. into account. According to that erratum, the FIFOCI bits of a FIFOSTA register, here the TX-FIFO tail index might be corrupted. However here it seems the bit indicating that the TX-FIFO is empty (MCP251XFD_REG_FIFOSTA_TFERFFIF) is not correct while the TX-FIFO tail index is. Assume that the TX-FIFO is indeed empty if: - Chip's head and tail index are equal (len == 0). - The TX-FIFO is less than half full. (The TX-FIFO empty case has already been checked at the beginning of this function.) - No free buffers in the TX ring. If the TX-FIFO is assumed to be empty, assume that the TEF is full and return the number of elements in the TX-FIFO (which equals the number of TEF elements). If these assumptions are false, the driver might read to many objects from the TEF. mcp251xfd_handle_tefif_one() checks the sequence numbers and will refuse to process old events. Reported-by: Renjaya Raga Zenta <renjaya.zenta@formulatrix.com> Closes: https://patch.msgid.link/CAJ7t6HgaeQ3a_OtfszezU=zB-FqiZXqrnATJ3UujNoQJJf7GgA@mail.gmail.com Fixes: b8e0ddd36ce9 ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum") Tested-by: Renjaya Raga Zenta <renjaya.zenta@formulatrix.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241126-mcp251xfd-fix-length-calculation-v2-1-c2ed516ed6ba@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:50 2024 +0100 can: sja1000: sja1000_err(): fix {rx,tx}_errors statistics [ Upstream commit 2c4ef3af4b028a0eaaf378df511d3b425b1df61f ] The sja1000_err() function only incremented the receive error counter and never the transmit error counter, even if the ECC_DIR flag reported that an error had occurred during transmission. Increment the receive/transmit error counter based on the value of the ECC_DIR flag. Fixes: 429da1cc841b ("can: Driver for the SJA1000 CAN controller") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-10-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:43 2024 +0100 can: sun4i_can: sun4i_can_err(): call can_change_state() even if cf is NULL [ Upstream commit ee6bf3677ae03569d833795064e17f605c2163c7 ] Call the function can_change_state() if the allocation of the skb fails, as it handles the cf parameter when it is null. Additionally, this ensures that the statistics related to state error counters (i. e. warning, passive, and bus-off) are updated. Fixes: 0738eff14d81 ("can: Allwinner A10/A20 CAN Controller support - Kernel module") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-3-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dario Binacchi <dario.binacchi@amarulasolutions.com> Date: Fri Nov 22 23:15:51 2024 +0100 can: sun4i_can: sun4i_can_err(): fix {rx,tx}_errors statistics [ Upstream commit 595a81988a6fe06eb5849e972c8b9cb21c4e0d54 ] The sun4i_can_err() function only incremented the receive error counter and never the transmit error counter, even if the STA_ERR_DIR flag reported that an error had occurred during transmission. Increment the receive/transmit error counter based on the value of the STA_ERR_DIR flag. Fixes: 0738eff14d81 ("can: Allwinner A10/A20 CAN Controller support - Kernel module") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-11-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Przemek Kitszel <przemyslaw.kitszel@intel.com> Date: Fri Oct 18 13:38:14 2024 +0200 cleanup: Adjust scoped_guard() macros to avoid potential warning [ Upstream commit fcc22ac5baf06dd17193de44b60dbceea6461983 ] Change scoped_guard() and scoped_cond_guard() macros to make reasoning about them easier for static analysis tools (smatch, compiler diagnostics), especially to enable them to tell if the given usage of scoped_guard() is with a conditional lock class (interruptible-locks, try-locks) or not (like simple mutex_lock()). Add compile-time error if scoped_cond_guard() is used for non-conditional lock class. Beyond easier tooling and a little shrink reported by bloat-o-meter this patch enables developer to write code like: int foo(struct my_drv *adapter) { scoped_guard(spinlock, &adapter->some_spinlock) return adapter->spinlock_protected_var; } Current scoped_guard() implementation does not support that, due to compiler complaining: error: control reaches end of non-void function [-Werror=return-type] Technical stuff about the change: scoped_guard() macro uses common idiom of using "for" statement to declare a scoped variable. Unfortunately, current logic is too hard for compiler diagnostics to be sure that there is exactly one loop step; fix that. To make any loop so trivial that there is no above warning, it must not depend on any non-const variable to tell if there are more steps. There is no obvious solution for that in C, but one could use the compound statement expression with "goto" jumping past the "loop", effectively leaving only the subscope part of the loop semantics. More impl details: one more level of macro indirection is now needed to avoid duplicating label names; I didn't spot any other place that is using the "for (...; goto label) if (0) label: break;" idiom, so it's not packed for reuse beyond scoped_guard() family, what makes actual macros code cleaner. There was also a need to introduce const true/false variable per lock class, it is used to aid compiler diagnostics reasoning about "exactly 1 step" loops (note that converting that to function would undo the whole benefit). Big thanks to Andy Shevchenko for help on this patch, both internal and public, ranging from whitespace/formatting, through commit message clarifications, general improvements, ending with presenting alternative approaches - all despite not even liking the idea. Big thanks to Dmitry Torokhov for the idea of compile-time check for scoped_cond_guard() (to use it only with conditional locsk), and general improvements for the patch. Big thanks to David Lechner for idea to cover also scoped_cond_guard(). Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lkml.kernel.org/r/20241018113823.171256-1-przemyslaw.kitszel@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Haoyu Li <lihaoyu499@gmail.com> Date: Tue Dec 3 22:29:15 2024 +0800 clk: en7523: Initialize num before accessing hws in en7523_register_clocks() [ Upstream commit 52fd1709e41d3a85b48bcfe2404a024ebaf30c3b ] With the new __counted_by annotation in clk_hw_onecell_data, the "num" struct member must be set before accessing the "hws" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: f316cdff8d67 ("clk: Annotate struct clk_hw_onecell_data with __counted_by") Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://lore.kernel.org/r/20241203142915.345523-1-lihaoyu499@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Devi Priya <quic_devipriy@quicinc.com> Date: Mon Oct 28 11:35:01 2024 +0530 clk: qcom: clk-alpha-pll: Add NSS HUAYRA ALPHA PLL support for ipq9574 [ Upstream commit 79dfed29aa3f714e0a94a39b2bfe9ac14ce19a6a ] Add support for NSS Huayra alpha pll found on ipq9574 SoCs. Programming sequence is the same as that of Huayra type Alpha PLL, so we can re-use the same. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Devi Priya <quic_devipriy@quicinc.com> Link: https://lore.kernel.org/r/20241028060506.246606-2-quic_srichara@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sun Oct 27 03:24:49 2024 +0200 clk: qcom: dispcc-sm8550: enable support for SAR2130P [ Upstream commit 1335c7eb7012f23dc073b8ae4ffcfc1f6e69cfb3 ] The display clock controller on SAR2130P is very close to the clock controller on SM8550 (and SM8650). Reuse existing driver to add support for the controller on SAR2130P. Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-clocks-v5-10-ecad2a1432ba@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sun Oct 27 03:24:45 2024 +0200 clk: qcom: rcg2: add clk_rcg2_shared_floor_ops [ Upstream commit aec8c0e28ce4a1f89fd82fcc06a5cc73147e9817 ] Generally SDCC clocks use clk_rcg2_floor_ops, however on SAR2130P platform it's recommended to use rcg2_shared_ops for all Root Clock Generators to park them instead of disabling. Implement a mix of those, clk_rcg2_shared_floor_ops. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-clocks-v5-6-ecad2a1432ba@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sun Oct 27 03:24:46 2024 +0200 clk: qcom: rpmh: add support for SAR2130P [ Upstream commit 2cc88de6261f01ebd4e2a3b4e29681fe87d0c089 ] Define clocks as supported by the RPMh on the SAR2130P platform. The msm-5.10 kernel declares just the CXO clock, the RF_CLK1 clock was added following recommendation from Taniya Das. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Taniya Das <quic_tdas@quicinc.com> Link: https://lore.kernel.org/r/20241027-sar2130p-clocks-v5-7-ecad2a1432ba@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sun Oct 27 03:24:48 2024 +0200 clk: qcom: tcsrcc-sm8550: add SAR2130P support [ Upstream commit d2e0a043530b9d6f37a8de8f05e0725667aba0a6 ] The SAR2130P platform has the same TCSR Clock Controller as the SM8550, except for the lack of the UFS clocks. Extend the SM8550 TCSRCC driver to support SAR2130P. Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-clocks-v5-9-ecad2a1432ba@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Gleixner <tglx@linutronix.de> Date: Tue Dec 3 11:16:30 2024 +0100 clocksource: Make negative motion detection more robust commit 76031d9536a076bf023bedbdb1b4317fc801dd67 upstream. Guenter reported boot stalls on a emulated ARM 32-bit platform, which has a 24-bit wide clocksource. It turns out that the calculated maximal idle time, which limits idle sleeps to prevent clocksource wrap arounds, is close to the point where the negative motion detection triggers. max_idle_ns: 597268854 ns negative motion tripping point: 671088640 ns If the idle wakeup is delayed beyond that point, the clocksource advances far enough to trigger the negative motion detection. This prevents the clock to advance and in the worst case the system stalls completely if the consecutive sleeps based on the stale clock are delayed as well. Cure this by calculating a more robust cut-off value for negative motion, which covers 87.5% of the actual clocksource counter width. Compare the delta against this value to catch negative motion. This is specifically for clock sources with a small counter width as their wrap around time is close to the half counter width. For clock sources with wide counters this is not a problem because the maximum idle time is far from the half counter width due to the math overflow protection constraints. For the case at hand this results in a tripping point of 1174405120ns. Note, that this cannot prevent issues when the delay exceeds the 87.5% margin, but that's not different from the previous unchecked version which allowed arbitrary time jumps. Systems with small counter width are prone to invalid results, but this problem is unlikely to be seen on real hardware. If such a system completely stalls for more than half a second, then there are other more urgent problems than the counter wrapping around. Fixes: c163e40af9b2 ("timekeeping: Always check for negative motion") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/8734j5ul4x.ffs@tglx Closes: https://lore.kernel.org/all/387b120b-d68a-45e8-b6ab-768cd95d11c2@roeck-us.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Lukas Wunner <lukas@wunner.de> Date: Tue Sep 10 16:30:24 2024 +0200 crypto: ecdsa - Avoid signed integer overflow on signature decoding [ Upstream commit 3b0565c703503f832d6cd7ba805aafa3b330cb9d ] When extracting a signature component r or s from an ASN.1-encoded integer, ecdsa_get_signature_rs() subtracts the expected length "bufsize" from the ASN.1 length "vlen" (both of unsigned type size_t) and stores the result in "diff" (of signed type ssize_t). This results in a signed integer overflow if vlen > SSIZE_MAX + bufsize. The kernel is compiled with -fno-strict-overflow, which implies -fwrapv, meaning signed integer overflow is not undefined behavior. And the function does check for overflow: if (-diff >= bufsize) return -EINVAL; So the code is fine in principle but not very obvious. In the future it might trigger a false-positive with CONFIG_UBSAN_SIGNED_WRAP=y. Avoid by comparing the two unsigned variables directly and erroring out if "vlen" is too large. Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ivan Solodovnikov <solodovnikov.ia@phystech.edu> Date: Tue Nov 26 17:39:02 2024 +0300 dccp: Fix memory leak in dccp_feat_change_recv [ Upstream commit 22be4727a8f898442066bcac34f8a1ad0bc72e14 ] If dccp_feat_push_confirm() fails after new value for SP feature was accepted without reconciliation ('entry == NULL' branch), memory allocated for that value with dccp_feat_clone_sp_val() is never freed. Here is the kmemleak stack for this: unreferenced object 0xffff88801d4ab488 (size 8): comm "syz-executor310", pid 1127, jiffies 4295085598 (age 41.666s) hex dump (first 8 bytes): 01 b4 4a 1d 80 88 ff ff ..J..... backtrace: [<00000000db7cabfe>] kmemdup+0x23/0x50 mm/util.c:128 [<0000000019b38405>] kmemdup include/linux/string.h:465 [inline] [<0000000019b38405>] dccp_feat_clone_sp_val net/dccp/feat.c:371 [inline] [<0000000019b38405>] dccp_feat_clone_sp_val net/dccp/feat.c:367 [inline] [<0000000019b38405>] dccp_feat_change_recv net/dccp/feat.c:1145 [inline] [<0000000019b38405>] dccp_feat_parse_options+0x1196/0x2180 net/dccp/feat.c:1416 [<00000000b1f6d94a>] dccp_parse_options+0xa2a/0x1260 net/dccp/options.c:125 [<0000000030d7b621>] dccp_rcv_state_process+0x197/0x13d0 net/dccp/input.c:650 [<000000001f74c72e>] dccp_v4_do_rcv+0xf9/0x1a0 net/dccp/ipv4.c:688 [<00000000a6c24128>] sk_backlog_rcv include/net/sock.h:1041 [inline] [<00000000a6c24128>] __release_sock+0x139/0x3b0 net/core/sock.c:2570 [<00000000cf1f3a53>] release_sock+0x54/0x1b0 net/core/sock.c:3111 [<000000008422fa23>] inet_wait_for_connect net/ipv4/af_inet.c:603 [inline] [<000000008422fa23>] __inet_stream_connect+0x5d0/0xf70 net/ipv4/af_inet.c:696 [<0000000015b6f64d>] inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:735 [<0000000010122488>] __sys_connect_file+0x15c/0x1a0 net/socket.c:1865 [<00000000b4b70023>] __sys_connect+0x165/0x1a0 net/socket.c:1882 [<00000000f4cb3815>] __do_sys_connect net/socket.c:1892 [inline] [<00000000f4cb3815>] __se_sys_connect net/socket.c:1889 [inline] [<00000000f4cb3815>] __x64_sys_connect+0x6e/0xb0 net/socket.c:1889 [<00000000e7b1e839>] do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 [<0000000055e91434>] entry_SYSCALL_64_after_hwframe+0x67/0xd1 Clean up the allocated memory in case of dccp_feat_push_confirm() failure and bail out with an error reset code. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: e77b8363b2ea ("dccp: Process incoming Change feature-negotiation options") Signed-off-by: Ivan Solodovnikov <solodovnikov.ia@phystech.edu> Link: https://patch.msgid.link/20241126143902.190853-1-solodovnikov.ia@phystech.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Aring <aahringo@redhat.com> Date: Fri Oct 4 11:13:38 2024 -0400 dlm: fix possible lkb_resource null dereference [ Upstream commit b98333c67daf887c724cd692e88e2db9418c0861 ] This patch fixes a possible null pointer dereference when this function is called from request_lock() as lkb->lkb_resource is not assigned yet, only after validate_lock_args() by calling attach_lkb(). Another issue is that a resource name could be a non printable bytearray and we cannot assume to be ASCII coded. The log functionality is probably never being hit when DLM is used in normal way and no debug logging is enabled. The null pointer dereference can only occur on a new created lkb that does not have the resource assigned yet, it probably never hits the null pointer dereference but we should be sure that other changes might not change this behaviour and we actually can hit the mentioned null pointer dereference. In this patch we just drop the printout of the resource name, the lkb id is enough to make a possible connection to a resource name if this exists. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian König <christian.koenig@amd.com> Date: Fri Nov 8 09:29:48 2024 +0100 dma-buf: fix dma_fence_array_signaled v4 commit 78ac1c3558810486d90aa533b0039aa70487a3da upstream. The function silently assumed that signaling was already enabled for the dma_fence_array. This meant that without enabling signaling first we would never see forward progress. Fix that by falling back to testing each individual fence when signaling isn't enabled yet. v2: add the comment suggested by Boris why this is done this way v3: fix the underflow pointed out by Tvrtko v4: atomic_read_acquire() as suggested by Tvrtko Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12094 Cc: <stable@vger.kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241112121925.18464-1-christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Levi Yun <yeoreum.yun@arm.com> Date: Fri Oct 25 11:06:00 2024 +0100 dma-debug: fix a possible deadlock on radix_lock [ Upstream commit 7543c3e3b9b88212fcd0aaf5cab5588797bdc7de ] radix_lock() shouldn't be held while holding dma_hash_entry[idx].lock otherwise, there's a possible deadlock scenario when dma debug API is called holding rq_lock(): CPU0 CPU1 CPU2 dma_free_attrs() check_unmap() add_dma_entry() __schedule() //out (A) rq_lock() get_hash_bucket() (A) dma_entry_hash check_sync() (A) radix_lock() (W) dma_entry_hash dma_entry_free() (W) radix_lock() // CPU2's one (W) rq_lock() CPU1 situation can happen when it extending radix tree and it tries to wake up kswapd via wake_all_kswapd(). CPU2 situation can happen while perf_event_task_sched_out() (i.e. dma sync operation is called while deleting perf_event using etm and etr tmc which are Arm Coresight hwtracing driver backends). To remove this possible situation, call dma_entry_free() after put_hash_bucket() in check_unmap(). Reported-by: Denis Nikitin <denik@chromium.org> Closes: https://lists.linaro.org/archives/list/coresight@lists.linaro.org/thread/2WMS7BBSF5OZYB63VT44U5YWLFP5HL6U/#RWM6MLQX5ANBTEQ2PRM7OXCBGCE6NPWU Signed-off-by: Levi Yun <yeoreum.yun@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Date: Fri Nov 15 10:21:49 2024 +0000 dma-fence: Fix reference leak on fence merge failure path commit 949291c5314009b4f6e252391edbb40fdd5d5414 upstream. Release all fence references if the output dma-fence-array could not be allocated. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: 245a4a7b531c ("dma-buf: generalize dma_fence unwrap & merging v3") Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Friedrich Vock <friedrich.vock@gmx.de> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> # v6.0+ Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241115102153.1980-2-tursulin@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Date: Fri Nov 15 10:21:50 2024 +0000 dma-fence: Use kernel's sort for merging fences commit fe52c649438b8489c9456681d93a9b3de3d38263 upstream. One alternative to the fix Christian proposed in https://lore.kernel.org/dri-devel/20241024124159.4519-3-christian.koenig@amd.com/ is to replace the rather complex open coded sorting loops with the kernel standard sort followed by a context squashing pass. Proposed advantage of this would be readability but one concern Christian raised was that there could be many fences, that they are typically mostly sorted, and so the kernel's heap sort would be much worse by the proposed algorithm. I had a look running some games and vkcube to see what are the typical number of input fences. Tested scenarios: 1) Hogwarts Legacy under Gamescope 450 calls per second to __dma_fence_unwrap_merge. Percentages per number of fences buckets, before and after checking for signalled status, sorting and flattening: N Before After 0 0.91% 1 69.40% 2-3 28.72% 9.4% (90.6% resolved to one fence) 4-5 0.93% 6-9 0.03% 10+ 2) Cyberpunk 2077 under Gamescope 1050 calls per second, amounting to 0.01% CPU time according to perf top. N Before After 0 1.13% 1 52.30% 2-3 40.34% 55.57% 4-5 1.46% 0.50% 6-9 2.44% 10+ 2.34% 3) vkcube under Plasma 90 calls per second. N Before After 0 1 2-3 100% 0% (Ie. all resolved to a single fence) 4-5 6-9 10+ In the case of vkcube all invocations in the 2-3 bucket were actually just two input fences. From these numbers it looks like the heap sort should not be a disadvantage, given how the dominant case is <= 2 input fences which heap sort solves with just one compare and swap. (And for the case of one input fence we have a fast path in the previous patch.) A complementary possibility is to implement a different sorting algorithm under the same API as the kernel's sort() and so keep the simplicity, potentially moving the new sort under lib/ if it would be found more widely useful. v2: * Hold on to fence references and reduce commentary. (Christian) * Record and use latest signaled timestamp in the 2nd loop too. * Consolidate zero or one fences fast paths. v3: * Reverse the seqno sort order for a simpler squashing pass. (Christian) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: 245a4a7b531c ("dma-buf: generalize dma_fence unwrap & merging v3") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3617 Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Friedrich Vock <friedrich.vock@gmx.de> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> # v6.0+ Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241115102153.1980-3-tursulin@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Will Deacon <will@kernel.org> Date: Mon Dec 2 14:57:29 2024 +0000 drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails [ Upstream commit d44679fb954ffea961036ed1aeb7d65035f78489 ] Calling the MMIO_GUARD hypercall from guests which have not been enrolled (e.g. because they are running without pvmfw) results in -EINVAL being returned. In this case, MMIO_GUARD is not active and so we can simply proceed with the normal ioremap() routine. Don't fail ioremap() if MMIO_GUARD fails; instead WARN_ON_ONCE() to highlight that the pvm environment is slightly wonky. Fixes: 0f1269495800 ("drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercall") Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241202145731.6422-2-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peterson Guo <peterson.guo@amd.com> Date: Thu Nov 7 19:20:02 2024 -0500 drm/amd/display: Add a left edge pixel if in YCbCr422 or YCbCr420 and odm commit 63e7ee677c74e981257cedfdd8543510d09096ba upstream. [WHY] On some cards when odm is used, the monitor will have 2 separate pipes split vertically. When compression is used on the YCbCr colour space on the second pipe to have correct colours, we need to read a pixel from the end of first pipe to accurately display colours. Hardware was programmed properly to account for this extra pixel but it was not calculated properly in software causing a split screen on some monitors. [HOW] The fix adjusts the second pipe's viewport and timings if the pixel encoding is YCbCr422 or YCbCr420. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: George Shen <george.shen@amd.com> Signed-off-by: Peterson Guo <peterson.guo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sung Lee <Sung.Lee@amd.com> Date: Thu Nov 14 14:18:13 2024 -0500 drm/amd/display: Add option to retrieve detile buffer size [ Upstream commit 6a7fd76b949efe40fb6d6677f480e624e0cb6e40 ] [WHY] For better power profiling knowing the detile buffer size at a given point in time would be useful. [HOW] Add interface to retrieve detile buffer from dc state. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Sung Lee <Sung.Lee@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Leo Chen <leo.chen@amd.com> Date: Mon Oct 7 15:50:35 2024 -0400 drm/amd/display: Adding array index check to prevent memory corruption [ Upstream commit 2c437d9a0b496168e1a1defd17b531f0a526dbe9 ] [Why & How] Array indices out of bound caused memory corruption. Adding checks to ensure that array index stays in bound. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yihan Zhu <Yihan.Zhu@amd.com> Date: Thu Sep 26 09:49:25 2024 -0400 drm/amd/display: calculate final viewport before TAP optimization [ Upstream commit e982310c9ce074e428abc260dc3cba1b1ea62b78 ] Viewport size excess surface size observed sometime with some timings or resizing the MPO video window to cause MPO unsupported. Calculate final viewport size first with a 100x100 dummy viewport to get the max TAP support and then re-run final viewport calculation if TAP value changed. Removed obsolete preliminary viewport calculation for TAP validation. Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: c33a93201ca0 ("drm/amd/display: Ignore scalar validation failure if pipe is phantom") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lo-an Chen <lo-an.chen@amd.com> Date: Thu Nov 14 17:53:41 2024 +0800 drm/amd/display: Correct prefetch calculation commit 24d3749c11d949972d8c22e75567dc90ff5482e7 upstream. [WHY] The minimum value of the dst_y_prefetch_equ was not correct in prefetch calculation whice causes OPTC underflow. [HOW] Add the min operation of dst_y_prefetch_equ in prefetch calculation. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Lo-an Chen <lo-an.chen@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Oct 4 09:34:24 2024 -0400 drm/amd/display: disable SG displays on cyan skillfish [ Upstream commit 66369db7fdd7d58d78673bf83d2b87ea623efb63 ] These parts were mainly for compute workloads, but they have a display that was available for the console. These chips should support SG display, but I don't know that the support was ever validated on Linux so disable it by default. It can still be enabled by setting sg_display=1 for those that want to play with it. These systems also generally had large carve outs so SG display was less of a factor. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3356 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhongwei <Zhongwei.Zhang@amd.com> Date: Wed Sep 18 14:43:49 2024 +0800 drm/amd/display: Fix garbage or black screen when resetting otg [ Upstream commit ffa1e31f70d2e97c121709b44a8960f5d7becb10 ] [Why] For some EDP to MIPI panel, disabling OTG when link is alive like boot case, the converter might output garbage or show no display because our GPU is not sending required pixel data. Alos Dig fifo underflow was found which might cause garbage, when resetting otg for other types of EDP panels. [How] Skipping resetting OTG if the dig fifo is on. Make sure that the otg for the pipe is the one that the dig fifo is selecting via the FE mask. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Date: Wed Sep 25 20:04:15 2024 +0530 drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create' [ Upstream commit 63de35a8fcfca59ae8750d469a7eb220c7557baf ] An issue was identified in the dcn21_link_encoder_create function where an out-of-bounds access could occur when the hpd_source index was used to reference the link_enc_hpd_regs array. This array has a fixed size and the index was not being checked against the array's bounds before accessing it. This fix adds a conditional check to ensure that the hpd_source index is within the valid range of the link_enc_hpd_regs array. If the index is out of bounds, the function now returns NULL to prevent undefined behavior. References: [ 65.920507] ------------[ cut here ]------------ [ 65.920510] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312:29 [ 65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers [5]' [ 65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G OE 6.8.0-cleanershaderfeatureresetasdntipmi200nv2132 #13 [ 65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS WMJ0429N_Weekly_20_04_2 04/29/2020 [ 65.920527] Call Trace: [ 65.920529] <TASK> [ 65.920532] dump_stack_lvl+0x48/0x70 [ 65.920541] dump_stack+0x10/0x20 [ 65.920543] __ubsan_handle_out_of_bounds+0xa2/0xe0 [ 65.920549] dcn21_link_encoder_create+0xd9/0x140 [amdgpu] [ 65.921009] link_create+0x6d3/0xed0 [amdgpu] [ 65.921355] create_links+0x18a/0x4e0 [amdgpu] [ 65.921679] dc_create+0x360/0x720 [amdgpu] [ 65.921999] ? dmi_matches+0xa0/0x220 [ 65.922004] amdgpu_dm_init+0x2b6/0x2c90 [amdgpu] [ 65.922342] ? console_unlock+0x77/0x120 [ 65.922348] ? dev_printk_emit+0x86/0xb0 [ 65.922354] dm_hw_init+0x15/0x40 [amdgpu] [ 65.922686] amdgpu_device_init+0x26a8/0x33a0 [amdgpu] [ 65.922921] amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu] [ 65.923087] amdgpu_pci_probe+0x1b7/0x630 [amdgpu] [ 65.923087] local_pci_probe+0x4b/0xb0 [ 65.923087] pci_device_probe+0xc8/0x280 [ 65.923087] really_probe+0x187/0x300 [ 65.923087] __driver_probe_device+0x85/0x130 [ 65.923087] driver_probe_device+0x24/0x110 [ 65.923087] __driver_attach+0xac/0x1d0 [ 65.923087] ? __pfx___driver_attach+0x10/0x10 [ 65.923087] bus_for_each_dev+0x7d/0xd0 [ 65.923087] driver_attach+0x1e/0x30 [ 65.923087] bus_add_driver+0xf2/0x200 [ 65.923087] driver_register+0x64/0x130 [ 65.923087] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu] [ 65.923087] __pci_register_driver+0x61/0x70 [ 65.923087] amdgpu_init+0x7d/0xff0 [amdgpu] [ 65.923087] do_one_initcall+0x49/0x310 [ 65.923087] ? kmalloc_trace+0x136/0x360 [ 65.923087] do_init_module+0x6a/0x270 [ 65.923087] load_module+0x1fce/0x23a0 [ 65.923087] init_module_from_file+0x9c/0xe0 [ 65.923087] ? init_module_from_file+0x9c/0xe0 [ 65.923087] idempotent_init_module+0x179/0x230 [ 65.923087] __x64_sys_finit_module+0x5d/0xa0 [ 65.923087] do_syscall_64+0x76/0x120 [ 65.923087] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 65.923087] RIP: 0033:0x7f2d80f1e88d [ 65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48 [ 65.923087] RSP: 002b:00007ffc7bc1aa78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 65.923087] RAX: ffffffffffffffda RBX: 0000564c9c1db130 RCX: 00007f2d80f1e88d [ 65.923087] RDX: 0000000000000000 RSI: 0000564c9c1e5480 RDI: 000000000000000f [ 65.923087] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002 [ 65.923087] R10: 000000000000000f R11: 0000000000000246 R12: 0000564c9c1e5480 [ 65.923087] R13: 0000564c9c1db260 R14: 0000000000000000 R15: 0000564c9c1e54b0 [ 65.923087] </TASK> [ 65.923927] ---[ end trace ]--- Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leo Ma <hanghong.ma@amd.com> Date: Fri Oct 11 14:08:34 2024 -0400 drm/amd/display: Fix underflow when playing 8K video in full screen mode [ Upstream commit 4007f07a47de4a277f4760cac3aed1b31d973eea ] [Why&How] Flickering observed while playing 8k HEVC-10 bit video in full screen mode with black border. We didn't support this case for subvp. Make change to the existing check to disable subvp for this corner case. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Leo Ma <hanghong.ma@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leo Chen <leo.chen@amd.com> Date: Thu Oct 3 12:20:23 2024 -0400 drm/amd/display: Full exit out of IPS2 when all allow signals have been cleared [ Upstream commit 0fe33e115fec305c35c66b78ad26e3755ab54b9c ] [Why] A race condition occurs between cursor movement and vertical interrupt control thread from OS, with both threads trying to exit IPS2. Vertical interrupt control thread clears the prev driver allow signal while not fully finishing the IPS2 exit process. [How] We want to detect all the allow signals have been cleared before we perform the full exit. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Leo Chen <leo.chen@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chris Park <chris.park@amd.com> Date: Mon Nov 4 13:18:39 2024 -0500 drm/amd/display: Ignore scalar validation failure if pipe is phantom [ Upstream commit c33a93201ca07119de90e8c952fbdf65920ab55d ] [Why] There are some pipe scaler validation failure when the pipe is phantom and causes crash in DML validation. Since, scalar parameters are not as important in phantom pipe and we require this plane to do successful MCLK switches, the failure condition can be ignored. [How] Ignore scalar validation failure if the pipe validation is marked as phantom pipe. Cc: stable@vger.kernel.org # 6.11+ Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dillon Varone <dillon.varone@amd.com> Date: Wed Nov 13 16:44:15 2024 -0500 drm/amd/display: Limit VTotal range to max hw cap minus fp commit a29997b7ac1f5c816b543e0c56aa2b5b56baac24 upstream. [WHY & HOW] Hardware does not support the VTotal to be between fp2 lines of the maximum possible VTotal, so add a capability flag to track it and apply where necessary. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Reviewed-by: Anthony Koo <anthony.koo@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aurabindo Pillai <aurabindo.pillai@amd.com> Date: Fri Oct 18 10:52:16 2024 -0400 drm/amd/display: parse umc_info or vram_info based on ASIC [ Upstream commit 2551b4a321a68134360b860113dd460133e856e5 ] An upstream bug report suggests that there are production dGPUs that are older than DCN401 but still have a umc_info in VBIOS tables with the same version as expected for a DCN401 product. Hence, reading this tables should be guarded with a version check. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fangzhi Zuo <Jerry.Zuo@amd.com> Date: Thu Oct 17 18:15:10 2024 -0400 drm/amd/display: Prune Invalid Modes For HDMI Output [ Upstream commit abdd2768d7630bc8ec3403aea24f4197bada3c1f ] [Why] 1. HDMI does not have 6 bpc support. Having 6 bpc pass validation does not comply with spec. 2. Validate 420 only for native HDMI, but not apply to pcon use case. 3. Current mode validation log is not readable. [how] 1. Cap 8 bpc for dp-hdmi converter. 2. Validate yuv420 for pcon use case as well, if rgb/yuv444 8bpc cannot fit into pcon bw limitation of the link from the converter to HDMI sink. 3. Add readable pixel_format and color_depth into debug log. Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ausef Yousof <Ausef.Yousof@amd.com> Date: Thu Oct 24 14:06:39 2024 -0400 drm/amd/display: Remove hw w/a toggle if on DP2/HPO [ Upstream commit b4c804628485af2b46f0d24a87190735cac37d61 ] [why&how] Applying a hw w/a only relevant to DIG FIFO causing corruption using HPO, do not apply the w/a if on DP2/HPO Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fudongwang <Fudong.Wang@amd.com> Date: Sat Sep 14 09:33:44 2024 +0800 drm/amd/display: skip disable CRTC in seemless bootup case [ Upstream commit 0e37e4b9afbd08df1f00a70bbb4d1ec273d18c9e ] Resync FIFO is a workaround to write the same value to DENTIST_DISPCLK_CNTL register after programming OTG_PIXEL_RATE_DIV register, in case seemless boot, there is no OTG_PIXEL_RATE_DIV register update, so skip CRTC disable when resync FIFO to avoid random FIFO error and garbage. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Fudongwang <Fudong.Wang@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Sat Nov 16 08:20:59 2024 -0500 drm/amd/pm: fix and simplify workload handling commit 1443dd3c67f6d1a8bd1f810e598e2f0c6f19205c upstream. smu->workload_mask is IP specific and should not be messed with in the common code. The mask bits vary across SMU versions. Move all handling of smu->workload_mask in to the backends and simplify the code. Store the user's preference in smu->power_profile_mode which will be reflected in sysfs. For internal driver profile switches for KFD or VCN, just update the workload mask so that the user's preference is retained. Remove all of the extra now unused workload related elements in the smu structure. v2: use refcounts for workload profiles v3: rework based on feedback from Lijo v4: fix the refcount on failure, drop backend mask v5: rework custom handling v6: handle failure cleanup with custom profile v7: Update documentation Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Date: Fri Oct 18 07:26:19 2024 +0530 drm/amdgpu/gfx9: Add cleaner shader for GFX9.4.2 [ Upstream commit 9343b904e7198e4804685133327dece7fe709bc1 ] This commit adds the cleaner shader microcode for GFX9.4.2 GPUs. The cleaner shader is a piece of GPU code that is used to clear or initialize certain GPU resources, such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Clearing these resources is important for ensuring data isolation between different workloads running on the GPU. Without the cleaner shader, residual data from a previous workload could potentially be accessed by a subsequent workload, leading to data leaks and incorrect computation results. The cleaner shader microcode is represented as an array of 32-bit words (`gfx_9_4_2_cleaner_shader_hex`). This array is the binary representation of the cleaner shader code, which is written in a low-level GPU instruction set. Also, this patch updates the `gfx_v9_0_sw_init` function to initialize the cleaner shader if the MEC firmware version is 88 or higher. It sets the `cleaner_shader_ptr` and `cleaner_shader_size` to the appropriate values and attempts to initialize the cleaner shader. When the cleaner shader feature is enabled, the AMDGPU driver loads this array into a specific location in the GPU memory. The GPU then reads this memory location to fetch and execute the cleaner shader instructions. The cleaner shader is executed automatically by the GPU at the end of each workload, before the next workload starts. This ensures that all GPU resources are in a clean state before the start of each workload. This change ensures that the GPU memory is properly cleared between different processes, preventing data leakage and enhancing security. It also aligns with the serialization mechanism between KGD and KFD, ensuring that the GPU state is consistent across different workloads. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Nov 22 11:22:51 2024 -0500 drm/amdgpu/hdp4.0: do a posting read when flushing HDP commit c9b8dcabb52afe88413ff135a0953e3cc4128483 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Nov 22 11:23:56 2024 -0500 drm/amdgpu/hdp5.0: do a posting read when flushing HDP commit cf424020e040be35df05b682b546b255e74a420f upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Nov 22 11:24:13 2024 -0500 drm/amdgpu/hdp5.2: do a posting read when flushing HDP commit f756dbac1ce1d5f9a2b35e3b55fa429cf6336437 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Nov 22 11:24:38 2024 -0500 drm/amdgpu/hdp6.0: do a posting read when flushing HDP commit abe1cbaec6cfe9fde609a15cd6a12c812282ce77 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Nov 28 16:05:24 2024 +0800 drm/amdgpu/hdp7.0: do a posting read when flushing HDP commit 689275140cb8e9f8ae59e545086fce51fb0b994a upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xiang Liu <xiang.liu@amd.com> Date: Fri Nov 15 16:59:30 2024 +0800 drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3 [ Upstream commit 928cd772e18ffbd7723cb2361db4a8ccf2222235 ] It is not necessarily corrupted. When there is RAS fatal error, device memory access is blocked. Hence vcpu bo cannot be saved to system memory as in a regular suspend sequence before going for reset. In other full device reset cases, that gets saved and restored during resume. v2: Remove redundant code like vcn_v4_0 did v2: Refine commit message v3: Drop the volatile v3: Refine commit message Signed-off-by: Xiang Liu <xiang.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Lu <victorchengchi.lu@amd.com> Date: Thu Jul 18 18:01:23 2024 -0400 drm/amdgpu: clear RB_OVERFLOW bit when enabling interrupts for vega20_ih [ Upstream commit 8b22f048331dfd45fdfbf0efdfb1d43deff7518d ] Port this change to vega20_ih.c: commit afbf7955ff01 ("drm/amdgpu: clear RB_OVERFLOW bit when enabling interrupts") Original commit message: "Why: Setting IH_RB_WPTR register to 0 will not clear the RB_OVERFLOW bit if RB_ENABLE is not set. How to fix: Set WPTR_OVERFLOW_CLEAR bit after RB_ENABLE bit is set. The RB_ENABLE bit is required to be set, together with WPTR_OVERFLOW_ENABLE bit so that setting WPTR_OVERFLOW_CLEAR bit would clear the RB_OVERFLOW." Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Prike Liang <Prike.Liang@amd.com> Date: Thu Oct 17 14:54:31 2024 +0800 drm/amdgpu: Dereference the ATCS ACPI buffer [ Upstream commit 32e7ee293ff476c67b51be006e986021967bc525 ] Need to dereference the atcs acpi buffer after the method is executed, otherwise it will result in a memory leak. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lang Yu <lang.yu@amd.com> Date: Fri Oct 18 17:21:09 2024 +0800 drm/amdgpu: refine error handling in amdgpu_ttm_tt_pin_userptr [ Upstream commit 46186667f98fb7158c98f4ff5da62c427761ffcd ] Free sg table when dma_map_sgtable() failed to avoid memory leak. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Mon Nov 25 13:59:09 2024 -0500 drm/amdgpu: rework resume handling for display (v2) commit 73dae652dcac776296890da215ee7dec357a1032 upstream. Split resume into a 3rd step to handle displays when DCC is enabled on DCN 4.0.1. Move display after the buffer funcs have been re-enabled so that the GPU will do the move and properly set the DCC metadata for DCN. v2: fix fence irq resume ordering Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Prike Liang <Prike.Liang@amd.com> Date: Thu Oct 31 10:59:17 2024 +0800 drm/amdgpu: set the right AMDGPU sg segment limitation [ Upstream commit e2e97435783979124ba92d6870415c57ecfef6a5 ] The driver needs to set the correct max_segment_size; otherwise debug_dma_map_sg() will complain about the over-mapping of the AMDGPU sg length as following: WARNING: CPU: 6 PID: 1964 at kernel/dma/debug.c:1178 debug_dma_map_sg+0x2dc/0x370 [ 364.049444] Modules linked in: veth amdgpu(OE) amdxcp drm_exec gpu_sched drm_buddy drm_ttm_helper ttm(OE) drm_suballoc_helper drm_display_helper drm_kms_helper i2c_algo_bit rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter br_netfilter nvme_fabrics overlay nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge stp llc amd_atl intel_rapl_msr intel_rapl_common sunrpc sch_fq_codel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd binfmt_misc snd_hda_codec snd_pci_acp6x snd_hda_core snd_acp_config snd_hwdep snd_soc_acpi kvm_amd snd_pcm kvm snd_seq_midi snd_seq_midi_event crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_rawmidi sha256_ssse3 sha1_ssse3 aesni_intel snd_seq nls_iso8859_1 crypto_simd snd_seq_device cryptd snd_timer rapl input_leds snd [ 364.049532] ipmi_devintf wmi_bmof ccp serio_raw k10temp sp5100_tco soundcore ipmi_msghandler cm32181 industrialio mac_hid msr parport_pc ppdev lp parport drm efi_pstore ip_tables x_tables pci_stub crc32_pclmul nvme ahci libahci i2c_piix4 r8169 nvme_core i2c_designware_pci realtek i2c_ccgx_ucsi video wmi hid_generic cdc_ether usbnet usbhid hid r8152 mii [ 364.049576] CPU: 6 PID: 1964 Comm: rocminfo Tainted: G OE 6.10.0-custom #492 [ 364.049579] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 364.049582] RIP: 0010:debug_dma_map_sg+0x2dc/0x370 [ 364.049585] Code: 89 4d b8 e8 36 b1 86 00 8b 4d b8 48 8b 55 b0 44 8b 45 a8 4c 8b 4d a0 48 89 c6 48 c7 c7 00 4b 74 bc 4c 89 4d b8 e8 b4 73 f3 ff <0f> 0b 4c 8b 4d b8 8b 15 c8 2c b8 01 85 d2 0f 85 ee fd ff ff 8b 05 [ 364.049588] RSP: 0018:ffff9ca600b57ac0 EFLAGS: 00010286 [ 364.049590] RAX: 0000000000000000 RBX: ffff88b7c132b0c8 RCX: 0000000000000027 [ 364.049592] RDX: ffff88bb0f521688 RSI: 0000000000000001 RDI: ffff88bb0f521680 [ 364.049594] RBP: ffff9ca600b57b20 R08: 000000000000006f R09: ffff9ca600b57930 [ 364.049596] R10: ffff9ca600b57928 R11: ffffffffbcb46328 R12: 0000000000000000 [ 364.049597] R13: 0000000000000001 R14: ffff88b7c19c0700 R15: ffff88b7c9059800 [ 364.049599] FS: 00007fb2d3516e80(0000) GS:ffff88bb0f500000(0000) knlGS:0000000000000000 [ 364.049601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 364.049603] CR2: 000055610bd03598 CR3: 00000001049f6000 CR4: 0000000000350ef0 [ 364.049605] Call Trace: [ 364.049607] <TASK> [ 364.049609] ? show_regs+0x6d/0x80 [ 364.049614] ? __warn+0x8c/0x140 [ 364.049618] ? debug_dma_map_sg+0x2dc/0x370 [ 364.049621] ? report_bug+0x193/0x1a0 [ 364.049627] ? handle_bug+0x46/0x80 [ 364.049631] ? exc_invalid_op+0x1d/0x80 [ 364.049635] ? asm_exc_invalid_op+0x1f/0x30 [ 364.049642] ? debug_dma_map_sg+0x2dc/0x370 [ 364.049647] __dma_map_sg_attrs+0x90/0xe0 [ 364.049651] dma_map_sgtable+0x25/0x40 [ 364.049654] amdgpu_bo_move+0x59a/0x850 [amdgpu] [ 364.049935] ? srso_return_thunk+0x5/0x5f [ 364.049939] ? amdgpu_ttm_tt_populate+0x5d/0xc0 [amdgpu] [ 364.050095] ttm_bo_handle_move_mem+0xc3/0x180 [ttm] [ 364.050103] ttm_bo_validate+0xc1/0x160 [ttm] [ 364.050108] ? amdgpu_ttm_tt_get_user_pages+0xe5/0x1b0 [amdgpu] [ 364.050263] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0xa12/0xc90 [amdgpu] [ 364.050473] kfd_ioctl_alloc_memory_of_gpu+0x16b/0x3b0 [amdgpu] [ 364.050680] kfd_ioctl+0x3c2/0x530 [amdgpu] [ 364.050866] ? __pfx_kfd_ioctl_alloc_memory_of_gpu+0x10/0x10 [amdgpu] [ 364.051054] ? srso_return_thunk+0x5/0x5f [ 364.051057] ? tomoyo_file_ioctl+0x20/0x30 [ 364.051063] __x64_sys_ioctl+0x9c/0xd0 [ 364.051068] x64_sys_call+0x1219/0x20d0 [ 364.051073] do_syscall_64+0x51/0x120 [ 364.051077] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 364.051081] RIP: 0033:0x7fb2d2f1a94f Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Zhao <Victor.Zhao@amd.com> Date: Thu Oct 24 13:40:39 2024 +0800 drm/amdgpu: skip amdgpu_device_cache_pci_state under sriov [ Upstream commit afe260df55ac280cd56306248cb6d8a6b0db095c ] Under sriov, host driver will save and restore vf pci cfg space during reset. And during device init, under sriov, pci_restore_state happens after fullaccess released, and it can have race condition with mmio protection enable from host side leading to missing interrupts. So skip amdgpu_device_cache_pci_state for sriov. Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Date: Thu Nov 28 12:05:56 2024 -0500 drm/amdkfd: add MEC version that supports no PCIe atomics for GFX12 commit 33114f1057ea5cf40e604021711a9711a060fcb6 upstream. Add MEC version from which alternate support for no PCIe atomics is provided so that device is not skipped during KFD device init in GFX1200/GFX1201. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Yat Sin <David.YatSin@amd.com> Date: Tue Nov 26 15:18:47 2024 -0500 drm/amdkfd: hard-code cacheline for gc943,gc944 commit 55ed120dcfdde2478c3ebfa1c0ac4ed1e430053b upstream. Cacheline size is not available in IP discovery for gc943,gc944. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Liao Chen <liaochen4@huawei.com> Date: Mon Sep 2 11:33:18 2024 +0000 drm/bridge: it6505: Enable module autoloading [ Upstream commit 1e2ab24cd708b1c864ff983ee1504c0a409d2f8e ] Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240902113320.903147-2-liaochen4@huawei.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Brahmajit Das <brahmajit.xyz@gmail.com> Date: Wed Oct 2 14:53:11 2024 +0530 drm/display: Fix building with GCC 15 [ Upstream commit a500f3751d3c861be7e4463c933cf467240cca5d ] GCC 15 enables -Werror=unterminated-string-initialization by default. This results in the following build error drivers/gpu/drm/display/drm_dp_dual_mode_helper.c: In function ‘is_hdmi_adaptor’: drivers/gpu/drm/display/drm_dp_dual_mode_helper.c:164:17: error: initializer-string for array of ‘char’ is too long [-Werror=unterminated-string-initialization] 164 | "DP-HDMI ADAPTOR\x04"; | ^~~~~~~~~~~~~~~~~~~~~ After discussion with Ville, the fix was to increase the size of dp_dual_mode_hdmi_id array by one, so that it can accommodate the NULL line character. This should let us build the kernel with GCC 15. Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002092311.942822-1-brahmajit.xyz@gmail.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Imre Deak <imre.deak@intel.com> Date: Mon Nov 25 22:53:14 2024 +0200 drm/dp_mst: Fix MST sideband message body length check commit bd2fccac61b40eaf08d9546acc9fef958bfe4763 upstream. Fix the MST sideband message body length check, which must be at least 1 byte accounting for the message body CRC (aka message data CRC) at the end of the message. This fixes a case where an MST branch device returns a header with a correct header CRC (indicating a correctly received body length), with the body length being incorrectly set to 0. This will later lead to a memory corruption in drm_dp_sideband_append_payload() and the following errors in dmesg: UBSAN: array-index-out-of-bounds in drivers/gpu/drm/display/drm_dp_mst_topology.c:786:25 index -1 is out of range for type 'u8 [48]' Call Trace: drm_dp_sideband_append_payload+0x33d/0x350 [drm_display_helper] drm_dp_get_one_sb_msg+0x3ce/0x5f0 [drm_display_helper] drm_dp_mst_hpd_irq_handle_event+0xc8/0x1580 [drm_display_helper] memcpy: detected field-spanning write (size 18446744073709551615) of single field "&msg->msg[msg->curlen]" at drivers/gpu/drm/display/drm_dp_mst_topology.c:791 (size 256) Call Trace: drm_dp_sideband_append_payload+0x324/0x350 [drm_display_helper] drm_dp_get_one_sb_msg+0x3ce/0x5f0 [drm_display_helper] drm_dp_mst_hpd_irq_handle_event+0xc8/0x1580 [drm_display_helper] Cc: <stable@vger.kernel.org> Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241125205314.1725887-1-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Imre Deak <imre.deak@intel.com> Date: Tue Dec 3 18:02:17 2024 +0200 drm/dp_mst: Fix resetting msg rx state after topology removal commit a6fa67d26de385c3c7a23c1e109a0e23bfda4ec7 upstream. If the MST topology is removed during the reception of an MST down reply or MST up request sideband message, the drm_dp_mst_topology_mgr::up_req_recv/down_rep_recv states could be reset from one thread via drm_dp_mst_topology_mgr_set_mst(false), racing with the reading/parsing of the message from another thread via drm_dp_mst_handle_down_rep() or drm_dp_mst_handle_up_req(). The race is possible since the reader/parser doesn't hold any lock while accessing the reception state. This in turn can lead to a memory corruption in the reader/parser as described by commit bd2fccac61b4 ("drm/dp_mst: Fix MST sideband message body length check"). Fix the above by resetting the message reception state if needed before reading/parsing a message. Another solution would be to hold the drm_dp_mst_topology_mgr::lock for the whole duration of the message reception/parsing in drm_dp_mst_handle_down_rep() and drm_dp_mst_handle_up_req(), however this would require a bigger change. Since the fix is also needed for stable, opting for the simpler solution in this patch. Cc: Lyude Paul <lyude@redhat.com> Cc: <stable@vger.kernel.org> Fixes: 1d082618bbf3 ("drm/display/dp_mst: Fix down/up message handling after sink disconnect") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13056 Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-2-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Imre Deak <imre.deak@intel.com> Date: Tue Dec 3 18:02:18 2024 +0200 drm/dp_mst: Verify request type in the corresponding down message reply commit 4d49e77a973d3b5d1881663c3f122906a0702940 upstream. After receiving the response for an MST down request message, the response should be accepted/parsed only if the response type matches that of the request. Ensure this by checking if the request type code stored both in the request and the reply match, dropping the reply in case of a mismatch. This fixes the topology detection for an MST hub, as described in the Closes link below, where the hub sends an incorrect reply message after a CLEAR_PAYLOAD_TABLE -> LINK_ADDRESS down request message sequence. Cc: Lyude Paul <lyude@redhat.com> Cc: <stable@vger.kernel.org> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12804 Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241203160223.2926014-3-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Liao Chen <liaochen4@huawei.com> Date: Mon Sep 2 11:33:20 2024 +0000 drm/mcde: Enable module autoloading [ Upstream commit 8a16b5cdae26207ff4c22834559384ad3d7bc970 ] Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240902113320.903147-4-liaochen4@huawei.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikandan Muralidharan <manikandan.m@microchip.com> Date: Thu Sep 19 14:45:48 2024 +0530 drm/panel: simple: Add Microchip AC69T88A LVDS Display panel [ Upstream commit 40da1463cd6879f542238b36c1148f517927c595 ] Add support for Microchip AC69T88A 5 inch TFT LCD 800x480 Display module with LVDS interface.The panel uses the Sitronix ST7262 800x480 Display driver Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240919091548.430285-2-manikandan.m@microchip.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jocelyn Falempe <jfalempe@redhat.com> Date: Tue Oct 22 20:39:47 2024 +0200 drm/panic: Add ABGR2101010 support [ Upstream commit 04596969eea9e73b64d63be52aabfddb382e9ce6 ] Add support for ABGR2101010, used by the nouveau driver. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022185553.1103384-2-jfalempe@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Igor Artemiev <Igor.A.Artemiev@mcst.ru> Date: Fri Sep 27 18:07:19 2024 +0300 drm/radeon/r600_cs: Fix possible int overflow in r600_packet3_check() [ Upstream commit a1e2da6a5072f8abe5b0feaa91a5bcd9dc544a04 ] It is possible, although unlikely, that an integer overflow will occur when the result of radeon_get_ib_value() is shifted to the left. Avoid it by casting one of the operands to larger data type (u64). Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Signed-off-by: Igor Artemiev <Igor.A.Artemiev@mcst.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Philipp Stanner <pstanner@redhat.com> Date: Mon Oct 21 12:50:28 2024 +0200 drm/sched: memset() 'job' in drm_sched_job_init() [ Upstream commit 2320c9e6a768d135c7b0039995182bb1a4e4fd22 ] drm_sched_job_init() has no control over how users allocate struct drm_sched_job. Unfortunately, the function can also not set some struct members such as job->sched. This could theoretically lead to UB by users dereferencing the struct's pointer members too early. It is easier to debug such issues if these pointers are initialized to NULL, so dereferencing them causes a NULL pointer exception. Accordingly, drm_sched_entity_init() does precisely that and initializes its struct with memset(). Initialize parameter "job" to 0 in drm_sched_job_init(). Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021105028.19794-2-pstanner@redhat.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pei Xiao <xiaopei01@kylinos.cn> Date: Wed Nov 20 15:21:36 2024 +0800 drm/sti: Add __iomem for mixer_dbg_mxn's parameter [ Upstream commit 86e8f94789dd6f3e705bfa821e1e416f97a2f863 ] Sparse complains about incorrect type in argument 1. expected void const volatile __iomem *ptr but got void *. so modify mixer_dbg_mxn's addr parameter. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202411191809.6V3c826r-lkp@intel.com/ Fixes: a5f81078a56c ("drm/sti: add debugfs entries for MIXER crtc") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Acked-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/c28f0dcb6a4526721d83ba1f659bba30564d3d54.1732087094.git.xiaopei01@kylinos.cn Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maíra Canal <mcanal@igalia.com> Date: Wed Dec 4 09:28:31 2024 -0300 drm/v3d: Enable Performance Counters before clearing them [ Upstream commit c98b10496b2f3c4f576af3482c71aadcfcbf765e ] On the Raspberry Pi 5, performance counters are not being cleared when `v3d_perfmon_start()` is called, even though we write to the CLR register. As a result, their values accumulate until they overflow. The expected behavior is for performance counters to reset to zero at the start of a job. When the job finishes and the perfmon is stopped, the counters should accurately reflect the values for that specific job. To ensure this behavior, the performance counters are now enabled before being cleared. This allows the CLR register to function as intended, zeroing the counter values when the job begins. Fixes: 26a4dc29b74a ("drm/v3d: Expose performance counters to userspace") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204122831.17015-1-mcanal@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dom Cobley <popcornmix@gmail.com> Date: Fri Jun 21 16:20:28 2024 +0100 drm/vc4: hdmi: Avoid log spam for audio start failure [ Upstream commit b4e5646178e86665f5caef2894578600f597098a ] We regularly get dmesg error reports of: [ 18.184066] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19 [ 18.184098] MAI: soc_pcm_open() failed (-19) These are generated for any disconnected hdmi interface when pulseaudio attempts to open the associated ALSA device (numerous times). Each open generates a kernel error message, generating general log spam. The error messages all come from _soc_pcm_ret in sound/soc/soc-pcm.c#L39 which suggests returning ENOTSUPP, rather that ENODEV will be quiet. And indeed it is. Signed-off-by: Dom Cobley <popcornmix@gmail.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240621152055.4180873-5-dave.stevenson@raspberrypi.com Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dave Stevenson <dave.stevenson@raspberrypi.com> Date: Fri Jun 21 16:20:30 2024 +0100 drm/vc4: hvs: Set AXI panic modes for the HVS [ Upstream commit 014eccc9da7bfc76a3107fceea37dd60f1d63630 ] The HVS can change AXI request mode based on how full the COB FIFOs are. Until now the vc4 driver has been relying on the firmware to have set these to sensible values. With HVS channel 2 now being used for live video, change the panic mode for all channels to be explicitly set by the driver, and the same for all channels. Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240621152055.4180873-7-dave.stevenson@raspberrypi.com Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Harrison <John.C.Harrison@Intel.com> Date: Wed Oct 2 17:46:04 2024 -0700 drm/xe/devcoredump: Add ASCII85 dump helper function [ Upstream commit ec1455ce7e35a31289d2dbc1070b980538698921 ] There is a need to include the GuC log and other large binary objects in core dumps and via dmesg. So add a helper for dumping to a printer function via conversion to ASCII85 encoding. Another issue with dumping such a large buffer is that it can be slow, especially if dumping to dmesg over a serial port. So add a yield to prevent the 'task has been stuck for 120s' kernel hang check feature from firing. v2: Add a prefix to the output string. Fix memory allocation bug. v3: Correct a string size calculation and clean up a define (review feedback from Julia F). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-5-John.C.Harrison@Intel.com Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Harrison <John.C.Harrison@Intel.com> Date: Wed Oct 2 17:46:03 2024 -0700 drm/xe/devcoredump: Improve section headings and add tile info [ Upstream commit c28fd6c358db44c87a1408f27ba412c94e25e6c2 ] The xe_guc_exec_queue_snapshot is not really a GuC internal thing and is definitely not a GuC CT thing. So give it its own section heading. The snapshot itself is really a capture of the submission backend's internal state. Although all it currently prints out is the submission contexts. So label it as 'Contexts'. If more general state is added later then it could be change to 'Submission backend' or some such. Further, everything from the GuC CT section onwards is GT specific but there was no indication of which GT it was related to (and that is impossible to work out from the other fields that are given). So add a GT section heading. Also include the tile id of the GT, because again significant information. Lastly, drop a couple of unnecessary line feeds within sections. v2: Add GT section heading, add tile id to device section. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-4-John.C.Harrison@Intel.com Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Date: Mon Oct 14 13:25:46 2024 +0530 drm/xe/devcoredump: Update handling of xe_force_wake_get return [ Upstream commit 9ffd6ec2de08ef4ac5f17f6131d1db57613493f9 ] xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Use helper xe_force_wake_ref_has_domain to verify all domains are initialized or not. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int for xe_force_wake_get() v6 - use helper xe_force_wake_ref_has_domain() v7 - Fix commit message Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-12-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Harrison <John.C.Harrison@Intel.com> Date: Wed Oct 2 17:46:02 2024 -0700 drm/xe/devcoredump: Use drm_puts and already cached local variables [ Upstream commit 9d86d080cfb3ab935c842ac5525a90430a14c998 ] There are a bunch of calls to drm_printf with static strings. Switch them to drm_puts instead. There are also a bunch of 'coredump->snapshot.XXX' references when 'coredump->snapshot' has alread been cached locally as 'ss'. So use 'ss->XXX' instead. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-3-John.C.Harrison@Intel.com Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Date: Mon Oct 14 13:25:38 2024 +0530 drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain() [ Upstream commit 9d62b07027f0710b7af03d78780d0a6c2425bc1e ] The helper xe_force_wake_ref_has_domain() checks if the input domain has been successfully reference-counted and awakened in the reference. v2 - Fix commit message and kernel-doc (Michal) - Remove unnecessary paranthesis (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-4-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Badal Nilawar <badal.nilawar@intel.com> Date: Thu Oct 17 16:44:10 2024 +0530 drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout [ Upstream commit e5152723380404acb8175e0777b1cea57f319a01 ] In case if g2h worker doesn't get opportunity to within specified timeout delay then flush the g2h worker explicitly. v2: - Describe change in the comment and add TODO (Matt B/John H) - Add xe_gt_warn on fence done after G2H flush (John H) v3: - Updated the comment with root cause - Clean up xe_gt_warn message (John H) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1620 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/2902 Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017111410.2553784-2-badal.nilawar@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Harrison <John.C.Harrison@Intel.com> Date: Wed Oct 2 17:46:05 2024 -0700 drm/xe/guc: Copy GuC log prior to dumping [ Upstream commit a59a403419aa03d5e44c8cf014e415490395b17f ] Add an extra stage to the GuC log print to copy the log buffer into regular host memory first, rather than printing the live GPU buffer object directly. Doing so helps prevent inconsistencies due to the log being updated as it is being dumped. It also allows the use of the ASCII85 helper function for printing the log in a more compact form than a straight hex dump. v2: Use %zx instead of %lx for size_t prints. v3: Replace hexdump code with ascii85 call (review feedback from Matthew B). Move chunking code into next patch as that reduces the deltas of both. v4: Add a prefix to the ASCII85 output to aid tool parsing. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-6-John.C.Harrison@Intel.com Stable-dep-of: 5dce85fecb87 ("drm/xe: Move the coredump registration to the worker thread") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Date: Thu Sep 12 17:29:06 2024 +0530 drm/xe/pciid: Add new PCI id for ARL [ Upstream commit 35667a0330612bb25a689e4d3a687d47cede1d7a ] Add new PCI id for ARL platform. v2: Fix typo in PCI id (SaiTeja) Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240912115906.2730577-1-dnyaneshwar.bhadane@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rodrigo Vivi <rodrigo.vivi@intel.com> Date: Fri Sep 6 15:06:03 2024 +0300 drm/xe/pciids: Add PVC's PCI device ID macros [ Upstream commit 5b40191152282e1f25d7b9826bcda41be927b39f ] Add PVC PCI IDs to the xe_pciids.h header. They're not yet used in the driver. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6ac1829493a53a3fec889c746648d627a0296892.1725624296.git.jani.nikula@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jani Nikula <jani.nikula@intel.com> Date: Wed Sep 4 12:46:49 2024 +0300 drm/xe/pciids: separate ARL and MTL PCI IDs [ Upstream commit cdb56a63f7eef34e89b045fc8bcae8d326bbdb19 ] Avoid including PCI IDs for one platform to the PCI IDs of another. It's more clear to deal with them completely separately at the PCI ID macro level. Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/a30cb0da7694a8eccceba66d676ac59aa0e96176.1725443121.git.jani.nikula@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jani Nikula <jani.nikula@intel.com> Date: Wed Sep 4 12:46:48 2024 +0300 drm/xe/pciids: separate RPL-U and RPL-P PCI IDs [ Upstream commit d454902a690db47f1880f963514bbf0fc7a129a8 ] Avoid including PCI IDs for one platform to the PCI IDs of another. It's more clear to deal with them completely separately at the PCI ID macro level. Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4868d36fbfa8c38ea2d490bca82cf6370b8d65dd.1725443121.git.jani.nikula@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shekhar Chauhan <shekhar.chauhan@intel.com> Date: Mon Oct 7 08:41:44 2024 -0700 drm/xe/ptl: L3bank mask is not available on the media GT [ Upstream commit 9ab440a9d0426cf7842240891cc457155db1a97e ] On PTL platforms with media version 30.00, the fuse registers for reporting L3 bank availability to the GT just read out as ~0 and do not provide proper values. Xe does not use the L3 bank mask for anything internally; it only passes the mask through to userspace via the GT topology query. Since we don't have any way to get the real L3 bank mask, we don't want to pass garbage to userspace. Passing a zeroed mask or a copy of the primary GT's L3 bank mask would also be inaccurate and likely to cause confusion for userspace. The best approach is to simply not include L3 in the list of masks returned by the topology query in cases where we aren't able to provide a meaningful value. This won't change the behavior for any existing platforms (where we can always obtain L3 masks successfully for all GTs), it will only prevent us from mis-reporting bad information on upcoming platform(s). There's a good chance this will become a formal workaround in the future, but for now we don't have a lineage number so "no_media_l3" is used in place of a lineage as the OOB workaround descriptor. v2: - Re-calculate query size to properly match data returned. (Gustavo) - Update kerneldoc to clarify that the L3bank mask may not be included in the query results if the hardware doesn't make it available. (Gustavo) Cc: Matt Atwood <matthew.s.atwood@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Co-developed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Acked-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007154143.2021124-2-matthew.d.roper@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gustavo Sousa <gustavo.sousa@intel.com> Date: Tue Oct 8 13:46:26 2024 -0700 drm/xe/xe3: Add initial set of workarounds [ Upstream commit 081cb8948cfe322076cd23f22f85ba68f73e2c4b ] Implement the initial set of workarounds for Xe3 IPs. Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008204626.55802-2-matthew.s.atwood@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joaquín Ignacio Aramendía <samsagax@gmail.com> Date: Mon Sep 16 15:18:51 2024 +0200 drm: panel-orientation-quirks: Add quirk for AYA NEO 2 model [ Upstream commit 361ebf5ef843b0aa1704c72eb26b91cf76c3c5b7 ] Add quirk orientation for AYA NEO 2. The name appears without spaces in DMI strings. That made it difficult to reuse the 2021 match. Also the display is larger in resolution. Tested by the JELOS team that has been patching their own kernel for a while now and confirmed by users in the AYA NEO and ChimeraOS discord servers. Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/2b35545b77a9fd8c9699b751ca282226dcecb1dd.1726492131.git.tjakobi@math.uni-bielefeld.de Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joaquín Ignacio Aramendía <samsagax@gmail.com> Date: Mon Sep 16 15:18:53 2024 +0200 drm: panel-orientation-quirks: Add quirk for AYA NEO Founder edition [ Upstream commit d7972d735ca80a40a571bf753c138263981a5698 ] Add quirk orientation for AYA NEO Founder. The name appears with spaces in DMI strings as other devices of the brand. The panel is the same as the NEXT and 2021 models. Those could not be reused as the former has VENDOR name as "AYANEO" without spaces and the latter has "AYADEVICE". Tested by the JELOS team that has been patching their own kernel for a while now and confirmed by users in the AYA NEO and ChimeraOS discord servers. Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/f71889a0b39f13f4b78481bd030377ca15035680.1726492131.git.tjakobi@math.uni-bielefeld.de Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joaquín Ignacio Aramendía <samsagax@gmail.com> Date: Mon Sep 16 15:18:55 2024 +0200 drm: panel-orientation-quirks: Add quirk for AYA NEO GEEK [ Upstream commit 428656feb972ca99200fc127b5aecb574efd9d3d ] Add quirk orientation for AYA NEO GEEK. The name appears without spaces in DMI strings. The board name is completely different to the previous models making it difficult to reuse their quirks despite being the same resolution and using the same orientation. Tested by the JELOS team that has been patching their own kernel for a while now and confirmed by users in the AYA NEO and ChimeraOS discord servers. Signed-off-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/40350b0d63fe2b54e7cba1e14be50917203f0079.1726492131.git.tjakobi@math.uni-bielefeld.de Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andrew Lunn <andrew@lunn.ch> Date: Sun Nov 10 18:59:55 2024 +0100 dsa: qca8k: Use nested lock to avoid splat [ Upstream commit 078e0d596f7b5952dad8662ace8f20ed2165e2ce ] qca8k_phy_eth_command() is used to probe the child MDIO bus while the parent MDIO is locked. This causes lockdep splat, reporting a possible deadlock. It is not an actually deadlock, because different locks are used. By making use of mutex_lock_nested() we can avoid this false positive. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241110175955.3053664-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Brauner <brauner@kernel.org> Date: Wed Sep 25 11:05:16 2024 +0200 epoll: annotate racy check [ Upstream commit 6474353a5e3d0b2cf610153cea0c61f576a36d0a ] Epoll relies on a racy fastpath check during __fput() in eventpoll_release() to avoid the hit of pointlessly acquiring a semaphore. Annotate that race by using WRITE_ONCE() and READ_ONCE(). Link: https://lore.kernel.org/r/66edfb3c.050a0220.3195df.001a.GAE@google.com Link: https://lore.kernel.org/r/20240925-fungieren-anbauen-79b334b00542@brauner Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: syzbot+3b6b32dc50537a49bb4a@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kory Maincent <kory.maincent@bootlin.com> Date: Mon Dec 2 16:33:57 2024 +0100 ethtool: Fix wrong mod state in case of verbose and no_mask bitset [ Upstream commit 910c4788d6155b2202ec88273376cd7ecdc24f0a ] A bitset without mask in a _SET request means we want exactly the bits in the bitset to be set. This works correctly for compact format but when verbose format is parsed, ethnl_update_bitset32_verbose() only sets the bits present in the request bitset but does not clear the rest. The commit 6699170376ab ("ethtool: fix application of verbose no_mask bitset") fixes this issue by clearing the whole target bitmap before we start iterating. The solution proposed brought an issue with the behavior of the mod variable. As the bitset is always cleared the old value will always differ to the new value. Fix it by adding a new function to compare bitmaps and a temporary variable which save the state of the old bitmap. Fixes: 6699170376ab ("ethtool: fix application of verbose no_mask bitset") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20241202153358.1142095-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Brian Foster <bfoster@redhat.com> Date: Thu Sep 19 12:07:40 2024 -0400 ext4: partial zero eof block on unaligned inode size extension [ Upstream commit c7fc0366c65628fd69bfc310affec4918199aae2 ] Using mapped writes, it's technically possible to expose stale post-eof data on a truncate up operation. Consider the following example: $ xfs_io -fc "pwrite 0 2k" -c "mmap 0 4k" -c "mwrite 2k 2k" \ -c "truncate 8k" -c "pread -v 2k 16" <file> ... 00000800: 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 XXXXXXXXXXXXXXXX ... This shows that the post-eof data written via mwrite lands within EOF after a truncate up. While this is deliberate of the test case, behavior is somewhat unpredictable because writeback does post-eof zeroing, and writeback can occur at any time in the background. For example, an fsync inserted between the mwrite and truncate causes the subsequent read to instead return zeroes. This basically means that there is a race window in this situation between any subsequent extending operation and writeback that dictates whether post-eof data is exposed to the file or zeroed. To prevent this problem, perform partial block zeroing as part of the various inode size extending operations that are susceptible to it. For truncate extension, zero around the original eof similar to how truncate down does partial zeroing of the new eof. For extension via writes and fallocate related operations, zero the newly exposed range of the file to cover any partial zeroing that must occur at the original and new eof blocks. Signed-off-by: Brian Foster <bfoster@redhat.com> Link: https://patch.msgid.link/20240919160741.208162-2-bfoster@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org> Date: Fri Nov 22 14:50:05 2024 +0800 f2fs: add a sysfs node to limit max read extent count per-inode [ Upstream commit 009a8241a8e5a14ea2dd0b8db42dbf283527dd44 ] Quoted: "at this time, there are still 1086911 extent nodes in this zombie extent tree that need to be cleaned up. crash_arm64_sprd_v8.0.3++> extent_tree.node_cnt ffffff80896cc500 node_cnt = { counter = 1086911 }, " As reported by Xiuhong, there will be a huge number of extent nodes in extent tree, it may potentially cause: - slab memory fragments - extreme long time shrink on extent tree - low mapping efficiency Let's add a sysfs node to limit max read extent count for each inode, by default, value of this threshold is 10240, it can be updated according to user's requirement. Reported-by: Xiuhong Wang <xiuhong.wang@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/20241112110627.1314632-1-xiuhong.wang@unisoc.com/ Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org> Date: Fri Nov 8 09:25:55 2024 +0800 f2fs: clean up w/ F2FS_{BLK_TO_BYTES,BTYES_TO_BLK} [ Upstream commit 7461f37094180200cb2f98e60ef99a0cea97beec ] f2fs doesn't support different blksize in one instance, so bytes_to_blks() and blks_to_bytes() are equal to F2FS_BYTES_TO_BLK and F2FS_BLK_TO_BYTES, let's use F2FS_BYTES_TO_BLK/F2FS_BLK_TO_BYTES instead for cleanup. Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 6787a8224585 ("f2fs: fix to requery extent which cross boundary of inquiry") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qi Han <hanqi@vivo.com> Date: Wed Sep 18 02:44:00 2024 -0600 f2fs: fix f2fs_bug_on when uninstalling filesystem call f2fs_evict_inode. [ Upstream commit d5c367ef8287fb4d235c46a2f8c8d68715f3a0ca ] creating a large files during checkpoint disable until it runs out of space and then delete it, then remount to enable checkpoint again, and then unmount the filesystem triggers the f2fs_bug_on as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:896! CPU: 2 UID: 0 PID: 1286 Comm: umount Not tainted 6.11.0-rc7-dirty #360 Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:f2fs_evict_inode+0x58c/0x610 Call Trace: __die_body+0x15/0x60 die+0x33/0x50 do_trap+0x10a/0x120 f2fs_evict_inode+0x58c/0x610 do_error_trap+0x60/0x80 f2fs_evict_inode+0x58c/0x610 exc_invalid_op+0x53/0x60 f2fs_evict_inode+0x58c/0x610 asm_exc_invalid_op+0x16/0x20 f2fs_evict_inode+0x58c/0x610 evict+0x101/0x260 dispose_list+0x30/0x50 evict_inodes+0x140/0x190 generic_shutdown_super+0x2f/0x150 kill_block_super+0x11/0x40 kill_f2fs_super+0x7d/0x140 deactivate_locked_super+0x2a/0x70 cleanup_mnt+0xb3/0x140 task_work_run+0x61/0x90 The root cause is: creating large files during disable checkpoint period results in not enough free segments, so when writing back root inode will failed in f2fs_enable_checkpoint. When umount the file system after enabling checkpoint, the root inode is dirty in f2fs_evict_inode function, which triggers BUG_ON. The steps to reproduce are as follows: dd if=/dev/zero of=f2fs.img bs=1M count=55 mount f2fs.img f2fs_dir -o checkpoint=disable:10% dd if=/dev/zero of=big bs=1M count=50 sync rm big mount -o remount,checkpoint=enable f2fs_dir umount f2fs_dir Let's redirty inode when there is not free segments during checkpoint is disable. Signed-off-by: Qi Han <hanqi@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhiguo Niu <zhiguo.niu@unisoc.com> Date: Fri Nov 8 09:25:56 2024 +0800 f2fs: fix to adjust appropriate length for fiemap [ Upstream commit 77569f785c8624fa4189795fb52e635a973672e5 ] If user give a file size as "length" parameter for fiemap operations, but if this size is non-block size aligned, it will show 2 segments fiemap results even this whole file is contiguous on disk, such as the following results: ./f2fs_io fiemap 0 19034 ylog/analyzer.py Fiemap: offset = 0 len = 19034 logical addr. physical addr. length flags 0 0000000000000000 0000000020baa000 0000000000004000 00001000 1 0000000000004000 0000000020bae000 0000000000001000 00001001 after this patch: ./f2fs_io fiemap 0 19034 ylog/analyzer.py Fiemap: offset = 0 len = 19034 logical addr. physical addr. length flags 0 0000000000000000 00000000315f3000 0000000000005000 00001001 Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 6787a8224585 ("f2fs: fix to requery extent which cross boundary of inquiry") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org> Date: Fri Nov 8 09:25:57 2024 +0800 f2fs: fix to requery extent which cross boundary of inquiry [ Upstream commit 6787a82245857271133b63ae7f72f1dc9f29e985 ] dd if=/dev/zero of=file bs=4k count=5 xfs_io file -c "fiemap -v 2 16384" file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..31]: 139272..139303 32 0x1000 1: [32..39]: 139304..139311 8 0x1001 xfs_io file -c "fiemap -v 0 16384" file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..31]: 139272..139303 32 0x1000 xfs_io file -c "fiemap -v 0 16385" file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..39]: 139272..139311 40 0x1001 There are two problems: - continuous extent is split to two - FIEMAP_EXTENT_LAST is missing in last extent The root cause is: if upper boundary of inquiry crosses extent, f2fs_map_blocks() will truncate length of returned extent to F2FS_BYTES_TO_BLK(len), and also, it will stop to query latter extent or hole to make sure current extent is last or not. In order to fix this issue, once we found an extent locates in the end of inquiry range by f2fs_map_blocks(), we need to expand inquiry range to requiry. Cc: stable@vger.kernel.org Fixes: 7f63eb77af7b ("f2fs: report unwritten area in f2fs_fiemap") Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org> Date: Thu Nov 21 09:57:50 2024 +0800 f2fs: fix to shrink read extent node in batches [ Upstream commit 3fc5d5a182f6a1f8bd4dc775feb54c369dd2c343 ] We use rwlock to protect core structure data of extent tree during its shrink, however, if there is a huge number of extent nodes in extent tree, during shrink of extent tree, it may hold rwlock for a very long time, which may trigger kernel hang issue. This patch fixes to shrink read extent node in batches, so that, critical region of the rwlock can be shrunk to avoid its extreme long time hold. Reported-by: Xiuhong Wang <xiuhong.wang@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/20241112110627.1314632-1-xiuhong.wang@unisoc.com/ Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org> Date: Wed Nov 20 14:58:50 2024 +0800 f2fs: print message if fscorrupted was found in f2fs_new_node_page() [ Upstream commit 81520c684ca67aea6a589461a3caebb9b11dcc90 ] If fs corruption occurs in f2fs_new_node_page(), let's print more information about corrupted metadata into kernel log. Meanwhile, it updates to record ERROR_INCONSISTENT_NAT instead of ERROR_INVALID_BLKADDR if blkaddr in nat entry is not NULL_ADDR which means nat bitmap and nat entry is inconsistent. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Goldstein <amir73il@gmail.com> Date: Thu Oct 3 16:29:22 2024 +0200 fanotify: allow reporting errors on failure to open fd [ Upstream commit 522249f05c5551aec9ec0ba9b6438f1ec19c138d ] When working in "fd mode", fanotify_read() needs to open an fd from a dentry to report event->fd to userspace. Opening an fd from dentry can fail for several reasons. For example, when tasks are gone and we try to open their /proc files or we try to open a WRONLY file like in sysfs or when trying to open a file that was deleted on the remote network server. Add a new flag FAN_REPORT_FD_ERROR for fanotify_init(). For a group with FAN_REPORT_FD_ERROR, we will send the event with the error instead of the open fd, otherwise userspace may not get the error at all. For an overflow event, we report -EBADF to avoid confusing FAN_NOFD with -EPERM. Similarly for pidfd open errors we report either -ESRCH or the open error instead of FAN_NOPIDFD and FAN_EPIDFD. In any case, userspace will not know which file failed to open, so add a debug print for further investigation. Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com> Link: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241003142922.111539-1-amir73il@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Date: Thu Oct 3 23:10:08 2024 +0200 firmware: qcom: scm: Allow QSEECOM on Dell XPS 13 9345 [ Upstream commit 304c250ba121f5c505be3fc13dec984016f3c032 ] Allow particular machine accessing eg. efivars. Signed-off-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Stefan Schmidt <stefan.schmidt@linaro.org> Link: https://lore.kernel.org/r/20241003211139.9296-3-alex.vinarskis@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maya Matuszczyk <maccraft123mc@gmail.com> Date: Thu Sep 19 15:44:21 2024 +0200 firmware: qcom: scm: Allow QSEECOM on Lenovo Yoga Slim 7x [ Upstream commit c6fa2834afc6a6fe210415ec253a61e6eafdf651 ] Allow QSEECOM on Lenovo Yoga Slim 7x, to enable accessing EFI variables. Signed-off-by: Maya Matuszczyk <maccraft123mc@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240919134421.112643-2-maccraft123mc@gmail.com [bjorn: Rewrote commit message] Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Date: Thu Oct 10 20:09:24 2024 +0300 fs/ntfs3: Fix case when unmarked clusters intersect with zone [ Upstream commit 5fc982fe7eca9d0cf7b25832450ebd4f7c8e1c36 ] Reported-by: syzbot+7f3761b790fa41d0f3d5@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Date: Tue Oct 8 10:48:15 2024 +0300 fs/ntfs3: Fix warning in ni_fiemap [ Upstream commit e2705dd3d16d1000f1fd8193d82447065de8c899 ] Use local runs_tree instead of cached. This way excludes rw_semaphore lock. Reported-by: syzbot+1c25748a40fe79b8a119@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ralph Boehme <slow@samba.org> Date: Fri Nov 15 13:15:50 2024 +0100 fs/smb/client: avoid querying SMB2_OP_QUERY_WSL_EA for SMB3 POSIX commit ca4b2c4607433033e9c4f4659f809af4261d8992 upstream. Avoid extra roundtrip Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ralph Boehme <slow@samba.org> Date: Mon Nov 25 16:19:56 2024 +0100 fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse points commit 8cb0bc5436351de8a11eef13b7367d64cc0d6c68 upstream. Spares an extra revalidation request Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ralph Boehme <slow@samba.org> Date: Fri Nov 15 19:21:04 2024 +0100 fs/smb/client: Implement new SMB3 POSIX type commit 6a832bc8bbb22350f7ffe6ecb2d36f261bb96023 upstream. Fixes special files against current Samba. On the Samba server: insgesamt 20 131958 brw-r--r-- 1 root root 0, 0 15. Nov 12:04 blockdev 131965 crw-r--r-- 1 root root 1, 1 15. Nov 12:04 chardev 131966 prw-r--r-- 1 samba samba 0 15. Nov 12:05 fifo 131953 -rw-rwxrw-+ 2 samba samba 4 18. Nov 11:37 file 131953 -rw-rwxrw-+ 2 samba samba 4 18. Nov 11:37 hardlink 131957 lrwxrwxrwx 1 samba samba 4 15. Nov 12:03 symlink -> file 131954 -rwxrwxr-x+ 1 samba samba 0 18. Nov 15:28 symlinkoversmb Before: ls: cannot access '/mnt/smb3unix/posix/blockdev': No data available ls: cannot access '/mnt/smb3unix/posix/chardev': No data available ls: cannot access '/mnt/smb3unix/posix/symlinkoversmb': No data available ls: cannot access '/mnt/smb3unix/posix/fifo': No data available ls: cannot access '/mnt/smb3unix/posix/symlink': No data available total 16 ? -????????? ? ? ? ? ? blockdev ? -????????? ? ? ? ? ? chardev ? -????????? ? ? ? ? ? fifo 131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 file 131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 hardlink ? -????????? ? ? ? ? ? symlink ? -????????? ? ? ? ? ? symlinkoversmb After: insgesamt 21 131958 brw-r--r-- 1 root root 0, 0 15. Nov 12:04 blockdev 131965 crw-r--r-- 1 root root 1, 1 15. Nov 12:04 chardev 131966 prw-r--r-- 1 root samba 0 15. Nov 12:05 fifo 131953 -rw-rwxrw- 2 root samba 4 18. Nov 11:37 file 131953 -rw-rwxrw- 2 root samba 4 18. Nov 11:37 hardlink 131957 lrwxrwxrwx 1 root samba 4 15. Nov 12:03 symlink -> file 131954 lrwxrwxr-x 1 root samba 23 18. Nov 15:28 symlinkoversmb -> mnt/smb3unix/posix/file Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aleksandr Mishin <amishin@t-argos.ru> Date: Mon Oct 28 09:58:24 2024 +0300 fsl/fman: Validate cell-index value obtained from Device Tree [ Upstream commit bd50c4125c98bd1a86f8e514872159700a9c678c ] Cell-index value is obtained from Device Tree and then used to calculate the index for accessing arrays port_mfl[], mac_mfl[] and intr_mng[]. In case of broken DT due to any error cell-index can contain any value and it is possible to go beyond the array boundaries which can lead at least to memory corruption. Validate cell-index value obtained from Device Tree. Found by Linux Verification Center (linuxtesting.org) with SVACE. Reviewed-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Link: https://patch.msgid.link/20241028065824.15452-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com> Date: Tue Dec 3 18:21:21 2024 +0000 geneve: do not assume mac header is set in geneve_xmit_skb() [ Upstream commit 8588c99c7d47448fcae39e3227d6e2bb97aad86d ] We should not assume mac header is set in output path. Use skb_eth_hdr() instead of eth_hdr() to fix the issue. sysbot reported the following : WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 skb_mac_header include/linux/skbuff.h:3052 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 eth_hdr include/linux/if_ether.h:24 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit_skb drivers/net/geneve.c:898 [inline] WARNING: CPU: 0 PID: 11635 at include/linux/skbuff.h:3052 geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039 Modules linked in: CPU: 0 UID: 0 PID: 11635 Comm: syz.4.1423 Not tainted 6.12.0-syzkaller-10296-gaaf20f870da0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_mac_header include/linux/skbuff.h:3052 [inline] RIP: 0010:eth_hdr include/linux/if_ether.h:24 [inline] RIP: 0010:geneve_xmit_skb drivers/net/geneve.c:898 [inline] RIP: 0010:geneve_xmit+0x4c38/0x5730 drivers/net/geneve.c:1039 Code: 21 c6 02 e9 35 d4 ff ff e8 a5 48 4c fb 90 0f 0b 90 e9 fd f5 ff ff e8 97 48 4c fb 90 0f 0b 90 e9 d8 f5 ff ff e8 89 48 4c fb 90 <0f> 0b 90 e9 41 e4 ff ff e8 7b 48 4c fb 90 0f 0b 90 e9 cd e7 ff ff RSP: 0018:ffffc90003b2f870 EFLAGS: 00010283 RAX: 000000000000037a RBX: 000000000000ffff RCX: ffffc9000dc3d000 RDX: 0000000000080000 RSI: ffffffff86428417 RDI: 0000000000000003 RBP: ffffc90003b2f9f0 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000002 R12: ffff88806603c000 R13: 0000000000000000 R14: ffff8880685b2780 R15: 0000000000000e23 FS: 00007fdc2deed6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b30a1dff8 CR3: 0000000056b8c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] __dev_direct_xmit+0x58a/0x720 net/core/dev.c:4490 dev_direct_xmit include/linux/netdevice.h:3181 [inline] packet_xmit+0x1e4/0x360 net/packet/af_packet.c:285 packet_snd net/packet/af_packet.c:3146 [inline] packet_sendmsg+0x2700/0x5660 net/packet/af_packet.c:3178 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x488/0x4f0 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __x64_sys_sendto+0xe0/0x1c0 net/socket.c:2200 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: a025fb5f49ad ("geneve: Allow configuration of DF behaviour") Reported-by: syzbot+3ec5271486d7cb2d242a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/674f4b72.050a0220.17bd51.004a.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Link: https://patch.msgid.link/20241203182122.2725517-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Date: Thu Sep 19 15:51:04 2024 +0200 gpio: free irqs that are still requested when the chip is being removed [ Upstream commit ec8b6f55b98146c41dcf15e8189eb43291e35e89 ] If we remove a GPIO chip that is also an interrupt controller with users not having freed some interrupts, we'll end up leaking resources as indicated by the following warning: remove_proc_entry: removing non-empty directory 'irq/30', leaking at least 'gpio' As there's no way of notifying interrupt users about the irqchip going away and the interrupt subsystem is not plugged into the driver model and so not all cases can be handled by devlinks, we need to make sure to free all interrupts before the complete the removal of the provider. Reviewed-by: Herve Codina <herve.codina@bootlin.com> Tested-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/20240919135104.3583-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Han <hanchunchao@inspur.com> Date: Thu Nov 14 17:18:22 2024 +0800 gpio: grgpio: Add NULL check in grgpio_probe [ Upstream commit 050b23d081da0f29474de043e9538c1f7a351b3b ] devm_kasprintf() can return a NULL pointer on failure,but this returned value in grgpio_probe is not checked. Add NULL check in grgpio_probe, to handle kernel NULL pointer dereference error. Cc: stable@vger.kernel.org Fixes: 7eb6ce2f2723 ("gpio: Convert to using %pOF instead of full_name") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://lore.kernel.org/r/20241114091822.78199-1-hanchunchao@inspur.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Date: Tue Oct 15 15:18:31 2024 +0200 gpio: grgpio: use a helper variable to store the address of ofdev->dev [ Upstream commit d036ae41cebdfae92666024163c109b8fef516fa ] Instead of dereferencing the platform device pointer repeatedly, just store its address in a helper variable. Link: https://lore.kernel.org/r/20241015131832.44678-3-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Stable-dep-of: 050b23d081da ("gpio: grgpio: Add NULL check in grgpio_probe") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Tissoires <bentiss@kernel.org> Date: Tue Oct 1 16:30:12 2024 +0200 HID: add per device quirk to force bind to hid-generic [ Upstream commit 645c224ac5f6e0013931c342ea707b398d24d410 ] We already have the possibility to force not binding to hid-generic and rely on a dedicated driver, but we couldn't do the other way around. This is useful for BPF programs where we are fixing the report descriptor and the events, but want to avoid a specialized driver to come after BPF which would unwind everything that is done there. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Link: https://patch.msgid.link/20241001-hid-bpf-hid-generic-v3-8-2ef1019468df@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kenny Levinsen <kl@kl.wtf> Date: Wed Nov 20 00:53:17 2024 +0100 HID: i2c-hid: Revert to using power commands to wake on resume commit 34851431ceca1bf457d85895bd38a4e7967e2055 upstream. commit 7d6f065de37c ("HID: i2c-hid: Use address probe to wake on resume") replaced the retry of power commands with the dummy read "bus probe" we use on boot which accounts for a necessary delay before retry. This made at least one Weida device (2575:0910 in an ASUS Vivobook S14) very unhappy, as the bus probe despite being successful somehow lead to the following power command failing so hard that the device never lets go of the bus. This means that even retries of the power command would fail on a timeout as the bus remains busy. Remove the bus probe on resume and instead reintroduce retry of the power command for wake-up purposes while respecting the newly established wake-up retry timings. Fixes: 7d6f065de37c ("HID: i2c-hid: Use address probe to wake on resume") Cc: stable@vger.kernel.org Reported-by: Michael <auslands-kv@gmx.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219440 Link: https://lore.kernel.org/r/d5acb485-7377-4139-826d-4df04d21b5ed@leemhuis.info/ Signed-off-by: Kenny Levinsen <kl@kl.wtf> Link: https://patch.msgid.link/20241119235615.23902-1-kl@kl.wtf Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Callahan Kovacs <callahankovacs@gmail.com> Date: Mon Nov 11 22:49:28 2024 +0100 HID: magicmouse: Apple Magic Trackpad 2 USB-C driver support [ Upstream commit 87a2f10395c82c2b4687bb8611a6c5663a12f9e7 ] Adds driver support for the USB-C model of Apple's Magic Trackpad 2. The 2024 USB-C model is compatible with the existing Magic Trackpad 2 driver but has a different hardware ID. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219470 Signed-off-by: Callahan Kovacs <callahankovacs@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: WangYuli <wangyuli@uniontech.com> Date: Mon Nov 25 13:26:16 2024 +0800 HID: wacom: fix when get product name maybe null pointer commit 59548215b76be98cf3422eea9a67d6ea578aca3d upstream. Due to incorrect dev->product reporting by certain devices, null pointer dereferences occur when dev->product is empty, leading to potential system crashes. This issue was found on EXCELSIOR DL37-D05 device with Loongson-LS3A6000-7A2000-DL37 motherboard. Kernel logs: [ 56.470885] usb 4-3: new full-speed USB device number 4 using ohci-pci [ 56.671638] usb 4-3: string descriptor 0 read error: -22 [ 56.671644] usb 4-3: New USB device found, idVendor=056a, idProduct=0374, bcdDevice= 1.07 [ 56.671647] usb 4-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 56.678839] hid-generic 0003:056A:0374.0004: hiddev0,hidraw3: USB HID v1.10 Device [HID 056a:0374] on usb-0000:00:05.0-3/input0 [ 56.697719] CPU 2 Unable to handle kernel paging request at virtual address 0000000000000000, era == 90000000066e35c8, ra == ffff800004f98a80 [ 56.697732] Oops[#1]: [ 56.697734] CPU: 2 PID: 2742 Comm: (udev-worker) Tainted: G OE 6.6.0-loong64-desktop #25.00.2000.015 [ 56.697737] Hardware name: Inspur CE520L2/C09901N000000000, BIOS 2.09.00 10/11/2024 [ 56.697739] pc 90000000066e35c8 ra ffff800004f98a80 tp 9000000125478000 sp 900000012547b8a0 [ 56.697741] a0 0000000000000000 a1 ffff800004818b28 a2 0000000000000000 a3 0000000000000000 [ 56.697743] a4 900000012547b8f0 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 56.697745] t0 ffff800004818b2d t1 0000000000000000 t2 0000000000000003 t3 0000000000000005 [ 56.697747] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 56.697748] t8 0000000000000000 u0 0000000000000000 s9 0000000000000000 s0 900000011aa48028 [ 56.697750] s1 0000000000000000 s2 0000000000000000 s3 ffff800004818e80 s4 ffff800004810000 [ 56.697751] s5 90000001000b98d0 s6 ffff800004811f88 s7 ffff800005470440 s8 0000000000000000 [ 56.697753] ra: ffff800004f98a80 wacom_update_name+0xe0/0x300 [wacom] [ 56.697802] ERA: 90000000066e35c8 strstr+0x28/0x120 [ 56.697806] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 56.697816] PRMD: 0000000c (PPLV0 +PIE +PWE) [ 56.697821] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 56.697827] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 56.697831] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 56.697835] BADV: 0000000000000000 [ 56.697836] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000) [ 56.697838] Modules linked in: wacom(+) bnep bluetooth rfkill qrtr nls_iso8859_1 nls_cp437 snd_hda_codec_conexant snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore input_leds mousedev led_class joydev deepin_netmonitor(OE) fuse nfnetlink dmi_sysfs ip_tables x_tables overlay amdgpu amdxcp drm_exec gpu_sched drm_buddy radeon drm_suballoc_helper i2c_algo_bit drm_ttm_helper r8169 ttm drm_display_helper spi_loongson_pci xhci_pci cec xhci_pci_renesas spi_loongson_core hid_generic realtek gpio_loongson_64bit [ 56.697887] Process (udev-worker) (pid: 2742, threadinfo=00000000aee0d8b4, task=00000000a9eff1f3) [ 56.697890] Stack : 0000000000000000 ffff800004817e00 0000000000000000 0000251c00000000 [ 56.697896] 0000000000000000 00000011fffffffd 0000000000000000 0000000000000000 [ 56.697901] 0000000000000000 1b67a968695184b9 0000000000000000 90000001000b98d0 [ 56.697906] 90000001000bb8d0 900000011aa48028 0000000000000000 ffff800004f9d74c [ 56.697911] 90000001000ba000 ffff800004f9ce58 0000000000000000 ffff800005470440 [ 56.697916] ffff800004811f88 90000001000b98d0 9000000100da2aa8 90000001000bb8d0 [ 56.697921] 0000000000000000 90000001000ba000 900000011aa48028 ffff800004f9d74c [ 56.697926] ffff8000054704e8 90000001000bb8b8 90000001000ba000 0000000000000000 [ 56.697931] 90000001000bb8d0 9000000006307564 9000000005e666e0 90000001752359b8 [ 56.697936] 9000000008cbe400 900000000804d000 9000000005e666e0 0000000000000000 [ 56.697941] ... [ 56.697944] Call Trace: [ 56.697945] [<90000000066e35c8>] strstr+0x28/0x120 [ 56.697950] [<ffff800004f98a80>] wacom_update_name+0xe0/0x300 [wacom] [ 56.698000] [<ffff800004f9ce58>] wacom_parse_and_register+0x338/0x900 [wacom] [ 56.698050] [<ffff800004f9d74c>] wacom_probe+0x32c/0x420 [wacom] [ 56.698099] [<9000000006307564>] hid_device_probe+0x144/0x260 [ 56.698103] [<9000000005e65d68>] really_probe+0x208/0x540 [ 56.698109] [<9000000005e661dc>] __driver_probe_device+0x13c/0x1e0 [ 56.698112] [<9000000005e66620>] driver_probe_device+0x40/0x100 [ 56.698116] [<9000000005e6680c>] __device_attach_driver+0x12c/0x180 [ 56.698119] [<9000000005e62bc8>] bus_for_each_drv+0x88/0x160 [ 56.698123] [<9000000005e66468>] __device_attach+0x108/0x260 [ 56.698126] [<9000000005e63918>] device_reprobe+0x78/0x100 [ 56.698129] [<9000000005e62a68>] bus_for_each_dev+0x88/0x160 [ 56.698132] [<9000000006304e54>] __hid_bus_driver_added+0x34/0x80 [ 56.698134] [<9000000005e62bc8>] bus_for_each_drv+0x88/0x160 [ 56.698137] [<9000000006304df0>] __hid_register_driver+0x70/0xa0 [ 56.698142] [<9000000004e10fe4>] do_one_initcall+0x104/0x320 [ 56.698146] [<9000000004f38150>] do_init_module+0x90/0x2c0 [ 56.698151] [<9000000004f3a3d8>] init_module_from_file+0xb8/0x120 [ 56.698155] [<9000000004f3a590>] idempotent_init_module+0x150/0x3a0 [ 56.698159] [<9000000004f3a890>] sys_finit_module+0xb0/0x140 [ 56.698163] [<900000000671e4e8>] do_syscall+0x88/0xc0 [ 56.698166] [<9000000004e12404>] handle_syscall+0xc4/0x160 [ 56.698171] Code: 0011958f 00150224 5800cd85 <2a00022c> 00150004 4000c180 0015022c 03400000 03400000 [ 56.698192] ---[ end trace 0000000000000000 ]--- Fixes: 09dc28acaec7 ("HID: wacom: Improve generic name generation") Reported-by: Zhenxing Chen <chenzhenxing@uniontech.com> Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://patch.msgid.link/B31757FE8E1544CF+20241125052616.18261-1-wangyuli@uniontech.com Cc: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sarah Maedel <sarah.maedel@hetzner-cloud.de> Date: Fri Oct 18 09:46:10 2024 +0200 hwmon: (nct6775) Add 665-ACE/600M-CL to ASUS WMI monitoring list [ Upstream commit ccae49e5cf6ebda1a7fa5d2ca99500987c7420c4 ] Boards such as * Pro WS 665-ACE * Pro WS 600M-CL have got a nct6775 chip, but by default there's no use of it because of resource conflict with WMI method. Add affected boards to the WMI monitoring list. Link: https://bugzilla.kernel.org/show_bug.cgi?id=204807 Co-developed-by: Tommy Giesler <tommy.giesler@hetzner.com> Signed-off-by: Tommy Giesler <tommy.giesler@hetzner.com> Signed-off-by: Sarah Maedel <sarah.maedel@hetzner-cloud.de> Message-ID: <20241018074611.358619-1-sarah.maedel@hetzner-cloud.de> [groeck: Change commit message to imperative mood] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jarkko Nikula <jarkko.nikula@linux.intel.com> Date: Mon Sep 23 16:27:19 2024 +0300 i2c: i801: Add support for Intel Panther Lake [ Upstream commit bd492b58371295d3ae26162b9666be584abad68a ] Add SMBus PCI IDs on Intel Panther Lake-P and -U. Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frank Li <Frank.Li@nxp.com> Date: Mon Oct 21 11:45:07 2024 -0400 i3c: master: Extend address status bit to 4 and add I3C_ADDR_SLOT_EXT_DESIRED [ Upstream commit 2f552fa280590e61bd3dbe66a7b54b99caa642a4 ] Extend the address status bit to 4 and introduce the I3C_ADDR_SLOT_EXT_DESIRED macro to indicate that a device prefers a specific address. This is generally set by the 'assigned-address' in the device tree source (dts) file. ┌────┬─────────────┬───┬─────────┬───┐ │S/Sr│ 7'h7E RnW=0 │ACK│ ENTDAA │ T ├────┐ └────┴─────────────┴───┴─────────┴───┘ │ ┌─────────────────────────────────────────┘ │ ┌──┬─────────────┬───┬─────────────────┬────────────────┬───┬─────────┐ └─►│Sr│7'h7E RnW=1 │ACK│48bit UID BCR DCR│Assign 7bit Addr│PAR│ ACK/NACK│ └──┴─────────────┴───┴─────────────────┴────────────────┴───┴─────────┘ Some master controllers (such as HCI) need to prepare the entire above transaction before sending it out to the I3C bus. This means that a 7-bit dynamic address needs to be allocated before knowing the target device's UID information. However, some I3C targets may request specific addresses (called as "init_dyn_addr"), which is typically specified by the DT-'s assigned-address property. Lower addresses having higher IBI priority. If it is available, i3c_bus_get_free_addr() preferably return a free address that is not in the list of desired addresses (called as "init_dyn_addr"). This allows the device with the "init_dyn_addr" to switch to its "init_dyn_addr" when it hot-joins the I3C bus. Otherwise, if the "init_dyn_addr" is already in use by another I3C device, the target device will not be able to switch to its desired address. If the previous step fails, fallback returning one of the remaining unassigned address, regardless of its state in the desired list. Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20241021-i3c_dts_assign-v8-2-4098b8bde01e@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Stable-dep-of: 851bd21cdb55 ("i3c: master: Fix dynamic address leak when 'assigned-address' is present") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frank Li <Frank.Li@nxp.com> Date: Mon Oct 21 11:45:08 2024 -0400 i3c: master: Fix dynamic address leak when 'assigned-address' is present [ Upstream commit 851bd21cdb55e727ab29280bc9f6b678164f802a ] If the DTS contains 'assigned-address', a dynamic address leak occurs during hotjoin events. Assume a device have assigned-address 0xb. - Device issue Hotjoin - Call i3c_master_do_daa() - Call driver xxx_do_daa() - Call i3c_master_get_free_addr() to get dynamic address 0x9 - i3c_master_add_i3c_dev_locked(0x9) - expected_dyn_addr = newdev->boardinfo->init_dyn_addr (0xb); - i3c_master_reattach_i3c_dev(newdev(0xb), old_dyn_addr(0x9)); - if (dev->info.dyn_addr != old_dyn_addr && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0xb != 0x9 -> TRUE (!dev->boardinfo || ^^^^^^^^^^^^^^^ -> FALSE dev->info.dyn_addr != dev->boardinfo->init_dyn_addr)) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0xb != 0xb -> FALSE ... i3c_bus_set_addr_slot_status(&master->bus, old_dyn_addr, I3C_ADDR_SLOT_FREE); ^^^ This will be skipped. So old_dyn_addr never free } - i3c_master_get_free_addr() will return increased sequence number. Remove dev->info.dyn_addr != dev->boardinfo->init_dyn_addr condition check. dev->info.dyn_addr should be checked before calling this function because i3c_master_setnewda_locked() has already been called and the target device has already accepted dyn_addr. It is too late to check if dyn_addr is free in i3c_master_reattach_i3c_dev(). Add check to ensure expected_dyn_addr is free before i3c_master_setnewda_locked(). Fixes: cc3a392d69b6 ("i3c: master: fix for SETDASA and DAA process") Cc: stable@kernel.org Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20241021-i3c_dts_assign-v8-3-4098b8bde01e@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frank Li <Frank.Li@nxp.com> Date: Mon Oct 21 11:45:06 2024 -0400 i3c: master: Replace hard code 2 with macro I3C_ADDR_SLOT_STATUS_BITS [ Upstream commit 16aed0a6520ba01b7d22c32e193fc1ec674f92d4 ] Replace the hardcoded value 2, which indicates 2 bits for I3C address status, with the predefined macro I3C_ADDR_SLOT_STATUS_BITS. Improve maintainability and extensibility of the code. Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20241021-i3c_dts_assign-v8-1-4098b8bde01e@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Stable-dep-of: 851bd21cdb55 ("i3c: master: Fix dynamic address leak when 'assigned-address' is present") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jarkko Nikula <jarkko.nikula@linux.intel.com> Date: Fri Sep 20 17:44:31 2024 +0300 i3c: mipi-i3c-hci: Mask ring interrupts before ring stop request [ Upstream commit 6ca2738174e4ee44edb2ab2d86ce74f015a0cc32 ] Bus cleanup path in DMA mode may trigger a RING_OP_STAT interrupt when the ring is being stopped. Depending on timing between ring stop request completion, interrupt handler removal and code execution this may lead to a NULL pointer dereference in hci_dma_irq_handler() if it gets to run after the io_data pointer is set to NULL in hci_dma_cleanup(). Prevent this my masking the ring interrupts before ring stop request. Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Link: https://lore.kernel.org/r/20240920144432.62370-2-jarkko.nikula@linux.intel.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Defa Li <defa.li@mediatek.com> Date: Thu Nov 7 21:25:39 2024 +0800 i3c: Use i3cdev->desc->info instead of calling i3c_device_get_info() to avoid deadlock [ Upstream commit 6cf7b65f7029914dc0cd7db86fac9ee5159008c6 ] A deadlock may happen since the i3c_master_register() acquires &i3cbus->lock twice. See the log below. Use i3cdev->desc->info instead of calling i3c_device_info() to avoid acquiring the lock twice. v2: - Modified the title and commit message ============================================ WARNING: possible recursive locking detected 6.11.0-mainline -------------------------------------------- init/1 is trying to acquire lock: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_bus_normaluse_lock but task is already holding lock: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&i3cbus->lock); lock(&i3cbus->lock); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by init/1: #0: fcffff809b6798f8 (&dev->mutex){....}-{3:3}, at: __driver_attach #1: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register stack backtrace: CPU: 6 UID: 0 PID: 1 Comm: init Call trace: dump_backtrace+0xfc/0x17c show_stack+0x18/0x28 dump_stack_lvl+0x40/0xc0 dump_stack+0x18/0x24 print_deadlock_bug+0x388/0x390 __lock_acquire+0x18bc/0x32ec lock_acquire+0x134/0x2b0 down_read+0x50/0x19c i3c_bus_normaluse_lock+0x14/0x24 i3c_device_get_info+0x24/0x58 i3c_device_uevent+0x34/0xa4 dev_uevent+0x310/0x384 kobject_uevent_env+0x244/0x414 kobject_uevent+0x14/0x20 device_add+0x278/0x460 device_register+0x20/0x34 i3c_master_register_new_i3c_devs+0x78/0x154 i3c_master_register+0x6a0/0x6d4 mtk_i3c_master_probe+0x3b8/0x4d8 platform_probe+0xa0/0xe0 really_probe+0x114/0x454 __driver_probe_device+0xa0/0x15c driver_probe_device+0x3c/0x1ac __driver_attach+0xc4/0x1f0 bus_for_each_dev+0x104/0x160 driver_attach+0x24/0x34 bus_add_driver+0x14c/0x294 driver_register+0x68/0x104 __platform_driver_register+0x20/0x30 init_module+0x20/0xfe4 do_one_initcall+0x184/0x464 do_init_module+0x58/0x1ec load_module+0xefc/0x10c8 __arm64_sys_finit_module+0x238/0x33c invoke_syscall+0x58/0x10c el0_svc_common+0xa8/0xdc do_el0_svc+0x1c/0x28 el0_svc+0x50/0xac el0t_64_sync_handler+0x70/0xbc el0t_64_sync+0x1a8/0x1ac Signed-off-by: Defa Li <defa.li@mediatek.com> Link: https://lore.kernel.org/r/20241107132549.25439-1-defa.li@mediatek.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Date: Mon Sep 30 20:36:22 2024 +0200 ice: fix PHY Clock Recovery availability check [ Upstream commit 01fd68e54794fb1e1fe95be38facf9bbafee9ca3 ] To check if PHY Clock Recovery mechanic is available for a device, there is a need to verify if given PHY is available within the netlist, but the netlist node type used for the search is wrong, also the search context shall be specified. Modify the search function to allow specifying the context in the search. Use the PHY node type instead of CLOCK CONTROLLER type, also use proper search context which for PHY search is PORT, as defined in E810 Datasheet [1] ('3.3.8.2.4 Node Part Number and Node Options (0x0003)' and 'Table 3-105. Program Topology Device NVM Admin Command'). [1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true Fixes: 91e43ca0090b ("ice: fix linking when CONFIG_PTP_1588_CLOCK=n") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Przemyslaw Korba <przemyslaw.korba@intel.com> Date: Fri Nov 15 13:25:37 2024 +0100 ice: fix PHY timestamp extraction for ETH56G [ Upstream commit 3214fae85e8336fe13e20cf78fc9b6a668bdedff ] Fix incorrect PHY timestamp extraction for ETH56G. It's better to use FIELD_PREP() than manual shift. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marcin Szycik <marcin.szycik@linux.intel.com> Date: Mon Nov 4 19:49:09 2024 +0100 ice: Fix VLAN pruning in switchdev mode [ Upstream commit 761e0be2888a931465e0d7bbeecce797f9c311a3 ] In switchdev mode the uplink VSI should receive all unmatched packets, including VLANs. Therefore, VLAN pruning should be disabled if uplink is in switchdev mode. It is already being done in ice_eswitch_setup_env(), however the addition of ice_up() in commit 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") caused VLAN pruning to be re-enabled after disabling it. Add a check to ice_set_vlan_filtering_features() to ensure VLAN filtering will not be enabled if uplink is in switchdev mode. Note that ice_is_eswitch_mode_switchdev() is being used instead of ice_is_switchdev_running(), as the latter would only return true after the whole switchdev setup completes. Fixes: 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joshua Hay <joshua.a.hay@intel.com> Date: Mon Oct 7 13:24:35 2024 -0700 idpf: set completion tag for "empty" bufs associated with a packet [ Upstream commit 4c69c77aafe74cf755af55070584b643e5c4e4d8 ] Commit d9028db618a6 ("idpf: convert to libeth Tx buffer completion") inadvertently removed code that was necessary for the tx buffer cleaning routine to iterate over all buffers associated with a packet. When a frag is too large for a single data descriptor, it will be split across multiple data descriptors. This means the frag will span multiple buffers in the buffer ring in order to keep the descriptor and buffer ring indexes aligned. The buffer entries in the ring are technically empty and no cleaning actions need to be performed. These empty buffers can precede other frags associated with the same packet. I.e. a single packet on the buffer ring can look like: buf[0]=skb0.frag0 buf[1]=skb0.frag1 buf[2]=empty buf[3]=skb0.frag2 The cleaning routine iterates through these buffers based on a matching completion tag. If the completion tag is not set for buf2, the loop will end prematurely. Frag2 will be left uncleaned and next_to_clean will be left pointing to the end of packet, which will break the cleaning logic for subsequent cleans. This consequently leads to tx timeouts. Assign the empty bufs the same completion tag for the packet to ensure the cleaning routine iterates over all of the buffers associated with the packet. Fixes: d9028db618a6 ("idpf: convert to libeth Tx buffer completion") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Madhu chittim <madhu.chittim@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuan Can <yuancan@huawei.com> Date: Wed Oct 23 20:10:48 2024 +0800 igb: Fix potential invalid memory access in igb_init_module() [ Upstream commit 0566f83d206c7a864abcd741fe39d6e0ae5eef29 ] The pci_register_driver() can fail and when this happened, the dca_notifier needs to be unregistered, otherwise the dca_notifier can be called when igb fails to install, resulting to invalid memory access. Fixes: bbd98fe48a43 ("igb: Fix DCA errors and do not use context index for 82576") Signed-off-by: Yuan Can <yuancan@huawei.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nuno Sa <nuno.sa@analog.com> Date: Mon Oct 14 17:01:21 2024 +0200 iio: adc: ad7192: properly check spi_get_device_match_data() [ Upstream commit b7f99fa1b64af2f696b13cec581cb4cd7d3982b8 ] spi_get_device_match_data() can return a NULL pointer. Hence, let's check for it. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20241014-fix-error-check-v1-1-089e1003d12f@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Date: Thu Oct 24 22:05:12 2024 +0300 iio: light: ltr501: Add LTER0303 to the supported devices [ Upstream commit c26acb09ccbef47d1fddaf0783c1392d0462122c ] It has been found that the (non-vendor issued) ACPI ID for Lite-On LTR303 is present in Microsoft catalog. Add it to the list of the supported devices. Link: https://www.catalog.update.microsoft.com/Search.aspx?q=lter0303 Closes: https://lore.kernel.org/r/9cdda3e0-d56e-466f-911f-96ffd6f602c8@redhat.com Reported-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20241024191200.229894-24-andriy.shevchenko@linux.intel.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Rothwell <sfr@canb.auug.org.au> Date: Fri Nov 8 16:41:27 2024 +0100 iio: magnetometer: fix if () scoped_guard() formatting commit 9a884bdb6e9560c6da44052d5248e89d78c983a6 upstream. Add mising braces after an if condition that contains scoped_guard(). This style is both preferred and necessary here, to fix warning after scoped_guard() change in commit fcc22ac5baf0 ("cleanup: Adjust scoped_guard() macros to avoid potential warning") to have if-else inside of the macro. Current (no braces) use in af8133j_set_scale() yields the following warnings: af8133j.c:315:12: warning: suggest explicit braces to avoid ambiguous 'else' [-Wdangling-else] af8133j.c:316:3: warning: add explicit braces to avoid dangling else [-Wdangling-else] Fixes: fcc22ac5baf0 ("cleanup: Adjust scoped_guard() macros to avoid potential warning") Closes: https://lore.kernel.org/oe-kbuild-all/202409270848.tTpyEAR7-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241108154258.21411-1-przemyslaw.kitszel@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jakob Hauser <jahau@rocketmail.com> Date: Fri Nov 29 22:25:07 2024 +0100 iio: magnetometer: yas530: use signed integer type for clamp limits [ Upstream commit f1ee5483e40881d8ad5a63aa148b753b5c6a839b ] In the function yas537_measure() there is a clamp_val() with limits of -BIT(13) and BIT(13) - 1. The input clamp value h[] is of type s32. The BIT() is of type unsigned long integer due to its define in include/vdso/bits.h. The lower limit -BIT(13) is recognized as -8192 but expressed as an unsigned long integer. The size of an unsigned long integer differs between 32-bit and 64-bit architectures. Converting this to type s32 may lead to undesired behavior. Additionally, in the calculation lines h[0], h[1] and h[2] the unsigned long integer divisor BIT(13) causes an unsigned division, shifting the left-hand side of the equation back and forth, possibly ending up in large positive values instead of negative values on 32-bit architectures. To solve those two issues, declare a signed integer with a value of BIT(13). There is another omission in the clamp line: clamp_val() returns a value and it's going nowhere here. Self-assign it to h[i] to make use of the clamp macro. Finally, replace clamp_val() macro by clamp() because after changing the limits from type unsigned long integer to signed integer it's fine that way. Link: https://lkml.kernel.org/r/11609b2243c295d65ab4d47e78c239d61ad6be75.1732914810.git.jahau@rocketmail.com Fixes: 65f79b501030 ("iio: magnetometer: yas530: Add YAS537 variant") Signed-off-by: Jakob Hauser <jahau@rocketmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202411230458.dhZwh3TT-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202411282222.oF0B4110-lkp@intel.com/ Reviewed-by: David Laight <david.laight@aculab.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jens Axboe <axboe@kernel.dk> Date: Fri Nov 29 07:20:28 2024 -0700 io_uring/tctx: work around xa_store() allocation error issue [ Upstream commit 7eb75ce7527129d7f1fee6951566af409a37a1c4 ] syzbot triggered the following WARN_ON: WARNING: CPU: 0 PID: 16 at io_uring/tctx.c:51 __io_uring_free+0xfa/0x140 io_uring/tctx.c:51 which is the WARN_ON_ONCE(!xa_empty(&tctx->xa)); sanity check in __io_uring_free() when a io_uring_task is going through its final put. The syzbot test case includes injecting memory allocation failures, and it very much looks like xa_store() can fail one of its memory allocations and end up with ->head being non-NULL even though no entries exist in the xarray. Until this issue gets sorted out, work around it by attempting to iterate entries in our xarray, and WARN_ON_ONCE() if one is found. Reported-by: syzbot+cc36d44ec9f368e443d3@syzkaller.appspotmail.com Link: https://lore.kernel.org/io-uring/673c1643.050a0220.87769.0066.GAE@google.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bernd Schubert <bschubert@ddn.com> Date: Tue Dec 3 11:31:05 2024 +0100 io_uring: Change res2 parameter type in io_uring_cmd_done commit a07d2d7930c75e6bf88683b376d09ab1f3fed2aa upstream. Change the type of the res2 parameter in io_uring_cmd_done from ssize_t to u64. This aligns the parameter type with io_req_set_cqe32_extra, which expects u64 arguments. The change eliminates potential issues on 32-bit architectures where ssize_t might be 32-bit. Only user of passing res2 is drivers/nvme/host/ioctl.c and it actually passes u64. Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Cc: stable@vger.kernel.org Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Tested-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Signed-off-by: Bernd Schubert <bschubert@ddn.com> Link: https://lore.kernel.org/r/20241203-io_uring_cmd_done-res2-as-u64-v2-1-5e59ae617151@ddn.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jason Gunthorpe <jgg@ziepe.ca> Date: Fri Oct 18 14:12:19 2024 -0300 iommu/amd: Fix corruption when mapping large pages from 0 [ Upstream commit e3a682eaf2af51a83f5313145ef592ce50fa787f ] If a page is mapped starting at 0 that is equal to or larger than can fit in the current mode (number of table levels) it results in corrupting the mapping as the following logic assumes the mode is correct for the page size being requested. There are two issues here, the check if the address fits within the table uses the start address, it should use the last address to ensure that last byte of the mapping fits within the current table mode. The second is if the mapping is exactly the size of the full page table it has to add another level to instead hold a single IOPTE for the large size. Since both corner cases require a 0 IOVA to be hit and doesn't start until a page size of 2^48 it is unlikely to ever hit in a real system. Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-27ab08d646a1+29-amd_0map_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolin Chen <nicolinc@nvidia.com> Date: Tue Dec 3 00:02:54 2024 -0800 iommufd: Fix out_fput in iommufd_fault_alloc() commit af7f4780514f850322b2959032ecaa96e4b26472 upstream. As fput() calls the file->f_op->release op, where fault obj and ictx are getting released, there is no need to release these two after fput() one more time, which would result in imbalanced refcounts: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 48 PID: 2369 at lib/refcount.c:31 refcount_warn_saturate+0x60/0x230 Call trace: refcount_warn_saturate+0x60/0x230 (P) refcount_warn_saturate+0x60/0x230 (L) iommufd_fault_fops_release+0x9c/0xe0 [iommufd] ... VFS: Close: file count is 0 (f_op=iommufd_fops [iommufd]) WARNING: CPU: 48 PID: 2369 at fs/open.c:1507 filp_flush+0x3c/0xf0 Call trace: filp_flush+0x3c/0xf0 (P) filp_flush+0x3c/0xf0 (L) __arm64_sys_close+0x34/0x98 ... imbalanced put on file reference count WARNING: CPU: 48 PID: 2369 at fs/file.c:74 __file_ref_put+0x100/0x138 Call trace: __file_ref_put+0x100/0x138 (P) __file_ref_put+0x100/0x138 (L) __fput_sync+0x4c/0xd0 Drop those two lines to fix the warnings above. Cc: stable@vger.kernel.org Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object") Link: https://patch.msgid.link/r/b5651beb3a6b1adeef26fffac24607353bf67ba1.1733212723.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Eric Dumazet <edumazet@google.com> Date: Tue Nov 26 19:28:27 2024 +0000 ipv6: avoid possible NULL deref in modify_prefix_route() [ Upstream commit a747e02430dfb3657141f99aa6b09331283fa493 ] syzbot found a NULL deref [1] in modify_prefix_route(), caused by one fib6_info without a fib6_table pointer set. This can happen for net->ipv6.fib6_null_entry [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] CPU: 1 UID: 0 PID: 5837 Comm: syz-executor888 Not tainted 6.12.0-syzkaller-09567-g7eef7e306d3c #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:__lock_acquire+0xe4/0x3c40 kernel/locking/lockdep.c:5089 Code: 08 84 d2 0f 85 15 14 00 00 44 8b 0d ca 98 f5 0e 45 85 c9 0f 84 b4 0e 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 96 2c 00 00 49 8b 04 24 48 3d a0 07 7f 93 0f 84 RSP: 0018:ffffc900035d7268 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000006 RSI: 1ffff920006bae5f RDI: 0000000000000030 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001 R10: ffffffff90608e17 R11: 0000000000000001 R12: 0000000000000030 R13: ffff888036334880 R14: 0000000000000000 R15: 0000000000000000 FS: 0000555579e90380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc59cc4278 CR3: 0000000072b54000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5849 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline] _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178 spin_lock_bh include/linux/spinlock.h:356 [inline] modify_prefix_route+0x30b/0x8b0 net/ipv6/addrconf.c:4831 inet6_addr_modify net/ipv6/addrconf.c:4923 [inline] inet6_rtm_newaddr+0x12c7/0x1ab0 net/ipv6/addrconf.c:5055 rtnetlink_rcv_msg+0x3c7/0xea0 net/core/rtnetlink.c:6920 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2541 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] ____sys_sendmsg+0xaaf/0xc90 net/socket.c:2583 ___sys_sendmsg+0x135/0x1e0 net/socket.c:2637 __sys_sendmsg+0x16e/0x220 net/socket.c:2669 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fd1dcef8b79 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc59cc4378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd1dcef8b79 RDX: 0000000000040040 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 00000000000113fd R08: 0000000000000006 R09: 0000000000000006 R10: 0000000000000006 R11: 0000000000000246 R12: 00007ffc59cc438c R13: 431bde82d7b634db R14: 0000000000000001 R15: 0000000000000001 </TASK> Fixes: 5eb902b8e719 ("net/ipv6: Remove expired routes with a separated list of routes.") Reported-by: syzbot+1de74b0794c40c8eb300@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67461f7f.050a0220.1286eb.0021.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> CC: Kui-Feng Lee <thinker.li@gmail.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jinghao Jia <jinghao7@illinois.edu> Date: Sat Nov 23 03:42:56 2024 -0600 ipvs: fix UB due to uninitialized stack access in ip_vs_protocol_init() [ Upstream commit 146b6f1112eb30a19776d6c323c994e9d67790db ] Under certain kernel configurations when building with Clang/LLVM, the compiler does not generate a return or jump as the terminator instruction for ip_vs_protocol_init(), triggering the following objtool warning during build time: vmlinux.o: warning: objtool: ip_vs_protocol_init() falls through to next function __initstub__kmod_ip_vs_rr__935_123_ip_vs_rr_init6() At runtime, this either causes an oops when trying to load the ipvs module or a boot-time panic if ipvs is built-in. This same issue has been reported by the Intel kernel test robot previously. Digging deeper into both LLVM and the kernel code reveals this to be a undefined behavior problem. ip_vs_protocol_init() uses a on-stack buffer of 64 chars to store the registered protocol names and leaves it uninitialized after definition. The function calls strnlen() when concatenating protocol names into the buffer. With CONFIG_FORTIFY_SOURCE strnlen() performs an extra step to check whether the last byte of the input char buffer is a null character (commit 3009f891bb9f ("fortify: Allow strlen() and strnlen() to pass compile-time known lengths")). This, together with possibly other configurations, cause the following IR to be generated: define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #5 section ".init.text" align 16 !kcfi_type !29 { %1 = alloca [64 x i8], align 16 ... 14: ; preds = %11 %15 = getelementptr inbounds i8, ptr %1, i64 63 %16 = load i8, ptr %15, align 1 %17 = tail call i1 @llvm.is.constant.i8(i8 %16) %18 = icmp eq i8 %16, 0 %19 = select i1 %17, i1 %18, i1 false br i1 %19, label %20, label %23 20: ; preds = %14 %21 = call i64 @strlen(ptr noundef nonnull dereferenceable(1) %1) #23 ... 23: ; preds = %14, %11, %20 %24 = call i64 @strnlen(ptr noundef nonnull dereferenceable(1) %1, i64 noundef 64) #24 ... } The above code calculates the address of the last char in the buffer (value %15) and then loads from it (value %16). Because the buffer is never initialized, the LLVM GVN pass marks value %16 as undefined: %13 = getelementptr inbounds i8, ptr %1, i64 63 br i1 undef, label %14, label %17 This gives later passes (SCCP, in particular) more DCE opportunities by propagating the undef value further, and eventually removes everything after the load on the uninitialized stack location: define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #0 section ".init.text" align 16 !kcfi_type !11 { %1 = alloca [64 x i8], align 16 ... 12: ; preds = %11 %13 = getelementptr inbounds i8, ptr %1, i64 63 unreachable } In this way, the generated native code will just fall through to the next function, as LLVM does not generate any code for the unreachable IR instruction and leaves the function without a terminator. Zero the on-stack buffer to avoid this possible UB. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402100205.PWXIz1ZK-lkp@intel.com/ Co-developed-by: Ruowen Qin <ruqin@redhat.com> Signed-off-by: Ruowen Qin <ruqin@redhat.com> Signed-off-by: Jinghao Jia <jinghao7@illinois.edu> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhou Wang <wangzhou1@hisilicon.com> Date: Sat Nov 16 16:01:26 2024 +0800 irqchip/gicv3-its: Add workaround for hip09 ITS erratum 162100801 [ Upstream commit f82e62d470cc990ebd9d691f931dd418e4e9cea9 ] When enabling GICv4.1 in hip09, VMAPP fails to clear some caches during the unmap operation, which can causes vSGIs to be lost. To fix the issue, invalidate the related vPE cache through GICR_INVALLR after VMOVP. Suggested-by: Marc Zyngier <maz@kernel.org> Co-developed-by: Nianyao Tang <tangnianyao@huawei.com> Signed-off-by: Nianyao Tang <tangnianyao@huawei.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be> Date: Tue Dec 3 17:27:40 2024 +0100 irqchip/stm32mp-exti: CONFIG_STM32MP_EXTI should not default to y when compile-testing [ Upstream commit 9151299ee5101e03eeed544c1280b0e14b89a8a4 ] Merely enabling compile-testing should not enable additional functionality. Fixes: 0be58e0553812fcb ("irqchip/stm32mp-exti: Allow building as module") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/ef5ec063b23522058f92087e072419ea233acfe9.1733243115.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oleksandr Ocheretnyi <oocheret@cisco.com> Date: Fri Sep 13 12:14:03 2024 -0700 iTCO_wdt: mask NMI_NOW bit for update_no_reboot_bit() call [ Upstream commit daa814d784ac034c62ab3fb0ef83daeafef527e2 ] Commit da23b6faa8bf ("watchdog: iTCO: Add support for Cannon Lake PCH iTCO") does not mask NMI_NOW bit during TCO1_CNT register's value comparison for update_no_reboot_bit() call causing following failure: ... iTCO_vendor_support: vendor-support=0 iTCO_wdt iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS ... and this can lead to unexpected NMIs later during regular crashkernel's workflow because of watchdog probe call failures. This change masks NMI_NOW bit for TCO1_CNT register values to avoid unexpected NMI_NOW bit inversions. Fixes: da23b6faa8bf ("watchdog: iTCO: Add support for Cannon Lake PCH iTCO") Signed-off-by: Oleksandr Ocheretnyi <oocheret@cisco.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20240913191403.2560805-1-oocheret@cisco.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tore Amundsen <tore@amundsen.org> Date: Fri Nov 15 14:17:36 2024 +0000 ixgbe: Correct BASE-BX10 compliance code [ Upstream commit f72ce14b231f7bf06088e4e50f1875f1e35f79d7 ] SFF-8472 (section 5.4 Transceiver Compliance Codes) defines bit 6 as BASE-BX10. Bit 6 means a value of 0x40 (decimal 64). The current value in the source code is 0x64, which appears to be a mix-up of hex and decimal values. A value of 0x64 (binary 01100100) incorrectly sets bit 2 (1000BASE-CX) and bit 5 (100BASE-FX) as well. Fixes: 1b43e0d20f2d ("ixgbe: Add 1000BASE-BX support") Signed-off-by: Tore Amundsen <tore@amundsen.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Ernesto Castellotti <ernesto@castellotti.net> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jacob Keller <jacob.e.keller@intel.com> Date: Fri Nov 1 16:05:43 2024 -0700 ixgbe: downgrade logging of unsupported VF API version to debug [ Upstream commit 15915b43a7fb938934bb7fc4290127218859d795 ] The ixgbe PF driver logs an info message when a VF attempts to negotiate an API version which it does not support: VF 0 requested invalid api version 6 The ixgbevf driver attempts to load with mailbox API v1.5, which is required for best compatibility with other hosts such as the ESX VMWare PF. The Linux PF only supports API v1.4, and does not currently have support for the v1.5 API. The logged message can confuse users, as the v1.5 API is valid, but just happens to not currently be supported by the Linux PF. Downgrade the info message to a debug message, and fix the language to use 'unsupported' instead of 'invalid' to improve message clarity. Long term, we should investigate whether the improvements in the v1.5 API make sense for the Linux PF, and if so implement them properly. This may require yet another API version to resolve issues with negotiating IPSEC offload support. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reported-by: Yifei Liu <yifei.l.liu@oracle.com> Link: https://lore.kernel.org/intel-wired-lan/20240301235837.3741422-1-yifei.l.liu@oracle.com/ Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jacob Keller <jacob.e.keller@intel.com> Date: Fri Nov 1 16:05:42 2024 -0700 ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5 [ Upstream commit d0725312adf5a803de8f621bd1b12ba7a6464a29 ] Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") added support for v1.5 of the PF to VF mailbox communication API. This commit mistakenly enabled IPSEC offload for API v1.5. No implementation of the v1.5 API has support for IPSEC offload. This offload is only supported by the Linux PF as mailbox API v1.4. In fact, the v1.5 API is not implemented in any Linux PF. Attempting to enable IPSEC offload on a PF which supports v1.5 API will not work. Only the Linux upstream ixgbe and ixgbevf support IPSEC offload, and only as part of the v1.4 API. Fix the ixgbevf Linux driver to stop attempting IPSEC offload when the mailbox API does not support it. The existing API design choice makes it difficult to support future API versions, as other non-Linux hosts do not implement IPSEC offload. If we add support for v1.5 to the Linux PF, then we lose support for IPSEC offload. A full solution likely requires a new mailbox API with a proper negotiation to check that IPSEC is actually supported by the host. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Weinberger <richard@nod.at> Date: Tue Dec 3 12:27:15 2024 +0100 jffs2: Fix rtime decompressor commit b29bf7119d6bbfd04aabb8d82b060fe2a33ef890 upstream. The fix for a memory corruption contained a off-by-one error and caused the compressor to fail in legit cases. Cc: Kinsey Moore <kinsey.moore@oarcorp.com> Cc: stable@vger.kernel.org Fixes: fe051552f5078 ("jffs2: Prevent rtime decompress memory corruption") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kinsey Moore <kinsey.moore@oarcorp.com> Date: Tue Jul 23 15:58:05 2024 -0500 jffs2: Prevent rtime decompress memory corruption commit fe051552f5078fa02d593847529a3884305a6ffe upstream. The rtime decompression routine does not fully check bounds during the entirety of the decompression pass and can corrupt memory outside the decompression buffer if the compressed data is corrupted. This adds the required check to prevent this failure mode. Cc: stable@vger.kernel.org Signed-off-by: Kinsey Moore <kinsey.moore@oarcorp.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nihar Chaithanya <niharchaithanya@gmail.com> Date: Wed Oct 9 01:51:38 2024 +0530 jfs: add a check to prevent array-index-out-of-bounds in dbAdjTree [ Upstream commit a174706ba4dad895c40b1d2277bade16dfacdcd9 ] When the value of lp is 0 at the beginning of the for loop, it will become negative in the next assignment and we should bail out. Reported-by: syzbot+412dea214d8baa3f7483@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=412dea214d8baa3f7483 Tested-by: syzbot+412dea214d8baa3f7483@syzkaller.appspotmail.com Signed-off-by: Nihar Chaithanya <niharchaithanya@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Date: Sat Sep 28 14:07:22 2024 +0530 jfs: array-index-out-of-bounds fix in dtReadFirst [ Upstream commit ca84a2c9be482836b86d780244f0357e5a778c46 ] The value of stbl can be sometimes out of bounds due to a bad filesystem. Added a check with appopriate return of error code in that case. Reported-by: syzbot+65fa06e29859e41a83f3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=65fa06e29859e41a83f3 Signed-off-by: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Date: Tue Oct 1 11:35:47 2024 +0530 jfs: fix array-index-out-of-bounds in jfs_readdir [ Upstream commit 839f102efb168f02dfdd46717b7c6dddb26b015e ] The stbl might contain some invalid values. Added a check to return error code in that case. Reported-by: syzbot+0315f8fe99120601ba88@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0315f8fe99120601ba88 Signed-off-by: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Date: Mon Sep 30 13:42:18 2024 +0530 jfs: fix shift-out-of-bounds in dbSplit [ Upstream commit a5f5e4698f8abbb25fe4959814093fb5bfa1aa9d ] When dmt_budmin is less than zero, it causes errors in the later stages. Added a check to return an error beforehand in dbAllocCtl itself. Reported-by: syzbot+b5ca8a249162c4b9a7d0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b5ca8a249162c4b9a7d0 Signed-off-by: Ghanshyam Agrawal <ghanshyam1898@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jared Kangas <jkangas@redhat.com> Date: Tue Nov 19 13:02:34 2024 -0800 kasan: make report_lock a raw spinlock commit e30a0361b8515d424c73c67de1a43e45a13b8ba2 upstream. If PREEMPT_RT is enabled, report_lock is a sleeping spinlock and must not be locked when IRQs are disabled. However, KASAN reports may be triggered in such contexts. For example: char *s = kzalloc(1, GFP_KERNEL); kfree(s); local_irq_disable(); char c = *s; /* KASAN report here leads to spin_lock() */ local_irq_enable(); Make report_spinlock a raw spinlock to prevent rescheduling when PREEMPT_RT is enabled. Link: https://lkml.kernel.org/r/20241119210234.1602529-1-jkangas@redhat.com Fixes: 342a93247e08 ("locking/spinlock: Provide RT variant header: <linux/spinlock_rt.h>") Signed-off-by: Jared Kangas <jkangas@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marco Elver <elver@google.com> Date: Tue Oct 1 16:00:45 2024 +0200 kcsan: Turn report_filterlist_lock into a raw_spinlock [ Upstream commit 59458fa4ddb47e7891c61b4a928d13d5f5b00aa0 ] Ran Xiaokai reports that with a KCSAN-enabled PREEMPT_RT kernel, we can see splats like: | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 | preempt_count: 10002, expected: 0 | RCU nest depth: 0, expected: 0 | no locks held by swapper/1/0. | irq event stamp: 156674 | hardirqs last enabled at (156673): [<ffffffff81130bd9>] do_idle+0x1f9/0x240 | hardirqs last disabled at (156674): [<ffffffff82254f84>] sysvec_apic_timer_interrupt+0x14/0xc0 | softirqs last enabled at (0): [<ffffffff81099f47>] copy_process+0xfc7/0x4b60 | softirqs last disabled at (0): [<0000000000000000>] 0x0 | Preemption disabled at: | [<ffffffff814a3e2a>] paint_ptr+0x2a/0x90 | CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.11.0+ #3 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 | Call Trace: | <IRQ> | dump_stack_lvl+0x7e/0xc0 | dump_stack+0x1d/0x30 | __might_resched+0x1a2/0x270 | rt_spin_lock+0x68/0x170 | kcsan_skip_report_debugfs+0x43/0xe0 | print_report+0xb5/0x590 | kcsan_report_known_origin+0x1b1/0x1d0 | kcsan_setup_watchpoint+0x348/0x650 | __tsan_unaligned_write1+0x16d/0x1d0 | hrtimer_interrupt+0x3d6/0x430 | __sysvec_apic_timer_interrupt+0xe8/0x3a0 | sysvec_apic_timer_interrupt+0x97/0xc0 | </IRQ> On a detected data race, KCSAN's reporting logic checks if it should filter the report. That list is protected by the report_filterlist_lock *non-raw* spinlock which may sleep on RT kernels. Since KCSAN may report data races in any context, convert it to a raw_spinlock. This requires being careful about when to allocate memory for the filter list itself which can be done via KCSAN's debugfs interface. Concurrent modification of the filter list via debugfs should be rare: the chosen strategy is to optimistically pre-allocate memory before the critical section and discard if unused. Link: https://lore.kernel.org/all/20240925143154.2322926-1-ranxiaokai627@163.com/ Reported-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Tested-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qianqiang Liu <qianqiang.liu@163.com> Date: Mon Oct 21 22:58:01 2024 +0200 KMSAN: uninit-value in inode_go_dump (5) [ Upstream commit f9417fcfca3c5e30a0b961e7250fab92cfa5d123 ] When mounting of a corrupted disk image fails, the error message printed can reference uninitialized inode fields. To prevent that from happening, always initialize those fields. Reported-by: syzbot+aa0730b0a42646eb1359@syzkaller.appspotmail.com Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Brown <broonie@kernel.org> Date: Mon Nov 11 16:18:55 2024 +0000 kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() [ Upstream commit 27141b690547da5650a420f26ec369ba142a9ebb ] The PAC exec_sign_all() test spawns some child processes, creating pipes to be stdin and stdout for the child. It cleans up most of the file descriptors that are created as part of this but neglects to clean up the parent end of the child stdin and stdout. Add the missing close() calls. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37e44@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Brown <broonie@kernel.org> Date: Wed Oct 23 00:20:45 2024 +0100 kselftest/arm64: Log fp-stress child startup errors to stdout [ Upstream commit dca93d29845dfed60910ba13dbfb6ae6a0e19f6d ] Currently if we encounter an error between fork() and exec() of a child process we log the error to stderr. This means that the errors don't get annotated with the child information which makes diagnostics harder and means that if we miss the exit signal from the child we can deadlock waiting for output from the child. Improve robustness and output quality by logging to stdout instead. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241023-arm64-fp-stress-exec-fail-v1-1-ee3c62932c15@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jordy Zomer <jordyzomer@google.com> Date: Thu Nov 28 09:32:45 2024 +0900 ksmbd: fix Out-of-Bounds Read in ksmbd_vfs_stream_read commit fc342cf86e2dc4d2edb0fc2ff5e28b6c7845adb9 upstream. An offset from client could be a negative value, It could lead to an out-of-bounds read from the stream_buf. Note that this issue is coming when setting 'vfs objects = streams_xattr parameter' in ksmbd.conf. Cc: stable@vger.kernel.org # v5.15+ Reported-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jordy Zomer <jordyzomer@google.com> Date: Thu Nov 28 09:33:25 2024 +0900 ksmbd: fix Out-of-Bounds Write in ksmbd_vfs_stream_write commit 313dab082289e460391c82d855430ec8a28ddf81 upstream. An offset from client could be a negative value, It could allows to write data outside the bounds of the allocated buffer. Note that this issue is coming when setting 'vfs objects = streams_xattr parameter' in ksmbd.conf. Cc: stable@vger.kernel.org # v5.15+ Reported-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nikolay Kuratov <kniv@yandex-team.ru> Date: Sun Dec 8 11:38:30 2024 +0300 KVM: x86/mmu: Ensure that kvm_release_pfn_clean() takes exact pfn from kvm_faultin_pfn() Since 5.16 and prior to 6.13 KVM can't be used with FSDAX guest memory (PMD pages). To reproduce the issue you need to reserve guest memory with `memmap=` cmdline, create and mount FS in DAX mode (tested both XFS and ext4), see doc link below. ndctl command for test: ndctl create-namespace -v -e namespace1.0 --map=dev --mode=fsdax -a 2M Then pass memory object to qemu like: -m 8G -object memory-backend-file,id=ram0,size=8G,\ mem-path=/mnt/pmem/guestmem,share=on,prealloc=on,dump=off,align=2097152 \ -numa node,memdev=ram0,cpus=0-1 QEMU fails to run guest with error: kvm run failed Bad address and there are two warnings in dmesg: WARN_ON_ONCE(!page_count(page)) in kvm_is_zone_device_page() and WARN_ON_ONCE(folio_ref_count(folio) <= 0) in try_grab_folio() (v6.6.63) It looks like in the past assumption was made that pfn won't change from faultin_pfn() to release_pfn_clean(), e.g. see commit 4cd071d13c5c ("KVM: x86/mmu: Move calls to thp_adjust() down a level") But kvm_page_fault structure made pfn part of mutable state, so now release_pfn_clean() can take hugepage-adjusted pfn. And it works for all cases (/dev/shm, hugetlb, devdax) except fsdax. Apparently in fsdax mode faultin-pfn and adjusted-pfn may refer to different folios, so we're getting get_page/put_page imbalance. To solve this preserve faultin pfn in separate local variable and pass it in kvm_release_pfn_clean(). Patch tested for all mentioned guest memory backends with tdp_mmu={0,1}. No bug in upstream as it was solved fundamentally by commit 8dd861cc07e2 ("KVM: x86/mmu: Put refcounted pages instead of blindly releasing pfns") and related patch series. Link: https://nvdimm.docs.kernel.org/2mib_fs_dax.html Fixes: 2f6305dd5676 ("KVM: MMU: change kvm_tdp_mmu_map() arguments to kvm_page_fault") Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mukesh Ojha <quic_mojha@quicinc.com> Date: Sun Nov 3 21:35:27 2024 +0530 leds: class: Protect brightness_show() with led_cdev->led_access mutex [ Upstream commit 4ca7cd938725a4050dcd62ae9472e931d603118d ] There is NULL pointer issue observed if from Process A where hid device being added which results in adding a led_cdev addition and later a another call to access of led_cdev attribute from Process B can result in NULL pointer issue. Use mutex led_cdev->led_access to protect access to led->cdev and its attribute inside brightness_show() and max_brightness_show() and also update the comment for mutex that it should be used to protect the led class device fields. Process A Process B kthread+0x114 worker_thread+0x244 process_scheduled_works+0x248 uhid_device_add_worker+0x24 hid_add_device+0x120 device_add+0x268 bus_probe_device+0x94 device_initial_probe+0x14 __device_attach+0xfc bus_for_each_drv+0x10c __device_attach_driver+0x14c driver_probe_device+0x3c __driver_probe_device+0xa0 really_probe+0x190 hid_device_probe+0x130 ps_probe+0x990 ps_led_register+0x94 devm_led_classdev_register_ext+0x58 led_classdev_register_ext+0x1f8 device_create_with_groups+0x48 device_create_groups_vargs+0xc8 device_add+0x244 kobject_uevent+0x14 kobject_uevent_env[jt]+0x224 mutex_unlock[jt]+0xc4 __mutex_unlock_slowpath+0xd4 wake_up_q+0x70 try_to_wake_up[jt]+0x48c preempt_schedule_common+0x28 __schedule+0x628 __switch_to+0x174 el0t_64_sync+0x1a8/0x1ac el0t_64_sync_handler+0x68/0xbc el0_svc+0x38/0x68 do_el0_svc+0x1c/0x28 el0_svc_common+0x80/0xe0 invoke_syscall+0x58/0x114 __arm64_sys_read+0x1c/0x2c ksys_read+0x78/0xe8 vfs_read+0x1e0/0x2c8 kernfs_fop_read_iter+0x68/0x1b4 seq_read_iter+0x158/0x4ec kernfs_seq_show+0x44/0x54 sysfs_kf_seq_show+0xb4/0x130 dev_attr_show+0x38/0x74 brightness_show+0x20/0x4c dualshock4_led_get_brightness+0xc/0x74 [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 [ 3313.874301][ T4013] Mem abort info: [ 3313.874303][ T4013] ESR = 0x0000000096000006 [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits [ 3313.874307][ T4013] SET = 0, FnV = 0 [ 3313.874309][ T4013] EA = 0, S1PTW = 0 [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault [ 3313.874313][ T4013] Data abort info: [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 .. [ 3313.874332][ T4013] Dumping ftrace buffer: [ 3313.874334][ T4013] (ftrace buffer empty) .. .. [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 .. .. [ 3313.874685][ T4013] Call trace: [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 [ 3313.874690][ T4013] brightness_show+0x20/0x4c [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 [ 3313.874711][ T4013] ksys_read+0x78/0xe8 [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 [ 3313.874727][ T4013] el0_svc+0x38/0x68 [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> Reviewed-by: Anish Kumar <yesanishhere@gmail.com> Link: https://lore.kernel.org/r/20241103160527.82487-1-quic_mojha@quicinc.com Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kees Cook <kees@kernel.org> Date: Sun Nov 17 03:38:13 2024 -0800 lib: stackinit: hide never-taken branch from compiler commit 5c3793604f91123bf49bc792ce697a0bef4c173c upstream. The never-taken branch leads to an invalid bounds condition, which is by design. To avoid the unwanted warning from the compiler, hide the variable from the optimizer. ../lib/stackinit_kunit.c: In function 'do_nothing_u16_zero': ../lib/stackinit_kunit.c:51:49: error: array subscript 1 is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Werror=array-bounds=] 51 | #define DO_NOTHING_RETURN_SCALAR(ptr) *(ptr) | ^~~~~~ ../lib/stackinit_kunit.c:219:24: note: in expansion of macro 'DO_NOTHING_RETURN_SCALAR' 219 | return DO_NOTHING_RETURN_ ## which(ptr + 1); \ | ^~~~~~~~~~~~~~~~~~ Link: https://lkml.kernel.org/r/20241117113813.work.735-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Date: Sat Dec 14 20:04:17 2024 +0100 Linux 6.12.5 Link: https://lore.kernel.org/r/20241212144306.641051666@linuxfoundation.org Tested-by: Luna Jernberg <droidbittin@gmail.com> Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Christian Heusel <christian@heusel.eu> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: SeongJae Park <sj@kernel.org> Link: https://lore.kernel.org/r/20241213145925.077514874@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Luna Jernberg <droidbittin@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Willy Tarreau <w@1wt.eu> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: kernelci.org bot <bot@kernelci.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bibo Mao <maobibo@loongson.cn> Date: Mon Dec 2 16:42:08 2024 +0800 LoongArch: Add architecture specific huge_pte_clear() commit 7cd1f5f77925ae905a57296932f0f9ef0dc364f8 upstream. When executing mm selftests run_vmtests.sh, there is such an error: BUG: Bad page state in process uffd-unit-tests pfn:00000 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x0 flags: 0xffff0000002000(reserved|node=0|zone=0|lastcpupid=0xffff) raw: 00ffff0000002000 ffffbf0000000008 ffffbf0000000008 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set Modules linked in: snd_seq_dummy snd_seq snd_seq_device rfkill vfat fat virtio_balloon efi_pstore virtio_net pstore net_failover failover fuse nfnetlink virtio_scsi virtio_gpu virtio_dma_buf dm_multipath efivarfs CPU: 2 UID: 0 PID: 1913 Comm: uffd-unit-tests Not tainted 6.12.0 #184 Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 Stack : 900000047c8ac000 0000000000000000 9000000000223a7c 900000047c8ac000 900000047c8af690 900000047c8af698 0000000000000000 900000047c8af7d8 900000047c8af7d0 900000047c8af7d0 900000047c8af5b0 0000000000000001 0000000000000001 900000047c8af698 10b3c7d53da40d26 0000010000000000 0000000000000022 0000000fffffffff fffffffffe000000 ffff800000000000 000000000000002f 0000800000000000 000000017a6d4000 90000000028f8940 0000000000000000 0000000000000000 90000000025aa5e0 9000000002905000 0000000000000000 90000000028f8940 ffff800000000000 0000000000000000 0000000000000000 0000000000000000 9000000000223a94 000000012001839c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000000223a94>] show_stack+0x5c/0x180 [<9000000001c3fd64>] dump_stack_lvl+0x6c/0xa0 [<900000000056aa08>] bad_page+0x1a0/0x1f0 [<9000000000574978>] free_unref_folios+0xbf0/0xd20 [<90000000004e65cc>] folios_put_refs+0x1a4/0x2b8 [<9000000000599a0c>] free_pages_and_swap_cache+0x164/0x260 [<9000000000547698>] tlb_batch_pages_flush+0xa8/0x1c0 [<9000000000547f30>] tlb_finish_mmu+0xa8/0x218 [<9000000000543cb8>] exit_mmap+0x1a0/0x360 [<9000000000247658>] __mmput+0x78/0x200 [<900000000025583c>] do_exit+0x43c/0xde8 [<9000000000256490>] do_group_exit+0x68/0x110 [<9000000000256554>] sys_exit_group+0x1c/0x20 [<9000000001c413b4>] do_syscall+0x94/0x130 [<90000000002216d8>] handle_syscall+0xb8/0x158 Disabling lock debugging due to kernel taint BUG: non-zero pgtables_bytes on freeing mm: -16384 On LoongArch system, invalid huge pte entry should be invalid_pte_table or a single _PAGE_HUGE bit rather than a zero value. And it should be the same with invalid pmd entry, since pmd_none() is called by function free_pgd_range() and pmd_none() return 0 by huge_pte_clear(). So single _PAGE_HUGE bit is also treated as a valid pte table and free_pte_range() will be called in free_pmd_range(). free_pmd_range() pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); if (pmd_none_or_clear_bad(pmd)) continue; free_pte_range(tlb, pmd, addr); } while (pmd++, addr = next, addr != end); Here invalid_pte_table is used for both invalid huge pte entry and pmd entry. Cc: stable@vger.kernel.org Fixes: 09cfefb7fa70 ("LoongArch: Add memory management") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Huacai Chen <chenhuacai@kernel.org> Date: Fri Nov 22 15:47:48 2024 +0800 LoongArch: Fix sleeping in atomic context for PREEMPT_RT [ Upstream commit 88fd2b70120d52c1010257d36776876941375490 ] Commit bab1c299f3945ffe79 ("LoongArch: Fix sleeping in atomic context in setup_tlb_handler()") changes the gfp flag from GFP_KERNEL to GFP_ATOMIC for alloc_pages_node(). However, for PREEMPT_RT kernels we can still get a "sleeping in atomic context" error: [ 0.372259] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 [ 0.372266] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 [ 0.372268] preempt_count: 1, expected: 0 [ 0.372270] RCU nest depth: 1, expected: 1 [ 0.372272] 3 locks held by swapper/1/0: [ 0.372274] #0: 900000000c9f5e60 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x524/0x1c60 [ 0.372294] #1: 90000000087013b8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x50/0x140 [ 0.372305] #2: 900000047fffd388 (&zone->lock){+.+.}-{3:3}, at: __rmqueue_pcplist+0x30c/0xea0 [ 0.372314] irq event stamp: 0 [ 0.372316] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 0.372322] hardirqs last disabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0 [ 0.372329] softirqs last enabled at (0): [<9000000005947320>] copy_process+0x9c0/0x26e0 [ 0.372335] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 0.372341] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7+ #1891 [ 0.372346] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 [ 0.372349] Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 9000000100388000 [ 0.372486] 900000010038b890 0000000000000000 900000010038b898 9000000007e53788 [ 0.372492] 900000000815bcc8 900000000815bcc0 900000010038b700 0000000000000001 [ 0.372498] 0000000000000001 4b031894b9d6b725 00000000055ec000 9000000100338fc0 [ 0.372503] 00000000000000c4 0000000000000001 000000000000002d 0000000000000003 [ 0.372509] 0000000000000030 0000000000000003 00000000055ec000 0000000000000003 [ 0.372515] 900000000806d000 9000000007e53788 00000000000000b0 0000000000000004 [ 0.372521] 0000000000000000 0000000000000000 900000000c9f5f10 0000000000000000 [ 0.372526] 90000000076f12d8 9000000007e53788 9000000005924778 0000000000000000 [ 0.372532] 00000000000000b0 0000000000000004 0000000000000000 0000000000070000 [ 0.372537] ... [ 0.372540] Call Trace: [ 0.372542] [<9000000005924778>] show_stack+0x38/0x180 [ 0.372548] [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [ 0.372555] [<900000000599b880>] __might_resched+0x1a0/0x260 [ 0.372561] [<90000000071675cc>] rt_spin_lock+0x4c/0x140 [ 0.372565] [<9000000005cbb768>] __rmqueue_pcplist+0x308/0xea0 [ 0.372570] [<9000000005cbed84>] get_page_from_freelist+0x564/0x1c60 [ 0.372575] [<9000000005cc0d98>] __alloc_pages_noprof+0x218/0x1820 [ 0.372580] [<900000000593b36c>] tlb_init+0x1ac/0x298 [ 0.372585] [<9000000005924b74>] per_cpu_trap_init+0x114/0x140 [ 0.372589] [<9000000005921964>] cpu_probe+0x4e4/0xa60 [ 0.372592] [<9000000005934874>] start_secondary+0x34/0xc0 [ 0.372599] [<900000000715615c>] smpboot_entry+0x64/0x6c This is because in PREEMPT_RT kernels normal spinlocks are replaced by rt spinlocks and rt_spin_lock() will cause sleeping. Fix it by disabling NUMA optimization completely for PREEMPT_RT kernels. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Huacai Chen <chenhuacai@kernel.org> Date: Mon Dec 2 16:42:10 2024 +0800 LoongArch: KVM: Protect kvm_check_requests() with SRCU commit 589e6cc7597655bed7b8543b8286925f631f597c upstream. When we enable lockdep we get such a warning: ============================= WARNING: suspicious RCU usage 6.12.0-rc7+ #1891 Tainted: G W ----------------------------- include/linux/kvm_host.h:1043 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by qemu-system-loo/948: #0: 90000001184a00a8 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0xf4/0xe20 [kvm] stack backtrace: CPU: 0 UID: 0 PID: 948 Comm: qemu-system-loo Tainted: G W 6.12.0-rc7+ #1891 Tainted: [W]=WARN Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.0-prebeta9 10/21/2022 Stack : 0000000000000089 9000000005a0db9c 90000000071519c8 900000012c578000 900000012c57b920 0000000000000000 900000012c57b928 9000000007e53788 900000000815bcc8 900000000815bcc0 900000012c57b790 0000000000000001 0000000000000001 4b031894b9d6b725 0000000004dec000 90000001003299c0 0000000000000414 0000000000000001 000000000000002d 0000000000000003 0000000000000030 00000000000003b4 0000000004dec000 90000001184a0000 900000000806d000 9000000007e53788 00000000000000b4 0000000000000004 0000000000000004 0000000000000000 0000000000000000 9000000107baf600 9000000008916000 9000000007e53788 9000000005924778 0000000010000044 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d ... Call Trace: [<9000000005924778>] show_stack+0x38/0x180 [<90000000071519c4>] dump_stack_lvl+0x94/0xe4 [<90000000059eb754>] lockdep_rcu_suspicious+0x194/0x240 [<ffff8000022143bc>] kvm_gfn_to_hva_cache_init+0xfc/0x120 [kvm] [<ffff80000222ade4>] kvm_pre_enter_guest+0x3a4/0x520 [kvm] [<ffff80000222b3dc>] kvm_handle_exit+0x23c/0x480 [kvm] Fix it by protecting kvm_check_requests() with SRCU. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Adam Young <admiyo@os.amperecomputing.com> Date: Wed Nov 20 14:02:14 2024 -0500 mailbox: pcc: Check before sending MCTP PCC response ACK [ Upstream commit 7f9e19f207be0c534d517d65e01417ba968cdd34 ] Type 4 PCC channels have an option to send back a response to the platform when they are done processing the request. The flag to indicate whether or not to respond is inside the message body, and thus is not available to the pcc mailbox. If the flag is not set, still set command completion bit after processing message. In order to read the flag, this patch maps the shared buffer to virtual memory. To avoid duplication of mapping the shared buffer is then made available to be used by the driver that uses the mailbox. Signed-off-by: Adam Young <admiyo@os.amperecomputing.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rohan Barar <rohan.barar@gmail.com> Date: Thu Oct 3 19:40:40 2024 +1000 media: cx231xx: Add support for Dexatek USB Video Grabber 1d19:6108 [ Upstream commit 61a830bc0ea69a05d8a4534f825c6aa618263649 ] Add Dexatek Technology Ltd USB Video Grabber 1d19:6108 to the cx231xx driver. This device is sold under the name "BAUHN DVD Maker (DK8723)" by ALDI in Australia. This device is similar to 1d19:6109, which is already included in cx231xx. Both video and audio capture function correctly after installing the patched cx231xx driver. Patch Changelog v1: - Initial submission. v2: - Fix SoB + Improve subject. v3: - Rephrase message to not exceed 75 characters per line. - Removed reference to external GitHub URL. Signed-off-by: Rohan Barar <rohan.barar@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bingbu Cao <bingbu.cao@intel.com> Date: Wed Oct 16 15:53:02 2024 +0800 media: ipu6: use the IPU6 DMA mapping APIs to do mapping commit 1d4a000289979cc7f2887c8407b1bfe2a0918354 upstream. dma_ops is removed from the IPU6 auxiliary device, ISYS driver should use the IPU6 DMA mapping APIs directly instead of depending on the device callbacks. ISYS driver switch from the videobuf2 DMA contig memory allocator to scatter/gather memory allocator. Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> [Sakari Ailus: Rebased on recent videobuf2 wait changes.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Given <dg@cowlark.com> Date: Wed Sep 18 20:05:40 2024 +0200 media: uvcvideo: Add a quirk for the Kaiweets KTI-W02 infrared camera [ Upstream commit b2ec92bb5605452d539a7aa1e42345b95acd8583 ] Adds a quirk to make the NXP Semiconductors 1fc9:009b chipset work. lsusb for the device reports: Bus 003 Device 011: ID 1fc9:009b NXP Semiconductors IR VIDEO Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 239 Miscellaneous Device bDeviceSubClass 2 [unknown] bDeviceProtocol 1 Interface Association bMaxPacketSize0 64 idVendor 0x1fc9 NXP Semiconductors idProduct 0x009b IR VIDEO bcdDevice 1.01 iManufacturer 1 Guide sensmart iProduct 2 IR VIDEO iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x00c2 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xc0 Self Powered MaxPower 100mA Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 0 bInterfaceCount 2 bFunctionClass 14 Video bFunctionSubClass 3 Video Interface Collection bFunctionProtocol 0 iFunction 3 IR Camera Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 14 Video bInterfaceSubClass 1 Video Control bInterfaceProtocol 0 iInterface 0 VideoControl Interface Descriptor: bLength 13 bDescriptorType 36 bDescriptorSubtype 1 (HEADER) bcdUVC 1.00 wTotalLength 0x0033 dwClockFrequency 6.000000MHz bInCollection 1 baInterfaceNr( 0) 1 VideoControl Interface Descriptor: bLength 18 bDescriptorType 36 bDescriptorSubtype 2 (INPUT_TERMINAL) bTerminalID 1 wTerminalType 0x0201 Camera Sensor bAssocTerminal 0 iTerminal 0 wObjectiveFocalLengthMin 0 wObjectiveFocalLengthMax 0 wOcularFocalLength 0 bControlSize 3 bmControls 0x00000000 VideoControl Interface Descriptor: bLength 9 bDescriptorType 36 bDescriptorSubtype 3 (OUTPUT_TERMINAL) bTerminalID 2 wTerminalType 0x0101 USB Streaming bAssocTerminal 0 bSourceID 1 iTerminal 0 VideoControl Interface Descriptor: bLength 11 bDescriptorType 36 bDescriptorSubtype 5 (PROCESSING_UNIT) Warning: Descriptor too short bUnitID 3 bSourceID 1 wMaxMultiplier 0 bControlSize 2 bmControls 0x00000000 iProcessing 0 bmVideoStandards 0x62 NTSC - 525/60 PAL - 525/60 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 1 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 0 bInterfaceClass 14 Video bInterfaceSubClass 2 Video Streaming bInterfaceProtocol 0 iInterface 0 VideoStreaming Interface Descriptor: bLength 14 bDescriptorType 36 bDescriptorSubtype 1 (INPUT_HEADER) bNumFormats 1 wTotalLength 0x0055 bEndpointAddress 0x82 EP 2 IN bmInfo 0 bTerminalLink 2 bStillCaptureMethod 2 bTriggerSupport 0 bTriggerUsage 0 bControlSize 1 bmaControls( 0) 0 VideoStreaming Interface Descriptor: bLength 27 bDescriptorType 36 bDescriptorSubtype 4 (FORMAT_UNCOMPRESSED) bFormatIndex 1 bNumFrameDescriptors 1 guidFormat {e436eb7b-524f-11ce-9f53-0020af0ba770} bBitsPerPixel 16 bDefaultFrameIndex 1 bAspectRatioX 0 bAspectRatioY 0 bmInterlaceFlags 0x00 Interlaced stream or variable: No Fields per frame: 2 fields Field 1 first: No Field pattern: Field 1 only bCopyProtect 0 VideoStreaming Interface Descriptor: bLength 34 bDescriptorType 36 bDescriptorSubtype 5 (FRAME_UNCOMPRESSED) bFrameIndex 1 bmCapabilities 0x00 Still image unsupported wWidth 240 wHeight 322 dwMinBitRate 12364800 dwMaxBitRate 30912000 dwMaxVideoFrameBufferSize 154560 dwDefaultFrameInterval 400000 bFrameIntervalType 2 dwFrameInterval( 0) 400000 dwFrameInterval( 1) 1000000 VideoStreaming Interface Descriptor: bLength 10 bDescriptorType 36 bDescriptorSubtype 3 (STILL_IMAGE_FRAME) bEndpointAddress 0x00 EP 0 OUT bNumImageSizePatterns 1 wWidth( 0) 240 wHeight( 0) 322 bNumCompressionPatterns 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 1 bNumEndpoints 1 bInterfaceClass 14 Video bInterfaceSubClass 2 Video Streaming bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 5 Transfer Type Isochronous Synch Type Asynchronous Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 1 Device Status: 0x0001 Self Powered Signed-off-by: David Given <dg@cowlark.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Link: https://lore.kernel.org/r/20240918180540.10830-2-dg@cowlark.com Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo Ribalda <ribalda@chromium.org> Date: Tue Sep 24 13:33:29 2024 +0000 media: uvcvideo: Force UVC version to 1.0a for 0408:4033 [ Upstream commit c9df99302fff53b6007666136b9f43fbac7ee3d8 ] The Quanta ACER HD User Facing camera reports a UVC 1.50 version, but implements UVC 1.0a as shown by the UVC probe control being 26 bytes long. Force the UVC version for that device. Reported-by: Giuliano Lotta <giuliano.lotta@gmail.com> Closes: https://lore.kernel.org/linux-media/fce4f906-d69b-417d-9f13-bf69fe5c81e3@koyu.space/ Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/20240924-uvc-quanta-v1-1-2de023863767@chromium.org Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Perchanov <dmitry.perchanov@intel.com> Date: Mon Aug 26 17:27:50 2024 +0300 media: uvcvideo: RealSense D421 Depth module metadata [ Upstream commit c6104297c965a5ee9d4b9d0d5d9cdd224d8fd59e ] RealSense(R) D421 Depth module is low cost solution for 3D-stereo vision. The module supports extended sensor metadata format D4XX. Signed-off-by: Dmitry Perchanov <dmitry.perchanov@intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/d1fbfbbff5c8247a3130499985a53218c5b55c61.camel@intel.com Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mike Rapoport (Microsoft) <rppt@kernel.org> Date: Fri Nov 29 11:13:47 2024 +0200 memblock: allow zero threshold in validate_numa_converage() commit 9cdc6423acb49055efb444ecd895d853a70ef931 upstream. Currently memblock validate_numa_converage() returns false negative when threshold set to zero. Make the check if the memory size with invalid node ID is greater than the threshold exclusive to fix that. Link: https://lore.kernel.org/all/Z0mIDBD4KLyxyOCm@kernel.org/ Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Xi Ruoyao <xry111@xry111.site> Date: Sat Nov 23 11:57:37 2024 +0800 MIPS: Loongson64: DTS: Really fix PCIe port nodes for ls7a [ Upstream commit 4fbd66d8254cedfd1218393f39d83b6c07a01917 ] Fix the dtc warnings: arch/mips/boot/dts/loongson/ls7a-pch.dtsi:68.16-416.5: Warning (interrupt_provider): /bus@10000000/pci@1a000000: '#interrupt-cells' found, but node is not an interrupt provider arch/mips/boot/dts/loongson/ls7a-pch.dtsi:68.16-416.5: Warning (interrupt_provider): /bus@10000000/pci@1a000000: '#interrupt-cells' found, but node is not an interrupt provider arch/mips/boot/dts/loongson/loongson64g_4core_ls7a.dtb: Warning (interrupt_map): Failed prerequisite 'interrupt_provider' And a runtime warning introduced in commit 045b14ca5c36 ("of: WARN on deprecated #address-cells/#size-cells handling"): WARNING: CPU: 0 PID: 1 at drivers/of/base.c:106 of_bus_n_addr_cells+0x9c/0xe0 Missing '#address-cells' in /bus@10000000/pci@1a000000/pci_bridge@9,0 The fix is similar to commit d89a415ff8d5 ("MIPS: Loongson64: DTS: Fix PCIe port nodes for ls7a"), which has fixed the issue for ls2k (despite its subject mentions ls7a). Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Parker Newman <pnewman@connecttech.com> Date: Wed Oct 2 11:12:33 2024 -0400 misc: eeprom: eeprom_93cx6: Add quirk for extra read clock cycle [ Upstream commit 7738a7ab9d12c5371ed97114ee2132d4512e9fd5 ] Add a quirk similar to eeprom_93xx46 to add an extra clock cycle before reading data from the EEPROM. The 93Cx6 family of EEPROMs output a "dummy 0 bit" between the writing of the op-code/address from the host to the EEPROM and the reading of the actual data from the EEPROM. More info can be found on page 6 of the AT93C46 datasheet (linked below). Similar notes are found in other 93xx6 datasheets. In summary the read operation for a 93Cx6 EEPROM is: Write to EEPROM: 110[A5-A0] (9 bits) Read from EEPROM: 0[D15-D0] (17 bits) Where: 110 is the start bit and READ OpCode [A5-A0] is the address to read from 0 is a "dummy bit" preceding the actual data [D15-D0] is the actual data. Looking at the READ timing diagrams in the 93Cx6 datasheets the dummy bit should be clocked out on the last address bit clock cycle meaning it should be discarded naturally. However, depending on the hardware configuration sometimes this dummy bit is not discarded. This is the case with Exar PCI UARTs which require an extra clock cycle between sending the address and reading the data. Datasheet: https://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-5193-SEEPROM-AT93C46D-Datasheet.pdf Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Parker Newman <pnewman@connecttech.com> Link: https://lore.kernel.org/r/0f23973efefccd2544705a0480b4ad4c2353e407.1727880931.git.pnewman@connecttech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Date: Fri Oct 4 07:26:05 2024 +0200 mlxsw: spectrum_acl_flex_keys: Constify struct mlxsw_afk_element_inst [ Upstream commit bec2a32145d5cc066df29182fa0e5b0d4329b1a1 ] 'struct mlxsw_afk_element_inst' are not modified in these drivers. Constifying these structures moves some data to a read-only section, so increases overall security. Update a few functions and struct mlxsw_afk_block accordingly. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 4278 4032 0 8310 2076 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o After: ===== text data bss dec hex filename 7934 352 0 8286 205e drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/8ccfc7bfb2365dcee5b03c81ebe061a927d6da2e.1727541677.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 217bbf156f93 ("mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com> Date: Tue Dec 3 16:16:05 2024 +0100 mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4 [ Upstream commit 217bbf156f93ada86b91617489e7ba8a0904233c ] The driver is currently using an ACL key block that is not supported by Spectrum-4. This works because the driver is only using a single field from this key block which is located in the same offset in the equivalent Spectrum-4 key block. The issue was discovered when the firmware started rejecting the use of the unsupported key block. The change has been reverted to avoid breaking users that only update their firmware. Nonetheless, fix the issue by using the correct key block. Fixes: 07ff135958dd ("mlxsw: Introduce flex key elements for Spectrum-4") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/35e72c97bdd3bc414fb8e4d747e5fb5d26c29658.1733237440.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akinobu Mita <akinobu.mita@gmail.com> Date: Fri Nov 15 10:20:23 2024 -0800 mm/damon: fix order of arguments in damos_before_apply tracepoint commit 6535b8669c1a74078098517174e53fc907ce9d56 upstream. Since the order of the scheme_idx and target_idx arguments in TP_ARGS is reversed, they are stored in the trace record in reverse. Link: https://lkml.kernel.org/r/20241115182023.43118-1-sj@kernel.org Link: https://patch.msgid.link/20241112154828.40307-1-akinobu.mita@gmail.com Fixes: c603c630b509 ("mm/damon/core: add a tracepoint for damos apply target regions") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: John Hubbard <jhubbard@nvidia.com> Date: Wed Nov 20 19:49:33 2024 -0800 mm/gup: handle NULL pages in unpin_user_pages() commit a1268be280d8e484ab3606d7476edd0f14bb9961 upstream. The recent addition of "pofs" (pages or folios) handling to gup has a flaw: it assumes that unpin_user_pages() handles NULL pages in the pages** array. That's not the case, as I discovered when I ran on a new configuration on my test machine. Fix this by skipping NULL pages in unpin_user_pages(), just like unpin_folios() already does. Details: when booting on x86 with "numa=fake=2 movablecore=4G" on Linux 6.12, and running this: tools/testing/selftests/mm/gup_longterm ...I get the following crash: BUG: kernel NULL pointer dereference, address: 0000000000000008 RIP: 0010:sanity_check_pinned_pages+0x3a/0x2d0 ... Call Trace: <TASK> ? __die_body+0x66/0xb0 ? page_fault_oops+0x30c/0x3b0 ? do_user_addr_fault+0x6c3/0x720 ? irqentry_enter+0x34/0x60 ? exc_page_fault+0x68/0x100 ? asm_exc_page_fault+0x22/0x30 ? sanity_check_pinned_pages+0x3a/0x2d0 unpin_user_pages+0x24/0xe0 check_and_migrate_movable_pages_or_folios+0x455/0x4b0 __gup_longterm_locked+0x3bf/0x820 ? mmap_read_lock_killable+0x12/0x50 ? __pfx_mmap_read_lock_killable+0x10/0x10 pin_user_pages+0x66/0xa0 gup_test_ioctl+0x358/0xb20 __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x7b/0x150 entry_SYSCALL_64_after_hwframe+0x76/0x7e Link: https://lkml.kernel.org/r/20241121034933.77502-1-jhubbard@nvidia.com Fixes: 94efde1d1539 ("mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Junxiao Chang <junxiao.chang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Hildenbrand <david@redhat.com> Date: Wed Nov 20 21:11:51 2024 +0100 mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM commit 091c1dd2d4df6edd1beebe0e5863d4034ade9572 upstream. We currently assume that there is at least one VMA in a MM, which isn't true. So we might end up having find_vma() return NULL, to then de-reference NULL. So properly handle find_vma() returning NULL. This fixes the report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 Code: ... RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [akpm@linux-foundation.org: add unlikely()] Link: https://lkml.kernel.org/r/20241120201151.9518-1-david@redhat.com Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: syzbot+3511625422f7aa637f0d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Andrii Nakryiko <andrii@kernel.org> Date: Mon Nov 25 16:52:06 2024 -0800 mm: fix vrealloc()'s KASAN poisoning logic commit d699440f58ce9bd71103cc7b692e3ab76a20bfcd upstream. When vrealloc() reuses already allocated vmap_area, we need to re-annotate poisoned and unpoisoned portions of underlying memory according to the new size. This results in a KASAN splat recorded at [1]. A KASAN mis-reporting issue where there is none. Note, hard-coding KASAN_VMALLOC_PROT_NORMAL might not be exactly correct, but KASAN flag logic is pretty involved and spread out throughout __vmalloc_node_range_noprof(), so I'm using the bare minimum flag here and leaving the rest to mm people to refactor this logic and reuse it here. Link: https://lkml.kernel.org/r/20241126005206.3457974-1-andrii@kernel.org Link: https://lore.kernel.org/bpf/67450f9b.050a0220.21d33d.0004.GAE@google.com/ [1] Fixes: 3ddc2fefe6f3 ("mm: vmalloc: implement vrealloc()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: John Sperbeck <jsperbeck@google.com> Date: Thu Nov 28 12:39:59 2024 -0800 mm: memcg: declare do_memsw_account inline commit 89dd878282881306c38f7e354e7614fca98cb9a6 upstream. In commit 66d60c428b23 ("mm: memcg: move legacy memcg event code into memcontrol-v1.c"), the static do_memsw_account() function was moved from a .c file to a .h file. Unfortunately, the traditional inline keyword wasn't added. If a file (e.g., a unit test) includes the .h file, but doesn't refer to do_memsw_account(), it will get a warning like: mm/memcontrol-v1.h:41:13: warning: unused function 'do_memsw_account' [-Wunused-function] 41 | static bool do_memsw_account(void) | ^~~~~~~~~~~~~~~~ Link: https://lkml.kernel.org/r/20241128203959.726527-1-jsperbeck@google.com Fixes: 66d60c428b23 ("mm: memcg: move legacy memcg event code into memcontrol-v1.c") Signed-off-by: John Sperbeck <jsperbeck@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Mon Nov 25 20:17:19 2024 +0000 mm: open-code page_folio() in dump_page() commit 6a7de1bf218d75f27f68d6a3f5ae1eb7332b941e upstream. page_folio() calls page_fixed_fake_head() which will misidentify this page as being a fake head and load off the end of 'precise'. We may have a pointer to a fake head, but that's OK because it contains the right information for dump_page(). gcc-15 is smart enough to catch this with -Warray-bounds: In function 'page_fixed_fake_head', inlined from '_compound_head' at ../include/linux/page-flags.h:251:24, inlined from '__dump_page' at ../mm/debug.c:123:11: ../include/asm-generic/rwonce.h:44:26: warning: array subscript 9 is outside +array bounds of 'struct page[1]' [-Warray-bounds=] Link: https://lkml.kernel.org/r/20241125201721.2963278-2-willy@infradead.org Fixes: fae7d834c43c ("mm: add __dump_folio()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Kees Cook <kees@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Mon Nov 25 20:17:18 2024 +0000 mm: open-code PageTail in folio_flags() and const_folio_flags() commit 4de22b2a6a7477d84d9a01eb6b62a9117309d722 upstream. It is unsafe to call PageTail() in dump_page() as page_is_fake_head() will almost certainly return true when called on a head page that is copied to the stack. That will cause the VM_BUG_ON_PGFLAGS() in const_folio_flags() to trigger when it shouldn't. Fortunately, we don't need to call PageTail() here; it's fine to have a pointer to a virtual alias of the page's flag word rather than the real page's flag word. Link: https://lkml.kernel.org/r/20241125201721.2963278-1-willy@infradead.org Fixes: fae7d834c43c ("mm: add __dump_folio()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Kees Cook <kees@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kalesh Singh <kaleshsingh@google.com> Date: Mon Nov 18 13:46:48 2024 -0800 mm: respect mmap hint address when aligning for THP commit 249608ee47132cab3b1adacd9e463548f57bd316 upstream. Commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") updated __get_unmapped_area() to align the start address for the VMA to a PMD boundary if CONFIG_TRANSPARENT_HUGEPAGE=y. It does this by effectively looking up a region that is of size, request_size + PMD_SIZE, and aligning up the start to a PMD boundary. Commit 4ef9ad19e176 ("mm: huge_memory: don't force huge page alignment on 32 bit") opted out of this for 32bit due to regressions in mmap base randomization. Commit d4148aeab412 ("mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes") restricted this to only mmap sizes that are multiples of the PMD_SIZE due to reported regressions in some performance benchmarks -- which seemed mostly due to the reduced spatial locality of related mappings due to the forced PMD-alignment. Another unintended side effect has emerged: When a user specifies an mmap hint address, the THP alignment logic modifies the behavior, potentially ignoring the hint even if a sufficiently large gap exists at the requested hint location. Example Scenario: Consider the following simplified virtual address (VA) space: ... 0x200000-0x400000 --- VMA A 0x400000-0x600000 --- Hole 0x600000-0x800000 --- VMA B ... A call to mmap() with hint=0x400000 and len=0x200000 behaves differently: - Before THP alignment: The requested region (size 0x200000) fits into the gap at 0x400000, so the hint is respected. - After alignment: The logic searches for a region of size 0x400000 (len + PMD_SIZE) starting at 0x400000. This search fails due to the mapping at 0x600000 (VMA B), and the hint is ignored, falling back to arch_get_unmapped_area[_topdown](). In general the hint is effectively ignored, if there is any existing mapping in the below range: [mmap_hint + mmap_size, mmap_hint + mmap_size + PMD_SIZE) This changes the semantics of mmap hint; from ""Respect the hint if a sufficiently large gap exists at the requested location" to "Respect the hint only if an additional PMD-sized gap exists beyond the requested size". This has performance implications for allocators that allocate their heap using mmap but try to keep it "as contiguous as possible" by using the end of the exisiting heap as the address hint. With the new behavior it's more likely to get a much less contiguous heap, adding extra fragmentation and performance overhead. To restore the expected behavior; don't use thp_get_unmapped_area_vmflags() when the user provided a hint address, for anonymous mappings. Note: As Yang Shi pointed out: the issue still remains for filesystems which are using thp_get_unmapped_area() for their get_unmapped_area() op. It is unclear what worklaods will regress for if we ignore THP alignment when the hint address is provided for such file backed mappings -- so this fix will be handled separately. Link: https://lkml.kernel.org/r/20241118214650.3667577-1-kaleshsingh@google.com Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hans Boehm <hboehm@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Keita Aihara <keita.aihara@sony.com> Date: Fri Sep 13 18:44:17 2024 +0900 mmc: core: Add SD card quirk for broken poweroff notification [ Upstream commit cd068d51594d9635bf6688fc78717572b78bce6a ] GIGASTONE Gaming Plus microSD cards manufactured on 02/2022 report that they support poweroff notification and cache, but they are not working correctly. Flush Cache bit never gets cleared in sd_flush_cache() and Poweroff Notification Ready bit also never gets set to 1 within 1 second from the end of busy of CMD49 in sd_poweroff_notify(). This leads to I/O error and runtime PM error state. I observed that the same card manufactured on 01/2024 works as expected. This problem seems similar to the Kingston cards fixed with commit c467c8f08185 ("mmc: Add MMC_QUIRK_BROKEN_SD_CACHE for Kingston Canvas Go Plus from 11/2019") and should be handled using quirks. CID for the problematic card is here. 12345641535443002000000145016200 Manufacturer ID is 0x12 and defined as CID_MANFID_GIGASTONE as of now, but would like comments on what naming is appropriate because MID list is not public and not sure it's right. Signed-off-by: Keita Aihara <keita.aihara@sony.com> Link: https://lore.kernel.org/r/20240913094417.GA4191647@sony.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Avri Altman <avri.altman@wdc.com> Date: Sun Oct 6 08:11:45 2024 +0300 mmc: core: Adjust ACMD22 to SDUC [ Upstream commit 449f34a34088d02457fa0f3216747e8a35bc03ae ] ACMD22 is used to verify the previously write operation. Normally, it returns the number of written sectors as u32. SDUC, however, returns it as u64. This is not a superfluous requirement, because SDUC writes may exceeds 2TB. For Linux mmc however, the previously write operation could not be more than the block layer limits, thus we make room for a u64 and cast the returning value to u32. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241006051148.160278-8-avri.altman@wdc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [Stephen Rothwell: Fix build error when moving to new rc from Linus's tree] Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Stable-dep-of: 869d37475788 ("mmc: core: Use GFP_NOIO in ACMD22") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ulf Hansson <ulf.hansson@linaro.org> Date: Mon Nov 25 13:24:46 2024 +0100 mmc: core: Further prevent card detect during shutdown commit 87a0d90fcd31c0f36da0332428c9e1a1e0f97432 upstream. Disabling card detect from the host's ->shutdown_pre() callback turned out to not be the complete solution. More precisely, beyond the point when the mmc_bus->shutdown() has been called, to gracefully power off the card, we need to prevent card detect. Otherwise the mmc_rescan work may poll for the card with a CMD13, to see if it's still alive, which then will fail and hang as the card has already been powered off. To fix this problem, let's disable mmc_rescan prior to power off the card during shutdown. Reported-by: Anthony Pighin <anthony.pighin@nokia.com> Fixes: 66c915d09b94 ("mmc: core: Disable card detect during shutdown") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Closes: https://lore.kernel.org/all/BN0PR08MB695133000AF116F04C3A9FFE83212@BN0PR08MB6951.namprd08.prod.outlook.com/ Tested-by: Anthony Pighin <anthony.pighin@nokia.com> Message-ID: <20241125122446.18684-1-ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Avri Altman <avri.altman@wdc.com> Date: Mon Oct 21 18:32:27 2024 +0300 mmc: core: Use GFP_NOIO in ACMD22 [ Upstream commit 869d37475788b0044bec1a33335e24abaf5e8884 ] While reviewing the SDUC series, Adrian made a comment concerning the memory allocation code in mmc_sd_num_wr_blocks() - see [1]. Prevent memory allocations from triggering I/O operations while ACMD22 is in progress. [1] https://lore.kernel.org/linux-mmc/3016fd71-885b-4ef9-97ed-46b4b0cb0e35@intel.com/ Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 051913dada04 ("mmc_block: do not DMA to stack") Signed-off-by: Avri Altman <avri.altman@wdc.com> Cc: stable@vger.kernel.org Message-ID: <20241021153227.493970-1-avri.altman@wdc.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com> Date: Mon Sep 30 15:49:19 2024 -0700 mmc: mtk-sd: fix devm_clk_get_optional usage [ Upstream commit ed299eda8fbb37cb0e05c7001ab6a6b2627ec087 ] This already returns NULL when not found. However, it can return EPROBE_DEFER and should thus return here. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://lore.kernel.org/r/20240930224919.355359-4-rosenp@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 2508925fb346 ("mmc: mtk-sd: Fix MMC_CAP2_CRYPTO flag setting") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy-ld Lu <andy-ld.lu@mediatek.com> Date: Thu Nov 7 20:11:21 2024 +0800 mmc: mtk-sd: Fix error handle of probe function [ Upstream commit 291220451c775a054cedc4fab4578a1419eb6256 ] In the probe function, it goes to 'release_mem' label and returns after some procedure failure. But if the clocks (partial or all) have been enabled previously, they would not be disabled in msdc_runtime_suspend, since runtime PM is not yet enabled for this case. That cause mmc related clocks always on during system suspend and block suspend flow. Below log is from a SDCard issue of MT8196 chromebook, it returns -ETIMEOUT while polling clock stable in the msdc_ungate_clock() and probe failed, but the enabled clocks could not be disabled anyway. [ 129.059253] clk_chk_dev_pm_suspend() [ 129.350119] suspend warning: msdcpll is on [ 129.354494] [ck_msdc30_1_sel : enabled, 1, 1, 191999939, ck_msdcpll_d2] [ 129.362787] [ck_msdcpll_d2 : enabled, 1, 1, 191999939, msdcpll] [ 129.371041] [ck_msdc30_1_ck : enabled, 1, 1, 191999939, ck_msdc30_1_sel] [ 129.379295] [msdcpll : enabled, 1, 1, 383999878, clk26m] Add a new 'release_clk' label and reorder the error handle functions to make sure the clocks be disabled after probe failure. Fixes: ffaea6ebfe9c ("mmc: mtk-sd: Use readl_poll_timeout instead of open-coded polling") Fixes: 7a2fa8eed936 ("mmc: mtk-sd: use devm_mmc_alloc_host") Signed-off-by: Andy-ld Lu <andy-ld.lu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: stable@vger.kernel.org Message-ID: <20241107121215.5201-1-andy-ld.lu@mediatek.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy-ld Lu <andy-ld.lu@mediatek.com> Date: Mon Nov 11 16:49:31 2024 +0800 mmc: mtk-sd: Fix MMC_CAP2_CRYPTO flag setting [ Upstream commit 2508925fb346661bad9f50b497d7ac7d0b6085d0 ] Currently, the MMC_CAP2_CRYPTO flag is set by default for eMMC hosts. However, this flag should not be set for hosts that do not support inline encryption. The 'crypto' clock, as described in the documentation, is used for data encryption and decryption. Therefore, only hosts that are configured with this 'crypto' clock should have the MMC_CAP2_CRYPTO flag set. Fixes: 7b438d0377fb ("mmc: mtk-sd: add Inline Crypto Engine clock control") Fixes: ed299eda8fbb ("mmc: mtk-sd: fix devm_clk_get_optional usage") Signed-off-by: Andy-ld Lu <andy-ld.lu@mediatek.com> Cc: stable@vger.kernel.org Message-ID: <20241111085039.26527-1-andy-ld.lu@mediatek.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com> Date: Mon Sep 30 15:49:17 2024 -0700 mmc: mtk-sd: use devm_mmc_alloc_host [ Upstream commit 7a2fa8eed936b33b22e49b1d2349cd7d02f22710 ] Allows removing several gotos. Also fixed some wrong ones. Added dev_err_probe where EPROBE_DEFER is possible. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://lore.kernel.org/r/20240930224919.355359-2-rosenp@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 291220451c77 ("mmc: mtk-sd: Fix error handle of probe function") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Avri Altman <avri.altman@wdc.com> Date: Sun Oct 6 08:11:39 2024 +0300 mmc: sd: SDUC Support Recognition [ Upstream commit fce2ce78af1e14dc1316aaddb5b3308be05cf452 ] Ultra Capacity SD cards (SDUC) was already introduced in SD7.0. Those cards support capacity larger than 2TB and up to including 128TB. ACMD41 was extended to support the host-card handshake during initialization. The card expects that the HCS & HO2T bits to be set in the command argument, and sets the applicable bits in the R3 returned response. On the contrary, if a SDUC card is inserted to a non-supporting host, it will never respond to this ACMD41 until eventually, the host will timed out and give up. Also, add SD CSD version 3.0 - designated for SDUC, and properly parse the csd register as the c_size field got expanded to 28 bits. Do not enable SDUC for now - leave it to the last patch in the series. Tested-by: Ricky WU <ricky_wu@realtek.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241006051148.160278-2-avri.altman@wdc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 869d37475788 ("mmc: core: Use GFP_NOIO in ACMD22") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com> Date: Mon Sep 23 14:20:16 2024 +0800 mmc: sdhci-esdhc-imx: enable quirks SDHCI_QUIRK_NO_LED [ Upstream commit 4dede2b76f4a760e948e1a49b1520881cb459bd3 ] Enable SDHCI_QUIRK_NO_LED for i.MX7ULP, i.MX8MM, i.MX8QXP and i.MXRT1050. Even there is LCTL register bit, there is no IOMUX PAD for it. So there is no sense to enable LED for SDHCI for these SoCs. Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Haibo Chen <haibo.chen@nxp.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240923062016.1165868-1-peng.fan@oss.nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com> Date: Mon Nov 18 22:00:49 2024 +0100 mmc: sdhci-pci: Add DMI quirk for missing CD GPIO on Vexia Edu Atla 10 tablet commit 7f0fa47ceebcff0e3591bb7e32a71a2cd7846149 upstream. The Vexia Edu Atla 10 tablet distributed to schools in the Spanish Andalucía region has no ACPI fwnode associated with the SDHCI controller for its microsd-slot and thus has no ACPI GPIO resource info. This causes the following error to be logged and the slot to not work: [ 10.572113] sdhci-pci 0000:00:12.0: failed to setup card detect gpio Add a DMI quirk table for providing gpiod_lookup_tables with manually provided CD GPIO info and use this DMI table to provide the CD GPIO info on this tablet. This fixes the microsd-slot not working. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Message-ID: <20241118210049.311079-1-hdegoede@redhat.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun Dec 1 12:17:30 2024 +0100 modpost: Add .irqentry.text to OTHER_SECTIONS commit 7912405643a14b527cd4a4f33c1d4392da900888 upstream. The compiler can fully inline the actual handler function of an interrupt entry into the .irqentry.text entry point. If such a function contains an access which has an exception table entry, modpost complains about a section mismatch: WARNING: vmlinux.o(__ex_table+0x447c): Section mismatch in reference ... The relocation at __ex_table+0x447c references section ".irqentry.text" which is not in the list of authorized sections. Add .irqentry.text to OTHER_SECTIONS to cure the issue. Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # needed for linux-5.4-y Link: https://lore.kernel.org/all/20241128111844.GE10431@google.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Gang Yan <yangang@kylinos.cn> Date: Mon Oct 21 17:14:04 2024 +0200 mptcp: annotate data-races around subflow->fully_established [ Upstream commit 581c8cbfa934aaa555daa4e843242fcecc160f05 ] We introduce the same handling for potential data races with the 'fully_established' flag in subflow as previously done for msk->fully_established. Additionally, we make a crucial change: convert the subflow's 'fully_established' from 'bit_field' to 'bool' type. This is necessary because methods for avoiding data races don't work well with 'bit_field'. Specifically, the 'READ_ONCE' needs to know the size of the variable being accessed, which is not supported in 'bit_field'. Also, 'test_bit' expect the address of 'bit_field'. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/516 Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-2-1ef02746504a@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Kandybka <d.kandybka@gmail.com> Date: Thu Nov 7 13:36:57 2024 +0300 mptcp: fix possible integer overflow in mptcp_reset_tout_timer [ Upstream commit b169e76ebad22cbd055101ee5aa1a7bed0e66606 ] In 'mptcp_reset_tout_timer', promote 'probe_timestamp' to unsigned long to avoid possible integer overflow. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com> Link: https://patch.msgid.link/20241107103657.1560536-1-d.kandybka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shradha Gupta <shradhagupta@linux.microsoft.com> Date: Tue Dec 3 21:48:20 2024 -0800 net :mana :Request a V2 response version for MANA_QUERY_GF_STAT commit 31f1b55d5d7e531cd827419e5d71c19f24de161c upstream. The current requested response version(V1) for MANA_QUERY_GF_STAT query results in STATISTICS_FLAGS_TX_ERRORS_GDMA_ERROR value being set to 0 always. In order to get the correct value for this counter we request the response version to be V2. Cc: stable@vger.kernel.org Fixes: e1df5202e879 ("net :mana :Add remaining GDMA stats for MANA to ethtool") Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1733291300-12593-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jiri Wiesner <jwiesner@suse.de> Date: Thu Nov 28 09:59:50 2024 +0100 net/ipv6: release expired exception dst cached in socket [ Upstream commit 3301ab7d5aeb0fe270f73a3d4810c9d1b6a9f045 ] Dst objects get leaked in ip6_negative_advice() when this function is executed for an expired IPv6 route located in the exception table. There are several conditions that must be fulfilled for the leak to occur: * an ICMPv6 packet indicating a change of the MTU for the path is received, resulting in an exception dst being created * a TCP connection that uses the exception dst for routing packets must start timing out so that TCP begins retransmissions * after the exception dst expires, the FIB6 garbage collector must not run before TCP executes ip6_negative_advice() for the expired exception dst When TCP executes ip6_negative_advice() for an exception dst that has expired and if no other socket holds a reference to the exception dst, the refcount of the exception dst is 2, which corresponds to the increment made by dst_init() and the increment made by the TCP socket for which the connection is timing out. The refcount made by the socket is never released. The refcount of the dst is decremented in sk_dst_reset() but that decrement is counteracted by a dst_hold() intentionally placed just before the sk_dst_reset() in ip6_negative_advice(). After ip6_negative_advice() has finished, there is no other object tied to the dst. The socket lost its reference stored in sk_dst_cache and the dst is no longer in the exception table. The exception dst becomes a leaked object. As a result of this dst leak, an unbalanced refcount is reported for the loopback device of a net namespace being destroyed under kernels that do not contain e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev"): unregister_netdevice: waiting for lo to become free. Usage count = 2 Fix the dst leak by removing the dst_hold() in ip6_negative_advice(). The patch that introduced the dst_hold() in ip6_negative_advice() was 92f1655aa2b22 ("net: fix __dst_negative_advice() race"). But 92f1655aa2b22 merely refactored the code with regards to the dst refcount so the issue was present even before 92f1655aa2b22. The bug was introduced in 54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually expired.") where the expired cached route is deleted and the sk_dst_cache member of the socket is set to NULL by calling dst_negative_advice() but the refcount belonging to the socket is left unbalanced. The IPv4 version - ipv4_negative_advice() - is not affected by this bug. When the TCP connection times out ipv4_negative_advice() merely resets the sk_dst_cache of the socket while decrementing the refcount of the exception dst. Fixes: 92f1655aa2b22 ("net: fix __dst_negative_advice() race") Fixes: 54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually expired.") Link: https://lore.kernel.org/netdev/20241113105611.GA6723@incl/T/#u Signed-off-by: Jiri Wiesner <jwiesner@suse.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241128085950.GA4505@incl Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cosmin Ratiu <cratiu@nvidia.com> Date: Tue Dec 3 22:49:15 2024 +0200 net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout [ Upstream commit 530b69a26952c95ffc9e6dcd1cff6f249032780a ] It allocates a match template, which creates a compressed definer fc struct, but that is not deallocated. This commit fixes that. Fixes: 74a778b4a63f ("net/mlx5: HWS, added definers handling") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cosmin Ratiu <cratiu@nvidia.com> Date: Tue Dec 3 22:49:16 2024 +0200 net/mlx5: HWS: Properly set bwc queue locks lock classes [ Upstream commit 10e0f0c018d5ce5ba4f349875c67f99c3253b5c3 ] The mentioned "Fixes" patch forgot to do that. Fixes: 9addffa34359 ("net/mlx5: HWS, use lock classes for bwc locks") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Ott <sebott@redhat.com> Date: Wed Oct 23 15:41:46 2024 +0200 net/mlx5: unique names for per device caches commit 25872a079bbbe952eb660249cc9f40fa75623e68 upstream. Add the device name to the per device kmem_cache names to ensure their uniqueness. This fixes warnings like this: "kmem_cache of name 'mlx5_fs_fgs' already exists". Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241023134146.28448-1-sebott@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Matt Fleming <mfleming@cloudflare.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jianbo Liu <jianbol@nvidia.com> Date: Tue Dec 3 22:49:20 2024 +0200 net/mlx5e: Remove workaround to avoid syndrome for internal port [ Upstream commit 5085f861b414e4a51ce28a891dfa32a10a54b64e ] Previously a workaround was added to avoid syndrome 0xcdb051. It is triggered when offload a rule with tunnel encapsulation, and forwarding to another table, but not matching on the internal port in firmware steering mode. The original workaround skips internal tunnel port logic, which is not correct as not all cases are considered. As an example, if vlan is configured on the uplink port, traffic can't pass because vlan header is not added with this workaround. Besides, there is no such issue for software steering. So, this patch removes that, and returns error directly if trying to offload such rule for firmware steering. Fixes: 06b4eac9c4be ("net/mlx5e: Don't offload internal port if filter device is out device") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Tested-by: Frode Nordahl <frode.nordahl@canonical.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tariq Toukan <tariqt@nvidia.com> Date: Tue Dec 3 22:49:19 2024 +0200 net/mlx5e: SD, Use correct mdev to build channel param [ Upstream commit 31f114c3d158dacbeda33f2afb0ca41f4ec6c9f9 ] In a multi-PF netdev, each traffic channel creates its own resources against a specific PF. In the cited commit, where this support was added, the channel_param logic was mistakenly kept unchanged, so it always used the primary PF which is found at priv->mdev. In this patch we fix this by moving the logic to be per-channel, and passing the correct mdev instance. This bug happened to be usually harmless, as the resulting cparam structures would be the same for all channels, due to identical FW logic and decisions. However, in some use cases, like fwreset, this gets broken. This could lead to different symptoms. Example: Error cqe on cqn 0x428, ci 0x0, qn 0x10a9, opcode 0xe, syndrome 0x4, vendor syndrome 0x32 Fixes: e4f9686bdee7 ("net/mlx5e: Let channels be SD-aware") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org> Date: Thu Nov 14 16:32:21 2024 -0800 net/neighbor: clear error in case strict check is not set [ Upstream commit 0de6a472c3b38432b2f184bd64eb70d9ea36d107 ] Commit 51183d233b5a ("net/neighbor: Update neigh_dump_info for strict data checking") added strict checking. The err variable is not cleared, so if we find no table to dump we will return the validation error even if user did not want strict checking. I think the only way to hit this is to send an buggy request, and ask for a table which doesn't exist, so there's no point treating this as a real fix. I only noticed it because a syzbot repro depended on it to trigger another bug. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241115003221.733593-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Louis Leseur <louis.leseur@gmail.com> Date: Thu Nov 28 09:33:58 2024 +0100 net/qed: allow old cards not supporting "num_images" to work [ Upstream commit 7a0ea70da56ee8c2716d0b79e9959d3c47efab62 ] Commit 43645ce03e00 ("qed: Populate nvm image attribute shadow.") added support for populating flash image attributes, notably "num_images". However, some cards were not able to return this information. In such cases, the driver would return EINVAL, causing the driver to exit. Add check to return EOPNOTSUPP instead of EINVAL when the card is not able to return these information. The caller function already handles EOPNOTSUPP without error. Fixes: 43645ce03e00 ("qed: Populate nvm image attribute shadow.") Co-developed-by: Florian Forestier <florian@forestier.re> Signed-off-by: Florian Forestier <florian@forestier.re> Signed-off-by: Louis Leseur <louis.leseur@gmail.com> Link: https://patch.msgid.link/20241128083633.26431-1-louis.leseur@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Elena Salomatkina <esalomatkina@ispras.ru> Date: Sun Oct 13 15:45:29 2024 +0300 net/sched: cbs: Fix integer overflow in cbs_set_port_rate() [ Upstream commit 397006ba5d918f9b74e734867e8fddbc36dc2282 ] The subsequent calculation of port_rate = speed * 1000 * BYTES_PER_KBIT, where the BYTES_PER_KBIT is of type LL, may cause an overflow. At least when speed = SPEED_20000, the expression to the left of port_rate will be greater than INT_MAX. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru> Link: https://patch.msgid.link/20241013124529.1043-1-esalomatkina@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Martin Ottens <martin.ottens@fau.de> Date: Mon Nov 25 18:46:07 2024 +0100 net/sched: tbf: correct backlog statistic for GSO packets [ Upstream commit 1596a135e3180c92e42dd1fbcad321f4fb3e3b17 ] When the length of a GSO packet in the tbf qdisc is larger than the burst size configured the packet will be segmented by the tbf_segment function. Whenever this function is used to enqueue SKBs, the backlog statistic of the tbf is not increased correctly. This can lead to underflows of the 'backlog' byte-statistic value when these packets are dequeued from tbf. Reproduce the bug: Ensure that the sender machine has GSO enabled. Configured the tbf on the outgoing interface of the machine as follows (burstsize = 1 MTU): $ tc qdisc add dev <oif> root handle 1: tbf rate 50Mbit burst 1514 latency 50ms Send bulk TCP traffic out via this interface, e.g., by running an iPerf3 client on this machine. Check the qdisc statistics: $ tc -s qdisc show dev <oif> The 'backlog' byte-statistic has incorrect values while traffic is transferred, e.g., high values due to u32 underflows. When the transfer is stopped, the value is != 0, which should never happen. This patch fixes this bug by updating the statistics correctly, even if single SKBs of a GSO SKB cannot be enqueued. Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets") Signed-off-by: Martin Ottens <martin.ottens@fau.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241125174608.1484356-1-martin.ottens@fau.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wen Gu <guwen@linux.alibaba.com> Date: Wed Nov 27 21:30:14 2024 +0800 net/smc: fix LGR and link use-after-free issue [ Upstream commit 2c7f14ed9c19ec0f149479d1c2842ec1f9bf76d7 ] We encountered a LGR/link use-after-free issue, which manifested as the LGR/link refcnt reaching 0 early and entering the clear process, making resource access unsafe. refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 107447 at lib/refcount.c:25 refcount_warn_saturate+0x9c/0x140 Workqueue: events smc_lgr_terminate_work [smc] Call trace: refcount_warn_saturate+0x9c/0x140 __smc_lgr_terminate.part.45+0x2a8/0x370 [smc] smc_lgr_terminate_work+0x28/0x30 [smc] process_one_work+0x1b8/0x420 worker_thread+0x158/0x510 kthread+0x114/0x118 or refcount_t: underflow; use-after-free. WARNING: CPU: 6 PID: 93140 at lib/refcount.c:28 refcount_warn_saturate+0xf0/0x140 Workqueue: smc_hs_wq smc_listen_work [smc] Call trace: refcount_warn_saturate+0xf0/0x140 smcr_link_put+0x1cc/0x1d8 [smc] smc_conn_free+0x110/0x1b0 [smc] smc_conn_abort+0x50/0x60 [smc] smc_listen_find_device+0x75c/0x790 [smc] smc_listen_work+0x368/0x8a0 [smc] process_one_work+0x1b8/0x420 worker_thread+0x158/0x510 kthread+0x114/0x118 It is caused by repeated release of LGR/link refcnt. One suspect is that smc_conn_free() is called repeatedly because some smc_conn_free() from server listening path are not protected by sock lock. e.g. Calls under socklock | smc_listen_work ------------------------------------------------------- lock_sock(sk) | smc_conn_abort smc_conn_free | \- smc_conn_free \- smcr_link_put | \- smcr_link_put (duplicated) release_sock(sk) So here add sock lock protection in smc_listen_work() path, making it exclusive with other connection operations. Fixes: 3b2dec2603d5 ("net/smc: restructure client and server code in af_smc") Co-developed-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Co-developed-by: Kai <KaiShen@linux.alibaba.com> Signed-off-by: Kai <KaiShen@linux.alibaba.com> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wen Gu <guwen@linux.alibaba.com> Date: Wed Nov 27 21:30:13 2024 +0800 net/smc: initialize close_work early to avoid warning [ Upstream commit 0541db8ee32c09463a72d0987382b3a3336b0043 ] We encountered a warning that close_work was canceled before initialization. WARNING: CPU: 7 PID: 111103 at kernel/workqueue.c:3047 __flush_work+0x19e/0x1b0 Workqueue: events smc_lgr_terminate_work [smc] RIP: 0010:__flush_work+0x19e/0x1b0 Call Trace: ? __wake_up_common+0x7a/0x190 ? work_busy+0x80/0x80 __cancel_work_timer+0xe3/0x160 smc_close_cancel_work+0x1a/0x70 [smc] smc_close_active_abort+0x207/0x360 [smc] __smc_lgr_terminate.part.38+0xc8/0x180 [smc] process_one_work+0x19e/0x340 worker_thread+0x30/0x370 ? process_one_work+0x340/0x340 kthread+0x117/0x130 ? __kthread_cancel_work+0x50/0x50 ret_from_fork+0x22/0x30 This is because when smc_close_cancel_work is triggered, e.g. the RDMA driver is rmmod and the LGR is terminated, the conn->close_work is flushed before initialization, resulting in WARN_ON(!work->func). __smc_lgr_terminate | smc_connect_{rdma|ism} ------------------------------------------------------------- | smc_conn_create | \- smc_lgr_register_conn for conn in lgr->conns_all | \- smc_conn_kill | \- smc_close_active_abort | \- smc_close_cancel_work | \- cancel_work_sync | \- __flush_work | (close_work) | | smc_close_init | \- INIT_WORK(&close_work) So fix this by initializing close_work before establishing the connection. Fixes: 46c28dbd4c23 ("net/smc: no socket state changes in tasklet context") Fixes: 413498440e30 ("net/smc: add SMC-D support in af_smc") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Safonov <0x7f454c46@gmail.com> Date: Wed Oct 30 04:22:33 2024 +0000 net/tcp: Add missing lockdep annotations for TCP-AO hlist traversals [ Upstream commit 6b2d11e2d8fc130df4708be0b6b53fd3e6b54cf6 ] Under CONFIG_PROVE_RCU_LIST + CONFIG_RCU_EXPERT hlist_for_each_entry_rcu() provides very helpful splats, which help to find possible issues. I missed CONFIG_RCU_EXPERT=y in my testing config the same as described in a3e4bf7f9675 ("configs/debug: make sure PROVE_RCU_LIST=y takes effect"). The fix itself is trivial: add the very same lockdep annotations as were used to dereference ao_info from the socket. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20241028152645.35a8be66@kernel.org/ Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20241030-tcp-ao-hlist-lockdep-annotate-v1-1-bf641a64d7c6@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:03 2024 +0100 net: af_can: do not leave a dangling sk pointer in can_create() [ Upstream commit 811a7ca7320c062e15d0f5b171fe6ad8592d1434 ] On error can_create() frees the allocated sk object, but sock_init_data() has already attached it to the provided sock object. This will leave a dangling sk pointer in the sock object and may cause use-after-free later. Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://patch.msgid.link/20241014153808.51894-5-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com> Date: Tue Dec 3 17:09:33 2024 +0000 net: avoid potential UAF in default_operstate() [ Upstream commit 750e51603395e755537da08f745864c93e3ce741 ] syzbot reported an UAF in default_operstate() [1] Issue is a race between device and netns dismantles. After calling __rtnl_unlock() from netdev_run_todo(), we can not assume the netns of each device is still alive. Make sure the device is not in NETREG_UNREGISTERED state, and add an ASSERT_RTNL() before the call to __dev_get_by_index(). We might move this ASSERT_RTNL() in __dev_get_by_index() in the future. [1] BUG: KASAN: slab-use-after-free in __dev_get_by_index+0x5d/0x110 net/core/dev.c:852 Read of size 8 at addr ffff888043eba1b0 by task syz.0.0/5339 CPU: 0 UID: 0 PID: 5339 Comm: syz.0.0 Not tainted 6.12.0-syzkaller-10296-gaaf20f870da0 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 __dev_get_by_index+0x5d/0x110 net/core/dev.c:852 default_operstate net/core/link_watch.c:51 [inline] rfc2863_policy+0x224/0x300 net/core/link_watch.c:67 linkwatch_do_dev+0x3e/0x170 net/core/link_watch.c:170 netdev_run_todo+0x461/0x1000 net/core/dev.c:10894 rtnl_unlock net/core/rtnetlink.c:152 [inline] rtnl_net_unlock include/linux/rtnetlink.h:133 [inline] rtnl_dellink+0x760/0x8d0 net/core/rtnetlink.c:3520 rtnetlink_rcv_msg+0x791/0xcf0 net/core/rtnetlink.c:6911 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2541 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:726 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2583 ___sys_sendmsg net/socket.c:2637 [inline] __sys_sendmsg+0x269/0x350 net/socket.c:2669 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f2a3cb80809 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f2a3d9cd058 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f2a3cd45fa0 RCX: 00007f2a3cb80809 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000008 RBP: 00007f2a3cbf393e R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f2a3cd45fa0 R15: 00007ffd03bc65c8 </TASK> Allocated by task 5339: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x243/0x390 mm/slub.c:4314 kmalloc_noprof include/linux/slab.h:901 [inline] kmalloc_array_noprof include/linux/slab.h:945 [inline] netdev_create_hash net/core/dev.c:11870 [inline] netdev_init+0x10c/0x250 net/core/dev.c:11890 ops_init+0x31e/0x590 net/core/net_namespace.c:138 setup_net+0x287/0x9e0 net/core/net_namespace.c:362 copy_net_ns+0x33f/0x570 net/core/net_namespace.c:500 create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228 ksys_unshare+0x57d/0xa70 kernel/fork.c:3314 __do_sys_unshare kernel/fork.c:3385 [inline] __se_sys_unshare kernel/fork.c:3383 [inline] __x64_sys_unshare+0x38/0x40 kernel/fork.c:3383 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 12: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kfree+0x196/0x420 mm/slub.c:4746 netdev_exit+0x65/0xd0 net/core/dev.c:11992 ops_exit_list net/core/net_namespace.c:172 [inline] cleanup_net+0x802/0xcc0 net/core/net_namespace.c:632 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 The buggy address belongs to the object at ffff888043eba000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 432 bytes inside of freed 2048-byte region [ffff888043eba000, ffff888043eba800) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x43eb8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff) page_type: f5(slab) raw: 04fff00000000040 ffff88801ac42000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000080008 00000001f5000000 0000000000000000 head: 04fff00000000040 ffff88801ac42000 dead000000000122 0000000000000000 head: 0000000000000000 0000000000080008 00000001f5000000 0000000000000000 head: 04fff00000000003 ffffea00010fae01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5339, tgid 5338 (syz.0.0), ts 69674195892, free_ts 69663220888 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1556 prep_new_page mm/page_alloc.c:1564 [inline] get_page_from_freelist+0x3649/0x3790 mm/page_alloc.c:3474 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4751 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x140 mm/slub.c:2408 allocate_slab+0x5a/0x2f0 mm/slub.c:2574 new_slab mm/slub.c:2627 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3815 __slab_alloc+0x58/0xa0 mm/slub.c:3905 __slab_alloc_node mm/slub.c:3980 [inline] slab_alloc_node mm/slub.c:4141 [inline] __do_kmalloc_node mm/slub.c:4282 [inline] __kmalloc_noprof+0x2e6/0x4c0 mm/slub.c:4295 kmalloc_noprof include/linux/slab.h:905 [inline] sk_prot_alloc+0xe0/0x210 net/core/sock.c:2165 sk_alloc+0x38/0x370 net/core/sock.c:2218 __netlink_create+0x65/0x260 net/netlink/af_netlink.c:629 __netlink_kernel_create+0x174/0x6f0 net/netlink/af_netlink.c:2015 netlink_kernel_create include/linux/netlink.h:62 [inline] uevent_net_init+0xed/0x2d0 lib/kobject_uevent.c:783 ops_init+0x31e/0x590 net/core/net_namespace.c:138 setup_net+0x287/0x9e0 net/core/net_namespace.c:362 page last free pid 1032 tgid 1032 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1127 [inline] free_unref_page+0xdf9/0x1140 mm/page_alloc.c:2657 __slab_free+0x31b/0x3d0 mm/slub.c:4509 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4104 [inline] slab_alloc_node mm/slub.c:4153 [inline] kmem_cache_alloc_node_noprof+0x1d9/0x380 mm/slub.c:4205 __alloc_skb+0x1c3/0x440 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1323 [inline] alloc_skb_with_frags+0xc3/0x820 net/core/skbuff.c:6612 sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2881 sock_alloc_send_skb include/net/sock.h:1797 [inline] mld_newpack+0x1c3/0xaf0 net/ipv6/mcast.c:1747 add_grhead net/ipv6/mcast.c:1850 [inline] add_grec+0x1492/0x19a0 net/ipv6/mcast.c:1988 mld_send_initial_cr+0x228/0x4b0 net/ipv6/mcast.c:2234 ipv6_mc_dad_complete+0x88/0x490 net/ipv6/mcast.c:2245 addrconf_dad_completed+0x712/0xcd0 net/ipv6/addrconf.c:4342 addrconf_dad_work+0xdc2/0x16f0 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 Memory state around the buggy address: ffff888043eba080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888043eba100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888043eba180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888043eba200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888043eba280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzbot+1939f24bdb783e9e43d9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/674f3a18.050a0220.48a03.0041.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241203170933.2449307-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wei Fang <wei.fang@nxp.com> Date: Mon Nov 25 17:07:19 2024 +0800 net: enetc: Do not configure preemptible TCs if SIs do not support [ Upstream commit b2420b8c81ec674552d00c55d46245e5c184b260 ] Both ENETC PF and VF drivers share enetc_setup_tc_mqprio() to configure MQPRIO. And enetc_setup_tc_mqprio() calls enetc_change_preemptible_tcs() to configure preemptible TCs. However, only PF is able to configure preemptible TCs. Because only PF has related registers, while VF does not have these registers. So for VF, its hw->port pointer is NULL. Therefore, VF will access an invalid pointer when accessing a non-existent register, which will cause a crash issue. The simplified log is as follows. root@ls1028ardb:~# tc qdisc add dev eno0vf0 parent root handle 100: \ mqprio num_tc 4 map 0 0 1 1 2 2 3 3 queues 1@0 1@1 1@2 1@3 hw 1 [ 187.290775] Unable to handle kernel paging request at virtual address 0000000000001f00 [ 187.424831] pc : enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.430518] lr : enetc_mm_commit_preemptible_tcs+0x30c/0x400 [ 187.511140] Call trace: [ 187.513588] enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.518918] enetc_setup_tc_mqprio+0x180/0x214 [ 187.523374] enetc_vf_setup_tc+0x1c/0x30 [ 187.527306] mqprio_enable_offload+0x144/0x178 [ 187.531766] mqprio_init+0x3ec/0x668 [ 187.535351] qdisc_create+0x15c/0x488 [ 187.539023] tc_modify_qdisc+0x398/0x73c [ 187.542958] rtnetlink_rcv_msg+0x128/0x378 [ 187.547064] netlink_rcv_skb+0x60/0x130 [ 187.550910] rtnetlink_rcv+0x18/0x24 [ 187.554492] netlink_unicast+0x300/0x36c [ 187.558425] netlink_sendmsg+0x1a8/0x420 [ 187.606759] ---[ end trace 0000000000000000 ]--- In addition, some PFs also do not support configuring preemptible TCs, such as eno1 and eno3 on LS1028A. It won't crash like it does for VFs, but we should prevent these PFs from accessing these unimplemented registers. Fixes: 827145392a4a ("net: enetc: only commit preemptible TCs to hardware when MM TX is active") Signed-off-by: Wei Fang <wei.fang@nxp.com> Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simon Horman <horms@kernel.org> Date: Mon Oct 14 11:48:08 2024 +0100 net: ethernet: fs_enet: Use %pa to format resource_size_t [ Upstream commit 45fe45fada261e1e83fce2a07fa22835aec1cf0a ] The correct format string for resource_size_t is %pa which acts on the address of the variable to be formatted [1]. [1] https://elixir.bootlin.com/linux/v6.11.3/source/Documentation/core-api/printk-formats.rst#L229 Introduced by commit 9d9326d3bc0e ("phy: Change mii_bus id field to a string") Flagged by gcc-14 as: drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c: In function 'fs_mii_bitbang_init': drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c:126:46: warning: format '%x' expects argument of type 'unsigned int', but argument 4 has type 'resource_size_t' {aka 'long long unsigned int'} [-Wformat=] 126 | snprintf(bus->id, MII_BUS_ID_SIZE, "%x", res.start); | ~^ ~~~~~~~~~ | | | | | resource_size_t {aka long long unsigned int} | unsigned int | %llx No functional change intended. Compile tested only. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/netdev/711d7f6d-b785-7560-f4dc-c6aad2cce99@linux-m68k.org/ Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241014-net-pa-fmt-v1-2-dcc9afb8858b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simon Horman <horms@kernel.org> Date: Mon Oct 14 11:48:07 2024 +0100 net: fec_mpc52xx_phy: Use %pa to format resource_size_t [ Upstream commit 020bfdc4ed94be472138c891bde4d14241cf00fd ] The correct format string for resource_size_t is %pa which acts on the address of the variable to be formatted [1]. [1] https://elixir.bootlin.com/linux/v6.11.3/source/Documentation/core-api/printk-formats.rst#L229 Introduced by commit 9d9326d3bc0e ("phy: Change mii_bus id field to a string") Flagged by gcc-14 as: drivers/net/ethernet/freescale/fec_mpc52xx_phy.c: In function 'mpc52xx_fec_mdio_probe': drivers/net/ethernet/freescale/fec_mpc52xx_phy.c:97:46: warning: format '%x' expects argument of type 'unsigned int', but argument 4 has type 'resource_size_t' {aka 'long long unsigned int'} [-Wformat=] 97 | snprintf(bus->id, MII_BUS_ID_SIZE, "%x", res.start); | ~^ ~~~~~~~~~ | | | | | resource_size_t {aka long long unsigned int} | unsigned int | %llx No functional change intended. Compile tested only. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/netdev/711d7f6d-b785-7560-f4dc-c6aad2cce99@linux-m68k.org/ Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241014-net-pa-fmt-v1-1-dcc9afb8858b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dong Chenchen <dongchenchen2@huawei.com> Date: Wed Nov 27 12:08:50 2024 +0800 net: Fix icmp host relookup triggering ip_rt_bug [ Upstream commit c44daa7e3c73229f7ac74985acb8c7fb909c4e0a ] arp link failure may trigger ip_rt_bug while xfrm enabled, call trace is: WARNING: CPU: 0 PID: 0 at net/ipv4/route.c:1241 ip_rt_bug+0x14/0x20 Modules linked in: CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-rc6-00077-g2e1b3cc9d7f7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:ip_rt_bug+0x14/0x20 Call Trace: <IRQ> ip_send_skb+0x14/0x40 __icmp_send+0x42d/0x6a0 ipv4_link_failure+0xe2/0x1d0 arp_error_report+0x3c/0x50 neigh_invalidate+0x8d/0x100 neigh_timer_handler+0x2e1/0x330 call_timer_fn+0x21/0x120 __run_timer_base.part.0+0x1c9/0x270 run_timer_softirq+0x4c/0x80 handle_softirqs+0xac/0x280 irq_exit_rcu+0x62/0x80 sysvec_apic_timer_interrupt+0x77/0x90 The script below reproduces this scenario: ip xfrm policy add src 0.0.0.0/0 dst 0.0.0.0/0 \ dir out priority 0 ptype main flag localok icmp ip l a veth1 type veth ip a a 192.168.141.111/24 dev veth0 ip l s veth0 up ping 192.168.141.155 -c 1 icmp_route_lookup() create input routes for locally generated packets while xfrm relookup ICMP traffic.Then it will set input route (dst->out = ip_rt_bug) to skb for DESTUNREACH. For ICMP err triggered by locally generated packets, dst->dev of output route is loopback. Generally, xfrm relookup verification is not required on loopback interfaces (net.ipv4.conf.lo.disable_xfrm = 1). Skip icmp relookup for locally generated packets to fix it. Fixes: 8b7817f3a959 ("[IPSEC]: Add ICMP host relookup support") Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241127040850.1513135-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com> Date: Tue Nov 26 14:43:44 2024 +0000 net: hsr: avoid potential out-of-bound access in fill_frame_info() [ Upstream commit b9653d19e556c6afd035602927a93d100a0d7644 ] syzbot is able to feed a packet with 14 bytes, pretending it is a vlan one. Since fill_frame_info() is relying on skb->mac_len already, extend the check to cover this case. BUG: KMSAN: uninit-value in fill_frame_info net/hsr/hsr_forward.c:709 [inline] BUG: KMSAN: uninit-value in hsr_forward_skb+0x9ee/0x3b10 net/hsr/hsr_forward.c:724 fill_frame_info net/hsr/hsr_forward.c:709 [inline] hsr_forward_skb+0x9ee/0x3b10 net/hsr/hsr_forward.c:724 hsr_dev_xmit+0x2f0/0x350 net/hsr/hsr_device.c:235 __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3606 __dev_queue_xmit+0x366a/0x57d0 net/core/dev.c:4434 dev_queue_xmit include/linux/netdevice.h:3168 [inline] packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3146 [inline] packet_sendmsg+0x91ae/0xa6f0 net/packet/af_packet.c:3178 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:726 __sys_sendto+0x594/0x750 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2200 x64_sys_call+0x346a/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:45 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: slab_post_alloc_hook mm/slub.c:4091 [inline] slab_alloc_node mm/slub.c:4134 [inline] kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4186 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587 __alloc_skb+0x363/0x7b0 net/core/skbuff.c:678 alloc_skb include/linux/skbuff.h:1323 [inline] alloc_skb_with_frags+0xc8/0xd00 net/core/skbuff.c:6612 sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2881 packet_alloc_skb net/packet/af_packet.c:2995 [inline] packet_snd net/packet/af_packet.c:3089 [inline] packet_sendmsg+0x74c6/0xa6f0 net/packet/af_packet.c:3178 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:726 __sys_sendto+0x594/0x750 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __x64_sys_sendto+0x125/0x1d0 net/socket.c:2200 x64_sys_call+0x346a/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:45 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 48b491a5cc74 ("net: hsr: fix mac_len checks") Reported-by: syzbot+671e2853f9851d039551@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6745dc7f.050a0220.21d33d.0018.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: WingMan Kwok <w-kwok2@ti.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: MD Danish Anwar <danishanwar@ti.com> Cc: Jiri Pirko <jiri@nvidia.com> Cc: George McCollister <george.mccollister@gmail.com> Link: https://patch.msgid.link/20241126144344.4177332-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com> Date: Mon Dec 2 10:05:58 2024 +0000 net: hsr: must allocate more bytes for RedBox support [ Upstream commit af8edaeddbc52e53207d859c912b017fd9a77629 ] Blamed commit forgot to change hsr_init_skb() to allocate larger skb for RedBox case. Indeed, send_hsr_supervision_frame() will add two additional components (struct hsr_sup_tlv and struct hsr_sup_payload) syzbot reported the following crash: skbuff: skb_over_panic: text:ffffffff8afd4b0a len:34 put:6 head:ffff88802ad29e00 data:ffff88802ad29f22 tail:0x144 end:0x140 dev:gretap0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 2 UID: 0 PID: 7611 Comm: syz-executor Not tainted 6.12.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:skb_panic+0x157/0x1d0 net/core/skbuff.c:206 Code: b6 04 01 84 c0 74 04 3c 03 7e 21 8b 4b 70 41 56 45 89 e8 48 c7 c7 a0 7d 9b 8c 41 57 56 48 89 ee 52 4c 89 e2 e8 9a 76 79 f8 90 <0f> 0b 4c 89 4c 24 10 48 89 54 24 08 48 89 34 24 e8 94 76 fb f8 4c RSP: 0018:ffffc90000858ab8 EFLAGS: 00010282 RAX: 0000000000000087 RBX: ffff8880598c08c0 RCX: ffffffff816d3e69 RDX: 0000000000000000 RSI: ffffffff816de786 RDI: 0000000000000005 RBP: ffffffff8c9b91c0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000302 R11: ffffffff961cc1d0 R12: ffffffff8afd4b0a R13: 0000000000000006 R14: ffff88804b938130 R15: 0000000000000140 FS: 000055558a3d6500(0000) GS:ffff88806a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1295974ff8 CR3: 000000002ab6e000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> skb_over_panic net/core/skbuff.c:211 [inline] skb_put+0x174/0x1b0 net/core/skbuff.c:2617 send_hsr_supervision_frame+0x6fa/0x9e0 net/hsr/hsr_device.c:342 hsr_proxy_announce+0x1a3/0x4a0 net/hsr/hsr_device.c:436 call_timer_fn+0x1a0/0x610 kernel/time/timer.c:1794 expire_timers kernel/time/timer.c:1845 [inline] __run_timers+0x6e8/0x930 kernel/time/timer.c:2419 __run_timer_base kernel/time/timer.c:2430 [inline] __run_timer_base kernel/time/timer.c:2423 [inline] run_timer_base+0x111/0x190 kernel/time/timer.c:2439 run_timer_softirq+0x1a/0x40 kernel/time/timer.c:2449 handle_softirqs+0x213/0x8f0 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu kernel/softirq.c:637 [inline] irq_exit_rcu+0xbb/0x120 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049 </IRQ> Fixes: 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)") Reported-by: syzbot+7f4643b267cc680bfa1c@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Lukasz Majewski <lukma@denx.de> Link: https://patch.msgid.link/20241202100558.507765-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:04 2024 +0100 net: ieee802154: do not leave a dangling sk pointer in ieee802154_create() [ Upstream commit b4fcd63f6ef79c73cafae8cf4a114def5fc3d80d ] sock_init_data() attaches the allocated sk object to the provided sock object. If ieee802154_create() fails later, the allocated sk object is freed, but the dangling pointer remains in the provided sock object, which may allow use-after-free. Clear the sk pointer in the sock object on error. Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-6-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:06 2024 +0100 net: inet6: do not leave a dangling sk pointer in inet6_create() [ Upstream commit 9df99c395d0f55fb444ef39f4d6f194ca437d884 ] sock_init_data() attaches the allocated sk pointer to the provided sock object. If inet6_create() fails later, the sk object is released, but the sock object retains the dangling sk pointer, which may cause use-after-free later. Clear the sock sk pointer on error. Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-8-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ignat Korchagin <ignat@cloudflare.com> Date: Mon Oct 14 16:38:05 2024 +0100 net: inet: do not leave a dangling sk pointer in inet_create() [ Upstream commit 9365fa510c6f82e3aa550a09d0c5c6b44dbc78ff ] sock_init_data() attaches the allocated sk object to the provided sock object. If inet_create() fails later, the sk object is freed, but the sock object retains the dangling pointer, which may create use-after-free later. Clear the sk pointer in the sock object on error. Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014153808.51894-7-ignat@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oleksij Rempel <o.rempel@pengutronix.de> Date: Mon Nov 25 09:40:50 2024 +0100 net: phy: microchip: Reset LAN88xx PHY to ensure clean link state on LAN7800/7850 [ Upstream commit ccb989e4d1efe0dd81b28c437443532d80d9ecee ] Fix outdated MII_LPA data in the LAN88xx PHY, which is used in LAN7800 and LAN7850 USB Ethernet controllers. Due to a hardware limitation, the PHY cannot reliably update link status after parallel detection when the link partner does not support auto-negotiation. To mitigate this, add a PHY reset in `lan88xx_link_change_notify()` when `phydev->state` is `PHY_NOLINK`, ensuring the PHY starts in a clean state and reports accurate fixed link parallel detection results. Fixes: 792aec47d59d9 ("add microchip LAN88xx phy driver") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241125084050.414352-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xin Long <lucien.xin@gmail.com> Date: Mon Dec 2 10:21:38 2024 -0500 net: sched: fix erspan_opt settings in cls_flower [ Upstream commit 292207809486d99c78068d3f459cbbbffde88415 ] When matching erspan_opt in cls_flower, only the (version, dir, hwid) fields are relevant. However, in fl_set_erspan_opt() it initializes all bits of erspan_opt and its mask to 1. This inadvertently requires packets to match not only the (version, dir, hwid) fields but also the other fields that are unexpectedly set to 1. This patch resolves the issue by ensuring that only the (version, dir, hwid) fields are configured in fl_set_erspan_opt(), leaving the other fields to 0 in erspan_opt. Fixes: 79b1011cb33d ("net: sched: allow flower to match erspan options") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengyu Qu <wiagn233@outlook.com> Date: Sat Oct 12 01:39:17 2024 +0800 net: sfp: change quirks for Alcatel Lucent G-010S-P [ Upstream commit 90cb5f1776ba371478e2b08fbf7018c7bd781a8d ] Seems Alcatel Lucent G-010S-P also have the same problem that it uses TX_FAULT pin for SOC uart. So apply sfp_fixup_ignore_tx_fault to it. Signed-off-by: Shengyu Qu <wiagn233@outlook.com> Link: https://patch.msgid.link/TYCPR01MB84373677E45A7BFA5A28232C98792@TYCPR01MB8437.jpnprd01.prod.outlook.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abhishek Chauhan <quic_abchauha@quicinc.com> Date: Wed Oct 16 16:43:13 2024 -0700 net: stmmac: Programming sequence for VLAN packets with split header [ Upstream commit d10f1a4e44c3bf874701f86f8cc43490e1956acf ] Currently reset state configuration of split header works fine for non-tagged packets and we see no corruption in payload of any size We need additional programming sequence with reset configuration to handle VLAN tagged packets to avoid corruption in payload for packets of size greater than 256 bytes. Without this change ping application complains about corruption in payload when the size of the VLAN packet exceeds 256 bytes. With this change tagged and non-tagged packets of any size works fine and there is no corruption seen. Current configuration which has the issue for VLAN packet ---------------------------------------------------------- Split happens at the position at Layer 3 header |MAC-DA|MAC-SA|Vlan Tag|Ether type|IP header|IP data|Rest of the payload| 2 bytes ^ | With the fix we are making sure that the split happens now at Layer 2 which is end of ethernet header and start of IP payload Ip traffic split ----------------- Bits which take care of this are SPLM and SPLOFST SPLM = Split mode is set to Layer 2 SPLOFST = These bits indicate the value of offset from the beginning of Length/Type field at which header split should take place when the appropriate SPLM is selected. Reset value is 2bytes. Un-tagged data (without VLAN) |MAC-DA|MAC-SA|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Tagged data (with VLAN) |MAC-DA|MAC-SA|VLAN Tag|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Non-IP traffic split such AV packet ------------------------------------ Bits which take care of this are SAVE = Split AV Enable SAVO = Split AV Offset, similar to SPLOFST but this is for AVTP packets. |Preamble|MAC-DA|MAC-SA|VLAN tag|Ether type|IEEE 1722 payload|CRC| 2bytes ^ | Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241016234313.3992214-1-quic_abchauha@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Phil Sutter <phil@nwl.cc> Date: Fri Nov 29 16:30:38 2024 +0100 netfilter: ipset: Hold module reference while requesting a module [ Upstream commit 456f010bfaefde84d3390c755eedb1b0a5857c3c ] User space may unload ip_set.ko while it is itself requesting a set type backend module, leading to a kernel crash. The race condition may be provoked by inserting an mdelay() right after the nfnl_unlock() call. Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org> Date: Wed Nov 27 12:46:54 2024 +0100 netfilter: nft_inner: incorrect percpu area handling under softirq [ Upstream commit 7b1d83da254be3bf054965c8f3b1ad976f460ae5 ] Softirq can interrupt ongoing packet from process context that is walking over the percpu area that contains inner header offsets. Disable bh and perform three checks before restoring the percpu inner header offsets to validate that the percpu area is valid for this skbuff: 1) If the NFT_PKTINFO_INNER_FULL flag is set on, then this skbuff has already been parsed before for inner header fetching to register. 2) Validate that the percpu area refers to this skbuff using the skbuff pointer as a cookie. If there is a cookie mismatch, then this skbuff needs to be parsed again. 3) Finally, validate if the percpu area refers to this tunnel type. Only after these three checks the percpu area is restored to a on-stack copy and bh is enabled again. After inner header fetching, the on-stack copy is stored back to the percpu area. Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Reported-by: syzbot+84d0441b9860f0d63285@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org> Date: Mon Dec 2 00:04:49 2024 +0100 netfilter: nft_set_hash: skip duplicated elements pending gc run [ Upstream commit 7ffc7481153bbabf3332c6a19b289730c7e1edf5 ] rhashtable does not provide stable walk, duplicated elements are possible in case of resizing. I considered that checking for errors when calling rhashtable_walk_next() was sufficient to detect the resizing. However, rhashtable_walk_next() returns -EAGAIN only at the end of the iteration, which is too late, because a gc work containing duplicated elements could have been already scheduled for removal to the worker. Add a u32 gc worker sequence number per set, bump it on every workqueue run. Annotate gc worker sequence number on the expired element. Use it to skip those already seen in this gc workqueue run. Note that this new field is never reset in case gc transaction fails, so next gc worker run on the expired element overrides it. Wraparound of gc worker sequence number should not be an issue with stale gc worker sequence number in the element, that would just postpone the element removal in one gc run. Note that it is not possible to use flags to annotate that element is pending gc run to detect duplicates, given that gc transaction can be invalidated in case of update from the control plane, therefore, not allowing to clear such flag. On x86_64, pahole reports no changes in the size of nft_rhash_elem. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Tested-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org> Date: Tue Nov 26 11:59:06 2024 +0100 netfilter: nft_socket: remove WARN_ON_ONCE on maximum cgroup level [ Upstream commit b7529880cb961d515642ce63f9d7570869bbbdc3 ] cgroup maximum depth is INT_MAX by default, there is a cgroup toggle to restrict this maximum depth to a more reasonable value not to harm performance. Remove unnecessary WARN_ON_ONCE which is reachable from userspace. Fixes: 7f3287db6543 ("netfilter: nft_socket: make cgroupsv2 matching work with namespaces") Reported-by: syzbot+57bac0866ddd99fe47c0@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Antipov <dmantipov@yandex.ru> Date: Thu Nov 21 09:55:42 2024 +0300 netfilter: x_tables: fix LED ID check in led_tg_check() [ Upstream commit 04317f4eb2aad312ad85c1a17ad81fe75f1f9bc7 ] Syzbot has reported the following BUG detected by KASAN: BUG: KASAN: slab-out-of-bounds in strlen+0x58/0x70 Read of size 1 at addr ffff8881022da0c8 by task repro/5879 ... Call Trace: <TASK> dump_stack_lvl+0x241/0x360 ? __pfx_dump_stack_lvl+0x10/0x10 ? __pfx__printk+0x10/0x10 ? _printk+0xd5/0x120 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x183/0x530 print_report+0x169/0x550 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x183/0x530 ? __virt_addr_valid+0x45f/0x530 ? __phys_addr+0xba/0x170 ? strlen+0x58/0x70 kasan_report+0x143/0x180 ? strlen+0x58/0x70 strlen+0x58/0x70 kstrdup+0x20/0x80 led_tg_check+0x18b/0x3c0 xt_check_target+0x3bb/0xa40 ? __pfx_xt_check_target+0x10/0x10 ? stack_depot_save_flags+0x6e4/0x830 ? nft_target_init+0x174/0xc30 nft_target_init+0x82d/0xc30 ? __pfx_nft_target_init+0x10/0x10 ? nf_tables_newrule+0x1609/0x2980 ? nf_tables_newrule+0x1609/0x2980 ? rcu_is_watching+0x15/0xb0 ? nf_tables_newrule+0x1609/0x2980 ? nf_tables_newrule+0x1609/0x2980 ? __kmalloc_noprof+0x21a/0x400 nf_tables_newrule+0x1860/0x2980 ? __pfx_nf_tables_newrule+0x10/0x10 ? __nla_parse+0x40/0x60 nfnetlink_rcv+0x14e5/0x2ab0 ? __pfx_validate_chain+0x10/0x10 ? __pfx_nfnetlink_rcv+0x10/0x10 ? __lock_acquire+0x1384/0x2050 ? netlink_deliver_tap+0x2e/0x1b0 ? __pfx_lock_release+0x10/0x10 ? netlink_deliver_tap+0x2e/0x1b0 netlink_unicast+0x7f8/0x990 ? __pfx_netlink_unicast+0x10/0x10 ? __virt_addr_valid+0x183/0x530 ? __check_object_size+0x48e/0x900 netlink_sendmsg+0x8e4/0xcb0 ? __pfx_netlink_sendmsg+0x10/0x10 ? aa_sock_msg_perm+0x91/0x160 ? __pfx_netlink_sendmsg+0x10/0x10 __sock_sendmsg+0x223/0x270 ____sys_sendmsg+0x52a/0x7e0 ? __pfx_____sys_sendmsg+0x10/0x10 __sys_sendmsg+0x292/0x380 ? __pfx___sys_sendmsg+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x43d/0x780 ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10 ? exc_page_fault+0x590/0x8c0 ? do_syscall_64+0xb6/0x230 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... </TASK> Since an invalid (without '\0' byte at all) byte sequence may be passed from userspace, add an extra check to ensure that such a sequence is rejected as possible ID and so never passed to 'kstrdup()' and further. Reported-by: syzbot+6c8215822f35fdb35667@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6c8215822f35fdb35667 Fixes: 268cb38e1802 ("netfilter: x_tables: add LED trigger target") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Donald Hunter <donald.hunter@gmail.com> Date: Fri Oct 18 10:06:30 2024 +0100 netlink: specs: Add missing bitset attrs to ethtool spec [ Upstream commit b0b3683419b45e2971b6d413c506cb818b268d35 ] There are a couple of attributes missing from the 'bitset' attribute-set in the ethtool netlink spec. Add them to the spec. Reported-by: Kory Maincent <kory.maincent@bootlin.com> Closes: https://lore.kernel.org/netdev/20241017180551.1259bf5c@kmaincent-XPS-13-7390/ Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Tested-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20241018090630.22212-1-donald.hunter@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org> Date: Mon Nov 18 03:15:17 2024 -0800 netpoll: Use rcu_access_pointer() in __netpoll_setup [ Upstream commit c69c5e10adb903ae2438d4f9c16eccf43d1fcbc1 ] The ndev->npinfo pointer in __netpoll_setup() is RCU-protected but is being accessed directly for a NULL check. While no RCU read lock is held in this context, we should still use proper RCU primitives for consistency and correctness. Replace the direct NULL check with rcu_access_pointer(), which is the appropriate primitive when only checking for NULL without dereferencing the pointer. This function provides the necessary ordering guarantees without requiring RCU read-side protection. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241118-netpoll_rcu-v1-1-a1888dcb4a02@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ryusuke Konishi <konishi.ryusuke@gmail.com> Date: Wed Nov 20 02:23:37 2024 +0900 nilfs2: fix potential out-of-bounds memory access in nilfs_find_entry() commit 985ebec4ab0a28bb5910c3b1481a40fbf7f9e61d upstream. Syzbot reported that when searching for records in a directory where the inode's i_size is corrupted and has a large value, memory access outside the folio/page range may occur, or a use-after-free bug may be detected if KASAN is enabled. This is because nilfs_last_byte(), which is called by nilfs_find_entry() and others to calculate the number of valid bytes of directory data in a page from i_size and the page index, loses the upper 32 bits of the 64-bit size information due to an inappropriate type of local variable to which the i_size value is assigned. This caused a large byte offset value due to underflow in the end address calculation in the calling nilfs_find_entry(), resulting in memory access that exceeds the folio/page size. Fix this issue by changing the type of the local variable causing the bit loss from "unsigned int" to "u64". The return value of nilfs_last_byte() is also of type "unsigned int", but it is truncated so as not to exceed PAGE_SIZE and no bit loss occurs, so no change is required. Link: https://lkml.kernel.org/r/20241119172403.9292-1-konishi.ryusuke@gmail.com Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+96d5d14c47d97015c624@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96d5d14c47d97015c624 Tested-by: syzbot+96d5d14c47d97015c624@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Marcelo Dalmas <marcelo.dalmas@ge.com> Date: Mon Nov 25 12:16:09 2024 +0000 ntp: Remove invalid cast in time offset math commit f5807b0606da7ac7c1b74a386b22134ec7702d05 upstream. Due to an unsigned cast, adjtimex() returns the wrong offest when using ADJ_MICRO and the offset is negative. In this case a small negative offset returns approximately 4.29 seconds (~ 2^32/1000 milliseconds) due to the unsigned cast of the negative offset. This cast was added when the kernel internal struct timex was changed to use type long long for the time offset value to address the problem of a 64bit/32bit division on 32bit systems. The correct cast would have been (s32), which is correct as time_offset can only be in the range of [INT_MIN..INT_MAX] because the shift constant used for calculating it is 32. But that's non-obvious. Remove the cast and use div_s64() to cure the issue. [ tglx: Fix white space damage, use div_s64() and amend the change log ] Fixes: ead25417f82e ("timex: use __kernel_timex internally") Signed-off-by: Marcelo Dalmas <marcelo.dalmas@ge.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/SJ0P101MB03687BF7D5A10FD3C49C51E5F42E2@SJ0P101MB0368.NAMP101.PROD.OUTLOOK.COM Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yi Yang <yiyang13@huawei.com> Date: Fri Nov 8 08:55:26 2024 +0000 nvdimm: rectify the illogical code within nd_dax_probe() [ Upstream commit b61352101470f8b68c98af674e187cfaa7c43504 ] When nd_dax is NULL, nd_pfn is consequently NULL as well. Nevertheless, it is inadvisable to perform pointer arithmetic or address-taking on a NULL pointer. Introduce the nd_dax_devinit() function to enhance the code's logic and improve its readability. Signed-off-by: Yi Yang <yiyang13@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20241108085526.527957-1-yiyang13@huawei.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maurizio Lombardi <mlombard@redhat.com> Date: Fri Nov 29 15:17:06 2024 +0100 nvme-fabrics: handle zero MAXCMD without closing the connection [ Upstream commit 88c23a32b851e36adc4ab36f796d9b711f47e2bb ] The NVMe specification states that MAXCMD is mandatory for NVMe-over-Fabrics implementations. However, some NVMe/TCP and NVMe/FC arrays from major vendors have buggy firmware that reports MAXCMD as zero in the Identify Controller data structure. Currently, the implementation closes the connection in such cases, completely preventing the host from connecting to the target. Fix the issue by printing a clear error message about the firmware bug and allowing the connection to proceed. It assumes that the target supports a MAXCMD value of SQSIZE + 1. If any issues arise, the user can manually adjust SQSIZE to mitigate them. Fixes: 4999568184e5 ("nvme-fabrics: check max outstanding commands") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chunguang.xu <chunguang.xu@shopee.com> Date: Tue Dec 3 11:34:41 2024 +0800 nvme-rdma: unquiesce admin_q before destroy it [ Upstream commit 5858b687559809f05393af745cbadf06dee61295 ] Kernel will hang on destroy admin_q while we create ctrl failed, such as following calltrace: PID: 23644 TASK: ff2d52b40f439fc0 CPU: 2 COMMAND: "nvme" #0 [ff61d23de260fb78] __schedule at ffffffff8323bc15 #1 [ff61d23de260fc08] schedule at ffffffff8323c014 #2 [ff61d23de260fc28] blk_mq_freeze_queue_wait at ffffffff82a3dba1 #3 [ff61d23de260fc78] blk_freeze_queue at ffffffff82a4113a #4 [ff61d23de260fc90] blk_cleanup_queue at ffffffff82a33006 #5 [ff61d23de260fcb0] nvme_rdma_destroy_admin_queue at ffffffffc12686ce #6 [ff61d23de260fcc8] nvme_rdma_setup_ctrl at ffffffffc1268ced #7 [ff61d23de260fd28] nvme_rdma_create_ctrl at ffffffffc126919b #8 [ff61d23de260fd68] nvmf_dev_write at ffffffffc024f362 #9 [ff61d23de260fe38] vfs_write at ffffffff827d5f25 RIP: 00007fda7891d574 RSP: 00007ffe2ef06958 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 000055e8122a4d90 RCX: 00007fda7891d574 RDX: 000000000000012b RSI: 000055e8122a4d90 RDI: 0000000000000004 RBP: 00007ffe2ef079c0 R8: 000000000000012b R9: 000055e8122a4d90 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004 R13: 000055e8122923c0 R14: 000000000000012b R15: 00007fda78a54500 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b This due to we have quiesced admi_q before cancel requests, but forgot to unquiesce before destroy it, as a result we fail to drain the pending requests, and hang on blk_mq_freeze_queue_wait() forever. Here try to reuse nvme_rdma_teardown_admin_queue() to fix this issue and simplify the code. Fixes: 958dc1d32c80 ("nvme-rdma: add clean action for failed reconnection") Reported-by: Yingfu.zhou <yingfu.zhou@shopee.com> Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com> Signed-off-by: Yue.zhao <yue.zhao@shopee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chunguang.xu <chunguang.xu@shopee.com> Date: Tue Dec 3 11:34:40 2024 +0800 nvme-tcp: fix the memleak while create new ctrl failed [ Upstream commit fec55c29e54d3ca6fe9d7d7d9266098b4514fd34 ] Now while we create new ctrl failed, we have not free the tagset occupied by admin_q, here try to fix it. Fixes: fd1418de10b9 ("nvme-tcp: avoid open-coding nvme_tcp_teardown_admin_queue()") Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christoph Hellwig <hch@lst.de> Date: Wed Nov 27 07:42:18 2024 +0100 nvme: don't apply NVME_QUIRK_DEALLOCATE_ZEROES when DSM is not supported [ Upstream commit 58a0c875ce028678c9594c7bdf3fe33462392808 ] Commit 63dfa1004322 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard") started applying the NVME_QUIRK_DEALLOCATE_ZEROES quirk even then the Dataset Management is not supported. It turns out that there versions of these old Intel SSDs that have DSM support disabled in the firmware, which will now lead to errors everytime a Write Zeroes command is issued. Fix this by checking for DSM support before applying the quirk. Reported-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Fixes: 63dfa1004322 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard") Tested-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Sat Nov 23 22:28:34 2024 +0900 ocfs2: free inode when ocfs2_get_init_inode() fails [ Upstream commit 965b5dd1894f4525f38c1b5f99b0106a07dbb5db ] syzbot is reporting busy inodes after unmount, for commit 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") forgot to call iput() when new_inode() succeeded and dquot_initialize() failed. Link: https://lkml.kernel.org/r/e68c0224-b7c6-4784-b4fa-a9fc8c675525@I-love.SAKURA.ne.jp Fixes: 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0af00f6a2cba2058b5db Tested-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heming Zhao <heming.zhao@suse.com> Date: Thu Dec 12 19:31:05 2024 +0800 ocfs2: Revert "ocfs2: fix the la space leak when unmounting an ocfs2 volume" This reverts commit dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume"). In commit dfe6c5692fb5, the commit log "This bug has existed since the initial OCFS2 code." is wrong. The correct introduction commit is 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()"). The influence of commit dfe6c5692fb5 is that it provides a correct fix for the latest kernel. however, it shouldn't be pushed to stable branches. Let's use this commit to revert all branches that include dfe6c5692fb5 and use a new fix method to fix commit 30dd3478c3cd. Fixes: dfe6c5692fb5 ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Wengang Wang <wen.gang.wang@oracle.com> Date: Tue Nov 19 09:45:00 2024 -0800 ocfs2: update seq_file index in ocfs2_dlm_seq_next commit 914eec5e980171bc128e7e24f7a22aa1d803570e upstream. The following INFO level message was seen: seq_file: buggy .next function ocfs2_dlm_seq_next [ocfs2] did not update position index Fix: Update *pos (so m->index) to make seq_read_iter happy though the index its self makes no sense to ocfs2_dlm_seq_next. Link: https://lkml.kernel.org/r/20241119174500.9198-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Keith Busch <kbusch@kernel.org> Date: Fri Oct 25 15:27:54 2024 -0700 PCI: Add 'reset_subordinate' to reset hierarchy below bridge [ Upstream commit 2fa046449a82a7d0f6d9721dd83e348816038444 ] The "bus" and "cxl_bus" reset methods reset a device by asserting Secondary Bus Reset on the bridge leading to the device. These only work if the device is the only device below the bridge. Add a sysfs 'reset_subordinate' attribute on bridges that can assert Secondary Bus Reset regardless of how many devices are below the bridge. This resets all the devices below a bridge in a single command, including the locking and config space save/restore that reset methods normally do. This may be the only way to reset devices that don't support other reset methods (ACPI, FLR, PM reset, etc). Link: https://lore.kernel.org/r/20241025222755.3756162-1-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: commit log, add capable(CAP_SYS_ADMIN) check] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mengyuan Lou <mengyuanlou@net-swift.com> Date: Fri Nov 15 10:46:04 2024 +0800 PCI: Add ACS quirk for Wangxun FF5xxx NICs [ Upstream commit aa46a3736afcb7b0793766d22479b8b99fc1b322 ] Wangxun FF5xxx NICs are similar to SFxxx, RP1000 and RP2000 NICs. They may be multi-function devices, but they do not advertise an ACS capability. But the hardware does isolate FF5xxx functions as though it had an ACS capability and PCI_ACS_RR and PCI_ACS_CR were set in the ACS Control register, i.e., all peer-to-peer traffic is directed upstream instead of being routed internally. Add ACS quirk for FF5xxx NICs in pci_quirk_wangxun_nic_acs() so the functions can be in independent IOMMU groups. Link: https://lore.kernel.org/r/E16053DB2B80E9A5+20241115024604.30493-1-mengyuanlou@net-swift.com Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Esther Shimanovich <eshimanovich@chromium.org> Date: Tue Sep 10 17:57:45 2024 +0000 PCI: Detect and trust built-in Thunderbolt chips [ Upstream commit 3b96b895127b7c0aed63d82c974b46340e8466c1 ] Some computers with CPUs that lack Thunderbolt features use discrete Thunderbolt chips to add Thunderbolt functionality. These Thunderbolt chips are located within the chassis; between the Root Port labeled ExternalFacingPort and the USB-C port. These Thunderbolt PCIe devices should be labeled as fixed and trusted, as they are built into the computer. Otherwise, security policies that rely on those flags may have unintended results, such as preventing USB-C ports from enumerating. Detect the above scenario through the process of elimination. 1) Integrated Thunderbolt host controllers already have Thunderbolt implemented, so anything outside their external facing Root Port is removable and untrusted. Detect them using the following properties: - Most integrated host controllers have the "usb4-host-interface" ACPI property, as described here: https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#mapping-native-protocols-pcie-displayport-tunneled-through-usb4-to-usb4-host-routers - Integrated Thunderbolt PCIe Root Ports before Alder Lake do not have the "usb4-host-interface" ACPI property. Identify those by their PCI IDs instead. 2) If a Root Port does not have integrated Thunderbolt capabilities, but has the "ExternalFacingPort" ACPI property, that means the manufacturer has opted to use a discrete Thunderbolt host controller that is built into the computer. This host controller can be identified by virtue of being located directly below an external-facing Root Port that lacks integrated Thunderbolt. Label it as trusted and fixed. Everything downstream from it is untrusted and removable. The "ExternalFacingPort" ACPI property is described here: https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports Link: https://lore.kernel.org/r/20240910-trust-tbt-fix-v5-1-7a7a42a5f496@chromium.org Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Esther Shimanovich <eshimanovich@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: devi priya <quic_devipriy@quicinc.com> Date: Thu Aug 1 11:18:03 2024 +0530 PCI: qcom: Add support for IPQ9574 [ Upstream commit a63b74f2e35be3829f256922037ae5cee6bb844a ] Add the new IPQ9574 platform which is based on the Qcom IP rev. 1.27.0 and Synopsys IP rev. 5.80a. The platform itself has four PCIe Gen3 controllers: two single-lane and two dual-lane, all are based on Synopsys IP rev. 5.70a. As such, reuse all the members of 'ops_2_9_0'. Link: https://lore.kernel.org/r/20240801054803.3015572-5-quic_srichara@quicinc.com Co-developed-by: Anusha Rao <quic_anusha@quicinc.com> Signed-off-by: Anusha Rao <quic_anusha@quicinc.com> Signed-off-by: devi priya <quic_devipriy@quicinc.com> Signed-off-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mayank Rana <quic_mrana@quicinc.com> Date: Mon Nov 11 14:11:52 2024 +0530 PCI: starfive: Enable controller runtime PM before probing host bridge [ Upstream commit 6168efbebace0db443185d4c6701ca8170a8788d ] A PCI controller device, e.g., StarFive, is parent to PCI host bridge device. We must enable runtime PM of the controller before enabling runtime PM of the host bridge, which will happen in pci_host_probe(), to avoid this warning: pcie-starfive 940000000.pcie: Enabling runtime PM for inactive device with active children Fix this issue by enabling StarFive controller device's runtime PM before calling pci_host_probe() in plda_pcie_host_init(). Link: https://lore.kernel.org/r/20241111-runtime_pm-v7-1-9c164eefcd87@quicinc.com Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Mayank Rana <quic_mrana@quicinc.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nirmal Patel <nirmal.patel@linux.ntel.com> Date: Fri Oct 11 10:56:57 2024 -0700 PCI: vmd: Add DID 8086:B06F and 8086:B60B for Intel client SKUs [ Upstream commit b727484cace4be22be9321cc0bc9487648ba447b ] Add support for this VMD device which supports the bus restriction mode. The feature that turns off vector 0 for MSI-X remapping is also enabled. Link: https://lore.kernel.org/r/20241011175657.249948-1-nirmal.patel@linux.intel.com Signed-off-by: Nirmal Patel <nirmal.patel@linux.ntel.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian-Hong Pan <jhp@endlessos.org> Date: Tue Oct 1 16:34:38 2024 +0800 PCI: vmd: Set devices to D0 before enabling PM L1 Substates [ Upstream commit d66041063192497a4a97d21dbf86b79a03a7f4fb ] The remapped PCIe Root Port and the child device have PM L1 Substates capability, but they are disabled originally. Here is a failed example on ASUS B1400CEAE: Capabilities: [900 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+ PortCommonModeRestoreTime=32us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=101376ns L1SubCtl2: T_PwrOn=50us Enable PCI-PM L1 PM Substates for devices below VMD while they are in D0 (see PCIe r6.0, sec 5.5.4). Link: https://lore.kernel.org/r/20241001083438.10070-4-jhp@endlessos.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394 Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org> Date: Tue Oct 1 07:10:19 2024 -0700 perf/x86/amd: Warn only on new bits set [ Upstream commit de20037e1b3c2f2ca97b8c12b8c7bca8abd509a7 ] Warning at every leaking bits can cause a flood of message, triggering various stall-warning mechanisms to fire, including CSD locks, which makes the machine to be unusable. Track the bits that are being leaked, and only warn when a new bit is set. That said, this patch will help with the following issues: 1) It will tell us which bits are being set, so, it is easy to communicate it back to vendor, and to do a root-cause analyzes. 2) It avoid the machine to be unusable, because, worst case scenario, the user gets less than 60 WARNs (one per unhandled bit). Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sandipan Das <sandipan.das@amd.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lkml.kernel.org/r/20241001141020.2620361-1-leitao@debian.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Esben Haabendal <esben@geanix.com> Date: Thu Oct 3 11:23:09 2024 +0200 pinctrl: freescale: fix COMPILE_TEST error with PINCTRL_IMX_SCU [ Upstream commit 58414a31c5713afb5449fd74a26a843d34cc62e8 ] When PINCTRL_IMX_SCU was selected by PINCTRL_IMX8DXL or PINCTRL_IMX8QM combined with COMPILE_TEST on a non-arm platforms, the IMX_SCU dependency could not be enabled. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410031439.GyTSa0kX-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202410030852.q0Hukplf-lkp@intel.com/ Signed-off-by: Esben Haabendal <esben@geanix.com> Link: https://lore.kernel.org/20241003-imx-pinctrl-compile-test-fix-v1-1-145ca1948cc3@geanix.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org> Date: Thu Oct 31 02:19:43 2024 +0100 pinctrl: qcom-pmic-gpio: add support for PM8937 [ Upstream commit 89265a58ff24e3885c2c9ca722bc3aaa47018be9 ] PM8937 has 8 GPIO-s with holes on GPIO3, GPIO4 and GPIO6. Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/20241031-msm8917-v2-2-8a075faa89b1@mainlining.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org> Date: Thu Oct 31 02:19:45 2024 +0100 pinctrl: qcom: spmi-mpp: Add PM8937 compatible [ Upstream commit f755261190e88f5d19fe0a3b762f0bbaff6bd438 ] The PM8937 provides 4 MPPs. Add a compatible to support them. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/20241031-msm8917-v2-4-8a075faa89b1@mainlining.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mukesh Ojha <quic_mojha@quicinc.com> Date: Tue Oct 15 00:59:30 2024 +0530 pinmux: Use sequential access to access desc->pinmux data [ Upstream commit 5a3e85c3c397c781393ea5fb2f45b1f60f8a4e6e ] When two client of the same gpio call pinctrl_select_state() for the same functionality, we are seeing NULL pointer issue while accessing desc->mux_owner. Let's say two processes A, B executing in pin_request() for the same pin and process A updates the desc->mux_usecount but not yet updated the desc->mux_owner while process B see the desc->mux_usecount which got updated by A path and further executes strcmp and while accessing desc->mux_owner it crashes with NULL pointer. Serialize the access to mux related setting with a mutex lock. cpu0 (process A) cpu1(process B) pinctrl_select_state() { pinctrl_select_state() { pin_request() { pin_request() { ... .... } else { desc->mux_usecount++; desc->mux_usecount && strcmp(desc->mux_owner, owner)) { if (desc->mux_usecount > 1) return 0; desc->mux_owner = owner; } } Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> Link: https://lore.kernel.org/20241014192930.1539673-1-quic_mojha@quicinc.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Armin Wolf <W_Armin@gmx.de> Date: Sun Nov 24 18:19:41 2024 +0100 platform/x86: asus-wmi: Ignore return value when writing thermal policy [ Upstream commit 25fb5f47f34d90aceda2c47a4230315536e97fa8 ] On some machines like the ASUS Vivobook S14 writing the thermal policy returns the currently writen thermal policy instead of an error code. Ignore the return code to avoid falsely returning an error when the thermal policy was written successfully. Reported-by: auslands-kv@gmx.de Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219517 Fixes: 2daa86e78c49 ("platform/x86: asus_wmi: Support throttle thermal policy") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241124171941.29789-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ulf Hansson <ulf.hansson@linaro.org> Date: Fri Nov 22 14:42:02 2024 +0100 pmdomain: core: Add missing put_device() [ Upstream commit b8f7bbd1f4ecff6d6277b8c454f62bb0a1c6dbe4 ] When removing a genpd we don't clean up the genpd->dev correctly. Let's add the missing put_device() in genpd_free_data() to fix this. Fixes: 401ea1572de9 ("PM / Domain: Add struct device to genpd") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Message-ID: <20241122134207.157283-2-ulf.hansson@linaro.org> Stable-dep-of: 3e3b71d35a02 ("pmdomain: core: Fix error path in pm_genpd_init() when ida alloc fails") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ulf Hansson <ulf.hansson@linaro.org> Date: Fri Nov 22 14:42:03 2024 +0100 pmdomain: core: Fix error path in pm_genpd_init() when ida alloc fails [ Upstream commit 3e3b71d35a02cee4b2cc3d4255668a6609165518 ] When the ida allocation fails we need to free up the previously allocated memory before returning the error code. Let's fix this and while at it, let's also move the ida allocation to genpd_alloc_data() and the freeing to genpd_free_data(), as it better belongs there. Fixes: 899f44531fe6 ("pmdomain: core: Add GENPD_FLAG_DEV_NAME_FW flag") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Message-ID: <20241122134207.157283-3-ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com> Date: Thu Nov 21 15:52:31 2024 +0800 pmdomain: imx: gpcv2: Adjust delay after power up handshake commit 2379fb937de5333991c567eefd7d11b98977d059 upstream. The udelay(5) is not enough, sometimes below kernel panic still be triggered: [ 4.012973] Kernel panic - not syncing: Asynchronous SError Interrupt [ 4.012976] CPU: 2 UID: 0 PID: 186 Comm: (udev-worker) Not tainted 6.12.0-rc2-0.0.0-devel-00004-g8b1b79e88956 #1 [ 4.012982] Hardware name: Toradex Verdin iMX8M Plus WB on Dahlia Board (DT) [ 4.012985] Call trace: [...] [ 4.013029] arm64_serror_panic+0x64/0x70 [ 4.013034] do_serror+0x3c/0x70 [ 4.013039] el1h_64_error_handler+0x30/0x54 [ 4.013046] el1h_64_error+0x64/0x68 [ 4.013050] clk_imx8mp_audiomix_runtime_resume+0x38/0x48 [ 4.013059] __genpd_runtime_resume+0x30/0x80 [ 4.013066] genpd_runtime_resume+0x114/0x29c [ 4.013073] __rpm_callback+0x48/0x1e0 [ 4.013079] rpm_callback+0x68/0x80 [ 4.013084] rpm_resume+0x3bc/0x6a0 [ 4.013089] __pm_runtime_resume+0x50/0x9c [ 4.013095] pm_runtime_get_suppliers+0x60/0x8c [ 4.013101] __driver_probe_device+0x4c/0x14c [ 4.013108] driver_probe_device+0x3c/0x120 [ 4.013114] __driver_attach+0xc4/0x200 [ 4.013119] bus_for_each_dev+0x7c/0xe0 [ 4.013125] driver_attach+0x24/0x30 [ 4.013130] bus_add_driver+0x110/0x240 [ 4.013135] driver_register+0x68/0x124 [ 4.013142] __platform_driver_register+0x24/0x30 [ 4.013149] sdma_driver_init+0x20/0x1000 [imx_sdma] [ 4.013163] do_one_initcall+0x60/0x1e0 [ 4.013168] do_init_module+0x5c/0x21c [ 4.013175] load_module+0x1a98/0x205c [ 4.013181] init_module_from_file+0x88/0xd4 [ 4.013187] __arm64_sys_finit_module+0x258/0x350 [ 4.013194] invoke_syscall.constprop.0+0x50/0xe0 [ 4.013202] do_el0_svc+0xa8/0xe0 [ 4.013208] el0_svc+0x3c/0x140 [ 4.013215] el0t_64_sync_handler+0x120/0x12c [ 4.013222] el0t_64_sync+0x190/0x194 [ 4.013228] SMP: stopping secondary CPUs The correct way is to wait handshake, but it needs BUS clock of BLK-CTL be enabled, which is in separate driver. So delay is the only option here. The udelay(10) is a data got by experiment. Fixes: e8dc41afca16 ("pmdomain: imx: gpcv2: Add delay after power up handshake") Reported-by: Francesco Dolcini <francesco@dolcini.it> Closes: https://lore.kernel.org/lkml/20241007132555.GA53279@francesco-nb/ Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Cc: stable@vger.kernel.org Message-ID: <20241121075231.3910922-1-shengjiu.wang@nxp.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Michael Ellerman <mpe@ellerman.id.au> Date: Tue Nov 26 13:57:10 2024 +1100 powerpc/prom_init: Fixup missing powermac #size-cells [ Upstream commit cf89c9434af122f28a3552e6f9cc5158c33ce50a ] On some powermacs `escc` nodes are missing `#size-cells` properties, which is deprecated and now triggers a warning at boot since commit 045b14ca5c36 ("of: WARN on deprecated #address-cells/#size-cells handling"). For example: Missing '#size-cells' in /pci@f2000000/mac-io@c/escc@13000 WARNING: CPU: 0 PID: 0 at drivers/of/base.c:133 of_bus_n_size_cells+0x98/0x108 Hardware name: PowerMac3,1 7400 0xc0209 PowerMac ... Call Trace: of_bus_n_size_cells+0x98/0x108 (unreliable) of_bus_default_count_cells+0x40/0x60 __of_get_address+0xc8/0x21c __of_address_to_resource+0x5c/0x228 pmz_init_port+0x5c/0x2ec pmz_probe.isra.0+0x144/0x1e4 pmz_console_init+0x10/0x48 console_init+0xcc/0x138 start_kernel+0x5c4/0x694 As powermacs boot via prom_init it's possible to add the missing properties to the device tree during boot, avoiding the warning. Note that `escc-legacy` nodes are also missing `#size-cells` properties, but they are skipped by the macio driver, so leave them alone. Depends-on: 045b14ca5c36 ("of: WARN on deprecated #address-cells/#size-cells handling") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241126025710.591683-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ajay Kaher <ajay.kaher@broadcom.com> Date: Mon Nov 25 10:59:54 2024 +0000 ptp: Add error handling for adjfine callback in ptp_clock_adjtime [ Upstream commit 98337d7c87577ded71114f6976edb70a163e27bc ] ptp_clock_adjtime sets ptp->dialed_frequency even when adjfine callback returns an error. This causes subsequent reads to return an incorrect value. Fix this by adding error check before ptp->dialed_frequency is set. Fixes: 39a8cbd9ca05 ("ptp: remember the adjusted frequency") Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://patch.msgid.link/20241125105954.1509971-1-ajay.kaher@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heiner Kallweit <hkallweit1@gmail.com> Date: Wed Oct 9 07:44:23 2024 +0200 r8169: don't apply UDP padding quirk on RTL8126A [ Upstream commit 87e26448dbda4523b73a894d96f0f788506d3795 ] Vendor drivers r8125/r8126 indicate that this quirk isn't needed any longer for RTL8126A. Mimic this in r8169. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/d1317187-aa81-4a69-b831-678436e4de62@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cosmin Tanislav <demonsingur@gmail.com> Date: Thu Nov 28 15:16:23 2024 +0200 regmap: detach regmap from dev on regmap_exit commit 3061e170381af96d1e66799d34264e6414d428a7 upstream. At the end of __regmap_init(), if dev is not NULL, regmap_attach_dev() is called, which adds a devres reference to the regmap, to be able to retrieve a dev's regmap by name using dev_get_regmap(). When calling regmap_exit, the opposite does not happen, and the reference is kept until the dev is detached. Add a regmap_detach_dev() function and call it in regmap_exit() to make sure that the devres reference is not kept. Cc: stable@vger.kernel.org Fixes: 72b39f6f2b5a ("regmap: Implement dev_get_regmap()") Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com> Rule: add Link: https://lore.kernel.org/stable/20241128130554.362486-1-demonsingur%40gmail.com Link: https://patch.msgid.link/20241128131625.363835-1-demonsingur@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Date: Thu Oct 31 18:37:04 2024 +0200 regmap: maple: Provide lockdep (sub)class for maple tree's internal lock [ Upstream commit 1ed9b927e7dd8b8cff13052efe212a8ff72ec51d ] In some cases when using the maple tree register cache, the lockdep validator might complain about invalid deadlocks: [7.131886] Possible interrupt unsafe locking scenario: [7.131890] CPU0 CPU1 [7.131893] ---- ---- [7.131896] lock(&mt->ma_lock); [7.131904] local_irq_disable(); [7.131907] lock(rockchip_drm_vop2:3114:(&vop2_regmap_config)->lock); [7.131916] lock(&mt->ma_lock); [7.131925] <Interrupt> [7.131928] lock(rockchip_drm_vop2:3114:(&vop2_regmap_config)->lock); [7.131936] *** DEADLOCK *** [7.131939] no locks held by swapper/0/0. [7.131944] the shortest dependencies between 2nd lock and 1st lock: [7.131950] -> (&mt->ma_lock){+.+.}-{2:2} { [7.131966] HARDIRQ-ON-W at: [7.131973] lock_acquire+0x200/0x330 [7.131986] _raw_spin_lock+0x50/0x70 [7.131998] regcache_maple_write+0x68/0xe0 [7.132010] regcache_write+0x6c/0x90 [7.132019] _regmap_read+0x19c/0x1d0 [7.132029] _regmap_update_bits+0xc0/0x148 [7.132038] regmap_update_bits_base+0x6c/0xa8 [7.132048] rk8xx_probe+0x22c/0x3d8 [7.132057] rk8xx_spi_probe+0x74/0x88 [7.132065] spi_probe+0xa8/0xe0 [...] [7.132675] } [7.132678] ... key at: [<ffff800082943c20>] __key.0+0x0/0x10 [7.132691] ... acquired at: [7.132695] _raw_spin_lock+0x50/0x70 [7.132704] regcache_maple_write+0x68/0xe0 [7.132714] regcache_write+0x6c/0x90 [7.132724] _regmap_read+0x19c/0x1d0 [7.132732] _regmap_update_bits+0xc0/0x148 [7.132741] regmap_field_update_bits_base+0x74/0xb8 [7.132751] vop2_plane_atomic_update+0x480/0x14d8 [rockchipdrm] [7.132820] drm_atomic_helper_commit_planes+0x1a0/0x320 [drm_kms_helper] [...] [7.135112] -> (rockchip_drm_vop2:3114:(&vop2_regmap_config)->lock){-...}-{2:2} { [7.135130] IN-HARDIRQ-W at: [7.135136] lock_acquire+0x200/0x330 [7.135147] _raw_spin_lock_irqsave+0x6c/0x98 [7.135157] regmap_lock_spinlock+0x20/0x40 [7.135166] regmap_read+0x44/0x90 [7.135175] vop2_isr+0x90/0x290 [rockchipdrm] [7.135225] __handle_irq_event_percpu+0x124/0x2d0 In the example above, the validator seems to get the scope of dependencies wrong, since the regmap instance used in rk8xx-spi driver has nothing to do with the instance from vop2. Improve validation by sharing the regmap's lockdep class with the maple tree's internal lock, while also providing a subclass for the latter. Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://patch.msgid.link/20241031-regmap-maple-lockdep-fix-v2-1-06a3710f3623@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Melody Olvera <quic_molvera@quicinc.com> Date: Mon Nov 11 16:26:45 2024 -0800 regulator: qcom-rpmh: Update ranges for FTSMPS525 [ Upstream commit eeecf953d697cb7f0d916f9908a2b9f451bb2667 ] All FTSMPS525 regulators support LV and MV ranges; however, the boot loader firmware will determine which range to use as the device boots. Nonetheless, the driver cannot determine which range was selected, so hardcoding the ranges as either LV or MV will not cover all cases as it's possible for the firmware to select a range not supported by the driver's current hardcoded values. To this end, combine the ranges for the FTSMPS525s into one struct and point all regulators to the updated combined struct. This should work on all boards regardless of which range is selected by the firmware and more accurately caputres the capability of this regulator on a hardware level. Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patch.msgid.link/20241112002645.2803506-1-quic_molvera@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sun Oct 27 01:09:45 2024 +0300 remoteproc: qcom: pas: enable SAR2130P audio DSP support [ Upstream commit 009e288c989b3fe548a45c82da407d7bd00418a9 ] Enable support for the Audio DSP on the Qualcomm SAR2130P platform, reusing the SM8350 resources. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-adsp-v1-3-bd204e39d24e@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com> Date: Fri Nov 8 09:34:46 2024 -0500 Revert "drm/amd/display: parse umc_info or vram_info based on ASIC" commit 3c2296b1eec55b50c64509ba15406142d4a958dc upstream. This reverts commit 2551b4a321a68134360b860113dd460133e856e5. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Nilay Shroff <nilay@linux.ibm.com> Date: Tue Nov 5 11:42:08 2024 +0530 Revert "nvme: make keep-alive synchronous operation" [ Upstream commit 84488282166de6b6760ada8030e87aaa08bce3aa ] This reverts commit d06923670b5a5f609603d4a9fee4dec02d38de9c. It was realized that the fix implemented to contain the race condition among the keep alive task and the fabric shutdown code path in the commit d06923670b5ia ("nvme: make keep-alive synchronous operation") is not optimal. The reason being keep-alive runs under the workqueue and making it synchronous would waste a workqueue context. Furthermore, we later found that the above race condition is a regression caused due to the changes implemented in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"). So we decided to revert the commit d06923670b5a ("nvme: make keep-alive synchronous operation") and then fix the regression. Link: https://lore.kernel.org/all/196f4013-3bbf-43ff-98b4-9cb2a96c20c2@grimberg.me/ Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jan Kara <jack@suse.cz> Date: Tue Nov 26 15:52:08 2024 +0100 Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()" commit a220d6b95b1ae12c7626283d7609f0a1438e6437 upstream. This reverts commit 7c877586da3178974a8a94577b6045a48377ff25. Anders and Philippe have reported that recent kernels occasionally hang when used with NFS in readahead code. The problem has been bisected to 7c877586da3 ("readahead: properly shorten readahead when falling back to do_page_cache_ra()"). The cause of the problem is that ra->size can be shrunk by read_pages() call and subsequently we end up calling do_page_cache_ra() with negative (read huge positive) number of pages. Let's revert 7c877586da3 for now until we can find a proper way how the logic in read_pages() and page_cache_ra_order() can coexist. This can lead to reduced readahead throughput due to readahead window confusion but that's better than outright hangs. Link: https://lkml.kernel.org/r/20241126145208.985-1-jack@suse.cz Fixes: 7c877586da31 ("readahead: properly shorten readahead when falling back to do_page_cache_ra()") Reported-by: Anders Blomdell <anders.blomdell@gmail.com> Reported-by: Philippe Troin <phil@fifi.org> Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Philippe Troin <phil@fifi.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Fernando Fernandez Mancera <ffmancera@riseup.net> Date: Mon Dec 2 15:56:08 2024 +0000 Revert "udp: avoid calling sock_def_readable() if possible" [ Upstream commit 3d501f562f63b290351169e3e9931ffe3d57b2ae ] This reverts commit 612b1c0dec5bc7367f90fc508448b8d0d7c05414. On a scenario with multiple threads blocking on a recvfrom(), we need to call sock_def_readable() on every __udp_enqueue_schedule_skb() otherwise the threads won't be woken up as __skb_wait_for_more_packets() is using prepare_to_wait_exclusive(). Link: https://bugzilla.redhat.com/2308477 Fixes: 612b1c0dec5b ("udp: avoid calling sock_def_readable() if possible") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241202155620.1719-1-ffmancera@riseup.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Dec 11 14:11:23 2024 -0800 Revert "unicode: Don't special case ignorable code points" [ Upstream commit 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b ] This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Pavlu <petr.pavlu@suse.com> Date: Tue Oct 15 13:27:46 2024 +0200 ring-buffer: Limit time with disabled interrupts in rb_check_pages() [ Upstream commit b237e1f7d2273fdcffac20100b72c002bdd770dd ] The function rb_check_pages() validates the integrity of a specified per-CPU tracing ring buffer. It does so by traversing the underlying linked list and checking its next and prev links. To guarantee that the list isn't modified during the check, a caller typically needs to take cpu_buffer->reader_lock. This prevents the check from running concurrently, for example, with a potential reader which can make the list temporarily inconsistent when swapping its old reader page into the buffer. A problem with this approach is that the time when interrupts are disabled is non-deterministic, dependent on the ring buffer size. This particularly affects PREEMPT_RT because the reader_lock is a raw spinlock which doesn't become sleepable on PREEMPT_RT kernels. Modify the check so it still attempts to traverse the entire list, but gives up the reader_lock between checking individual pages. Introduce for this purpose a new variable ring_buffer_per_cpu.cnt which is bumped any time the list is modified. The value is used by rb_check_pages() to detect such a change and restart the check. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241015112810.27203-1-petr.pavlu@suse.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Antipov <dmantipov@yandex.ru> Date: Thu Nov 14 18:19:46 2024 +0300 rocker: fix link status detection in rocker_carrier_init() [ Upstream commit e64285ff41bb7a934bd815bd38f31119be62ac37 ] Since '1 << rocker_port->pport' may be undefined for port >= 32, cast the left operand to 'unsigned long long' like it's done in 'rocker_port_set_enable()' above. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://patch.msgid.link/20241114151946.519047-1-dmantipov@yandex.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Fri Oct 25 13:14:57 2024 -0700 rtc: cmos: avoid taking rtc_lock for extended period of time [ Upstream commit 0a6efab33eab4e973db26d9f90c3e97a7a82e399 ] On my device reading entirety of /sys/devices/pnp0/00:03/cmos_nvram0/nvmem takes about 9 msec during which time interrupts are off on the CPU that does the read and the thread that performs the read can not be migrated or preempted by another higher priority thread (RT or not). Allow readers and writers be preempted by taking and releasing rtc_lock spinlock for each individual byte read or written rather than once per read/write request. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Link: https://lore.kernel.org/r/Zxv8QWR21AV4ztC5@google.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tomas Glozar <tglozar@redhat.com> Date: Fri Oct 11 14:10:15 2024 +0200 rtla/timerlat: Make timerlat_hist_cpu->*_count unsigned long long [ Upstream commit 76b3102148135945b013797fac9b206273f0f777 ] Do the same fix as in previous commit also for timerlat-hist. Link: https://lore.kernel.org/20241011121015.2868751-2-tglozar@redhat.com Reported-by: Attila Fazekas <afazekas@redhat.com> Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tomas Glozar <tglozar@redhat.com> Date: Fri Oct 11 14:10:14 2024 +0200 rtla/timerlat: Make timerlat_top_cpu->*_count unsigned long long [ Upstream commit 4eba4723c5254ba8251ecb7094a5078d5c300646 ] Most fields of struct timerlat_top_cpu are unsigned long long, but the fields {irq,thread,user}_count are int (32-bit signed). This leads to overflow when tracing on a large number of CPUs for a long enough time: $ rtla timerlat top -a20 -c 1-127 -d 12h ... 0 12:00:00 | IRQ Timer Latency (us) | Thread Timer Latency (us) CPU COUNT | cur min avg max | cur min avg max 1 #43200096 | 0 0 1 2 | 3 2 6 12 ... 127 #43200096 | 0 0 1 2 | 3 2 5 11 ALL #119144 e4 | 0 5 4 | 2 28 16 The average latency should be 0-1 for IRQ and 5-6 for thread, but is reported as 5 and 28, about 4 to 5 times more, due to the count overflowing when summed over all CPUs: 43200096 * 127 = 5486412192, however, 1191444898 (= 5486412192 mod MAX_INT) is reported instead, as seen on the last line of the output, and the averages are thus ~4.6 times higher than they should be (5486412192 / 1191444898 = ~4.6). Fix the issue by changing {irq,thread,user}_count fields to unsigned long long, similarly to other fields in struct timerlat_top_cpu and to the count variable in timerlat_top_print_sum. Link: https://lore.kernel.org/20241011121015.2868751-1-tglozar@redhat.com Reported-by: Attila Fazekas <afazekas@redhat.com> Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabriele Monaco <gmonaco@redhat.com> Date: Thu Sep 26 16:34:17 2024 +0200 rtla: Fix consistency in getopt_long for timerlat_hist [ Upstream commit cfb1ea216c1656a4112becbc4bf757891933b902 ] Commit e9a4062e1527 ("rtla: Add --trace-buffer-size option") adds a new long option to rtla utilities, but among all affected files, timerlat_hist misses a trailing `:` in the corresponding short option inside the getopt string (e.g. `\3:`). This patch propagates the `:`. Although this change is not functionally required, it improves consistency and slightly reduces the likelihood a future change would introduce a problem. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/20240926143417.54039-1-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Miguel Ojeda <ojeda@kernel.org> Date: Sat Nov 16 19:15:37 2024 +0100 rust: allow `clippy::needless_lifetimes` commit 60fc1e6750133620e404d40b93df5afe32e3e6c6 upstream. In beta Clippy (i.e. Rust 1.83.0), the `needless_lifetimes` lint has been extended [1] to suggest eliding `impl` lifetimes, e.g. error: the following explicit lifetimes could be elided: 'a --> rust/kernel/list.rs:647:6 | 647 | impl<'a, T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'a, T, ID> {} | ^^ ^^ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes = note: `-D clippy::needless-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_lifetimes)]` help: elide the lifetimes | 647 - impl<'a, T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'a, T, ID> {} 647 + impl<T: ?Sized + ListItem<ID>, const ID: u64> FusedIterator for Iter<'_, T, ID> {} A possibility would have been to clean them -- the RFC patch [2] did this, while asking if we wanted these cleanups. There is an open issue [3] in Clippy about being able to differentiate some of the new cases, e.g. those that do not involve introducing `'_`. Thus it seems others feel similarly. Thus, for the time being, we decided to `allow` the lint. Link: https://github.com/rust-lang/rust-clippy/pull/13286 [1] Link: https://lore.kernel.org/rust-for-linux/20241012231300.397010-1-ojeda@kernel.org/ [2] Link: https://github.com/rust-lang/rust-clippy/issues/13514 [3] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://lore.kernel.org/r/20241116181538.369355-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Gary Guo <gary@garyguo.net> Date: Sun Sep 15 14:26:31 2024 +0100 rust: enable arbitrary_self_types and remove `Receiver` commit c95bbb59a9b22f9b838b15d28319185c1c884329 upstream. The term "receiver" means that a type can be used as the type of `self`, and thus enables method call syntax `foo.bar()` instead of `Foo::bar(foo)`. Stable Rust as of today (1.81) enables a limited selection of types (primitives and types in std, e.g. `Box` and `Arc`) to be used as receivers, while custom types cannot. We want the kernel `Arc` type to have the same functionality as the Rust std `Arc`, so we use the `Receiver` trait (gated behind `receiver_trait` unstable feature) to gain the functionality. The `arbitrary_self_types` RFC [1] (tracking issue [2]) is accepted and it will allow all types that implement a new `Receiver` trait (different from today's unstable trait) to be used as receivers. This trait will be automatically implemented for all `Deref` types, which include our `Arc` type, so we no longer have to opt-in to be used as receiver. To prepare us for the change, remove the `Receiver` implementation and the associated feature. To still allow `Arc` and others to be used as method receivers, turn on `arbitrary_self_types` feature instead. This feature gate is introduced in 1.23.0. It used to enable both `Deref` types and raw pointer types to be used as receivers, but the latter is now split into a different feature gate in Rust 1.83 nightly. We do not need receivers on raw pointers so this change would not affect us and usage of `arbitrary_self_types` feature would work for all Rust versions that we support (>=1.78). Cc: Adrian Taylor <ade@hohum.me.uk> Link: https://github.com/rust-lang/rfcs/pull/3519 [1] Link: https://github.com/rust-lang/rust/issues/44874 [2] Signed-off-by: Gary Guo <gary@garyguo.net> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240915132734.1653004-1-gary@garyguo.net Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thomas Richter <tmricht@linux.ibm.com> Date: Fri Oct 25 12:27:53 2024 +0200 s390/cpum_sf: Handle CPU hotplug remove during sampling [ Upstream commit a0bd7dacbd51c632b8e2c0500b479af564afadf3 ] CPU hotplug remove handling triggers the following function call sequence: CPUHP_AP_PERF_S390_SF_ONLINE --> s390_pmu_sf_offline_cpu() ... CPUHP_AP_PERF_ONLINE --> perf_event_exit_cpu() The s390 CPUMF sampling CPU hotplug handler invokes: s390_pmu_sf_offline_cpu() +--> cpusf_pmu_setup() +--> setup_pmc_cpu() +--> deallocate_buffers() This function de-allocates all sampling data buffers (SDBs) allocated for that CPU at event initialization. It also clears the PMU_F_RESERVED bit. The CPU is gone and can not be sampled. With the event still being active on the removed CPU, the CPU event hotplug support in kernel performance subsystem triggers the following function calls on the removed CPU: perf_event_exit_cpu() +--> perf_event_exit_cpu_context() +--> __perf_event_exit_context() +--> __perf_remove_from_context() +--> event_sched_out() +--> cpumsf_pmu_del() +--> cpumsf_pmu_stop() +--> hw_perf_event_update() to stop and remove the event. During removal of the event, the sampling device driver tries to read out the remaining samples from the sample data buffers (SDBs). But they have already been freed (and may have been re-assigned). This may lead to a use after free situation in which case the samples are most likely invalid. In the best case the memory has not been reassigned and still contains valid data. Remedy this situation and check if the CPU is still in reserved state (bit PMU_F_RESERVED set). In this case the SDBs have not been released an contain valid data. This is always the case when the event is removed (and no CPU hotplug off occured). If the PMU_F_RESERVED bit is not set, the SDB buffers are gone. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Schnelle <schnelle@linux.ibm.com> Date: Mon Nov 25 10:35:10 2024 +0100 s390/pci: Fix leak of struct zpci_dev when zpci_add_device() fails commit 48796104c864cf4dafa80bd8c2ce88f9c92a65ea upstream. Prior to commit 0467cdde8c43 ("s390/pci: Sort PCI functions prior to creating virtual busses") the IOMMU was initialized and the device was registered as part of zpci_create_device() with the struct zpci_dev freed if either resulted in an error. With that commit this was moved into a separate function called zpci_add_device(). While this new function logs when adding failed, it expects the caller not to use and to free the struct zpci_dev on error. This difference between it and zpci_create_device() was missed while changing the callers and the incompletely initialized struct zpci_dev may get used in zpci_scan_configured_device in the error path. This then leads to a crash due to the device not being registered with the zbus. It was also not freed in this case. Fix this by handling the error return of zpci_add_device(). Since in this case the zdev was not added to the zpci_list it can simply be discarded and freed. Also make this more explicit by moving the kref_init() into zpci_add_device() and document that zpci_zdev_get()/zpci_zdev_put() must be used after adding. Cc: stable@vger.kernel.org Fixes: 0467cdde8c43 ("s390/pci: Sort PCI functions prior to creating virtual busses") Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niklas Schnelle <schnelle@linux.ibm.com> Date: Thu Sep 26 16:08:31 2024 +0200 s390/pci: Ignore RID for isolated VFs [ Upstream commit 25f39d3dcb48bbc824a77d16b3d977f0f3713cfe ] Ensure that VFs used in isolation, that is with their parent PF not visible to the configuration but with their RID exposed, are treated compatibly with existing isolated VF use cases without exposed RID including RoCE Express VFs. This allows creating configurations where one LPAR manages PFs while their child VFs are used by other LPARs. This gives the LPAR managing the PFs a role analogous to that of the hypervisor in a typical use case of passing child VFs to guests. Instead of creating a multifunction struct zpci_bus whenever a PCI function with RID exposed is discovered only create such a bus for configured physical functions and only consider multifunction busses when searching for an existing bus. Additionally only set zdev->devfn to the devfn part of the RID once the function is added to a multifunction bus. This also fixes probing of more than 7 such isolated VFs from the same physical bus. This is because common PCI code in pci_scan_slot() only looks for more functions when pdev->multifunction is set which somewhat counter intutively is not the case for VFs. Note that PFs are looked at before their child VFs is guaranteed because we sort the zpci_list by RID ascending. Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Schnelle <schnelle@linux.ibm.com> Date: Thu Sep 26 16:08:29 2024 +0200 s390/pci: Sort PCI functions prior to creating virtual busses [ Upstream commit 0467cdde8c4320bbfdb31a8cff1277b202f677fc ] Instead of relying on the observed but not architected firmware behavior that PCI functions from the same card are listed in ascending RID order in clp_list_pci() ensure this by sorting. To allow for sorting separate the initial clp_list_pci() and creation of the virtual PCI busses. Note that fundamentally in our per-PCI function hotplug design non RID order of discovery is still possible. For example when the two PFs of a two port NIC are hotplugged after initial boot and in descending RID order. In this case the virtual PCI bus would be created by the second PF using that PF's UID as domain number instead of that of the first PF. Thus the domain number would then change from the UID of the second PF to that of the first PF on reboot but there is really nothing we can do about that since changing domain numbers at runtime seems even worse. This only impacts the domain number as the RIDs are consistent and thus even with just the second PF visible it will show up in the correct position on the virtual bus. Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Schnelle <schnelle@linux.ibm.com> Date: Thu Sep 26 16:08:30 2024 +0200 s390/pci: Use topology ID for multi-function devices [ Upstream commit 126034faaac5f356822c4a9bebfa75664da11056 ] The newly introduced topology ID (TID) field in the CLP Query PCI Function explicitly identifies groups of PCI functions whose RIDs belong to the same (sub-)topology. When available use the TID instead of the PCHID to match zPCI busses/domains for multi-function devices. Note that currently only a single PCI bus per TID is supported. This change is required because in future machines the PCHID will not identify a PCI card but a specific port in the case of some multi-port NICs while from a PCI point of view the entire card is a subtopology. Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhu Jun <zhujun2@cmss.chinamobile.com> Date: Wed Oct 9 18:41:26 2024 -0700 samples/bpf: Fix a resource leak [ Upstream commit f3ef53174b23246fe9bc2bbc2542f3a3856fa1e2 ] The opened file should be closed in show_sockopts(), otherwise resource leak will occur that this problem was discovered by reading code Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241010014126.2573-1-zhujun2@cmss.chinamobile.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Randy Dunlap <rdunlap@infradead.org> Date: Fri Nov 29 18:24:06 2024 -0800 scatterlist: fix incorrect func name in kernel-doc [ Upstream commit d89c8ec0546184267cb211b579514ebaf8916100 ] Fix a kernel-doc warning by making the kernel-doc function description match the function name: include/linux/scatterlist.h:323: warning: expecting prototype for sg_unmark_bus_address(). Prototype was for sg_dma_unmark_bus_address() instead Link: https://lkml.kernel.org/r/20241130022406.537973-1-rdunlap@infradead.org Fixes: 42399301203e ("lib/scatterlist: add flag for indicating P2PDMA segments in an SGL") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: K Prateek Nayak <kprateek.nayak@amd.com> Date: Tue Nov 19 05:44:32 2024 +0000 sched/core: Prevent wakeup of ksoftirqd during idle load balance [ Upstream commit e932c4ab38f072ce5894b2851fea8bc5754bb8e5 ] Scheduler raises a SCHED_SOFTIRQ to trigger a load balancing event on from the IPI handler on the idle CPU. If the SMP function is invoked from an idle CPU via flush_smp_call_function_queue() then the HARD-IRQ flag is not set and raise_softirq_irqoff() needlessly wakes ksoftirqd because soft interrupts are handled before ksoftirqd get on the CPU. Adding a trace_printk() in nohz_csd_func() at the spot of raising SCHED_SOFTIRQ and enabling trace events for sched_switch, sched_wakeup, and softirq_entry (for SCHED_SOFTIRQ vector alone) helps observing the current behavior: <idle>-0 [000] dN.1.: nohz_csd_func: Raising SCHED_SOFTIRQ from nohz_csd_func <idle>-0 [000] dN.4.: sched_wakeup: comm=ksoftirqd/0 pid=16 prio=120 target_cpu=000 <idle>-0 [000] .Ns1.: softirq_entry: vec=7 [action=SCHED] <idle>-0 [000] .Ns1.: softirq_exit: vec=7 [action=SCHED] <idle>-0 [000] d..2.: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=16 next_prio=120 ksoftirqd/0-16 [000] d..2.: sched_switch: prev_comm=ksoftirqd/0 prev_pid=16 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 ... Use __raise_softirq_irqoff() to raise the softirq. The SMP function call is always invoked on the requested CPU in an interrupt handler. It is guaranteed that soft interrupts are handled at the end. Following are the observations with the changes when enabling the same set of events: <idle>-0 [000] dN.1.: nohz_csd_func: Raising SCHED_SOFTIRQ for nohz_idle_balance <idle>-0 [000] dN.1.: softirq_raise: vec=7 [action=SCHED] <idle>-0 [000] .Ns1.: softirq_entry: vec=7 [action=SCHED] No unnecessary ksoftirqd wakeups are seen from idle task's context to service the softirq. Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") Closes: https://lore.kernel.org/lkml/fcf823f-195e-6c9a-eac3-25f870cb35ac@inria.fr/ [1] Reported-by: Julia Lawall <julia.lawall@inria.fr> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20241119054432.6405-5-kprateek.nayak@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: K Prateek Nayak <kprateek.nayak@amd.com> Date: Tue Nov 19 05:44:30 2024 +0000 sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() [ Upstream commit ea9cffc0a154124821531991d5afdd7e8b20d7aa ] The need_resched() check currently in nohz_csd_func() can be tracked to have been added in scheduler_ipi() back in 2011 via commit ca38062e57e9 ("sched: Use resched IPI to kick off the nohz idle balance") Since then, it has travelled quite a bit but it seems like an idle_cpu() check currently is sufficient to detect the need to bail out from an idle load balancing. To justify this removal, consider all the following case where an idle load balancing could race with a task wakeup: o Since commit f3dd3f674555b ("sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle") a target perceived to be idle (target_rq->nr_running == 0) will return true for ttwu_queue_cond(target) which will offload the task wakeup to the idle target via an IPI. In all such cases target_rq->ttwu_pending will be set to 1 before queuing the wake function. If an idle load balance races here, following scenarios are possible: - The CPU is not in TIF_POLLING_NRFLAG mode in which case an actual IPI is sent to the CPU to wake it out of idle. If the nohz_csd_func() queues before sched_ttwu_pending(), the idle load balance will bail out since idle_cpu(target) returns 0 since target_rq->ttwu_pending is 1. If the nohz_csd_func() is queued after sched_ttwu_pending() it should see rq->nr_running to be non-zero and bail out of idle load balancing. - The CPU is in TIF_POLLING_NRFLAG mode and instead of an actual IPI, the sender will simply set TIF_NEED_RESCHED for the target to put it out of idle and flush_smp_call_function_queue() in do_idle() will execute the call function. Depending on the ordering of the queuing of nohz_csd_func() and sched_ttwu_pending(), the idle_cpu() check in nohz_csd_func() should either see target_rq->ttwu_pending = 1 or target_rq->nr_running to be non-zero if there is a genuine task wakeup racing with the idle load balance kick. o The waker CPU perceives the target CPU to be busy (targer_rq->nr_running != 0) but the CPU is in fact going idle and due to a series of unfortunate events, the system reaches a case where the waker CPU decides to perform the wakeup by itself in ttwu_queue() on the target CPU but target is concurrently selected for idle load balance (XXX: Can this happen? I'm not sure, but we'll consider the mother of all coincidences to estimate the worst case scenario). ttwu_do_activate() calls enqueue_task() which would increment "rq->nr_running" post which it calls wakeup_preempt() which is responsible for setting TIF_NEED_RESCHED (via a resched IPI or by setting TIF_NEED_RESCHED on a TIF_POLLING_NRFLAG idle CPU) The key thing to note in this case is that rq->nr_running is already non-zero in case of a wakeup before TIF_NEED_RESCHED is set which would lead to idle_cpu() check returning false. In all cases, it seems that need_resched() check is unnecessary when checking for idle_cpu() first since an impending wakeup racing with idle load balancer will either set the "rq->ttwu_pending" or indicate a newly woken task via "rq->nr_running". Chasing the reason why this check might have existed in the first place, I came across Peter's suggestion on the fist iteration of Suresh's patch from 2011 [1] where the condition to raise the SCHED_SOFTIRQ was: sched_ttwu_do_pending(list); if (unlikely((rq->idle == current) && rq->nohz_balance_kick && !need_resched())) raise_softirq_irqoff(SCHED_SOFTIRQ); Since the condition to raise the SCHED_SOFIRQ was preceded by sched_ttwu_do_pending() (which is equivalent of sched_ttwu_pending()) in the current upstream kernel, the need_resched() check was necessary to catch a newly queued task. Peter suggested modifying it to: if (idle_cpu() && rq->nohz_balance_kick && !need_resched()) raise_softirq_irqoff(SCHED_SOFTIRQ); where idle_cpu() seems to have replaced "rq->idle == current" check. Even back then, the idle_cpu() check would have been sufficient to catch a new task being enqueued. Since commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") overloads the interpretation of TIF_NEED_RESCHED for TIF_POLLING_NRFLAG idling, remove the need_resched() check in nohz_csd_func() to raise SCHED_SOFTIRQ based on Peter's suggestion. Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-3-kprateek.nayak@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wander Lairson Costa <wander@redhat.com> Date: Wed Jul 24 11:22:47 2024 -0300 sched/deadline: Fix warning in migrate_enable for boosted tasks [ Upstream commit 0664e2c311b9fa43b33e3e81429cd0c2d7f9c638 ] When running the following command: while true; do stress-ng --cyclic 30 --timeout 30s --minimize --quiet done a warning is eventually triggered: WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794 setup_new_dl_entity+0x13e/0x180 ... Call Trace: <TASK> ? show_trace_log_lvl+0x1c4/0x2df ? enqueue_dl_entity+0x631/0x6e0 ? setup_new_dl_entity+0x13e/0x180 ? __warn+0x7e/0xd0 ? report_bug+0x11a/0x1a0 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 enqueue_dl_entity+0x631/0x6e0 enqueue_task_dl+0x7d/0x120 __do_set_cpus_allowed+0xe3/0x280 __set_cpus_allowed_ptr_locked+0x140/0x1d0 __set_cpus_allowed_ptr+0x54/0xa0 migrate_enable+0x7e/0x150 rt_spin_unlock+0x1c/0x90 group_send_sig_info+0xf7/0x1a0 ? kill_pid_info+0x1f/0x1d0 kill_pid_info+0x78/0x1d0 kill_proc_info+0x5b/0x110 __x64_sys_kill+0x93/0xc0 do_syscall_64+0x5c/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7f0dab31f92b This warning occurs because set_cpus_allowed dequeues and enqueues tasks with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning is triggered. A boosted task already had its parameters set by rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary, hence the WARN_ON call. Check if we are requeueing a boosted task and avoid calling setup_new_dl_entity if that's the case. Fixes: 295d6d5e3736 ("sched/deadline: Fix switching to -deadline") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20240724142253.27145-2-wander@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: K Prateek Nayak <kprateek.nayak@amd.com> Date: Tue Nov 19 05:44:31 2024 +0000 sched/fair: Check idle_cpu() before need_resched() to detect ilb CPU turning busy [ Upstream commit ff47a0acfcce309cf9e175149c75614491953c8f ] Commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") optimizes IPIs to idle CPUs in TIF_POLLING_NRFLAG mode by setting the TIF_NEED_RESCHED flag in idle task's thread info and relying on flush_smp_call_function_queue() in idle exit path to run the call-function. A softirq raised by the call-function is handled shortly after in do_softirq_post_smp_call_flush() but the TIF_NEED_RESCHED flag remains set and is only cleared later when schedule_idle() calls __schedule(). need_resched() check in _nohz_idle_balance() exists to bail out of load balancing if another task has woken up on the CPU currently in-charge of idle load balancing which is being processed in SCHED_SOFTIRQ context. Since the optimization mentioned above overloads the interpretation of TIF_NEED_RESCHED, check for idle_cpu() before going with the existing need_resched() check which can catch a genuine task wakeup on an idle CPU processing SCHED_SOFTIRQ from do_softirq_post_smp_call_flush(), as well as the case where ksoftirqd needs to be preempted as a result of new task wakeup or slice expiry. In case of PREEMPT_RT or threadirqs, although the idle load balancing may be inhibited in some cases on the ilb CPU, the fact that ksoftirqd is the only fair task going back to sleep will trigger a newidle balance on the CPU which will alleviate some imbalance if it exists if idle balance fails to do so. Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-4-kprateek.nayak@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Adrian Huang <ahuang12@lenovo.com> Date: Wed Nov 13 18:21:46 2024 +0800 sched/numa: fix memory leak due to the overwritten vma->numab_state commit 5f1b64e9a9b7ee9cfd32c6b2fab796e29bfed075 upstream. [Problem Description] When running the hackbench program of LTP, the following memory leak is reported by kmemleak. # /opt/ltp/testcases/bin/hackbench 20 thread 1000 Running with 20*40 (== 800) tasks. # dmesg | grep kmemleak ... kmemleak: 480 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 665 new suspected memory leaks (see /sys/kernel/debug/kmemleak) # cat /sys/kernel/debug/kmemleak unreferenced object 0xffff888cd8ca2c40 (size 64): comm "hackbench", pid 17142, jiffies 4299780315 hex dump (first 32 bytes): ac 74 49 00 01 00 00 00 4c 84 49 00 01 00 00 00 .tI.....L.I..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc bff18fd4): [<ffffffff81419a89>] __kmalloc_cache_noprof+0x2f9/0x3f0 [<ffffffff8113f715>] task_numa_work+0x725/0xa00 [<ffffffff8110f878>] task_work_run+0x58/0x90 [<ffffffff81ddd9f8>] syscall_exit_to_user_mode+0x1c8/0x1e0 [<ffffffff81dd78d5>] do_syscall_64+0x85/0x150 [<ffffffff81e0012b>] entry_SYSCALL_64_after_hwframe+0x76/0x7e ... This issue can be consistently reproduced on three different servers: * a 448-core server * a 256-core server * a 192-core server [Root Cause] Since multiple threads are created by the hackbench program (along with the command argument 'thread'), a shared vma might be accessed by two or more cores simultaneously. When two or more cores observe that vma->numab_state is NULL at the same time, vma->numab_state will be overwritten. Although current code ensures that only one thread scans the VMAs in a single 'numa_scan_period', there might be a chance for another thread to enter in the next 'numa_scan_period' while we have not gotten till numab_state allocation [1]. Note that the command `/opt/ltp/testcases/bin/hackbench 50 process 1000` cannot the reproduce the issue. It is verified with 200+ test runs. [Solution] Use the cmpxchg atomic operation to ensure that only one thread executes the vma->numab_state assignment. [1] https://lore.kernel.org/lkml/1794be3c-358c-4cdc-a43d-a1f841d91ef7@amd.com/ Link: https://lkml.kernel.org/r/20241113102146.2384-1-ahuang12@lenovo.com Fixes: ef6a22b70f6d ("sched/numa: apply the scan delay to every new vma") Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Reported-by: Jiwei Sun <sunjw10@lenovo.com> Reviewed-by: Raghavendra K T <raghavendra.kt@amd.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Ben Segall <bsegall@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Josh Don <joshdon@google.com> Date: Mon Nov 11 10:27:38 2024 -0800 sched: fix warning in sched_setaffinity [ Upstream commit 70ee7947a29029736a1a06c73a48ff37674a851b ] Commit 8f9ea86fdf99b added some logic to sched_setaffinity that included a WARN when a per-task affinity assignment races with a cpuset update. Specifically, we can have a race where a cpuset update results in the task affinity no longer being a subset of the cpuset. That's fine; we have a fallback to instead use the cpuset mask. However, we have a WARN set up that will trigger if the cpuset mask has no overlap at all with the requested task affinity. This shouldn't be a warning condition; its trivial to create this condition. Reproduced the warning by the following setup: - $PID inside a cpuset cgroup - another thread repeatedly switching the cpuset cpus from 1-2 to just 1 - another thread repeatedly setting the $PID affinity (via taskset) to 2 Fixes: 8f9ea86fdf99b ("sched: Always preserve the user requested cpumask") Signed-off-by: Josh Don <joshdon@google.com> Acked-and-tested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Link: https://lkml.kernel.org/r/20241111182738.1832953-1-joshdon@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Changwoo Min <multics69@gmail.com> Date: Sat Nov 9 15:29:05 2024 +0900 sched_ext: add a missing rcu_read_lock/unlock pair at scx_select_cpu_dfl() [ Upstream commit f39489fea677ad78ca4ce1ab2d204a6639b868dc ] When getting an LLC CPU mask in the default CPU selection policy, scx_select_cpu_dfl(), a pointer to the sched_domain is dereferenced using rcu_read_lock() without holding rcu_read_lock(). Such an unprotected dereference often causes the following warning and can cause an invalid memory access in the worst case. Therefore, protect dereference of a sched_domain pointer using a pair of rcu_read_lock() and unlock(). [ 20.996135] ============================= [ 20.996345] WARNING: suspicious RCU usage [ 20.996563] 6.11.0-virtme #17 Tainted: G W [ 20.996576] ----------------------------- [ 20.996576] kernel/sched/ext.c:3323 suspicious rcu_dereference_check() usage! [ 20.996576] [ 20.996576] other info that might help us debug this: [ 20.996576] [ 20.996576] [ 20.996576] rcu_scheduler_active = 2, debug_locks = 1 [ 20.996576] 4 locks held by kworker/8:1/140: [ 20.996576] #0: ffff8b18c00dd348 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x4a0/0x590 [ 20.996576] #1: ffffb3da01f67e58 ((work_completion)(&dev->power.work)){+.+.}-{0:0}, at: process_one_work+0x1ba/0x590 [ 20.996576] #2: ffffffffa316f9f0 (&rcu_state.gp_wq){..-.}-{2:2}, at: swake_up_one+0x15/0x60 [ 20.996576] #3: ffff8b1880398a60 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0x59/0x7d0 [ 20.996576] [ 20.996576] stack backtrace: [ 20.996576] CPU: 8 UID: 0 PID: 140 Comm: kworker/8:1 Tainted: G W 6.11.0-virtme #17 [ 20.996576] Tainted: [W]=WARN [ 20.996576] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 20.996576] Workqueue: pm pm_runtime_work [ 20.996576] Sched_ext: simple (disabling+all), task: runnable_at=-6ms [ 20.996576] Call Trace: [ 20.996576] <IRQ> [ 20.996576] dump_stack_lvl+0x6f/0xb0 [ 20.996576] lockdep_rcu_suspicious.cold+0x4e/0x96 [ 20.996576] scx_select_cpu_dfl+0x234/0x260 [ 20.996576] select_task_rq_scx+0xfb/0x190 [ 20.996576] select_task_rq+0x47/0x110 [ 20.996576] try_to_wake_up+0x110/0x7d0 [ 20.996576] swake_up_one+0x39/0x60 [ 20.996576] rcu_core+0xb08/0xe50 [ 20.996576] ? srso_alias_return_thunk+0x5/0xfbef5 [ 20.996576] ? mark_held_locks+0x40/0x70 [ 20.996576] handle_softirqs+0xd3/0x410 [ 20.996576] irq_exit_rcu+0x78/0xa0 [ 20.996576] sysvec_apic_timer_interrupt+0x73/0x80 [ 20.996576] </IRQ> [ 20.996576] <TASK> [ 20.996576] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 20.996576] RIP: 0010:_raw_spin_unlock_irqrestore+0x36/0x70 [ 20.996576] Code: f5 53 48 8b 74 24 10 48 89 fb 48 83 c7 18 e8 11 b4 36 ff 48 89 df e8 99 0d 37 ff f7 c5 00 02 00 00 75 17 9c 58 f6 c4 02 75 2b <65> ff 0d 5b 55 3c 5e 74 16 5b 5d e9 95 8e 28 00 e8 a5 ee 44 ff 9c [ 20.996576] RSP: 0018:ffffb3da01f67d20 EFLAGS: 00000246 [ 20.996576] RAX: 0000000000000002 RBX: ffffffffa4640220 RCX: 0000000000000040 [ 20.996576] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa1c7b27b [ 20.996576] RBP: 0000000000000246 R08: 0000000000000001 R09: 0000000000000000 [ 20.996576] R10: 0000000000000001 R11: 000000000000021c R12: 0000000000000246 [ 20.996576] R13: ffff8b1881363958 R14: 0000000000000000 R15: ffff8b1881363800 [ 20.996576] ? _raw_spin_unlock_irqrestore+0x4b/0x70 [ 20.996576] serial_port_runtime_resume+0xd4/0x1a0 [ 20.996576] ? __pfx_serial_port_runtime_resume+0x10/0x10 [ 20.996576] __rpm_callback+0x44/0x170 [ 20.996576] ? __pfx_serial_port_runtime_resume+0x10/0x10 [ 20.996576] rpm_callback+0x55/0x60 [ 20.996576] ? __pfx_serial_port_runtime_resume+0x10/0x10 [ 20.996576] rpm_resume+0x582/0x7b0 [ 20.996576] pm_runtime_work+0x7c/0xb0 [ 20.996576] process_one_work+0x1fb/0x590 [ 20.996576] worker_thread+0x18e/0x350 [ 20.996576] ? __pfx_worker_thread+0x10/0x10 [ 20.996576] kthread+0xe2/0x110 [ 20.996576] ? __pfx_kthread+0x10/0x10 [ 20.996576] ret_from_fork+0x34/0x50 [ 20.996576] ? __pfx_kthread+0x10/0x10 [ 20.996576] ret_from_fork_asm+0x1a/0x30 [ 20.996576] </TASK> [ 21.056592] sched_ext: BPF scheduler "simple" disabled (unregistered from user space) Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yihang Li <liyihang9@huawei.com> Date: Tue Oct 8 10:18:16 2024 +0800 scsi: hisi_sas: Add cond_resched() for no forced preemption model [ Upstream commit 2233c4a0b948211743659b24c13d6bd059fa75fc ] For no forced preemption model kernel, in the scenario where the expander is connected to 12 high performance SAS SSDs, the following call trace may occur: [ 214.409199][ C240] watchdog: BUG: soft lockup - CPU#240 stuck for 22s! [irq/149-hisi_sa:3211] [ 214.568533][ C240] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 214.575224][ C240] pc : fput_many+0x8c/0xdc [ 214.579480][ C240] lr : fput+0x1c/0xf0 [ 214.583302][ C240] sp : ffff80002de2b900 [ 214.587298][ C240] x29: ffff80002de2b900 x28: ffff1082aa412000 [ 214.593291][ C240] x27: ffff3062a0348c08 x26: ffff80003a9f6000 [ 214.599284][ C240] x25: ffff1062bbac5c40 x24: 0000000000001000 [ 214.605277][ C240] x23: 000000000000000a x22: 0000000000000001 [ 214.611270][ C240] x21: 0000000000001000 x20: 0000000000000000 [ 214.617262][ C240] x19: ffff3062a41ae580 x18: 0000000000010000 [ 214.623255][ C240] x17: 0000000000000001 x16: ffffdb3a6efe5fc0 [ 214.629248][ C240] x15: ffffffffffffffff x14: 0000000003ffffff [ 214.635241][ C240] x13: 000000000000ffff x12: 000000000000029c [ 214.641234][ C240] x11: 0000000000000006 x10: ffff80003a9f7fd0 [ 214.647226][ C240] x9 : ffffdb3a6f0482fc x8 : 0000000000000001 [ 214.653219][ C240] x7 : 0000000000000002 x6 : 0000000000000080 [ 214.659212][ C240] x5 : ffff55480ee9b000 x4 : fffffde7f94c6554 [ 214.665205][ C240] x3 : 0000000000000002 x2 : 0000000000000020 [ 214.671198][ C240] x1 : 0000000000000021 x0 : ffff3062a41ae5b8 [ 214.677191][ C240] Call trace: [ 214.680320][ C240] fput_many+0x8c/0xdc [ 214.684230][ C240] fput+0x1c/0xf0 [ 214.687707][ C240] aio_complete_rw+0xd8/0x1fc [ 214.692225][ C240] blkdev_bio_end_io+0x98/0x140 [ 214.696917][ C240] bio_endio+0x160/0x1bc [ 214.701001][ C240] blk_update_request+0x1c8/0x3bc [ 214.705867][ C240] scsi_end_request+0x3c/0x1f0 [ 214.710471][ C240] scsi_io_completion+0x7c/0x1a0 [ 214.715249][ C240] scsi_finish_command+0x104/0x140 [ 214.720200][ C240] scsi_softirq_done+0x90/0x180 [ 214.724892][ C240] blk_mq_complete_request+0x5c/0x70 [ 214.730016][ C240] scsi_mq_done+0x48/0xac [ 214.734194][ C240] sas_scsi_task_done+0xbc/0x16c [libsas] [ 214.739758][ C240] slot_complete_v3_hw+0x260/0x760 [hisi_sas_v3_hw] [ 214.746185][ C240] cq_thread_v3_hw+0xbc/0x190 [hisi_sas_v3_hw] [ 214.752179][ C240] irq_thread_fn+0x34/0xa4 [ 214.756435][ C240] irq_thread+0xc4/0x130 [ 214.760520][ C240] kthread+0x108/0x13c [ 214.764430][ C240] ret_from_fork+0x10/0x18 This is because in the hisi_sas driver, both the hardware interrupt handler and the interrupt thread are executed on the same CPU. In the performance test scenario, function irq_wait_for_interrupt() will always return 0 if lots of interrupts occurs and the CPU will be continuously consumed. As a result, the CPU cannot run the watchdog thread. When the watchdog time exceeds the specified time, call trace occurs. To fix it, add cond_resched() to execute the watchdog thread. Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20241008021822.2617339-8-liyihang9@huawei.com Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yihang Li <liyihang9@huawei.com> Date: Tue Oct 8 10:18:21 2024 +0800 scsi: hisi_sas: Create all dump files during debugfs initialization [ Upstream commit 9f564f15f88490b484e02442dc4c4b11640ea172 ] For the current debugfs of hisi_sas, after user triggers dump, the driver allocate memory space to save the register information and create debugfs files to display the saved information. In this process, the debugfs files created after each dump. Therefore, when the dump is triggered while the driver is unbind, the following hang occurs: [67840.853907] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [67840.862947] Mem abort info: [67840.865855] ESR = 0x0000000096000004 [67840.869713] EC = 0x25: DABT (current EL), IL = 32 bits [67840.875125] SET = 0, FnV = 0 [67840.878291] EA = 0, S1PTW = 0 [67840.881545] FSC = 0x04: level 0 translation fault [67840.886528] Data abort info: [67840.889524] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [67840.895117] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [67840.900284] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [67840.905709] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002803a1f000 [67840.912263] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000 [67840.919177] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [67840.996435] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [67841.003628] pc : down_write+0x30/0x98 [67841.007546] lr : start_creating.part.0+0x60/0x198 [67841.012495] sp : ffff8000b979ba20 [67841.016046] x29: ffff8000b979ba20 x28: 0000000000000010 x27: 0000000000024b40 [67841.023412] x26: 0000000000000012 x25: ffff20202b355ae8 x24: ffff20202b35a8c8 [67841.030779] x23: ffffa36877928208 x22: ffffa368b4972240 x21: ffff8000b979bb18 [67841.038147] x20: ffff00281dc1e3c0 x19: fffffffffffffffe x18: 0000000000000020 [67841.045515] x17: 0000000000000000 x16: ffffa368b128a530 x15: ffffffffffffffff [67841.052888] x14: ffff8000b979bc18 x13: ffffffffffffffff x12: ffff8000b979bb18 [67841.060263] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa368b1289b18 [67841.067640] x8 : 0000000000000012 x7 : 0000000000000000 x6 : 00000000000003a9 [67841.075014] x5 : 0000000000000000 x4 : ffff002818c5cb00 x3 : 0000000000000001 [67841.082388] x2 : 0000000000000000 x1 : ffff002818c5cb00 x0 : 00000000000000a0 [67841.089759] Call trace: [67841.092456] down_write+0x30/0x98 [67841.096017] start_creating.part.0+0x60/0x198 [67841.100613] debugfs_create_dir+0x48/0x1f8 [67841.104950] debugfs_create_files_v3_hw+0x88/0x348 [hisi_sas_v3_hw] [67841.111447] debugfs_snapshot_regs_v3_hw+0x708/0x798 [hisi_sas_v3_hw] [67841.118111] debugfs_trigger_dump_v3_hw_write+0x9c/0x120 [hisi_sas_v3_hw] [67841.125115] full_proxy_write+0x68/0xc8 [67841.129175] vfs_write+0xd8/0x3f0 [67841.132708] ksys_write+0x70/0x108 [67841.136317] __arm64_sys_write+0x24/0x38 [67841.140440] invoke_syscall+0x50/0x128 [67841.144385] el0_svc_common.constprop.0+0xc8/0xf0 [67841.149273] do_el0_svc+0x24/0x38 [67841.152773] el0_svc+0x38/0xd8 [67841.156009] el0t_64_sync_handler+0xc0/0xc8 [67841.160361] el0t_64_sync+0x1a4/0x1a8 [67841.164189] Code: b9000882 d2800002 d2800023 f9800011 (c85ffc05) [67841.170443] ---[ end trace 0000000000000000 ]--- To fix this issue, create all directories and files during debugfs initialization. In this way, the driver only needs to allocate memory space to save information each time the user triggers dumping. Signed-off-by: Yihang Li <liyihang9@huawei.com> Link: https://lore.kernel.org/r/20241008021822.2617339-13-liyihang9@huawei.com Reviewed-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Tee <justin.tee@broadcom.com> Date: Thu Oct 31 15:32:11 2024 -0700 scsi: lpfc: Call lpfc_sli4_queue_unset() in restart and rmmod paths [ Upstream commit d35f7672715d1ff3e3ad9bb4ae6ac6cb484200fe ] During initialization, the driver allocates wq->pring in lpfc_wq_create and lpfc_sli4_queue_unset() is the only place where kfree(wq->pring) is called. There is a possible memory leak in lpfc_sli_brdrestart_s4() (restart) and lpfc_pci_remove_one_s4() (rmmod) paths because there are no calls to lpfc_sli4_queue_unset() to kfree() the wq->pring. Fix by inserting a call to lpfc_sli4_queue_unset() in lpfc_sli_brdrestart_s4() and lpfc_sli4_hba_unset() routines. Also, add a check for the SLI_ACTIVE flag before issuing the Q_DESTROY mailbox command. If not set, then the mailbox command will obviously fail. In such cases, skip issuing the mailbox command and only execute the driver resource clean up portions of the lpfc_*q_destroy routines. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Tee <justin.tee@broadcom.com> Date: Thu Oct 31 15:32:13 2024 -0700 scsi: lpfc: Check SLI_ACTIVE flag in FDMI cmpl before submitting follow up FDMI [ Upstream commit 98f8d3588097e321be70f83b844fa67d4828fe5c ] The lpfc_cmpl_ct_disc_fdmi() routine has incorrect logic that treats an FDMI completion with error LOCAL_REJECT/SLI_ABORTED as a success status. Under the erroneous assumption of successful completion, the routine proceeds to issue follow up FDMI commands, which may never complete if the HBA is in an errata state as indicated by the errored completion status. Fix by freeing FDMI cmd resources and early return when the LPFC_SLI_ACTIVE flag is not set and a LOCAL_REJECT/SLI_ABORTED or SLI_DOWN status is received. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Tee <justin.tee@broadcom.com> Date: Thu Oct 31 15:32:15 2024 -0700 scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback [ Upstream commit 4281f44ea8bfedd25938a0031bebba1473ece9ad ] Current dev_loss_tmo handling checks whether there has been a previous call to unregister with SCSI transport. If so, the NDLP kref count is decremented a second time in dev_loss_tmo as the final kref release. However, this can sometimes result in a reference count underflow if there is also a race to unregister with NVMe transport as well. Add a check for NVMe transport registration before decrementing the final kref. If NVMe transport is still registered, then the NVMe transport unregistration is designated as the final kref decrement. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Quinn Tran <qutran@marvell.com> Date: Fri Nov 15 18:33:07 2024 +0530 scsi: qla2xxx: Fix abort in bsg timeout commit c423263082ee8ccfad59ab33e3d5da5dc004c21e upstream. Current abort of bsg on timeout prematurely clears the outstanding_cmds[]. Abort does not allow FW to return the IOCB/SRB. In addition, bsg_job_done() is not called to return the BSG (i.e. leak). Abort the outstanding bsg/SRB and wait for the completion. The completion IOCB will wake up the bsg_timeout thread. If abort is not successful, then driver will forcibly call bsg_job_done() and free the srb. Err Inject: - qaucli -z - assign CT Passthru IOCB's NportHandle with another initiator nport handle to trigger timeout. Remote port will drop CT request. - bsg_job_done is properly called as part of cleanup kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f0838 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-507c:7: Abort command issued - hdl=4b, type=5 kernel: qla2xxx [0000:21:00.1]-5040:7: ELS-CT pass-through-ct pass-through error hdl=4b comp_status-status=0x5 error subcode 1=0x0 error subcode 2=0xaf882e80. kernel: qla2xxx [0000:21:00.1]-7009:7: qla2x00_bsg_job_done: sp hdl 4b, result=70000 bsg ptr ffff9971a42f0838 kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=4b rval=0 kernel: qla2xxx [0000:21:00.1]-708a:7: bsg abort success. bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=0x4b kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f43b8 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-7012:7: qla_bsg_found : 2206 : Error Inject 2. kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f43b8 sp=ffff99762c304440 handle=5e rval=5 kernel: qla2xxx [0000:21:00.1]-704f:7: bsg abort fail. bsg=ffff9971a42f43b8 sp=ffff99762c304440 rval=5. kernel: qla2xxx [0000:21:00.1]-7051:7: qla_bsg_found bsg_job_done : bsg ffff9971a42f43b8 result 0xfffffffa sp ffff99762c304440. Cc: stable@vger.kernel.org Fixes: c449b4198701 ("scsi: qla2xxx: Use QP lock to search for bsg") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Quinn Tran <qutran@marvell.com> Date: Fri Nov 15 18:33:11 2024 +0530 scsi: qla2xxx: Fix NVMe and NPIV connect issue commit 4812b7796c144f63a1094f79a5eb8fbdad8d7ebc upstream. NVMe controller fails to send connect command due to failure to locate hw context buffer for NVMe queue 0 (blk_mq_hw_ctx, hctx_idx=0). The cause of the issue is NPIV host did not initialize the vha->irq_offset field. This field is given to blk-mq (blk_mq_pci_map_queues) to help locate the beginning of IO Queues which in turn help locate NVMe queue 0. Initialize this field to allow NVMe to work properly with NPIV host. kernel: nvme nvme5: Connect command failed, errno: -18 kernel: nvme nvme5: qid 0: secure concatenation is not supported kernel: nvme nvme5: NVME-FC{5}: create_assoc failed, assoc_id 2e9100 ret 401 kernel: nvme nvme5: NVME-FC{5}: reset: Reconnect attempt failed (401) kernel: nvme nvme5: NVME-FC{5}: Reconnect attempt in 2 seconds Cc: stable@vger.kernel.org Fixes: f0783d43dde4 ("scsi: qla2xxx: Use correct number of vectors for online CPUs") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Quinn Tran <qutran@marvell.com> Date: Fri Nov 15 18:33:08 2024 +0530 scsi: qla2xxx: Fix use after free on unload commit 07c903db0a2ff84b68efa1a74a4de353ea591eb0 upstream. System crash is observed with stack trace warning of use after free. There are 2 signals to tell dpc_thread to terminate (UNLOADING flag and kthread_stop). On setting the UNLOADING flag when dpc_thread happens to run at the time and sees the flag, this causes dpc_thread to exit and clean up itself. When kthread_stop is called for final cleanup, this causes use after free. Remove UNLOADING signal to terminate dpc_thread. Use the kthread_stop as the main signal to exit dpc_thread. [596663.812935] kernel BUG at mm/slub.c:294! [596663.812950] invalid opcode: 0000 [#1] SMP PTI [596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x86_64 #1 [596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 [596663.812974] RIP: 0010:__slab_free+0x17d/0x360 ... [596663.813008] Call Trace: [596663.813022] ? __dentry_kill+0x121/0x170 [596663.813030] ? _cond_resched+0x15/0x30 [596663.813034] ? _cond_resched+0x15/0x30 [596663.813039] ? wait_for_completion+0x35/0x190 [596663.813048] ? try_to_wake_up+0x63/0x540 [596663.813055] free_task+0x5a/0x60 [596663.813061] kthread_stop+0xf3/0x100 [596663.813103] qla2x00_remove_one+0x284/0x440 [qla2xxx] Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Saurav Kashyap <skashyap@marvell.com> Date: Fri Nov 15 18:33:10 2024 +0530 scsi: qla2xxx: Remove check req_sg_cnt should be equal to rsp_sg_cnt commit 833c70e212fc40d3e98da941796f4c7bcaecdf58 upstream. Firmware supports multiple sg_cnt for request and response for CT commands, so remove the redundant check. A check is there where sg_cnt for request and response should be same. This is not required as driver and FW have code to handle multiple and different sg_cnt on request and response. Cc: stable@vger.kernel.org Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Anil Gurumurthy <agurumurthy@marvell.com> Date: Fri Nov 15 18:33:12 2024 +0530 scsi: qla2xxx: Supported speed displayed incorrectly for VPorts commit e4e268f898c8a08f0a1188677e15eadbc06e98f6 upstream. The fc_function_template for vports was missing the .show_host_supported_speeds. The base port had the same. Add .show_host_supported_speeds to the vport template as well. Cc: stable@vger.kernel.org Fixes: 2c3dfe3f6ad8 ("[SCSI] qla2xxx: add support for NPIV") Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: John Garry <john.g.garry@oracle.com> Date: Mon Dec 2 13:00:45 2024 +0000 scsi: scsi_debug: Fix hrtimer support for ndelay [ Upstream commit 6918141d815acef056a0d10e966a027d869a922d ] Since commit 771f712ba5b0 ("scsi: scsi_debug: Fix cmd duration calculation"), ns_from_boot value is only evaluated in schedule_resp() for polled requests. However, ns_from_boot is also required for hrtimer support for when ndelay is less than INCLUSIVE_TIMING_MAX_NS, so fix up the logic to decide when to evaluate ns_from_boot. Fixes: 771f712ba5b0 ("scsi: scsi_debug: Fix cmd duration calculation") Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241202130045.2335194-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Suraj Sonawane <surajsonawane0215@gmail.com> Date: Wed Nov 20 18:29:44 2024 +0530 scsi: sg: Fix slab-use-after-free read in sg_release() [ Upstream commit f10593ad9bc36921f623361c9e3dd96bd52d85ee ] Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN: BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30 kernel/locking/lockdep.c:5838 __mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912 sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407 In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is called before releasing the open_rel_lock mutex. The kref_put() call may decrement the reference count of sfp to zero, triggering its cleanup through sg_remove_sfp(). This cleanup includes scheduling deferred work via sg_remove_sfp_usercontext(), which ultimately frees sfp. After kref_put(), sg_release() continues to unlock open_rel_lock and may reference sfp or sdp. If sfp has already been freed, this results in a slab-use-after-free error. Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the open_rel_lock mutex. This ensures: - No references to sfp or sdp occur after the reference count is decremented. - Cleanup functions such as sg_remove_sfp() and sg_remove_sfp_usercontext() can safely execute without impacting the mutex handling in sg_release(). The fix has been tested and validated by syzbot. This patch closes the bug reported at the following syzkaller link and ensures proper sequencing of resource cleanup and mutex operations, eliminating the risk of use-after-free errors in sg_release(). Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling") Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Date: Wed Nov 6 11:57:22 2024 +0200 scsi: st: Add MTIOCGET and MTLOAD to ioctls allowed after device reset [ Upstream commit 0b120edb37dc9dd8ca82893d386922eb6b16f860 ] Most drives rewind the tape when the device is reset. Reading and writing are not allowed until something is done to make the tape position match the user's expectation (e.g., rewind the tape). Add MTIOCGET and MTLOAD to operations allowed after reset. MTIOCGET is modified to not touch the tape if pos_unknown is non-zero. The tape location is known after MTLOAD. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14 Link: https://lore.kernel.org/r/20241106095723.63254-3-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Date: Wed Nov 6 11:57:21 2024 +0200 scsi: st: Don't modify unknown block number in MTIOCGET [ Upstream commit 5bb2d6179d1a8039236237e1e94cfbda3be1ed9e ] Struct mtget field mt_blkno -1 means it is unknown. Don't add anything to it. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14 Link: https://lore.kernel.org/r/20241106095723.63254-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Wang <peter.wang@mediatek.com> Date: Fri Nov 22 10:49:43 2024 +0800 scsi: ufs: core: Add missing post notify for power mode change commit 7f45ed5f0cd5ccbbec79adc6c48a67d6a85fba56 upstream. When the power mode change is successful but the power mode hasn't actually changed, the post notification was missed. Similar to the approach with hibernate/clock scale/hce enable, having pre/post notifications in the same function will make it easier to maintain. Additionally, supplement the description of power parameters for the pwr_change_notify callback. Fixes: 7eb584db73be ("ufs: refactor configuring power mode") Cc: stable@vger.kernel.org #6.11.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241122024943.30589-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ziqi Chen <quic_ziqichen@quicinc.com> Date: Tue Nov 19 17:56:04 2024 +0800 scsi: ufs: core: Add ufshcd_send_bsg_uic_cmd() for UFS BSG [ Upstream commit 60b4dd1460f6d65739acb0f28d12bd9abaeb34b4 ] User layer applications can send UIC GET/SET commands via the BSG framework, and if the user layer application sends a UIC SET command to the PA_PWRMODE attribute, a power mode change shall be initiated in UniPro and two interrupts shall be triggered if the power mode is successfully changed, i.e., UIC Command Completion interrupt and UIC Power Mode interrupt. The current UFS BSG code calls ufshcd_send_uic_cmd() directly, with which the second interrupt, i.e., UIC Power Mode interrupt, shall be treated as unhandled interrupt. In addition, after the UIC command is completed, user layer application has to poll UniPro and/or M-PHY state machine to confirm the power mode change is finished. Add a new wrapper function ufshcd_send_bsg_uic_cmd() and call it from ufs_bsg_request() so that if a UIC SET command is targeting the PA_PWRMODE attribute it can be redirected to ufshcd_uic_pwr_ctrl(). Fixes: e77044c5a842 ("scsi: ufs-bsg: Add support for uic commands in ufs_bsg_request()") Co-developed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Ziqi Chen <quic_ziqichen@quicinc.com> Link: https://lore.kernel.org/r/20241119095613.121385-1-quic_ziqichen@quicinc.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bart Van Assche <bvanassche@acm.org> Date: Thu Sep 12 15:30:05 2024 -0700 scsi: ufs: core: Always initialize the UIC done completion [ Upstream commit b1e8c53749adb795bfb0bf4e2f7836e26684bb90 ] Simplify __ufshcd_send_uic_cmd() by always initializing the uic_cmd::done completion. This is fine since the time required to initialize a completion is small compared to the time required to process an UIC command. Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240912223019.3510966-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 60b4dd1460f6 ("scsi: ufs: core: Add ufshcd_send_bsg_uic_cmd() for UFS BSG") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Mon Nov 11 23:18:30 2024 +0530 scsi: ufs: core: Cancel RTC work during ufshcd_remove() commit 1695c4361d35b7bdadd7b34f99c9c07741e181e5 upstream. Currently, RTC work is only cancelled during __ufshcd_wl_suspend(). When ufshcd is removed in ufshcd_remove(), RTC work is not cancelled. Due to this, any further trigger of the RTC work after ufshcd_remove() would result in a NULL pointer dereference as below: Unable to handle kernel NULL pointer dereference at virtual address 00000000000002a4 Workqueue: events ufshcd_rtc_work Call trace: _raw_spin_lock_irqsave+0x34/0x8c pm_runtime_get_if_active+0x24/0xb4 ufshcd_rtc_work+0x124/0x19c process_scheduled_works+0x18c/0x2d8 worker_thread+0x144/0x280 kthread+0x11c/0x128 ret_from_fork+0x10/0x20 Since RTC work accesses the ufshcd internal structures, it should be cancelled when ufshcd is removed. So do that in ufshcd_remove(), as per the order in ufshcd_init(). Cc: stable@vger.kernel.org # 6.8 Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241111-ufs_bug_fix-v1-1-45ad8b62f02e@linaro.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bart Van Assche <bvanassche@acm.org> Date: Fri Oct 18 12:47:39 2024 -0700 scsi: ufs: core: Make DMA mask configuration more flexible [ Upstream commit 78bc671bd1501e2f6c571e063301a4fdc5db53b2 ] Replace UFSHCD_QUIRK_BROKEN_64BIT_ADDRESS with ufs_hba_variant_ops::set_dma_mask. Update the Renesas driver accordingly. This patch enables supporting other configurations than 32-bit or 64-bit DMA addresses, e.g. 36-bit DMA addresses. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241018194753.775074-1-bvanassche@acm.org Reviewed-by: Avri Altman <Avri.Altman@wdc.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gwendal Grignou <gwendal@chromium.org> Date: Tue Nov 19 22:25:22 2024 -0800 scsi: ufs: core: sysfs: Prevent div by zero commit eb48e9fc0028bed94a40a9352d065909f19e333c upstream. Prevent a division by 0 when monitoring is not enabled. Fixes: 1d8613a23f3c ("scsi: ufs: core: Introduce HBA performance monitor sysfs nodes") Cc: stable@vger.kernel.org Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Link: https://lore.kernel.org/r/20241120062522.917157-1-gwendal@chromium.org Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Mon Nov 11 23:18:34 2024 +0530 scsi: ufs: pltfrm: Dellocate HBA during ufshcd_pltfrm_remove() [ Upstream commit 897df60c16d54ad515a3d0887edab5c63da06d1f ] This will ensure that the scsi host is cleaned up properly using scsi_host_dev_release(). Otherwise, it may lead to memory leaks. Cc: stable@vger.kernel.org # 4.4 Fixes: 03b1781aa978 ("[SCSI] ufs: Add Platform glue driver for ufshcd") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241111-ufs_bug_fix-v1-5-45ad8b62f02e@linaro.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Mon Nov 11 23:18:32 2024 +0530 scsi: ufs: pltfrm: Disable runtime PM during removal of glue drivers commit d3326e6a3f9bf1e075be2201fb704c2fdf19e2b7 upstream. When the UFSHCD platform glue drivers are removed, runtime PM should be disabled using pm_runtime_disable() to balance the enablement done in ufshcd_pltfrm_init(). This is also reported by PM core when the glue driver is removed and inserted again: ufshcd-qcom 1d84000.ufshc: Unbalanced pm_runtime_enable! So disable runtime PM using a new helper API ufshcd_pltfrm_remove(), that also takes care of removing ufshcd. This helper should be called during the remove() stage of glue drivers. Cc: stable@vger.kernel.org # 3.12 Fixes: 62694735ca95 ("[SCSI] ufs: Add runtime PM support for UFS host controller driver") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241111-ufs_bug_fix-v1-3-45ad8b62f02e@linaro.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Mon Nov 11 23:18:33 2024 +0530 scsi: ufs: pltfrm: Drop PM runtime reference count after ufshcd_remove() commit 1745dcdb7227102e16248a324c600b9121c8f6df upstream. During the remove stage of glue drivers, some of them are incrementing the reference count using pm_runtime_get_sync(), before removing the ufshcd using ufshcd_remove(). But they are not dropping that reference count after ufshcd_remove() to balance the refcount. So drop the reference count by calling pm_runtime_put_noidle() after ufshcd_remove(). Since the behavior is applicable to all glue drivers, move the PM handling to ufshcd_pltfrm_remove(). Cc: stable@vger.kernel.org # 3.12 Fixes: 62694735ca95 ("[SCSI] ufs: Add runtime PM support for UFS host controller driver") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241111-ufs_bug_fix-v1-4-45ad8b62f02e@linaro.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Mon Nov 11 23:18:31 2024 +0530 scsi: ufs: qcom: Only free platform MSIs when ESI is enabled commit 64506b3d23a337e98a74b18dcb10c8619365f2bd upstream. Otherwise, it will result in a NULL pointer dereference as below: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Call trace: mutex_lock+0xc/0x54 platform_device_msi_free_irqs_all+0x14/0x20 ufs_qcom_remove+0x34/0x48 [ufs_qcom] platform_remove+0x28/0x44 device_remove+0x4c/0x80 device_release_driver_internal+0xd8/0x178 driver_detach+0x50/0x9c bus_remove_driver+0x6c/0xbc driver_unregister+0x30/0x60 platform_driver_unregister+0x14/0x20 ufs_qcom_pltform_exit+0x18/0xb94 [ufs_qcom] __arm64_sys_delete_module+0x180/0x260 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xdc el0t_64_sync_handler+0xc0/0xc4 el0t_64_sync+0x190/0x194 Cc: stable@vger.kernel.org # 6.3 Fixes: 519b6274a777 ("scsi: ufs: qcom: Add MCQ ESI config vendor specific ops") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241111-ufs_bug_fix-v1-2-45ad8b62f02e@linaro.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mark Brown <broonie@kernel.org> Date: Wed Nov 27 16:14:22 2024 +0000 selftest: hugetlb_dio: fix test naming commit 4ae132c693896b0713db572676c90ffd855a4246 upstream. The string logged when a test passes or fails is used by the selftest framework to identify which test is being reported. The hugetlb_dio test not only uses the same strings for every test that is run but it also uses different strings for test passes and failures which means that test automation is unable to follow what the test is doing at all. Pull the existing duplicated logging of the number of free huge pages before and after the test out of the conditional and replace that and the logging of the result with a single ksft_print_result() which incorporates the parameters passed into the test into the output. Link: https://lkml.kernel.org/r/20241127-kselftest-mm-hugetlb-dio-names-v1-1-22aab01bf550@kernel.org Fixes: fae1980347bf ("selftests: hugetlb_dio: fixup check for initial conditions to skip in the start") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Maximilian Heyne <mheyne@amazon.de> Date: Wed Nov 27 12:08:53 2024 +0000 selftests/damon: add _damon_sysfs.py to TEST_FILES commit 4a475c0a7eeb3368eca40fe7cb02d157eeddc77a upstream. When running selftests I encountered the following error message with some damon tests: # Traceback (most recent call last): # File "[...]/damon/./damos_quota.py", line 7, in <module> # import _damon_sysfs # ModuleNotFoundError: No module named '_damon_sysfs' Fix this by adding the _damon_sysfs.py file to TEST_FILES so that it will be available when running the respective damon selftests. Link: https://lkml.kernel.org/r/20241127-picks-visitor-7416685b-mheyne@amazon.de Fixes: 306abb63a8ca ("selftests/damon: implement a python module for test-purpose DAMON sysfs controls") Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hari Bathini <hbathini@linux.ibm.com> Date: Sat Nov 30 01:56:21 2024 +0530 selftests/ftrace: adjust offset for kprobe syntax error test [ Upstream commit 777f290ab328de333b85558bb6807a69a59b36ba ] In 'NOFENTRY_ARGS' test case for syntax check, any offset X of `vfs_read+X` except function entry offset (0) fits the criterion, even if that offset is not at instruction boundary, as the parser comes before probing. But with "ENDBR64" instruction on x86, offset 4 is treated as function entry. So, X can't be 4 as well. Thus, 8 was used as offset for the test case. On 64-bit powerpc though, any offset <= 16 can be considered function entry depending on build configuration (see arch_kprobe_on_func_entry() for implementation details). So, use `vfs_read+20` to accommodate that scenario too. Link: https://lore.kernel.org/r/20241129202621.721159-1-hbathini@linux.ibm.com Fixes: 4231f30fcc34a ("selftests/ftrace: Add BTF arguments test cases") Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Reinette Chatre <reinette.chatre@intel.com> Date: Thu Oct 24 14:18:42 2024 -0700 selftests/resctrl: Protect against array overflow when reading strings [ Upstream commit 46058430fc5d39c114f7e1b9c6ff14c9f41bd531 ] resctrl selftests discover system properties via a variety of sysfs files. The MBM and MBA tests need to discover the event and umask with which to configure the performance event used to measure read memory bandwidth. This is done by parsing the contents of /sys/bus/event_source/devices/uncore_imc_<imc instance>/events/cas_count_read Similarly, the resctrl selftests discover the cache size via /sys/bus/cpu/devices/cpu<id>/cache/index<index>/size. Take care to do bounds checking when using fscanf() to read the contents of files into a string buffer because by default fscanf() assumes arbitrarily long strings. If the file contains more bytes than the array can accommodate then an overflow will occur. Provide a maximum field width to the conversion specifier to protect against array overflow. The maximum is one less than the array size because string input stores a terminating null byte that is not covered by the maximum field width. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maximilian Heyne <mheyne@amazon.de> Date: Tue Nov 26 13:58:50 2024 +0000 selftests: hid: fix typo and exit code [ Upstream commit e8f34747bddedaf3895e5d5066e0f71713fff811 ] The correct exit code to mark a test as skipped is 4. Fixes: ffb85d5c9e80 ("selftests: hid: import hid-tools hid-core tests") Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Link: https://patch.msgid.link/20241126135850.76493-1-mheyne@amazon.de Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Inochi Amaoto <inochiama@gmail.com> Date: Thu Oct 24 14:21:03 2024 +0800 serial: 8250_dw: Add Sophgo SG2044 quirk [ Upstream commit cad4dda82c7eedcfc22597267e710ccbcf39d572 ] SG2044 relys on an internal divisor when calculating bitrate, which means a wrong clock for the most common bitrates. So add a quirk for this uart device to skip the set rate call and only relys on the internal UART divisor. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Link: https://lore.kernel.org/r/20241024062105.782330-4-inochiama@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rasmus Villemoes <linux@rasmusvillemoes.dk> Date: Mon Nov 18 12:01:54 2024 +0100 setlocalversion: work around "git describe" performance [ Upstream commit 523f3dbc187a9618d4fd80c2b438e4d490705dcd ] Contrary to expectations, passing a single candidate tag to "git describe" is slower than not passing any --match options. $ time git describe --debug ... traversed 10619 commits ... v6.12-rc5-63-g0fc810ae3ae1 real 0m0.169s $ time git describe --match=v6.12-rc5 --debug ... traversed 1310024 commits v6.12-rc5-63-g0fc810ae3ae1 real 0m1.281s In fact, the --debug output shows that git traverses all or most of history. For some repositories and/or git versions, those 1.3s are actually 10-15 seconds. This has been acknowledged as a performance bug in git [1], and a fix is on its way [2]. However, no solution is yet in git.git, and even when one lands, it will take quite a while before it finds its way to a release and for $random_kernel_developer to pick that up. So rewrite the logic to use plumbing commands. For each of the candidate values of $tag, we ask: (1) is $tag even an annotated tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of HEAD? (3) If so, how many commits are in $tag..HEAD? I have tested that this produces the same output as the current script for ~700 random commits between v6.9..v6.10. For those 700 commits, and in my git repo, the 'make -s kernelrelease' command is on average ~4 times faster with this patch applied (geometric mean of ratios). For the commit mentioned in Josh's original report [3], the time-consuming part of setlocalversion goes from $ time git describe --match=v6.12-rc5 c1e939a21eb1 v6.12-rc5-44-gc1e939a21eb1 real 0m1.210s to $ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1 0 44 real 0m0.037s [1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/ [2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/ [3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ Reported-by: Sean Christopherson <seanjc@google.com> Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/ Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/ Tested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Steve French <stfrench@microsoft.com> Date: Wed Dec 4 17:46:00 2024 -0600 smb3.1.1: fix posix mounts to older servers commit ddca5023091588eb303e3c0097d95c325992d05f upstream. Some servers which implement the SMB3.1.1 POSIX extensions did not set the file type in the mode in the infolevel 100 response. With the recent changes for checking the file type via the mode field, this can cause the root directory to be reported incorrectly and mounts (e.g. to ksmbd) to fail. Fixes: 6a832bc8bbb2 ("fs/smb/client: Implement new SMB3 POSIX type") Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Cc: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paulo Alcantara <pc@manguebit.com> Date: Tue Nov 26 15:55:53 2024 -0300 smb: client: don't try following DFS links in cifs_tree_connect() [ Upstream commit 36008fe6e3dc588e5e9ceae6e82c7f69399eb5d8 ] We can't properly support chasing DFS links in cifs_tree_connect() because (1) We don't support creating new sessions while we're reconnecting, which would be required for DFS interlinks. (2) ->is_path_accessible() can't be called from cifs_tree_connect() as it would deadlock with smb2_reconnect(). This is required for checking if new DFS target is a nested DFS link. By unconditionally trying to get an DFS referral from new DFS target isn't correct because if the new DFS target (interlink) is an DFS standalone namespace, then we would end up getting -ELOOP and then potentially leaving tcon disconnected. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paulo Alcantara <pc@manguebit.com> Date: Fri Dec 6 11:49:07 2024 -0300 smb: client: fix potential race in cifs_put_tcon() [ Upstream commit c32b624fa4f7ca5a2ff217a0b1b2f1352bb4ec11 ] dfs_cache_refresh() delayed worker could race with cifs_put_tcon(), so make sure to call list_replace_init() on @tcon->dfs_ses_list after kworker is cancelled or finished. Fixes: 4f42a8b54b5c ("smb: client: fix DFS interlink failover") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kees Cook <kees@kernel.org> Date: Sun Nov 17 03:32:09 2024 -0800 smb: client: memcpy() with surrounding object base address [ Upstream commit f69b0187f8745a7a9584f6b13f5e792594b88b2e ] Like commit f1f047bd7ce0 ("smb: client: Fix -Wstringop-overflow issues"), adjust the memcpy() destination address to be based off the surrounding object rather than based off the 4-byte "Protocol" member. This avoids a build-time warning when compiling under CONFIG_FORTIFY_SOURCE with GCC 15: In function 'fortify_memcpy_chk', inlined from 'CIFSSMBSetPathInfo' at ../fs/smb/client/cifssmb.c:5358:2: ../include/linux/fortify-string.h:571:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 571 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Vasut <marex@denx.de> Date: Sun Sep 29 20:49:16 2024 +0200 soc: imx8m: Probe the SoC driver as platform driver [ Upstream commit 9cc832d37799dbea950c4c8a34721b02b8b5a8ff ] With driver_async_probe=* on kernel command line, the following trace is produced because on i.MX8M Plus hardware because the soc-imx8m.c driver calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock driver is not yet probed. This was not detected during regular testing without driver_async_probe. Convert the SoC code to platform driver and instantiate a platform device in its current device_initcall() to probe the platform driver. Rework .soc_revision callback to always return valid error code and return SoC revision via parameter. This way, if anything in the .soc_revision callback return -EPROBE_DEFER, it gets propagated to .probe and the .probe will get retried later. " ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1 at drivers/soc/imx/soc-imx8m.c:115 imx8mm_soc_revision+0xdc/0x180 CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-next-20240924-00002-g2062bb554dea #603 Hardware name: DH electronics i.MX8M Plus DHCOM Premium Developer Kit (3) (DT) pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : imx8mm_soc_revision+0xdc/0x180 lr : imx8mm_soc_revision+0xd0/0x180 sp : ffff8000821fbcc0 x29: ffff8000821fbce0 x28: 0000000000000000 x27: ffff800081810120 x26: ffff8000818a9970 x25: 0000000000000006 x24: 0000000000824311 x23: ffff8000817f42c8 x22: ffff0000df8be210 x21: fffffffffffffdfb x20: ffff800082780000 x19: 0000000000000001 x18: ffffffffffffffff x17: ffff800081fff418 x16: ffff8000823e1000 x15: ffff0000c03b65e8 x14: ffff0000c00051b0 x13: ffff800082790000 x12: 0000000000000801 x11: ffff80008278ffff x10: ffff80008209d3a6 x9 : ffff80008062e95c x8 : ffff8000821fb9a0 x7 : 0000000000000000 x6 : 00000000000080e3 x5 : ffff0000df8c03d8 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : fffffffffffffdfb x0 : fffffffffffffdfb Call trace: imx8mm_soc_revision+0xdc/0x180 imx8_soc_init+0xb0/0x1e0 do_one_initcall+0x94/0x1a8 kernel_init_freeable+0x240/0x2a8 kernel_init+0x28/0x140 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- SoC: i.MX8MP revision 1.1 " Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <quic_kdybcio@quicinc.com> Date: Tue Sep 10 17:01:39 2024 +0200 soc: qcom: llcc: Use designated initializers for LLC settings [ Upstream commit 20a0a05f40faf82f64f1c2ad3e9f5006b80ca0cb ] The current way of storing the configuration is very much unmaintainable. Convert the data to use designated initializers to make it easier both to understand and add/update the slice configuration data. Signed-off-by: Konrad Dybcio <quic_kdybcio@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240910-topic-llcc_unwrap-v2-1-f0487c983373@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> Date: Fri Oct 4 20:47:29 2024 -0700 soc: qcom: pd-mapper: Add QCM6490 PD maps [ Upstream commit 31a95fe0851afbbc697b6be96c8a81a01d65aa5f ] The QCM6490 is a variant of SC7280, with the usual set of protection domains, and hence the need for a PD-mapper. In particular USB Type-C port management and battery management is pmic_glink based. Add an entry to the kernel, to avoid the need for userspace to provide this service. Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241004-qcm6490-pd-mapper-v1-1-d6f4bc3bffa3@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: K Prateek Nayak <kprateek.nayak@amd.com> Date: Tue Nov 19 05:44:29 2024 +0000 softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel commit 6675ce20046d149e1e1ffe7e9577947dee17aad5 upstream. do_softirq_post_smp_call_flush() on PREEMPT_RT kernels carries a WARN_ON_ONCE() for any SOFTIRQ being raised from an SMP-call-function. Since do_softirq_post_smp_call_flush() is called with preempt disabled, raising a SOFTIRQ during flush_smp_call_function_queue() can lead to longer preempt disabled sections. Since commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()") IPIs to an idle CPU in TIF_POLLING_NRFLAG mode can be optimized out by instead setting TIF_NEED_RESCHED bit in idle task's thread_info and relying on the flush_smp_call_function_queue() in the idle-exit path to run the SMP-call-function. To trigger an idle load balancing, the scheduler queues nohz_csd_function() responsible for triggering an idle load balancing on a target nohz idle CPU and sends an IPI. Only now, this IPI is optimized out and the SMP-call-function is executed from flush_smp_call_function_queue() in do_idle() which can raise a SCHED_SOFTIRQ to trigger the balancing. So far, this went undetected since, the need_resched() check in nohz_csd_function() would make it bail out of idle load balancing early as the idle thread does not clear TIF_POLLING_NRFLAG before calling flush_smp_call_function_queue(). The need_resched() check was added with the intent to catch a new task wakeup, however, it has recently discovered to be unnecessary and will be removed in the subsequent commit after which nohz_csd_function() can raise a SCHED_SOFTIRQ from flush_smp_call_function_queue() to trigger an idle load balance on an idle target in TIF_POLLING_NRFLAG mode. nohz_csd_function() bails out early if "idle_cpu()" check for the target CPU, and does not lock the target CPU's rq until the very end, once it has found tasks to run on the CPU and will not inhibit the wakeup of, or running of a newly woken up higher priority task. Account for this and prevent a WARN_ON_ONCE() when SCHED_SOFTIRQ is raised from flush_smp_call_function_queue(). Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-2-kprateek.nayak@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pei Xiao <xiaopei01@kylinos.cn> Date: Thu Nov 28 16:38:17 2024 +0800 spi: mpc52xx: Add cancel_work_sync before module remove [ Upstream commit 984836621aad98802d92c4a3047114cf518074c8 ] If we remove the module which will call mpc52xx_spi_remove it will free 'ms' through spi_unregister_controller. while the work ms->work will be used. The sequence of operations that may lead to a UAF bug. Fix it by ensuring that the work is canceled before proceeding with the cleanup in mpc52xx_spi_remove. Fixes: ca632f556697 ("spi: reorganize drivers") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Link: https://patch.msgid.link/1f16f8ae0e50ca9adb1dc849bf2ac65a40c9ceb9.1732783000.git.xiaopei01@kylinos.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Wahren <wahrenst@gmx.net> Date: Mon Sep 30 11:30:54 2024 +0200 spi: spi-fsl-lpspi: Adjust type of scldiv [ Upstream commit fa8ecda9876ac1e7b29257aa82af1fd0695496e2 ] The target value of scldiv is just a byte, but its calculation in fsl_lpspi_set_bitrate could be negative. So use an adequate type to store the result and avoid overflows. After that this needs range check adjustments, but this should make the code less opaque. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20240930093056.93418-2-wahrenst@gmx.net Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marco Elver <elver@google.com> Date: Fri Nov 22 16:39:47 2024 +0100 stackdepot: fix stack_depot_save_flags() in NMI context commit 031e04bdc834cda3b054ef6b698503b2b97e8186 upstream. Per documentation, stack_depot_save_flags() was meant to be usable from NMI context if STACK_DEPOT_FLAG_CAN_ALLOC is unset. However, it still would try to take the pool_lock in an attempt to save a stack trace in the current pool (if space is available). This could result in deadlock if an NMI is handled while pool_lock is already held. To avoid deadlock, only try to take the lock in NMI context and give up if unsuccessful. The documentation is fixed to clearly convey this. Link: https://lkml.kernel.org/r/Z0CcyfbPqmxJ9uJH@elver.google.com Link: https://lkml.kernel.org/r/20241122154051.3914732-1-elver@google.com Fixes: 4434a56ec209 ("stackdepot: make fast paths lock-less again") Signed-off-by: Marco Elver <elver@google.com> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zijian Zhang <zijianzhang@bytedance.com> Date: Wed Oct 16 23:48:38 2024 +0000 tcp_bpf: Fix the sk_mem_uncharge logic in tcp_bpf_sendmsg [ Upstream commit ca70b8baf2bd125b2a4d96e76db79375c07d7ff2 ] The current sk memory accounting logic in __SK_REDIRECT is pre-uncharging tosend bytes, which is either msg->sg.size or a smaller value apply_bytes. Potential problems with this strategy are as follows: - If the actual sent bytes are smaller than tosend, we need to charge some bytes back, as in line 487, which is okay but seems not clean. - When tosend is set to apply_bytes, as in line 417, and (ret < 0), we may miss uncharging (msg->sg.size - apply_bytes) bytes. [...] 415 tosend = msg->sg.size; 416 if (psock->apply_bytes && psock->apply_bytes < tosend) 417 tosend = psock->apply_bytes; [...] 443 sk_msg_return(sk, msg, tosend); 444 release_sock(sk); 446 origsize = msg->sg.size; 447 ret = tcp_bpf_sendmsg_redir(sk_redir, redir_ingress, 448 msg, tosend, flags); 449 sent = origsize - msg->sg.size; [...] 454 lock_sock(sk); 455 if (unlikely(ret < 0)) { 456 int free = sk_msg_free_nocharge(sk, msg); 458 if (!cork) 459 *copied -= free; 460 } [...] 487 if (eval == __SK_REDIRECT) 488 sk_mem_charge(sk, tosend - sent); [...] When running the selftest test_txmsg_redir_wait_sndmem with txmsg_apply, the following warning will be reported: ------------[ cut here ]------------ WARNING: CPU: 6 PID: 57 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x190/0x1a0 Modules linked in: CPU: 6 UID: 0 PID: 57 Comm: kworker/6:0 Not tainted 6.12.0-rc1.bm.1-amd64+ #43 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Workqueue: events sk_psock_destroy RIP: 0010:inet_sock_destruct+0x190/0x1a0 RSP: 0018:ffffad0a8021fe08 EFLAGS: 00010206 RAX: 0000000000000011 RBX: ffff9aab4475b900 RCX: ffff9aab481a0800 RDX: 0000000000000303 RSI: 0000000000000011 RDI: ffff9aab4475b900 RBP: ffff9aab4475b990 R08: 0000000000000000 R09: ffff9aab40050ec0 R10: 0000000000000000 R11: ffff9aae6fdb1d01 R12: ffff9aab49c60400 R13: ffff9aab49c60598 R14: ffff9aab49c60598 R15: dead000000000100 FS: 0000000000000000(0000) GS:ffff9aae6fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffec7e47bd8 CR3: 00000001a1a1c004 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x89/0x130 ? inet_sock_destruct+0x190/0x1a0 ? report_bug+0xfc/0x1e0 ? handle_bug+0x5c/0xa0 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? inet_sock_destruct+0x190/0x1a0 __sk_destruct+0x25/0x220 sk_psock_destroy+0x2b2/0x310 process_scheduled_works+0xa3/0x3e0 worker_thread+0x117/0x240 ? __pfx_worker_thread+0x10/0x10 kthread+0xcf/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ---[ end trace 0000000000000000 ]--- In __SK_REDIRECT, a more concise way is delaying the uncharging after sent bytes are finalized, and uncharge this value. When (ret < 0), we shall invoke sk_msg_free. Same thing happens in case __SK_DROP, when tosend is set to apply_bytes, we may miss uncharging (msg->sg.size - apply_bytes) bytes. The same warning will be reported in selftest. [...] 468 case __SK_DROP: 469 default: 470 sk_msg_free_partial(sk, msg, tosend); 471 sk_msg_apply_bytes(psock, tosend); 472 *copied -= (tosend + delta); 473 return -EACCES; [...] So instead of sk_msg_free_partial we can do sk_msg_free here. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Fixes: 8ec95b94716a ("bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241016234838.3167769-3-zijianzhang@bytedance.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org> Date: Wed Nov 13 16:11:46 2024 +0100 thermal/drivers/qcom/tsens-v1: Add support for MSM8937 tsens [ Upstream commit e2ffb6c3a40ee714160e35e61f0a984028b5d550 ] Add support for tsens v1.4 block what can be found in MSM8937 and MSM8917. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org> Link: https://lore.kernel.org/r/20241113-msm8917-v6-5-c348fb599fef@mainlining.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Gleixner <tglx@linutronix.de> Date: Thu Oct 31 13:04:08 2024 +0100 timekeeping: Always check for negative motion [ Upstream commit c163e40af9b2331b2c629fd4ec8b703ed4d4ae39 ] clocksource_delta() has two variants. One with a check for negative motion, which is only selected by x86. This is a historic leftover as this function was previously used in the time getter hot paths. Since 135225a363ae timekeeping_cycles_to_ns() has unconditional protection against this as a by-product of the protection against 64bit math overflow. clocksource_delta() is only used in the clocksource watchdog and in timekeeping_advance(). The extra conditional there is not hurting anyone. Remove the config option and unconditionally prevent negative motion of the readout. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/all/20241031120328.599430157@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Gleixner <tglx@linutronix.de> Date: Thu Oct 31 13:04:07 2024 +0100 timekeeping: Remove CONFIG_DEBUG_TIMEKEEPING commit d44d26987bb3df6d76556827097fc9ce17565cb8 upstream. Since 135225a363ae timekeeping_cycles_to_ns() handles large offsets which would lead to 64bit multiplication overflows correctly. It's also protected against negative motion of the clocksource unconditionally, which was exclusive to x86 before. timekeeping_advance() handles large offsets already correctly. That means the value of CONFIG_DEBUG_TIMEKEEPING which analyzed these cases is very close to zero. Remove all of it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/all/20241031120328.536010148@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com> Date: Wed Nov 27 14:05:12 2024 +0900 tipc: Fix use-after-free of kernel socket in cleanup_bearer(). [ Upstream commit 6a2fa13312e51a621f652d522d7e2df7066330b6 ] syzkaller reported a use-after-free of UDP kernel socket in cleanup_bearer() without repro. [0][1] When bearer_disable() calls tipc_udp_disable(), cleanup of the UDP kernel socket is deferred by work calling cleanup_bearer(). tipc_net_stop() waits for such works to finish by checking tipc_net(net)->wq_count. However, the work decrements the count too early before releasing the kernel socket, unblocking cleanup_net() and resulting in use-after-free. Let's move the decrement after releasing the socket in cleanup_bearer(). [0]: ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at sk_alloc+0x438/0x608 inet_create+0x4c8/0xcb0 __sock_create+0x350/0x6b8 sock_create_kern+0x58/0x78 udp_sock_create4+0x68/0x398 udp_sock_create+0x88/0xc8 tipc_udp_enable+0x5e8/0x848 __tipc_nl_bearer_enable+0x84c/0xed8 tipc_nl_bearer_enable+0x38/0x60 genl_family_rcv_msg_doit+0x170/0x248 genl_rcv_msg+0x400/0x5b0 netlink_rcv_skb+0x1dc/0x398 genl_rcv+0x44/0x68 netlink_unicast+0x678/0x8b0 netlink_sendmsg+0x5e4/0x898 ____sys_sendmsg+0x500/0x830 [1]: BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline] BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979 udp_hashslot include/net/udp.h:85 [inline] udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979 sk_common_release+0xaf/0x3f0 net/core/sock.c:3820 inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437 inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489 __sock_release net/socket.c:658 [inline] sock_release+0xa0/0x210 net/socket.c:686 cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391 kthread+0x531/0x6b0 kernel/kthread.c:389 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244 Uninit was created at: slab_free_hook mm/slub.c:2269 [inline] slab_free mm/slub.c:4580 [inline] kmem_cache_free+0x207/0xc40 mm/slub.c:4682 net_free net/core/net_namespace.c:454 [inline] cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391 kthread+0x531/0x6b0 kernel/kthread.c:389 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244 CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Workqueue: events cleanup_bearer Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241127050512.28438-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: furkanonder <furkanonder@protonmail.com> Date: Mon Oct 21 15:12:30 2024 +0000 tools/rtla: Enhance argument parsing in timerlat_load.py [ Upstream commit bd26818343dc02936a4f2f7b63368d5e1e1773c8 ] The enhancements made to timerlat_load.py are aimed at improving the clarity of argument parsing. Summary of Changes: - The cpu argument is now specified as an integer type in the argument parser to enforce input validation, and the construction of affinity_mask has been simplified to directly use the integer value of args.cpu. - The prio argument is similarly updated to be of integer type for consistency and validation, eliminating the need for the conversion of args.prio to an integer, as this is now handled by the argument parser. Cc: "jkacur@redhat.com" <jkacur@redhat.com> Cc: "lgoncalv@redhat.com" <lgoncalv@redhat.com> Link: https://lore.kernel.org/QfgO7ayKD9dsLk8_ZDebkAV0OF7wla7UmasbP9CBmui_sChOeizy512t3RqCHTjvQoUBUDP8dwEOVCdHQ5KvVNEiP69CynMY94SFDERWl94=@protonmail.com Signed-off-by: Furkan Onder <furkanonder@protonmail.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jan Stancek <jstancek@redhat.com> Date: Thu Oct 10 17:09:48 2024 +0200 tools/rtla: fix collision with glibc sched_attr/sched_set_attr [ Upstream commit 0eecee340672c4b512f6f4a8c6add26df05d130c ] glibc commit 21571ca0d703 ("Linux: Add the sched_setattr and sched_getattr functions") now also provides 'struct sched_attr' and sched_setattr() which collide with the ones from rtla. In file included from src/trace.c:11: src/utils.h:49:8: error: redefinition of ‘struct sched_attr’ 49 | struct sched_attr { | ^~~~~~~~~~ In file included from /usr/include/bits/sched.h:60, from /usr/include/sched.h:43, from /usr/include/tracefs/tracefs.h:10, from src/trace.c:4: /usr/include/linux/sched/types.h:98:8: note: originally defined here 98 | struct sched_attr { | ^~~~~~~~~~ Define 'struct sched_attr' conditionally, similar to what strace did: https://lore.kernel.org/all/20240930222913.3981407-1-raj.khem@gmail.com/ and rename rtla's version of sched_setattr() to avoid collision. Link: https://lore.kernel.org/8088f66a7a57c1b209cd8ae0ae7c336a7f8c930d.1728572865.git.jstancek@redhat.com Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Björn Töpel <bjorn@rivosinc.com> Date: Wed Nov 27 11:17:46 2024 +0100 tools: Override makefile ARCH variable if defined, but empty [ Upstream commit 537a2525eaf76ea9b0dca62b994500d8670b39d5 ] There are a number of tools (bpftool, selftests), that require a "bootstrap" build. Here, a bootstrap build is a build host variant of a target. E.g., assume that you're performing a bpftool cross-build on x86 to riscv, a bootstrap build would then be an x86 variant of bpftool. The typical way to perform the host build variant, is to pass "ARCH=" in a sub-make. However, if a variable has been set with a command argument, then ordinary assignments in the makefile are ignored. This side-effect results in that ARCH, and variables depending on ARCH are not set. Workaround by overriding ARCH to the host arch, if ARCH is empty. Fixes: 8859b0da5aac ("tools/bpftool: Fix cross-build") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Quentin Monnet <qmo@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/bpf/20241127101748.165693-1-bjorn@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Masami Hiramatsu (Google) <mhiramat@kernel.org> Date: Sat Nov 30 01:47:47 2024 +0900 tracing/eprobe: Fix to release eprobe when failed to add dyn_event [ Upstream commit 494b332064c0ce2f7392fa92632bc50191c1b517 ] Fix eprobe event to unregister event call and release eprobe when it fails to add dynamic event correctly. Link: https://lore.kernel.org/all/173289886698.73724.1959899350183686006.stgit@devnote2/ Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Date: Tue Oct 8 21:07:12 2024 -0400 tracing/ftrace: disable preemption in syscall probe [ Upstream commit 13d750c2c03e9861e15268574ed2c239cca9c9d5 ] In preparation for allowing system call enter/exit instrumentation to handle page faults, make sure that ftrace can handle this change by explicitly disabling preemption within the ftrace system call tracepoint probes to respect the current expectations within ftrace ring buffer code. This change does not yet allow ftrace to take page faults per se within its probe, but allows its existing probes to adapt to the upcoming change. Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/20241009010718.2050182-3-mathieu.desnoyers@efficios.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuan-Wei Chiu <visitorckw@gmail.com> Date: Wed Dec 4 04:22:28 2024 +0800 tracing: Fix cmp_entries_dup() to respect sort() comparison rules commit e63fbd5f6810ed756bbb8a1549c7d4132968baa9 upstream. The cmp_entries_dup() function used as the comparator for sort() violated the symmetry and transitivity properties required by the sorting algorithm. Specifically, it returned 1 whenever memcmp() was non-zero, which broke the following expectations: * Symmetry: If x < y, then y > x. * Transitivity: If x < y and y < z, then x < z. These violations could lead to incorrect sorting and failure to correctly identify duplicate elements. Fix the issue by directly returning the result of memcmp(), which adheres to the required comparison properties. Cc: stable@vger.kernel.org Fixes: 08d43a5fa063 ("tracing: Add lock-free tracing_map") Link: https://lore.kernel.org/20241203202228.1274403-1-visitorckw@gmail.com Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tatsuya S <tatsuya.s2862@gmail.com> Date: Mon Oct 21 07:14:53 2024 +0000 tracing: Fix function name for trampoline [ Upstream commit 6ce5a6f0a07d37cc377df08a8d8a9c283420f323 ] The issue that unrelated function name is shown on stack trace like following even though it should be trampoline code address is caused by the creation of trampoline code in the area where .init.text section of module was freed after module is loaded. bash-1344 [002] ..... 43.644608: <stack trace> => (MODULE INIT FUNCTION) => vfs_write => ksys_write => do_syscall_64 => entry_SYSCALL_64_after_hwframe To resolve this, when function address of stack trace entry is in trampoline, output without looking up symbol name. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241021071454.34610-2-tatsuya.s2862@gmail.com Signed-off-by: Tatsuya S <tatsuya.s2862@gmail.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uros Bizjak <ubizjak@gmail.com> Date: Mon Oct 7 10:56:28 2024 +0200 tracing: Use atomic64_inc_return() in trace_clock_counter() [ Upstream commit eb887c4567d1b0e7684c026fe7df44afa96589e6 ] Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref) to use optimized implementation and ease register pressure around the primitive for targets that implement optimized variant. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241007085651.48544-1-ubizjak@gmail.com Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com> Date: Mon Sep 23 16:12:01 2024 +0800 usb: chipidea: add CI_HDRC_HAS_SHORT_PKT_LIMIT flag [ Upstream commit ec841b8d73cff37f8960e209017efe1eb2fb21f2 ] Currently, the imx deivice controller has below limitations: 1. can't generate short packet interrupt if IOC not set in dTD. So if one request span more than one dTDs and only the last dTD set IOC, the usb request will pending there if no more data comes. 2. the controller can't accurately deliver data to differtent usb requests in some cases due to short packet. For example: one usb request span 3 dTDs, then if the controller received a short packet the next packet will go to 2nd dTD of current request rather than the first dTD of next request. 3. can't build a bus packet use multiple dTDs. For example: controller needs to send one packet of 512 bytes use dTD1 (200 bytes) + dTD2 (312 bytes), actually the host side will see 200 bytes short packet. Based on these limits, add CI_HDRC_HAS_SHORT_PKT_LIMIT flag and use it on imx platforms. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240923081203.2851768-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com> Date: Mon Sep 23 16:12:03 2024 +0800 usb: chipidea: udc: create bounce buffer for problem sglist entries if possible [ Upstream commit edfcc455c85ccc5855f0c329ca5a2d85cc9fc6c6 ] The chipidea controller doesn't fully support sglist, such as it can not transfer data spanned more dTDs to form a bus packet, so it can only work on very limited cases. The limitations as below: 1. the end address of the first sg buffer must be 4KB aligned. 2. the start and end address of the middle sg buffer must be 4KB aligned. 3. the start address of the first sg buffer must be 4KB aligned. However, not all the use cases violate these limitations. To make the controller compatible with most of the cases, this will try to bounce the problem sglist entries which can be found by sglist_get_invalid_entry(). Then a bounced line buffer (the size will roundup to page size) will be allocated to replace the remaining problem sg entries. The data will be copied between problem sg entries and bounce buffer according to the transfer direction. The bounce buffer will be freed when the request completed. Acked-by: Peter Chen <peter.chen@kernel.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240923081203.2851768-3-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com> Date: Thu Sep 26 10:29:04 2024 +0800 usb: chipidea: udc: handle USB Error Interrupt if IOC not set [ Upstream commit 548f48b66c0c5d4b9795a55f304b7298cde2a025 ] As per USBSTS register description about UEI: When completion of a USB transaction results in an error condition, this bit is set by the Host/Device Controller. This bit is set along with the USBINT bit, if the TD on which the error interrupt occurred also had its interrupt on complete (IOC) bit set. UI is set only when IOC set. Add checking UEI to fix miss call isr_tr_complete_handler() when IOC have not set and transfer error happen. Acked-by: Peter Chen <peter.chen@kernel.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240926022906.473319-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com> Date: Mon Sep 23 16:12:02 2024 +0800 usb: chipidea: udc: limit usb request length to max 16KB [ Upstream commit ca8d18aa7b0f22d66a3ca9a90d8f73431b8eca89 ] To let the device controller work properly on short packet limitations, one usb request should only correspond to one dTD. Then every dTD will set IOC. In theory, each dTD support up to 20KB data transfer if the offset is 0. Due to we cannot predetermine the offset, this will limit the usb request length to max 16KB. This should be fine since most of the user transfer data based on this size policy. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240923081203.2851768-2-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Saranya Gopal <saranya.gopal@intel.com> Date: Fri Aug 30 14:13:42 2024 +0530 usb: typec: ucsi: Do not call ACPI _DSM method for UCSI read operations [ Upstream commit fa48d7e81624efdf398b990a9049e9cd75a5aead ] ACPI _DSM methods are needed only for UCSI write operations and for reading CCI during RESET_PPM operation. So, remove _DSM calls from other places. While there, remove the Zenbook quirk also since the default behavior now aligns with the Zenbook quirk. With this change, GET_CONNECTOR_STATUS returns at least 6 seconds faster than before in Arrowlake-S platforms. Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Saranya Gopal <saranya.gopal@intel.com> Reviewed-by: Christian A. Ehrhardt <lk@c--e.de> Link: https://lore.kernel.org/r/20240830084342.460109-1-saranya.gopal@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Sat Nov 9 02:04:15 2024 +0200 usb: typec: ucsi: glink: be more precise on orientation-aware ports [ Upstream commit de9df030ccb5d3e31ee0c715d74cd77c619748f8 ] Instead of checking if any of the USB-C ports have orientation GPIO and thus is orientation-aware, check for the GPIO for the port being registered. There are no boards that are affected by this change at this moment, so the patch is not marked as a fix, but it might affect other boards in future. Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241109-ucsi-glue-fixes-v2-2-8b21ff4f9fbe@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabriele Monaco <gmonaco@redhat.com> Date: Thu Oct 17 08:42:39 2024 +0200 verification/dot2: Improve dot parser robustness [ Upstream commit 571f8b3f866a6d990a50fe5c89fe0ea78784d70b ] This patch makes the dot parser used by dot2c and dot2k slightly more robust, namely: * allows parsing files with the gv extension (GraphViz) * correctly parses edges with any indentation * used to work only with a single character (e.g. '\t') Additionally it fixes a couple of warnings reported by pylint such as wrong indentation and comparison to False instead of `not ...` Link: https://lore.kernel.org/20241017064238.41394-2-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yishai Hadas <yishaih@nvidia.com> Date: Thu Dec 5 14:26:54 2024 +0200 vfio/mlx5: Align the page tracking max message size with the device capability [ Upstream commit 9c7c5430bca36e9636eabbba0b3b53251479c7ab ] Align the page tracking maximum message size with the device's capability instead of relying on PAGE_SIZE. This adjustment resolves a mismatch on systems where PAGE_SIZE is 64K, but the firmware only supports a maximum message size of 4K. Now that we rely on the device's capability for max_message_size, we must account for potential future increases in its value. Key considerations include: - Supporting message sizes that exceed a single system page (e.g., an 8K message on a 4K system). - Ensuring the RQ size is adjusted to accommodate at least 4 WQEs/messages, in line with the device specification. The above has been addressed as part of the patch. Fixes: 79c3cf279926 ("vfio/mlx5: Init QP based resources for dirty tracking") Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Yingshun Cui <yicui@redhat.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20241205122654.235619-1-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Date: Tue Oct 29 16:46:12 2024 +0800 virtio-net: fix overflow inside virtnet_rq_alloc [ Upstream commit 6aacd1484468361d1d04badfe75f264fa5314864 ] When the frag just got a page, then may lead to regression on VM. Specially if the sysctl net.core.high_order_alloc_disable value is 1, then the frag always get a page when do refill. Which could see reliable crashes or scp failure (scp a file 100M in size to VM). The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning of a new frag. When the frag size is larger than PAGE_SIZE, everything is fine. However, if the frag is only one page and the total size of the buffer and virtnet_rq_dma is larger than one page, an overflow may occur. The commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api") introduced this problem. And we reverted some commits to fix this in last linux version. Now we try to enable it and fix this bug directly. Here, when the frag size is not enough, we reduce the buffer len to fix this problem. Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com> Tested-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Shkolnyy <kshk@linux.ibm.com> Date: Tue Dec 3 09:06:54 2024 -0600 vsock/test: fix failures due to wrong SO_RCVLOWAT parameter [ Upstream commit 7ce1c0921a806ee7d4bb24f74a3b30c89fc5fb39 ] This happens on 64-bit big-endian machines. SO_RCVLOWAT requires an int parameter. However, instead of int, the test uses unsigned long in one place and size_t in another. Both are 8 bytes long on 64-bit machines. The kernel, having received the 8 bytes, doesn't test for the exact size of the parameter, it only cares that it's >= sizeof(int), and casts the 4 lower-addressed bytes to an int, which, on a big-endian machine, contains 0. 0 doesn't trigger an error, SO_RCVLOWAT returns with success and the socket stays with the default SO_RCVLOWAT = 1, which results in vsock_test failures, while vsock_perf doesn't even notice that it's failed to change it. Fixes: b1346338fbae ("vsock_test: POLLIN + SO_RCVLOWAT test") Fixes: 542e893fbadc ("vsock/test: two tests to check credit update logic") Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konstantin Shkolnyy <kshk@linux.ibm.com> Date: Tue Dec 3 09:06:55 2024 -0600 vsock/test: fix parameter types in SO_VM_SOCKETS_* calls [ Upstream commit 3f36ee29e732b68044d531f47d31f22d265954c6 ] Change parameters of SO_VM_SOCKETS_* to unsigned long long as documented in the vm_sockets.h, because the corresponding kernel code requires them to be at least 64-bit, no matter what architecture. Otherwise they are too small on 32-bit machines. Fixes: 5c338112e48a ("test/vsock: rework message bounds test") Fixes: 685a21c314a8 ("test/vsock: add big message test") Fixes: 542e893fbadc ("vsock/test: two tests to check credit update logic") Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility") Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nick Chan <towinchenmi@gmail.com> Date: Wed Oct 2 00:59:51 2024 +0800 watchdog: apple: Actually flush writes after requesting watchdog restart [ Upstream commit 51dfe714c03c066aabc815a2bb2adcc998dfcb30 ] Although there is an existing code comment about flushing the writes, writes were not actually being flushed. Actually flush the writes by changing readl_relaxed() to readl(). Fixes: 4ed224aeaf661 ("watchdog: Add Apple SoC watchdog driver") Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20241001170018.20139-2-towinchenmi@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yassine Oudjana <y.oudjana@protonmail.com> Date: Wed Nov 6 10:47:51 2024 +0000 watchdog: mediatek: Make sure system reset gets asserted in mtk_wdt_restart() [ Upstream commit a1495a21e0b8aad92132dfcf9c6fffc1bde9d5b2 ] Clear the IRQ enable bit of WDT_MODE before asserting software reset in order to make TOPRGU issue a system reset signal instead of an IRQ. Fixes: a44a45536f7b ("watchdog: Add driver for Mediatek watchdog") Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20241106104738.195968-2-y.oudjana@protonmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Sverdlin <alexander.sverdlin@siemens.com> Date: Thu Nov 7 21:38:28 2024 +0100 watchdog: rti: of: honor timeout-sec property commit 4962ee045d8f06638714d801ab0fb72f89c16690 upstream. Currently "timeout-sec" Device Tree property is being silently ignored: even though watchdog_init_timeout() is being used, the driver always passes "heartbeat" == DEFAULT_HEARTBEAT == 60 as argument. Fix this by setting struct watchdog_device::timeout to DEFAULT_HEARTBEAT and passing real module parameter value to watchdog_init_timeout() (which may now be 0 if not specified). Cc: stable@vger.kernel.org Fixes: 2d63908bdbfb ("watchdog: Add K3 RTI watchdog support") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20241107203830.1068456-1-alexander.sverdlin@siemens.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Harini T <harini.t@amd.com> Date: Fri Sep 13 17:02:30 2024 +0530 watchdog: xilinx_wwdt: Calculate max_hw_heartbeat_ms using clock frequency [ Upstream commit 006778844c2c132c28cfa90e3570560351e01b9a ] In the current implementation, the value of max_hw_heartbeat_ms is set to the timeout period expressed in milliseconds and fails to verify if the close window percentage exceeds the maximum value that the hardware supports. 1. Calculate max_hw_heartbeat_ms based on input clock frequency. 2. Update frequency check to require a minimum frequency of 1Mhz. 3. Limit the close and open window percent to hardware supported value to avoid truncation. 4. If the user input timeout exceeds the maximum timeout supported, use only open window and the framework supports the higher timeouts. Fixes: 12984cea1b8c ("watchdog: xilinx_wwdt: Add Versal window watchdog support") Signed-off-by: Harini T <harini.t@amd.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240913113230.1939373-1-harini.t@amd.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kang Yang <quic_kangyang@quicinc.com> Date: Tue Oct 8 10:22:46 2024 +0800 wifi: ath10k: avoid NULL pointer error during sdio remove [ Upstream commit 95c38953cb1ecf40399a676a1f85dfe2b5780a9a ] When running 'rmmod ath10k', ath10k_sdio_remove() will free sdio workqueue by destroy_workqueue(). But if CONFIG_INIT_ON_FREE_DEFAULT_ON is set to yes, kernel panic will happen: Call trace: destroy_workqueue+0x1c/0x258 ath10k_sdio_remove+0x84/0x94 sdio_bus_remove+0x50/0x16c device_release_driver_internal+0x188/0x25c device_driver_detach+0x20/0x2c This is because during 'rmmod ath10k', ath10k_sdio_remove() will call ath10k_core_destroy() before destroy_workqueue(). wiphy_dev_release() will finally be called in ath10k_core_destroy(). This function will free struct cfg80211_registered_device *rdev and all its members, including wiphy, dev and the pointer of sdio workqueue. Then the pointer of sdio workqueue will be set to NULL due to CONFIG_INIT_ON_FREE_DEFAULT_ON. After device release, destroy_workqueue() will use NULL pointer then the kernel panic happen. Call trace: ath10k_sdio_remove ->ath10k_core_unregister …… ->ath10k_core_stop ->ath10k_hif_stop ->ath10k_sdio_irq_disable ->ath10k_hif_power_down ->del_timer_sync(&ar_sdio->sleep_timer) ->ath10k_core_destroy ->ath10k_mac_destroy ->ieee80211_free_hw ->wiphy_free …… ->wiphy_dev_release ->destroy_workqueue Need to call destroy_workqueue() before ath10k_core_destroy(), free the work queue buffer first and then free pointer of work queue by ath10k_core_destroy(). This order matches the error path order in ath10k_sdio_probe(). No work will be queued on sdio workqueue between it is destroyed and ath10k_core_destroy() is called. Based on the call_stack above, the reason is: Only ath10k_sdio_sleep_timer_handler(), ath10k_sdio_hif_tx_sg() and ath10k_sdio_irq_disable() will queue work on sdio workqueue. Sleep timer will be deleted before ath10k_core_destroy() in ath10k_hif_power_down(). ath10k_sdio_irq_disable() only be called in ath10k_hif_stop(). ath10k_core_unregister() will call ath10k_hif_power_down() to stop hif bus, so ath10k_sdio_hif_tx_sg() won't be called anymore. Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00189 Signed-off-by: Kang Yang <quic_kangyang@quicinc.com> Tested-by: David Ruth <druth@chromium.org> Reviewed-by: David Ruth <druth@chromium.org> Link: https://patch.msgid.link/20241008022246.1010-1-quic_kangyang@quicinc.com Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kalle Valo <quic_kvalo@quicinc.com> Date: Mon Oct 7 19:59:27 2024 +0300 wifi: ath12k: fix atomic calls in ath12k_mac_op_set_bitrate_mask() [ Upstream commit 8fac3266c68a8e647240b8ac8d0b82f1821edf85 ] When I try to manually set bitrates: iw wlan0 set bitrates legacy-2.4 1 I get sleeping from invalid context error, see below. Fix that by switching to use recently introduced ieee80211_iterate_stations_mtx(). Do note that WCN6855 firmware is still crashing, I'm not sure if that firmware even supports bitrate WMI commands and should we consider disabling ath12k_mac_op_set_bitrate_mask() for WCN6855? But that's for another patch. BUG: sleeping function called from invalid context at drivers/net/wireless/ath/ath12k/wmi.c:420 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 2236, name: iw preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 3 locks held by iw/2236: #0: ffffffffabc6f1d8 (cb_lock){++++}-{3:3}, at: genl_rcv+0x14/0x40 #1: ffff888138410810 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_pre_doit+0x54d/0x800 [cfg80211] #2: ffffffffab2cfaa0 (rcu_read_lock){....}-{1:2}, at: ieee80211_iterate_stations_atomic+0x2f/0x200 [mac80211] CPU: 3 UID: 0 PID: 2236 Comm: iw Not tainted 6.11.0-rc7-wt-ath+ #1772 Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 Call Trace: <TASK> dump_stack_lvl+0xa4/0xe0 dump_stack+0x10/0x20 __might_resched+0x363/0x5a0 ? __alloc_skb+0x165/0x340 __might_sleep+0xad/0x160 ath12k_wmi_cmd_send+0xb1/0x3d0 [ath12k] ? ath12k_wmi_init_wcn7850+0xa40/0xa40 [ath12k] ? __netdev_alloc_skb+0x45/0x7b0 ? __asan_memset+0x39/0x40 ? ath12k_wmi_alloc_skb+0xf0/0x150 [ath12k] ? reacquire_held_locks+0x4d0/0x4d0 ath12k_wmi_set_peer_param+0x340/0x5b0 [ath12k] ath12k_mac_disable_peer_fixed_rate+0xa3/0x110 [ath12k] ? ath12k_mac_vdev_stop+0x4f0/0x4f0 [ath12k] ieee80211_iterate_stations_atomic+0xd4/0x200 [mac80211] ath12k_mac_op_set_bitrate_mask+0x5d2/0x1080 [ath12k] ? ath12k_mac_vif_chan+0x320/0x320 [ath12k] drv_set_bitrate_mask+0x267/0x470 [mac80211] ieee80211_set_bitrate_mask+0x4cc/0x8a0 [mac80211] ? __this_cpu_preempt_check+0x13/0x20 nl80211_set_tx_bitrate_mask+0x2bc/0x530 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? trace_contention_end+0xef/0x140 ? rtnl_unlock+0x9/0x10 ? nl80211_pre_doit+0x557/0x800 [cfg80211] genl_family_rcv_msg_doit+0x1f0/0x2e0 ? genl_family_rcv_msg_attrs_parse.isra.0+0x250/0x250 ? ns_capable+0x57/0xd0 genl_family_rcv_msg+0x34c/0x600 ? genl_family_rcv_msg_dumpit+0x310/0x310 ? __lock_acquire+0xc62/0x1de0 ? he_set_mcs_mask.isra.0+0x8d0/0x8d0 [cfg80211] ? nl80211_parse_tx_bitrate_mask+0x2320/0x2320 [cfg80211] ? cfg80211_external_auth_request+0x690/0x690 [cfg80211] genl_rcv_msg+0xa0/0x130 netlink_rcv_skb+0x14c/0x400 ? genl_family_rcv_msg+0x600/0x600 ? netlink_ack+0xd70/0xd70 ? rwsem_optimistic_spin+0x4f0/0x4f0 ? genl_rcv+0x14/0x40 ? down_read_killable+0x580/0x580 ? netlink_deliver_tap+0x13e/0x350 ? __this_cpu_preempt_check+0x13/0x20 genl_rcv+0x23/0x40 netlink_unicast+0x45e/0x790 ? netlink_attachskb+0x7f0/0x7f0 netlink_sendmsg+0x7eb/0xdb0 ? netlink_unicast+0x790/0x790 ? __this_cpu_preempt_check+0x13/0x20 ? selinux_socket_sendmsg+0x31/0x40 ? netlink_unicast+0x790/0x790 __sock_sendmsg+0xc9/0x160 ____sys_sendmsg+0x620/0x990 ? kernel_sendmsg+0x30/0x30 ? __copy_msghdr+0x410/0x410 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xe6/0x1470 ___sys_sendmsg+0xe9/0x170 ? copy_msghdr_from_user+0x120/0x120 ? __lock_acquire+0xc62/0x1de0 ? do_fault_around+0x2c6/0x4e0 ? do_user_addr_fault+0x8c1/0xde0 ? reacquire_held_locks+0x220/0x4d0 ? do_user_addr_fault+0x8c1/0xde0 ? __kasan_check_read+0x11/0x20 ? __fdget+0x4e/0x1d0 ? sockfd_lookup_light+0x1a/0x170 __sys_sendmsg+0xd2/0x180 ? __sys_sendmsg_sock+0x20/0x20 ? reacquire_held_locks+0x4d0/0x4d0 ? debug_smp_processor_id+0x17/0x20 __x64_sys_sendmsg+0x72/0xb0 ? lockdep_hardirqs_on+0x7d/0x100 x64_sys_call+0x894/0x9f0 do_syscall_64+0x64/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f230fe04807 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffe996a7ea8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000556f9f9c3390 RCX: 00007f230fe04807 RDX: 0000000000000000 RSI: 00007ffe996a7ee0 RDI: 0000000000000003 RBP: 0000556f9f9c88c0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000556f965ca190 R11: 0000000000000246 R12: 0000556f9f9c8780 R13: 00007ffe996a7ee0 R14: 0000556f9f9c87d0 R15: 0000556f9f9c88c0 </TASK> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20241007165932.78081-2-kvalo@kernel.org Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com> Date: Mon Sep 30 11:07:16 2024 -0700 wifi: ath5k: add PCI ID for Arcadyan devices [ Upstream commit f3ced9bb90b0a287a1fa6184d16b0f104a78fa90 ] Arcadyan made routers with this PCI ID containing an AR2417. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240930180716.139894-3-rosenp@gmail.com Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rosen Penev <rosenp@gmail.com> Date: Mon Sep 30 11:07:15 2024 -0700 wifi: ath5k: add PCI ID for SX76X [ Upstream commit da0474012402d4729b98799d71a54c35dc5c5de3 ] This is in two devices made by Gigaset, SX762 and SX763. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240930180716.139894-2-rosenp@gmail.com Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Norbert van Bolhuis <nvbolhuis@gmail.com> Date: Thu Nov 7 14:28:13 2024 +0100 wifi: brcmfmac: Fix oops due to NULL pointer dereference in brcmf_sdiod_sglist_rw() [ Upstream commit 857282b819cbaa0675aaab1e7542e2c0579f52d7 ] This patch fixes a NULL pointer dereference bug in brcmfmac that occurs when a high 'sd_sgentry_align' value applies (e.g. 512) and a lot of queued SKBs are sent from the pkt queue. The problem is the number of entries in the pre-allocated sgtable, it is nents = max(rxglom_size, txglom_size) + max(rxglom_size, txglom_size) >> 4 + 1. Given the default [rt]xglom_size=32 it's actually 35 which is too small. Worst case, the pkt queue can end up with 64 SKBs. This occurs when a new SKB is added for each original SKB if tailroom isn't enough to hold tail_pad. At least one sg entry is needed for each SKB. So, eventually the "skb_queue_walk loop" in brcmf_sdiod_sglist_rw may run out of sg entries. This makes sg_next return NULL and this causes the oops. The patch sets nents to max(rxglom_size, txglom_size) * 2 to be able handle the worst-case. Btw. this requires only 64-35=29 * 16 (or 20 if CONFIG_NEED_SG_DMA_LENGTH) = 464 additional bytes of memory. Signed-off-by: Norbert van Bolhuis <nvbolhuis@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241107132903.13513-1-nvbolhuis@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Date: Fri Nov 1 14:07:25 2024 +0800 wifi: ipw2x00: libipw_rx_any(): fix bad alignment [ Upstream commit 4fa4f049dc0d9741b16c96bcbf0108c85368a2b9 ] This patch fixes incorrect code alignment. ./drivers/net/wireless/intel/ipw2x00/libipw_rx.c:871:2-3: code aligned with following code on line 882. ./drivers/net/wireless/intel/ipw2x00/libipw_rx.c:886:2-3: code aligned with following code on line 900. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=11381 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241101060725.54640-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ping-Ke Shih <pkshih@realtek.com> Date: Thu Aug 22 09:42:55 2024 +0800 wifi: rtw88: use ieee80211_purge_tx_queue() to purge TX skb [ Upstream commit 3e5e4a801aaf4283390cc34959c6c48f910ca5ea ] When removing kernel modules by: rmmod rtw88_8723cs rtw88_8703b rtw88_8723x rtw88_sdio rtw88_core Driver uses skb_queue_purge() to purge TX skb, but not report tx status causing "Have pending ack frames!" warning. Use ieee80211_purge_tx_queue() to correct this. Since ieee80211_purge_tx_queue() doesn't take locks, to prevent racing between TX work and purge TX queue, flush and destroy TX work in advance. wlan0: deauthenticating from aa:f5:fd:60:4c:a8 by local choice (Reason: 3=DEAUTH_LEAVING) ------------[ cut here ]------------ Have pending ack frames! WARNING: CPU: 3 PID: 9232 at net/mac80211/main.c:1691 ieee80211_free_ack_frame+0x5c/0x90 [mac80211] CPU: 3 PID: 9232 Comm: rmmod Tainted: G C 6.10.1-200.fc40.aarch64 #1 Hardware name: pine64 Pine64 PinePhone Braveheart (1.1)/Pine64 PinePhone Braveheart (1.1), BIOS 2024.01 01/01/2024 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ieee80211_free_ack_frame+0x5c/0x90 [mac80211] lr : ieee80211_free_ack_frame+0x5c/0x90 [mac80211] sp : ffff80008c1b37b0 x29: ffff80008c1b37b0 x28: ffff000003be8000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff000003dc14b8 x24: ffff80008c1b37d0 x23: ffff000000ff9f80 x22: 0000000000000000 x21: 000000007fffffff x20: ffff80007c7e93d8 x19: ffff00006e66f400 x18: 0000000000000000 x17: ffff7ffffd2b3000 x16: ffff800083fc0000 x15: 0000000000000000 x14: 0000000000000000 x13: 2173656d61726620 x12: 6b636120676e6964 x11: 0000000000000000 x10: 000000000000005d x9 : ffff8000802af2b0 x8 : ffff80008c1b3430 x7 : 0000000000000001 x6 : 0000000000000001 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000003be8000 Call trace: ieee80211_free_ack_frame+0x5c/0x90 [mac80211] idr_for_each+0x74/0x110 ieee80211_free_hw+0x44/0xe8 [mac80211] rtw_sdio_remove+0x9c/0xc0 [rtw88_sdio] sdio_bus_remove+0x44/0x180 device_remove+0x54/0x90 device_release_driver_internal+0x1d4/0x238 driver_detach+0x54/0xc0 bus_remove_driver+0x78/0x108 driver_unregister+0x38/0x78 sdio_unregister_driver+0x2c/0x40 rtw_8723cs_driver_exit+0x18/0x1000 [rtw88_8723cs] __do_sys_delete_module.isra.0+0x190/0x338 __arm64_sys_delete_module+0x1c/0x30 invoke_syscall+0x74/0x100 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x158 el0t_64_sync_handler+0x120/0x138 el0t_64_sync+0x194/0x198 ---[ end trace 0000000000000000 ]--- Reported-by: Peter Robinson <pbrobinson@gmail.com> Closes: https://lore.kernel.org/linux-wireless/CALeDE9OAa56KMzgknaCD3quOgYuEHFx9_hcT=OFgmMAb+8MPyA@mail.gmail.com/ Tested-by: Ping-Ke Shih <pkshih@realtek.com> # 8723DU Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240822014255.10211-2-pkshih@realtek.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ping-Ke Shih <pkshih@realtek.com> Date: Thu Sep 19 16:12:14 2024 +0800 wifi: rtw89: check return value of ieee80211_probereq_get() for RNR [ Upstream commit 630d5d8f2bf6b340202b6bc2c05d794bbd8e4c1c ] The return value of ieee80211_probereq_get() might be NULL, so check it before using to avoid NULL pointer access. Addresses-Coverity-ID: 1529805 ("Dereference null return value") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20240919081216.28505-2-pkshih@realtek.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Date: Wed Nov 27 16:22:47 2024 -0800 x86/cacheinfo: Delete global num_cache_leaves commit 9677be09e5e4fbe48aeccb06ae3063c5eba331c3 upstream. Linux remembers cpu_cachinfo::num_leaves per CPU, but x86 initializes all CPUs from the same global "num_cache_leaves". This is erroneous on systems such as Meteor Lake, where each CPU has a distinct num_leaves value. Delete the global "num_cache_leaves" and initialize num_leaves on each CPU. init_cache_level() no longer needs to set num_leaves. Also, it never had to set num_levels as it is unnecessary in x86. Keep checking for zero cache leaves. Such condition indicates a bug. [ bp: Cleanup. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org # 6.3+ Link: https://lore.kernel.org/r/20241128002247.26726-3-ricardo.neri-calderon@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sean Christopherson <seanjc@google.com> Date: Fri Dec 6 08:20:06 2024 -0800 x86/CPU/AMD: WARN when setting EFER.AUTOIBRS if and only if the WRMSR fails [ Upstream commit 492077668fb453b8b16c842fcf3fafc2ebc190e9 ] When ensuring EFER.AUTOIBRS is set, WARN only on a negative return code from msr_set_bit(), as '1' is used to indicate the WRMSR was successful ('0' indicates the MSR bit was already set). Fixes: 8cc68c9c9e92 ("x86/CPU/AMD: Make sure EFER[AIBRSE] is set") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/Z1MkNofJjt7Oq0G6@google.com Closes: https://lore.kernel.org/all/20241205220604.GA2054199@thelio-3990X Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fernando Fernandez Mancera <ffmancera@riseup.net> Date: Mon Dec 2 14:58:45 2024 +0000 x86/cpu/topology: Remove limit of CPUs due to disabled IO/APIC commit 73da582a476ea6e3512f89f8ed57dfed945829a2 upstream. The rework of possible CPUs management erroneously disabled SMP when the IO/APIC is disabled either by the 'noapic' command line parameter or during IO/APIC setup. SMP is possible without IO/APIC. Remove the ioapic_is_disabled conditions from the relevant possible CPU management code paths to restore the orgininal behaviour. Fixes: 7c0edad3643f ("x86/cpu/topology: Rework possible CPU management") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241202145905.1482-1-ffmancera@riseup.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Len Brown <len.brown@intel.com> Date: Tue Nov 12 21:07:00 2024 -0500 x86/cpu: Add Lunar Lake to list of CPUs with a broken MONITOR implementation commit c9a4b55431e5220347881e148725bed69c84e037 upstream. Under some conditions, MONITOR wakeups on Lunar Lake processors can be lost, resulting in significant user-visible delays. Add Lunar Lake to X86_BUG_MONITOR so that wake_up_idle_cpu() always sends an IPI, avoiding this potential delay. Reported originally here: https://bugzilla.kernel.org/show_bug.cgi?id=219364 [ dhansen: tweak subject ] Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/a4aa8842a3c3bfdb7fe9807710eef159cbf0e705.1731463305.git.len.brown%40intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Woodhouse <dwmw@amazon.co.uk> Date: Thu Dec 5 15:05:07 2024 +0000 x86/kexec: Restore GDT on return from ::preserve_context kexec commit 07fa619f2a40c221ea27747a3323cabc59ab25eb upstream. The restore_processor_state() function explicitly states that "the asm code that gets us here will have restored a usable GDT". That wasn't true in the case of returning from a ::preserve_context kexec. Make it so. Without this, the kernel was depending on the called function to reload a GDT which is appropriate for the kernel before returning. Test program: #include <unistd.h> #include <errno.h> #include <stdio.h> #include <stdlib.h> #include <linux/kexec.h> #include <linux/reboot.h> #include <sys/reboot.h> #include <sys/syscall.h> int main (void) { struct kexec_segment segment = {}; unsigned char purgatory[] = { 0x66, 0xba, 0xf8, 0x03, // mov $0x3f8, %dx 0xb0, 0x42, // mov $0x42, %al 0xee, // outb %al, (%dx) 0xc3, // ret }; int ret; segment.buf = &purgatory; segment.bufsz = sizeof(purgatory); segment.mem = (void *)0x400000; segment.memsz = 0x1000; ret = syscall(__NR_kexec_load, 0x400000, 1, &segment, KEXEC_PRESERVE_CONTEXT); if (ret) { perror("kexec_load"); exit(1); } ret = syscall(__NR_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_KEXEC); if (ret) { perror("kexec reboot"); exit(1); } printf("Success\n"); return 0; } Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241205153343.3275139-2-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Woodhouse <dwmw@amazon.co.uk> Date: Wed Dec 4 11:27:14 2024 +0000 x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating userspace page tables commit d0ceea662d459726487030237689835fcc0483e5 upstream. The set_p4d() and set_pgd() functions (in 4-level or 5-level page table setups respectively) assume that the root page table is actually a 8KiB allocation, with the userspace root immediately after the kernel root page table (so that the former can enforce NX on on all the subordinate page tables, which are actually shared). However, users of the kernel_ident_mapping_init() code do not give it an 8KiB allocation for its PGD. Both swsusp_arch_resume() and acpi_mp_setup_reset() allocate only a single 4KiB page. The kexec code on x86_64 currently gets away with it purely by chance, because it allocates 8KiB for its "control code page" and then actually uses the first half for the PGD, then copies the actual trampoline code into the second half only after the identmap code has finished scribbling over it. Fix this by defining a _PAGE_NOPTISHADOW bit (which can use the same bit as _PAGE_SAVED_DIRTY since one is only for the PGD/P4D root and the other is exclusively for leaf PTEs.). This instructs __pti_set_user_pgtbl() not to write to the userspace 'shadow' PGD. Strictly, the _PAGE_NOPTISHADOW bit doesn't need to be written out to the actual page tables; since __pti_set_user_pgtbl() returns the value to be written to the kernel page table, it could be filtered out. But there seems to be no benefit to actually doing so. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/412c90a4df7aef077141d9f68d19cbe5602d6c6d.camel@infradead.org Cc: stable@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Date: Tue Nov 19 17:45:19 2024 +0000 x86/pkeys: Change caller of update_pkru_in_sigframe() [ Upstream commit 6a1853bdf17874392476b552398df261f75503e0 ] update_pkru_in_sigframe() will shortly need some information which is only available inside xsave_to_user_sigframe(). Move update_pkru_in_sigframe() inside the other function to make it easier to provide it that information. No functional changes. Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241119174520.3987538-2-aruna.ramakrishna%40oracle.com Stable-dep-of: ae6012d72fa6 ("x86/pkeys: Ensure updated PKRU value is XRSTOR'd") Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Date: Tue Nov 19 17:45:20 2024 +0000 x86/pkeys: Ensure updated PKRU value is XRSTOR'd [ Upstream commit ae6012d72fa60c9ff92de5bac7a8021a47458e5b ] When XSTATE_BV[i] is 0, and XRSTOR attempts to restore state component 'i' it ignores any value in the XSAVE buffer and instead restores the state component's init value. This means that if XSAVE writes XSTATE_BV[PKRU]=0 then XRSTOR will ignore the value that update_pkru_in_sigframe() writes to the XSAVE buffer. XSTATE_BV[PKRU] only gets written as 0 if PKRU is in its init state. On Intel CPUs, basically never happens because the kernel usually overwrites the init value (aside: this is why we didn't notice this bug until now). But on AMD, the init tracker is more aggressive and will track PKRU as being in its init state upon any wrpkru(0x0). Unfortunately, sig_prepare_pkru() does just that: wrpkru(0x0). This writes XSTATE_BV[PKRU]=0 which makes XRSTOR ignore the PKRU value in the sigframe. To fix this, always overwrite the sigframe XSTATE_BV with a value that has XSTATE_BV[PKRU]==1. This ensures that XRSTOR will not ignore what update_pkru_in_sigframe() wrote. The problematic sequence of events is something like this: Userspace does: * wrpkru(0xffff0000) (or whatever) * Hardware sets: XINUSE[PKRU]=1 Signal happens, kernel is entered: * sig_prepare_pkru() => wrpkru(0x00000000) * Hardware sets: XINUSE[PKRU]=0 (aggressive AMD init tracker) * XSAVE writes most of XSAVE buffer, including XSTATE_BV[PKRU]=XINUSE[PKRU]=0 * update_pkru_in_sigframe() overwrites PKRU in XSAVE buffer ... signal handling * XRSTOR sees XSTATE_BV[PKRU]==0, ignores just-written value from update_pkru_in_sigframe() Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE") Suggested-by: Rudi Horn <rudi.horn@oracle.com> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241119174520.3987538-3-aruna.ramakrishna%40oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Damien Le Moal <dlemoal@kernel.org> Date: Mon Dec 9 08:53:32 2024 +0900 x86: Fix build regression with CONFIG_KEXEC_JUMP enabled [ Upstream commit aeb68937614f4aeceaaa762bd7f0212ce842b797 ] Build 6.13-rc12 for x86_64 with gcc 14.2.1 fails with the error: ld: vmlinux.o: in function `virtual_mapped': linux/arch/x86/kernel/relocate_kernel_64.S:249:(.text+0x5915b): undefined reference to `saved_context_gdt_desc' when CONFIG_KEXEC_JUMP is enabled. This was introduced by commit 07fa619f2a40 ("x86/kexec: Restore GDT on return from ::preserve_context kexec") which introduced a use of saved_context_gdt_desc without a declaration for it. Fix that by including asm/asm-offsets.h where saved_context_gdt_desc is defined (indirectly in include/generated/asm-offsets.h which asm/asm-offsets.h includes). Fixes: 07fa619f2a40 ("x86/kexec: Restore GDT on return from ::preserve_context kexec") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Closes: https://lore.kernel.org/oe-kbuild-all/202411270006.ZyyzpYf8-lkp@intel.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Larysa Zaremba <larysa.zaremba@intel.com> Date: Fri Nov 22 12:29:09 2024 +0100 xsk: always clear DMA mapping information when unmapping the pool [ Upstream commit ac9a48a6f1610b094072b815e884e1668aea4401 ] When the umem is shared, the DMA mapping is also shared between the xsk pools, therefore it should stay valid as long as at least 1 user remains. However, the pool also keeps the copies of DMA-related information that are initialized in the same way in xp_init_dma_info(), but cleared by xp_dma_unmap() only for the last remaining pool, this causes the problems below. The first one is that the commit adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") relies on pool->dev to determine the presence of a ZC pool on a given queue, avoiding internal bookkeeping. This works perfectly fine if the UMEM is not shared, but reliably fails otherwise as stated in the linked report. The second one is pool->dma_pages which is dynamically allocated and only freed in xp_dma_unmap(), this leads to a small memory leak. kmemleak does not catch it, but by printing the allocation results after terminating the userspace program it is possible to see that all addresses except the one belonging to the last detached pool are still accessible through the kmemleak dump functionality. Always clear the DMA mapping information from the pool and free pool->dma_pages when unmapping the pool, so that the only difference between results of the last remaining user's call and the ones before would be the destruction of the DMA mapping. Fixes: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") Fixes: 921b68692abb ("xsk: Enable sharing of dma mappings") Reported-by: Alasdair McWilliam <alasdair.mcwilliam@outlook.com> Closes: https://lore.kernel.org/PA4P194MB10056F208AF221D043F57A3D86512@PA4P194MB1005.EURP194.PROD.OUTLOOK.COM Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20241122112912.89881-1-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Date: Fri Nov 22 13:10:29 2024 +0100 xsk: fix OOB map writes when deleting elements commit 32cd3db7de97c0c7a018756ce66244342fd583f0 upstream. Jordy says: " In the xsk_map_delete_elem function an unsigned integer (map->max_entries) is compared with a user-controlled signed integer (k). Due to implicit type conversion, a large unsigned value for map->max_entries can bypass the intended bounds check: if (k >= map->max_entries) return -EINVAL; This allows k to hold a negative value (between -2147483648 and -2), which is then used as an array index in m->xsk_map[k], which results in an out-of-bounds access. spin_lock_bh(&m->lock); map_entry = &m->xsk_map[k]; // Out-of-bounds map_entry old_xs = unrcu_pointer(xchg(map_entry, NULL)); // Oob write if (old_xs) xsk_map_sock_delete(old_xs, map_entry); spin_unlock_bh(&m->lock); The xchg operation can then be used to cause an out-of-bounds write. Moreover, the invalid map_entry passed to xsk_map_sock_delete can lead to further memory corruption. " It indeed results in following splat: [76612.897343] BUG: unable to handle page fault for address: ffffc8fc2e461108 [76612.904330] #PF: supervisor write access in kernel mode [76612.909639] #PF: error_code(0x0002) - not-present page [76612.914855] PGD 0 P4D 0 [76612.917431] Oops: Oops: 0002 [#1] PREEMPT SMP [76612.921859] CPU: 11 UID: 0 PID: 10318 Comm: a.out Not tainted 6.12.0-rc1+ #470 [76612.929189] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [76612.939781] RIP: 0010:xsk_map_delete_elem+0x2d/0x60 [76612.944738] Code: 00 00 41 54 55 53 48 63 2e 3b 6f 24 73 38 4c 8d a7 f8 00 00 00 48 89 fb 4c 89 e7 e8 2d bf 05 00 48 8d b4 eb 00 01 00 00 31 ff <48> 87 3e 48 85 ff 74 05 e8 16 ff ff ff 4c 89 e7 e8 3e bc 05 00 31 [76612.963774] RSP: 0018:ffffc9002e407df8 EFLAGS: 00010246 [76612.969079] RAX: 0000000000000000 RBX: ffffc9002e461000 RCX: 0000000000000000 [76612.976323] RDX: 0000000000000001 RSI: ffffc8fc2e461108 RDI: 0000000000000000 [76612.983569] RBP: ffffffff80000001 R08: 0000000000000000 R09: 0000000000000007 [76612.990812] R10: ffffc9002e407e18 R11: ffff888108a38858 R12: ffffc9002e4610f8 [76612.998060] R13: ffff888108a38858 R14: 00007ffd1ae0ac78 R15: ffffc9002e4610c0 [76613.005303] FS: 00007f80b6f59740(0000) GS:ffff8897e0ec0000(0000) knlGS:0000000000000000 [76613.013517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [76613.019349] CR2: ffffc8fc2e461108 CR3: 000000011e3ef001 CR4: 00000000007726f0 [76613.026595] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [76613.033841] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [76613.041086] PKRU: 55555554 [76613.043842] Call Trace: [76613.046331] <TASK> [76613.048468] ? __die+0x20/0x60 [76613.051581] ? page_fault_oops+0x15a/0x450 [76613.055747] ? search_extable+0x22/0x30 [76613.059649] ? search_bpf_extables+0x5f/0x80 [76613.063988] ? exc_page_fault+0xa9/0x140 [76613.067975] ? asm_exc_page_fault+0x22/0x30 [76613.072229] ? xsk_map_delete_elem+0x2d/0x60 [76613.076573] ? xsk_map_delete_elem+0x23/0x60 [76613.080914] __sys_bpf+0x19b7/0x23c0 [76613.084555] __x64_sys_bpf+0x1a/0x20 [76613.088194] do_syscall_64+0x37/0xb0 [76613.091832] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [76613.096962] RIP: 0033:0x7f80b6d1e88d [76613.100592] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48 [76613.119631] RSP: 002b:00007ffd1ae0ac68 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [76613.131330] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f80b6d1e88d [76613.142632] RDX: 0000000000000098 RSI: 00007ffd1ae0ad20 RDI: 0000000000000003 [76613.153967] RBP: 00007ffd1ae0adc0 R08: 0000000000000000 R09: 0000000000000000 [76613.166030] R10: 00007f80b6f77040 R11: 0000000000000206 R12: 00007ffd1ae0aed8 [76613.177130] R13: 000055ddf42ce1e9 R14: 000055ddf42d0d98 R15: 00007f80b6fab040 [76613.188129] </TASK> Fix this by simply changing key type from int to u32. Fixes: fbfc504a24f5 ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP") CC: stable@vger.kernel.org Reported-by: Jordy Zomer <jordyzomer@google.com> Suggested-by: Jordy Zomer <jordyzomer@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20241122121030.716788-2-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sergey Senozhatsky <senozhatsky@chromium.org> Date: Tue Oct 29 00:36:15 2024 +0900 zram: clear IDLE flag in mark_idle() [ Upstream commit d37da422edb0664a2037e6d7d42fe6d339aae78a ] If entry does not fulfill current mark_idle() parameters, e.g. cutoff time, then we should clear its ZRAM_IDLE from previous mark_idle() invocations. Consider the following case: - mark_idle() cutoff time 8h - mark_idle() cutoff time 4h - writeback() idle - will writeback entries with cutoff time 8h, while it should only pick entries with cutoff time 4h The bug was reported by Shin Kawamura. Link: https://lkml.kernel.org/r/20241028153629.1479791-3-senozhatsky@chromium.org Fixes: 755804d16965 ("zram: introduce an aged idle interface") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reported-by: Shin Kawamura <kawasin@google.com> Acked-by: Brian Geffon <bgeffon@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sergey Senozhatsky <senozhatsky@chromium.org> Date: Tue Sep 17 11:09:10 2024 +0900 zram: do not mark idle slots that cannot be idle [ Upstream commit b967fa1ba72b5da2b6d9bf95f0b13420a59e0701 ] ZRAM_SAME slots cannot be post-processed (writeback or recompress) so do not mark them ZRAM_IDLE. Same with ZRAM_WB slots, they cannot be ZRAM_IDLE because they are not in zsmalloc pool anymore. Link: https://lkml.kernel.org/r/20240917021020.883356-6-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: d37da422edb0 ("zram: clear IDLE flag in mark_idle()") Signed-off-by: Sasha Levin <sashal@kernel.org>