Changelog in Linux kernel 6.6.63

ALSA: hda/realtek - Fixed Clevo platform headset Mic issue [+ + +]

Author: Kailang Yang <kailang@realtek.com>
Date:   Fri Oct 25 16:37:57 2024 +0800

    ALSA: hda/realtek - Fixed Clevo platform headset Mic issue
    
    commit 42ee87df8530150d637aa48363b72b22a9bbd78f upstream.
    
    Clevo platform with ALC255 Headset Mic was disable by default.
    Assigned verb table for Mic pin will enable it.
    
    Signed-off-by: Kailang Yang <kailang@realtek.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/b2dcac3e09ef4f82b36d6712194e1ea4@realtek.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: fix mute/micmute LEDs for a HP EliteBook 645 G10 [+ + +]

Author: Maksym Glubokiy <maxgl.kernel@gmail.com>
Date:   Tue Nov 12 17:48:15 2024 +0200

    ALSA: hda/realtek: fix mute/micmute LEDs for a HP EliteBook 645 G10
    
    commit 96409eeab8cdd394e03ec494ea9547edc27f7ab4 upstream.
    
    HP EliteBook 645 G10 uses ALC236 codec and need the
    ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make mute LED and
    micmute LED work.
    
    Signed-off-by: Maksym Glubokiy <maxgl.kernel@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://patch.msgid.link/20241112154815.10888-1-maxgl.kernel@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: 9419/1: mm: Fix kernel memory mapping for xip kernels [+ + +]

Author: Harith G <harith.g@alifsemi.com>
Date:   Wed Sep 18 06:57:11 2024 +0100

    ARM: 9419/1: mm: Fix kernel memory mapping for xip kernels
    
    [ Upstream commit ed6cbe6e5563452f305e89c15846820f2874e431 ]
    
    The patchset introducing kernel_sec_start/end variables to separate the
    kernel/lowmem memory mappings, broke the mapping of the kernel memory
    for xipkernels.
    
    kernel_sec_start/end variables are in RO area before the MMU is switched
    on for xipkernels.
    So these cannot be set early in boot in head.S. Fix this by setting these
    after MMU is switched on.
    xipkernels need two different mappings for kernel text (starting at
    CONFIG_XIP_PHYS_ADDR) and data (starting at CONFIG_PHYS_OFFSET).
    Also, move the kernel code mapping from devicemaps_init() to map_kernel().
    
    Fixes: a91da5457085 ("ARM: 9089/1: Define kernel physical section start and end")
    Signed-off-by: Harith George <harith.g@alifsemi.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: btintel: Direct exception event to bluetooth stack [+ + +]

Author: Kiran K <kiran.k@intel.com>
Date:   Tue Oct 22 14:41:34 2024 +0530

    Bluetooth: btintel: Direct exception event to bluetooth stack
    
    [ Upstream commit d5359a7f583ab9b7706915213b54deac065bcb81 ]
    
    Have exception event part of HCI traces which helps for debug.
    
    snoop traces:
    > HCI Event: Vendor (0xff) plen 79
            Vendor Prefix (0x8780)
          Intel Extended Telemetry (0x03)
            Unknown extended telemetry event type (0xde)
            01 01 de
            Unknown extended subevent 0x07
            01 01 de 07 01 de 06 1c ef be ad de ef be ad de
            ef be ad de ef be ad de ef be ad de ef be ad de
            ef be ad de 05 14 ef be ad de ef be ad de ef be
            ad de ef be ad de ef be ad de 43 10 ef be ad de
            ef be ad de ef be ad de ef be ad de
    
    Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support")
    Signed-off-by: Kiran K <kiran.k@intel.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_core: Fix calling mgmt_device_connected [+ + +]

Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Fri Nov 8 11:19:54 2024 -0500

    Bluetooth: hci_core: Fix calling mgmt_device_connected
    
    [ Upstream commit 7967dc8f797f454d4f4acec15c7df0cdf4801617 ]
    
    Since 61a939c68ee0 ("Bluetooth: Queue incoming ACL data until
    BT_CONNECTED state is reached") there is no long the need to call
    mgmt_device_connected as ACL data will be queued until BT_CONNECTED
    state.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219458
    Link: https://github.com/bluez/bluez/issues/1014
    Fixes: 333b4fd11e89 ("Bluetooth: L2CAP: Fix uaf in l2cap_connect")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bonding: add ns target multicast address to slave device [+ + +]

Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Mon Nov 11 10:16:49 2024 +0000

    bonding: add ns target multicast address to slave device
    
    [ Upstream commit 8eb36164d1a6769a20ed43033510067ff3dab9ee ]
    
    Commit 4598380f9c54 ("bonding: fix ns validation on backup slaves")
    tried to resolve the issue where backup slaves couldn't be brought up when
    receiving IPv6 Neighbor Solicitation (NS) messages. However, this fix only
    worked for drivers that receive all multicast messages, such as the veth
    interface.
    
    For standard drivers, the NS multicast message is silently dropped because
    the slave device is not a member of the NS target multicast group.
    
    To address this, we need to make the slave device join the NS target
    multicast group, ensuring it can receive these IPv6 NS messages to validate
    the slave’s status properly.
    
    There are three policies before joining the multicast group:
    1. All settings must be under active-backup mode (alb and tlb do not support
       arp_validate), with backup slaves and slaves supporting multicast.
    2. We can add or remove multicast groups when arp_validate changes.
    3. Other operations, such as enslaving, releasing, or setting NS targets,
       need to be guarded by arp_validate.
    
    Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets")
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Adjust VSDB parser for replay feature [+ + +]

Author: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Date:   Tue Nov 5 08:40:23 2024 -0700

    drm/amd/display: Adjust VSDB parser for replay feature
    
    commit 16dd2825c23530f2259fc671960a3a65d2af69bd upstream.
    
    At some point, the IEEE ID identification for the replay check in the
    AMD EDID was added. However, this check causes the following
    out-of-bounds issues when using KASAN:
    
    [   27.804016] BUG: KASAN: slab-out-of-bounds in amdgpu_dm_update_freesync_caps+0xefa/0x17a0 [amdgpu]
    [   27.804788] Read of size 1 at addr ffff8881647fdb00 by task systemd-udevd/383
    
    ...
    
    [   27.821207] Memory state around the buggy address:
    [   27.821215]  ffff8881647fda00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   27.821224]  ffff8881647fda80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   27.821234] >ffff8881647fdb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   27.821243]                    ^
    [   27.821250]  ffff8881647fdb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   27.821259]  ffff8881647fdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   27.821268] ==================================================================
    
    This is caused because the ID extraction happens outside of the range of
    the edid lenght. This commit addresses this issue by considering the
    amd_vsdb_block size.
    
    Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com>
    Reviewed-by: Leo Li <sunpeng.li@amd.com>
    Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
    Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit b7e381b1ccd5e778e3d9c44c669ad38439a861d8)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/pm: Vangogh: Fix kernel memory out of bounds write [+ + +]

Author: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Date:   Fri Oct 25 15:56:39 2024 +0100

    drm/amd/pm: Vangogh: Fix kernel memory out of bounds write
    
    commit 4aa923a6e6406b43566ef6ac35a3d9a3197fa3e8 upstream.
    
    KASAN reports that the GPU metrics table allocated in
    vangogh_tables_init() is not large enough for the memset done in
    smu_cmn_init_soft_gpu_metrics(). Condensed report follows:
    
    [   33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
    [   33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
    ...
    [   33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G        W          6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
    [   33.861816] Tainted: [W]=WARN
    [   33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
    [   33.861822] Call Trace:
    [   33.861826]  <TASK>
    [   33.861829]  dump_stack_lvl+0x66/0x90
    [   33.861838]  print_report+0xce/0x620
    [   33.861853]  kasan_report+0xda/0x110
    [   33.862794]  kasan_check_range+0xfd/0x1a0
    [   33.862799]  __asan_memset+0x23/0x40
    [   33.862803]  smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
    [   33.863306]  vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
    [   33.864257]  vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
    [   33.865682]  amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
    [   33.866160]  amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
    [   33.867135]  dev_attr_show+0x43/0xc0
    [   33.867147]  sysfs_kf_seq_show+0x1f1/0x3b0
    [   33.867155]  seq_read_iter+0x3f8/0x1140
    [   33.867173]  vfs_read+0x76c/0xc50
    [   33.867198]  ksys_read+0xfb/0x1d0
    [   33.867214]  do_syscall_64+0x90/0x160
    ...
    [   33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
    [   33.867358]  kasan_save_stack+0x33/0x50
    [   33.867364]  kasan_save_track+0x17/0x60
    [   33.867367]  __kasan_kmalloc+0x87/0x90
    [   33.867371]  vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
    [   33.867835]  smu_sw_init+0xa32/0x1850 [amdgpu]
    [   33.868299]  amdgpu_device_init+0x467b/0x8d90 [amdgpu]
    [   33.868733]  amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
    [   33.869167]  amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
    [   33.869608]  local_pci_probe+0xda/0x180
    [   33.869614]  pci_device_probe+0x43f/0x6b0
    
    Empirically we can confirm that the former allocates 152 bytes for the
    table, while the latter memsets the 168 large block.
    
    Root cause appears that when GPU metrics tables for v2_4 parts were added
    it was not considered to enlarge the table to fit.
    
    The fix in this patch is rather "brute force" and perhaps later should be
    done in a smarter way, by extracting and consolidating the part version to
    size logic to a common helper, instead of brute forcing the largest
    possible allocation. Nevertheless, for now this works and fixes the out of
    bounds write.
    
    v2:
     * Drop impossible v3_0 case. (Mario)
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Evan Quan <evan.quan@amd.com>
    Cc: Wenyou Yang <WenYou.Yang@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703)
    Cc: stable@vger.kernel.org # v6.6+
    Signed-off-by: Bin Lan <bin.lan.cn@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd: Fix initialization mistake for NBIO 7.7.0 [+ + +]

Author: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Date:   Tue Nov 12 10:11:42 2024 -0600

    drm/amd: Fix initialization mistake for NBIO 7.7.0
    
    commit 7013a8268d311fded6c7a6528fc1de82668e75f6 upstream.
    
    There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME
    events while in the D0 state.
    
    Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Acked-by: Alex Deucher <alexander.deucher@amd.com>
    Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    (cherry picked from commit 447a54a0f79c9a409ceaa17804bdd2e0206397b9)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/bridge: tc358768: Fix DSI command tx [+ + +]

Author: Francesco Dolcini <francesco.dolcini@toradex.com>
Date:   Thu Sep 26 16:12:46 2024 +0200

    drm/bridge: tc358768: Fix DSI command tx
    
    commit 32c4514455b2b8fde506f8c0962f15c7e4c26f1d upstream.
    
    Wait for the command transmission to be completed in the DSI transfer
    function polling for the dc_start bit to go back to idle state after the
    transmission is started.
    
    This is documented in the datasheet and failures to do so lead to
    commands corruption.
    
    Fixes: ff1ca6397b1d ("drm/bridge: Add tc358768 driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
    Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://lore.kernel.org/r/20240926141246.48282-1-francesco@dolcini.it
    Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240926141246.48282-1-francesco@dolcini.it
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/rockchip: vop: Fix a dereferenced before check warning [+ + +]

Author: Andy Yan <andy.yan@rock-chips.com>
Date:   Mon Oct 21 15:28:06 2024 +0800

    drm/rockchip: vop: Fix a dereferenced before check warning
    
    [ Upstream commit ab1c793f457f740ab7108cc0b1340a402dbf484d ]
    
    The 'state' can't be NULL, we should check crtc_state.
    
    Fix warning:
    drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1096
    vop_plane_atomic_async_check() warn: variable dereferenced before check
    'state' (see line 1077)
    
    Fixes: 5ddb0bd4ddc3 ("drm/atomic: Pass the full state to planes async atomic check and update")
    Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
    Signed-off-by: Heiko Stuebner <heiko@sntech.de>
    Link: https://patchwork.freedesktop.org/patch/msgid/20241021072818.61621-1-andyshrk@163.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fs/9p: fix uninitialized values during inode evict [+ + +]

Author: Eric Van Hensbergen <ericvh@kernel.org>
Date:   Tue Nov 19 11:43:17 2024 +0800

    fs/9p: fix uninitialized values during inode evict
    
    [ Upstream commit 6630036b7c228f57c7893ee0403e92c2db2cd21d ]
    
    If an iget fails due to not being able to retrieve information
    from the server then the inode structure is only partially
    initialized.  When the inode gets evicted, references to
    uninitialized structures (like fscache cookies) were being
    made.
    
    This patch checks for a bad_inode before doing anything other
    than clearing the inode from the cache.  Since the inode is
    bad, it shouldn't have any state associated with it that needs
    to be written back (and there really isn't a way to complete
    those anyways).
    
    Reported-by: syzbot+eb83fe1cce5833cd66a0@syzkaller.appspotmail.com
    Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    [Xiangyu: CVE-2024-36923 Minor conflict resolution ]
    Signed-off-by: Xiangyu Chen <xiangyu.chen@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ima: fix buffer overrun in ima_eventdigest_init_common [+ + +]

Author: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Date:   Wed Aug 7 10:27:13 2024 -0700

    ima: fix buffer overrun in ima_eventdigest_init_common
    
    commit 923168a0631bc42fffd55087b337b1b6c54dcff5 upstream.
    
    Function ima_eventdigest_init() calls ima_eventdigest_init_common()
    with HASH_ALGO__LAST which is then used to access the array
    hash_digest_size[] leading to buffer overrun. Have a conditional
    statement to handle this.
    
    Fixes: 9fab303a2cb3 ("ima: fix violation measurement list record")
    Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
    Tested-by: Enrico Bravi (PhD at polito.it) <enrico.bravi@huawei.com>
    Cc: stable@vger.kernel.org # 5.19+
    Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Thu Oct 31 13:20:11 2024 -0700

    KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled
    
    commit 2657b82a78f18528bef56dc1b017158490970873 upstream.
    
    When getting the current VPID, e.g. to emulate a guest TLB flush, return
    vpid01 if L2 is running but with VPID disabled, i.e. if VPID is disabled
    in vmcs12.  Architecturally, if VPID is disabled, then the guest and host
    effectively share VPID=0.  KVM emulates this behavior by using vpid01 when
    running an L2 with VPID disabled (see prepare_vmcs02_early_rare()), and so
    KVM must also treat vpid01 as the current VPID while L2 is active.
    
    Unconditionally treating vpid02 as the current VPID when L2 is active
    causes KVM to flush TLB entries for vpid02 instead of vpid01, which
    results in TLB entries from L1 being incorrectly preserved across nested
    VM-Enter to L2 (L2=>L1 isn't problematic, because the TLB flush after
    nested VM-Exit flushes vpid01).
    
    The bug manifests as failures in the vmx_apicv_test KVM-Unit-Test, as KVM
    incorrectly retains TLB entries for the APIC-access page across a nested
    VM-Enter.
    
    Opportunisticaly add comments at various touchpoints to explain the
    architectural requirements, and also why KVM uses vpid01 instead of vpid02.
    
    All credit goes to Chao, who root caused the issue and identified the fix.
    
    Link: https://lore.kernel.org/all/ZwzczkIlYGX+QXJz@intel.com
    Fixes: 2b4a5a5d5688 ("KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST")
    Cc: stable@vger.kernel.org
    Cc: Like Xu <like.xu.linux@gmail.com>
    Debugged-by: Chao Gao <chao.gao@intel.com>
    Reviewed-by: Chao Gao <chao.gao@intel.com>
    Tested-by: Chao Gao <chao.gao@intel.com>
    Link: https://lore.kernel.org/r/20241031202011.1580522-1-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Nov 1 11:50:30 2024 -0700

    KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN
    
    commit aa0d42cacf093a6fcca872edc954f6f812926a17 upstream.
    
    Hide KVM's pt_mode module param behind CONFIG_BROKEN, i.e. disable support
    for virtualizing Intel PT via guest/host mode unless BROKEN=y.  There are
    myriad bugs in the implementation, some of which are fatal to the guest,
    and others which put the stability and health of the host at risk.
    
    For guest fatalities, the most glaring issue is that KVM fails to ensure
    tracing is disabled, and *stays* disabled prior to VM-Enter, which is
    necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing
    is enabled (enforced via a VMX consistency check).  Per the SDM:
    
      If the logical processor is operating with Intel PT enabled (if
      IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load
      IA32_RTIT_CTL" VM-entry control must be 0.
    
    On the host side, KVM doesn't validate the guest CPUID configuration
    provided by userspace, and even worse, uses the guest configuration to
    decide what MSRs to save/load at VM-Enter and VM-Exit.  E.g. configuring
    guest CPUID to enumerate more address ranges than are supported in hardware
    will result in KVM trying to passthrough, save, and load non-existent MSRs,
    which generates a variety of WARNs, ToPA ERRORs in the host, a potential
    deadlock, etc.
    
    Fixes: f99e3daf94ff ("KVM: x86: Add Intel PT virtualization work mode")
    Cc: stable@vger.kernel.org
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
    Tested-by: Adrian Hunter <adrian.hunter@intel.com>
    Message-ID: <20241101185031.1799556-2-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: x86: Unconditionally set irr_pending when updating APICv state [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Tue Nov 5 17:51:35 2024 -0800

    KVM: x86: Unconditionally set irr_pending when updating APICv state
    
    commit d3ddef46f22e8c3124e0df1f325bc6a18dadff39 upstream.
    
    Always set irr_pending (to true) when updating APICv status to fix a bug
    where KVM fails to set irr_pending when userspace sets APIC state and
    APICv is disabled, which ultimate results in KVM failing to inject the
    pending interrupt(s) that userspace stuffed into the vIRR, until another
    interrupt happens to be emulated by KVM.
    
    Only the APICv-disabled case is flawed, as KVM forces apic->irr_pending to
    be true if APICv is enabled, because not all vIRR updates will be visible
    to KVM.
    
    Hit the bug with a big hammer, even though strictly speaking KVM can scan
    the vIRR and set/clear irr_pending as appropriate for this specific case.
    The bug was introduced by commit 755c2bf87860 ("KVM: x86: lapic: don't
    touch irr_pending in kvm_apic_update_apicv when inhibiting it"), which as
    the shortlog suggests, deleted code that updated irr_pending.
    
    Before that commit, kvm_apic_update_apicv() did indeed scan the vIRR, with
    with the crucial difference that kvm_apic_update_apicv() did the scan even
    when APICv was being *disabled*, e.g. due to an AVIC inhibition.
    
            struct kvm_lapic *apic = vcpu->arch.apic;
    
            if (vcpu->arch.apicv_active) {
                    /* irr_pending is always true when apicv is activated. */
                    apic->irr_pending = true;
                    apic->isr_count = 1;
            } else {
                    apic->irr_pending = (apic_search_irr(apic) != -1);
                    apic->isr_count = count_vectors(apic->regs + APIC_ISR);
            }
    
    And _that_ bug (clearing irr_pending) was introduced by commit b26a695a1d78
    ("kvm: lapic: Introduce APICv update helper function"), prior to which KVM
    unconditionally set irr_pending to true in kvm_apic_set_state(), i.e.
    assumed that the new virtual APIC state could have a pending IRQ.
    
    Furthermore, in addition to introducing this issue, commit 755c2bf87860
    also papered over the underlying bug: KVM doesn't ensure CPUs and devices
    see APICv as disabled prior to searching the IRR.  Waiting until KVM
    emulates an EOI to update irr_pending "works", but only because KVM won't
    emulate EOI until after refresh_apicv_exec_ctrl(), and there are plenty of
    memory barriers in between.  I.e. leaving irr_pending set is basically
    hacking around bad ordering.
    
    So, effectively revert to the pre-b26a695a1d78 behavior for state restore,
    even though it's sub-optimal if no IRQs are pending, in order to provide a
    minimal fix, but leave behind a FIXME to document the ugliness.  With luck,
    the ordering issue will be fixed and the mess will be cleaned up in the
    not-too-distant future.
    
    Fixes: 755c2bf87860 ("KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it")
    Cc: stable@vger.kernel.org
    Cc: Maxim Levitsky <mlevitsk@redhat.com>
    Reported-by: Yong He <zhuangel570@gmail.com>
    Closes: https://lkml.kernel.org/r/20241023124527.1092810-1-alexyonghe%40tencent.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-ID: <20241106015135.2462147-1-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

leds: mlxreg: Use devm_mutex_init() for mutex initialization [+ + +]

Author: George Stark <gnstark@salutedevices.com>
Date:   Thu Apr 11 19:10:31 2024 +0300

    leds: mlxreg: Use devm_mutex_init() for mutex initialization
    
    commit efc347b9efee1c2b081f5281d33be4559fa50a16 upstream.
    
    In this driver LEDs are registered using devm_led_classdev_register()
    so they are automatically unregistered after module's remove() is done.
    led_classdev_unregister() calls module's led_set_brightness() to turn off
    the LEDs and that callback uses mutex which was destroyed already
    in module's remove() so use devm API instead.
    
    Signed-off-by: George Stark <gnstark@salutedevices.com>
    Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Link: https://lore.kernel.org/r/20240411161032.609544-8-gnstark@salutedevices.com
    Signed-off-by: Lee Jones <lee@kernel.org>
    [ Resolve minor conflicts to fix CVE-2024-42129 ]
    Signed-off-by: Bin Lan <bin.lan.cn@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lib/buildid: Fix build ID parsing logic [+ + +]

Author: Jiri Olsa <jolsa@kernel.org>
Date:   Mon Nov 4 18:52:55 2024 +0100

    lib/buildid: Fix build ID parsing logic
    
    The parse_build_id_buf does not account Elf32_Nhdr header size
    when getting the build id data pointer and returns wrong build
    id data as result.
    
    This is problem only for stable trees that merged c83a80d8b84f
    fix, the upstream build id code was refactored and returns proper
    build id.
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Fixes: c83a80d8b84f ("lib/buildid: harden build ID parsing logic")
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: Linux 6.6.63 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Fri Nov 22 15:38:37 2024 +0100

    Linux 6.6.63
    
    Link: https://lore.kernel.org/r/20241120125629.623666563@linuxfoundation.org
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Hardik Garg hargar@linux.microsoft.com=0A=
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

LoongArch: Disable KASAN if PGDIR_SIZE is too large for cpu_vabits [+ + +]

Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Tue Nov 12 16:35:39 2024 +0800

    LoongArch: Disable KASAN if PGDIR_SIZE is too large for cpu_vabits
    
    commit 227ca9f6f6aeb8aa8f0c10430b955f1fe2aeab91 upstream.
    
    If PGDIR_SIZE is too large for cpu_vabits, KASAN_SHADOW_END will
    overflow UINTPTR_MAX because KASAN_SHADOW_START/KASAN_SHADOW_END are
    aligned up by PGDIR_SIZE. And then the overflowed KASAN_SHADOW_END looks
    like a user space address.
    
    For example, PGDIR_SIZE of CONFIG_4KB_4LEVEL is 2^39, which is too large
    for Loongson-2K series whose cpu_vabits = 39.
    
    Since CONFIG_4KB_4LEVEL is completely legal for CPUs with cpu_vabits <=
    39, we just disable KASAN via early return in kasan_init(). Otherwise we
    get a boot failure.
    
    Moreover, we change KASAN_SHADOW_END from the first address after KASAN
    shadow area to the last address in KASAN shadow area, in order to avoid
    the end address exactly overflow to 0 (which is a legal case). We don't
    need to worry about alignment because pgd_addr_end() can handle it.
    
    Cc: stable@vger.kernel.org
    Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

LoongArch: Fix early_numa_add_cpu() usage for FDT systems [+ + +]

Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Tue Nov 12 16:35:36 2024 +0800

    LoongArch: Fix early_numa_add_cpu() usage for FDT systems
    
    commit 30cec747d6bf2c3e915c075d76d9712e54cde0a6 upstream.
    
    early_numa_add_cpu() applies on physical CPU id rather than logical CPU
    id, so use cpuid instead of cpu.
    
    Cc: stable@vger.kernel.org
    Fixes: 3de9c42d02a79a5 ("LoongArch: Add all CPUs enabled by fdt to NUMA node 0")
    Reported-by: Bibo Mao <maobibo@loongson.cn>
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

LoongArch: Make KASAN work with 5-level page-tables [+ + +]

Author: Huacai Chen <chenhuacai@kernel.org>
Date:   Tue Nov 12 16:35:39 2024 +0800

    LoongArch: Make KASAN work with 5-level page-tables
    
    commit a410656643ce4844ba9875aa4e87a7779308259b upstream.
    
    Make KASAN work with 5-level page-tables, including:
    1. Implement and use __pgd_none() and kasan_p4d_offset().
    2. As done in kasan_pmd_populate() and kasan_pte_populate(), restrict
       the loop conditions of kasan_p4d_populate() and kasan_pud_populate()
       to avoid unnecessary population.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: dvbdev: fix the logic when DVB_DYNAMIC_MINORS is not set [+ + +]

Author: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Date:   Wed Nov 6 21:50:55 2024 +0100

    media: dvbdev: fix the logic when DVB_DYNAMIC_MINORS is not set
    
    commit a4aebaf6e6efff548b01a3dc49b4b9074751c15b upstream.
    
    When CONFIG_DVB_DYNAMIC_MINORS, ret is not initialized, and a
    semaphore is left at the wrong state, in case of errors.
    
    Make the code simpler and avoid mistakes by having just one error
    check logic used weather DVB_DYNAMIC_MINORS is used or not.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Closes: https://lore.kernel.org/r/202410201717.ULWWdJv8-lkp@intel.com/
    Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
    Link: https://lore.kernel.org/r/9e067488d8935b8cf00959764a1fa5de85d65725.1730926254.git.mchehab+huawei@kernel.org
    Cc: Nathan Chancellor <nathan@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/damon/core: check apply interval in damon_do_apply_schemes() [+ + +]

Author: SeongJae Park <sj@kernel.org>
Date:   Mon Feb 5 12:13:06 2024 -0800

    mm/damon/core: check apply interval in damon_do_apply_schemes()
    
    commit e9e3db69966d5e9e6f7e7d017b407c0025180fe5 upstream.
    
    kdamond_apply_schemes() checks apply intervals of schemes and avoid
    further applying any schemes if no scheme passed its apply interval.
    However, the following schemes applying function, damon_do_apply_schemes()
    iterates all schemes without the apply interval check.  As a result, the
    shortest apply interval is applied to all schemes.  Fix the problem by
    checking the apply interval in damon_do_apply_schemes().
    
    Link: https://lkml.kernel.org/r/20240205201306.88562-1-sj@kernel.org
    Fixes: 42f994b71404 ("mm/damon/core: implement scheme-specific apply interval")
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Cc: <stable@vger.kernel.org>    [6.7.x]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/damon/core: copy nr_accesses when splitting region [+ + +]

Author: SeongJae Park <sj@kernel.org>
Date:   Sun Nov 19 17:15:28 2023 +0000

    mm/damon/core: copy nr_accesses when splitting region
    
    commit 1f3730fd9e8d4d77fb99c60d0e6ad4b1104e7e04 upstream.
    
    Regions split function ('damon_split_region_at()') is called at the
    beginning of an aggregation interval, and when DAMOS applying the actions
    and charging quota.  Because 'nr_accesses' fields of all regions are reset
    at the beginning of each aggregation interval, and DAMOS was applying the
    action at the end of each aggregation interval, there was no need to copy
    the 'nr_accesses' field to the split-out region.
    
    However, commit 42f994b71404 ("mm/damon/core: implement scheme-specific
    apply interval") made DAMOS applies action on its own timing interval.
    Hence, 'nr_accesses' should also copied to split-out regions, but the
    commit didn't.  Fix it by copying it.
    
    Link: https://lkml.kernel.org/r/20231119171529.66863-1-sj@kernel.org
    Fixes: 42f994b71404 ("mm/damon/core: implement scheme-specific apply interval")
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/damon/core: handle zero schemes apply interval [+ + +]

Author: SeongJae Park <sj@kernel.org>
Date:   Thu Oct 31 11:37:57 2024 -0700

    mm/damon/core: handle zero schemes apply interval
    
    commit 8e7bde615f634a82a44b1f3d293c049fd3ef9ca9 upstream.
    
    DAMON's logics to determine if this is the time to apply damos schemes
    assumes next_apply_sis is always set larger than current
    passed_sample_intervals.  And therefore assume continuously incrementing
    passed_sample_intervals will make it reaches to the next_apply_sis in
    future.  The logic hence does apply the scheme and update next_apply_sis
    only if passed_sample_intervals is same to next_apply_sis.
    
    If Schemes apply interval is set as zero, however, next_apply_sis is set
    same to current passed_sample_intervals, respectively.  And
    passed_sample_intervals is incremented before doing the next_apply_sis
    check.  Hence, next_apply_sis becomes larger than next_apply_sis, and the
    logic says it is not the time to apply schemes and update next_apply_sis.
    In other words, DAMON stops applying schemes until passed_sample_intervals
    overflows.
    
    Based on the documents and the common sense, a reasonable behavior for
    such inputs would be applying the schemes for every sampling interval.
    Handle the case by removing the assumption.
    
    Link: https://lkml.kernel.org/r/20241031183757.49610-3-sj@kernel.org
    Fixes: 42f994b71404 ("mm/damon/core: implement scheme-specific apply interval")
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Cc: <stable@vger.kernel.org>    [6.7.x]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/damon/core: handle zero {aggregation,ops_update} intervals [+ + +]

Author: SeongJae Park <sj@kernel.org>
Date:   Thu Oct 31 11:37:56 2024 -0700

    mm/damon/core: handle zero {aggregation,ops_update} intervals
    
    [ Upstream commit 3488af0970445ff5532c7e8dc5e6456b877aee5e ]
    
    Patch series "mm/damon/core: fix handling of zero non-sampling intervals".
    
    DAMON's internal intervals accounting logic is not correctly handling
    non-sampling intervals of zero values for a wrong assumption.  This could
    cause unexpected monitoring behavior, and even result in infinite hang of
    DAMON sysfs interface user threads in case of zero aggregation interval.
    Fix those by updating the intervals accounting logic.  For details of the
    root case and solutions, please refer to commit messages of fixes.
    
    This patch (of 2):
    
    DAMON's logics to determine if this is the time to do aggregation and ops
    update assumes next_{aggregation,ops_update}_sis are always set larger
    than current passed_sample_intervals.  And therefore it further assumes
    continuously incrementing passed_sample_intervals every sampling interval
    will make it reaches to the next_{aggregation,ops_update}_sis in future.
    The logic therefore make the action and update
    next_{aggregation,ops_updaste}_sis only if passed_sample_intervals is same
    to the counts, respectively.
    
    If Aggregation interval or Ops update interval are zero, however,
    next_aggregation_sis or next_ops_update_sis are set same to current
    passed_sample_intervals, respectively.  And passed_sample_intervals is
    incremented before doing the next_{aggregation,ops_update}_sis check.
    Hence, passed_sample_intervals becomes larger than
    next_{aggregation,ops_update}_sis, and the logic says it is not the time
    to do the action and update next_{aggregation,ops_update}_sis forever,
    until an overflow happens.  In other words, DAMON stops doing aggregations
    or ops updates effectively forever, and users cannot get monitoring
    results.
    
    Based on the documents and the common sense, a reasonable behavior for
    such inputs is doing an aggregation and an ops update for every sampling
    interval.  Handle the case by removing the assumption.
    
    Note that this could incur particular real issue for DAMON sysfs interface
    users, in case of zero Aggregation interval.  When user starts DAMON with
    zero Aggregation interval and asks online DAMON parameter tuning via DAMON
    sysfs interface, the request is handled by the aggregation callback.
    Until the callback finishes the work, the user who requested the online
    tuning just waits.  Hence, the user will be stuck until the
    passed_sample_intervals overflows.
    
    Link: https://lkml.kernel.org/r/20241031183757.49610-1-sj@kernel.org
    Link: https://lkml.kernel.org/r/20241031183757.49610-2-sj@kernel.org
    Fixes: 4472edf63d66 ("mm/damon/core: use number of passed access sampling as a timer")
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Cc: <stable@vger.kernel.org>    [6.7.x]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/damon/core: implement scheme-specific apply interval [+ + +]

Author: SeongJae Park <sj@kernel.org>
Date:   Sat Sep 16 02:09:40 2023 +0000

    mm/damon/core: implement scheme-specific apply interval
    
    [ Upstream commit 42f994b71404b17abcd6b170de7a6aa95ffe5d4a ]
    
    DAMON-based operation schemes are applied for every aggregation interval.
    That was mainly because schemes were using nr_accesses, which be complete
    to be used for every aggregation interval.  However, the schemes are now
    using nr_accesses_bp, which is updated for each sampling interval in a way
    that reasonable to be used.  Therefore, there is no reason to apply
    schemes for each aggregation interval.
    
    The unnecessary alignment with aggregation interval was also making some
    use cases of DAMOS tricky.  Quotas setting under long aggregation interval
    is one such example.  Suppose the aggregation interval is ten seconds, and
    there is a scheme having CPU quota 100ms per 1s.  The scheme will actually
    uses 100ms per ten seconds, since it cannobe be applied before next
    aggregation interval.  The feature is working as intended, but the results
    might not that intuitive for some users.  This could be fixed by updating
    the quota to 1s per 10s.  But, in the case, the CPU usage of DAMOS could
    look like spikes, and would actually make a bad effect to other
    CPU-sensitive workloads.
    
    Implement a dedicated timing interval for each DAMON-based operation
    scheme, namely apply_interval.  The interval will be sampling interval
    aligned, and each scheme will be applied for its apply_interval.  The
    interval is set to 0 by default, and it means the scheme should use the
    aggregation interval instead.  This avoids old users getting any
    behavioral difference.
    
    Link: https://lkml.kernel.org/r/20230916020945.47296-5-sj@kernel.org
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 3488af097044 ("mm/damon/core: handle zero {aggregation,ops_update} intervals")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: avoid unsafe VMA hook invocation when error arises on mmap hook [+ + +]

Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Fri Nov 15 12:41:54 2024 +0000

    mm: avoid unsafe VMA hook invocation when error arises on mmap hook
    
    [ Upstream commit 3dd6ed34ce1f2356a77fb88edafb5ec96784e3cf ]
    
    Patch series "fix error handling in mmap_region() and refactor
    (hotfixes)", v4.
    
    mmap_region() is somewhat terrifying, with spaghetti-like control flow and
    numerous means by which issues can arise and incomplete state, memory
    leaks and other unpleasantness can occur.
    
    A large amount of the complexity arises from trying to handle errors late
    in the process of mapping a VMA, which forms the basis of recently
    observed issues with resource leaks and observable inconsistent state.
    
    This series goes to great lengths to simplify how mmap_region() works and
    to avoid unwinding errors late on in the process of setting up the VMA for
    the new mapping, and equally avoids such operations occurring while the
    VMA is in an inconsistent state.
    
    The patches in this series comprise the minimal changes required to
    resolve existing issues in mmap_region() error handling, in order that
    they can be hotfixed and backported.  There is additionally a follow up
    series which goes further, separated out from the v1 series and sent and
    updated separately.
    
    This patch (of 5):
    
    After an attempted mmap() fails, we are no longer in a situation where we
    can safely interact with VMA hooks.  This is currently not enforced,
    meaning that we need complicated handling to ensure we do not incorrectly
    call these hooks.
    
    We can avoid the whole issue by treating the VMA as suspect the moment
    that the file->f_ops->mmap() function reports an error by replacing
    whatever VMA operations were installed with a dummy empty set of VMA
    operations.
    
    We do so through a new helper function internal to mm - mmap_file() -
    which is both more logically named than the existing call_mmap() function
    and correctly isolates handling of the vm_op reassignment to mm.
    
    All the existing invocations of call_mmap() outside of mm are ultimately
    nested within the call_mmap() from mm, which we now replace.
    
    It is therefore safe to leave call_mmap() in place as a convenience
        function (and to avoid churn).  The invokers are:
    
         ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap()
        coda_file_operations -> mmap -> coda_file_mmap()
         shm_file_operations -> shm_mmap()
    shm_file_operations_huge -> shm_mmap()
                dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops
                                -> i915_gem_dmabuf_mmap()
    
    None of these callers interact with vm_ops or mappings in a problematic
    way on error, quickly exiting out.
    
    Link: https://lkml.kernel.org/r/cover.1730224667.git.lorenzo.stoakes@oracle.com
    Link: https://lkml.kernel.org/r/d41fd763496fd0048a962f3fd9407dc72dd4fd86.1730224667.git.lorenzo.stoakes@oracle.com
    Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Jann Horn <jannh@google.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Jann Horn <jannh@google.com>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Helge Deller <deller@gmx.de>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix NULL pointer dereference in alloc_pages_bulk_noprof [+ + +]

Author: Jinjiang Tu <tujinjiang@huawei.com>
Date:   Wed Nov 13 16:32:35 2024 +0800

    mm: fix NULL pointer dereference in alloc_pages_bulk_noprof
    
    commit 8ce41b0f9d77cca074df25afd39b86e2ee3aa68e upstream.
    
    We triggered a NULL pointer dereference for ac.preferred_zoneref->zone in
    alloc_pages_bulk_noprof() when the task is migrated between cpusets.
    
    When cpuset is enabled, in prepare_alloc_pages(), ac->nodemask may be
    ¤t->mems_allowed.  when first_zones_zonelist() is called to find
    preferred_zoneref, the ac->nodemask may be modified concurrently if the
    task is migrated between different cpusets.  Assuming we have 2 NUMA Node,
    when traversing Node1 in ac->zonelist, the nodemask is 2, and when
    traversing Node2 in ac->zonelist, the nodemask is 1.  As a result, the
    ac->preferred_zoneref points to NULL zone.
    
    In alloc_pages_bulk_noprof(), for_each_zone_zonelist_nodemask() finds a
    allowable zone and calls zonelist_node_idx(ac.preferred_zoneref), leading
    to NULL pointer dereference.
    
    __alloc_pages_noprof() fixes this issue by checking NULL pointer in commit
    ea57485af8f4 ("mm, page_alloc: fix check for NULL preferred_zone") and
    commit df76cee6bbeb ("mm, page_alloc: remove redundant checks from alloc
    fastpath").
    
    To fix it, check NULL pointer for preferred_zoneref->zone.
    
    Link: https://lkml.kernel.org/r/20241113083235.166798-1-tujinjiang@huawei.com
    Fixes: 387ba26fb1cb ("mm/page_alloc: add a bulk page allocator")
    Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Alexander Lobakin <alobakin@pm.me>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Nanyong Sun <sunnanyong@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling [+ + +]

Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Fri Nov 15 12:41:57 2024 +0000

    mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
    
    [ Upstream commit 5baf8b037debf4ec60108ccfeccb8636d1dbad81 ]
    
    Currently MTE is permitted in two circumstances (desiring to use MTE
    having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is
    specified, as checked by arch_calc_vm_flag_bits() and actualised by
    setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is
    shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap
    hook is activated in mmap_region().
    
    The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also
    set is the arm64 implementation of arch_validate_flags().
    
    Unfortunately, we intend to refactor mmap_region() to perform this check
    earlier, meaning that in the case of a shmem backing we will not have
    invoked shmem_mmap() yet, causing the mapping to fail spuriously.
    
    It is inappropriate to set this architecture-specific flag in general mm
    code anyway, so a sensible resolution of this issue is to instead move the
    check somewhere else.
    
    We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
    the arch_calc_vm_flag_bits() call.
    
    This is an appropriate place to do this as we already check for the
    MAP_ANONYMOUS case here, and the shmem file case is simply a variant of
    the same idea - we permit RAM-backed memory.
    
    This requires a modification to the arch_calc_vm_flag_bits() signature to
    pass in a pointer to the struct file associated with the mapping, however
    this is not too egregious as this is only used by two architectures anyway
    - arm64 and parisc.
    
    So this patch performs this adjustment and removes the unnecessary
    assignment of VM_MTE_ALLOWED in shmem_mmap().
    
    [akpm@linux-foundation.org: fix whitespace, per Catalin]
    Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com
    Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
    Reported-by: Jann Horn <jannh@google.com>
    Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Helge Deller <deller@gmx.de>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: refactor map_deny_write_exec() [+ + +]

Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Fri Nov 15 12:41:56 2024 +0000

    mm: refactor map_deny_write_exec()
    
    [ Upstream commit 0fb4a7ad270b3b209e510eb9dc5b07bf02b7edaf ]
    
    Refactor the map_deny_write_exec() to not unnecessarily require a VMA
    parameter but rather to accept VMA flags parameters, which allows us to
    use this function early in mmap_region() in a subsequent commit.
    
    While we're here, we refactor the function to be more readable and add
    some additional documentation.
    
    Link: https://lkml.kernel.org/r/6be8bb59cd7c68006ebb006eb9d8dc27104b1f70.1730224667.git.lorenzo.stoakes@oracle.com
    Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Jann Horn <jannh@google.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Jann Horn <jannh@google.com>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Helge Deller <deller@gmx.de>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: resolve faulty mmap_region() error path behaviour [+ + +]

Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Fri Nov 15 12:41:58 2024 +0000

    mm: resolve faulty mmap_region() error path behaviour
    
    [ Upstream commit 5de195060b2e251a835f622759550e6202167641 ]
    
    The mmap_region() function is somewhat terrifying, with spaghetti-like
    control flow and numerous means by which issues can arise and incomplete
    state, memory leaks and other unpleasantness can occur.
    
    A large amount of the complexity arises from trying to handle errors late
    in the process of mapping a VMA, which forms the basis of recently
    observed issues with resource leaks and observable inconsistent state.
    
    Taking advantage of previous patches in this series we move a number of
    checks earlier in the code, simplifying things by moving the core of the
    logic into a static internal function __mmap_region().
    
    Doing this allows us to perform a number of checks up front before we do
    any real work, and allows us to unwind the writable unmap check
    unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE
    validation unconditionally also.
    
    We move a number of things here:
    
    1. We preallocate memory for the iterator before we call the file-backed
       memory hook, allowing us to exit early and avoid having to perform
       complicated and error-prone close/free logic. We carefully free
       iterator state on both success and error paths.
    
    2. The enclosing mmap_region() function handles the mapping_map_writable()
       logic early. Previously the logic had the mapping_map_writable() at the
       point of mapping a newly allocated file-backed VMA, and a matching
       mapping_unmap_writable() on success and error paths.
    
       We now do this unconditionally if this is a file-backed, shared writable
       mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however
       doing so does not invalidate the seal check we just performed, and we in
       any case always decrement the counter in the wrapper.
    
       We perform a debug assert to ensure a driver does not attempt to do the
       opposite.
    
    3. We also move arch_validate_flags() up into the mmap_region()
       function. This is only relevant on arm64 and sparc64, and the check is
       only meaningful for SPARC with ADI enabled. We explicitly add a warning
       for this arch if a driver invalidates this check, though the code ought
       eventually to be fixed to eliminate the need for this.
    
    With all of these measures in place, we no longer need to explicitly close
    the VMA on error paths, as we place all checks which might fail prior to a
    call to any driver mmap hook.
    
    This eliminates an entire class of errors, makes the code easier to reason
    about and more robust.
    
    Link: https://lkml.kernel.org/r/6e0becb36d2f5472053ac5d544c0edfe9b899e25.1730224667.git.lorenzo.stoakes@oracle.com
    Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Jann Horn <jannh@google.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Tested-by: Mark Brown <broonie@kernel.org>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Helge Deller <deller@gmx.de>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: revert "mm: shmem: fix data-race in shmem_getattr()" [+ + +]

Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Fri Nov 15 16:57:24 2024 -0800

    mm: revert "mm: shmem: fix data-race in shmem_getattr()"
    
    commit d1aa0c04294e29883d65eac6c2f72fe95cc7c049 upstream.
    
    Revert d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()") as
    suggested by Chuck [1].  It is causing deadlocks when accessing tmpfs over
    NFS.
    
    As Hugh commented, "added just to silence a syzbot sanitizer splat: added
    where there has never been any practical problem".
    
    Link: https://lkml.kernel.org/r/ZzdxKF39VEmXSSyN@tissot.1015granger.net [1]
    Fixes: d949d1d14fa2 ("mm: shmem: fix data-race in shmem_getattr()")
    Acked-by: Hugh Dickins <hughd@google.com>
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Cc: Jeongjun Park <aha310510@gmail.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: unconditionally close VMAs on error [+ + +]

Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Fri Nov 15 12:41:55 2024 +0000

    mm: unconditionally close VMAs on error
    
    [ Upstream commit 4080ef1579b2413435413988d14ac8c68e4d42c8 ]
    
    Incorrect invocation of VMA callbacks when the VMA is no longer in a
    consistent state is bug prone and risky to perform.
    
    With regards to the important vm_ops->close() callback We have gone to
    great lengths to try to track whether or not we ought to close VMAs.
    
    Rather than doing so and risking making a mistake somewhere, instead
    unconditionally close and reset vma->vm_ops to an empty dummy operations
    set with a NULL .close operator.
    
    We introduce a new function to do so - vma_close() - and simplify existing
    vms logic which tracked whether we needed to close or not.
    
    This simplifies the logic, avoids incorrect double-calling of the .close()
    callback and allows us to update error paths to simply call vma_close()
    unconditionally - making VMA closure idempotent.
    
    Link: https://lkml.kernel.org/r/28e89dda96f68c505cb6f8e9fc9b57c3e9f74b42.1730224667.git.lorenzo.stoakes@oracle.com
    Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Jann Horn <jannh@google.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reviewed-by: Jann Horn <jannh@google.com>
    Cc: Andreas Larsson <andreas@gaisler.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Helge Deller <deller@gmx.de>
    Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mark Brown <broonie@kernel.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sunxi-mmc: Fix A100 compatible description [+ + +]

Author: Andre Przywara <andre.przywara@arm.com>
Date:   Thu Nov 7 01:42:40 2024 +0000

    mmc: sunxi-mmc: Fix A100 compatible description
    
    commit 85b580afc2c215394e08974bf033de9face94955 upstream.
    
    It turns out that the Allwinner A100/A133 SoC only supports 8K DMA
    blocks (13 bits wide), for both the SD/SDIO and eMMC instances.
    And while this alone would make a trivial fix, the H616 falls back to
    the A100 compatible string, so we have to now match the H616 compatible
    string explicitly against the description advertising 64K DMA blocks.
    
    As the A100 is now compatible with the D1 description, let the A100
    compatible string point to that block instead, and introduce an explicit
    match against the H616 string, pointing to the old description.
    Also remove the redundant setting of clk_delays to NULL on the way.
    
    Fixes: 3536b82e5853 ("mmc: sunxi: add support for A100 mmc controller")
    Cc: stable@vger.kernel.org
    Signed-off-by: Andre Przywara <andre.przywara@arm.com>
    Tested-by: Parthiban Nallathambi <parthiban@linumiz.com>
    Reviewed-by: Chen-Yu Tsai <wens@csie.org>
    Message-ID: <20241107014240.24669-1-andre.przywara@arm.com>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: add userspace_pm_lookup_addr_by_id helper [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Nov 18 19:27:20 2024 +0100

    mptcp: add userspace_pm_lookup_addr_by_id helper
    
    commit 06afe09091ee69dc7ab058b4be9917ae59cc81e5 upstream.
    
    Corresponding __lookup_addr_by_id() helper in the in-kernel netlink PM,
    this patch adds a new helper mptcp_userspace_pm_lookup_addr_by_id() to
    lookup the address entry with the given id on the userspace pm local
    address list.
    
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: f642c5c4d528 ("mptcp: hold pm lock when deleting entry")
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: cope racing subflow creation in mptcp_rcv_space_adjust [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Nov 8 11:58:17 2024 +0100

    mptcp: cope racing subflow creation in mptcp_rcv_space_adjust
    
    [ Upstream commit ce7356ae35943cc6494cc692e62d51a734062b7d ]
    
    Additional active subflows - i.e. created by the in kernel path
    manager - are included into the subflow list before starting the
    3whs.
    
    A racing recvmsg() spooling data received on an already established
    subflow would unconditionally call tcp_cleanup_rbuf() on all the
    current subflows, potentially hitting a divide by zero error on
    the newly created ones.
    
    Explicitly check that the subflow is in a suitable state before
    invoking tcp_cleanup_rbuf().
    
    Fixes: c76c6956566f ("mptcp: call tcp_cleanup_rbuf on subflows")
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/02374660836e1b52afc91966b7535c8c5f7bafb0.1731060874.git.pabeni@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: define more local variables sk [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Nov 18 19:27:19 2024 +0100

    mptcp: define more local variables sk
    
    commit 14cb0e0bf39bd10429ba14e9e2f905f1144226fc upstream.
    
    '(struct sock *)msk' is used several times in mptcp_nl_cmd_announce(),
    mptcp_nl_cmd_remove() or mptcp_userspace_pm_set_flags() in pm_userspace.c,
    it's worth adding a local variable sk to point it.
    
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-8-db8f25f798eb@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 06afe09091ee ("mptcp: add userspace_pm_lookup_addr_by_id helper")
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: drop lookup_by_id in lookup_addr [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Nov 18 19:27:23 2024 +0100

    mptcp: drop lookup_by_id in lookup_addr
    
    commit af250c27ea1c404e210fc3a308b20f772df584d6 upstream.
    
    When the lookup_by_id parameter of __lookup_addr() is true, it's the same
    as __lookup_addr_by_id(), it can be replaced by __lookup_addr_by_id()
    directly. So drop this parameter, let __lookup_addr() only looks up address
    on the local address list by comparing addresses in it, not address ids.
    
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-4-c436ba5e569b@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: db3eab8110bc ("mptcp: pm: use _rcu variant under rcu_read_lock")
    [ Conflicts in pm_netlink.c, because commit 6a42477fe449 ("mptcp: update
      set_flags interfaces") is not in this version, and causes too many
      conflicts when backporting it. The conflict is easy to resolve: addr
      is a pointer here here in mptcp_pm_nl_set_flags(), the rest of the
      code is the same. ]
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: error out earlier on disconnect [+ + +]

Author: Paolo Abeni <pabeni@redhat.com>
Date:   Fri Nov 8 11:58:16 2024 +0100

    mptcp: error out earlier on disconnect
    
    [ Upstream commit 581302298524e9d77c4c44ff5156a6cd112227ae ]
    
    Eric reported a division by zero splat in the MPTCP protocol:
    
    Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 1 UID: 0 PID: 6094 Comm: syz-executor317 Not tainted
    6.12.0-rc5-syzkaller-00291-g05b92660cdfe #0
    Hardware name: Google Google Compute Engine/Google Compute Engine,
    BIOS Google 09/13/2024
    RIP: 0010:__tcp_select_window+0x5b4/0x1310 net/ipv4/tcp_output.c:3163
    Code: f6 44 01 e3 89 df e8 9b 75 09 f8 44 39 f3 0f 8d 11 ff ff ff e8
    0d 74 09 f8 45 89 f4 e9 04 ff ff ff e8 00 74 09 f8 44 89 f0 99 <f7> 7c
    24 14 41 29 d6 45 89 f4 e9 ec fe ff ff e8 e8 73 09 f8 48 89
    RSP: 0018:ffffc900041f7930 EFLAGS: 00010293
    RAX: 0000000000017e67 RBX: 0000000000017e67 RCX: ffffffff8983314b
    RDX: 0000000000000000 RSI: ffffffff898331b0 RDI: 0000000000000004
    RBP: 00000000005d6000 R08: 0000000000000004 R09: 0000000000017e67
    R10: 0000000000003e80 R11: 0000000000000000 R12: 0000000000003e80
    R13: ffff888031d9b440 R14: 0000000000017e67 R15: 00000000002eb000
    FS: 00007feb5d7f16c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007feb5d8adbb8 CR3: 0000000074e4c000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    <TASK>
    __tcp_cleanup_rbuf+0x3e7/0x4b0 net/ipv4/tcp.c:1493
    mptcp_rcv_space_adjust net/mptcp/protocol.c:2085 [inline]
    mptcp_recvmsg+0x2156/0x2600 net/mptcp/protocol.c:2289
    inet_recvmsg+0x469/0x6a0 net/ipv4/af_inet.c:885
    sock_recvmsg_nosec net/socket.c:1051 [inline]
    sock_recvmsg+0x1b2/0x250 net/socket.c:1073
    __sys_recvfrom+0x1a5/0x2e0 net/socket.c:2265
    __do_sys_recvfrom net/socket.c:2283 [inline]
    __se_sys_recvfrom net/socket.c:2279 [inline]
    __x64_sys_recvfrom+0xe0/0x1c0 net/socket.c:2279
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
    entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7feb5d857559
    Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48
    89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
    01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007feb5d7f1208 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
    RAX: ffffffffffffffda RBX: 00007feb5d8e1318 RCX: 00007feb5d857559
    RDX: 000000800000000e RSI: 0000000000000000 RDI: 0000000000000003
    RBP: 00007feb5d8e1310 R08: 0000000000000000 R09: ffffffff81000000
    R10: 0000000000000100 R11: 0000000000000246 R12: 00007feb5d8e131c
    R13: 00007feb5d8ae074 R14: 000000800000000e R15: 00000000fffffdef
    
    and provided a nice reproducer.
    
    The root cause is the current bad handling of racing disconnect.
    After the blamed commit below, sk_wait_data() can return (with
    error) with the underlying socket disconnected and a zero rcv_mss.
    
    Catch the error and return without performing any additional
    operations on the current socket.
    
    Reported-by: Eric Dumazet <edumazet@google.com>
    Fixes: 419ce133ab92 ("tcp: allow again tcp_disconnect() when threads are waiting")
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/8c82ecf71662ecbc47bf390f9905de70884c9f2d.1731060874.git.pabeni@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: hold pm lock when deleting entry [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Nov 18 19:27:22 2024 +0100

    mptcp: hold pm lock when deleting entry
    
    commit f642c5c4d528d11bd78b6c6f84f541cd3c0bea86 upstream.
    
    When traversing userspace_pm_local_addr_list and deleting an entry from
    it in mptcp_pm_nl_remove_doit(), msk->pm.lock should be held.
    
    This patch holds this lock before mptcp_userspace_pm_lookup_addr_by_id()
    and releases it after list_move() in mptcp_pm_nl_remove_doit().
    
    Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-2-b835580cefa8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: pm: use _rcu variant under rcu_read_lock [+ + +]

Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Mon Nov 18 19:27:24 2024 +0100

    mptcp: pm: use _rcu variant under rcu_read_lock
    
    commit db3eab8110bc0520416101b6a5b52f44a43fb4cf upstream.
    
    In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are
    used as expected to iterate over the list of local addresses, but
    list_for_each_entry() was used instead of list_for_each_entry_rcu() in
    __lookup_addr(). It is important to use this variant which adds the
    required READ_ONCE() (and diagnostic checks if enabled).
    
    Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it
    is called under the pernet->lock and not rcu_read_lock(), an extra
    condition is then passed to help the diagnostic checks making sure
    either the associated spin lock or the RCU lock is held.
    
    Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk")
    Cc: stable@vger.kernel.org
    Reviewed-by: Geliang Tang <geliang@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-3-b835580cefa8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: update local address flags when setting it [+ + +]

Author: Geliang Tang <geliang@kernel.org>
Date:   Mon Nov 18 19:27:21 2024 +0100

    mptcp: update local address flags when setting it
    
    commit e0266319413d5d687ba7b6df7ca99e4b9724a4f2 upstream.
    
    Just like in-kernel pm, when userspace pm does set_flags, it needs to send
    out MP_PRIO signal, and also modify the flags of the corresponding address
    entry in the local address list. This patch implements the missing logic.
    
    Traverse all address entries on userspace_pm_local_addr_list to find the
    local address entry, if bkup is true, set the flags of this entry with
    FLAG_BACKUP, otherwise, clear FLAG_BACKUP.
    
    Fixes: 892f396c8e68 ("mptcp: netlink: issue MP_PRIO signals from userspace PMs")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-1-b835580cefa8@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    [ Conflicts in pm_userspace.c, because commit 6a42477fe449 ("mptcp:
      update set_flags interfaces"), is not in this version, and causes too
      many conflicts when backporting it. The same code can still be added
      at the same place, before sending the ACK. ]
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx5: fs, lock FTE when checking if active [+ + +]

Author: Mark Bloch <mbloch@nvidia.com>
Date:   Thu Nov 7 20:35:23 2024 +0200

    net/mlx5: fs, lock FTE when checking if active
    
    [ Upstream commit 9ca314419930f9135727e39d77e66262d5f7bef6 ]
    
    The referenced commits introduced a two-step process for deleting FTEs:
    
    - Lock the FTE, delete it from hardware, set the hardware deletion function
      to NULL and unlock the FTE.
    - Lock the parent flow group, delete the software copy of the FTE, and
      remove it from the xarray.
    
    However, this approach encounters a race condition if a rule with the same
    match value is added simultaneously. In this scenario, fs_core may set the
    hardware deletion function to NULL prematurely, causing a panic during
    subsequent rule deletions.
    
    To prevent this, ensure the active flag of the FTE is checked under a lock,
    which will prevent the fs_core layer from attaching a new steering rule to
    an FTE that is in the process of deletion.
    
    [  438.967589] MOSHE: 2496 mlx5_del_flow_rules del_hw_func
    [  438.968205] ------------[ cut here ]------------
    [  438.968654] refcount_t: decrement hit 0; leaking memory.
    [  438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcount_warn_saturate+0xfb/0x110
    [  438.970054] Modules linked in: act_mirred cls_flower act_gact sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core zram zsmalloc fuse [last unloaded: cls_flower]
    [  438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ #8
    [  438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    [  438.974874] RIP: 0010:refcount_warn_saturate+0xfb/0x110
    [  438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90
    [  438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286
    [  438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000
    [  438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0
    [  438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0
    [  438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0
    [  438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0
    [  438.980607] FS:  00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
    [  438.983984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0
    [  438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  438.986507] Call Trace:
    [  438.986799]  <TASK>
    [  438.987070]  ? __warn+0x7d/0x110
    [  438.987426]  ? refcount_warn_saturate+0xfb/0x110
    [  438.987877]  ? report_bug+0x17d/0x190
    [  438.988261]  ? prb_read_valid+0x17/0x20
    [  438.988659]  ? handle_bug+0x53/0x90
    [  438.989054]  ? exc_invalid_op+0x14/0x70
    [  438.989458]  ? asm_exc_invalid_op+0x16/0x20
    [  438.989883]  ? refcount_warn_saturate+0xfb/0x110
    [  438.990348]  mlx5_del_flow_rules+0x2f7/0x340 [mlx5_core]
    [  438.990932]  __mlx5_eswitch_del_rule+0x49/0x170 [mlx5_core]
    [  438.991519]  ? mlx5_lag_is_sriov+0x3c/0x50 [mlx5_core]
    [  438.992054]  ? xas_load+0x9/0xb0
    [  438.992407]  mlx5e_tc_rule_unoffload+0x45/0xe0 [mlx5_core]
    [  438.993037]  mlx5e_tc_del_fdb_flow+0x2a6/0x2e0 [mlx5_core]
    [  438.993623]  mlx5e_flow_put+0x29/0x60 [mlx5_core]
    [  438.994161]  mlx5e_delete_flower+0x261/0x390 [mlx5_core]
    [  438.994728]  tc_setup_cb_destroy+0xb9/0x190
    [  438.995150]  fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
    [  438.995650]  fl_change+0x11a4/0x13c0 [cls_flower]
    [  438.996105]  tc_new_tfilter+0x347/0xbc0
    [  438.996503]  ? ___slab_alloc+0x70/0x8c0
    [  438.996929]  rtnetlink_rcv_msg+0xf9/0x3e0
    [  438.997339]  ? __netlink_sendskb+0x4c/0x70
    [  438.997751]  ? netlink_unicast+0x286/0x2d0
    [  438.998171]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
    [  438.998625]  netlink_rcv_skb+0x54/0x100
    [  438.999020]  netlink_unicast+0x203/0x2d0
    [  438.999421]  netlink_sendmsg+0x1e4/0x420
    [  438.999820]  __sock_sendmsg+0xa1/0xb0
    [  439.000203]  ____sys_sendmsg+0x207/0x2a0
    [  439.000600]  ? copy_msghdr_from_user+0x6d/0xa0
    [  439.001072]  ___sys_sendmsg+0x80/0xc0
    [  439.001459]  ? ___sys_recvmsg+0x8b/0xc0
    [  439.001848]  ? generic_update_time+0x4d/0x60
    [  439.002282]  __sys_sendmsg+0x51/0x90
    [  439.002658]  do_syscall_64+0x50/0x110
    [  439.003040]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
    Fixes: cefc23554fc2 ("net/mlx5: Fix FTE cleanup")
    Signed-off-by: Mark Bloch <mbloch@nvidia.com>
    Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/20241107183527.676877-4-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: clear xdp features on non-uplink representors [+ + +]

Author: William Tu <witu@nvidia.com>
Date:   Thu Nov 7 20:35:25 2024 +0200

    net/mlx5e: clear xdp features on non-uplink representors
    
    [ Upstream commit c079389878debf767dc4e52fe877b9117258dfe2 ]
    
    Non-uplink representor port does not support XDP. The patch clears
    the xdp feature by checking the net_device_ops.ndo_bpf is set or not.
    
    Verify using the netlink tool:
    $ tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump dev-get
    
    Representor netdev before the patch:
    {'ifindex': 8,
      'xdp-features': {'basic',
                       'ndo-xmit',
                       'ndo-xmit-sg',
                       'redirect',
                       'rx-sg',
                       'xsk-zerocopy'},
      'xdp-rx-metadata-features': set(),
      'xdp-zc-max-segs': 1,
      'xsk-features': set()},
    With the patch:
     {'ifindex': 8,
      'xdp-features': set(),
      'xdp-rx-metadata-features': set(),
      'xsk-features': set()},
    
    Fixes: 4d5ab0ad964d ("net/mlx5e: take into account device reconfiguration for xdp_features flag")
    Signed-off-by: William Tu <witu@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/20241107183527.676877-6-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: CT: Fix null-ptr-deref in add rule err flow [+ + +]

Author: Moshe Shemesh <moshe@nvidia.com>
Date:   Thu Nov 7 20:35:26 2024 +0200

    net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
    
    [ Upstream commit e99c6873229fe0482e7ceb7d5600e32d623ed9d9 ]
    
    In error flow of mlx5_tc_ct_entry_add_rule(), in case ct_rule_add()
    callback returns error, zone_rule->attr is used uninitiated. Fix it to
    use attr which has the needed pointer value.
    
    Kernel log:
     BUG: kernel NULL pointer dereference, address: 0000000000000110
     RIP: 0010:mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
    …
     Call Trace:
      <TASK>
      ? __die+0x20/0x70
      ? page_fault_oops+0x150/0x3e0
      ? exc_page_fault+0x74/0x140
      ? asm_exc_page_fault+0x22/0x30
      ? mlx5_tc_ct_entry_add_rule+0x2b1/0x2f0 [mlx5_core]
      ? mlx5_tc_ct_entry_add_rule+0x1d5/0x2f0 [mlx5_core]
      mlx5_tc_ct_block_flow_offload+0xc6a/0xf90 [mlx5_core]
      ? nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
      nf_flow_offload_tuple+0xd8/0x190 [nf_flow_table]
      flow_offload_work_handler+0x142/0x320 [nf_flow_table]
      ? finish_task_switch.isra.0+0x15b/0x2b0
      process_one_work+0x16c/0x320
      worker_thread+0x28c/0x3a0
      ? __pfx_worker_thread+0x10/0x10
      kthread+0xb8/0xf0
      ? __pfx_kthread+0x10/0x10
      ret_from_fork+0x2d/0x50
      ? __pfx_kthread+0x10/0x10
      ret_from_fork_asm+0x1a/0x30
      </TASK>
    
    Fixes: 7fac5c2eced3 ("net/mlx5: CT: Avoid reusing modify header context for natted entries")
    Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
    Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
    Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/20241107183527.676877-7-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlx5e: kTLS, Fix incorrect page refcounting [+ + +]

Author: Dragos Tatulea <dtatulea@nvidia.com>
Date:   Thu Nov 7 20:35:24 2024 +0200

    net/mlx5e: kTLS, Fix incorrect page refcounting
    
    [ Upstream commit dd6e972cc5890d91d6749bb48e3912721c4e4b25 ]
    
    The kTLS tx handling code is using a mix of get_page() and
    page_ref_inc() APIs to increment the page reference. But on the release
    path (mlx5e_ktls_tx_handle_resync_dump_comp()), only put_page() is used.
    
    This is an issue when using pages from large folios: the get_page()
    references are stored on the folio page while the page_ref_inc()
    references are stored directly in the given page. On release the folio
    page will be dereferenced too many times.
    
    This was found while doing kTLS testing with sendfile() + ZC when the
    served file was read from NFS on a kernel with NFS large folios support
    (commit 49b29a573da8 ("nfs: add support for large folios")).
    
    Fixes: 84d1bb2b139e ("net/mlx5e: kTLS, Limit DUMP wqe size")
    Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
    Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
    Link: https://patch.msgid.link/20241107183527.676877-5-tariqt@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/sched: cls_u32: replace int refcounts with proper refcounts [+ + +]

Author: Pedro Tammela <pctammela@mojatatu.com>
Date:   Tue Nov 14 11:18:55 2023 -0300

    net/sched: cls_u32: replace int refcounts with proper refcounts
    
    [ Upstream commit 6b78debe1c07e6aa3c91ca0b1384bf3cb8217c50 ]
    
    Proper refcounts will always warn splat when something goes wrong,
    be it underflow, saturation or object resurrection. As these are always
    a source of bugs, use it in cls_u32 as a safeguard to prevent/catch issues.
    Another benefit is that the refcount API self documents the code, making
    clear when transitions to dead are expected.
    
    For such an update we had to make minor adaptations on u32 to fit the refcount
    API. First we set explicitly to '1' when objects are created, then the
    objects are alive until a 1 -> 0 happens, which is then released appropriately.
    
    The above made clear some redundant operations in the u32 code
    around the root_ht handling that were removed. The root_ht is created
    with a refcnt set to 1. Then when it's associated with tcf_proto it increments the refcnt to 2.
    Throughout the entire code the root_ht is an exceptional case and can never be referenced,
    therefore the refcnt never incremented/decremented.
    Its lifetime is always bound to tcf_proto, meaning if you delete tcf_proto
    the root_ht is deleted as well. The code made up for the fact that root_ht refcnt is 2 and did
    a double decrement to free it, which is not a fit for the refcount API.
    
    Even though refcount_t is implemented using atomics, we should observe
    a negligible control plane impact.
    
    Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://lore.kernel.org/r/20231114141856.974326-2-pctammela@mojatatu.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 73af53d82076 ("net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Make copy_safe_from_sockptr() match documentation [+ + +]

Author: Michal Luczaj <mhal@rbox.co>
Date:   Mon Nov 11 00:17:34 2024 +0100

    net: Make copy_safe_from_sockptr() match documentation
    
    [ Upstream commit eb94b7bb10109a14a5431a67e5d8e31cfa06b395 ]
    
    copy_safe_from_sockptr()
      return copy_from_sockptr()
        return copy_from_sockptr_offset()
          return copy_from_user()
    
    copy_from_user() does not return an error on fault. Instead, it returns a
    number of bytes that were not copied. Have it handled.
    
    Patch has a side effect: it un-breaks garbage input handling of
    nfc_llcp_setsockopt() and mISDN's data_sock_setsockopt().
    
    Fixes: 6309863b31dd ("net: add copy_safe_from_sockptr() helper")
    Signed-off-by: Michal Luczaj <mhal@rbox.co>
    Link: https://patch.msgid.link/20241111-sockptr-copy-ret-fix-v1-1-a520083a93fb@rbox.co
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. [+ + +]

Author: Alexandre Ferrieux <alexandre.ferrieux@gmail.com>
Date:   Sun Nov 10 18:28:36 2024 +0100

    net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
    
    [ Upstream commit 73af53d82076bbe184d9ece9e14b0dc8599e6055 ]
    
    To generate hnode handles (in gen_new_htid()), u32 uses IDR and
    encodes the returned small integer into a structured 32-bit
    word. Unfortunately, at disposal time, the needed decoding
    is not done. As a result, idr_remove() fails, and the IDR
    fills up. Since its size is 2048, the following script ends up
    with "Filter already exists":
    
      tc filter add dev myve $FILTER1
      tc filter add dev myve $FILTER2
      for i in {1..2048}
      do
        echo $i
        tc filter del dev myve $FILTER2
        tc filter add dev myve $FILTER2
      done
    
    This patch adds the missing decoding logic for handles that
    deserve it.
    
    Fixes: e7614370d6f0 ("net_sched: use idr to allocate u32 filter handles")
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
    Tested-by: Victor Nogueira <victor@mojatatu.com>
    Link: https://patch.msgid.link/20241110172836.331319-1-alexandre.ferrieux@orange.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac-intel-plat: use devm_stmmac_probe_config_dt() [+ + +]

Author: Jisheng Zhang <jszhang@kernel.org>
Date:   Sat Sep 16 15:58:13 2023 +0800

    net: stmmac: dwmac-intel-plat: use devm_stmmac_probe_config_dt()
    
    [ Upstream commit abea8fd5e801a679312479b2bf00d7b4285eca78 ]
    
    Simplify the driver's probe() function by using the devres
    variant of stmmac_probe_config_dt().
    
    The calling of stmmac_pltfr_remove() now needs to be switched to
    stmmac_pltfr_remove_no_dt().
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol [+ + +]

Author: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Date:   Sat Nov 9 10:16:32 2024 -0500

    net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol
    
    [ Upstream commit a03b18a71c128846360cc81ac6fdb0e7d41597b4 ]
    
    The mediatek,mac-wol property is being handled backwards to what is
    described in the binding: it currently enables PHY WOL when the property
    is present and vice versa. Invert the driver logic so it matches the
    binding description.
    
    Fixes: fd1d62d80ebc ("net: stmmac: replace the use_phy_wol field with a flag")
    Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
    Link: https://patch.msgid.link/20241109-mediatek-mac-wol-noninverted-v2-1-0e264e213878@collabora.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: dwmac-visconti: use devm_stmmac_probe_config_dt() [+ + +]

Author: Jisheng Zhang <jszhang@kernel.org>
Date:   Sat Sep 16 15:58:27 2023 +0800

    net: stmmac: dwmac-visconti: use devm_stmmac_probe_config_dt()
    
    [ Upstream commit d336a117b593e96559c309bb250f06b4fc22998f ]
    
    Simplify the driver's probe() function by using the devres
    variant of stmmac_probe_config_dt().
    
    The calling of stmmac_pltfr_remove() now needs to be switched to
    stmmac_pltfr_remove_no_dt().
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: rename stmmac_pltfr_remove_no_dt to stmmac_pltfr_remove [+ + +]

Author: Jisheng Zhang <jszhang@kernel.org>
Date:   Sat Sep 16 15:58:28 2023 +0800

    net: stmmac: rename stmmac_pltfr_remove_no_dt to stmmac_pltfr_remove
    
    [ Upstream commit 2c9fc838067b02cb3e6057fef5cd7cf1c04a95aa ]
    
    Now, all users of the old stmmac_pltfr_remove() are converted to the
    devres helper, it's time to rename stmmac_pltfr_remove_no_dt() back to
    stmmac_pltfr_remove() and remove the old stmmac_pltfr_remove().
    
    Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ti: icssg-prueth: Fix 1 PPS sync [+ + +]

Author: Meghana Malladi <m-malladi@ti.com>
Date:   Mon Nov 11 15:28:42 2024 +0530

    net: ti: icssg-prueth: Fix 1 PPS sync
    
    [ Upstream commit dc065076ee7768377d7c16af7d1b0767782d8c98 ]
    
    The first PPS latch time needs to be calculated by the driver
    (in rounded off seconds) and configured as the start time
    offset for the cycle. After synchronizing two PTP clocks
    running as master/slave, missing this would cause master
    and slave to start immediately with some milliseconds
    drift which causes the PPS signal to never synchronize with
    the PTP master.
    
    Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
    Signed-off-by: Meghana Malladi <m-malladi@ti.com>
    Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
    Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
    Link: https://patch.msgid.link/20241111095842.478833-1-m-malladi@ti.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: vertexcom: mse102x: Fix tx_bytes calculation [+ + +]

Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Fri Nov 8 12:43:43 2024 +0100

    net: vertexcom: mse102x: Fix tx_bytes calculation
    
    [ Upstream commit e68da664d379f352d41d7955712c44e0a738e4ab ]
    
    The tx_bytes should consider the actual size of the Ethernet frames
    without the SPI encapsulation. But we still need to take care of
    Ethernet padding.
    
    Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://patch.msgid.link/20241108114343.6174-3-wahrenst@gmx.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: terminate outstanding dump on socket close [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue Nov 5 17:52:34 2024 -0800

    netlink: terminate outstanding dump on socket close
    
    [ Upstream commit 1904fb9ebf911441f90a68e96b22aa73e4410505 ]
    
    Netlink supports iterative dumping of data. It provides the families
    the following ops:
     - start - (optional) kicks off the dumping process
     - dump  - actual dump helper, keeps getting called until it returns 0
     - done  - (optional) pairs with .start, can be used for cleanup
    The whole process is asynchronous and the repeated calls to .dump
    don't actually happen in a tight loop, but rather are triggered
    in response to recvmsg() on the socket.
    
    This gives the user full control over the dump, but also means that
    the user can close the socket without getting to the end of the dump.
    To make sure .start is always paired with .done we check if there
    is an ongoing dump before freeing the socket, and if so call .done.
    
    The complication is that sockets can get freed from BH and .done
    is allowed to sleep. So we use a workqueue to defer the call, when
    needed.
    
    Unfortunately this does not work correctly. What we defer is not
    the cleanup but rather releasing a reference on the socket.
    We have no guarantee that we own the last reference, if someone
    else holds the socket they may release it in BH and we're back
    to square one.
    
    The whole dance, however, appears to be unnecessary. Only the user
    can interact with dumps, so we can clean up when socket is closed.
    And close always happens in process context. Some async code may
    still access the socket after close, queue notification skbs to it etc.
    but no dumps can start, end or otherwise make progress.
    
    Delete the workqueue and flush the dump state directly from the release
    handler. Note that further cleanup is possible in -next, for instance
    we now always call .done before releasing the main module reference,
    so dump doesn't have to take a reference of its own.
    
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Fixes: ed5d7788a934 ("netlink: Do not schedule work from sk_destruct")
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20241106015235.2458807-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSD: Async COPY result needs to return a write verifier [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 18 16:14:10 2024 -0500

    NFSD: Async COPY result needs to return a write verifier
    
    [ Upstream commit 9ed666eba4e0a2bb8ffaa3739d830b64d4f2aaad ]
    
    Currently, when NFSD handles an asynchronous COPY, it returns a
    zero write verifier, relying on the subsequent CB_OFFLOAD callback
    to pass the write verifier and a stable_how4 value to the client.
    
    However, if the CB_OFFLOAD never arrives at the client (for example,
    if a network partition occurs just as the server sends the
    CB_OFFLOAD operation), the client will never receive this verifier.
    Thus, if the client sends a follow-up COMMIT, there is no way for
    the client to assess the COMMIT result.
    
    The usual recovery for a missing CB_OFFLOAD is for the client to
    send an OFFLOAD_STATUS operation, but that operation does not carry
    a write verifier in its result. Neither does it carry a stable_how4
    value, so the client /must/ send a COMMIT in this case -- which will
    always fail because currently there's still no write verifier in the
    COPY result.
    
    Thus the server needs to return a normal write verifier in its COPY
    result even if the COPY operation is to be performed asynchronously.
    
    If the server recognizes the callback stateid in subsequent
    OFFLOAD_STATUS operations, then obviously it has not restarted, and
    the write verifier the client received in the COPY result is still
    valid and can be used to assess a COMMIT of the copied data, if one
    is needed.
    
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    [ cel: adjusted to apply to origin/linux-6.6.y ]
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSD: initialize copy->cp_clp early in nfsd4_copy for use by trace point [+ + +]

Author: Dai Ngo <dai.ngo@oracle.com>
Date:   Mon Nov 18 16:14:09 2024 -0500

    NFSD: initialize copy->cp_clp early in nfsd4_copy for use by trace point
    
    [ Upstream commit 15d1975b7279693d6f09398e0e2e31aca2310275 ]
    
    Prepare for adding server copy trace points.
    
    Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
    Tested-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
    Stable-dep-of: 9ed666eba4e0 ("NFSD: Async COPY result needs to return a write verifier")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSD: Initialize struct nfsd4_copy earlier [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 18 16:14:12 2024 -0500

    NFSD: Initialize struct nfsd4_copy earlier
    
    [ Upstream commit 63fab04cbd0f96191b6e5beedc3b643b01c15889 ]
    
    Ensure the refcount and async_copies fields are initialized early.
    cleanup_async_copy() will reference these fields if an error occurs
    in nfsd4_copy(). If they are not correctly initialized, at the very
    least, a refcount underflow occurs.
    
    Reported-by: Olga Kornievskaia <okorniev@redhat.com>
    Fixes: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Tested-by: Olga Kornievskaia <okorniev@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSD: Limit the number of concurrent async COPY operations [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 18 16:14:11 2024 -0500

    NFSD: Limit the number of concurrent async COPY operations
    
    [ Upstream commit aadc3bbea163b6caaaebfdd2b6c4667fbc726752 ]
    
    Nothing appears to limit the number of concurrent async COPY
    operations that clients can start. In addition, AFAICT each async
    COPY can copy an unlimited number of 4MB chunks, so can run for a
    long time. Thus IMO async COPY can become a DoS vector.
    
    Add a restriction mechanism that bounds the number of concurrent
    background COPY operations. Start simple and try to be fair -- this
    patch implements a per-namespace limit.
    
    An async COPY request that occurs while this limit is exceeded gets
    NFS4ERR_DELAY. The requesting client can choose to send the request
    again after a delay or fall back to a traditional read/write style
    copy.
    
    If there is need to make the mechanism more sophisticated, we can
    visit that in future patches.
    
    Cc: stable@vger.kernel.org
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Link: https://nvd.nist.gov/vuln/detail/CVE-2024-49974
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSD: Never decrement pending_async_copies on error [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Mon Nov 18 16:14:13 2024 -0500

    NFSD: Never decrement pending_async_copies on error
    
    [ Upstream commit 8286f8b622990194207df9ab852e0f87c60d35e9 ]
    
    The error flow in nfsd4_copy() calls cleanup_async_copy(), which
    already decrements nn->pending_async_copies.
    
    Reported-by: Olga Kornievskaia <okorniev@redhat.com>
    Fixes: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint [+ + +]

Author: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Date:   Thu Nov 7 01:07:33 2024 +0900

    nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint
    
    commit 2026559a6c4ce34db117d2db8f710fe2a9420d5a upstream.
    
    When using the "block:block_dirty_buffer" tracepoint, mark_buffer_dirty()
    may cause a NULL pointer dereference, or a general protection fault when
    KASAN is enabled.
    
    This happens because, since the tracepoint was added in
    mark_buffer_dirty(), it references the dev_t member bh->b_bdev->bd_dev
    regardless of whether the buffer head has a pointer to a block_device
    structure.
    
    In the current implementation, nilfs_grab_buffer(), which grabs a buffer
    to read (or create) a block of metadata, including b-tree node blocks,
    does not set the block device, but instead does so only if the buffer is
    not in the "uptodate" state for each of its caller block reading
    functions.  However, if the uptodate flag is set on a folio/page, and the
    buffer heads are detached from it by try_to_free_buffers(), and new buffer
    heads are then attached by create_empty_buffers(), the uptodate flag may
    be restored to each buffer without the block device being set to
    bh->b_bdev, and mark_buffer_dirty() may be called later in that state,
    resulting in the bug mentioned above.
    
    Fix this issue by making nilfs_grab_buffer() always set the block device
    of the super block structure to the buffer head, regardless of the state
    of the buffer's uptodate flag.
    
    Link: https://lkml.kernel.org/r/20241106160811.3316-3-konishi.ryusuke@gmail.com
    Fixes: 5305cb830834 ("block: add block_{touch|dirty}_buffer tracepoint")
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Ubisectech Sirius <bugreport@valiantsec.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint [+ + +]

Author: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Date:   Thu Nov 7 01:07:32 2024 +0900

    nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint
    
    commit cd45e963e44b0f10d90b9e6c0e8b4f47f3c92471 upstream.
    
    Patch series "nilfs2: fix null-ptr-deref bugs on block tracepoints".
    
    This series fixes null pointer dereference bugs that occur when using
    nilfs2 and two block-related tracepoints.
    
    
    This patch (of 2):
    
    It has been reported that when using "block:block_touch_buffer"
    tracepoint, touch_buffer() called from __nilfs_get_folio_block() causes a
    NULL pointer dereference, or a general protection fault when KASAN is
    enabled.
    
    This happens because since the tracepoint was added in touch_buffer(), it
    references the dev_t member bh->b_bdev->bd_dev regardless of whether the
    buffer head has a pointer to a block_device structure.  In the current
    implementation, the block_device structure is set after the function
    returns to the caller.
    
    Here, touch_buffer() is used to mark the folio/page that owns the buffer
    head as accessed, but the common search helper for folio/page used by the
    caller function was optimized to mark the folio/page as accessed when it
    was reimplemented a long time ago, eliminating the need to call
    touch_buffer() here in the first place.
    
    So this solves the issue by eliminating the touch_buffer() call itself.
    
    Link: https://lkml.kernel.org/r/20241106160811.3316-1-konishi.ryusuke@gmail.com
    Link: https://lkml.kernel.org/r/20241106160811.3316-2-konishi.ryusuke@gmail.com
    Fixes: 5305cb830834 ("block: add block_{touch|dirty}_buffer tracepoint")
    Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
    Reported-by: Ubisectech Sirius <bugreport@valiantsec.com>
    Closes: https://lkml.kernel.org/r/86bd3013-887e-4e38-960f-ca45c657f032.bugreport@valiantsec.com
    Reported-by: syzbot+9982fb8d18eba905abe2@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=9982fb8d18eba905abe2
    Tested-by: syzbot+9982fb8d18eba905abe2@syzkaller.appspotmail.com
    Cc: Tejun Heo <tj@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nommu: pass NULL argument to vma_iter_prealloc() [+ + +]

Author: Hajime Tazaki <thehajime@gmail.com>
Date:   Sat Nov 9 07:28:34 2024 +0900

    nommu: pass NULL argument to vma_iter_prealloc()
    
    commit 247d720b2c5d22f7281437fd6054a138256986ba upstream.
    
    When deleting a vma entry from a maple tree, it has to pass NULL to
    vma_iter_prealloc() in order to calculate internal state of the tree, but
    it passed a wrong argument.  As a result, nommu kernels crashed upon
    accessing a vma iterator, such as acct_collect() reading the size of vma
    entries after do_munmap().
    
    This commit fixes this issue by passing a right argument to the
    preallocation call.
    
    Link: https://lkml.kernel.org/r/20241108222834.3625217-1-thehajime@gmail.com
    Fixes: b5df09226450 ("mm: set up vma iterator for vma_iter_prealloc() calls")
    Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nouveau: fw: sync dma after setup is called. [+ + +]

Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Nov 13 05:57:03 2024 +1000

    nouveau: fw: sync dma after setup is called.
    
    commit 21ec425eaf2cb7c0371f7683f81ad7d9679b6eb5 upstream.
    
    When this code moved to non-coherent allocator the sync was put too
    early for some firmwares which called the setup function, move the
    sync down after the setup function.
    
    Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
    Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
    Reviewed-by: Lyude Paul <lyude@redhat.com>
    Fixes: 9b340aeb26d5 ("nouveau/firmware: use dma non-coherent allocator")
    Cc: stable@vger.kernel.org
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20241114004603.3095485-1-airlied@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: fix UBSAN warning in ocfs2_verify_volume() [+ + +]

Author: Dmitry Antipov <dmantipov@yandex.ru>
Date:   Wed Nov 6 12:21:00 2024 +0300

    ocfs2: fix UBSAN warning in ocfs2_verify_volume()
    
    commit 23aab037106d46e6168ce1214a958ce9bf317f2e upstream.
    
    Syzbot has reported the following splat triggered by UBSAN:
    
    UBSAN: shift-out-of-bounds in fs/ocfs2/super.c:2336:10
    shift exponent 32768 is too large for 32-bit type 'int'
    CPU: 2 UID: 0 PID: 5255 Comm: repro Not tainted 6.12.0-rc4-syzkaller-00047-gc2ee9f594da8 #0
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x241/0x360
     ? __pfx_dump_stack_lvl+0x10/0x10
     ? __pfx__printk+0x10/0x10
     ? __asan_memset+0x23/0x50
     ? lockdep_init_map_type+0xa1/0x910
     __ubsan_handle_shift_out_of_bounds+0x3c8/0x420
     ocfs2_fill_super+0xf9c/0x5750
     ? __pfx_ocfs2_fill_super+0x10/0x10
     ? __pfx_validate_chain+0x10/0x10
     ? __pfx_validate_chain+0x10/0x10
     ? validate_chain+0x11e/0x5920
     ? __lock_acquire+0x1384/0x2050
     ? __pfx_validate_chain+0x10/0x10
     ? string+0x26a/0x2b0
     ? widen_string+0x3a/0x310
     ? string+0x26a/0x2b0
     ? bdev_name+0x2b1/0x3c0
     ? pointer+0x703/0x1210
     ? __pfx_pointer+0x10/0x10
     ? __pfx_format_decode+0x10/0x10
     ? __lock_acquire+0x1384/0x2050
     ? vsnprintf+0x1ccd/0x1da0
     ? snprintf+0xda/0x120
     ? __pfx_lock_release+0x10/0x10
     ? do_raw_spin_lock+0x14f/0x370
     ? __pfx_snprintf+0x10/0x10
     ? set_blocksize+0x1f9/0x360
     ? sb_set_blocksize+0x98/0xf0
     ? setup_bdev_super+0x4e6/0x5d0
     mount_bdev+0x20c/0x2d0
     ? __pfx_ocfs2_fill_super+0x10/0x10
     ? __pfx_mount_bdev+0x10/0x10
     ? vfs_parse_fs_string+0x190/0x230
     ? __pfx_vfs_parse_fs_string+0x10/0x10
     legacy_get_tree+0xf0/0x190
     ? __pfx_ocfs2_mount+0x10/0x10
     vfs_get_tree+0x92/0x2b0
     do_new_mount+0x2be/0xb40
     ? __pfx_do_new_mount+0x10/0x10
     __se_sys_mount+0x2d6/0x3c0
     ? __pfx___se_sys_mount+0x10/0x10
     ? do_syscall_64+0x100/0x230
     ? __x64_sys_mount+0x20/0xc0
     do_syscall_64+0xf3/0x230
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f37cae96fda
    Code: 48 8b 0d 51 ce 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1e ce 0c 00 f7 d8 64 89 01 48
    RSP: 002b:00007fff6c1aa228 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
    RAX: ffffffffffffffda RBX: 00007fff6c1aa240 RCX: 00007f37cae96fda
    RDX: 00000000200002c0 RSI: 0000000020000040 RDI: 00007fff6c1aa240
    RBP: 0000000000000004 R08: 00007fff6c1aa280 R09: 0000000000000000
    R10: 00000000000008c0 R11: 0000000000000206 R12: 00000000000008c0
    R13: 00007fff6c1aa280 R14: 0000000000000003 R15: 0000000001000000
     </TASK>
    
    For a really damaged superblock, the value of 'i_super.s_blocksize_bits'
    may exceed the maximum possible shift for an underlying 'int'.  So add an
    extra check whether the aforementioned field represents the valid block
    size, which is 512 bytes, 1K, 2K, or 4K.
    
    Link: https://lkml.kernel.org/r/20241106092100.2661330-1-dmantipov@yandex.ru
    Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
    Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
    Reported-by: syzbot+56f7cd1abe4b8e475180@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=56f7cd1abe4b8e475180
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs2: uncache inode which has failed entering the group [+ + +]

Author: Dmitry Antipov <dmantipov@yandex.ru>
Date:   Thu Nov 14 07:38:44 2024 +0300

    ocfs2: uncache inode which has failed entering the group
    
    commit 737f34137844d6572ab7d473c998c7f977ff30eb upstream.
    
    Syzbot has reported the following BUG:
    
    kernel BUG at fs/ocfs2/uptodate.c:509!
    ...
    Call Trace:
     <TASK>
     ? __die_body+0x5f/0xb0
     ? die+0x9e/0xc0
     ? do_trap+0x15a/0x3a0
     ? ocfs2_set_new_buffer_uptodate+0x145/0x160
     ? do_error_trap+0x1dc/0x2c0
     ? ocfs2_set_new_buffer_uptodate+0x145/0x160
     ? __pfx_do_error_trap+0x10/0x10
     ? handle_invalid_op+0x34/0x40
     ? ocfs2_set_new_buffer_uptodate+0x145/0x160
     ? exc_invalid_op+0x38/0x50
     ? asm_exc_invalid_op+0x1a/0x20
     ? ocfs2_set_new_buffer_uptodate+0x2e/0x160
     ? ocfs2_set_new_buffer_uptodate+0x144/0x160
     ? ocfs2_set_new_buffer_uptodate+0x145/0x160
     ocfs2_group_add+0x39f/0x15a0
     ? __pfx_ocfs2_group_add+0x10/0x10
     ? __pfx_lock_acquire+0x10/0x10
     ? mnt_get_write_access+0x68/0x2b0
     ? __pfx_lock_release+0x10/0x10
     ? rcu_read_lock_any_held+0xb7/0x160
     ? __pfx_rcu_read_lock_any_held+0x10/0x10
     ? smack_log+0x123/0x540
     ? mnt_get_write_access+0x68/0x2b0
     ? mnt_get_write_access+0x68/0x2b0
     ? mnt_get_write_access+0x226/0x2b0
     ocfs2_ioctl+0x65e/0x7d0
     ? __pfx_ocfs2_ioctl+0x10/0x10
     ? smack_file_ioctl+0x29e/0x3a0
     ? __pfx_smack_file_ioctl+0x10/0x10
     ? lockdep_hardirqs_on_prepare+0x43d/0x780
     ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
     ? __pfx_ocfs2_ioctl+0x10/0x10
     __se_sys_ioctl+0xfb/0x170
     do_syscall_64+0xf3/0x230
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    ...
     </TASK>
    
    When 'ioctl(OCFS2_IOC_GROUP_ADD, ...)' has failed for the particular
    inode in 'ocfs2_verify_group_and_input()', corresponding buffer head
    remains cached and subsequent call to the same 'ioctl()' for the same
    inode issues the BUG() in 'ocfs2_set_new_buffer_uptodate()' (trying
    to cache the same buffer head of that inode). Fix this by uncaching
    the buffer head with 'ocfs2_remove_from_cache()' on error path in
    'ocfs2_group_add()'.
    
    Link: https://lkml.kernel.org/r/20241114043844.111847-1-dmantipov@yandex.ru
    Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize")
    Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
    Reported-by: syzbot+453873f1588c2d75b447@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=453873f1588c2d75b447
    Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
    Cc: Dmitry Antipov <dmantipov@yandex.ru>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Mark Fasheh <mark@fasheh.com>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: Changwei Ge <gechangwei@live.cn>
    Cc: Jun Piao <piaojun@huawei.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pmdomain: imx93-blk-ctrl: correct remove path [+ + +]

Author: Peng Fan <peng.fan@nxp.com>
Date:   Fri Nov 1 18:12:51 2024 +0800

    pmdomain: imx93-blk-ctrl: correct remove path
    
    commit f7c7c5aa556378a2c8da72c1f7f238b6648f95fb upstream.
    
    The check condition should be 'i < bc->onecell_data.num_domains', not
    'bc->onecell_data.num_domains' which will make the look never finish
    and cause kernel panic.
    
    Also disable runtime to address
    "imx93-blk-ctrl 4ac10000.system-controller: Unbalanced pm_runtime_enable!"
    
    Fixes: e9aa77d413c9 ("soc: imx: add i.MX93 media blk ctrl driver")
    Signed-off-by: Peng Fan <peng.fan@nxp.com>
    Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
    Cc: stable@vger.kernel.org
    Message-ID: <20241101101252.1448466-1-peng.fan@oss.nxp.com>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K" [+ + +]

Author: Aurelien Jarno <aurelien@aurel32.net>
Date:   Sun Nov 10 12:46:36 2024 +0100

    Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K"
    
    commit 1635e407a4a64d08a8517ac59ca14ad4fc785e75 upstream.
    
    The commit 8396c793ffdf ("mmc: dw_mmc: Fix IDMAC operation with pages
    bigger than 4K") increased the max_req_size, even for 4K pages, causing
    various issues:
    - Panic booting the kernel/rootfs from an SD card on Rockchip RK3566
    - Panic booting the kernel/rootfs from an SD card on StarFive JH7100
    - "swiotlb buffer is full" and data corruption on StarFive JH7110
    
    At this stage no fix have been found, so it's probably better to just
    revert the change.
    
    This reverts commit 8396c793ffdf28bb8aee7cfe0891080f8cab7890.
    
    Cc: stable@vger.kernel.org
    Cc: Sam Protsenko <semen.protsenko@linaro.org>
    Fixes: 8396c793ffdf ("mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K")
    Closes: https://lore.kernel.org/linux-mmc/614692b4-1dbe-31b8-a34d-cb6db1909bb7@w6rz.net/
    Closes: https://lore.kernel.org/linux-mmc/CAC8uq=Ppnmv98mpa1CrWLawWoPnu5abtU69v-=G-P7ysATQ2Pw@mail.gmail.com/
    Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
    Message-ID: <20241110114700.622372-1-aurelien@aurel32.net>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "RDMA/core: Fix ENODEV error for iWARP test over vlan" [+ + +]

Author: Leon Romanovsky <leon@kernel.org>
Date:   Tue Nov 12 10:56:26 2024 +0200

    Revert "RDMA/core: Fix ENODEV error for iWARP test over vlan"
    
    [ Upstream commit 6abe2a90808192a5a8b2825293e5f10e80fdea56 ]
    
    The citied commit in Fixes line caused to regression for udaddy [1]
    application. It doesn't work over VLANs anymore.
    
    Client:
      ifconfig eth2 1.1.1.1
      ip link add link eth2 name p0.3597 type vlan protocol 802.1Q id 3597
      ip link set dev p0.3597 up
      ip addr add 2.2.2.2/16 dev p0.3597
      udaddy -S 847 -C 220 -c 2 -t 0 -s 2.2.2.3 -b 2.2.2.2
    
    Server:
      ifconfig eth2 1.1.1.3
      ip link add link eth2 name p0.3597 type vlan protocol 802.1Q id 3597
      ip link set dev p0.3597 up
      ip addr add 2.2.2.3/16 dev p0.3597
      udaddy -S 847 -C 220 -c 2 -t 0 -b 2.2.2.3
    
    [1] https://github.com/linux-rdma/rdma-core/blob/master/librdmacm/examples/udaddy.c
    
    Fixes: 5069d7e202f6 ("RDMA/core: Fix ENODEV error for iWARP test over vlan")
    Reported-by: Leon Romanovsky <leonro@nvidia.com>
    Closes: https://lore.kernel.org/all/20241110130746.GA48891@unreal
    Link: https://patch.msgid.link/bb9d403419b2b9566da5b8bf0761fa8377927e49.1731401658.git.leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

samples: pktgen: correct dev to DEV [+ + +]

Author: Wei Fang <wei.fang@nxp.com>
Date:   Tue Nov 12 11:03:47 2024 +0800

    samples: pktgen: correct dev to DEV
    
    [ Upstream commit 3342dc8b4623d835e7dd76a15cec2e5a94fe2f93 ]
    
    In the pktgen_sample01_simple.sh script, the device variable is uppercase
    'DEV' instead of lowercase 'dev'. Because of this typo, the script cannot
    enable UDP tx checksum.
    
    Fixes: 460a9aa23de6 ("samples: pktgen: add UDP tx checksum support")
    Signed-off-by: Wei Fang <wei.fang@nxp.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
    Link: https://patch.msgid.link/20241112030347.1849335-1-wei.fang@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

sctp: fix possible UAF in sctp_v6_available() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Nov 7 19:20:21 2024 +0000

    sctp: fix possible UAF in sctp_v6_available()
    
    [ Upstream commit eb72e7fcc83987d5d5595b43222f23b295d5de7f ]
    
    A lockdep report [1] with CONFIG_PROVE_RCU_LIST=y hints
    that sctp_v6_available() is calling dev_get_by_index_rcu()
    and ipv6_chk_addr() without holding rcu.
    
    [1]
     =============================
     WARNING: suspicious RCU usage
     6.12.0-rc5-virtme #1216 Tainted: G        W
     -----------------------------
     net/core/dev.c:876 RCU-list traversed in non-reader section!!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 1
     1 lock held by sctp_hello/31495:
     #0: ffff9f1ebbdb7418 (sk_lock-AF_INET6){+.+.}-{0:0}, at: sctp_bind (./arch/x86/include/asm/jump_label.h:27 net/sctp/socket.c:315) sctp
    
    stack backtrace:
     CPU: 7 UID: 0 PID: 31495 Comm: sctp_hello Tainted: G        W          6.12.0-rc5-virtme #1216
     Tainted: [W]=WARN
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
     Call Trace:
      <TASK>
     dump_stack_lvl (lib/dump_stack.c:123)
     lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
     dev_get_by_index_rcu (net/core/dev.c:876 (discriminator 7))
     sctp_v6_available (net/sctp/ipv6.c:701) sctp
     sctp_do_bind (net/sctp/socket.c:400 (discriminator 1)) sctp
     sctp_bind (net/sctp/socket.c:320) sctp
     inet6_bind_sk (net/ipv6/af_inet6.c:465)
     ? security_socket_bind (security/security.c:4581 (discriminator 1))
     __sys_bind (net/socket.c:1848 net/socket.c:1869)
     ? do_user_addr_fault (./include/linux/rcupdate.h:347 ./include/linux/rcupdate.h:880 ./include/linux/mm.h:729 arch/x86/mm/fault.c:1340)
     ? do_user_addr_fault (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:98 (discriminator 13) ./include/linux/rcupdate.h:882 (discriminator 13) ./include/linux/mm.h:729 (discriminator 13) arch/x86/mm/fault.c:1340 (discriminator 13))
     __x64_sys_bind (net/socket.c:1877 (discriminator 1) net/socket.c:1875 (discriminator 1) net/socket.c:1875 (discriminator 1))
     do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
     entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
     RIP: 0033:0x7f59b934a1e7
     Code: 44 00 00 48 8b 15 39 8c 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 09 8c 0c 00 f7 d8 64 89 01 48
    All code
    ========
       0:   44 00 00                add    %r8b,(%rax)
       3:   48 8b 15 39 8c 0c 00    mov    0xc8c39(%rip),%rdx        # 0xc8c43
       a:   f7 d8                   neg    %eax
       c:   64 89 02                mov    %eax,%fs:(%rdx)
       f:   b8 ff ff ff ff          mov    $0xffffffff,%eax
      14:   eb bd                   jmp    0xffffffffffffffd3
      16:   66 2e 0f 1f 84 00 00    cs nopw 0x0(%rax,%rax,1)
      1d:   00 00 00
      20:   0f 1f 00                nopl   (%rax)
      23:   b8 31 00 00 00          mov    $0x31,%eax
      28:   0f 05                   syscall
      2a:*  48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax         <-- trapping instruction
      30:   73 01                   jae    0x33
      32:   c3                      ret
      33:   48 8b 0d 09 8c 0c 00    mov    0xc8c09(%rip),%rcx        # 0xc8c43
      3a:   f7 d8                   neg    %eax
      3c:   64 89 01                mov    %eax,%fs:(%rcx)
      3f:   48                      rex.W
    
    Code starting with the faulting instruction
    ===========================================
       0:   48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax
       6:   73 01                   jae    0x9
       8:   c3                      ret
       9:   48 8b 0d 09 8c 0c 00    mov    0xc8c09(%rip),%rcx        # 0xc8c19
      10:   f7 d8                   neg    %eax
      12:   64 89 01                mov    %eax,%fs:(%rcx)
      15:   48                      rex.W
     RSP: 002b:00007ffe2d0ad398 EFLAGS: 00000202 ORIG_RAX: 0000000000000031
     RAX: ffffffffffffffda RBX: 00007ffe2d0ad3d0 RCX: 00007f59b934a1e7
     RDX: 000000000000001c RSI: 00007ffe2d0ad3d0 RDI: 0000000000000005
     RBP: 0000000000000005 R08: 1999999999999999 R09: 0000000000000000
     R10: 00007f59b9253298 R11: 0000000000000202 R12: 00007ffe2d0ada61
     R13: 0000000000000000 R14: 0000562926516dd8 R15: 00007f59b9479000
      </TASK>
    
    Fixes: 6fe1e52490a9 ("sctp: check ipv6 addr with sk_bound_dev if set")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Acked-by: Xin Long <lucien.xin@gmail.com>
    Link: https://patch.msgid.link/20241107192021.2579789-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

staging: vchiq_arm: Get the rid off struct vchiq_2835_state [+ + +]

Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Fri Jun 21 15:19:53 2024 +0200

    staging: vchiq_arm: Get the rid off struct vchiq_2835_state
    
    [ Upstream commit 4e2766102da632f26341d5539519b0abf73df887 ]
    
    The whole benefit of this encapsulating struct is questionable.
    It just stores a flag to signalize the init state of vchiq_arm_state.
    Beside the fact this flag is set too soon, the access to uninitialized
    members should be avoided. So initialize vchiq_arm_state properly before
    assign it directly to vchiq_state.
    
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20240621131958.98208-6-wahrenst@gmx.net
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Stable-dep-of: 404b739e8955 ("staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation [+ + +]

Author: Umang Jain <umang.jain@ideasonboard.com>
Date:   Wed Oct 16 18:32:24 2024 +0530

    staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation
    
    [ Upstream commit 404b739e895522838f1abdc340c554654d671dde ]
    
    The struct vchiq_arm_state 'platform_state' is currently allocated
    dynamically using kzalloc(). Unfortunately, it is never freed and is
    subjected to memory leaks in the error handling paths of the probe()
    function.
    
    To address the issue, use device resource management helper
    devm_kzalloc(), to ensure cleanup after its allocation.
    
    Fixes: 71bad7f08641 ("staging: add bcm2708 vchiq driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://lore.kernel.org/r/20241016130225.61024-2-umang.jain@ideasonboard.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines [+ + +]

Author: Vitalii Mordan <mordan@ispras.ru>
Date:   Fri Nov 8 20:33:34 2024 +0300

    stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines
    
    [ Upstream commit 5b366eae71937ae7412365340b431064625f9617 ]
    
    If the clock dwmac->tx_clk was not enabled in intel_eth_plat_probe,
    it should not be disabled in any path.
    
    Conversely, if it was enabled in intel_eth_plat_probe, it must be disabled
    in all error paths to ensure proper cleanup.
    
    Found by Linux Verification Center (linuxtesting.org) with Klever.
    
    Fixes: 9efc9b2b04c7 ("net: stmmac: Add dwmac-intel-plat for GBE driver")
    Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
    Link: https://patch.msgid.link/20241108173334.2973603-1-mordan@ispras.ru
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tools/mm: fix compile error [+ + +]

Author: Motiejus JakÅ`tys <motiejus@jakstys.lt>
Date:   Tue Nov 12 19:16:55 2024 +0200

    tools/mm: fix compile error
    
    [ Upstream commit a39326767c55c00c7c313333404cbcb502cce8fe ]
    
    Add a missing semicolon.
    
    Link: https://lkml.kernel.org/r/20241112171655.1662670-1-motiejus@jakstys.lt
    Fixes: ece5897e5a10 ("tools/mm: -Werror fixes in page-types/slabinfo")
    Signed-off-by: Motiejus JakÅ`tys <motiejus@jakstys.lt>
    Closes: https://github.com/NixOS/nixpkgs/issues/355369
    Reviewed-by: SeongJae Park <sj@kernel.org>
    Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
    Cc: Wladislav Wiebe <wladislav.kw@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

vdpa/mlx5: Fix PA offset with unaligned starting iotlb map [+ + +]

Author: Si-Wei Liu <si-wei.liu@oracle.com>
Date:   Mon Oct 21 16:40:39 2024 +0300

    vdpa/mlx5: Fix PA offset with unaligned starting iotlb map
    
    commit 29ce8b8a4fa74e841342c8b8f8941848a3c6f29f upstream.
    
    When calculating the physical address range based on the iotlb and mr
    [start,end) ranges, the offset of mr->start relative to map->start
    is not taken into account. This leads to some incorrect and duplicate
    mappings.
    
    For the case when mr->start < map->start the code is already correct:
    the range in [mr->start, map->start) was handled by a different
    iteration.
    
    Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
    Cc: stable@vger.kernel.org
    Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
    Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
    Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vdpa: solidrun: Fix UB bug with devres [+ + +]

Author: Philipp Stanner <pstanner@redhat.com>
Date:   Mon Oct 28 08:43:59 2024 +0100

    vdpa: solidrun: Fix UB bug with devres
    
    commit 0b364cf53b20204e92bac7c6ebd1ee7d3ec62931 upstream.
    
    In psnet_open_pf_bar() and snet_open_vf_bar() a string later passed to
    pcim_iomap_regions() is placed on the stack. Neither
    pcim_iomap_regions() nor the functions it calls copy that string.
    
    Should the string later ever be used, this, consequently, causes
    undefined behavior since the stack frame will by then have disappeared.
    
    Fix the bug by allocating the strings on the heap through
    devm_kasprintf().
    
    Cc: stable@vger.kernel.org      # v6.3
    Fixes: 51a8f9d7f587 ("virtio: vdpa: new SolidNET DPU driver.")
    Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Closes: https://lore.kernel.org/all/74e9109a-ac59-49e2-9b1d-d825c9c9f891@wanadoo.fr/
    Suggested-by: Andy Shevchenko <andy@kernel.org>
    Signed-off-by: Philipp Stanner <pstanner@redhat.com>
    Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
    Message-Id: <20241028074357.9104-3-pstanner@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtio/vsock: Fix accept_queue memory leak [+ + +]

Author: Michal Luczaj <mhal@rbox.co>
Date:   Thu Nov 7 21:46:12 2024 +0100

    virtio/vsock: Fix accept_queue memory leak
    
    [ Upstream commit d7b0ff5a866724c3ad21f2628c22a63336deec3f ]
    
    As the final stages of socket destruction may be delayed, it is possible
    that virtio_transport_recv_listen() will be called after the accept_queue
    has been flushed, but before the SOCK_DONE flag has been set. As a result,
    sockets enqueued after the flush would remain unremoved, leading to a
    memory leak.
    
    vsock_release
      __vsock_release
        lock
        virtio_transport_release
          virtio_transport_close
            schedule_delayed_work(close_work)
        sk_shutdown = SHUTDOWN_MASK
    (!) flush accept_queue
        release
                                            virtio_transport_recv_pkt
                                              vsock_find_bound_socket
                                              lock
                                              if flag(SOCK_DONE) return
                                              virtio_transport_recv_listen
                                                child = vsock_create_connected
                                          (!)   vsock_enqueue_accept(child)
                                              release
    close_work
      lock
      virtio_transport_do_close
        set_flag(SOCK_DONE)
        virtio_transport_remove_sock
          vsock_remove_sock
            vsock_remove_bound
      release
    
    Introduce a sk_shutdown check to disallow vsock_enqueue_accept() during
    socket destruction.
    
    unreferenced object 0xffff888109e3f800 (size 2040):
      comm "kworker/5:2", pid 371, jiffies 4294940105
      hex dump (first 32 bytes):
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        28 00 0b 40 00 00 00 00 00 00 00 00 00 00 00 00  (..@............
      backtrace (crc 9e5f4e84):
        [<ffffffff81418ff1>] kmem_cache_alloc_noprof+0x2c1/0x360
        [<ffffffff81d27aa0>] sk_prot_alloc+0x30/0x120
        [<ffffffff81d2b54c>] sk_alloc+0x2c/0x4b0
        [<ffffffff81fe049a>] __vsock_create.constprop.0+0x2a/0x310
        [<ffffffff81fe6d6c>] virtio_transport_recv_pkt+0x4dc/0x9a0
        [<ffffffff81fe745d>] vsock_loopback_work+0xfd/0x140
        [<ffffffff810fc6ac>] process_one_work+0x20c/0x570
        [<ffffffff810fce3f>] worker_thread+0x1bf/0x3a0
        [<ffffffff811070dd>] kthread+0xdd/0x110
        [<ffffffff81044fdd>] ret_from_fork+0x2d/0x50
        [<ffffffff8100785a>] ret_from_fork_asm+0x1a/0x30
    
    Fixes: 3fe356d58efa ("vsock/virtio: discard packets only when socket is really closed")
    Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
    Signed-off-by: Michal Luczaj <mhal@rbox.co>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

vp_vdpa: fix id_table array not null terminated error [+ + +]

Author: Xiaoguang Wang <lege.wang@jaguarmicro.com>
Date:   Tue Nov 5 21:35:18 2024 +0800

    vp_vdpa: fix id_table array not null terminated error
    
    commit 4e39ecadf1d2a08187139619f1f314b64ba7d947 upstream.
    
    Allocate one extra virtio_device_id as null terminator, otherwise
    vdpa_mgmtdev_get_classes() may iterate multiple times and visit
    undefined memory.
    
    Fixes: ffbda8e9df10 ("vdpa/vp_vdpa : add vdpa tool support in vp_vdpa")
    Cc: stable@vger.kernel.org
    Suggested-by: Parav Pandit <parav@nvidia.com>
    Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com>
    Signed-off-by: Xiaoguang Wang <lege.wang@jaguarmicro.com>
    Message-Id: <20241105133518.1494-1-lege.wang@jaguarmicro.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Reviewed-by: Parav Pandit <parav@nvidia.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/mm: Fix a kdump kernel failure on SME system when CONFIG_IMA_KEXEC=y [+ + +]

Author: Baoquan He <bhe@redhat.com>
Date:   Wed Sep 11 16:16:15 2024 +0800

    x86/mm: Fix a kdump kernel failure on SME system when CONFIG_IMA_KEXEC=y
    
    commit 8d9ffb2fe65a6c4ef114e8d4f947958a12751bbe upstream.
    
    The kdump kernel is broken on SME systems with CONFIG_IMA_KEXEC=y enabled.
    Debugging traced the issue back to
    
      b69a2afd5afc ("x86/kexec: Carry forward IMA measurement log on kexec").
    
    Testing was previously not conducted on SME systems with CONFIG_IMA_KEXEC
    enabled, which led to the oversight, with the following incarnation:
    
    ...
      ima: No TPM chip found, activating TPM-bypass!
      Loading compiled-in module X.509 certificates
      Loaded X.509 cert 'Build time autogenerated kernel key: 18ae0bc7e79b64700122bb1d6a904b070fef2656'
      ima: Allocated hash algorithm: sha256
      Oops: general protection fault, probably for non-canonical address 0xcfacfdfe6660003e: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
      Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
      RIP: 0010:ima_restore_measurement_list
      Call Trace:
       <TASK>
       ? show_trace_log_lvl
       ? show_trace_log_lvl
       ? ima_load_kexec_buffer
       ? __die_body.cold
       ? die_addr
       ? exc_general_protection
       ? asm_exc_general_protection
       ? ima_restore_measurement_list
       ? vprintk_emit
       ? ima_load_kexec_buffer
       ima_load_kexec_buffer
       ima_init
       ? __pfx_init_ima
       init_ima
       ? __pfx_init_ima
       do_one_initcall
       do_initcalls
       ? __pfx_kernel_init
       kernel_init_freeable
       kernel_init
       ret_from_fork
       ? __pfx_kernel_init
       ret_from_fork_asm
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      ...
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      Rebooting in 10 seconds..
    
    Adding debug printks showed that the stored addr and size of ima_kexec buffer
    are not decrypted correctly like:
    
      ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, size:0xe48066052d5df359
    
    Three types of setup_data info
    
      — SETUP_EFI,
      - SETUP_IMA, and
      - SETUP_RNG_SEED
    
    are passed to the kexec/kdump kernel. Only the ima_kexec buffer
    experienced incorrect decryption. Debugging identified a bug in
    early_memremap_is_setup_data(), where an incorrect range calculation
    occurred due to the len variable in struct setup_data ended up only
    representing the length of the data field, excluding the struct's size,
    and thus leading to miscalculation.
    
    Address a similar issue in memremap_is_setup_data() while at it.
    
      [ bp: Heavily massage. ]
    
    Fixes: b3c72fc9a78e ("x86/boot: Introduce setup_indirect")
    Signed-off-by: Baoquan He <bhe@redhat.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
    Cc: <stable@kernel.org>
    Link: https://lore.kernel.org/r/20240911081615.262202-3-bhe@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>