Список изменений в ядре 6.6.26

9p: Fix read/write debug statements to report server reply [+ + +]

Author: Dominique Martinet <asmadeus@codewreck.org>
Date:   Tue Jan 9 12:39:03 2024 +0900

    9p: Fix read/write debug statements to report server reply
    
    [ Upstream commit be3193e58ec210b2a72fb1134c2a0695088a911d ]
    
    Previous conversion to iov missed these debug statements which would now
    always print the requested size instead of the actual server reply.
    
    Write also added a loop in a much older commit but we didn't report
    these, while reads do report each iteration -- it's more coherent to
    keep reporting all requests to server so move that at the same time.
    
    Fixes: 7f02464739da ("9p: convert to advancing variant of iov_iter_get_pages_alloc()")
    Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
    Message-ID: <20240109-9p-rw-trace-v1-1-327178114257@codewreck.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ACPICA: debugger: check status of acpi_evaluate_object() in acpi_db_walk_for_fields() [+ + +]

Author: Nikita Kiryushin <kiryushin@ancud.ru>
Date:   Fri Mar 22 21:07:53 2024 +0300

    ACPICA: debugger: check status of acpi_evaluate_object() in acpi_db_walk_for_fields()
    
    [ Upstream commit 40e2710860e57411ab57a1529c5a2748abbe8a19 ]
    
    ACPICA commit 9061cd9aa131205657c811a52a9f8325a040c6c9
    
    Errors in acpi_evaluate_object() can lead to incorrect state of buffer.
    
    This can lead to access to data in previously ACPI_FREEd buffer and
    secondary ACPI_FREE to the same buffer later.
    
    Handle errors in acpi_evaluate_object the same way it is done earlier
    with acpi_ns_handle_to_pathname.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Link: https://github.com/acpica/acpica/commit/9061cd9a
    Fixes: 5fd033288a86 ("ACPICA: debugger: add command to dump all fields of particular subtype")
    Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/realtek - Fix inactive headset mic jack [+ + +]

Author: Christoffer Sandberg <cs@tuxedo.de>
Date:   Thu Mar 28 11:27:57 2024 +0100

    ALSA: hda/realtek - Fix inactive headset mic jack
    
    commit daf6c4681a74034d5723e2fb761e0d7f3a1ca18f upstream.
    
    This patch adds the existing fixup to certain TF platforms implementing
    the ALC274 codec with a headset jack. It fixes/activates the inactive
    microphone of the headset.
    
    Signed-off-by: Christoffer Sandberg <cs@tuxedo.de>
    Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
    Cc: <stable@vger.kernel.org>
    Message-ID: <20240328102757.50310-1-wse@tuxedocomputers.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone [+ + +]

Author: I Gede Agastya Darma Laksana <gedeagas22@gmail.com>
Date:   Tue Apr 2 00:46:02 2024 +0700

    ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone
    
    commit 1576f263ee2147dc395531476881058609ad3d38 upstream.
    
    This patch addresses an issue with the Panasonic CF-SZ6's existing quirk,
    specifically its headset microphone functionality. Previously, the quirk
    used ALC269_FIXUP_HEADSET_MODE, which does not support the CF-SZ6's design
    of a single 3.5mm jack for both mic and audio output effectively. The
    device uses pin 0x19 for the headset mic without jack detection.
    
    Following verification on the CF-SZ6 and discussions with the original
    patch author, i determined that the update to
    ALC269_FIXUP_ASPIRE_HEADSET_MIC is the appropriate solution. This change
    is custom-designed for the CF-SZ6's unique hardware setup, which includes
    a single 3.5mm jack for both mic and audio output, connecting the headset
    microphone to pin 0x19 without the use of jack detection.
    
    Fixes: 0fca97a29b83 ("ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk")
    Signed-off-by: I Gede Agastya Darma Laksana <gedeagas22@gmail.com>
    Cc: <stable@vger.kernel.org>
    Message-ID: <20240401174602.14133-1-gedeagas22@gmail.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda: cs35l56: Add ACPI device match tables [+ + +]

Author: Simon Trimmer <simont@opensource.cirrus.com>
Date:   Thu Mar 28 12:13:55 2024 +0000

    ALSA: hda: cs35l56: Add ACPI device match tables
    
    [ Upstream commit 2d0401ee38d43ab0e4cdd02dfc9d402befb2b5c8 ]
    
    Adding the ACPI HIDs to the match table triggers the cs35l56-hda modules
    to be loaded on boot so that Serial Multi Instantiate can add the
    devices to the bus and begin the driver init sequence.
    
    Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
    Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier")
    Message-ID: <20240328121355.18972-1-simont@opensource.cirrus.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda: cs35l56: Set the init_done flag before component_add() [+ + +]

Author: Simon Trimmer <simont@opensource.cirrus.com>
Date:   Mon Mar 25 14:55:10 2024 +0000

    ALSA: hda: cs35l56: Set the init_done flag before component_add()
    
    [ Upstream commit cafe9c6a72cf1ffe96d2561d988a141cb5c093db ]
    
    Initialization is completed before adding the component as that can
    start the process of the device binding and trigger actions that check
    init_done.
    
    Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier")
    Message-ID: <20240325145510.328378-1-rf@opensource.cirrus.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64/ptrace: Use saved floating point state type to determine SVE layout [+ + +]

Author: Mark Brown <broonie@kernel.org>
Date:   Mon Mar 25 16:35:21 2024 +0000

    arm64/ptrace: Use saved floating point state type to determine SVE layout
    
    commit b017a0cea627fcbe158fc2c214fe893e18c4d0c4 upstream.
    
    The SVE register sets have two different formats, one of which is a wrapped
    version of the standard FPSIMD register set and another with actual SVE
    register data. At present we check TIF_SVE to see if full SVE register
    state should be provided when reading the SVE regset but if we were in a
    syscall we may have saved only floating point registers even though that is
    set.
    
    Fix this and simplify the logic by checking and using the format which we
    recorded when deciding if we should use FPSIMD or SVE format.
    
    Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
    Cc: <stable@vger.kernel.org> # 6.2.x
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20240325-arm64-ptrace-fp-type-v1-1-8dc846caf11f@kernel.org
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: bpf: fix 32bit unconditional bswap [+ + +]

Author: Artem Savkov <asavkov@redhat.com>
Date:   Thu Mar 21 09:18:09 2024 +0100

    arm64: bpf: fix 32bit unconditional bswap
    
    [ Upstream commit a51cd6bf8e10793103c5870ff9e4db295a843604 ]
    
    In case when is64 == 1 in emit(A64_REV32(is64, dst, dst), ctx) the
    generated insn reverses byte order for both high and low 32-bit words,
    resuling in an incorrect swap as indicated by the jit test:
    
    [ 9757.262607] test_bpf: #312 BSWAP 16: 0x0123456789abcdef -> 0xefcd jited:1 8 PASS
    [ 9757.264435] test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times)
    [ 9757.266260] test_bpf: #314 BSWAP 64: 0x0123456789abcdef -> 0x67452301 jited:1 8 PASS
    [ 9757.268000] test_bpf: #315 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 jited:1 8 PASS
    [ 9757.269686] test_bpf: #316 BSWAP 16: 0xfedcba9876543210 -> 0x1032 jited:1 8 PASS
    [ 9757.271380] test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times)
    [ 9757.273022] test_bpf: #318 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe jited:1 7 PASS
    [ 9757.274721] test_bpf: #319 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 jited:1 9 PASS
    
    Fix this by forcing 32bit variant of rev32.
    
    Fixes: 1104247f3f979 ("bpf, arm64: Support unconditional bswap")
    Signed-off-by: Artem Savkov <asavkov@redhat.com>
    Tested-by: Puranjay Mohan <puranjay12@gmail.com>
    Acked-by: Puranjay Mohan <puranjay12@gmail.com>
    Acked-by: Xu Kuohai <xukuohai@huawei.com>
    Message-ID: <20240321081809.158803-1-asavkov@redhat.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Mar 20 08:55:52 2024 +0100

    arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken
    
    commit e12e28009e584c8f8363439f6a928ec86278a106 upstream.
    
    Several Qualcomm Bluetooth controllers lack persistent storage for the
    device address and instead one can be provided by the boot firmware
    using the 'local-bd-address' devicetree property.
    
    The Bluetooth bindings clearly states that the address should be
    specified in little-endian order, but due to a long-standing bug in the
    Qualcomm driver which reversed the address some boot firmware has been
    providing the address in big-endian order instead.
    
    The boot firmware in SC7180 Trogdor Chromebooks is known to be affected
    so mark the 'local-bd-address' property as broken to maintain backwards
    compatibility with older firmware when fixing the underlying driver bug.
    
    Note that ChromeOS always updates the kernel and devicetree in lockstep
    so that there is no need to handle backwards compatibility with older
    devicetrees.
    
    Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt")
    Cc: stable@vger.kernel.org      # 5.10
    Cc: Rob Clark <robdclark@chromium.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Acked-by: Bjorn Andersson <andersson@kernel.org>
    Reviewed-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: amd: acp: fix for acp_init function error handling [+ + +]

Author: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Date:   Fri Mar 29 11:08:12 2024 +0530

    ASoC: amd: acp: fix for acp_init function error handling
    
    [ Upstream commit 2c603a4947a1247102ccb008d5eb6f37a4043c98 ]
    
    If acp_init() fails, acp pci driver probe should return error.
    Add acp_init() function return value check logic.
    
    Fixes: e61b415515d3 ("ASoC: amd: acp: refactor the acp init and de-init sequence")
    Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
    Link: https://lore.kernel.org/r/20240329053815.2373979-1-Vijendar.Mukunda@amd.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw [+ + +]

Author: Stephen Lee <slee08177@gmail.com>
Date:   Mon Mar 25 18:01:31 2024 -0700

    ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw
    
    [ Upstream commit fc563aa900659a850e2ada4af26b9d7a3de6c591 ]
    
    In snd_soc_info_volsw(), mask is generated by figuring out the index of
    the most significant bit set in max and converting the index to a
    bitmask through bit shift 1. Unintended wraparound occurs when max is an
    integer value with msb bit set. Since the bit shift value 1 is treated
    as an integer type, the left shift operation will wraparound and set
    mask to 0 instead of all 1's. In order to fix this, we type cast 1 as
    `1ULL` to prevent the wraparound.
    
    Fixes: 7077148fb50a ("ASoC: core: Split ops out of soc-core.c")
    Signed-off-by: Stephen Lee <slee08177@gmail.com>
    Link: https://msgid.link/r/20240326010131.6211-1-slee08177@gmail.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt5682-sdw: fix locking sequence [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Mon Mar 25 17:18:12 2024 -0500

    ASoC: rt5682-sdw: fix locking sequence
    
    [ Upstream commit 310a5caa4e861616a27a83c3e8bda17d65026fa8 ]
    
    The disable_irq_lock protects the 'disable_irq' value, we need to lock
    before testing it.
    
    Fixes: 02fb23d72720 ("ASoC: rt5682-sdw: fix for JD event handling in ClockStop Mode0")
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Chao Song <chao.song@linux.intel.com>
    Link: https://msgid.link/r/20240325221817.206465-2-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt711-sdca: fix locking sequence [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Mon Mar 25 17:18:13 2024 -0500

    ASoC: rt711-sdca: fix locking sequence
    
    [ Upstream commit ee287771644394d071e6a331951ee8079b64f9a7 ]
    
    The disable_irq_lock protects the 'disable_irq' value, we need to lock
    before testing it.
    
    Fixes: 23adeb7056ac ("ASoC: rt711-sdca: fix for JD event handling in ClockStop Mode0")
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Chao Song <chao.song@linux.intel.com>
    Link: https://msgid.link/r/20240325221817.206465-3-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt711-sdw: fix locking sequence [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Mon Mar 25 17:18:14 2024 -0500

    ASoC: rt711-sdw: fix locking sequence
    
    [ Upstream commit aae86cfd8790bcc7693a5a0894df58de5cb5128c ]
    
    The disable_irq_lock protects the 'disable_irq' value, we need to lock
    before testing it.
    
    Fixes: b69de265bd0e ("ASoC: rt711: fix for JD event handling in ClockStop Mode0")
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Chao Song <chao.song@linux.intel.com>
    Link: https://msgid.link/r/20240325221817.206465-4-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt712-sdca-sdw: fix locking sequence [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Mon Mar 25 17:18:15 2024 -0500

    ASoC: rt712-sdca-sdw: fix locking sequence
    
    [ Upstream commit c8b2e5c1b959d100990e4f0cbad38e7d047bb97c ]
    
    The disable_irq_lock protects the 'disable_irq' value, we need to lock
    before testing it.
    
    Fixes: 7a8735c1551e ("ASoC: rt712-sdca: fix for JD event handling in ClockStop Mode0")
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Chao Song <chao.song@linux.intel.com>
    Link: https://msgid.link/r/20240325221817.206465-5-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: rt722-sdca-sdw: fix locking sequence [+ + +]

Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Mon Mar 25 17:18:16 2024 -0500

    ASoC: rt722-sdca-sdw: fix locking sequence
    
    [ Upstream commit adb354bbc231b23d3a05163ce35c1d598512ff64 ]
    
    The disable_irq_lock protects the 'disable_irq' value, we need to lock
    before testing it.
    
    Fixes: a0b7c59ac1a9 ("ASoC: rt722-sdca: fix for JD event handling in ClockStop Mode0")
    Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
    Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
    Reviewed-by: Chao Song <chao.song@linux.intel.com>
    Link: https://msgid.link/r/20240325221817.206465-6-pierre-louis.bossart@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: SOF: amd: fix for false dsp interrupts [+ + +]

Author: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Date:   Thu Apr 4 09:47:13 2024 +0530

    ASoC: SOF: amd: fix for false dsp interrupts
    
    [ Upstream commit b9846a386734e73a1414950ebfd50f04919f5e24 ]
    
    Before ACP firmware loading, DSP interrupts are not expected.
    Sometimes after reboot, it's observed that before ACP firmware is loaded
    false DSP interrupt is reported.
    Registering the interrupt handler before acp initialization causing false
    interrupts sometimes on reboot as ACP reset is not applied.
    Correct the sequence by invoking acp initialization sequence prior to
    registering interrupt handler.
    
    Fixes: 738a2b5e2cc9 ("ASoC: SOF: amd: Add IPC support for ACP IP block")
    Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
    Link: https://msgid.link/r/20240404041717.430545-1-Vijendar.Mukunda@amd.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: wm_adsp: Fix missing mutex_lock in wm_adsp_write_ctl() [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Thu Mar 7 11:02:27 2024 +0000

    ASoC: wm_adsp: Fix missing mutex_lock in wm_adsp_write_ctl()
    
    [ Upstream commit f193957b0fbbba397c8bddedf158b3bf7e4850fc ]
    
    wm_adsp_write_ctl() must hold the pwr_lock mutex when calling
    cs_dsp_get_ctl().
    
    This was previously partially fixed by commit 781118bc2fc1
    ("ASoC: wm_adsp: Fix missing locking in wm_adsp_[read|write]_ctl()")
    but this only put locking around the call to cs_dsp_coeff_write_ctrl(),
    missing the call to cs_dsp_get_ctl().
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 781118bc2fc1 ("ASoC: wm_adsp: Fix missing locking in wm_adsp_[read|write]_ctl()")
    Link: https://msgid.link/r/20240307110227.41421-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ata: sata_mv: Fix PCI device ID table declaration compilation warning [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Apr 3 10:06:48 2024 +0200

    ata: sata_mv: Fix PCI device ID table declaration compilation warning
    
    [ Upstream commit 3137b83a90646917c90951d66489db466b4ae106 ]
    
    Building with W=1 shows a warning for an unused variable when CONFIG_PCI
    is diabled:
    
    drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable]
    static const struct pci_device_id mv_pci_tbl[] = {
    
    Move the table into the same block that containsn the pci_driver
    definition.
    
    Fixes: 7bb3c5290ca0 ("sata_mv: Remove PCI dependency")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Tue Mar 26 15:53:37 2024 +0100

    ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit
    
    [ Upstream commit 52f80bb181a9a1530ade30bc18991900bbb9697f ]
    
    gcc warns about a memcpy() with overlapping pointers because of an
    incorrect size calculation:
    
    In file included from include/linux/string.h:369,
                     from drivers/ata/sata_sx4.c:66:
    In function 'memcpy_fromio',
        inlined from 'pdc20621_get_from_dimm.constprop' at drivers/ata/sata_sx4.c:962:2:
    include/linux/fortify-string.h:97:33: error: '__builtin_memcpy' accessing 4294934464 bytes at offsets 0 and [16, 16400] overlaps 6442385281 bytes at offset -2147450817 [-Werror=restrict]
       97 | #define __underlying_memcpy     __builtin_memcpy
          |                                 ^
    include/linux/fortify-string.h:620:9: note: in expansion of macro '__underlying_memcpy'
      620 |         __underlying_##op(p, q, __fortify_size);                        \
          |         ^~~~~~~~~~~~~
    include/linux/fortify-string.h:665:26: note: in expansion of macro '__fortify_memcpy_chk'
      665 | #define memcpy(p, q, s)  __fortify_memcpy_chk(p, q, s,                  \
          |                          ^~~~~~~~~~~~~~~~~~~~
    include/asm-generic/io.h:1184:9: note: in expansion of macro 'memcpy'
     1184 |         memcpy(buffer, __io_virt(addr), size);
          |         ^~~~~~
    
    The problem here is the overflow of an unsigned 32-bit number to a
    negative that gets converted into a signed 'long', keeping a large
    positive number.
    
    Replace the complex calculation with a more readable min() variant
    that avoids the warning.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ax25: fix use-after-free bugs caused by ax25_ds_del_timer [+ + +]

Author: Duoming Zhou <duoming@zju.edu.cn>
Date:   Fri Mar 29 09:50:23 2024 +0800

    ax25: fix use-after-free bugs caused by ax25_ds_del_timer
    
    commit fd819ad3ecf6f3c232a06b27423ce9ed8c20da89 upstream.
    
    When the ax25 device is detaching, the ax25_dev_device_down()
    calls ax25_ds_del_timer() to cleanup the slave_timer. When
    the timer handler is running, the ax25_ds_del_timer() that
    calls del_timer() in it will return directly. As a result,
    the use-after-free bugs could happen, one of the scenarios
    is shown below:
    
          (Thread 1)          |      (Thread 2)
                              | ax25_ds_timeout()
    ax25_dev_device_down()    |
      ax25_ds_del_timer()     |
        del_timer()           |
      ax25_dev_put() //FREE   |
                              |  ax25_dev-> //USE
    
    In order to mitigate bugs, when the device is detaching, use
    timer_shutdown_sync() to stop the timer.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240329015023.9223-1-duoming@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: add quirk for broken address properties [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Mar 20 08:55:53 2024 +0100

    Bluetooth: add quirk for broken address properties
    
    commit 39646f29b100566451d37abc4cc8cdd583756dfe upstream.
    
    Some Bluetooth controllers lack persistent storage for the device
    address and instead one can be provided by the boot firmware using the
    'local-bd-address' devicetree property.
    
    The Bluetooth devicetree bindings clearly states that the address should
    be specified in little-endian order, but due to a long-standing bug in
    the Qualcomm driver which reversed the address some boot firmware has
    been providing the address in big-endian order instead.
    
    Add a new quirk that can be set on platforms with broken firmware and
    use it to reverse the address when parsing the property so that the
    underlying driver bug can be fixed.
    
    Fixes: 5c0a1001c8be ("Bluetooth: hci_qca: Add helper to set device address")
    Cc: stable@vger.kernel.org      # 5.1
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: Fix TOCTOU in HCI debugfs implementation [+ + +]

Author: Bastien Nocera <hadess@hadess.net>
Date:   Wed Mar 27 15:24:56 2024 +0100

    Bluetooth: Fix TOCTOU in HCI debugfs implementation
    
    commit 7835fcfd132eb88b87e8eb901f88436f63ab60f7 upstream.
    
    struct hci_dev members conn_info_max_age, conn_info_min_age,
    le_conn_max_interval, le_conn_min_interval, le_adv_max_interval,
    and le_adv_min_interval can be modified from the HCI core code, as well
    through debugfs.
    
    The debugfs implementation, that's only available to privileged users,
    will check for boundaries, making sure that the minimum value being set
    is strictly above the maximum value that already exists, and vice-versa.
    
    However, as both minimum and maximum values can be changed concurrently
    to us modifying them, we need to make sure that the value we check is
    the value we end up using.
    
    For example, with ->conn_info_max_age set to 10, conn_info_min_age_set()
    gets called from vfs handlers to set conn_info_min_age to 8.
    
    In conn_info_min_age_set(), this goes through:
            if (val == 0 || val > hdev->conn_info_max_age)
                    return -EINVAL;
    
    Concurrently, conn_info_max_age_set() gets called to set to set the
    conn_info_max_age to 7:
            if (val == 0 || val > hdev->conn_info_max_age)
                    return -EINVAL;
    That check will also pass because we used the old value (10) for
    conn_info_max_age.
    
    After those checks that both passed, the struct hci_dev access
    is mutex-locked, disabling concurrent access, but that does not matter
    because the invalid value checks both passed, and we'll end up with
    conn_info_min_age = 8 and conn_info_max_age = 7
    
    To fix this problem, we need to lock the structure access before so the
    check and assignment are not interrupted.
    
    This fix was originally devised by the BassCheck[1] team, and
    considered the problem to be an atomicity one. This isn't the case as
    there aren't any concerns about the variable changing while we check it,
    but rather after we check it parallel to another change.
    
    This patch fixes CVE-2024-24858 and CVE-2024-24857.
    
    [1] https://sites.google.com/view/basscheck/
    
    Co-developed-by: Gui-Dong Han <2045gemini@gmail.com>
    Signed-off-by: Gui-Dong Han <2045gemini@gmail.com>
    Link: https://lore.kernel.org/linux-bluetooth/20231222161317.6255-1-2045gemini@gmail.com/
    Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24858
    Link: https://lore.kernel.org/linux-bluetooth/20231222162931.6553-1-2045gemini@gmail.com/
    Link: https://lore.kernel.org/linux-bluetooth/20231222162310.6461-1-2045gemini@gmail.com/
    Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24857
    Fixes: 31ad169148df ("Bluetooth: Add conn info lifetime parameters to debugfs")
    Fixes: 729a1051da6f ("Bluetooth: Expose default LE advertising interval via debugfs")
    Fixes: 71c3b60ec6d2 ("Bluetooth: Move BR/EDR debugfs file creation into hci_debugfs.c")
    Signed-off-by: Bastien Nocera <hadess@hadess.net>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: hci_event: set the conn encrypted before conn establishes [+ + +]

Author: Hui Wang <hui.wang@canonical.com>
Date:   Wed Mar 27 12:30:30 2024 +0800

    Bluetooth: hci_event: set the conn encrypted before conn establishes
    
    commit c569242cd49287d53b73a94233db40097d838535 upstream.
    
    We have a BT headset (Lenovo Thinkplus XT99), the pairing and
    connecting has no problem, once this headset is paired, bluez will
    remember this device and will auto re-connect it whenever the device
    is powered on. The auto re-connecting works well with Windows and
    Android, but with Linux, it always fails. Through debugging, we found
    at the rfcomm connection stage, the bluetooth stack reports
    "Connection refused - security block (0x0003)".
    
    For this device, the re-connecting negotiation process is different
    from other BT headsets, it sends the Link_KEY_REQUEST command before
    the CONNECT_REQUEST completes, and it doesn't send ENCRYPT_CHANGE
    command during the negotiation. When the device sends the "connect
    complete" to hci, the ev->encr_mode is 1.
    
    So here in the conn_complete_evt(), if ev->encr_mode is 1, link type
    is ACL and HCI_CONN_ENCRYPT is not set, we set HCI_CONN_ENCRYPT to
    this conn, and update conn->enc_key_size accordingly.
    
    After this change, this BT headset could re-connect with Linux
    successfully. This is the btmon log after applying the patch, after
    receiving the "Connect Complete" with "Encryption: Enabled", will send
    the command to read encryption key size:
    > HCI Event: Connect Request (0x04) plen 10
            Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA)
            Class: 0x240404
              Major class: Audio/Video (headset, speaker, stereo, video, vcr)
              Minor class: Wearable Headset Device
              Rendering (Printing, Speaker)
              Audio (Speaker, Microphone, Headset)
            Link type: ACL (0x01)
    ...
    > HCI Event: Link Key Request (0x17) plen 6
            Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA)
    < HCI Command: Link Key Request Reply (0x01|0x000b) plen 22
            Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA)
            Link key: ${32-hex-digits-key}
    ...
    > HCI Event: Connect Complete (0x03) plen 11
            Status: Success (0x00)
            Handle: 256
            Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA)
            Link type: ACL (0x01)
            Encryption: Enabled (0x01)
    < HCI Command: Read Encryption Key... (0x05|0x0008) plen 2
            Handle: 256
    < ACL Data TX: Handle 256 flags 0x00 dlen 10
          L2CAP: Information Request (0x0a) ident 1 len 2
            Type: Extended features supported (0x0002)
    > HCI Event: Command Complete (0x0e) plen 7
          Read Encryption Key Size (0x05|0x0008) ncmd 1
            Status: Success (0x00)
            Handle: 256
            Key size: 16
    
    Cc: stable@vger.kernel.org
    Link: https://github.com/bluez/bluez/issues/704
    Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Reviewed-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
    Signed-off-by: Hui Wang <hui.wang@canonical.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: qca: fix device-address endianness [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Wed Mar 20 08:55:54 2024 +0100

    Bluetooth: qca: fix device-address endianness
    
    commit 77f45cca8bc55d00520a192f5a7715133591c83e upstream.
    
    The WCN6855 firmware on the Lenovo ThinkPad X13s expects the Bluetooth
    device address in big-endian order when setting it using the
    EDL_WRITE_BD_ADDR_OPCODE command.
    
    Presumably, this is the case for all non-ROME devices which all use the
    EDL_WRITE_BD_ADDR_OPCODE command for this (unlike the ROME devices which
    use a different command and expect the address in little-endian order).
    
    Reverse the little-endian address before setting it to make sure that
    the address can be configured using tools like btmgmt or using the
    'local-bd-address' devicetree property.
    
    Note that this can potentially break systems with boot firmware which
    has started relying on the broken behaviour and is incorrectly passing
    the address via devicetree in big-endian order.
    
    The only device affected by this should be the WCN3991 used in some
    Chromebooks. As ChromeOS updates the kernel and devicetree in lockstep,
    the new 'qcom,local-bd-address-broken' property can be used to determine
    if the firmware is buggy so that the underlying driver bug can be fixed
    without breaking backwards compatibility.
    
    Set the HCI_QUIRK_BDADDR_PROPERTY_BROKEN quirk for such platforms so
    that the address is reversed when parsing the address property.
    
    Fixes: 5c0a1001c8be ("Bluetooth: hci_qca: Add helper to set device address")
    Cc: stable@vger.kernel.org      # 5.1
    Cc: Balakrishna Godavarthi <quic_bgodavar@quicinc.com>
    Cc: Matthias Kaehlcke <mka@chromium.org>
    Tested-by: Nikita Travkin <nikita@trvn.ru> # sc7180
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf, arm64: fix bug in BPF_LDX_MEMSX [+ + +]

Author: Puranjay Mohan <puranjay12@gmail.com>
Date:   Tue Mar 12 23:59:17 2024 +0000

    bpf, arm64: fix bug in BPF_LDX_MEMSX
    
    [ Upstream commit 114b5b3b4bde7358624437be2f12cde1b265224e ]
    
    A64_LDRSW() takes three registers: Xt, Xn, Xm as arguments and it loads
    and sign extends the value at address Xn + Xm into register Xt.
    
    Currently, the offset is being directly used in place of the tmp
    register which has the offset already loaded by the last emitted
    instruction.
    
    This will cause JIT failures. The easiest way to reproduce this is to
    test the following code through test_bpf module:
    
    {
            "BPF_LDX_MEMSX | BPF_W",
            .u.insns_int = {
                    BPF_LD_IMM64(R1, 0x00000000deadbeefULL),
                    BPF_LD_IMM64(R2, 0xffffffffdeadbeefULL),
                    BPF_STX_MEM(BPF_DW, R10, R1, -7),
                    BPF_LDX_MEMSX(BPF_W, R0, R10, -7),
                    BPF_JMP_REG(BPF_JNE, R0, R2, 1),
                    BPF_ALU64_IMM(BPF_MOV, R0, 0),
                    BPF_EXIT_INSN(),
            },
            INTERNAL,
            { },
            { { 0, 0 } },
            .stack_depth = 7,
    },
    
    We need to use the offset as -7 to trigger this code path, there could
    be other valid ways to trigger this from proper BPF programs as well.
    
    This code is rejected by the JIT because -7 is passed to A64_LDRSW() but
    it expects a valid register (0 - 31).
    
     roott@pjy:~# modprobe test_bpf test_name="BPF_LDX_MEMSX | BPF_W"
     [11300.490371] test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
     [11300.491750] test_bpf: #345 BPF_LDX_MEMSX | BPF_W
     [11300.493179] aarch64_insn_encode_register: unknown register encoding -7
     [11300.494133] aarch64_insn_encode_register: unknown register encoding -7
     [11300.495292] FAIL to select_runtime err=-524
     [11300.496804] test_bpf: Summary: 0 PASSED, 1 FAILED, [0/0 JIT'ed]
     modprobe: ERROR: could not insert 'test_bpf': Invalid argument
    
    Applying this patch fixes the issue.
    
     root@pjy:~# modprobe test_bpf test_name="BPF_LDX_MEMSX | BPF_W"
     [  292.837436] test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
     [  292.839416] test_bpf: #345 BPF_LDX_MEMSX | BPF_W jited:1 156 PASS
     [  292.844794] test_bpf: Summary: 1 PASSED, 0 FAILED, [1/1 JIT'ed]
    
    Fixes: cc88f540da52 ("bpf, arm64: Support sign-extension load instructions")
    Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
    Message-ID: <20240312235917.103626-1-puranjay12@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Prevent lock inversion deadlock in map delete elem [+ + +]

Author: Jakub Sitnicki <jakub@cloudflare.com>
Date:   Tue Apr 2 12:46:21 2024 +0200

    bpf, sockmap: Prevent lock inversion deadlock in map delete elem
    
    commit ff91059932401894e6c86341915615c5eb0eca48 upstream.
    
    syzkaller started using corpuses where a BPF tracing program deletes
    elements from a sockmap/sockhash map. Because BPF tracing programs can be
    invoked from any interrupt context, locks taken during a map_delete_elem
    operation must be hardirq-safe. Otherwise a deadlock due to lock inversion
    is possible, as reported by lockdep:
    
           CPU0                    CPU1
           ----                    ----
      lock(&htab->buckets[i].lock);
                                   local_irq_disable();
                                   lock(&host->lock);
                                   lock(&htab->buckets[i].lock);
      <Interrupt>
        lock(&host->lock);
    
    Locks in sockmap are hardirq-unsafe by design. We expects elements to be
    deleted from sockmap/sockhash only in task (normal) context with interrupts
    enabled, or in softirq context.
    
    Detect when map_delete_elem operation is invoked from a context which is
    _not_ hardirq-unsafe, that is interrupts are disabled, and bail out with an
    error.
    
    Note that map updates are not affected by this issue. BPF verifier does not
    allow updating sockmap/sockhash from a BPF tracing program today.
    
    Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
    Reported-by: xingwei lee <xrivendell7@gmail.com>
    Reported-by: yue sun <samsun1006219@gmail.com>
    Reported-by: syzbot+bc922f476bd65abbd466@syzkaller.appspotmail.com
    Reported-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com
    Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Closes: https://syzkaller.appspot.com/bug?extid=d4066896495db380182e
    Closes: https://syzkaller.appspot.com/bug?extid=bc922f476bd65abbd466
    Link: https://lore.kernel.org/bpf/20240402104621.1050319-1-jakub@cloudflare.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: Protect against int overflow for stack access size [+ + +]

Author: Andrei Matei <andreimatei1@gmail.com>
Date:   Tue Mar 26 22:42:45 2024 -0400

    bpf: Protect against int overflow for stack access size
    
    [ Upstream commit ecc6a2101840177e57c925c102d2d29f260d37c8 ]
    
    This patch re-introduces protection against the size of access to stack
    memory being negative; the access size can appear negative as a result
    of overflowing its signed int representation. This should not actually
    happen, as there are other protections along the way, but we should
    protect against it anyway. One code path was missing such protections
    (fixed in the previous patch in the series), causing out-of-bounds array
    accesses in check_stack_range_initialized(). This patch causes the
    verification of a program with such a non-sensical access size to fail.
    
    This check used to exist in a more indirect way, but was inadvertendly
    removed in a833a17aeac7.
    
    Fixes: a833a17aeac7 ("bpf: Fix verification of indirect var-off stack access")
    Reported-by: syzbot+33f4297b5f927648741a@syzkaller.appspotmail.com
    Reported-by: syzbot+aafd0513053a1cbf52ef@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/bpf/CAADnVQLORV5PT0iTAhRER+iLBTkByCYNBYyvBSgjN1T31K+gOw@mail.gmail.com/
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
    Link: https://lore.kernel.org/r/20240327024245.318299-3-andreimatei1@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: put uprobe link's path and task in release callback [+ + +]

Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Mar 27 22:24:25 2024 -0700

    bpf: put uprobe link's path and task in release callback
    
    commit e9c856cabefb71d47b2eeb197f72c9c88e9b45b0 upstream.
    
    There is no need to delay putting either path or task to deallocation
    step. It can be done right after bpf_uprobe_unregister. Between release
    and dealloc, there could be still some running BPF programs, but they
    don't access either task or path, only data in link->uprobes, so it is
    safe to do.
    
    On the other hand, doing path_put() in dealloc callback makes this
    dealloc sleepable because path_put() itself might sleep. Which is
    problematic due to the need to call uprobe's dealloc through call_rcu(),
    which is what is done in the next bug fix patch. So solve the problem by
    releasing these resources early.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240328052426.3042617-1-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: support deferring bpf_link dealloc to after RCU grace period [+ + +]

Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Mar 27 22:24:26 2024 -0700

    bpf: support deferring bpf_link dealloc to after RCU grace period
    
    commit 1a80dbcb2dbaf6e4c216e62e30fa7d3daa8001ce upstream.
    
    BPF link for some program types is passed as a "context" which can be
    used by those BPF programs to look up additional information. E.g., for
    multi-kprobes and multi-uprobes, link is used to fetch BPF cookie values.
    
    Because of this runtime dependency, when bpf_link refcnt drops to zero
    there could still be active BPF programs running accessing link data.
    
    This patch adds generic support to defer bpf_link dealloc callback to
    after RCU GP, if requested. This is done by exposing two different
    deallocation callbacks, one synchronous and one deferred. If deferred
    one is provided, bpf_link_free() will schedule dealloc_deferred()
    callback to happen after RCU GP.
    
    BPF is using two flavors of RCU: "classic" non-sleepable one and RCU
    tasks trace one. The latter is used when sleepable BPF programs are
    used. bpf_link_free() accommodates that by checking underlying BPF
    program's sleepable flag, and goes either through normal RCU GP only for
    non-sleepable, or through RCU tasks trace GP *and* then normal RCU GP
    (taking into account rcu_trace_implies_rcu_gp() optimization), if BPF
    program is sleepable.
    
    We use this for multi-kprobe and multi-uprobe links, which dereference
    link during program run. We also preventively switch raw_tp link to use
    deferred dealloc callback, as upcoming changes in bpf-next tree expose
    raw_tp link data (specifically, cookie value) to BPF program at runtime
    as well.
    
    Fixes: 0dcac2725406 ("bpf: Add multi kprobe link")
    Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
    Reported-by: syzbot+981935d9485a560bfbcb@syzkaller.appspotmail.com
    Reported-by: syzbot+2cb5a6c573e98db598cc@syzkaller.appspotmail.com
    Reported-by: syzbot+62d8b26793e8a2bd0516@syzkaller.appspotmail.com
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20240328052426.3042617-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: ensure fiemap doesn't race with writes when FIEMAP_FLAG_SYNC is given [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Thu Feb 22 12:29:34 2024 +0000

    btrfs: ensure fiemap doesn't race with writes when FIEMAP_FLAG_SYNC is given
    
    [ Upstream commit 418b09027743d9a9fb39116bed46a192f868a3c3 ]
    
    When FIEMAP_FLAG_SYNC is given to fiemap the expectation is that that
    are no concurrent writes and we get a stable view of the inode's extent
    layout.
    
    When the flag is given we flush all IO (and wait for ordered extents to
    complete) and then lock the inode in shared mode, however that leaves open
    the possibility that a write might happen right after the flushing and
    before locking the inode. So fix this by flushing again after locking the
    inode - we leave the initial flushing before locking the inode to avoid
    holding the lock and blocking other RO operations while waiting for IO
    and ordered extents to complete. The second flushing while holding the
    inode's lock will most of the time do nothing or very little since the
    time window for new writes to have happened is small.
    
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 978b63f7464a ("btrfs: fix race when detecting delalloc ranges during fiemap")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: fix race when detecting delalloc ranges during fiemap [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Feb 28 11:37:56 2024 +0000

    btrfs: fix race when detecting delalloc ranges during fiemap
    
    [ Upstream commit 978b63f7464abcfd364a6c95f734282c50f3decf ]
    
    For fiemap we recently stopped locking the target extent range for the
    whole duration of the fiemap call, in order to avoid a deadlock in a
    scenario where the fiemap buffer happens to be a memory mapped range of
    the same file. This use case is very unlikely to be useful in practice but
    it may be triggered by fuzz testing (syzbot, etc).
    
    This however introduced a race that makes us miss delalloc ranges for
    file regions that are currently holes, so the caller of fiemap will not
    be aware that there's data for some file regions. This can be quite
    serious for some use cases - for example in coreutils versions before 9.0,
    the cp program used fiemap to detect holes and data in the source file,
    copying only regions with data (extents or delalloc) from the source file
    to the destination file in order to preserve holes (see the documentation
    for its --sparse command line option). This means that if cp was used
    with a source file that had delalloc in a hole, the destination file could
    end up without that data, which is effectively a data loss issue, if it
    happened to hit the race described below.
    
    The race happens like this:
    
    1) Fiemap is called, without the FIEMAP_FLAG_SYNC flag, for a file that
       has delalloc in the file range [64M, 65M[, which is currently a hole;
    
    2) Fiemap locks the inode in shared mode, then starts iterating the
       inode's subvolume tree searching for file extent items, without having
       the whole fiemap target range locked in the inode's io tree - the
       change introduced recently by commit b0ad381fa769 ("btrfs: fix
       deadlock with fiemap and extent locking"). It only locks ranges in
       the io tree when it finds a hole or prealloc extent since that
       commit;
    
    3) Note that fiemap clones each leaf before using it, and this is to
       avoid deadlocks when locking a file range in the inode's io tree and
       the fiemap buffer is memory mapped to some file, because writing
       to the page with btrfs_page_mkwrite() will wait on any ordered extent
       for the page's range and the ordered extent needs to lock the range
       and may need to modify the same leaf, therefore leading to a deadlock
       on the leaf;
    
    4) While iterating the file extent items in the cloned leaf before
       finding the hole in the range [64M, 65M[, the delalloc in that range
       is flushed and its ordered extent completes - meaning the corresponding
       file extent item is in the inode's subvolume tree, but not present in
       the cloned leaf that fiemap is iterating over;
    
    5) When fiemap finds the hole in the [64M, 65M[ range by seeing the gap in
       the cloned leaf (or a file extent item with disk_bytenr == 0 in case
       the NO_HOLES feature is not enabled), it will lock that file range in
       the inode's io tree and then search for delalloc by checking for the
       EXTENT_DELALLOC bit in the io tree for that range and ordered extents
       (with btrfs_find_delalloc_in_range()). But it finds nothing since the
       delalloc in that range was already flushed and the ordered extent
       completed and is gone - as a result fiemap will not report that there's
       delalloc or an extent for the range [64M, 65M[, so user space will be
       mislead into thinking that there's a hole in that range.
    
    This could actually be sporadically triggered with test case generic/094
    from fstests, which reports a missing extent/delalloc range like this:
    
    #  generic/094 2s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/094.out.bad)
    #      --- tests/generic/094.out        2020-06-10 19:29:03.830519425 +0100
    #      +++ /home/fdmanana/git/hub/xfstests/results//generic/094.out.bad 2024-02-28 11:00:00.381071525 +0000
    #      @@ -1,3 +1,9 @@
    #       QA output created by 094
    #       fiemap run with sync
    #       fiemap run without sync
    #      +ERROR: couldn't find extent at 7
    #      +map is 'HHDDHPPDPHPH'
    #      +logical: [       5..       6] phys:   301517..  301518 flags: 0x800 tot: 2
    #      +logical: [       8..       8] phys:   301520..  301520 flags: 0x800 tot: 1
    #      ...
    #      (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/094.out /home/fdmanana/git/hub/xfstests/results//generic/094.out.bad'  to see the entire diff)
    
    So in order to fix this, while still avoiding deadlocks in the case where
    the fiemap buffer is memory mapped to the same file, change fiemap to work
    like the following:
    
    1) Always lock the whole range in the inode's io tree before starting to
       iterate the inode's subvolume tree searching for file extent items,
       just like we did before commit b0ad381fa769 ("btrfs: fix deadlock with
       fiemap and extent locking");
    
    2) Now instead of writing to the fiemap buffer every time we have an extent
       to report, write instead to a temporary buffer (1 page), and when that
       buffer becomes full, stop iterating the file extent items, unlock the
       range in the io tree, release the search path, submit all the entries
       kept in that buffer to the fiemap buffer, and then resume the search
       for file extent items after locking again the remainder of the range in
       the io tree.
    
       The buffer having a size of a page, allows for 146 entries in a system
       with 4K pages. This is a large enough value to have a good performance
       by avoiding too many restarts of the search for file extent items.
       In other words this preserves the huge performance gains made in the
       last two years to fiemap, while avoiding the deadlocks in case the
       fiemap buffer is memory mapped to the same file (useless in practice,
       but possible and exercised by fuzz testing and syzbot).
    
    Fixes: b0ad381fa769 ("btrfs: fix deadlock with fiemap and extent locking")
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: Fix caching to try to do open O_WRONLY as rdwr on server [+ + +]

Author: David Howells <dhowells@redhat.com>
Date:   Tue Apr 2 10:11:35 2024 +0100

    cifs: Fix caching to try to do open O_WRONLY as rdwr on server
    
    [ Upstream commit e9e62243a3e2322cf639f653a0b0a88a76446ce7 ]
    
    When we're engaged in local caching of a cifs filesystem, we cannot perform
    caching of a partially written cache granule unless we can read the rest of
    the granule.  This can result in unexpected access errors being reported to
    the user.
    
    Fix this by the following: if a file is opened O_WRONLY locally, but the
    mount was given the "-o fsc" flag, try first opening the remote file with
    GENERIC_READ|GENERIC_WRITE and if that returns -EACCES, try dropping the
    GENERIC_READ and doing the open again.  If that last succeeds, invalidate
    the cache for that file as for O_DIRECT.
    
    Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite")
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Steve French <sfrench@samba.org>
    cc: Shyam Prasad N <nspmangalore@gmail.com>
    cc: Rohith Surabattula <rohiths.msft@gmail.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: linux-cifs@vger.kernel.org
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cifs: Fix duplicate fscache cookie warnings [+ + +]

Author: David Howells <dhowells@redhat.com>
Date:   Wed Mar 27 14:13:24 2024 +0000

    cifs: Fix duplicate fscache cookie warnings
    
    [ Upstream commit 8876a37277cb832e1861c35f8c661825179f73f5 ]
    
    fscache emits a lot of duplicate cookie warnings with cifs because the
    index key for the fscache cookies does not include everything that the
    cifs_find_inode() function does.  The latter is used with iget5_locked() to
    distinguish between inodes in the local inode cache.
    
    Fix this by adding the creation time and file type to the fscache cookie
    key.
    
    Additionally, add a couple of comments to note that if one is changed the
    other must be also.
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite")
    cc: Shyam Prasad N <nspmangalore@gmail.com>
    cc: Rohith Surabattula <rohiths.msft@gmail.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: linux-cifs@vger.kernel.org
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dm integrity: fix out-of-range warning [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Thu Mar 28 15:30:39 2024 +0100

    dm integrity: fix out-of-range warning
    
    [ Upstream commit 8e91c2342351e0f5ef6c0a704384a7f6fc70c3b2 ]
    
    Depending on the value of CONFIG_HZ, clang complains about a pointless
    comparison:
    
    drivers/md/dm-integrity.c:4085:12: error: result of comparison of
                            constant 42949672950 with expression of type
                            'unsigned int' is always false
                            [-Werror,-Wtautological-constant-out-of-range-compare]
                            if (val >= (uint64_t)UINT_MAX * 1000 / HZ) {
    
    As the check remains useful for other configurations, shut up the
    warning by adding a second type cast to uint64_t.
    
    Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
    Reviewed-by: Justin Stitt <justinstitt@google.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

dma-buf: Fix NULL pointer dereference in sanitycheck() [+ + +]

Author: Pavel Sakharov <p.sakharov@ispras.ru>
Date:   Wed Mar 20 04:15:23 2024 +0500

    dma-buf: Fix NULL pointer dereference in sanitycheck()
    
    [ Upstream commit 2295bd846765c766701e666ed2e4b35396be25e6 ]
    
    If due to a memory allocation failure mock_chain() returns NULL, it is
    passed to dma_fence_enable_sw_signaling() resulting in NULL pointer
    dereference there.
    
    Call dma_fence_enable_sw_signaling() only if mock_chain() succeeds.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: d62c43a953ce ("dma-buf: Enable signaling on fence for selftests")
    Signed-off-by: Pavel Sakharov <p.sakharov@ispras.ru>
    Reviewed-by: Christian Kц╤nig <christian.koenig@amd.com>
    Signed-off-by: Christian Kц╤nig <christian.koenig@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240319231527.1821372-1-p.sakharov@ispras.ru
    Signed-off-by: Sasha Levin <sashal@kernel.org>

driver core: Introduce device_link_wait_removal() [+ + +]

Author: Herve Codina <herve.codina@bootlin.com>
Date:   Mon Mar 25 16:21:25 2024 +0100

    driver core: Introduce device_link_wait_removal()
    
    commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream.
    
    The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
    introduces a workqueue to release the consumer and supplier devices used
    in the devlink.
    In the job queued, devices are release and in turn, when all the
    references to these devices are dropped, the release function of the
    device itself is called.
    
    Nothing is present to provide some synchronisation with this workqueue
    in order to ensure that all ongoing releasing operations are done and
    so, some other operations can be started safely.
    
    For instance, in the following sequence:
      1) of_platform_depopulate()
      2) of_overlay_remove()
    
    During the step 1, devices are released and related devlinks are removed
    (jobs pushed in the workqueue).
    During the step 2, OF nodes are destroyed but, without any
    synchronisation with devlink removal jobs, of_overlay_remove() can raise
    warnings related to missing of_node_put():
      ERROR: memory leak, expected refcount 1 instead of 2
    
    Indeed, the missing of_node_put() call is going to be done, too late,
    from the workqueue job execution.
    
    Introduce device_link_wait_removal() to offer a way to synchronize
    operations waiting for the end of devlink removals (i.e. end of
    workqueue jobs).
    Also, as a flushing operation is done on the workqueue, the workqueue
    used is moved from a system-wide workqueue to a local one.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Herve Codina <herve.codina@bootlin.com>
    Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Reviewed-by: Nuno Sa <nuno.sa@analog.com>
    Reviewed-by: Saravana Kannan <saravanak@google.com>
    Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com
    Signed-off-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported [+ + +]

Author: Pu Lehui <pulehui@huawei.com>
Date:   Tue Mar 12 01:20:53 2024 +0000

    drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported
    
    [ Upstream commit ea6873118493019474abbf57d5a800da365734df ]
    
    RISC-V perf driver does not yet support branch sampling. Although the
    specification is in the works [0], it is best to disable such events
    until support is available, otherwise we will get unexpected results.
    Due to this reason, two riscv bpf testcases get_branch_snapshot and
    perf_branches/perf_branches_hw fail.
    
    Link: https://github.com/riscv/riscv-control-transfer-records [0]
    Fixes: f5bfa23f576f ("RISC-V: Add a perf core library for pmu drivers")
    Signed-off-by: Pu Lehui <pulehui@huawei.com>
    Reviewed-by: Atish Patra <atishp@rivosinc.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20240312012053.1178140-1-pulehui@huaweicloud.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Fix DPSTREAM CLK on and off sequence [+ + +]

Author: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Date:   Wed Jan 17 16:46:02 2024 -0500

    drm/amd/display: Fix DPSTREAM CLK on and off sequence
    
    [ Upstream commit e8d131285c98927554cd007f47cedc4694bfedde ]
    
    [Why]
    Secondary DP2 display fails to light up in some instances
    
    [How]
    Clock needs to be on when DPSTREAMCLK*_EN =1. This change
    moves dtbclk_p enable/disable point to make sure this is
    the case
    
    Reviewed-by: Charlene Liu <charlene.liu@amd.com>
    Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
    Acked-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Daniel Miess <daniel.miess@amd.com>
    Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Stable-dep-of: 72d72e8fddbc ("drm/amd/display: Prevent crash when disable stream")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd/display: Prevent crash when disable stream [+ + +]

Author: Chris Park <chris.park@amd.com>
Date:   Tue Mar 5 17:41:15 2024 -0500

    drm/amd/display: Prevent crash when disable stream
    
    [ Upstream commit 72d72e8fddbcd6c98e1b02d32cf6f2b04e10bd1c ]
    
    [Why]
    Disabling stream encoder invokes a function that no longer exists.
    
    [How]
    Check if the function declaration is NULL in disable stream encoder.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Charlene Liu <charlene.liu@amd.com>
    Acked-by: Wayne Lin <wayne.lin@amd.com>
    Signed-off-by: Chris Park <chris.park@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd: Add concept of running prepare_suspend() sequence for IP blocks [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Fri Oct 6 13:50:21 2023 -0500

    drm/amd: Add concept of running prepare_suspend() sequence for IP blocks
    
    [ Upstream commit cb11ca3233aa3303dc11dca25977d2e7f24be00f ]
    
    If any IP blocks allocate memory during their hw_fini() sequence
    this can cause the suspend to fail under memory pressure.  Introduce
    a new phase that IP blocks can use to allocate memory before suspend
    starts so that it can potentially be evicted into swap instead.
    
    Reviewed-by: Christian Kц╤nig <christian.koenig@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Stable-dep-of: ca299b4512d4 ("drm/amd: Flush GFXOFF requests in prepare stage")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd: Evict resources during PM ops prepare() callback [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Fri Oct 6 13:50:20 2023 -0500

    drm/amd: Evict resources during PM ops prepare() callback
    
    [ Upstream commit 5095d5418193eb2748c7d8553c7150b8f1c44696 ]
    
    Linux PM core has a prepare() callback run before suspend.
    
    If the system is under high memory pressure, the resources may need
    to be evicted into swap instead.  If the storage backing for swap
    is offlined during the suspend() step then such a call may fail.
    
    So move this step into prepare() to move evict majority of
    resources and update all non-pmops callers to call the same callback.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
    Reviewed-by: Christian Kц╤nig <christian.koenig@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Stable-dep-of: ca299b4512d4 ("drm/amd: Flush GFXOFF requests in prepare stage")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/amd: Flush GFXOFF requests in prepare stage [+ + +]

Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Mar 20 13:32:21 2024 -0500

    drm/amd: Flush GFXOFF requests in prepare stage
    
    [ Upstream commit ca299b4512d4b4f516732a48ce9aa19d91f4473e ]
    
    If the system hasn't entered GFXOFF when suspend starts it can cause
    hangs accessing GC and RLC during the suspend stage.
    
    Cc: <stable@vger.kernel.org> # 6.1.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback")
    Cc: <stable@vger.kernel.org> # 6.1.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks")
    Cc: <stable@vger.kernel.org> # 6.1.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()")
    Cc: <stable@vger.kernel.org> # 6.1.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend")
    Cc: <stable@vger.kernel.org> # 6.6.y: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback")
    Cc: <stable@vger.kernel.org> # 6.6.y: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks")
    Cc: <stable@vger.kernel.org> # 6.6.y: 2ceec37b0e3d ("drm/amd: Add missing kernel doc for prepare_suspend()")
    Cc: <stable@vger.kernel.org> # 6.6.y: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend")
    Cc: <stable@vger.kernel.org> # 6.1+
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132
    Fixes: ab4750332dbe ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/dg2: Drop pre-production GT workarounds [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Wed Aug 16 14:42:05 2023 -0700

    drm/i915/dg2: Drop pre-production GT workarounds
    
    [ Upstream commit eaeb4b3614529bfa8a7edfdd7ecf6977b27f18b2 ]
    
    DG2 first production steppings were C0 (for DG2-G10), B1 (for DG2-G11),
    and A1 (for DG2-G12).  Several workarounds that apply onto to
    pre-production hardware can be dropped.  Furthermore, several
    workarounds that apply to all production steppings can have their
    conditions simplified to no longer check the GT stepping.
    
    v2:
     - Keep Wa_16011777198 in place for now; it will be removed separately
       in a follow-up patch to keep review easier.
    
    Bspec: 44477
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Acked-by: Jani Nikula <jani.nikula@intel.com>
    Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230816214201.534095-10-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/display: Use i915_gem_object_get_dma_address to get dma address [+ + +]

Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Wed Oct 25 12:11:31 2023 +0200

    drm/i915/display: Use i915_gem_object_get_dma_address to get dma address
    
    [ Upstream commit 7054b551de18e9875fbdf8d4f3baade428353545 ]
    
    Works better for xe like that. obj is no longer const.
    
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231204134946.16219-1-maarten.lankhorst@linux.intel.com
    Reviewed-by: Jouni Hц╤gander <jouni.hogander@intel.com>
    Stable-dep-of: 582dc04b0658 ("drm/i915: Pre-populate the cursor physical dma address")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/gt: Disable HW load balancing for CCS [+ + +]

Author: Andi Shyti <andi.shyti@linux.intel.com>
Date:   Thu Mar 28 08:34:03 2024 +0100

    drm/i915/gt: Disable HW load balancing for CCS
    
    commit bc9a1ec01289e6e7259dc5030b413a9c6654a99a upstream.
    
    The hardware should not dynamically balance the load between CCS
    engines. Wa_14019159160 recommends disabling it across all
    platforms.
    
    Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: <stable@vger.kernel.org> # v6.2+
    Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
    Acked-by: Michal Mrozek <michal.mrozek@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-2-andi.shyti@linux.intel.com
    (cherry picked from commit f5d2904cf814f20b79e3e4c1b24a4ccc2411b7e0)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gt: Do not generate the command streamer for all the CCS [+ + +]

Author: Andi Shyti <andi.shyti@linux.intel.com>
Date:   Thu Mar 28 08:34:04 2024 +0100

    drm/i915/gt: Do not generate the command streamer for all the CCS
    
    commit ea315f98e5d6d3191b74beb0c3e5fc16081d517c upstream.
    
    We want a fixed load CCS balancing consisting in all slices
    sharing one single user engine. For this reason do not create the
    intel_engine_cs structure with its dedicated command streamer for
    CCS slices beyond the first.
    
    Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: <stable@vger.kernel.org> # v6.2+
    Acked-by: Michal Mrozek <michal.mrozek@intel.com>
    Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com
    (cherry picked from commit c7a5aa4e57f88470313a8277eb299b221b86e3b1)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gt: Enable only one CCS for compute workload [+ + +]

Author: Andi Shyti <andi.shyti@linux.intel.com>
Date:   Thu Mar 28 08:34:05 2024 +0100

    drm/i915/gt: Enable only one CCS for compute workload
    
    commit 6db31251bb265813994bfb104eb4b4d0f44d64fb upstream.
    
    Enable only one CCS engine by default with all the compute sices
    allocated to it.
    
    While generating the list of UABI engines to be exposed to the
    user, exclude any additional CCS engines beyond the first
    instance.
    
    This change can be tested with igt i915_query.
    
    Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Matt Roper <matthew.d.roper@intel.com>
    Cc: <stable@vger.kernel.org> # v6.2+
    Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
    Acked-by: Michal Mrozek <michal.mrozek@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-4-andi.shyti@linux.intel.com
    (cherry picked from commit 2bebae0112b117de7e8a7289277a4bd2403b9e17)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/mtl: Update workaround 14016712196 [+ + +]

Author: Tejas Upadhyay <tejas.upadhyay@intel.com>
Date:   Mon Aug 28 12:04:50 2023 +0530

    drm/i915/mtl: Update workaround 14016712196
    
    [ Upstream commit 7467e1da906468bcbd311023b30708193103ecf9 ]
    
    Now this workaround is permanent workaround on MTL and DG2,
    earlier we used to apply on MTL A0 step only.
    VLK-45480
    
    Fixes: d922b80b1010 ("drm/i915/gt: Add workaround 14016712196")
    Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
    Acked-by: Nirmoy Das <nirmoy.das@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230828063450.2642748-1-tejas.upadhyay@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/mtl: Update workaround 14018575942 [+ + +]

Author: Tejas Upadhyay <tejas.upadhyay@intel.com>
Date:   Wed Feb 28 16:07:38 2024 +0530

    drm/i915/mtl: Update workaround 14018575942
    
    [ Upstream commit 186bce682772e7346bf7ced5325b5f4ff050ccfb ]
    
    Applying WA 14018575942 only on Compute engine has impact on
    some apps like chrome. Updating this WA to apply on Render
    engine as well as it is helping with performance on Chrome.
    
    Note: There is no concern from media team thus not applying
    WA on media engines. We will revisit if any issues reported
    from media team.
    
    V2(Matt):
     - Use correct WA number
    
    Fixes: 668f37e1ee11 ("drm/i915/mtl: Update workaround 14018778641")
    Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
    Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240228103738.2018458-1-tejas.upadhyay@intel.com
    (cherry picked from commit 71271280175aa0ed6673e40cce7c01296bcd05f6)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/xelpg: Call Xe_LPG workaround functions based on IP version [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Aug 21 11:06:23 2023 -0700

    drm/i915/xelpg: Call Xe_LPG workaround functions based on IP version
    
    [ Upstream commit f7696ded7c9e358670dae1801660f442f059c7db ]
    
    Although some of our Xe_LPG workarounds were already being applied based
    on IP version correctly, others were matching on MTL as a base platform,
    which is incorrect.  Although MTL is the only platform right now that
    uses Xe_LPG IP, this may not always be the case.  If a future platform
    re-uses this graphics IP, the same workarounds should be applied, even
    if it isn't a "MTL" platform.
    
    We were also incorrectly applying Xe_LPG workarounds/tuning to the
    Xe_LPM+ media IP in one or two places; we should make sure that we don't
    try to apply graphics workarounds to the media GT and vice versa where
    they don't belong.  A new helper macro IS_GT_IP_RANGE() is added to help
    ensure this is handled properly -- it checks that the GT matches the IP
    type being tested as well as the IP version falling in the proper range.
    
    Note that many of the stepping-based workarounds are still incorrectly
    checking for a MTL base platform; that will be remedied in a later
    patch.
    
    v2:
     - Rework macro into a slightly more generic IS_GT_IP_RANGE() that can
       be used for either GFX or MEDIA checks.
    
    v3:
     - Switch back to separate macros for gfx and media.  (Jani)
     - Move macro to intel_gt.h.  (Andi)
    
    Cc: Gustavo Sousa <gustavo.sousa@intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: Andi Shyti <andi.shyti@linux.intel.com>
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-14-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/xelpg: Extend some workarounds/tuning to gfx version 12.74 [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Jan 8 17:57:38 2024 +0530

    drm/i915/xelpg: Extend some workarounds/tuning to gfx version 12.74
    
    [ Upstream commit c44d4ef47fdad0a33966de89f9064e19736bb52f ]
    
    Some of our existing Xe_LPG workarounds and tuning are also applicable
    to the version 12.74 variant.  Extend the condition bounds accordingly.
    Also fix the comment on Wa_14018575942 while we're at it.
    
    v2: Extend some more workarounds (Harish)
    
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
    Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
    Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240108122738.14399-4-haridhar.kalvala@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Consolidate condition for Wa_22011802037 [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Aug 21 11:06:21 2023 -0700

    drm/i915: Consolidate condition for Wa_22011802037
    
    [ Upstream commit 28c46feec7f8760683ef08f12746630a3598173e ]
    
    The workaround bounds for Wa_22011802037 are somewhat complex and are
    replicated in several places throughout the code.  Pull the condition
    out to a helper function to prevent mistakes if this condition needs to
    change again in the future.
    
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-12-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Eliminate IS_MTL_GRAPHICS_STEP [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Aug 21 11:06:24 2023 -0700

    drm/i915: Eliminate IS_MTL_GRAPHICS_STEP
    
    [ Upstream commit 5a213086a025349361b5cf75c8fd4591d96a7a99 ]
    
    Several workarounds are guarded by IS_MTL_GRAPHICS_STEP.  However none
    of these workarounds are actually tied to MTL as a platform; they only
    relate to the Xe_LPG graphics IP, regardless of what platform it appears
    in.  At the moment MTL is the only platform that uses Xe_LPG with IP
    versions 12.70 and 12.71, but we can't count on this being true in the
    future.  Switch these to use a new IS_GFX_GT_IP_STEP() macro instead
    that is purely based on IP version.  IS_GFX_GT_IP_STEP() is also
    GT-based rather than device-based, which will help prevent mistakes
    where we accidentally try to apply Xe_LPG graphics workarounds to the
    Xe_LPM+ media GT and vice-versa.
    
    v2:
     - Switch to a more generic and shorter IS_GT_IP_STEP macro that can be
       used for both graphics and media IP (and any other kind of GTs that
       show up in the future).
    v3:
     - Switch back to long-form IS_GFX_GT_IP_STEP macro.  (Jani)
     - Move macro to intel_gt.h.  (Andi)
    v4:
     - Build IS_GFX_GT_IP_STEP on top of IS_GFX_GT_IP_RANGE and
       IS_GRAPHICS_STEP building blocks and name the parameters from/until
       rather than begin/fixed.  (Jani)
     - Fix usage examples in comment.
    v5:
     - Tweak comment on macro.  (Gustavo)
    
    Cc: Gustavo Sousa <gustavo.sousa@intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
    Cc: Andi Shyti <andi.shyti@linux.intel.com>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-15-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Pre-populate the cursor physical dma address [+ + +]

Author: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
Date:   Mon Mar 25 19:57:38 2024 +0200

    drm/i915: Pre-populate the cursor physical dma address
    
    [ Upstream commit 582dc04b0658ef3b90aeb49cbdd9747c2f1eccc3 ]
    
    Calling i915_gem_object_get_dma_address() from the vblank
    evade critical section triggers might_sleep().
    
    While we know that we've already pinned the framebuffer
    and thus i915_gem_object_get_dma_address() will in fact
    not sleep in this case, it seems reasonable to keep the
    unconditional might_sleep() for maximum coverage.
    
    So let's instead pre-populate the dma address during
    fb pinning, which all happens before we enter the
    vblank evade critical section.
    
    We can use u32 for the dma address as this class of
    hardware doesn't support >32bit addresses.
    
    Cc: stable@vger.kernel.org
    Fixes: 0225a90981c8 ("drm/i915: Make cursor plane registers unlocked")
    Reported-by: Borislav Petkov <bp@alien8.de>
    Closes: https://lore.kernel.org/intel-gfx/20240227100342.GAZd2zfmYcPS_SndtO@fat_crate.local/
    Signed-off-by: Ville Syrjц╓lц╓ <ville.syrjala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240325175738.3440-1-ville.syrjala@linux.intel.com
    Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
    (cherry picked from commit c1289a5c3594cf04caa94ebf0edeb50c62009f1f)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Replace several IS_METEORLAKE with proper IP version checks [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Mon Aug 21 11:06:29 2023 -0700

    drm/i915: Replace several IS_METEORLAKE with proper IP version checks
    
    [ Upstream commit 14128d64090fa88445376cb8ccf91c50c08bd410 ]
    
    Many of the IS_METEORLAKE conditions throughout the driver are supposed
    to be checks for Xe_LPG and/or Xe_LPM+ IP, not for the MTL platform
    specifically.  Update those checks to ensure that the code will still
    operate properly if/when these IP versions show up on future platforms.
    
    v2:
     - Update two more conditions (one for pg_enable, one for MTL HuC
       compatibility).
    v3:
     - Don't change GuC/HuC compatibility check, which sounds like it truly
       is specific to the MTL platform.  (Gustavo)
     - Drop a non-lineage workaround number for the OA timestamp frequency
       workaround.  (Gustavo)
    
    Cc: Gustavo Sousa <gustavo.sousa@intel.com>
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-20-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Tidy workaround definitions [+ + +]

Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Wed Aug 16 14:42:06 2023 -0700

    drm/i915: Tidy workaround definitions
    
    [ Upstream commit f1c805716516f9e648e13f0108cea8096e0c7023 ]
    
    Removal of the DG2 pre-production workarounds has left duplicate
    condition blocks in a couple places, as well as some inconsistent
    platform ordering.  Reshuffle and consolidate some of the workarounds to
    reduce the number of condition blocks and to more consistently follow
    the "newest platform first" convention.  Code movement only; no
    functional change.
    
    Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
    Acked-by: Jani Nikula <jani.nikula@intel.com>
    Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230816214201.534095-11-matthew.d.roper@intel.com
    Stable-dep-of: 186bce682772 ("drm/i915/mtl: Update workaround 14018575942")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/panfrost: fix power transition timeout warnings [+ + +]

Author: Christian Hewitt <christianshewitt@gmail.com>
Date:   Fri Mar 22 16:45:25 2024 +0000

    drm/panfrost: fix power transition timeout warnings
    
    [ Upstream commit 2bd02f5a0bac4bb13e0da18652dc75ba0e4958ec ]
    
    Increase the timeout value to prevent system logs on Amlogic boards flooding
    with power transition warnings:
    
    [   13.047638] panfrost ffe40000.gpu: shader power transition timeout
    [   13.048674] panfrost ffe40000.gpu: l2 power transition timeout
    [   13.937324] panfrost ffe40000.gpu: shader power transition timeout
    [   13.938351] panfrost ffe40000.gpu: l2 power transition timeout
    ...
    [39829.506904] panfrost ffe40000.gpu: shader power transition timeout
    [39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
    [39949.508369] panfrost ffe40000.gpu: shader power transition timeout
    [39949.509405] panfrost ffe40000.gpu: l2 power transition timeout
    
    The 2000 value has been found through trial and error testing with devices
    using G52 and G31 GPUs.
    
    Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
    Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
    Reviewed-by: Steven Price <steven.price@arm.com>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Signed-off-by: Steven Price <steven.price@arm.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240322164525.2617508-1-christianshewitt@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/prime: Unbreak virtgpu dma-buf export [+ + +]

Author: Rob Clark <robdclark@chromium.org>
Date:   Fri Mar 22 14:48:01 2024 -0700

    drm/prime: Unbreak virtgpu dma-buf export
    
    [ Upstream commit a4ec240f6b7c21cf846d10017c3ce423a0eae92c ]
    
    virtgpu "vram" GEM objects do not implement obj->get_sg_table().  But
    they also don't use drm_gem_map_dma_buf().  In fact they may not even
    have guest visible pages.  But it is perfectly fine to export and share
    with other virtual devices.
    
    Reported-by: Dominik Behr <dbehr@chromium.org>
    Fixes: 207395da5a97 ("drm/prime: reject DMA-BUF attach when get_sg_table is missing")
    Signed-off-by: Rob Clark <robdclark@chromium.org>
    Reviewed-by: Simon Ser <contact@emersion.fr>
    Signed-off-by: Simon Ser <contact@emersion.fr>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240322214801.319975-1-robdclark@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

e1000e: Minor flow correction in e1000_shutdown function [+ + +]

Author: Vitaly Lifshits <vitaly.lifshits@intel.com>
Date:   Fri Mar 1 10:48:05 2024 -0800

    e1000e: Minor flow correction in e1000_shutdown function
    
    [ Upstream commit 662200e324daebe6859c1f0f3ea1538b0561425a ]
    
    Add curly braces to avoid entering to an if statement where it is not
    always required in e1000_shutdown function.
    This improves code readability and might prevent non-deterministic
    behaviour in the future.
    
    Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Link: https://lore.kernel.org/r/20240301184806.2634508-5-anthony.l.nguyen@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue [+ + +]

Author: Vitaly Lifshits <vitaly.lifshits@intel.com>
Date:   Sun Mar 3 12:51:32 2024 +0200

    e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue
    
    [ Upstream commit 861e8086029e003305750b4126ecd6617465f5c7 ]
    
    Forcing SMBUS inside the ULP enabling flow leads to sporadic PHY loss on
    some systems. It is suspected to be caused by initiating PHY transactions
    before the interface settles.
    
    Separating this configuration from the ULP enabling flow and moving it to
    the shutdown function allows enough time for the interface to settle and
    avoids adding a delay.
    
    Fixes: 6607c99e7034 ("e1000e: i219 - fix to enable both ULP and EEE in Sx state")
    Co-developed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
    Signed-off-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
    Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

e1000e: Workaround for sporadic MDI error on Meteor Lake systems [+ + +]

Author: Vitaly Lifshits <vitaly.lifshits@intel.com>
Date:   Thu Jan 4 16:16:52 2024 +0200

    e1000e: Workaround for sporadic MDI error on Meteor Lake systems
    
    [ Upstream commit 6dbdd4de0362c37e54e8b049781402e5a409e7d0 ]
    
    On some Meteor Lake systems accessing the PHY via the MDIO interface may
    result in an MDI error. This issue happens sporadically and in most cases
    a second access to the PHY via the MDIO interface results in success.
    
    As a workaround, introduce a retry counter which is set to 3 on Meteor
    Lake systems. The driver will only return an error if 3 consecutive PHY
    access attempts fail. The retry mechanism is disabled in specific flows,
    where MDI errors are expected.
    
    Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake")
    Suggested-by: Nikolay Mushayev <nikolay.mushayev@intel.com>
    Co-developed-by: Nir Efrati <nir.efrati@intel.com>
    Signed-off-by: Nir Efrati <nir.efrati@intel.com>
    Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

efi/libstub: Add generic support for parsing mem_encrypt= [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Tue Feb 27 16:19:13 2024 +0100

    efi/libstub: Add generic support for parsing mem_encrypt=
    
    commit 7205f06e847422b66c1506eee01b9998ffc75d76 upstream.
    
    Parse the mem_encrypt= command line parameter from the EFI stub if
    CONFIG_ARCH_HAS_MEM_ENCRYPT=y, so that it can be passed to the early
    boot code by the arch code in the stub.
    
    This avoids the need for the core kernel to do any string parsing very
    early in the boot.
    
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240227151907.387873-16-ardb+git@google.com
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

erspan: make sure erspan_base_hdr is present in skb->head [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Mar 28 11:22:48 2024 +0000

    erspan: make sure erspan_base_hdr is present in skb->head
    
    commit 17af420545a750f763025149fa7b833a4fc8b8f0 upstream.
    
    syzbot reported a problem in ip6erspan_rcv() [1]
    
    Issue is that ip6erspan_rcv() (and erspan_rcv()) no longer make
    sure erspan_base_hdr is present in skb linear part (skb->head)
    before getting @ver field from it.
    
    Add the missing pskb_may_pull() calls.
    
    v2: Reload iph pointer in erspan_rcv() after pskb_may_pull()
        because skb->head might have changed.
    
    [1]
    
     BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
     BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2756 [inline]
     BUG: KMSAN: uninit-value in ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
     BUG: KMSAN: uninit-value in gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
      pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
      pskb_may_pull include/linux/skbuff.h:2756 [inline]
      ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
      gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
      ip6_protocol_deliver_rcu+0x1d4c/0x2ca0 net/ipv6/ip6_input.c:438
      ip6_input_finish net/ipv6/ip6_input.c:483 [inline]
      NF_HOOK include/linux/netfilter.h:314 [inline]
      ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492
      ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586
      dst_input include/net/dst.h:460 [inline]
      ip6_rcv_finish+0x955/0x970 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:314 [inline]
      ipv6_rcv+0xde/0x390 net/ipv6/ip6_input.c:310
      __netif_receive_skb_one_core net/core/dev.c:5538 [inline]
      __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5652
      netif_receive_skb_internal net/core/dev.c:5738 [inline]
      netif_receive_skb+0x58/0x660 net/core/dev.c:5798
      tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1549
      tun_get_user+0x5566/0x69e0 drivers/net/tun.c:2002
      tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
      call_write_iter include/linux/fs.h:2108 [inline]
      new_sync_write fs/read_write.c:497 [inline]
      vfs_write+0xb63/0x1520 fs/read_write.c:590
      ksys_write+0x20f/0x4c0 fs/read_write.c:643
      __do_sys_write fs/read_write.c:655 [inline]
      __se_sys_write fs/read_write.c:652 [inline]
      __x64_sys_write+0x93/0xe0 fs/read_write.c:652
     do_syscall_64+0xd5/0x1f0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75
    
    Uninit was created at:
      slab_post_alloc_hook mm/slub.c:3804 [inline]
      slab_alloc_node mm/slub.c:3845 [inline]
      kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
      kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
      __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
      alloc_skb include/linux/skbuff.h:1318 [inline]
      alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
      sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
      tun_alloc_skb drivers/net/tun.c:1525 [inline]
      tun_get_user+0x209a/0x69e0 drivers/net/tun.c:1846
      tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
      call_write_iter include/linux/fs.h:2108 [inline]
      new_sync_write fs/read_write.c:497 [inline]
      vfs_write+0xb63/0x1520 fs/read_write.c:590
      ksys_write+0x20f/0x4c0 fs/read_write.c:643
      __do_sys_write fs/read_write.c:655 [inline]
      __se_sys_write fs/read_write.c:652 [inline]
      __x64_sys_write+0x93/0xe0 fs/read_write.c:652
     do_syscall_64+0xd5/0x1f0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75
    
    CPU: 1 PID: 5045 Comm: syz-executor114 Not tainted 6.9.0-rc1-syzkaller-00021-g962490525cff #0
    
    Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
    Reported-by: syzbot+1c1cf138518bf0c53d68@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/netdev/000000000000772f2c0614b66ef7@google.com/
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Lorenzo Bianconi <lorenzo@kernel.org>
    Link: https://lore.kernel.org/r/20240328112248.1101491-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() [+ + +]

Author: Jann Horn <jannh@google.com>
Date:   Fri Nov 24 16:08:22 2023 +0100

    fs/pipe: Fix lockdep false-positive in watchqueue pipe_write()
    
    [ Upstream commit 055ca83559912f2cfd91c9441427bac4caf3c74e ]
    
    When you try to splice between a normal pipe and a notification pipe,
    get_pipe_info(..., true) fails, so splice() falls back to treating the
    notification pipe like a normal pipe - so we end up in
    iter_file_splice_write(), which first locks the input pipe, then calls
    vfs_iter_write(), which locks the output pipe.
    
    Lockdep complains about that, because we're taking a pipe lock while
    already holding another pipe lock.
    
    I think this probably (?) can't actually lead to deadlocks, since you'd
    need another way to nest locking a normal pipe into locking a
    watch_queue pipe, but the lockdep annotations don't make that clear.
    
    Bail out earlier in pipe_write() for notification pipes, before taking
    the pipe lock.
    
    Reported-and-tested-by: <syzbot+011e4ea1da6692cf881c@syzkaller.appspotmail.com>
    Closes: https://syzkaller.appspot.com/bug?extid=011e4ea1da6692cf881c
    Fixes: c73be61cede5 ("pipe: Add general notification queue support")
    Signed-off-by: Jann Horn <jannh@google.com>
    Link: https://lore.kernel.org/r/20231124150822.2121798-1-jannh@google.com
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gpio: cdev: check for NULL labels when sanitizing them for irqs [+ + +]

Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Date:   Thu Apr 4 11:33:27 2024 +0200

    gpio: cdev: check for NULL labels when sanitizing them for irqs
    
    commit b3b95964590a3d756d69ea8604c856de805479ad upstream.
    
    We need to take into account that a line's consumer label may be NULL
    and not try to kstrdup() it in that case but rather pass the NULL
    pointer up the stack to the interrupt request function.
    
    To that end: let make_irq_label() return NULL as a valid return value
    and use ERR_PTR() instead to signal an allocation failure to callers.
    
    Cc: stable@vger.kernel.org
    Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt")
    Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Closes: https://lore.kernel.org/lkml/20240402093534.212283-1-naresh.kamboju@linaro.org/
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Tested-by: Anders Roxell <anders.roxell@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: cdev: fix missed label sanitizing in debounce_setup() [+ + +]

Author: Kent Gibson <warthog618@gmail.com>
Date:   Thu Apr 4 11:33:28 2024 +0200

    gpio: cdev: fix missed label sanitizing in debounce_setup()
    
    commit 83092341e15d0dfee1caa8dc502f66c815ccd78a upstream.
    
    When adding sanitization of the label, the path through
    edge_detector_setup() that leads to debounce_setup() was overlooked.
    A request taking this path does not allocate a new label and the
    request label is freed twice when the request is released, resulting
    in memory corruption.
    
    Add label sanitization to debounce_setup().
    
    Cc: stable@vger.kernel.org
    Fixes: b34490879baa ("gpio: cdev: sanitize the label before requesting the interrupt")
    Signed-off-by: Kent Gibson <warthog618@gmail.com>
    [Bartosz: rebased on top of the fix for empty GPIO labels]
    Co-developed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: cdev: sanitize the label before requesting the interrupt [+ + +]

Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Date:   Mon Mar 25 10:02:42 2024 +0100

    gpio: cdev: sanitize the label before requesting the interrupt
    
    commit b34490879baa847d16fc529c8ea6e6d34f004b38 upstream.
    
    When an interrupt is requested, a procfs directory is created under
    "/proc/irq/<irqnum>/<label>" where <label> is the string passed to one of
    the request_irq() variants.
    
    What follows is that the string must not contain the "/" character or
    the procfs mkdir operation will fail. We don't have such constraints for
    GPIO consumer labels which are used verbatim as interrupt labels for
    GPIO irqs. We must therefore sanitize the consumer string before
    requesting the interrupt.
    
    Let's replace all "/" with ":".
    
    Cc: stable@vger.kernel.org
    Reported-by: Stefan Wahren <wahrenst@gmx.net>
    Closes: https://lore.kernel.org/linux-gpio/39fe95cb-aa83-4b8b-8cab-63947a726754@gmx.net/
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Reviewed-by: Kent Gibson <warthog618@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gro: fix ownership transfer [+ + +]

Author: Antoine Tenart <atenart@kernel.org>
Date:   Tue Mar 26 12:33:59 2024 +0100

    gro: fix ownership transfer
    
    commit ed4cccef64c1d0d5b91e69f7a8a6697c3a865486 upstream.
    
    If packets are GROed with fraglist they might be segmented later on and
    continue their journey in the stack. In skb_segment_list those skbs can
    be reused as-is. This is an issue as their destructor was removed in
    skb_gro_receive_list but not the reference to their socket, and then
    they can't be orphaned. Fix this by also removing the reference to the
    socket.
    
    For example this could be observed,
    
      kernel BUG at include/linux/skbuff.h:3131!  (skb_orphan)
      RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
      Call Trace:
       ipv6_list_rcv+0x250/0x3f0
       __netif_receive_skb_list_core+0x49d/0x8f0
       netif_receive_skb_list_internal+0x634/0xd40
       napi_complete_done+0x1d2/0x7d0
       gro_cell_poll+0x118/0x1f0
    
    A similar construction is found in skb_gro_receive, apply the same
    change there.
    
    Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference")
    Signed-off-by: Antoine Tenart <atenart@kernel.org>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: Enforce software interrupt during busy-poll exit [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Sat Mar 16 12:38:29 2024 +0100

    i40e: Enforce software interrupt during busy-poll exit
    
    [ Upstream commit ea558de7238bb12c3435c47f0631e9d17bf4a09f ]
    
    As for ice bug fixed by commit b7306b42beaf ("ice: manage interrupts
    during poll exit") followed by commit 23be7075b318 ("ice: fix software
    generating extra interrupts") I'm seeing the similar issue also with
    i40e driver.
    
    In certain situation when busy-loop is enabled together with adaptive
    coalescing, the driver occasionally misses that there are outstanding
    descriptors to clean when exiting busy poll.
    
    Try to catch the remaining work by triggering a software interrupt
    when exiting busy poll. No extra interrupts will be generated when
    busy polling is not used.
    
    The issue was found when running sockperf ping-pong tcp test with
    adaptive coalescing and busy poll enabled (50 as value busy_pool
    and busy_read sysctl knobs) and results in huge latency spikes
    with more than 100000us.
    
    The fix is inspired from the ice driver and do the following:
    1) During napi poll exit in case of busy-poll (napo_complete_done()
       returns false) this is recorded to q_vector that we were in busy
       loop.
    2) Extends i40e_buildreg_itr() to be able to add an enforced software
       interrupt into built value
    2) In i40e_update_enable_itr() enforces a software interrupt trigger
       if we are exiting busy poll to catch any pending clean-ups
    3) Reuses unused 3rd ITR (interrupt throttle) index and set it to
       20K interrupts per second to limit the number of these sw interrupts.
    
    Test results
    ============
    Prior:
    [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
    sockperf: == version #3.10-no.git ==
    sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
    
    [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
    sockperf: Warmup stage (sending a few dummy messages)...
    sockperf: Starting test...
    sockperf: Test end (interrupted by timer)
    sockperf: Test ended
    sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2438563; ReceivedMessages=2438562
    sockperf: ========= Printing statistics for Server No: 0
    sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2429473; ReceivedMessages=2429473
    sockperf: ====> avg-latency=24.571 (std-dev=93.297, mean-ad=4.904, median-ad=1.510, siqr=1.063, cv=3.797, std-error=0.060, 99.0% ci=[24.417, 24.725])
    sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
    sockperf: Summary: Latency is 24.571 usec
    sockperf: Total 2429473 observations; each percentile contains 24294.73 observations
    sockperf: ---> <MAX> observation = 103294.331
    sockperf: ---> percentile 99.999 =   45.633
    sockperf: ---> percentile 99.990 =   37.013
    sockperf: ---> percentile 99.900 =   35.910
    sockperf: ---> percentile 99.000 =   33.390
    sockperf: ---> percentile 90.000 =   28.626
    sockperf: ---> percentile 75.000 =   27.741
    sockperf: ---> percentile 50.000 =   26.743
    sockperf: ---> percentile 25.000 =   25.614
    sockperf: ---> <MIN> observation =   12.220
    
    After:
    [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
    sockperf: == version #3.10-no.git ==
    sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
    
    [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
    sockperf: Warmup stage (sending a few dummy messages)...
    sockperf: Starting test...
    sockperf: Test end (interrupted by timer)
    sockperf: Test ended
    sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2400055; ReceivedMessages=2400054
    sockperf: ========= Printing statistics for Server No: 0
    sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2391186; ReceivedMessages=2391186
    sockperf: ====> avg-latency=24.965 (std-dev=5.934, mean-ad=4.642, median-ad=1.485, siqr=1.067, cv=0.238, std-error=0.004, 99.0% ci=[24.955, 24.975])
    sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
    sockperf: Summary: Latency is 24.965 usec
    sockperf: Total 2391186 observations; each percentile contains 23911.86 observations
    sockperf: ---> <MAX> observation =  195.841
    sockperf: ---> percentile 99.999 =   45.026
    sockperf: ---> percentile 99.990 =   39.009
    sockperf: ---> percentile 99.900 =   35.922
    sockperf: ---> percentile 99.000 =   33.482
    sockperf: ---> percentile 90.000 =   28.902
    sockperf: ---> percentile 75.000 =   27.821
    sockperf: ---> percentile 50.000 =   26.860
    sockperf: ---> percentile 25.000 =   25.685
    sockperf: ---> <MIN> observation =   12.277
    
    Fixes: 0bcd952feec7 ("ethernet/intel: consolidate NAPI and NAPI exit")
    Reported-by: Hugo Ferreira <hferreir@redhat.com>
    Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: fix i40e_count_filters() to count only active/new filters [+ + +]

Author: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Date:   Wed Mar 13 10:44:00 2024 +0100

    i40e: fix i40e_count_filters() to count only active/new filters
    
    commit eb58c598ce45b7e787568fe27016260417c3d807 upstream.
    
    The bug usually affects untrusted VFs, because they are limited to 18 MACs,
    it affects them badly, not letting to create MAC all filters.
    Not stable to reproduce, it happens when VF user creates MAC filters
    when other MACVLAN operations are happened in parallel.
    But consequence is that VF can't receive desired traffic.
    
    Fix counter to be bumped only for new or active filters.
    
    Fixes: 621650cabee5 ("i40e: Refactoring VF MAC filters counting to make more reliable")
    Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: Fix VF MAC filter removal [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Fri Mar 29 11:06:37 2024 -0700

    i40e: Fix VF MAC filter removal
    
    commit ea2a1cfc3b2019bdea6324acd3c03606b60d71ad upstream.
    
    Commit 73d9629e1c8c ("i40e: Do not allow untrusted VF to remove
    administratively set MAC") fixed an issue where untrusted VF was
    allowed to remove its own MAC address although this was assigned
    administratively from PF. Unfortunately the introduced check
    is wrong because it causes that MAC filters for other MAC addresses
    including multi-cast ones are not removed.
    
    <snip>
            if (ether_addr_equal(addr, vf->default_lan_addr.addr) &&
                i40e_can_vf_change_mac(vf))
                    was_unimac_deleted = true;
            else
                    continue;
    
            if (i40e_del_mac_filter(vsi, al->list[i].addr)) {
            ...
    </snip>
    
    The else path with `continue` effectively skips any MAC filter
    removal except one for primary MAC addr when VF is allowed to do so.
    Fix the check condition so the `continue` is only done for primary
    MAC address.
    
    Fixes: 73d9629e1c8c ("i40e: Do not allow untrusted VF to remove administratively set MAC")
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
    Reviewed-by: Brett Creeley <brett.creeley@amd.com>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Link: https://lore.kernel.org/r/20240329180638.211412-1-anthony.l.nguyen@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: fix vf may be used uninitialized in this function warning [+ + +]

Author: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Date:   Wed Mar 13 10:56:39 2024 +0100

    i40e: fix vf may be used uninitialized in this function warning
    
    commit f37c4eac99c258111d414d31b740437e1925b8e8 upstream.
    
    To fix the regression introduced by commit 52424f974bc5, which causes
    servers hang in very hard to reproduce conditions with resets races.
    Using two sources for the information is the root cause.
    In this function before the fix bumping v didn't mean bumping vf
    pointer. But the code used this variables interchangeably, so stale vf
    could point to different/not intended vf.
    
    Remove redundant "v" variable and iterate via single VF pointer across
    whole function instead to guarantee VF pointer validity.
    
    Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF")
    Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i40e: Move memory allocation structures to i40e_alloc.h [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:32 2023 +0200

    i40e: Move memory allocation structures to i40e_alloc.h
    
    [ Upstream commit ef5d54078d451973f90e123fafa23fc95c2a08ae ]
    
    Structures i40e_dma_mem & i40e_virt_mem are defined i40e_osdep.h while
    memory allocation functions that use them are declared in i40e_alloc.h
    Move them there.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Refactor I40E_MDIO_CLAUSE* macros [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:29 2023 +0200

    i40e: Refactor I40E_MDIO_CLAUSE* macros
    
    [ Upstream commit 8196b5fd6c7312d31775f77c7fff0253eb0ecdaa ]
    
    The macros I40E_MDIO_CLAUSE22* and I40E_MDIO_CLAUSE45* are using I40E_MASK
    together with the same values I40E_GLGEN_MSCA_STCODE_SHIFT and
    I40E_GLGEN_MSCA_OPCODE_SHIFT to define masks.
    Introduce I40E_GLGEN_MSCA_OPCODE_MASK and I40E_GLGEN_MSCA_STCODE_MASK
    for both shifts in i40e_register.h and use them to refactor the macros
    mentioned above.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Remove _t suffix from enum type names [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Mon Nov 13 15:10:24 2023 -0800

    i40e: Remove _t suffix from enum type names
    
    [ Upstream commit addca9175e5f74cf29e8ad918c38c09b8663b5b8 ]
    
    Enum type names should not be suffixed by '_t'. Either to use
    'typedef enum name name_t' to so plain 'name_t var' instead of
    'enum name_t var'.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Link: https://lore.kernel.org/r/20231113231047.548659-6-anthony.l.nguyen@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: ea558de7238b ("i40e: Enforce software interrupt during busy-poll exit")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Remove back pointer from i40e_hw structure [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:27 2023 +0200

    i40e: Remove back pointer from i40e_hw structure
    
    [ Upstream commit 39ec612acf6d075809c38a7262d7ad09314762f3 ]
    
    The .back field placed in i40e_hw is used to get pointer to i40e_pf
    instance but it is not necessary as the i40e_hw is a part of i40e_pf
    and containerof macro can be used to obtain the pointer to i40e_pf.
    Remove .back field from i40e_hw structure, introduce i40e_hw_to_pf()
    and i40e_hw_to_dev() helpers and use them.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Remove circular header dependencies and fix headers [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:34 2023 +0200

    i40e: Remove circular header dependencies and fix headers
    
    [ Upstream commit 56df345917c09ffc00b7834f88990a7a7c338b5c ]
    
    Similarly as for ice driver [1] there are also circular header
    dependencies in i40e driver:
    i40e.h -> i40e_virtchnl_pf.h -> i40e.h
    
    Another issue is that i40e header files does not contain their own
    dependencies on other header files (both private and standard) so their
    inclusion in .c file require to add these deps in certain order to
    that .c file to make it compilable.
    
    Fix both issues by removal the mentioned circular dependency, by filling
    i40e headers with their dependencies so they can be placed anywhere in
    a source code. Additionally remove bunch of includes from i40e.h super
    header file that are not necessary and include i40e.h only in .c files
    that really require it.
    
    [1] 649c87c6ff52 ("ice: remove circular header dependencies on ice.h")
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Simplify memory allocation functions [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:31 2023 +0200

    i40e: Simplify memory allocation functions
    
    [ Upstream commit d3276f928a1d2dfebc41a82e967cd0dffeb540f8 ]
    
    Enum i40e_memory_type enum is unused in i40e_allocate_dma_mem() thus
    can be safely removed. Useless macros in i40e_alloc.h can be removed
    as well.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Split i40e_osdep.h [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:33 2023 +0200

    i40e: Split i40e_osdep.h
    
    [ Upstream commit 5dfd37c37a44ba47c35ff8e6eaff14c226141111 ]
    
    Header i40e_osdep.h contains only IO primitives and couple of debug
    printing macros. Split this header file to i40e_io.h and i40e_debug.h
    and move i40e_debug_mask enum to i40e_debug.h
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix enabling RX VLAN filtering [+ + +]

Author: Petr Oros <poros@redhat.com>
Date:   Mon Mar 25 21:19:01 2024 +0100

    ice: fix enabling RX VLAN filtering
    
    commit 8edfc7a40e3300fc6c5fa7a3228a24d5bcd86ba5 upstream.
    
    ice_port_vlan_on/off() was introduced in commit 2946204b3fa8 ("ice:
    implement bridge port vlan"). But ice_port_vlan_on() incorrectly assigns
    ena_rx_filtering to inner_vlan_ops in DVM mode.
    This causes an error when rx_filtering cannot be enabled in legacy mode.
    
    Reproducer:
     echo 1 > /sys/class/net/$PF/device/sriov_numvfs
     ip link set $PF vf 0 spoofchk off trust on vlan 3
    dmesg:
     ice 0000:41:00.0: failed to enable Rx VLAN filtering for VF 0 VSI 9 during VF rebuild, error -95
    
    Fixes: 2946204b3fa8 ("ice: implement bridge port vlan")
    Signed-off-by: Petr Oros <poros@redhat.com>
    Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ice: fix memory corruption bug with suspend and rebuild [+ + +]

Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date:   Tue Mar 5 15:02:03 2024 -0800

    ice: fix memory corruption bug with suspend and rebuild
    
    [ Upstream commit 1cb7fdb1dfde1aab66780b4ba44dba6402172111 ]
    
    The ice driver would previously panic after suspend. This is caused
    from the driver *only* calling the ice_vsi_free_q_vectors() function by
    itself, when it is suspending. Since commit b3e7b3a6ee92 ("ice: prevent
    NULL pointer deref during reload") the driver has zeroed out
    num_q_vectors, and only restored it in ice_vsi_cfg_def().
    
    This further causes the ice_rebuild() function to allocate a zero length
    buffer, after which num_q_vectors is updated, and then the new value of
    num_q_vectors is used to index into the zero length buffer, which
    corrupts memory.
    
    The fix entails making sure all the code referencing num_q_vectors only
    does so after it has been reset via ice_vsi_cfg_def().
    
    I didn't perform a full bisect, but I was able to test against 6.1.77
    kernel and that ice driver works fine for suspend/resume with no panic,
    so sometime since then, this problem was introduced.
    
    Also clean up an un-needed init of a local variable in the function
    being modified.
    
    PANIC from 6.8.0-rc1:
    
    [1026674.915596] PM: suspend exit
    [1026675.664697] ice 0000:17:00.1: PTP reset successful
    [1026675.664707] ice 0000:17:00.1: 2755 msecs passed between update to cached PHC time
    [1026675.667660] ice 0000:b1:00.0: PTP reset successful
    [1026675.675944] ice 0000:b1:00.0: 2832 msecs passed between update to cached PHC time
    [1026677.137733] ixgbe 0000:31:00.0 ens787: NIC Link is Up 1 Gbps, Flow Control: None
    [1026677.190201] BUG: kernel NULL pointer dereference, address: 0000000000000010
    [1026677.192753] ice 0000:17:00.0: PTP reset successful
    [1026677.192764] ice 0000:17:00.0: 4548 msecs passed between update to cached PHC time
    [1026677.197928] #PF: supervisor read access in kernel mode
    [1026677.197933] #PF: error_code(0x0000) - not-present page
    [1026677.197937] PGD 1557a7067 P4D 0
    [1026677.212133] ice 0000:b1:00.1: PTP reset successful
    [1026677.212143] ice 0000:b1:00.1: 4344 msecs passed between update to cached PHC time
    [1026677.212575]
    [1026677.243142] Oops: 0000 [#1] PREEMPT SMP NOPTI
    [1026677.247918] CPU: 23 PID: 42790 Comm: kworker/23:0 Kdump: loaded Tainted: G        W          6.8.0-rc1+ #1
    [1026677.257989] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
    [1026677.269367] Workqueue: ice ice_service_task [ice]
    [1026677.274592] RIP: 0010:ice_vsi_rebuild_set_coalesce+0x130/0x1e0 [ice]
    [1026677.281421] Code: 0f 84 3a ff ff ff 41 0f b7 74 ec 02 66 89 b0 22 02 00 00 81 e6 ff 1f 00 00 e8 ec fd ff ff e9 35 ff ff ff 48 8b 43 30 49 63 ed <41> 0f b7 34 24 41 83 c5 01 48 8b 3c e8 66 89 b7 aa 02 00 00 81 e6
    [1026677.300877] RSP: 0018:ff3be62a6399bcc0 EFLAGS: 00010202
    [1026677.306556] RAX: ff28691e28980828 RBX: ff28691e41099828 RCX: 0000000000188000
    [1026677.314148] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff28691e41099828
    [1026677.321730] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    [1026677.329311] R10: 0000000000000007 R11: ffffffffffffffc0 R12: 0000000000000010
    [1026677.336896] R13: 0000000000000000 R14: 0000000000000000 R15: ff28691e0eaa81a0
    [1026677.344472] FS:  0000000000000000(0000) GS:ff28693cbffc0000(0000) knlGS:0000000000000000
    [1026677.353000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [1026677.359195] CR2: 0000000000000010 CR3: 0000000128df4001 CR4: 0000000000771ef0
    [1026677.366779] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [1026677.374369] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [1026677.381952] PKRU: 55555554
    [1026677.385116] Call Trace:
    [1026677.388023]  <TASK>
    [1026677.390589]  ? __die+0x20/0x70
    [1026677.394105]  ? page_fault_oops+0x82/0x160
    [1026677.398576]  ? do_user_addr_fault+0x65/0x6a0
    [1026677.403307]  ? exc_page_fault+0x6a/0x150
    [1026677.407694]  ? asm_exc_page_fault+0x22/0x30
    [1026677.412349]  ? ice_vsi_rebuild_set_coalesce+0x130/0x1e0 [ice]
    [1026677.418614]  ice_vsi_rebuild+0x34b/0x3c0 [ice]
    [1026677.423583]  ice_vsi_rebuild_by_type+0x76/0x180 [ice]
    [1026677.429147]  ice_rebuild+0x18b/0x520 [ice]
    [1026677.433746]  ? delay_tsc+0x8f/0xc0
    [1026677.437630]  ice_do_reset+0xa3/0x190 [ice]
    [1026677.442231]  ice_service_task+0x26/0x440 [ice]
    [1026677.447180]  process_one_work+0x174/0x340
    [1026677.451669]  worker_thread+0x27e/0x390
    [1026677.455890]  ? __pfx_worker_thread+0x10/0x10
    [1026677.460627]  kthread+0xee/0x120
    [1026677.464235]  ? __pfx_kthread+0x10/0x10
    [1026677.468445]  ret_from_fork+0x2d/0x50
    [1026677.472476]  ? __pfx_kthread+0x10/0x10
    [1026677.476671]  ret_from_fork_asm+0x1b/0x30
    [1026677.481050]  </TASK>
    
    Fixes: b3e7b3a6ee92 ("ice: prevent NULL pointer deref during reload")
    Reported-by: Robert Elliott <elliott@hpe.com>
    Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: fix typo in assignment [+ + +]

Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date:   Mon Mar 4 16:37:07 2024 -0800

    ice: fix typo in assignment
    
    commit 6c5b6ca7642f2992502a22dbd8b80927de174b67 upstream.
    
    Fix an obviously incorrect assignment, created with a typo or cut-n-paste
    error.
    
    Fixes: 5995ef88e3a8 ("ice: realloc VSI stats arrays")
    Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ice: realloc VSI stats arrays [+ + +]

Author: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Date:   Tue Oct 24 13:09:26 2023 +0200

    ice: realloc VSI stats arrays
    
    [ Upstream commit 5995ef88e3a8c2b014f51256a88be8e336532ce7 ]
    
    Previously only case when queues amount is lower was covered. Implement
    realloc for case when queues amount is higher than previous one. Use
    krealloc() function and zero new allocated elements.
    
    It has to be done before ice_vsi_def_cfg(), because stats element for
    ring is set there.
    
    Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
    Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
    Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 1cb7fdb1dfde ("ice: fix memory corruption bug with suspend and rebuild")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: Refactor FW data type and fix bitmap casting issue [+ + +]

Author: Steven Zou <steven.zou@intel.com>
Date:   Wed Feb 7 09:49:59 2024 +0800

    ice: Refactor FW data type and fix bitmap casting issue
    
    [ Upstream commit 817b18965b58a6e5fb6ce97abf01b03a205a6aea ]
    
    According to the datasheet, the recipe association data is an 8-byte
    little-endian value. It is described as 'Bitmap of the recipe indexes
    associated with this profile', it is from 24 to 31 byte area in FW.
    Therefore, it is defined to '__le64 recipe_assoc' in struct
    ice_aqc_recipe_to_profile. And then fix the bitmap casting issue, as we
    must never ever use castings for bitmap type.
    
    Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>
    Reviewed-by: Jan Sokolowski <jan.sokolowski@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Steven Zou <steven.zou@intel.com>
    Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: Remove stale comment about Tx timestamping [+ + +]

Author: Kurt Kanzenbach <kurt@linutronix.de>
Date:   Wed Mar 13 14:03:10 2024 +0100

    igc: Remove stale comment about Tx timestamping
    
    [ Upstream commit 47ce2956c7a61ff354723e28235205fa2012265b ]
    
    The initial igc Tx timestamping implementation used only one register for
    retrieving Tx timestamps. Commit 3ed247e78911 ("igc: Add support for
    multiple in-flight TX timestamps") added support for utilizing all four of
    them e.g., for multiple domain support. Remove the stale comment/FIXME.
    
    Fixes: 3ed247e78911 ("igc: Add support for multiple in-flight TX timestamps")
    Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

inet: inet_defrag: prevent sk release while still in use [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Tue Mar 26 11:18:41 2024 +0100

    inet: inet_defrag: prevent sk release while still in use
    
    [ Upstream commit 18685451fc4e546fc0e718580d32df3c0e5c8272 ]
    
    ip_local_out() and other functions can pass skb->sk as function argument.
    
    If the skb is a fragment and reassembly happens before such function call
    returns, the sk must not be released.
    
    This affects skb fragments reassembled via netfilter or similar
    modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
    
    Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
      Calling ip_defrag() in output path is also implying skb_orphan(),
      which is buggy because output path relies on sk not disappearing.
    
      A relevant old patch about the issue was :
      8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
    
      [..]
    
      net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
      inet socket, not an arbitrary one.
    
      If we orphan the packet in ipvlan, then downstream things like FQ
      packet scheduler will not work properly.
    
      We need to change ip_defrag() to only use skb_orphan() when really
      needed, ie whenever frag_list is going to be used.
    
    Eric suggested to stash sk in fragment queue and made an initial patch.
    However there is a problem with this:
    
    If skb is refragmented again right after, ip_do_fragment() will copy
    head->sk to the new fragments, and sets up destructor to sock_wfree.
    IOW, we have no choice but to fix up sk_wmem accouting to reflect the
    fully reassembled skb, else wmem will underflow.
    
    This change moves the orphan down into the core, to last possible moment.
    As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
    offset into the FRAG_CB, else skb->sk gets clobbered.
    
    This allows to delay the orphaning long enough to learn if the skb has
    to be queued or if the skb is completing the reasm queue.
    
    In the former case, things work as before, skb is orphaned.  This is
    safe because skb gets queued/stolen and won't continue past reasm engine.
    
    In the latter case, we will steal the skb->sk reference, reattach it to
    the head skb, and fix up wmem accouting when inet_frag inflates truesize.
    
    Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().")
    Diagnosed-by: Eric Dumazet <edumazet@google.com>
    Reported-by: xingwei lee <xrivendell7@gmail.com>
    Reported-by: yue sun <samsun1006219@gmail.com>
    Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20240326101845.30836-1-fw@strlen.de
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

intel: add bit macro includes where needed [+ + +]

Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date:   Tue Dec 5 17:01:01 2023 -0800

    intel: add bit macro includes where needed
    
    [ Upstream commit 3314f2097dee43defc20554f961a8b17f4787e2d ]
    
    This series is introducing the use of FIELD_GET and FIELD_PREP which
    requires bitfield.h to be included. Fix all the includes in this one
    change, and rearrange includes into alphabetical order to ease
    readability and future maintenance.
    
    virtchnl.h and it's usage was modified to have it's own includes as it
    should. This required including bits.h for virtchnl.h.
    
    Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
    Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

intel: legacy: field get conversion [+ + +]

Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date:   Tue Dec 5 17:01:08 2023 -0800

    intel: legacy: field get conversion
    
    [ Upstream commit b9a4525450758dd75edbdaee97425ba7546c2b5c ]
    
    Refactor several older Intel drivers to use FIELD_GET(), which reduces
    lines of code and adds clarity of intent.
    
    This code was generated by the following coccinelle/spatch script and
    then manually repaired.
    
    @get@
    constant shift,mask;
    type T;
    expression a;
    @@
    (
    -((T)((a) & mask) >> shift)
    +FIELD_GET(mask, a)
    
    and applied via:
    spatch --sp-file field_prep.cocci --in-place --dir \
     drivers/net/ethernet/intel/
    
    Cc: Julia Lawall <Julia.Lawall@inria.fr>
    CC: Alexander Lobakin <aleksander.lobakin@intel.com>
    Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

io_uring/kbuf: get rid of bl->is_ready [+ + +]

Author: Jens Axboe <axboe@kernel.dk>
Date:   Thu Mar 14 10:46:40 2024 -0600

    io_uring/kbuf: get rid of bl->is_ready
    
    commit 3b80cff5a4d117c53d38ce805823084eaeffbde6 upstream.
    
    Now that xarray is being exclusively used for the buffer_list lookup,
    this check is no longer needed. Get rid of it and the is_ready member.
    
    Cc: stable@vger.kernel.org # v6.4+
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring/kbuf: get rid of lower BGID lists [+ + +]

Author: Jens Axboe <axboe@kernel.dk>
Date:   Thu Mar 14 10:45:07 2024 -0600

    io_uring/kbuf: get rid of lower BGID lists
    
    commit 09ab7eff38202159271534d2f5ad45526168f2a5 upstream.
    
    Just rely on the xarray for any kind of bgid. This simplifies things, and
    it really doesn't bring us much, if anything.
    
    Cc: stable@vger.kernel.org # v6.4+
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring/kbuf: hold io_buffer_list reference over mmap [+ + +]

Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Apr 2 16:16:03 2024 -0600

    io_uring/kbuf: hold io_buffer_list reference over mmap
    
    commit 561e4f9451d65fc2f7eef564e0064373e3019793 upstream.
    
    If we look up the kbuf, ensure that it doesn't get unregistered until
    after we're done with it. Since we're inside mmap, we cannot safely use
    the io_uring lock. Rely on the fact that we can lookup the buffer list
    under RCU now and grab a reference to it, preventing it from being
    unregistered until we're done with it. The lookup returns the
    io_buffer_list directly with it referenced.
    
    Cc: stable@vger.kernel.org # v6.4+
    Fixes: 5cf4f52e6d8a ("io_uring: free io_buffer_list entries via RCU")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring/kbuf: protect io_buffer_list teardown with a reference [+ + +]

Author: Jens Axboe <axboe@kernel.dk>
Date:   Fri Mar 15 16:12:51 2024 -0600

    io_uring/kbuf: protect io_buffer_list teardown with a reference
    
    commit 6b69c4ab4f685327d9e10caf0d84217ba23a8c4b upstream.
    
    No functional changes in this patch, just in preparation for being able
    to keep the buffer list alive outside of the ctx->uring_lock.
    
    Cc: stable@vger.kernel.org # v6.4+
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: use private workqueue for exit work [+ + +]

Author: Jens Axboe <axboe@kernel.dk>
Date:   Mon Apr 1 15:16:19 2024 -0600

    io_uring: use private workqueue for exit work
    
    commit 73eaa2b583493b680c6f426531d6736c39643bfb upstream.
    
    Rather than use the system unbound event workqueue, use an io_uring
    specific one. This avoids dependencies with the tty, which also uses
    the system_unbound_wq, and issues flushes of said workqueue from inside
    its poll handling.
    
    Cc: stable@vger.kernel.org
    Reported-by: Rasmus Karlsson <rasmus.karlsson@pajlada.com>
    Tested-by: Rasmus Karlsson <rasmus.karlsson@pajlada.com>
    Tested-by: Iskren Chernev <me@iskren.info>
    Link: https://github.com/axboe/liburing/issues/1113
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: Fix infinite recursion in fib6_dump_done(). [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Apr 1 14:10:04 2024 -0700

    ipv6: Fix infinite recursion in fib6_dump_done().
    
    commit d21d40605bca7bd5fc23ef03d4c1ca1f48bc2cae upstream.
    
    syzkaller reported infinite recursive calls of fib6_dump_done() during
    netlink socket destruction.  [1]
    
    From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
    the response was generated.  The following recvmmsg() resumed the dump
    for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
    to the fault injection.  [0]
    
      12:01:34 executing program 3:
      r0 = socket$nl_route(0x10, 0x3, 0x0)
      sendmsg$nl_route(r0, ... snip ...)
      recvmmsg(r0, ... snip ...) (fail_nth: 8)
    
    Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
    of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
    receiving the response halfway through, and finally netlink_sock_destruct()
    called nlk_sk(sk)->cb.done().
    
    fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
    is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
    nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
    itself recursively and hitting the stack guard page.
    
    To avoid the issue, let's set the destructor after kzalloc().
    
    [0]:
    FAULT_INJECTION: forcing a failure.
    name failslab, interval 1, probability 0, space 0, times 0
    CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl (lib/dump_stack.c:117)
     should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
     should_failslab (mm/slub.c:3733)
     kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992)
     inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662)
     rtnl_dump_all (net/core/rtnetlink.c:4029)
     netlink_dump (net/netlink/af_netlink.c:2269)
     netlink_recvmsg (net/netlink/af_netlink.c:1988)
     ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801)
     ___sys_recvmsg (net/socket.c:2846)
     do_recvmmsg (net/socket.c:2943)
     __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)
    
    [1]:
    BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb)
    stack guard page: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    Workqueue: events netlink_sock_destruct_work
    RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570)
    Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff
    RSP: 0018:ffffc9000d980000 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3
    RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358
    RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000
    R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68
    FS:  0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0
    PKRU: 55555554
    Call Trace:
     <#DF>
     </#DF>
     <TASK>
     fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
     fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
     ...
     fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
     fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
     netlink_sock_destruct (net/netlink/af_netlink.c:401)
     __sk_destruct (net/core/sock.c:2177 (discriminator 2))
     sk_destruct (net/core/sock.c:2224)
     __sk_free (net/core/sock.c:2235)
     sk_free (net/core/sock.c:2246)
     process_one_work (kernel/workqueue.c:3259)
     worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
     kthread (kernel/kthread.c:388)
     ret_from_fork (arch/x86/kernel/process.c:153)
     ret_from_fork_asm (arch/x86/entry/entry_64.S:256)
    Modules linked in:
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa() [+ + +]

Author: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Date:   Tue Mar 5 17:02:02 2024 +0100

    ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa()
    
    [ Upstream commit aec806fb4afba5fe80b09e29351379a4292baa43 ]
    
    Change kzalloc() flags used in ixgbe_ipsec_vf_add_sa() to GFP_ATOMIC, to
    avoid sleeping in IRQ context.
    
    Dan Carpenter, with the help of Smatch, has found following issue:
    The patch eda0333ac293: "ixgbe: add VF IPsec management" from Aug 13,
    2018 (linux-next), leads to the following Smatch static checker
    warning: drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c:917 ixgbe_ipsec_vf_add_sa()
            warn: sleeping in IRQ context
    
    The call tree that Smatch is worried about is:
    ixgbe_msix_other() <- IRQ handler
    -> ixgbe_msg_task()
       -> ixgbe_rcv_msg_from_vf()
          -> ixgbe_ipsec_vf_add_sa()
    
    Fixes: eda0333ac293 ("ixgbe: add VF IPsec management")
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Link: https://lore.kernel.org/intel-wired-lan/db31a0b0-4d9f-4e6b-aed8-88266eb5665c@moroto.mountain
    Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
    Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1 [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Tue Apr 2 09:31:22 2024 +0900

    ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
    
    commit 5ed11af19e56f0434ce0959376d136005745a936 upstream.
    
    SMB2_GLOBAL_CAP_ENCRYPTION flag should be used only for 3.0 and
    3.0.2 dialects. This flags set cause compatibility problems with
    other SMB clients.
    
    Reported-by: James Christopher Adduono <jc@adduono.com>
    Tested-by: James Christopher Adduono <jc@adduono.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: don't send oplock break if rename fails [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Sun Mar 31 21:58:26 2024 +0900

    ksmbd: don't send oplock break if rename fails
    
    commit c1832f67035dc04fb89e6b591b64e4d515843cda upstream.
    
    Don't send oplock break if rename fails. This patch fix
    smb2.oplock.batch20 test.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: validate payload size in ipc response [+ + +]

Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Sun Mar 31 21:59:10 2024 +0900

    ksmbd: validate payload size in ipc response
    
    commit a677ebd8ca2f2632ccdecbad7b87641274e15aac upstream.
    
    If installing malicious ksmbd-tools, ksmbd.mountd can return invalid ipc
    response to ksmbd kernel server. ksmbd should validate payload size of
    ipc response from ksmbd.mountd to avoid memory overrun or
    slab-out-of-bounds. This patch validate 3 ipc response that has payload.
    
    Cc: stable@vger.kernel.org
    Reported-by: Chao Ma <machao2019@gmail.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: arm64: Ensure target address is granule-aligned for range TLBI [+ + +]

Author: Will Deacon <will@kernel.org>
Date:   Wed Mar 27 12:48:53 2024 +0000

    KVM: arm64: Ensure target address is granule-aligned for range TLBI
    
    commit 4c36a156738887c1edd78589fe192d757989bcde upstream.
    
    When zapping a table entry in stage2_try_break_pte(), we issue range
    TLB invalidation for the region that was mapped by the table. However,
    we neglect to align the base address down to the granule size and so
    if we ended up reaching the table entry via a misaligned address then
    we will accidentally skip invalidation for some prefix of the affected
    address range.
    
    Align 'ctx->addr' down to the granule size when performing TLB
    invalidation for an unmapped table in stage2_try_break_pte().
    
    Cc: Raghavendra Rao Ananta <rananta@google.com>
    Cc: Gavin Shan <gshan@redhat.com>
    Cc: Shaoqin Huang <shahuang@redhat.com>
    Cc: Quentin Perret <qperret@google.com>
    Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range")
    Signed-off-by: Will Deacon <will@kernel.org>
    Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
    Reviewed-by: Marc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org
    Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: arm64: Fix host-programmed guest events in nVHE [+ + +]

Author: Oliver Upton <oliver.upton@linux.dev>
Date:   Tue Mar 5 18:48:39 2024 +0000

    KVM: arm64: Fix host-programmed guest events in nVHE
    
    commit e89c928bedd77d181edc2df01cb6672184775140 upstream.
    
    Programming PMU events in the host that count during guest execution is
    a feature supported by perf, e.g.
    
      perf stat -e cpu_cycles:G ./lkvm run
    
    While this works for VHE, the guest/host event bitmaps are not carried
    through to the hypervisor in the nVHE configuration. Make
    kvm_pmu_update_vcpu_events() conditional on whether or not _hardware_
    supports PMUv3 rather than if the vCPU as vPMU enabled.
    
    Cc: stable@vger.kernel.org
    Fixes: 84d751a019a9 ("KVM: arm64: Pass pmu events to hyp via vcpu")
    Reviewed-by: Marc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20240305184840.636212-3-oliver.upton@linux.dev
    Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Add support for allowing zero SEV ASIDs [+ + +]

Author: Ashish Kalra <ashish.kalra@amd.com>
Date:   Wed Jan 31 15:56:08 2024 -0800

    KVM: SVM: Add support for allowing zero SEV ASIDs
    
    [ Upstream commit 0aa6b90ef9d75b4bd7b6d106d85f2a3437697f91 ]
    
    Some BIOSes allow the end user to set the minimum SEV ASID value
    (CPUID 0x8000001F_EDX) to be greater than the maximum number of
    encrypted guests, or maximum SEV ASID value (CPUID 0x8000001F_ECX)
    in order to dedicate all the SEV ASIDs to SEV-ES or SEV-SNP.
    
    The SEV support, as coded, does not handle the case where the minimum
    SEV ASID value can be greater than the maximum SEV ASID value.
    As a result, the following confusing message is issued:
    
    [   30.715724] kvm_amd: SEV enabled (ASIDs 1007 - 1006)
    
    Fix the support to properly handle this case.
    
    Fixes: 916391a2d1dc ("KVM: SVM: Add support for SEV-ES capability in KVM")
    Suggested-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
    Cc: stable@vger.kernel.org
    Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240104190520.62510-1-Ashish.Kalra@amd.com
    Link: https://lore.kernel.org/r/20240131235609.4161407-4-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: SVM: Use unsigned integers when dealing with ASIDs [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Wed Jan 31 15:56:07 2024 -0800

    KVM: SVM: Use unsigned integers when dealing with ASIDs
    
    [ Upstream commit 466eec4a22a76c462781bf6d45cb02cbedf21a61 ]
    
    Convert all local ASID variables and parameters throughout the SEV code
    from signed integers to unsigned integers.  As ASIDs are fundamentally
    unsigned values, and the global min/max variables are appropriately
    unsigned integers, too.
    
    Functionally, this is a glorified nop as KVM guarantees min_sev_asid is
    non-zero, and no CPU supports -1u as the _only_ asid, i.e. the signed vs.
    unsigned goof won't cause problems in practice.
    
    Opportunistically use sev_get_asid() in sev_flush_encrypted_page() instead
    of open coding an equivalent.
    
    Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240131235609.4161407-3-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Stable-dep-of: 0aa6b90ef9d7 ("KVM: SVM: Add support for allowing zero SEV ASIDs")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: x86: Add BHI_NO [+ + +]

Author: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Date:   Wed Mar 13 09:49:17 2024 -0700

    KVM: x86: Add BHI_NO
    
    commit ed2e8d49b54d677f3123668a21a57822d679651f upstream.
    
    Intel processors that aren't vulnerable to BHI will set
    MSR_IA32_ARCH_CAPABILITIES[BHI_NO] = 1;. Guests may use this BHI_NO bit to
    determine if they need to implement BHI mitigations or not.  Allow this bit
    to be passed to the guests.
    
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: Linux 6.6.26 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Apr 10 16:36:08 2024 +0200

    Linux 6.6.26
    
    Link: https://lore.kernel.org/r/20240408125306.643546457@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Link: https://lore.kernel.org/r/20240409172821.820897593@linuxfoundation.org
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Link: https://lore.kernel.org/r/20240409173540.185904475@linuxfoundation.org
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mlxbf_gige: call request_irq() after NAPI initialized [+ + +]

Author: David Thompson <davthompson@nvidia.com>
Date:   Mon Mar 25 14:36:27 2024 -0400

    mlxbf_gige: call request_irq() after NAPI initialized
    
    [ Upstream commit f7442a634ac06b953fc1f7418f307b25acd4cfbc ]
    
    The mlxbf_gige driver encounters a NULL pointer exception in
    mlxbf_gige_open() when kdump is enabled.  The sequence to reproduce
    the exception is as follows:
    a) enable kdump
    b) trigger kdump via "echo c > /proc/sysrq-trigger"
    c) kdump kernel executes
    d) kdump kernel loads mlxbf_gige module
    e) the mlxbf_gige module runs its open() as the
       the "oob_net0" interface is brought up
    f) mlxbf_gige module will experience an exception
       during its open(), something like:
    
         Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
         Mem abort info:
           ESR = 0x0000000086000004
           EC = 0x21: IABT (current EL), IL = 32 bits
           SET = 0, FnV = 0
           EA = 0, S1PTW = 0
           FSC = 0x04: level 0 translation fault
         user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000
         [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
         Internal error: Oops: 0000000086000004 [#1] SMP
         CPU: 0 PID: 812 Comm: NetworkManager Tainted: G           OE     5.15.0-1035-bluefield #37-Ubuntu
         Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024
         pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
         pc : 0x0
         lr : __napi_poll+0x40/0x230
         sp : ffff800008003e00
         x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff
         x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8
         x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000
         x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000
         x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0
         x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c
         x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398
         x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2
         x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100
         x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238
         Call trace:
          0x0
          net_rx_action+0x178/0x360
          __do_softirq+0x15c/0x428
          __irq_exit_rcu+0xac/0xec
          irq_exit+0x18/0x2c
          handle_domain_irq+0x6c/0xa0
          gic_handle_irq+0xec/0x1b0
          call_on_irq_stack+0x20/0x2c
          do_interrupt_handler+0x5c/0x70
          el1_interrupt+0x30/0x50
          el1h_64_irq_handler+0x18/0x2c
          el1h_64_irq+0x7c/0x80
          __setup_irq+0x4c0/0x950
          request_threaded_irq+0xf4/0x1bc
          mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige]
          mlxbf_gige_open+0x5c/0x170 [mlxbf_gige]
          __dev_open+0x100/0x220
          __dev_change_flags+0x16c/0x1f0
          dev_change_flags+0x2c/0x70
          do_setlink+0x220/0xa40
          __rtnl_newlink+0x56c/0x8a0
          rtnl_newlink+0x58/0x84
          rtnetlink_rcv_msg+0x138/0x3c4
          netlink_rcv_skb+0x64/0x130
          rtnetlink_rcv+0x20/0x30
          netlink_unicast+0x2ec/0x360
          netlink_sendmsg+0x278/0x490
          __sock_sendmsg+0x5c/0x6c
          ____sys_sendmsg+0x290/0x2d4
          ___sys_sendmsg+0x84/0xd0
          __sys_sendmsg+0x70/0xd0
          __arm64_sys_sendmsg+0x2c/0x40
          invoke_syscall+0x78/0x100
          el0_svc_common.constprop.0+0x54/0x184
          do_el0_svc+0x30/0xac
          el0_svc+0x48/0x160
          el0t_64_sync_handler+0xa4/0x12c
          el0t_64_sync+0x1a4/0x1a8
         Code: bad PC value
         ---[ end trace 7d1c3f3bf9d81885 ]---
         Kernel panic - not syncing: Oops: Fatal exception in interrupt
         Kernel Offset: 0x2870a7a00000 from 0xffff800008000000
         PHYS_OFFSET: 0x80000000
         CPU features: 0x0,000005c1,a3332a5a
         Memory Limit: none
         ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
    
    The exception happens because there is a pending RX interrupt before the
    call to request_irq(RX IRQ) executes.  Then, the RX IRQ handler fires
    immediately after this request_irq() completes. The RX IRQ handler runs
    "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()"
    and "napi_enable()", both which happen later in the open() logic.
    
    The logic in mlxbf_gige_open() must fully initialize NAPI before any calls
    to request_irq() execute.
    
    Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
    Signed-off-by: David Thompson <davthompson@nvidia.com>
    Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
    Link: https://lore.kernel.org/r/20240325183627.7641-1-davthompson@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mlxbf_gige: stop interface during shutdown [+ + +]

Author: David Thompson <davthompson@nvidia.com>
Date:   Mon Mar 25 17:09:29 2024 -0400

    mlxbf_gige: stop interface during shutdown
    
    commit 09ba28e1cd3cf715daab1fca6e1623e22fd754a6 upstream.
    
    The mlxbf_gige driver intermittantly encounters a NULL pointer
    exception while the system is shutting down via "reboot" command.
    The mlxbf_driver will experience an exception right after executing
    its shutdown() method.  One example of this exception is:
    
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
    Mem abort info:
      ESR = 0x0000000096000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      FSC = 0x04: level 0 translation fault
    Data abort info:
      ISV = 0, ISS = 0x00000004
      CM = 0, WnR = 0
    user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000
    [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
    Internal error: Oops: 96000004 [#1] SMP
    CPU: 0 PID: 13 Comm: ksoftirqd/0 Tainted: G S         OE     5.15.0-bf.6.gef6992a #1
    Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.0.2.12669 Apr 21 2023
    pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
    lr : mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
    sp : ffff8000080d3c10
    x29: ffff8000080d3c10 x28: ffffcce72cbb7000 x27: ffff8000080d3d58
    x26: ffff0000814e7340 x25: ffff331cd1a05000 x24: ffffcce72c4ea008
    x23: ffff0000814e4b40 x22: ffff0000814e4d10 x21: ffff0000814e4128
    x20: 0000000000000000 x19: ffff0000814e4a80 x18: ffffffffffffffff
    x17: 000000000000001c x16: ffffcce72b4553f4 x15: ffff80008805b8a7
    x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101
    x11: 7f7f7f7f7f7f7f7f x10: c2ac898b17576267 x9 : ffffcce720fa5404
    x8 : ffff000080812138 x7 : 0000000000002e9a x6 : 0000000000000080
    x5 : ffff00008de3b000 x4 : 0000000000000000 x3 : 0000000000000001
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
    Call trace:
     mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
     mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
     __napi_poll+0x40/0x1c8
     net_rx_action+0x314/0x3a0
     __do_softirq+0x128/0x334
     run_ksoftirqd+0x54/0x6c
     smpboot_thread_fn+0x14c/0x190
     kthread+0x10c/0x110
     ret_from_fork+0x10/0x20
    Code: 8b070000 f9000ea0 f95056c0 f86178a1 (b9407002)
    ---[ end trace 7cc3941aa0d8e6a4 ]---
    Kernel panic - not syncing: Oops: Fatal exception in interrupt
    Kernel Offset: 0x4ce722520000 from 0xffff800008000000
    PHYS_OFFSET: 0x80000000
    CPU features: 0x000005c1,a3330e5a
    Memory Limit: none
    ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
    
    During system shutdown, the mlxbf_gige driver's shutdown() is always executed.
    However, the driver's stop() method will only execute if networking interface
    configuration logic within the Linux distribution has been setup to do so.
    
    If shutdown() executes but stop() does not execute, NAPI remains enabled
    and this can lead to an exception if NAPI is scheduled while the hardware
    interface has only been partially deinitialized.
    
    The networking interface managed by the mlxbf_gige driver must be properly
    stopped during system shutdown so that IFF_UP is cleared, the hardware
    interface is put into a clean state, and NAPI is fully deinitialized.
    
    Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
    Signed-off-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/20240325210929.25362-1-davthompson@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mlxbf_gige: stop PHY during open() error paths [+ + +]

Author: David Thompson <davthompson@nvidia.com>
Date:   Wed Mar 20 15:31:17 2024 -0400

    mlxbf_gige: stop PHY during open() error paths
    
    [ Upstream commit d6c30c5a168f8586b8bcc0d8e42e2456eb05209b ]
    
    The mlxbf_gige_open() routine starts the PHY as part of normal
    initialization.  The mlxbf_gige_open() routine must stop the
    PHY during its error paths.
    
    Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
    Signed-off-by: David Thompson <davthompson@nvidia.com>
    Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/secretmem: fix GUP-fast succeeding on secretmem folios [+ + +]

Author: David Hildenbrand <david@redhat.com>
Date:   Tue Mar 26 15:32:08 2024 +0100

    mm/secretmem: fix GUP-fast succeeding on secretmem folios
    
    commit 65291dcfcf8936e1b23cfd7718fdfde7cfaf7706 upstream.
    
    folio_is_secretmem() currently relies on secretmem folios being LRU
    folios, to save some cycles.
    
    However, folios might reside in a folio batch without the LRU flag set, or
    temporarily have their LRU flag cleared.  Consequently, the LRU flag is
    unreliable for this purpose.
    
    In particular, this is the case when secretmem_fault() allocates a fresh
    page and calls filemap_add_folio()->folio_add_lru().  The folio might be
    added to the per-cpu folio batch and won't get the LRU flag set until the
    batch was drained using e.g., lru_add_drain().
    
    Consequently, folio_is_secretmem() might not detect secretmem folios and
    GUP-fast can succeed in grabbing a secretmem folio, crashing the kernel
    when we would later try reading/writing to the folio, because the folio
    has been unmapped from the directmap.
    
    Fix it by removing that unreliable check.
    
    Link: https://lkml.kernel.org/r/20240326143210.291116-2-david@redhat.com
    Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
    Signed-off-by: David Hildenbrand <david@redhat.com>
    Reported-by: xingwei lee <xrivendell7@gmail.com>
    Reported-by: yue sun <samsun1006219@gmail.com>
    Closes: https://lore.kernel.org/lkml/CABOYnLyevJeravW=QrH0JUPYEcDN160aZFb7kwndm-J2rmz0HQ@mail.gmail.com/
    Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
    Tested-by: Miklos Szeredi <mszeredi@redhat.com>
    Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Lorenzo Stoakes <lstoakes@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/treewide: replace pud_large() with pud_leaf() [+ + +]

Author: Peter Xu <peterx@redhat.com>
Date:   Tue Mar 5 12:37:48 2024 +0800

    mm/treewide: replace pud_large() with pud_leaf()
    
    [ Upstream commit 0a845e0f6348ccfa2dcc8c450ffd1c9ffe8c4add ]
    
    pud_large() is always defined as pud_leaf().  Merge their usages.  Chose
    pud_leaf() because pud_leaf() is a global API, while pud_large() is not.
    
    Link: https://lkml.kernel.org/r/20240305043750.93762-9-peterx@redhat.com
    Signed-off-by: Peter Xu <peterx@redhat.com>
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Kirill A. Shutemov <kirill@shutemov.name>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: c567f2948f57 ("Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

modpost: do not make find_tosym() return NULL [+ + +]

Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Sat Mar 23 20:45:11 2024 +0900

    modpost: do not make find_tosym() return NULL
    
    [ Upstream commit 1102f9f85bf66b1a7bd6a40afb40efbbe05dfc05 ]
    
    As mentioned in commit 397586506c3d ("modpost: Add '.ltext' and
    '.ltext.*' to TEXT_SECTIONS"), modpost can result in a segmentation
    fault due to a NULL pointer dereference in default_mismatch_handler().
    
    find_tosym() can return the original symbol pointer instead of NULL
    if a better one is not found.
    
    This fixes the reported segmentation fault.
    
    Fixes: a23e7584ecf3 ("modpost: unify 'sym' and 'to' in default_mismatch_handler()")
    Reported-by: Nathan Chancellor <nathan@kernel.org>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

modpost: Optimize symbol search from linear to binary search [+ + +]

Author: Jack Brennen <jbrennen@google.com>
Date:   Tue Sep 26 08:40:44 2023 -0400

    modpost: Optimize symbol search from linear to binary search
    
    [ Upstream commit 4074532758c5c367d3fcb8d124150824a254659d ]
    
    Modify modpost to use binary search for converting addresses back
    into symbol references.  Previously it used linear search.
    
    This change saves a few seconds of wall time for defconfig builds,
    but can save several minutes on allyesconfigs.
    
    Before:
    $ make LLVM=1 -j128 allyesconfig vmlinux -s KCFLAGS="-Wno-error"
    $ time scripts/mod/modpost -M -m -a -N -o vmlinux.symvers vmlinux.o
    198.38user 1.27system 3:19.71elapsed
    
    After:
    $ make LLVM=1 -j128 allyesconfig vmlinux -s KCFLAGS="-Wno-error"
    $ time scripts/mod/modpost -M -m -a -N -o vmlinux.symvers vmlinux.o
    11.91user 0.85system 0:12.78elapsed
    
    Signed-off-by: Jack Brennen <jbrennen@google.com>
    Tested-by: Nick Desaulniers <ndesaulniers@google.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Stable-dep-of: 1102f9f85bf6 ("modpost: do not make find_tosym() return NULL")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: don't account accept() of non-MPC client as fallback to TCP [+ + +]

Author: Davide Caratti <dcaratti@redhat.com>
Date:   Fri Mar 29 13:08:52 2024 +0100

    mptcp: don't account accept() of non-MPC client as fallback to TCP
    
    commit 7a1b3490f47e88ec4cbde65f1a77a0f4bc972282 upstream.
    
    Current MPTCP servers increment MPTcpExtMPCapableFallbackACK when they
    accept non-MPC connections. As reported by Christoph, this is "surprising"
    because the counter might become greater than MPTcpExtMPCapableSYNRX.
    
    MPTcpExtMPCapableFallbackACK counter's name suggests it should only be
    incremented when a connection was seen using MPTCP options, then a
    fallback to TCP has been done. Let's do that by incrementing it when
    the subflow context of an inbound MPC connection attempt is dropped.
    Also, update mptcp_connect.sh kselftest, to ensure that the
    above MIB does not increment in case a pure TCP client connects to a
    MPTCP server.
    
    Fixes: fc518953bc9c ("mptcp: add and use MIB counter infrastructure")
    Cc: stable@vger.kernel.org
    Reported-by: Christoph Paasch <cpaasch@apple.com>
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/449
    Signed-off-by: Davide Caratti <dcaratti@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240329-upstream-net-20240329-fallback-mib-v1-1-324a8981da48@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: don't overwrite sock_ops in mptcp_is_tcpsk() [+ + +]

Author: Davide Caratti <dcaratti@redhat.com>
Date:   Tue Dec 19 22:31:04 2023 +0100

    mptcp: don't overwrite sock_ops in mptcp_is_tcpsk()
    
    commit 8e2b8a9fa512709e6fee744dcd4e2a20ee7f5c56 upstream.
    
    Eric Dumazet suggests:
    
     > The fact that mptcp_is_tcpsk() was able to write over sock->ops was a
     > bit strange to me.
     > mptcp_is_tcpsk() should answer a question, with a read-only argument.
    
    re-factor code to avoid overwriting sock_ops inside that function. Also,
    change the helper name to reflect the semantics and to disambiguate from
    its dual, sk_is_mptcp(). While at it, collapse mptcp_stream_accept() and
    mptcp_accept() into a single function, where fallback / non-fallback are
    separated into a single sk_is_mptcp() conditional.
    
    Link: https://github.com/multipath-tcp/mptcp_net-next/issues/432
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Davide Caratti <dcaratti@redhat.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/rds: fix possible cp null dereference [+ + +]

Author: Mahmoud Adam <mngyadam@amazon.com>
Date:   Tue Mar 26 16:31:33 2024 +0100

    net/rds: fix possible cp null dereference
    
    commit 62fc3357e079a07a22465b9b6ef71bb6ea75ee4b upstream.
    
    cp might be null, calling cp->cp_conn would produce null dereference
    
    [Simon Horman adds:]
    
    Analysis:
    
    * cp is a parameter of __rds_rdma_map and is not reassigned.
    
    * The following call-sites pass a NULL cp argument to __rds_rdma_map()
    
      - rds_get_mr()
      - rds_get_mr_for_dest
    
    * Prior to the code above, the following assumes that cp may be NULL
      (which is indicative, but could itself be unnecessary)
    
            trans_private = rs->rs_transport->get_mr(
                    sg, nents, rs, &mr->r_key, cp ? cp->cp_conn : NULL,
                    args->vec.addr, args->vec.bytes,
                    need_odp ? ODP_ZEROBASED : ODP_NOT_NEEDED);
    
    * The code modified by this patch is guarded by IS_ERR(trans_private),
      where trans_private is assigned as per the previous point in this analysis.
    
      The only implementation of get_mr that I could locate is rds_ib_get_mr()
      which can return an ERR_PTR if the conn (4th) argument is NULL.
    
    * ret is set to PTR_ERR(trans_private).
      rds_ib_get_mr can return ERR_PTR(-ENODEV) if the conn (4th) argument is NULL.
      Thus ret may be -ENODEV in which case the code in question will execute.
    
    Conclusion:
    * cp may be NULL at the point where this patch adds a check;
      this patch does seem to address a possible bug
    
    Fixes: c055fc00c07b ("net/rds: fix WARNING in rds_conn_connect_if_down")
    Cc: stable@vger.kernel.org # v4.19+
    Signed-off-by: Mahmoud Adam <mngyadam@amazon.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240326153132.55580-1-mngyadam@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/sched: act_skbmod: prevent kernel-infoleak [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Apr 3 13:09:08 2024 +0000

    net/sched: act_skbmod: prevent kernel-infoleak
    
    commit d313eb8b77557a6d5855f42d2234bd592c7b50dd upstream.
    
    syzbot found that tcf_skbmod_dump() was copying four bytes
    from kernel stack to user space [1].
    
    The issue here is that 'struct tc_skbmod' has a four bytes hole.
    
    We need to clear the structure before filling fields.
    
    [1]
    BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
     BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
     BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
     BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
     BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
     BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
      instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      copy_to_user_iter lib/iov_iter.c:24 [inline]
      iterate_ubuf include/linux/iov_iter.h:29 [inline]
      iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      iterate_and_advance include/linux/iov_iter.h:271 [inline]
      _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
      copy_to_iter include/linux/uio.h:196 [inline]
      simple_copy_to_iter net/core/datagram.c:532 [inline]
      __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420
      skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
      skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline]
      netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962
      sock_recvmsg_nosec net/socket.c:1046 [inline]
      sock_recvmsg+0x2c4/0x340 net/socket.c:1068
      __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242
      __do_sys_recvfrom net/socket.c:2260 [inline]
      __se_sys_recvfrom net/socket.c:2256 [inline]
      __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256
     do_syscall_64+0xd5/0x1f0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75
    
    Uninit was stored to memory at:
      pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253
      netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317
      netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351
      nlmsg_unicast include/net/netlink.h:1144 [inline]
      nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610
      rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741
      rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline]
      tcf_add_notify net/sched/act_api.c:2048 [inline]
      tcf_action_add net/sched/act_api.c:2071 [inline]
      tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119
      rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
      netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
      rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
      netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
      netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
      netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:745
      ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
      __sys_sendmsg net/socket.c:2667 [inline]
      __do_sys_sendmsg net/socket.c:2676 [inline]
      __se_sys_sendmsg net/socket.c:2674 [inline]
      __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
     do_syscall_64+0xd5/0x1f0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75
    
    Uninit was stored to memory at:
      __nla_put lib/nlattr.c:1041 [inline]
      nla_put+0x1c6/0x230 lib/nlattr.c:1099
      tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256
      tcf_action_dump_old net/sched/act_api.c:1191 [inline]
      tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
      tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251
      tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628
      tcf_add_notify_msg net/sched/act_api.c:2023 [inline]
      tcf_add_notify net/sched/act_api.c:2042 [inline]
      tcf_action_add net/sched/act_api.c:2071 [inline]
      tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119
      rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
      netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
      rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
      netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
      netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
      netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:745
      ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
      __sys_sendmsg net/socket.c:2667 [inline]
      __do_sys_sendmsg net/socket.c:2676 [inline]
      __se_sys_sendmsg net/socket.c:2674 [inline]
      __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
     do_syscall_64+0xd5/0x1f0
     entry_SYSCALL_64_after_hwframe+0x6d/0x75
    
    Local variable opt created at:
      tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244
      tcf_action_dump_old net/sched/act_api.c:1191 [inline]
      tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
    
    Bytes 188-191 of 248 are uninitialized
    Memory access of size 248 starts at ffff888117697680
    Data copied to user address 00007ffe56d855f0
    
    Fixes: 86da71b57383 ("net_sched: Introduce skbmod action")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Apr 2 13:41:33 2024 +0000

    net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
    
    commit 7eb322360b0266481e560d1807ee79e0cef5742b upstream.
    
    qdisc_tree_reduce_backlog() is called with the qdisc lock held,
    not RTNL.
    
    We must use qdisc_lookup_rcu() instead of qdisc_lookup()
    
    syzbot reported:
    
    WARNING: suspicious RCU usage
    6.1.74-syzkaller #0 Not tainted
    -----------------------------
    net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 1
    3 locks held by udevd/1142:
      #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
      #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
      #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282
      #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline]
      #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297
      #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
      #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
      #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792
    
    stack backtrace:
    CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
    Call Trace:
     <TASK>
      [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline]
      [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106
      [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113
      [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592
      [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305
      [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811
      [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51
      [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline]
      [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723
      [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline]
      [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline]
      [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415
      [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
      [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313
      [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616
      [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline]
      [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700
      [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712
      [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107
      [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656
    
    Fixes: d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://lore.kernel.org/r/20240402134133.2352776-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: bcmasp: Bring up unimac after PHY link up [+ + +]

Author: Justin Chen <justin.chen@broadcom.com>
Date:   Mon Mar 25 12:30:24 2024 -0700

    net: bcmasp: Bring up unimac after PHY link up
    
    [ Upstream commit dfd222e2aef68818320a57b13a1c52a44c22bc80 ]
    
    The unimac requires the PHY RX clk during reset or it may be put
    into a bad state. Bring up the unimac after link up to ensure the
    PHY RX clk exists.
    
    Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
    Signed-off-by: Justin Chen <justin.chen@broadcom.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: mv88e6xxx: fix usable ports on 88e6020 [+ + +]

Author: Michael Krummsdorf <michael.krummsdorf@tq-group.com>
Date:   Tue Mar 26 13:36:54 2024 +0100

    net: dsa: mv88e6xxx: fix usable ports on 88e6020
    
    commit 625aefac340f45a4fc60908da763f437599a0d6f upstream.
    
    The switch has 4 ports with 2 internal PHYs, but ports are numbered up
    to 6, with ports 0, 1, 5 and 6 being usable.
    
    Fixes: 71d94a432a15 ("net: dsa: mv88e6xxx: add support for MV88E6020 switch")
    Signed-off-by: Michael Krummsdorf <michael.krummsdorf@tq-group.com>
    Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240326123655.40666-1-matthias.schiffer@ew.tq-group.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45() [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Tue Apr 2 20:33:56 2024 +0200

    net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
    
    commit c120209bce34c49dcaba32f15679574327d09f63 upstream.
    
    The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
    parameters in the same order.
    
    Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
    in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
    
       int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
                                      int reg, u16 val);
    
    it is likely that the definition is the one to change.
    
    Found with cppcheck, funcArgOrderDifferent.
    
    Fixes: ae271547bba6 ("net: dsa: sja1105: C45 only transactions for PCS")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: Michael Walle <mwalle@kernel.org>
    Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fec: Set mac_managed_pm during probe [+ + +]

Author: Wei Fang <wei.fang@nxp.com>
Date:   Thu Mar 28 15:59:29 2024 +0000

    net: fec: Set mac_managed_pm during probe
    
    commit cbc17e7802f5de37c7c262204baadfad3f7f99e5 upstream.
    
    Setting mac_managed_pm during interface up is too late.
    
    In situations where the link is not brought up yet and the system suspends
    the regular PHY power management will run. Since the FEC ETHEREN control
    bit is cleared (automatically) on suspend the controller is off in resume.
    When the regular PHY power management resume path runs in this context it
    will write to the MII_DATA register but nothing will be transmitted on the
    MDIO bus.
    
    This can be observed by the following log:
    
        fec 5b040000.ethernet eth0: MDIO read timeout
        Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110
        Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110
    
    The data written will however remain in the MII_DATA register.
    
    When the link later is set to administrative up it will trigger a call to
    fec_restart() which will restore the MII_SPEED register. This triggers the
    quirk explained in f166f890c8f0 ("net: ethernet: fec: Replace interrupt
    driven MDIO with polled IO") causing an extra MII_EVENT.
    
    This extra event desynchronizes all the MDIO register reads, causing them
    to complete too early. Leading all reads to read as 0 because
    fec_enet_mdio_wait() returns too early.
    
    When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes
    the PHY to be initialized incorrectly and the PHY will not transmit any
    ethernet signal in this state. It cannot be brought out of this state
    without a power cycle of the PHY.
    
    Fixes: 557d5dc83f68 ("net: fec: use mac-managed PHY PM")
    Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/
    Signed-off-by: Wei Fang <wei.fang@nxp.com>
    [jernberg: commit message]
    Signed-off-by: John Ernberg <john.ernberg@actia.se>
    Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.se
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: hns3: fix index limit to support all queue stats [+ + +]

Author: Jie Wang <wangjie125@huawei.com>
Date:   Mon Mar 25 20:43:09 2024 +0800

    net: hns3: fix index limit to support all queue stats
    
    [ Upstream commit 47e39d213e09c6cae0d6b4d95e454ea404013312 ]
    
    Currently, hns hardware supports more than 512 queues and the index limit
    in hclge_comm_tqps_update_stats is wrong. So this patch removes it.
    
    Fixes: 287db5c40d15 ("net: hns3: create new set of common tqp stats APIs for PF and VF reuse")
    Signed-off-by: Jie Wang <wangjie125@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
    Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix kernel crash when devlink reload during pf initialization [+ + +]

Author: Yonglong Liu <liuyonglong@huawei.com>
Date:   Mon Mar 25 20:43:10 2024 +0800

    net: hns3: fix kernel crash when devlink reload during pf initialization
    
    [ Upstream commit 93305b77ffcb042f1538ecc383505e87d95aa05a ]
    
    The devlink reload process will access the hardware resources,
    but the register operation is done before the hardware is initialized.
    So, processing the devlink reload during initialization may lead to kernel
    crash. This patch fixes this by taking devl_lock during initialization.
    
    Fixes: b741269b2759 ("net: hns3: add support for registering devlink for PF")
    Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: mark unexcuted loopback test result as UNEXECUTED [+ + +]

Author: Jian Shen <shenjian15@huawei.com>
Date:   Mon Mar 25 20:43:11 2024 +0800

    net: hns3: mark unexcuted loopback test result as UNEXECUTED
    
    [ Upstream commit 5bd088d6c21a45ee70e6116879310e54174d75eb ]
    
    Currently, loopback test may be skipped when resetting, but the test
    result will still show as 'PASS', because the driver doesn't set
    ETH_TEST_FL_FAILED flag. Fix it by setting the flag and
    initializating the value to UNEXECUTED.
    
    Fixes: 4c8dab1c709c ("net: hns3: reconstruct function hns3_self_test")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hsr: hsr_slave: Fix the promiscuous mode in offload mode [+ + +]

Author: Ravi Gunasekaran <r-gunasekaran@ti.com>
Date:   Fri Mar 22 15:34:47 2024 +0530

    net: hsr: hsr_slave: Fix the promiscuous mode in offload mode
    
    [ Upstream commit b11c81731c810efe592e510bb0110e0db6877419 ]
    
    commit e748d0fd66ab ("net: hsr: Disable promiscuous mode in
    offload mode") disables promiscuous mode of slave devices
    while creating an HSR interface. But while deleting the
    HSR interface, it does not take care of it. It decreases the
    promiscuous mode count, which eventually enables promiscuous
    mode on the slave devices when creating HSR interface again.
    
    Fix this by not decrementing the promiscuous mode count while
    deleting the HSR interface when offload is enabled.
    
    Fixes: e748d0fd66ab ("net: hsr: Disable promiscuous mode in offload mode")
    Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Link: https://lore.kernel.org/r/20240322100447.27615-1-r-gunasekaran@ti.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips [+ + +]

Author: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Date:   Tue Mar 26 12:28:05 2024 +0530

    net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips
    
    [ Upstream commit e4a58989f5c839316ac63675e8800b9eed7dbe96 ]
    
    PCI11x1x Rev B0 devices might drop packets when receiving back to back frames
    at 2.5G link speed. Change the B0 Rev device's Receive filtering Engine FIFO
    threshold parameter from its hardware default of 4 to 3 dwords to prevent the
    problem. Rev C0 and later hardware already defaults to 3 dwords.
    
    Fixes: bb4f6bffe33c ("net: lan743x: Add PCI11010 / PCI11414 device IDs")
    Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240326065805.686128-1-Raju.Lakkaraju@microchip.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mana: Fix Rx DMA datasize and skb_over_panic [+ + +]

Author: Haiyang Zhang <haiyangz@microsoft.com>
Date:   Tue Apr 2 12:48:36 2024 -0700

    net: mana: Fix Rx DMA datasize and skb_over_panic
    
    commit c0de6ab920aafb56feab56058e46b688e694a246 upstream.
    
    mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
    multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
    can be received and cause skb_over_panic.
    
    Sample dmesg:
    [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
    [ 5325.243689] ------------[ cut here ]------------
    [ 5325.245748] kernel BUG at net/core/skbuff.c:192!
    [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
    [ 5325.302941] Call Trace:
    [ 5325.304389]  <IRQ>
    [ 5325.315794]  ? skb_panic+0x4f/0x60
    [ 5325.317457]  ? asm_exc_invalid_op+0x1f/0x30
    [ 5325.319490]  ? skb_panic+0x4f/0x60
    [ 5325.321161]  skb_put+0x4e/0x50
    [ 5325.322670]  mana_poll+0x6fa/0xb50 [mana]
    [ 5325.324578]  __napi_poll+0x33/0x1e0
    [ 5325.326328]  net_rx_action+0x12e/0x280
    
    As discussed internally, this alignment is not necessary. To fix
    this bug, remove it from the code. So oversized packets will be
    marked as CQE_RX_TRUNCATED by NIC, and dropped.
    
    Cc: stable@vger.kernel.org
    Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes")
    Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
    Reviewed-by: Dexuan Cui <decui@microsoft.com>
    Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: phy: micrel: Fix potential null pointer dereference [+ + +]

Author: Aleksandr Mishin <amishin@t-argos.ru>
Date:   Fri Mar 29 09:16:31 2024 +0300

    net: phy: micrel: Fix potential null pointer dereference
    
    commit 96c155943a703f0655c0c4cab540f67055960e91 upstream.
    
    In lan8814_get_sig_rx() and lan8814_get_sig_tx() ptp_parse_header() may
    return NULL as ptp_header due to abnormal packet type or corrupted packet.
    Fix this bug by adding ptp_header check.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy")
    Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Link: https://lore.kernel.org/r/20240329061631.33199-1-amishin@t-argos.ru
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping [+ + +]

Author: Horatiu Vultur <horatiu.vultur@microchip.com>
Date:   Tue Apr 2 09:16:34 2024 +0200

    net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping
    
    commit de99e1ea3a35f23ff83a31d6b08f43d27b2c6345 upstream.
    
    There are 2 issues with the blamed commit.
    1. When the phy is initialized, it would enable the disabled of UDPv4
       checksums. The UDPv6 checksum is already enabled by default. So when
       1-step is configured then it would clear these flags.
    2. After the 1-step is configured, then if 2-step is configured then the
       1-step would be still configured because it is not clearing the flag.
       So the sync frames will still have origin timestamps set.
    
    Fix this by reading first the value of the register and then
    just change bit 12 as this one determines if the timestamp needs to
    be inserted in the frame, without changing any other bits.
    
    Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy")
    Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
    Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
    Link: https://lore.kernel.org/r/20240402071634.2483524-1-horatiu.vultur@microchip.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: ravb: Always process TX descriptor ring [+ + +]

Author: Paul Barker <paul.barker.ct@bp.renesas.com>
Date:   Tue Apr 2 15:53:04 2024 +0100

    net: ravb: Always process TX descriptor ring
    
    [ Upstream commit 596a4254915f94c927217fe09c33a6828f33fb25 ]
    
    The TX queue should be serviced each time the poll function is called,
    even if the full RX work budget has been consumed. This prevents
    starvation of the TX queue when RX bandwidth usage is high.
    
    Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
    Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
    Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ravb: Always update error counters [+ + +]

Author: Paul Barker <paul.barker.ct@bp.renesas.com>
Date:   Tue Apr 2 15:53:05 2024 +0100

    net: ravb: Always update error counters
    
    [ Upstream commit 101b76418d7163240bc74a7e06867dca0e51183e ]
    
    The error statistics should be updated each time the poll function is
    called, even if the full RX work budget has been consumed. This prevents
    the counts from becoming stuck when RX bandwidth usage is high.
    
    This also ensures that error counters are not updated after we've
    re-enabled interrupts as that could result in a race condition.
    
    Also drop an unnecessary space.
    
    Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
    Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
    Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ravb: Let IP-specific receive function to interrogate descriptors [+ + +]

Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Date:   Fri Feb 2 10:41:22 2024 +0200

    net: ravb: Let IP-specific receive function to interrogate descriptors
    
    [ Upstream commit 2b993bfdb47b3aaafd8fe9cd5038b5e297b18ee1 ]
    
    ravb_poll() initial code used to interrogate the first descriptor of the
    RX queue in case gPTP is false to determine if ravb_rx() should be called.
    This is done for non-gPTP IPs. For gPTP IPs the driver PTP-specific
    information was used to determine if receive function should be called. As
    every IP has its own receive function that interrogates the RX descriptors
    list in the same way the ravb_poll() was doing there is no need to double
    check this in ravb_poll(). Removing the code from ravb_poll() leads to a
    cleaner code.
    
    Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
    Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: 596a4254915f ("net: ravb: Always process TX descriptor ring")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: fix rx queue priority assignment [+ + +]

Author: Piotr Wejman <piotrwejman90@gmail.com>
Date:   Mon Apr 1 21:22:39 2024 +0200

    net: stmmac: fix rx queue priority assignment
    
    commit b3da86d432b7cd65b025a11f68613e333d2483db upstream.
    
    The driver should ensure that same priority is not mapped to multiple
    rx queues. From DesignWare Cores Ethernet Quality-of-Service
    Databook, section 17.1.29 MAC_RxQ_Ctrl2:
    "[...]The software must ensure that the content of this field is
    mutually exclusive to the PSRQ fields for other queues, that is,
    the same priority is not mapped to multiple Rx queues[...]"
    
    Previously rx_queue_priority() function was:
    - clearing all priorities from a queue
    - adding new priorities to that queue
    After this patch it will:
    - first assign new priorities to a queue
    - then remove those priorities from all other queues
    - keep other priorities previously assigned to that queue
    
    Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration")
    Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2")
    Signed-off-by: Piotr Wejman <piotrwejman90@gmail.com>
    Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: txgbe: fix i2c dev name cannot match clkdev [+ + +]

Author: Duanqiang Wen <duanqiangwen@net-swift.com>
Date:   Tue Apr 2 10:18:43 2024 +0800

    net: txgbe: fix i2c dev name cannot match clkdev
    
    commit c644920ce9220d83e070f575a4df711741c07f07 upstream.
    
    txgbe clkdev shortened clk_name, so i2c_dev info_name
    also need to shorten. Otherwise, i2c_dev cannot initialize
    clock.
    
    Fixes: e30cef001da2 ("net: txgbe: fix clk_name exceed MAX_DEV_ID limits")
    Signed-off-by: Duanqiang Wen <duanqiangwen@net-swift.com>
    Link: https://lore.kernel.org/r/20240402021843.126192-1-duanqiangwen@net-swift.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: usb: ax88179_178a: avoid the interface always configured as random address [+ + +]

Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Date:   Wed Apr 3 15:21:58 2024 +0200

    net: usb: ax88179_178a: avoid the interface always configured as random address
    
    commit 2e91bb99b9d4f756e92e83c4453f894dda220f09 upstream.
    
    After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
    consecutive device resets"), reset is not executed from bind operation and
    mac address is not read from the device registers or the devicetree at that
    moment. Since the check to configure if the assigned mac address is random
    or not for the interface, happens after the bind operation from
    usbnet_probe, the interface keeps configured as random address, although the
    address is correctly read and set during open operation (the only reset
    now).
    
    In order to keep only one reset for the device and to avoid the interface
    always configured as random address, after reset, configure correctly the
    suitable field from the driver, if the mac address is read successfully from
    the device registers or the devicetree. Take into account if a locally
    administered address (random) was previously stored.
    
    cc: stable@vger.kernel.org # 6.6+
    Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
    Reported-by: Dave Stevenson  <dave.stevenson@raspberrypi.com>
    Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: wwan: t7xx: Split 64bit accesses to fix alignment issues [+ + +]

Author: Bjц╦rn Mork <bjorn@mork.no>
Date:   Fri Mar 22 15:40:00 2024 +0100

    net: wwan: t7xx: Split 64bit accesses to fix alignment issues
    
    [ Upstream commit 7d5a7dd5a35876f0ecc286f3602a88887a788217 ]
    
    Some of the registers are aligned on a 32bit boundary, causing
    alignment faults on 64bit platforms.
    
     Unable to handle kernel paging request at virtual address ffffffc084a1d004
     Mem abort info:
     ESR = 0x0000000096000061
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
     FSC = 0x21: alignment fault
     Data abort info:
     ISV = 0, ISS = 0x00000061, ISS2 = 0x00000000
     CM = 0, WnR = 1, TnD = 0, TagAccess = 0
     GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
     swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000046ad6000
     [ffffffc084a1d004] pgd=100000013ffff003, p4d=100000013ffff003, pud=100000013ffff003, pmd=0068000020a00711
     Internal error: Oops: 0000000096000061 [#1] SMP
     Modules linked in: mtk_t7xx(+) qcserial pppoe ppp_async option nft_fib_inet nf_flow_table_inet mt7921u(O) mt7921s(O) mt7921e(O) mt7921_common(O) iwlmvm(O) iwldvm(O) usb_wwan rndis_host qmi_wwan pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7996e(O) mt792x_usb(O) mt792x_lib(O) mt7915e(O) mt76_usb(O) mt76_sdio(O) mt76_connac_lib(O) mt76(O) mac80211(O) iwlwifi(O) huawei_cdc_ncm cfg80211(O) cdc_ncm cdc_ether wwan usbserial usbnet slhc sfp rtc_pcf8563 nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mt6577_auxadc mdio_i2c libcrc32c compat(O) cdc_wdm cdc_acm at24 crypto_safexcel pwm_fan i2c_gpio i2c_smbus industrialio i2c_algo_bit i2c_mux_reg i2c_mux_pca954x i2c_mux_pca9541 i2c_mux_gpio i2c_mux dummy oid_registry tun sha512_arm64 sha1_ce sha1_generic seqiv
     md5 geniv des_generic libdes cbc authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd nvme nvme_core gpio_button_hotplug(O) dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax usbcore usb_common ptp aquantia pps_core mii tpm encrypted_keys trusted
     CPU: 3 PID: 5266 Comm: kworker/u9:1 Tainted: G O 6.6.22 #0
     Hardware name: Bananapi BPI-R4 (DT)
     Workqueue: md_hk_wq t7xx_fsm_uninit [mtk_t7xx]
     pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : t7xx_cldma_hw_set_start_addr+0x1c/0x3c [mtk_t7xx]
     lr : t7xx_cldma_start+0xac/0x13c [mtk_t7xx]
     sp : ffffffc085d63d30
     x29: ffffffc085d63d30 x28: 0000000000000000 x27: 0000000000000000
     x26: 0000000000000000 x25: ffffff80c804f2c0 x24: ffffff80ca196c05
     x23: 0000000000000000 x22: ffffff80c814b9b8 x21: ffffff80c814b128
     x20: 0000000000000001 x19: ffffff80c814b080 x18: 0000000000000014
     x17: 0000000055c9806b x16: 000000007c5296d0 x15: 000000000f6bca68
     x14: 00000000dbdbdce4 x13: 000000001aeaf72a x12: 0000000000000001
     x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
     x8 : ffffff80ca1ef6b4 x7 : ffffff80c814b818 x6 : 0000000000000018
     x5 : 0000000000000870 x4 : 0000000000000000 x3 : 0000000000000000
     x2 : 000000010a947000 x1 : ffffffc084a1d004 x0 : ffffffc084a1d004
     Call trace:
     t7xx_cldma_hw_set_start_addr+0x1c/0x3c [mtk_t7xx]
     t7xx_fsm_uninit+0x578/0x5ec [mtk_t7xx]
     process_one_work+0x154/0x2a0
     worker_thread+0x2ac/0x488
     kthread+0xe0/0xec
     ret_from_fork+0x10/0x20
     Code: f9400800 91001000 8b214001 d50332bf (f9000022)
     ---[ end trace 0000000000000000 ]---
    
    The inclusion of io-64-nonatomic-lo-hi.h indicates that all 64bit
    accesses can be replaced by pairs of nonatomic 32bit access.  Fix
    alignment by forcing all accesses to be 32bit on 64bit platforms.
    
    Link: https://forum.openwrt.org/t/fibocom-fm350-gl-support/142682/72
    Fixes: 39d439047f1d ("net: wwan: t7xx: Add control DMA interface")
    Signed-off-by: Bjц╦rn Mork <bjorn@mork.no>
    Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
    Tested-by: Liviu Dudau <liviu@dudau.co.uk>
    Link: https://lore.kernel.org/r/20240322144000.1683822-1-bjorn@mork.no
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: discard table flag update with pending basechain deletion [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Wed Apr 3 19:35:30 2024 +0200

    netfilter: nf_tables: discard table flag update with pending basechain deletion
    
    commit 1bc83a019bbe268be3526406245ec28c2458a518 upstream.
    
    Hook unregistration is deferred to the commit phase, same occurs with
    hook updates triggered by the table dormant flag. When both commands are
    combined, this results in deleting a basechain while leaving its hook
    still registered in the core.
    
    Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() [+ + +]

Author: Ziyang Xuan <william.xuanziyang@huawei.com>
Date:   Wed Apr 3 15:22:04 2024 +0800

    netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
    
    commit 24225011d81b471acc0e1e315b7d9905459a6304 upstream.
    
    nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
    concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
    And thhere is not any protection when iterate over nf_tables_flowtables
    list in __nft_flowtable_type_get(). Therefore, there is pertential
    data-race of nf_tables_flowtables list entry.
    
    Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
    in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
    nft_flowtable_type_get() to protect the entire type query process.
    
    Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: flush pending destroy work before exit_net release [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Tue Apr 2 18:04:36 2024 +0200

    netfilter: nf_tables: flush pending destroy work before exit_net release
    
    commit 24cea9677025e0de419989ecb692acd4bb34cac2 upstream.
    
    Similar to 2c9f0293280e ("netfilter: nf_tables: flush pending destroy
    work before netlink notifier") to address a race between exit_net and
    the destroy workqueue.
    
    The trace below shows an element to be released via destroy workqueue
    while exit_net path (triggered via module removal) has already released
    the set that is used in such transaction.
    
    [ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
    [ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465
    [ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359
    [ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
    [ 1360.547984] Call Trace:
    [ 1360.547991]  <TASK>
    [ 1360.547998]  dump_stack_lvl+0x53/0x70
    [ 1360.548014]  print_report+0xc4/0x610
    [ 1360.548026]  ? __virt_addr_valid+0xba/0x160
    [ 1360.548040]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
    [ 1360.548054]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
    [ 1360.548176]  kasan_report+0xae/0xe0
    [ 1360.548189]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
    [ 1360.548312]  nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
    [ 1360.548447]  ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables]
    [ 1360.548577]  ? _raw_spin_unlock_irq+0x18/0x30
    [ 1360.548591]  process_one_work+0x2f1/0x670
    [ 1360.548610]  worker_thread+0x4d3/0x760
    [ 1360.548627]  ? __pfx_worker_thread+0x10/0x10
    [ 1360.548640]  kthread+0x16b/0x1b0
    [ 1360.548653]  ? __pfx_kthread+0x10/0x10
    [ 1360.548665]  ret_from_fork+0x2f/0x50
    [ 1360.548679]  ? __pfx_kthread+0x10/0x10
    [ 1360.548690]  ret_from_fork_asm+0x1a/0x30
    [ 1360.548707]  </TASK>
    
    [ 1360.548719] Allocated by task 192061:
    [ 1360.548726]  kasan_save_stack+0x20/0x40
    [ 1360.548739]  kasan_save_track+0x14/0x30
    [ 1360.548750]  __kasan_kmalloc+0x8f/0xa0
    [ 1360.548760]  __kmalloc_node+0x1f1/0x450
    [ 1360.548771]  nf_tables_newset+0x10c7/0x1b50 [nf_tables]
    [ 1360.548883]  nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink]
    [ 1360.548909]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
    [ 1360.548927]  netlink_unicast+0x367/0x4f0
    [ 1360.548935]  netlink_sendmsg+0x34b/0x610
    [ 1360.548944]  ____sys_sendmsg+0x4d4/0x510
    [ 1360.548953]  ___sys_sendmsg+0xc9/0x120
    [ 1360.548961]  __sys_sendmsg+0xbe/0x140
    [ 1360.548971]  do_syscall_64+0x55/0x120
    [ 1360.548982]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
    
    [ 1360.548994] Freed by task 192222:
    [ 1360.548999]  kasan_save_stack+0x20/0x40
    [ 1360.549009]  kasan_save_track+0x14/0x30
    [ 1360.549019]  kasan_save_free_info+0x3b/0x60
    [ 1360.549028]  poison_slab_object+0x100/0x180
    [ 1360.549036]  __kasan_slab_free+0x14/0x30
    [ 1360.549042]  kfree+0xb6/0x260
    [ 1360.549049]  __nft_release_table+0x473/0x6a0 [nf_tables]
    [ 1360.549131]  nf_tables_exit_net+0x170/0x240 [nf_tables]
    [ 1360.549221]  ops_exit_list+0x50/0xa0
    [ 1360.549229]  free_exit_list+0x101/0x140
    [ 1360.549236]  unregister_pernet_operations+0x107/0x160
    [ 1360.549245]  unregister_pernet_subsys+0x1c/0x30
    [ 1360.549254]  nf_tables_module_exit+0x43/0x80 [nf_tables]
    [ 1360.549345]  __do_sys_delete_module+0x253/0x370
    [ 1360.549352]  do_syscall_64+0x55/0x120
    [ 1360.549360]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
    
    (gdb) list *__nft_release_table+0x473
    0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354).
    11349           list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
    11350                   list_del(&flowtable->list);
    11351                   nft_use_dec(&table->use);
    11352                   nf_tables_flowtable_destroy(flowtable);
    11353           }
    11354           list_for_each_entry_safe(set, ns, &table->sets, list) {
    11355                   list_del(&set->list);
    11356                   nft_use_dec(&table->use);
    11357                   if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
    11358                           nft_map_deactivate(&ctx, set);
    (gdb)
    
    [ 1360.549372] Last potentially related work creation:
    [ 1360.549376]  kasan_save_stack+0x20/0x40
    [ 1360.549384]  __kasan_record_aux_stack+0x9b/0xb0
    [ 1360.549392]  __queue_work+0x3fb/0x780
    [ 1360.549399]  queue_work_on+0x4f/0x60
    [ 1360.549407]  nft_rhash_remove+0x33b/0x340 [nf_tables]
    [ 1360.549516]  nf_tables_commit+0x1c6a/0x2620 [nf_tables]
    [ 1360.549625]  nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink]
    [ 1360.549647]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
    [ 1360.549671]  netlink_unicast+0x367/0x4f0
    [ 1360.549680]  netlink_sendmsg+0x34b/0x610
    [ 1360.549690]  ____sys_sendmsg+0x4d4/0x510
    [ 1360.549697]  ___sys_sendmsg+0xc9/0x120
    [ 1360.549706]  __sys_sendmsg+0xbe/0x140
    [ 1360.549715]  do_syscall_64+0x55/0x120
    [ 1360.549725]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
    
    Fixes: 0935d5588400 ("netfilter: nf_tables: asynchronous release")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: reject destroy command to remove basechain hooks [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Mar 21 01:27:50 2024 +0100

    netfilter: nf_tables: reject destroy command to remove basechain hooks
    
    [ Upstream commit b32ca27fa238ff83427d23bef2a5b741e2a88a1e ]
    
    Report EOPNOTSUPP if NFT_MSG_DESTROYCHAIN is used to delete hooks in an
    existing netdev basechain, thus, only NFT_MSG_DELCHAIN is allowed.
    
    Fixes: 7d937b107108f ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: reject new basechain after table flag update [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Mon Apr 1 00:33:02 2024 +0200

    netfilter: nf_tables: reject new basechain after table flag update
    
    commit 994209ddf4f430946f6247616b2e33d179243769 upstream.
    
    When dormant flag is toggled, hooks are disabled in the commit phase by
    iterating over current chains in table (existing and new).
    
    The following configuration allows for an inconsistent state:
    
      add table x
      add chain x y { type filter hook input priority 0; }
      add table x { flags dormant; }
      add chain x w { type filter hook input priority 1; }
    
    which triggers the following warning when trying to unregister chain w
    which is already unregistered.
    
    [  127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50                                                                     1 __nf_unregister_net_hook+0x21a/0x260
    [...]
    [  127.322519] Call Trace:
    [  127.322521]  <TASK>
    [  127.322524]  ? __warn+0x9f/0x1a0
    [  127.322531]  ? __nf_unregister_net_hook+0x21a/0x260
    [  127.322537]  ? report_bug+0x1b1/0x1e0
    [  127.322545]  ? handle_bug+0x3c/0x70
    [  127.322552]  ? exc_invalid_op+0x17/0x40
    [  127.322556]  ? asm_exc_invalid_op+0x1a/0x20
    [  127.322563]  ? kasan_save_free_info+0x3b/0x60
    [  127.322570]  ? __nf_unregister_net_hook+0x6a/0x260
    [  127.322577]  ? __nf_unregister_net_hook+0x21a/0x260
    [  127.322583]  ? __nf_unregister_net_hook+0x6a/0x260
    [  127.322590]  ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
    [  127.322655]  nft_table_disable+0x75/0xf0 [nf_tables]
    [  127.322717]  nf_tables_commit+0x2571/0x2620 [nf_tables]
    
    Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: reject table flag and netdev basechain updates [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Mar 21 01:27:59 2024 +0100

    netfilter: nf_tables: reject table flag and netdev basechain updates
    
    [ Upstream commit 1e1fb6f00f52812277963365d9bd835b9b0ea4e0 ]
    
    netdev basechain updates are stored in the transaction object hook list.
    When setting on the table dormant flag, it iterates over the existing
    hooks in the basechain. Thus, skipping the hooks that are being
    added/deleted in this transaction, which leaves hook registration in
    inconsistent state.
    
    Reject table flag updates in combination with netdev basechain updates
    in the same batch:
    
    - Update table flags and add/delete basechain: Check from basechain update
      path if there are pending flag updates for this table.
    - add/delete basechain and update table flags: Iterate over the transaction
      list to search for basechain updates from the table update path.
    
    In both cases, the batch is rejected. Based on suggestion from Florian Westphal.
    
    Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
    Fixes: 7d937b107108f ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: release batch on table validation from abort path [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Mar 28 13:27:36 2024 +0100

    netfilter: nf_tables: release batch on table validation from abort path
    
    commit a45e6889575c2067d3c0212b6bc1022891e65b91 upstream.
    
    Unlike early commit path stage which triggers a call to abort, an
    explicit release of the batch is required on abort, otherwise mutex is
    released and commit_list remains in place.
    
    Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
    before releasing the mutex.
    
    After this patch, commit_list is always assumed to be empty before
    grabbing the mutex, therefore
    
      03c1f1ef1584 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
    
    only needs to release the pending modules for registration.
    
    Cc: stable@vger.kernel.org
    Fixes: c0391b6ab810 ("netfilter: nf_tables: missing validation from the abort path")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Mar 28 14:23:55 2024 +0100

    netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
    
    commit 0d459e2ffb541841714839e8228b845458ed3b27 upstream.
    
    The commit mutex should not be released during the critical section
    between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
    worker could collect expired objects and get the released commit lock
    within the same GC sequence.
    
    nf_tables_module_autoload() temporarily releases the mutex to load
    module dependencies, then it goes back to replay the transaction again.
    Move it at the end of the abort phase after nft_gc_seq_end() is called.
    
    Cc: stable@vger.kernel.org
    Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path")
    Reported-by: Kuan-Ting Chen <hexrabbit@devco.re>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: nf_tables: skip netdev hook unregistration if table is dormant [+ + +]

Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Mar 21 01:28:07 2024 +0100

    netfilter: nf_tables: skip netdev hook unregistration if table is dormant
    
    [ Upstream commit 216e7bf7402caf73f4939a8e0248392e96d7c0da ]
    
    Skip hook unregistration when adding or deleting devices from an
    existing netdev basechain. Otherwise, commit/abort path try to
    unregister hooks which not enabled.
    
    Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
    Fixes: 7d937b107108 ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: validate user input for expected length [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Apr 4 12:20:51 2024 +0000

    netfilter: validate user input for expected length
    
    commit 0c83842df40f86e529db6842231154772c20edcc upstream.
    
    I got multiple syzbot reports showing old bugs exposed
    by BPF after commit 20f2505fb436 ("bpf: Try to avoid kzalloc
    in cgroup/{s,g}etsockopt")
    
    setsockopt() @optlen argument should be taken into account
    before copying data.
    
     BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
     BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline]
     BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
     BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
    Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238
    
    CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
      print_address_description mm/kasan/report.c:377 [inline]
      print_report+0x169/0x550 mm/kasan/report.c:488
      kasan_report+0x143/0x180 mm/kasan/report.c:601
      kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
      __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
      copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
      copy_from_sockptr include/linux/sockptr.h:55 [inline]
      do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
      do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
      nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101
      do_sock_setsockopt+0x3af/0x720 net/socket.c:2311
      __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
      __do_sys_setsockopt net/socket.c:2343 [inline]
      __se_sys_setsockopt net/socket.c:2340 [inline]
      __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
     do_syscall_64+0xfb/0x240
     entry_SYSCALL_64_after_hwframe+0x72/0x7a
    RIP: 0033:0x7fd22067dde9
    Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9
    RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003
    RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000
    R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8
     </TASK>
    
    Allocated by task 7238:
      kasan_save_stack mm/kasan/common.c:47 [inline]
      kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
      poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
      __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
      kasan_kmalloc include/linux/kasan.h:211 [inline]
      __do_kmalloc_node mm/slub.c:4069 [inline]
      __kmalloc_noprof+0x200/0x410 mm/slub.c:4082
      kmalloc_noprof include/linux/slab.h:664 [inline]
      __cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869
      do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293
      __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
      __do_sys_setsockopt net/socket.c:2343 [inline]
      __se_sys_setsockopt net/socket.c:2340 [inline]
      __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
     do_syscall_64+0xfb/0x240
     entry_SYSCALL_64_after_hwframe+0x72/0x7a
    
    The buggy address belongs to the object at ffff88802cd73da0
     which belongs to the cache kmalloc-8 of size 8
    The buggy address is located 0 bytes inside of
     allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1)
    
    The buggy address belongs to the physical page:
    page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73
    flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff)
    page_type: 0xffffefff(slab)
    raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122
    raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000
    page dumped because: kasan: bad access detected
    page_owner tracks the page as allocated
    page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828
      set_page_owner include/linux/page_owner.h:32 [inline]
      post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490
      prep_new_page mm/page_alloc.c:1498 [inline]
      get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454
      __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712
      __alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
      alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
      alloc_slab_page+0x5f/0x120 mm/slub.c:2249
      allocate_slab+0x5a/0x2e0 mm/slub.c:2412
      new_slab mm/slub.c:2465 [inline]
      ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615
      __slab_alloc+0x58/0xa0 mm/slub.c:3705
      __slab_alloc_node mm/slub.c:3758 [inline]
      slab_alloc_node mm/slub.c:3936 [inline]
      __do_kmalloc_node mm/slub.c:4068 [inline]
      kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089
      kstrdup+0x3a/0x80 mm/util.c:62
      device_rename+0xb5/0x1b0 drivers/base/core.c:4558
      dev_change_name+0x275/0x860 net/core/dev.c:1232
      do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864
      __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
      rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
      rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594
      netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
      netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
      netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
    page last free pid 5146 tgid 5146 stack trace:
      reset_page_owner include/linux/page_owner.h:25 [inline]
      free_pages_prepare mm/page_alloc.c:1110 [inline]
      free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617
      discard_slab mm/slub.c:2511 [inline]
      __put_partials+0xeb/0x130 mm/slub.c:2980
      put_cpu_partial+0x17c/0x250 mm/slub.c:3055
      __slab_free+0x2ea/0x3d0 mm/slub.c:4254
      qlink_free mm/kasan/quarantine.c:163 [inline]
      qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
      kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
      __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
      kasan_slab_alloc include/linux/kasan.h:201 [inline]
      slab_post_alloc_hook mm/slub.c:3888 [inline]
      slab_alloc_node mm/slub.c:3948 [inline]
      __do_kmalloc_node mm/slub.c:4068 [inline]
      __kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076
      kmalloc_node_noprof include/linux/slab.h:681 [inline]
      kvmalloc_node_noprof+0x72/0x190 mm/util.c:634
      bucket_table_alloc lib/rhashtable.c:186 [inline]
      rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367
      rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427
      process_one_work kernel/workqueue.c:3218 [inline]
      process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
      worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
      kthread+0x2f0/0x390 kernel/kthread.c:388
      ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
      ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
    
    Memory state around the buggy address:
     ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc
     ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
    >ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc
                                   ^
     ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc
     ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet [+ + +]

Author: Ryosuke Yasuoka <ryasuoka@redhat.com>
Date:   Wed Mar 20 09:54:10 2024 +0900

    nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet
    
    [ Upstream commit d24b03535e5eb82e025219c2f632b485409c898f ]
    
    syzbot reported the following uninit-value access issue [1][2]:
    
    nci_rx_work() parses and processes received packet. When the payload
    length is zero, each message type handler reads uninitialized payload
    and KMSAN detects this issue. The receipt of a packet with a zero-size
    payload is considered unexpected, and therefore, such packets should be
    silently discarded.
    
    This patch resolved this issue by checking payload size before calling
    each message type handler codes.
    
    Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
    Reported-and-tested-by: syzbot+7ea9413ea6749baf5574@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+29b5ca705d2e0f4a44d2@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=7ea9413ea6749baf5574 [1]
    Closes: https://syzkaller.appspot.com/bug?extid=29b5ca705d2e0f4a44d2 [2]
    Signed-off-by: Ryosuke Yasuoka <ryasuoka@redhat.com>
    Reviewed-by: Jeremy Cline <jeremy@jcline.org>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nfsd: hold a lighter-weight client reference over CB_RECALL_ANY [+ + +]

Author: Jeff Layton <jlayton@kernel.org>
Date:   Fri Apr 5 13:56:18 2024 -0400

    nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
    
    [ Upstream commit 10396f4df8b75ff6ab0aa2cd74296565466f2c8d ]
    
    Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
    client. While a callback job is technically an RPC that counter is
    really more for client-driven RPCs, and this has the effect of
    preventing the client from being unhashed until the callback completes.
    
    If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
    can end up in a situation where the callback can't complete on the (now
    dead) callback channel, but the new client can't connect because the old
    client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
    return on the CREATE_SESSION operation.
    
    The job is only holding a reference to the client so it can clear a flag
    after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
    reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
    reference when dealing with the nfsdfs info files, but it should work
    appropriately here to ensure that the nfs4_client doesn't disappear.
    
    Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
    Reported-by: Vladimir Benes <vbenes@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

nouveau/uvmm: fix addr/range calcs for remap operations [+ + +]

Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Mar 28 12:43:16 2024 +1000

    nouveau/uvmm: fix addr/range calcs for remap operations
    
    [ Upstream commit be141849ec00ef39935bf169c0f194ac70bf85ce ]
    
    dEQP-VK.sparse_resources.image_rebind.2d_array.r64i.128_128_8
    was causing a remap operation like the below.
    
    op_remap: prev: 0000003fffed0000 00000000000f0000 00000000a5abd18a 0000000000000000
    op_remap: next:
    op_remap: unmap: 0000003fffed0000 0000000000100000 0
    op_map: map: 0000003ffffc0000 0000000000010000 000000005b1ba33c 00000000000e0000
    
    This was resulting in an unmap operation from 0x3fffed0000+0xf0000, 0x100000
    which was corrupting the pagetables and oopsing the kernel.
    
    Fixes the prev + unmap range calcs to use start/end and map back to addr/range.
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
    Cc: Danilo Krummrich <dakr@redhat.com>
    Signed-off-by: Danilo Krummrich <dakr@redhat.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240328024317.2041851-1-airlied@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: Add array index check [+ + +]

Author: Aleksandr Mishin <amishin@t-argos.ru>
Date:   Thu Mar 28 19:55:05 2024 +0300

    octeontx2-af: Add array index check
    
    commit ef15ddeeb6bee87c044bf7754fac524545bf71e8 upstream.
    
    In rvu_map_cgx_lmac_pf() the 'iter', which is used as an array index, can reach
    value (up to 14) that exceed the size (MAX_LMAC_COUNT = 8) of the array.
    Fix this bug by adding 'iter' value check.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 91c6945ea1f9 ("octeontx2-af: cn10k: Add RPM MAC support")
    Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

octeontx2-af: Fix issue with loading coalesced KPU profiles [+ + +]

Author: Hariprasad Kelam <hkelam@marvell.com>
Date:   Tue Mar 26 17:51:49 2024 +0530

    octeontx2-af: Fix issue with loading coalesced KPU profiles
    
    commit 0ba80d96585662299d4ea4624043759ce9015421 upstream.
    
    The current implementation for loading coalesced KPU profiles has
    a limitation.  The "offset" field, which is used to locate profiles
    within the profile is restricted to a u16.
    
    This restricts the number of profiles that can be loaded. This patch
    addresses this limitation by increasing the size of the "offset" field.
    
    Fixes: 11c730bfbf5b ("octeontx2-af: support for coalescing KPU profiles")
    Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
    Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Octeontx2-af: fix pause frame configuration in GMP mode [+ + +]

Author: Hariprasad Kelam <hkelam@marvell.com>
Date:   Tue Mar 26 10:57:20 2024 +0530

    Octeontx2-af: fix pause frame configuration in GMP mode
    
    [ Upstream commit 40d4b4807cadd83fb3f46cc8cd67a945b5b25461 ]
    
    The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for
    different speeds, allowing for efficient data transfer.
    
    The previous patch which added pause frame configuration has a bug due
    to which pause frame feature is not working in GMP mode.
    
    This patch fixes the issue by configurating appropriate registers.
    
    Fixes: f7e086e754fe ("octeontx2-af: Pause frame configuration at cgx")
    Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20240326052720.4441-1-hkelam@marvell.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-pf: check negative error code in otx2_open() [+ + +]

Author: Su Hui <suhui@nfschina.com>
Date:   Thu Mar 28 10:06:21 2024 +0800

    octeontx2-pf: check negative error code in otx2_open()
    
    commit e709acbd84fb6ef32736331b0147f027a3ef4c20 upstream.
    
    otx2_rxtx_enable() return negative error code such as -EIO,
    check -EIO rather than EIO to fix this problem.
    
    Fixes: c926252205c4 ("octeontx2-pf: Disable packet I/O for graceful exit")
    Signed-off-by: Su Hui <suhui@nfschina.com>
    Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
    Link: https://lore.kernel.org/r/20240328020620.4054692-1-suhui@nfschina.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

of: dynamic: Synchronize of_changeset_destroy() with the devlink removals [+ + +]

Author: Herve Codina <herve.codina@bootlin.com>
Date:   Mon Mar 25 16:21:26 2024 +0100

    of: dynamic: Synchronize of_changeset_destroy() with the devlink removals
    
    commit 8917e7385346bd6584890ed362985c219fe6ae84 upstream.
    
    In the following sequence:
      1) of_platform_depopulate()
      2) of_overlay_remove()
    
    During the step 1, devices are destroyed and devlinks are removed.
    During the step 2, OF nodes are destroyed but
    __of_changeset_entry_destroy() can raise warnings related to missing
    of_node_put():
      ERROR: memory leak, expected refcount 1 instead of 2 ...
    
    Indeed, during the devlink removals performed at step 1, the removal
    itself releasing the device (and the attached of_node) is done by a job
    queued in a workqueue and so, it is done asynchronously with respect to
    function calls.
    When the warning is present, of_node_put() will be called but wrongly
    too late from the workqueue job.
    
    In order to be sure that any ongoing devlink removals are done before
    the of_node destruction, synchronize the of_changeset_destroy() with the
    devlink removals.
    
    Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
    Cc: stable@vger.kernel.org
    Signed-off-by: Herve Codina <herve.codina@bootlin.com>
    Reviewed-by: Saravana Kannan <saravanak@google.com>
    Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
    Reviewed-by: Nuno Sa <nuno.sa@analog.com>
    Link: https://lore.kernel.org/r/20240325152140.198219-3-herve.codina@bootlin.com
    Signed-off-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

of: module: prevent NULL pointer dereference in vsnprintf() [+ + +]

Author: Sergey Shtylyov <s.shtylyov@omp.ru>
Date:   Wed Mar 27 19:52:49 2024 +0300

    of: module: prevent NULL pointer dereference in vsnprintf()
    
    commit a1aa5390cc912934fee76ce80af5f940452fa987 upstream.
    
    In of_modalias(), we can get passed the str and len parameters which would
    cause a kernel oops in vsnprintf() since it only allows passing a NULL ptr
    when the length is also 0. Also, we need to filter out the negative values
    of the len parameter as these will result in a really huge buffer since
    snprintf() takes size_t parameter while ours is ssize_t...
    
    Found by Linux Verification Center (linuxtesting.org) with the Svace static
    analysis tool.
    
    Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/1d211023-3923-685b-20f0-f3f90ea56e1f@omp.ru
    Signed-off-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later [+ + +]

Author: Sandipan Das <sandipan.das@amd.com>
Date:   Mon Mar 25 13:17:53 2024 +0530

    perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later
    
    [ Upstream commit c7b2edd8377be983442c1344cb940cd2ac21b601 ]
    
    AMD processors based on Zen 2 and later microarchitectures do not
    support PMCx087 (instruction pipe stalls) which is used as the backing
    event for "stalled-cycles-frontend" and "stalled-cycles-backend".
    
    Use PMCx0A9 (cycles where micro-op queue is empty) instead to count
    frontend stalls and remove the entry for backend stalls since there
    is no direct replacement.
    
    Signed-off-by: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Reviewed-by: Ian Rogers <irogers@google.com>
    Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h")
    Link: https://lore.kernel.org/r/03d7fc8fa2a28f9be732116009025bdec1b3ec97.1711352180.git.sandipan.das@amd.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/x86/amd/lbr: Use freeze based on availability [+ + +]

Author: Sandipan Das <sandipan.das@amd.com>
Date:   Mon Mar 25 13:01:45 2024 +0530

    perf/x86/amd/lbr: Use freeze based on availability
    
    [ Upstream commit 598c2fafc06fe5c56a1a415fb7b544b31453d637 ]
    
    Currently, the LBR code assumes that LBR Freeze is supported on all processors
    when X86_FEATURE_AMD_LBR_V2 is available i.e. CPUID leaf 0x80000022[EAX]
    bit 1 is set. This is incorrect as the availability of the feature is
    additionally dependent on CPUID leaf 0x80000022[EAX] bit 2 being set,
    which may not be set for all Zen 4 processors.
    
    Define a new feature bit for LBR and PMC freeze and set the freeze enable bit
    (FLBRI) in DebugCtl (MSR 0x1d9) conditionally.
    
    It should still be possible to use LBR without freeze for profile-guided
    optimization of user programs by using an user-only branch filter during
    profiling. When the user-only filter is enabled, branches are no longer
    recorded after the transition to CPL 0 upon PMI arrival. When branch
    entries are read in the PMI handler, the branch stack does not change.
    
    E.g.
    
      $ perf record -j any,u -e ex_ret_brn_tkn ./workload
    
    Since the feature bit is visible under flags in /proc/cpuinfo, it can be
    used to determine the feasibility of use-cases which require LBR Freeze
    to be supported by the hardware such as profile-guided optimization of
    kernels.
    
    Fixes: ca5b7c0d9621 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support")
    Signed-off-by: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/69a453c97cfd11c6f2584b19f937fe6df741510f.1711091584.git.sandipan.das@amd.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf/x86/intel/ds: Don't clear ->pebs_data_cfg for the last PEBS event [+ + +]

Author: Kan Liang <kan.liang@linux.intel.com>
Date:   Mon Apr 1 06:33:20 2024 -0700

    perf/x86/intel/ds: Don't clear ->pebs_data_cfg for the last PEBS event
    
    commit 312be9fc2234c8acfb8148a9f4c358b70d358dee upstream.
    
    The MSR_PEBS_DATA_CFG MSR register is used to configure which data groups
    should be generated into a PEBS record, and it's shared among all counters.
    
    If there are different configurations among counters, perf combines all the
    configurations.
    
    The first perf command as below requires a complete PEBS record
    (including memory info, GPRs, XMMs, and LBRs). The second perf command
    only requires a basic group. However, after the second perf command is
    running, the MSR_PEBS_DATA_CFG register is cleared. Only a basic group is
    generated in a PEBS record, which is wrong. The required information
    for the first perf command is missed.
    
     $ perf record --intr-regs=AX,SP,XMM0 -a -C 8 -b -W -d -c 100000003 -o /dev/null -e cpu/event=0xd0,umask=0x81/upp &
     $ sleep 5
     $ perf record  --per-thread  -c 1  -e cycles:pp --no-timestamp --no-tid taskset -c 8 ./noploop 1000
    
    The first PEBS event is a system-wide PEBS event. The second PEBS event
    is a per-thread event. When the thread is scheduled out, the
    intel_pmu_pebs_del() function is invoked to update the PEBS state.
    Since the system-wide event is still available, the cpuc->n_pebs is 1.
    The cpuc->pebs_data_cfg is cleared. The data configuration for the
    system-wide PEBS event is lost.
    
    The (cpuc->n_pebs == 1) check was introduced in commit:
    
      b6a32f023fcc ("perf/x86: Fix PEBS threshold initialization")
    
    At that time, it indeed didn't hurt whether the state was updated
    during the removal, because only the threshold is updated.
    
    The calculation of the threshold takes the last PEBS event into
    account.
    
    However, since commit:
    
      b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG")
    
    we delay the threshold update, and clear the PEBS data config, which triggers
    the bug.
    
    The PEBS data config update scope should not be shrunk during removal.
    
    [ mingo: Improved the changelog & comments. ]
    
    Fixes: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG")
    Reported-by: Stephane Eranian <eranian@google.com>
    Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240401133320.703971-1-kan.liang@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d [+ + +]

Author: Heiner Kallweit <hkallweit1@gmail.com>
Date:   Sat Mar 30 12:49:02 2024 +0100

    r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d
    
    commit 5d872c9f46bd2ea3524af3c2420a364a13667135 upstream.
    
    On some boards with this chip version the BIOS is buggy and misses
    to reset the PHY page selector. This results in the PHY ID read
    accessing registers on a different page, returning a more or
    less random value. Fix this by resetting the page selector first.
    
    Fixes: f1e911d5d0df ("r8169: add basic phylib support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/64f2055e-98b8-45ec-8568-665e3d54d4e6@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

r8169: skip DASH fw status checks when DASH is disabled [+ + +]

Author: Atlas Yu <atlas.yu@canonical.com>
Date:   Thu Mar 28 13:51:52 2024 +0800

    r8169: skip DASH fw status checks when DASH is disabled
    
    commit 5e864d90b20803edf6bd44a99fb9afa7171785f2 upstream.
    
    On devices that support DASH, the current code in the "rtl_loop_wait" function
    raises false alarms when DASH is disabled. This occurs because the function
    attempts to wait for the DASH firmware to be ready, even though it's not
    relevant in this case.
    
    r8169 0000:0c:00.0 eth0: RTL8168ep/8111ep, 38:7c:76:49:08:d9, XID 502, IRQ 86
    r8169 0000:0c:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
    r8169 0000:0c:00.0 eth0: DASH disabled
    ...
    r8169 0000:0c:00.0 eth0: rtl_ep_ocp_read_cond == 0 (loop: 30, delay: 10000).
    
    This patch modifies the driver start/stop functions to skip checking the DASH
    firmware status when DASH is explicitly disabled. This prevents unnecessary
    delays and false alarms.
    
    The patch has been tested on several ThinkStation P8/PX workstations.
    
    Fixes: 0ab0c45d8aae ("r8169: add handling DASH when DASH is disabled")
    Signed-off-by: Atlas Yu <atlas.yu@canonical.com>
    Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
    Link: https://lore.kernel.org/r/20240328055152.18443-1-atlas.yu@canonical.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

regmap: maple: Fix cache corruption in regcache_maple_drop() [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Wed Mar 27 11:44:06 2024 +0000

    regmap: maple: Fix cache corruption in regcache_maple_drop()
    
    [ Upstream commit 00bb549d7d63a21532e76e4a334d7807a54d9f31 ]
    
    When keeping the upper end of a cache block entry, the entry[] array
    must be indexed by the offset from the base register of the block,
    i.e. max - mas.index.
    
    The code was indexing entry[] by only the register address, leading
    to an out-of-bounds access that copied some part of the kernel
    memory over the cache contents.
    
    This bug was not detected by the regmap KUnit test because it only
    tests with a block of registers starting at 0, so mas.index == 0.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache")
    Link: https://msgid.link/r/20240327114406.976986-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

regmap: maple: Fix uninitialized symbol 'ret' warnings [+ + +]

Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Fri Mar 29 14:46:30 2024 +0000

    regmap: maple: Fix uninitialized symbol 'ret' warnings
    
    [ Upstream commit eaa03486d932572dfd1c5f64f9dfebe572ad88c0 ]
    
    Fix warnings reported by smatch by initializing local 'ret' variable
    to 0.
    
    drivers/base/regmap/regcache-maple.c:186 regcache_maple_drop()
    error: uninitialized symbol 'ret'.
    drivers/base/regmap/regcache-maple.c:290 regcache_maple_sync()
    error: uninitialized symbol 'ret'.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache")
    Link: https://lore.kernel.org/r/20240329144630.1965159-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "ALSA: emu10k1: fix synthesizer sample playback position and caching" [+ + +]

Author: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Date:   Mon Apr 1 16:58:05 2024 +0200

    Revert "ALSA: emu10k1: fix synthesizer sample playback position and caching"
    
    [ Upstream commit 03f56ed4ead162551ac596c9e3076ff01f1c5836 ]
    
    As already anticipated in the original commit, playback was broken for
    very short samples. I just didn't expect it to be an actual problem,
    because we're talking about less than 1.5 milliseconds here. But clearly
    such wavetable samples do actually exist.
    
    The problem was that for such short samples we'd set the current
    position beyond the end of the loop, so we'd run off the end of the
    sample and play garbage.
    This is a bigger (more audible) problem than the original one, which was
    that we'd start playback with garbage (whatever was still in the cache),
    which would be mostly masked by the note's attack phase.
    
    So revert to the old behavior for now. We'll subsequently fix it
    properly with a bigger patch series.
    Note that this isn't a full revert - the dead code is not re-introduced,
    because that would be silly.
    
    Fixes: df335e9a8bcb ("ALSA: emu10k1: fix synthesizer sample playback position and caching")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218625
    Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
    Message-ID: <20240401145805.528794-1-oswald.buddenhagen@gmx.de>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT" [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Mar 14 09:44:12 2024 +0100

    Revert "Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT"
    
    commit 4790a73ace86f3d165bbedba898e0758e6e1b82d upstream.
    
    This reverts commit 7dcd3e014aa7faeeaf4047190b22d8a19a0db696.
    
    Qualcomm Bluetooth controllers like WCN6855 do not have persistent
    storage for the Bluetooth address and must therefore start as
    unconfigured to allow the user to set a valid address unless one has
    been provided by the boot firmware in the devicetree.
    
    A recent change snuck into v6.8-rc7 and incorrectly started marking the
    default (non-unique) address as valid. This specifically also breaks the
    Bluetooth setup for some user of the Lenovo ThinkPad X13s.
    
    Note that this is the second time Qualcomm breaks the driver this way
    and that this was fixed last year by commit 6945795bc81a ("Bluetooth:
    fix use-bdaddr-property quirk"), which also has some further details.
    
    Fixes: 7dcd3e014aa7 ("Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT")
    Cc: stable@vger.kernel.org      # 6.8
    Cc: Janaki Ramaiah Thota <quic_janathot@quicinc.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Reported-by: Clayton Craft <clayton@craftyguy.net>
    Tested-by: Clayton Craft <clayton@craftyguy.net>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped." [+ + +]

Author: Ingo Molnar <mingo@kernel.org>
Date:   Mon Mar 25 11:47:51 2024 +0100

    Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."
    
    [ Upstream commit c567f2948f57bdc03ed03403ae0234085f376b7d ]
    
    This reverts commit d794734c9bbfe22f86686dc2909c25f5ffe1a572.
    
    While the original change tries to fix a bug, it also unintentionally broke
    existing systems, see the regressions reported at:
    
      https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph.com/
    
    Since d794734c9bbf was also marked for -stable, let's back it out before
    causing more damage.
    
    Note that due to another upstream change the revert was not 100% automatic:
    
      0a845e0f6348 mm/treewide: replace pud_large() with pud_leaf()
    
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: <stable@vger.kernel.org>
    Cc: Russ Anderson <rja@hpe.com>
    Cc: Steve Wahl <steve.wahl@hpe.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Link: https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph.com/
    Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "x86/mpparse: Register APIC address only once" [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Mon Apr 8 12:42:06 2024 +0200

    Revert "x86/mpparse: Register APIC address only once"
    
    This reverts commit bebb5af001dc6cb4f505bb21c4d5e2efbdc112e2 which is
    commit f2208aa12c27bfada3c15c550c03ca81d42dcac2 upstream.
    
    It is reported to cause problems in the stable branches, so revert it.
    
    Link: https://lore.kernel.org/r/899b7c1419a064a2b721b78eade06659@stwm.de
    Reported-by: Wolfgang Walter <linux@stwm.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Guenter Roeck <linux@roeck-us.net>
    Cc: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RISC-V: KVM: Fix APLIC in_clrip[x] read emulation [+ + +]

Author: Anup Patel <apatel@ventanamicro.com>
Date:   Thu Mar 21 14:20:41 2024 +0530

    RISC-V: KVM: Fix APLIC in_clrip[x] read emulation
    
    commit 8e936e98718f005c986be0bfa1ee6b355acf96be upstream.
    
    The reads to APLIC in_clrip[x] registers returns rectified input values
    of the interrupt sources.
    
    A rectified input value of an interrupt source is defined by the section
    "4.5.2 Source configurations (sourcecfg[1]Б─⌠sourcecfg[1023])" of the
    RISC-V AIA specification as:
    
        rectified input value = (incoming wire value) XOR (source is inverted)
    
    Update the riscv_aplic_input() implementation to match the above.
    
    Cc: stable@vger.kernel.org
    Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC")
    Signed-off-by: Anup Patel <apatel@ventanamicro.com>
    Signed-off-by: Anup Patel <anup@brainfault.org>
    Link: https://lore.kernel.org/r/20240321085041.1955293-3-apatel@ventanamicro.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RISC-V: KVM: Fix APLIC setipnum_le/be write emulation [+ + +]

Author: Anup Patel <apatel@ventanamicro.com>
Date:   Thu Mar 21 14:20:40 2024 +0530

    RISC-V: KVM: Fix APLIC setipnum_le/be write emulation
    
    commit d8dd9f113e16bef3b29c9dcceb584a6f144f55e4 upstream.
    
    The writes to setipnum_le/be register for APLIC in MSI-mode have special
    consideration for level-triggered interrupts as-per the section "4.9.2
    Special consideration for level-sensitive interrupt sources" of the RISC-V
    AIA specification.
    
    Particularly, the below text from the RISC-V AIA specification defines
    the behaviour of writes to setipnum_le/be register for level-triggered
    interrupts:
    
    "A second option is for the interrupt service routine to write the
    APLICБ─≥s source identity number for the interrupt to the domainБ─≥s
    setipnum register just before exiting. This will cause the interruptБ─≥s
    pending bit to be set to one again if the source is still asserting
    an interrupt, but not if the source is not asserting an interrupt."
    
    Fix setipnum_le/be write emulation for in-kernel APLIC by implementing
    the above behaviour in aplic_write_pending() function.
    
    Cc: stable@vger.kernel.org
    Fixes: 74967aa208e2 ("RISC-V: KVM: Add in-kernel emulation of AIA APLIC")
    Signed-off-by: Anup Patel <apatel@ventanamicro.com>
    Signed-off-by: Anup Patel <anup@brainfault.org>
    Link: https://lore.kernel.org/r/20240321085041.1955293-2-apatel@ventanamicro.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ [+ + +]

Author: Victor Isaev <victor@torrio.net>
Date:   Fri Dec 15 23:27:20 2023 -0500

    RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ
    
    [ Upstream commit 13dddf9319808badd2c1f5d7007b4e82838a648e ]
    
    "riscv: signal: Report signal frame size to userspace via auxv" (e92f469)
    has added new constant AT_MINSIGSTKSZ but failed to increment the size of
    auxv, keeping AT_VECTOR_SIZE_ARCH at 9.
    This fix correctly increments AT_VECTOR_SIZE_ARCH to 10, following the
    approach in the commit 94b07c1 ("arm64: signal: Report signal frame size
    to userspace via auxv").
    
    Link: https://lore.kernel.org/r/73883406.20231215232720@torrio.net
    Link: https://lore.kernel.org/all/20240102133617.3649-1-victor@torrio.net/
    Reported-by: Ivan Komarov <ivan.komarov@dfyz.info>
    Closes: https://lore.kernel.org/linux-riscv/CY3Z02NYV1C4.11BLB9PLVW9G1@fedora/
    Fixes: e92f469b0771 ("riscv: signal: Report signal frame size to userspace via auxv")
    Signed-off-by: Victor Isaev <isv@google.com>
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Disable preemption when using patch_map() [+ + +]

Author: Alexandre Ghiti <alexghiti@rivosinc.com>
Date:   Tue Mar 26 21:30:17 2024 +0100

    riscv: Disable preemption when using patch_map()
    
    [ Upstream commit a370c2419e4680a27382d9231edcf739d5d74efc ]
    
    patch_map() uses fixmap mappings to circumvent the non-writability of
    the kernel text mapping.
    
    The __set_fixmap() function only flushes the current cpu tlb, it does
    not emit an IPI so we must make sure that while we use a fixmap mapping,
    the current task is not migrated on another cpu which could miss the
    newly introduced fixmap mapping.
    
    So in order to avoid any task migration, disable the preemption.
    
    Reported-by: Andrea Parri <andrea@rivosinc.com>
    Closes: https://lore.kernel.org/all/ZcS+GAaM25LXsBOl@andrea/
    Reported-by: Andy Chiu <andy.chiu@sifive.com>
    Closes: https://lore.kernel.org/linux-riscv/CABgGipUMz3Sffu-CkmeUB1dKVwVQ73+7=sgC45-m0AE9RCjOZg@mail.gmail.com/
    Fixes: cad539baa48f ("riscv: implement a memset like function for text")
    Fixes: 0ff7c3b33127 ("riscv: Use text_mutex instead of patch_lock")
    Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
    Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
    Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Acked-by: Puranjay Mohan <puranjay12@gmail.com>
    Link: https://lore.kernel.org/r/20240326203017.310422-3-alexghiti@rivosinc.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

riscv: Fix spurious errors from __get/put_kernel_nofault [+ + +]

Author: Samuel Holland <samuel.holland@sifive.com>
Date:   Mon Mar 11 19:19:13 2024 -0700

    riscv: Fix spurious errors from __get/put_kernel_nofault
    
    commit d080a08b06b6266cc3e0e86c5acfd80db937cb6b upstream.
    
    These macros did not initialize __kr_err, so they could fail even if
    the access did not fault.
    
    Cc: stable@vger.kernel.org
    Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault")
    Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
    Link: https://lore.kernel.org/r/20240312022030.320789-1-samuel.holland@sifive.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

riscv: process: Fix kernel gp leakage [+ + +]

Author: Stefan O'Rear <sorear@fastmail.com>
Date:   Wed Mar 27 02:12:58 2024 -0400

    riscv: process: Fix kernel gp leakage
    
    commit d14fa1fcf69db9d070e75f1c4425211fa619dfc8 upstream.
    
    childregs represents the registers which are active for the new thread
    in user context. For a kernel thread, childregs->gp is never used since
    the kernel gp is not touched by switch_to. For a user mode helper, the
    gp value can be observed in user space after execve or possibly by other
    means.
    
    [From the email thread]
    
    The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
    for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
    when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
    PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
    
    childregs is the *user* context during syscall execution and it is observable
    from userspace in at least five ways:
    
    1. kernel_execve does not currently clear integer registers, so the starting
       register state for PID 1 and other user processes started by the kernel has
       sp = user stack, gp = kernel __global_pointer$, all other integer registers
       zeroed by the memset in the patch comment.
    
       This is a bug in its own right, but I'm unwilling to bet that it is the only
       way to exploit the issue addressed by this patch.
    
    2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
       before it execs, but ptrace requires SIGSTOP to be delivered which can only
       happen at user/kernel boundaries.
    
    3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
       user_mode_helpers before the exec completes, but gp is not one of the
       registers it returns.
    
    4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
       addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
       are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
       LOCKDOWN_PERF. I have not attempted to write exploit code.
    
    5. Much of the tracing infrastructure allows access to user registers. I have
       not attempted to determine which forms of tracing allow access to user
       registers without already allowing access to kernel registers.
    
    Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
    Cc: stable@vger.kernel.org
    Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
    Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/r/20240327061258.2370291-1-sorear@fastmail.com
    Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/bpf: Fix bpf_plt pointer arithmetic [+ + +]

Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date:   Wed Mar 20 02:54:12 2024 +0100

    s390/bpf: Fix bpf_plt pointer arithmetic
    
    [ Upstream commit 7ded842b356d151ece8ac4985940438e6d3998bb ]
    
    Kui-Feng Lee reported a crash on s390x triggered by the
    dummy_st_ops/dummy_init_ptr_arg test [1]:
    
      [<0000000000000002>] 0x2
      [<00000000009d5cde>] bpf_struct_ops_test_run+0x156/0x250
      [<000000000033145a>] __sys_bpf+0xa1a/0xd00
      [<00000000003319dc>] __s390x_sys_bpf+0x44/0x50
      [<0000000000c4382c>] __do_syscall+0x244/0x300
      [<0000000000c59a40>] system_call+0x70/0x98
    
    This is caused by GCC moving memcpy() after assignments in
    bpf_jit_plt(), resulting in NULL pointers being written instead of
    the return and the target addresses.
    
    Looking at the GCC internals, the reordering is allowed because the
    alias analysis thinks that the memcpy() destination and the assignments'
    left-hand-sides are based on different objects: new_plt and
    bpf_plt_ret/bpf_plt_target respectively, and therefore they cannot
    alias.
    
    This is in turn due to a violation of the C standard:
    
      When two pointers are subtracted, both shall point to elements of the
      same array object, or one past the last element of the array object
      ...
    
    From the C's perspective, bpf_plt_ret and bpf_plt are distinct objects
    and cannot be subtracted. In the practical terms, doing so confuses the
    GCC's alias analysis.
    
    The code was written this way in order to let the C side know a few
    offsets defined in the assembly. While nice, this is by no means
    necessary. Fix the noncompliance by hardcoding these offsets.
    
    [1] https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d34c@gmail.com/
    
    Fixes: f1d5df84cd8c ("s390/bpf: Implement bpf_arch_text_poke()")
    Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Message-ID: <20240320015515.11883-1-iii@linux.ibm.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

s390/entry: align system call table on 8 bytes [+ + +]

Author: Sumanth Korikkar <sumanthk@linux.ibm.com>
Date:   Tue Mar 26 18:12:13 2024 +0100

    s390/entry: align system call table on 8 bytes
    
    commit 378ca2d2ad410a1cd5690d06b46c5e2297f4c8c0 upstream.
    
    Align system call table on 8 bytes. With sys_call_table entry size
    of 8 bytes that eliminates the possibility of a system call pointer
    crossing cache line boundary.
    
    Cc: stable@kernel.org
    Suggested-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
    Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
    Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

s390/qeth: handle deferred cc1 [+ + +]

Author: Alexandra Winter <wintera@linux.ibm.com>
Date:   Thu Mar 21 12:53:37 2024 +0100

    s390/qeth: handle deferred cc1
    
    [ Upstream commit afb373ff3f54c9d909efc7f810dc80a9742807b2 ]
    
    The IO subsystem expects a driver to retry a ccw_device_start, when the
    subsequent interrupt response block (irb) contains a deferred
    condition code 1.
    
    Symptoms before this commit:
    On the read channel we always trigger the next read anyhow, so no
    different behaviour here.
    On the write channel we may experience timeout errors, because the
    expected reply will never be received without the retry.
    Other callers of qeth_send_control_data() may wrongly assume that the ccw
    was successful, which may cause problems later.
    
    Note that since
    commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
    and
    commit 5ef1dc40ffa6 ("s390/cio: fix invalid -EBUSY on ccw_device_start")
    deferred CC1s are much more likely to occur. See the commit message of the
    latter for more background information.
    
    Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")
    Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
    Co-developed-by: Thorsten Winkler <twinkler@linux.ibm.com>
    Signed-off-by: Thorsten Winkler <twinkler@linux.ibm.com>
    Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
    Link: https://lore.kernel.org/r/20240321115337.3564694-1-wintera@linux.ibm.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scripts/bpf_doc: Use silent mode when exec make cmd [+ + +]

Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Fri Mar 15 10:34:43 2024 +0800

    scripts/bpf_doc: Use silent mode when exec make cmd
    
    [ Upstream commit 5384cc0d1a88c27448a6a4e65b8abe6486de8012 ]
    
    When getting kernel version via make, the result may be polluted by other
    output, like directory change info. e.g.
    
      $ export MAKEFLAGS="-w"
      $ make kernelversion
      make: Entering directory '/home/net'
      6.8.0
      make: Leaving directory '/home/net'
    
    This will distort the reStructuredText output and make latter rst2man
    failed like:
    
      [...]
      bpf-helpers.rst:20: (WARNING/2) Field list ends without a blank line; unexpected unindent.
      [...]
    
    Using silent mode would help. e.g.
    
      $ make -s --no-print-directory kernelversion
      6.8.0
    
    Fixes: fd0a38f9c37d ("scripts/bpf: Set version attribute for bpf-helpers(7) man page")
    Signed-off-by: Michael Hofmann <mhofmann@redhat.com>
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Quentin Monnet <qmo@kernel.org>
    Acked-by: Alejandro Colomar <alx@kernel.org>
    Link: https://lore.kernel.org/bpf/20240315023443.2364442-1-liuhangbin@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: mylex: Fix sysfs buffer lengths [+ + +]

Author: Arnd Bergmann <arnd@arndb.de>
Date:   Tue Mar 26 23:38:06 2024 +0100

    scsi: mylex: Fix sysfs buffer lengths
    
    [ Upstream commit 1197c5b2099f716b3de327437fb50900a0b936c9 ]
    
    The myrb and myrs drivers use an odd way of implementing their sysfs files,
    calling snprintf() with a fixed length of 32 bytes to print into a page
    sized buffer. One of the strings is actually longer than 32 bytes, which
    clang can warn about:
    
    drivers/scsi/myrb.c:1906:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation]
    drivers/scsi/myrs.c:1089:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation]
    
    These could all be plain sprintf() without a length as the buffer is always
    long enough. On the other hand, sysfs files should not be overly long
    either, so just double the length to make sure the longest strings don't
    get truncated here.
    
    Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller (SCSI interface)")
    Fixes: 081ff398c56c ("scsi: myrb: Add Mylex RAID controller (block interface)")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Link: https://lore.kernel.org/r/20240326223825.4084412-8-arnd@kernel.org
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: sd: Unregister device if device_add_disk() failed in sd_probe() [+ + +]

Author: Li Nan <linan122@huawei.com>
Date:   Fri Dec 8 16:23:35 2023 +0800

    scsi: sd: Unregister device if device_add_disk() failed in sd_probe()
    
    [ Upstream commit 0296bea01cfa6526be6bd2d16dc83b4e7f1af91f ]
    
    "if device_add() succeeds, you should call device_del() when you want to
    get rid of it."
    
    In sd_probe(), device_add_disk() fails when device_add() has already
    succeeded, so change put_device() to device_unregister() to ensure device
    resources are released.
    
    Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
    Signed-off-by: Li Nan <linan122@huawei.com>
    Link: https://lore.kernel.org/r/20231208082335.1754205-1-linan666@huaweicloud.com
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Reviewed-by: Yu Kuai <yukuai3@huawei.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests/mm: include strings.h for ffsl [+ + +]

Author: Edward Liaw <edliaw@google.com>
Date:   Fri Mar 29 18:58:10 2024 +0000

    selftests/mm: include strings.h for ffsl
    
    commit 176517c9310281d00dd3210ab4cc4d3cdc26b17e upstream.
    
    Got a compilation error on Android for ffsl after 91b80cc5b39f
    ("selftests: mm: fix map_hugetlb failure on 64K page size systems")
    included vm_util.h.
    
    Link: https://lkml.kernel.org/r/20240329185814.16304-1-edliaw@google.com
    Fixes: af605d26a8f2 ("selftests/mm: merge util.h into vm_util.h")
    Signed-off-by: Edward Liaw <edliaw@google.com>
    Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
    Cc: Axel Rasmussen <axelrasmussen@google.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: connect: fix shellcheck warnings [+ + +]

Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date:   Wed Mar 6 10:42:57 2024 +0100

    selftests: mptcp: connect: fix shellcheck warnings
    
    commit e3aae1098f109f0bd33c971deff1926f4e4441d0 upstream.
    
    shellcheck recently helped to prevent issues. It is then good to fix the
    other harmless issues in order to spot "real" ones later.
    
    Here, two categories of warnings are now ignored:
    
    - SC2317: Command appears to be unreachable. The cleanup() function is
      invoked indirectly via the EXIT trap.
    
    - SC2086: Double quote to prevent globbing and word splitting. This is
      recommended, but the current usage is correct and there is no need to
      do all these modifications to be compliant with this rule.
    
    For the modifications:
    
      - SC2034: ksft_skip appears unused.
      - SC2181: Check exit code directly with e.g. 'if mycmd;', not
                indirectly with $?.
      - SC2004: $/${} is unnecessary on arithmetic variables.
      - SC2155: Declare and assign separately to avoid masking return
                values.
      - SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
      - SC2059: Don't use variables in the printf format string. Use printf
                '..%s..' "$foo".
    
    Now this script is shellcheck (0.9.0) compliant. We can easily spot new
    issues.
    
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240306-upstream-net-next-20240304-selftests-mptcp-shared-code-shellcheck-v2-8-bc79e6e5e6a0@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: mptcp: join: fix dev in check_endpoint [+ + +]

Author: Geliang Tang <tanggeliang@kylinos.cn>
Date:   Fri Mar 29 13:08:53 2024 +0100

    selftests: mptcp: join: fix dev in check_endpoint
    
    commit 40061817d95bce6dd5634a61a65cd5922e6ccc92 upstream.
    
    There's a bug in pm_nl_check_endpoint(), 'dev' didn't be parsed correctly.
    If calling it in the 2nd test of endpoint_tests() too, it fails with an
    error like this:
    
     creation  [FAIL] expected '10.0.2.2 id 2 subflow dev dev' \
                         found '10.0.2.2 id 2 subflow dev ns2eth2'
    
    The reason is '$2' should be set to 'dev', not '$1'. This patch fixes it.
    
    Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case")
    Cc: stable@vger.kernel.org
    Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20240329-upstream-net-20240329-fallback-mib-v1-2-324a8981da48@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: net: gro fwd: update vxlan GRO test expectations [+ + +]

Author: Antoine Tenart <atenart@kernel.org>
Date:   Tue Mar 26 12:34:02 2024 +0100

    selftests: net: gro fwd: update vxlan GRO test expectations
    
    commit 0fb101be97ca27850c5ecdbd1269423ce4d1f607 upstream.
    
    UDP tunnel packets can't be GRO in-between their endpoints as this
    causes different issues. The UDP GRO fwd vxlan tests were relying on
    this and their expectations have to be fixed.
    
    We keep both vxlan tests and expected no GRO from happening. The vxlan
    UDP GRO bench test was removed as it's not providing any valuable
    information now.
    
    Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests")
    Signed-off-by: Antoine Tenart <atenart@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: reuseaddr_conflict: add missing new line at the end of the output [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri Mar 29 09:05:59 2024 -0700

    selftests: reuseaddr_conflict: add missing new line at the end of the output
    
    commit 31974122cfdeaf56abc18d8ab740d580d9833e90 upstream.
    
    The netdev CI runs in a VM and captures serial, so stdout and
    stderr get combined. Because there's a missing new line in
    stderr the test ends up corrupting KTAP:
    
      # Successok 1 selftests: net: reuseaddr_conflict
    
    which should have been:
    
      # Success
      ok 1 selftests: net: reuseaddr_conflict
    
    Fixes: 422d8dc6fd3a ("selftest: add a reuseaddr test")
    Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
    Link: https://lore.kernel.org/r/20240329160559.249476-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

selftests: vxlan_mdb: Fix failures with old libnet [+ + +]

Author: Ido Schimmel <idosch@nvidia.com>
Date:   Mon Mar 25 09:50:30 2024 +0200

    selftests: vxlan_mdb: Fix failures with old libnet
    
    [ Upstream commit f1425529c33def8b46faae4400dd9e2bbaf16a05 ]
    
    Locally generated IP multicast packets (such as the ones used in the
    test) do not perform routing and simply egress the bound device.
    
    However, as explained in commit 8bcfb4ae4d97 ("selftests: forwarding:
    Fix failing tests with old libnet"), old versions of libnet (used by
    mausezahn) do not use the "SO_BINDTODEVICE" socket option. Specifically,
    the library started using the option for IPv6 sockets in version 1.1.6
    and for IPv4 sockets in version 1.2. This explains why on Ubuntu - which
    uses version 1.1.6 - the IPv4 overlay tests are failing whereas the IPv6
    ones are passing.
    
    Fix by specifying the source and destination MAC of the packets which
    will cause mausezahn to use a packet socket instead of an IP socket.
    
    Fixes: 62199e3f1658 ("selftests: net: Add VXLAN MDB test")
    Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
    Closes: https://lore.kernel.org/netdev/5bb50349-196d-4892-8ed2-f37543aa863f@alu.unizg.hr/
    Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Link: https://lore.kernel.org/r/20240325075030.2379513-1-idosch@nvidia.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selinux: avoid dereference of garbage after mount failure [+ + +]

Author: Christian Gц╤ttsche <cgzones@googlemail.com>
Date:   Thu Mar 28 20:16:58 2024 +0100

    selinux: avoid dereference of garbage after mount failure
    
    commit 37801a36b4d68892ce807264f784d818f8d0d39b upstream.
    
    In case kern_mount() fails and returns an error pointer return in the
    error branch instead of continuing and dereferencing the error pointer.
    
    While on it drop the never read static variable selinuxfs_mount.
    
    Cc: stable@vger.kernel.org
    Fixes: 0619f0f5e36f ("selinux: wrap selinuxfs state")
    Signed-off-by: Christian Gц╤ttsche <cgzones@googlemail.com>
    Signed-off-by: Paul Moore <paul@paul-moore.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb3: retrying on failed server close [+ + +]

Author: Ritvik Budhiraja <rbudhiraja@microsoft.com>
Date:   Tue Apr 2 14:01:28 2024 -0500

    smb3: retrying on failed server close
    
    commit 173217bd73365867378b5e75a86f0049e1069ee8 upstream.
    
    In the current implementation, CIFS close sends a close to the
    server and does not check for the success of the server close.
    This patch adds functionality to check for server close return
    status and retries in case of an EBUSY or EAGAIN error.
    
    This can help avoid handle leaks
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Ritvik Budhiraja <rbudhiraja@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in cifs_debug_files_proc_show() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:53 2024 -0300

    smb: client: fix potential UAF in cifs_debug_files_proc_show()
    
    commit ca545b7f0823f19db0f1148d59bc5e1a56634502 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in cifs_dump_full_key() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:54 2024 -0300

    smb: client: fix potential UAF in cifs_dump_full_key()
    
    commit 58acd1f497162e7d282077f816faa519487be045 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in cifs_signal_cifsd_for_reconnect() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:34:04 2024 -0300

    smb: client: fix potential UAF in cifs_signal_cifsd_for_reconnect()
    
    commit e0e50401cc3921c9eaf1b0e667db174519ea939f upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in cifs_stats_proc_show() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:56 2024 -0300

    smb: client: fix potential UAF in cifs_stats_proc_show()
    
    commit 0865ffefea197b437ba78b5dd8d8e256253efd65 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in cifs_stats_proc_write() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:55 2024 -0300

    smb: client: fix potential UAF in cifs_stats_proc_write()
    
    commit d3da25c5ac84430f89875ca7485a3828150a7e0a upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in is_valid_oplock_break() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:34:00 2024 -0300

    smb: client: fix potential UAF in is_valid_oplock_break()
    
    commit 69ccf040acddf33a3a85ec0f6b45ef84b0f7ec29 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in smb2_is_network_name_deleted() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:34:02 2024 -0300

    smb: client: fix potential UAF in smb2_is_network_name_deleted()
    
    commit 63981561ffd2d4987807df4126f96a11e18b0c1d upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in smb2_is_valid_lease_break() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:58 2024 -0300

    smb: client: fix potential UAF in smb2_is_valid_lease_break()
    
    commit 705c76fbf726c7a2f6ff9143d4013b18daaaebf1 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: fix potential UAF in smb2_is_valid_oplock_break() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Tue Apr 2 16:33:59 2024 -0300

    smb: client: fix potential UAF in smb2_is_valid_oplock_break()
    
    commit 22863485a4626ec6ecf297f4cc0aef709bc862e4 upstream.
    
    Skip sessions that are being teared down (status == SES_EXITING) to
    avoid UAF.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: handle DFS tcons in cifs_construct_tcon() [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Mon Apr 1 22:44:08 2024 -0300

    smb: client: handle DFS tcons in cifs_construct_tcon()
    
    commit 4a5ba0e0bfe552ac7451f57e304f6343c3d87f89 upstream.
    
    The tcons created by cifs_construct_tcon() on multiuser mounts must
    also be able to failover and refresh DFS referrals, so set the
    appropriate fields in order to get a full DFS tcon.  They could be
    shared among different superblocks later, too.
    
    Cc: stable@vger.kernel.org # 6.4+
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202404021518.3Xu2VU4s-lkp@intel.com/
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

smb: client: serialise cifs_construct_tcon() with cifs_mount_mutex [+ + +]

Author: Paulo Alcantara <pc@manguebit.com>
Date:   Mon Apr 1 22:44:09 2024 -0300

    smb: client: serialise cifs_construct_tcon() with cifs_mount_mutex
    
    commit 93cee45ccfebc62a3bb4cd622b89e00c8c7d8493 upstream.
    
    Serialise cifs_construct_tcon() with cifs_mount_mutex to handle
    parallel mounts that may end up reusing the session and tcon created
    by it.
    
    Cc: stable@vger.kernel.org # 6.4+
    Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

spi: mchp-pci1xxx: Fix a possible null pointer dereference in pci1xxx_spi_probe [+ + +]

Author: Huai-Yuan Liu <qq810974084@gmail.com>
Date:   Wed Apr 3 09:42:21 2024 +0800

    spi: mchp-pci1xxx: Fix a possible null pointer dereference in pci1xxx_spi_probe
    
    [ Upstream commit 1f886a7bfb3faf4c1021e73f045538008ce7634e ]
    
    In function pci1xxxx_spi_probe, there is a potential null pointer that
    may be caused by a failed memory allocation by the function devm_kzalloc.
    Hence, a null pointer check needs to be added to prevent null pointer
    dereferencing later in the code.
    
    To fix this issue, spi_bus->spi_int[iter] should be checked. The memory
    allocated by devm_kzalloc will be automatically released, so just directly
    return -ENOMEM without worrying about memory leaks.
    
    Fixes: 1cc0cbea7167 ("spi: microchip: pci1xxxx: Add driver for SPI controller of PCI1XXXX PCIe switch")
    Signed-off-by: Huai-Yuan Liu <qq810974084@gmail.com>
    Link: https://msgid.link/r/20240403014221.969801-1-qq810974084@gmail.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: allow full FIFO masks [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Fri Feb 16 07:05:46 2024 +0000

    spi: s3c64xx: allow full FIFO masks
    
    [ Upstream commit d6911cf27e5c8491cbfedd4ae2d1ee74a3e685b4 ]
    
    The driver is wrong because is using partial register field masks for the
    SPI_STATUS.{RX, TX}_FIFO_LVL register fields.
    
    We see s3c64xx_spi_port_config.fifo_lvl_mask with different values for
    different instances of the same IP. Take s5pv210_spi_port_config for
    example, it defines:
            .fifo_lvl_mask  = { 0x1ff, 0x7F },
    
    fifo_lvl_mask is used to determine the FIFO depth of the instance of the
    IP. In this case, the integrator uses a 256 bytes FIFO for the first SPI
    instance of the IP, and a 64 bytes FIFO for the second instance. While
    the first mask reflects the SPI_STATUS.{RX, TX}_FIFO_LVL register
    fields, the second one is two bits short. Using partial field masks is
    misleading and can hide problems of the driver's logic.
    
    Allow platforms to specify the full FIFO mask, regardless of the FIFO
    depth.
    
    Introduce {rx, tx}_fifomask to represent the SPI_STATUS.{RX, TX}_FIFO_LVL
    register fields. It's a shifted mask defining the field's length and
    position. We'll be able to deprecate the use of @rx_lvl_offset, as the
    shift value can be determined from the mask. The existing compatibles
    shall start using {rx, tx}_fifomask so that they use the full field mask
    and to avoid shifting the mask to position, and then shifting it back to
    zero in the {TX, RX}_FIFO_LVL macros.
    
    @rx_lvl_offset will be deprecated in a further patch, after we have the
    infrastructure to deprecate @fifo_lvl_mask as well.
    
    No functional change intended.
    
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://msgid.link/r/20240216070555.2483977-4-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: define a magic value [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Fri Feb 16 07:05:45 2024 +0000

    spi: s3c64xx: define a magic value
    
    [ Upstream commit ff8faa8a5c0f4c2da797cd22a163ee3cc8823b13 ]
    
    Define a magic value, it will be used in the next patch as well.
    
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://msgid.link/r/20240216070555.2483977-3-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: determine the fifo depth only once [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Fri Feb 16 07:05:47 2024 +0000

    spi: s3c64xx: determine the fifo depth only once
    
    [ Upstream commit c6e776ab6abdfce5a1edcde7a22c639e76499939 ]
    
    Determine the FIFO depth only once, at probe time.
    ``sdd->fifo_depth`` can be set later on with the FIFO depth
    specified in the device tree.
    
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://msgid.link/r/20240216070555.2483977-5-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: explicitly include [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Wed Feb 7 12:04:17 2024 +0000

    spi: s3c64xx: explicitly include <linux/bits.h>
    
    [ Upstream commit 4568fa574fcef3811a8140702979f076ef0f5bc0 ]
    
    The driver uses GENMASK() but does not include <linux/bits.h>.
    
    It is good practice to directly include all headers used, it avoids
    implicit dependencies and spurious breakage if someone rearranges
    headers and causes the implicit include to vanish.
    
    Include the missing header.
    
    Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://lore.kernel.org/r/20240207120431.2766269-4-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: Extract FIFO depth calculation to a dedicated macro [+ + +]

Author: Sam Protsenko <semen.protsenko@linaro.org>
Date:   Sat Jan 20 11:00:01 2024 -0600

    spi: s3c64xx: Extract FIFO depth calculation to a dedicated macro
    
    [ Upstream commit 460efee706c2b6a4daba62ec143fea29c2e7b358 ]
    
    Simplify the code by extracting all cases of FIFO depth calculation into
    a dedicated macro. No functional change.
    
    Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
    Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://msgid.link/r/20240120170001.3356-1-semen.protsenko@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: remove else after return [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Wed Feb 7 12:04:22 2024 +0000

    spi: s3c64xx: remove else after return
    
    [ Upstream commit 9d47e411f4d636519a8d26587928d34cf52c0c1f ]
    
    Else case is not needed after a return, remove it.
    
    Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
    Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://lore.kernel.org/r/20240207120431.2766269-9-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: sort headers alphabetically [+ + +]

Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Wed Feb 7 12:04:15 2024 +0000

    spi: s3c64xx: sort headers alphabetically
    
    [ Upstream commit a77ce80f63f06d7ae933c332ed77c79136fa69b0 ]
    
    Sorting headers alphabetically helps locating duplicates,
    and makes it easier to figure out where to insert new headers.
    
    Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
    Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://lore.kernel.org/r/20240207120431.2766269-2-tudor.ambarus@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: a3d3eab627bb ("spi: s3c64xx: Use DMA mode from fifo size")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: s3c64xx: Use DMA mode from fifo size [+ + +]

Author: Jaewon Kim <jaewon02.kim@samsung.com>
Date:   Fri Mar 29 17:58:40 2024 +0900

    spi: s3c64xx: Use DMA mode from fifo size
    
    [ Upstream commit a3d3eab627bbbb0cb175910cf8d0f7022628a642 ]
    
    If the SPI data size is smaller than FIFO, it operates in PIO mode,
    and if it is larger than FIFO size, it oerates in DMA mode.
    
    If the SPI data size is equal to fifo, it operates in PIO mode and it is
    separated to 2 transfers. To prevent it, it must operate in DMA mode
    from the case where the data size and the fifo size are the same.
    
    Fixes: 1ee806718d5e ("spi: s3c64xx: support interrupt based pio mode")
    Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
    Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
    Link: https://lore.kernel.org/r/20240329085840.65856-1-jaewon02.kim@samsung.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP [+ + +]

Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Wed Apr 3 10:36:25 2024 -0400

    SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP
    
    [ Upstream commit 05258a0a69b3c5d2c003f818702c0a52b6fea861 ]
    
    Jan Schunk reports that his small NFS servers suffer from memory
    exhaustion after just a few days. A bisect shows that commit
    e18e157bb5c8 ("SUNRPC: Send RPC message on TCP with a single
    sock_sendmsg() call") is the first bad commit.
    
    That commit assumed that sock_sendmsg() releases all the pages in
    the underlying bio_vec array, but the reality is that it doesn't.
    svc_xprt_release() releases the rqst's response pages, but the
    record marker page fragment isn't one of those, so it is never
    released.
    
    This is a narrow fix that can be applied to stable kernels. A
    more extensive fix is in the works.
    
    Reported-by: Jan Schunk <scpcom@gmx.de>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218671
    Fixes: e18e157bb5c8 ("SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call")
    Cc: Alexander Duyck <alexander.duyck@gmail.com>
    Cc: Jakub Kacinski <kuba@kernel.org>
    Cc: David Howells <dhowells@redhat.com>
    Reviewed-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: Fix bind() regression for v6-only wildcard and v4(-mapped-v6) non-wildcard addresses. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Tue Mar 26 13:42:45 2024 -0700

    tcp: Fix bind() regression for v6-only wildcard and v4(-mapped-v6) non-wildcard addresses.
    
    commit d91ef1e1b55f730bee8ce286b02b7bdccbc42973 upstream.
    
    Jianguo Wu reported another bind() regression introduced by bhash2.
    
    Calling bind() for the following 3 addresses on the same port, the
    3rd one should fail but now succeeds.
    
      1. 0.0.0.0 or ::ffff:0.0.0.0
      2. [::] w/ IPV6_V6ONLY
      3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address
    
    The first two bind() create tb2 like this:
    
      bhash2 -> tb2(:: w/ IPV6_V6ONLY) -> tb2(0.0.0.0)
    
    The 3rd bind() will match with the IPv6 only wildcard address bucket
    in inet_bind2_bucket_match_addr_any(), however, no conflicting socket
    exists in the bucket.  So, inet_bhash2_conflict() will returns false,
    and thus, inet_bhash2_addr_any_conflict() returns false consequently.
    
    As a result, the 3rd bind() bypasses conflict check, which should be
    done against the IPv4 wildcard address bucket.
    
    So, in inet_bhash2_addr_any_conflict(), we must iterate over all buckets.
    
    Note that we cannot add ipv6_only flag for inet_bind2_bucket as it
    would confuse the following patetrn.
    
      1. [::] w/ SO_REUSE{ADDR,PORT} and IPV6_V6ONLY
      2. [::] w/ SO_REUSE{ADDR,PORT}
      3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address
    
    The first bind() would create a bucket with ipv6_only flag true,
    the second bind() would add the [::] socket into the same bucket,
    and the third bind() could succeed based on the wrong assumption
    that ipv6_only bucket would not conflict with v4(-mapped-v6) address.
    
    Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
    Diagnosed-by: Jianguo Wu <wujianguo106@163.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20240326204251.51301-3-kuniyu@amazon.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: properly terminate timers for kernel sockets [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Mar 22 13:57:32 2024 +0000

    tcp: properly terminate timers for kernel sockets
    
    [ Upstream commit 151c9c724d05d5b0dd8acd3e11cb69ef1f2dbada ]
    
    We had various syzbot reports about tcp timers firing after
    the corresponding netns has been dismantled.
    
    Fortunately Josef Bacik could trigger the issue more often,
    and could test a patch I wrote two years ago.
    
    When TCP sockets are closed, we call inet_csk_clear_xmit_timers()
    to 'stop' the timers.
    
    inet_csk_clear_xmit_timers() can be called from any context,
    including when socket lock is held.
    This is the reason it uses sk_stop_timer(), aka del_timer().
    This means that ongoing timers might finish much later.
    
    For user sockets, this is fine because each running timer
    holds a reference on the socket, and the user socket holds
    a reference on the netns.
    
    For kernel sockets, we risk that the netns is freed before
    timer can complete, because kernel sockets do not hold
    reference on the netns.
    
    This patch adds inet_csk_clear_xmit_timers_sync() function
    that using sk_stop_timer_sync() to make sure all timers
    are terminated before the kernel socket is released.
    Modules using kernel sockets close them in their netns exit()
    handler.
    
    Also add sock_not_owned_by_me() helper to get LOCKDEP
    support : inet_csk_clear_xmit_timers_sync() must not be called
    while socket lock is held.
    
    It is very possible we can revert in the future commit
    3a58f13a881e ("net: rds: acquire refcount on TCP sockets")
    which attempted to solve the issue in rds only.
    (net/smc/af_smc.c and net/mptcp/subflow.c have similar code)
    
    We probably can remove the check_net() tests from
    tcp_out_of_resources() and __tcp_close() in the future.
    
    Reported-by: Josef Bacik <josef@toxicpanda.com>
    Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/
    Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
    Fixes: 8a68173691f0 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
    Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Tested-by: Josef Bacik <josef@toxicpanda.com>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: adjust recv return with async crypto and failed copy to userspace [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Mar 25 16:56:46 2024 +0100

    tls: adjust recv return with async crypto and failed copy to userspace
    
    [ Upstream commit 85eef9a41d019b59be7bc91793f26251909c0710 ]
    
    process_rx_list may not copy as many bytes as we want to the userspace
    buffer, for example in case we hit an EFAULT during the copy. If this
    happens, we should only count the bytes that were actually copied,
    which may be 0.
    
    Subtracting async_copy_bytes is correct in both peek and !peek cases,
    because decrypted == async_copy_bytes + peeked for the peek case: peek
    is always !ZC, and we can go through either the sync or async path. In
    the async case, we add chunk to both decrypted and
    async_copy_bytes. In the sync case, we add chunk to both decrypted and
    peeked. I missed that in commit 6caaf104423d ("tls: fix peeking with
    sync+async decryption").
    
    Fixes: 4d42cd6bc2ac ("tls: rx: fix return value for async crypto")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/1b5a1eaab3c088a9dd5d9f1059ceecd7afe888d1.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: get psock ref after taking rxlock to avoid leak [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Mar 25 16:56:48 2024 +0100

    tls: get psock ref after taking rxlock to avoid leak
    
    [ Upstream commit 417e91e856099e9b8a42a2520e2255e6afe024be ]
    
    At the start of tls_sw_recvmsg, we take a reference on the psock, and
    then call tls_rx_reader_lock. If that fails, we return directly
    without releasing the reference.
    
    Instead of adding a new label, just take the reference after locking
    has succeeded, since we don't need it before.
    
    Fixes: 4cbc325ed6b4 ("tls: rx: allow only one reader at a time")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/fe2ade22d030051ce4c3638704ed58b67d0df643.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: recv: process_rx_list shouldn't use an offset with kvec [+ + +]

Author: Sabrina Dubroca <sd@queasysnail.net>
Date:   Mon Mar 25 16:56:45 2024 +0100

    tls: recv: process_rx_list shouldn't use an offset with kvec
    
    [ Upstream commit 7608a971fdeb4c3eefa522d1bfe8d4bc6b2481cc ]
    
    Only MSG_PEEK needs to copy from an offset during the final
    process_rx_list call, because the bytes we copied at the beginning of
    tls_sw_recvmsg were left on the rx_list. In the KVEC case, we removed
    data from the rx_list as we were copying it, so there's no need to use
    an offset, just like in the normal case.
    
    Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
    Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/e5487514f828e0347d2b92ca40002c62b58af73d.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tools: ynl: fix setting presence bits in simple nests [+ + +]

Author: Jakub Kicinski <kuba@kernel.org>
Date:   Wed Mar 20 19:02:14 2024 -0700

    tools: ynl: fix setting presence bits in simple nests
    
    [ Upstream commit f6c8f5e8694c7a78c94e408b628afa6255cc428a ]
    
    When we set members of simple nested structures in requests
    we need to set "presence" bits for all the nesting layers
    below. This has nothing to do with the presence type of
    the last layer.
    
    Fixes: be5bea1cc0bf ("net: add basic C code generators for Netlink")
    Reviewed-by: Breno Leitao <leitao@debian.org>
    Link: https://lore.kernel.org/r/20240321020214.1250202-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: do not accept non-tunnel GSO skbs landing in a tunnel [+ + +]

Author: Antoine Tenart <atenart@kernel.org>
Date:   Tue Mar 26 12:33:58 2024 +0100

    udp: do not accept non-tunnel GSO skbs landing in a tunnel
    
    commit 3d010c8031e39f5fa1e8b13ada77e0321091011f upstream.
    
    When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
    being forwarded. If such packets might land in a tunnel this can cause
    various issues and udp_gro_receive makes sure this isn't the case by
    looking for a matching socket. This is performed in
    udp4/6_gro_lookup_skb but only in the current netns. This is an issue
    with tunneled packets when the endpoint is in another netns. In such
    cases the packets will be GROed at the UDP level, which leads to various
    issues later on. The same thing can happen with rx-gro-list.
    
    We saw this with geneve packets being GROed at the UDP level. In such
    case gso_size is set; later the packet goes through the geneve rx path,
    the geneve header is pulled, the offset are adjusted and frag_list skbs
    are not adjusted with regard to geneve. When those skbs hit
    skb_fragment, it will misbehave. Different outcomes are possible
    depending on what the GROed skbs look like; from corrupted packets to
    kernel crashes.
    
    One example is a BUG_ON[1] triggered in skb_segment while processing the
    frag_list. Because gso_size is wrong (geneve header was pulled)
    skb_segment thinks there is "geneve header size" of data in frag_list,
    although it's in fact the next packet. The BUG_ON itself has nothing to
    do with the issue. This is only one of the potential issues.
    
    Looking up for a matching socket in udp_gro_receive is fragile: the
    lookup could be extended to all netns (not speaking about performances)
    but nothing prevents those packets from being modified in between and we
    could still not find a matching socket. It's OK to keep the current
    logic there as it should cover most cases but we also need to make sure
    we handle tunnel packets being GROed too early.
    
    This is done by extending the checks in udp_unexpected_gso: GSO packets
    lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must
    be segmented.
    
    [1] kernel BUG at net/core/skbuff.c:4408!
        RIP: 0010:skb_segment+0xd2a/0xf70
        __udp_gso_segment+0xaa/0x560
    
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
    Signed-off-by: Antoine Tenart <atenart@kernel.org>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udp: do not transition UDP GRO fraglist partial checksums to unnecessary [+ + +]

Author: Antoine Tenart <atenart@kernel.org>
Date:   Tue Mar 26 12:34:00 2024 +0100

    udp: do not transition UDP GRO fraglist partial checksums to unnecessary
    
    commit f0b8c30345565344df2e33a8417a27503589247d upstream.
    
    UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
    are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
    this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
    an egress path and then their partial checksums are not fixed.
    
    Different issues can be observed, from invalid checksum on packets to
    traces like:
    
      gen01: hw csum failure
      skb len=3008 headroom=160 headlen=1376 tailroom=0
      mac=(106,14) net=(120,40) trans=160
      shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
      csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
      hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
      ...
    
    Fix this by only converting CHECKSUM_NONE packets to
    CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All
    other checksum types are kept as-is, including CHECKSUM_COMPLETE as
    fraglist packets being segmented back would have their skb->csum valid.
    
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Antoine Tenart <atenart@kernel.org>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udp: prevent local UDP tunnel packets from being GROed [+ + +]

Author: Antoine Tenart <atenart@kernel.org>
Date:   Tue Mar 26 12:34:01 2024 +0100

    udp: prevent local UDP tunnel packets from being GROed
    
    commit 64235eabc4b5b18c507c08a1f16cdac6c5661220 upstream.
    
    GRO has a fundamental issue with UDP tunnel packets as it can't detect
    those in a foolproof way and GRO could happen before they reach the
    tunnel endpoint. Previous commits have fixed issues when UDP tunnel
    packets come from a remote host, but if those packets are issued locally
    they could run into checksum issues.
    
    If the inner packet has a partial checksum the information will be lost
    in the GRO logic, either in udp4/6_gro_complete or in
    udp_gro_complete_segment and packets will have an invalid checksum when
    leaving the host.
    
    Prevent local UDP tunnel packets from ever being GROed at the outer UDP
    level.
    
    Due to skb->encapsulation being wrongly used in some drivers this is
    actually only preventing UDP tunnel packets with a partial checksum to
    be GROed (see iptunnel_handle_offloads) but those were also the packets
    triggering issues so in practice this should be sufficient.
    
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
    Suggested-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Antoine Tenart <atenart@kernel.org>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: ucsi: Fix race between typec_switch and role_switch [+ + +]

Author: Krishna Kurapati <quic_kriskura@quicinc.com>
Date:   Fri Mar 1 09:39:14 2024 +0530

    usb: typec: ucsi: Fix race between typec_switch and role_switch
    
    [ Upstream commit f5e9bda03aa50ffad36eccafe893d004ef213c43 ]
    
    When orientation switch is enabled in ucsi glink, there is a xhci
    probe failure seen when booting up in host mode in reverse
    orientation.
    
    During bootup the following things happen in multiple drivers:
    
    a) DWC3 controller driver initializes the core in device mode when the
    dr_mode is set to DRD. It relies on role_switch call to change role to
    host.
    
    b) QMP driver initializes the lanes to TYPEC_ORIENTATION_NORMAL as a
    normal routine. It relies on the typec_switch_set call to get notified
    of orientation changes.
    
    c) UCSI core reads the UCSI_GET_CONNECTOR_STATUS via the glink and
    provides initial role switch to dwc3 controller.
    
    When booting up in host mode with orientation TYPEC_ORIENTATION_REVERSE,
    then we see the following things happening in order:
    
    a) UCSI gives initial role as host to dwc3 controller ucsi_register_port.
    Upon receiving this notification, the dwc3 core needs to program GCTL from
    PRTCAP_DEVICE to PRTCAP_HOST and as part of this change, it asserts GCTL
    Core soft reset and waits for it to be  completed before shifting it to
    host. Only after the reset is done will the dwc3_host_init be invoked and
    xhci is probed. DWC3 controller expects that the usb phy's are stable
    during this process i.e., the phy init is already done.
    
    b) During the 100ms wait for GCTL core soft reset, the actual notification
    from PPM is received by ucsi_glink via pmic glink for changing role to
    host. The pmic_glink_ucsi_notify routine first sends the orientation
    change to QMP and then sends role to dwc3 via ucsi framework. This is
    happening exactly at the time GCTL core soft reset is being processed.
    
    c) When QMP driver receives typec switch to TYPEC_ORIENTATION_REVERSE, it
    then re-programs the phy at the instant GCTL core soft reset has been
    asserted by dwc3 controller due to which the QMP PLL lock fails in
    qmp_combo_usb_power_on.
    
    d) After the 100ms of GCTL core soft reset is completed, the dwc3 core
    goes for initializing the host mode and invokes xhci probe. But at this
    point the QMP is non-responsive and as a result, the xhci plat probe fails
    during xhci_reset.
    
    Fix this by passing orientation switch to available ucsi instances if
    their gpio configuration is available before ucsi_register is invoked so
    that by the time, the pmic_glink_ucsi_notify provides typec_switch to QMP,
    the lane is already configured and the call would be a NOP thus not racing
    with role switch.
    
    Cc: stable@vger.kernel.org
    Fixes: c6165ed2f425 ("usb: ucsi: glink: use the connector orientation GPIO to provide switch events")
    Suggested-by: Wesley Cheng <quic_wcheng@quicinc.com>
    Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
    Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20240301040914.458492-1-quic_kriskura@quicinc.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

vboxsf: Avoid an spurious warning if load_nls_xxx() fails [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Wed Nov 1 11:49:48 2023 +0100

    vboxsf: Avoid an spurious warning if load_nls_xxx() fails
    
    commit de3f64b738af57e2732b91a0774facc675b75b54 upstream.
    
    If an load_nls_xxx() function fails a few lines above, the 'sbi->bdi_id' is
    still 0.
    So, in the error handling path, we will call ida_simple_remove(..., 0)
    which is not allocated yet.
    
    In order to prevent a spurious "ida_free called for id=0 which is not
    allocated." message, tweak the error handling path and add a new label.
    
    Fixes: 0fd169576648 ("fs: Add VirtualBox guest shared folder (vboxsf) support")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://lore.kernel.org/r/d09eaaa4e2e08206c58a1a27ca9b3e81dc168773.1698835730.git.christophe.jaillet@wanadoo.fr
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtchnl: Add header dependencies [+ + +]

Author: Ivan Vecera <ivecera@redhat.com>
Date:   Wed Sep 27 10:31:30 2023 +0200

    virtchnl: Add header dependencies
    
    [ Upstream commit 7151d87a175c6618fe81705755eb3dc4199cad4e ]
    
    The <linux/avf/virtchnl.h> uses BIT, struct_size and ETH_ALEN macros
    but does not include appropriate header files that defines them.
    Add these dependencies so this header file can be included anywhere.
    
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Stable-dep-of: 6dbdd4de0362 ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

vsock/virtio: fix packet delivery to tap device [+ + +]

Author: Marco Pinna <marco.pinn95@gmail.com>
Date:   Fri Mar 29 17:12:59 2024 +0100

    vsock/virtio: fix packet delivery to tap device
    
    commit b32a09ea7c38849ff925489a6bf5bd8914bc45df upstream.
    
    Commit 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks") added
    virtio_transport_deliver_tap_pkt() for handing packets to the
    vsockmon device. However, in virtio_transport_send_pkt_work(),
    the function is called before actually sending the packet (i.e.
    before placing it in the virtqueue with virtqueue_add_sgs() and checking
    whether it returned successfully).
    Queuing the packet in the virtqueue can fail even multiple times.
    However, in virtio_transport_deliver_tap_pkt() we deliver the packet
    to the monitoring tap interface only the first time we call it.
    This certainly avoids seeing the same packet replicated multiple times
    in the monitoring interface, but it can show the packet sent with the
    wrong timestamp or even before we succeed to queue it in the virtqueue.
    
    Move virtio_transport_deliver_tap_pkt() after calling virtqueue_add_sgs()
    and making sure it returned successfully.
    
    Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
    Cc: stable@vge.kernel.org
    Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
    Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
    Link: https://lore.kernel.org/r/20240329161259.411751-1-marco.pinn95@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: iwlwifi: disable multi rx queue for 9000 [+ + +]

Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date:   Tue Oct 17 12:16:46 2023 +0300

    wifi: iwlwifi: disable multi rx queue for 9000
    
    [ Upstream commit 29fa9a984b6d1075020f12071a89897fd62ed27f ]
    
    Multi rx queue allows to spread the load of the Rx streams on different
    CPUs. 9000 series required complex synchronization mechanisms from the
    driver side since the hardware / firmware is not able to provide
    information about duplicate packets and timeouts inside the reordering
    buffer.
    
    Users have complained that for newer devices, all those synchronization
    mechanisms have caused spurious packet drops. Those packet drops
    disappeared if we simplify the code, but unfortunately, we can't have
    RSS enabled on 9000 series without this complex code.
    
    Remove support for RSS on 9000 so that we can make the code much simpler
    for newer devices and fix the bugs for them.
    
    The down side of this patch is a that all the Rx path will be routed to
    a single CPU, but this has never been an issue, the modern CPUs are just
    fast enough to cope with all the traffic.
    
    Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
    Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/20231017115047.2917eb8b7af9.Iddd7dcf335387ba46fcbbb6067ef4ff9cd3755a7@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: e78d78773089 ("wifi: iwlwifi: mvm: include link ID when releasing frames")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: include link ID when releasing frames [+ + +]

Author: Benjamin Berg <benjamin.berg@intel.com>
Date:   Wed Mar 20 23:26:22 2024 +0200

    wifi: iwlwifi: mvm: include link ID when releasing frames
    
    [ Upstream commit e78d7877308989ef91b64a3c746ae31324c07caa ]
    
    When releasing frames from the reorder buffer, the link ID was not
    included in the RX status information. This subsequently led mac80211 to
    drop the frame. Change it so that the link information is set
    immediately when possible so that it doesn't not need to be filled in
    anymore when submitting the frame to mac80211.
    
    Fixes: b8a85a1d42d7 ("wifi: iwlwifi: mvm: rxmq: report link ID to mac80211")
    Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
    Tested-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
    Reviewed-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240320232419.bbbd5e9bfe80.Iec1bf5c884e371f7bc5ea2534ed9ea8d3f2c0bf6@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: rfi: fix potential response leaks [+ + +]

Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Mar 19 10:10:17 2024 +0200

    wifi: iwlwifi: mvm: rfi: fix potential response leaks
    
    [ Upstream commit 06a093807eb7b5c5b29b6cff49f8174a4e702341 ]
    
    If the rx payload length check fails, or if kmemdup() fails,
    we still need to free the command response. Fix that.
    
    Fixes: 21254908cbe9 ("iwlwifi: mvm: add RFI-M support")
    Co-authored-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Link: https://msgid.link/20240319100755.db2fa0196aa7.I116293b132502ac68a65527330fa37799694b79c@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/bhi: Add BHI mitigation knob [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Mon Mar 11 08:57:05 2024 -0700

    x86/bhi: Add BHI mitigation knob
    
    commit ec9404e40e8f36421a2b66ecb76dc2209fe7f3ef upstream.
    
    Branch history clearing software sequences and hardware control
    BHI_DIS_S were defined to mitigate Branch History Injection (BHI).
    
    Add cmdline spectre_bhi={on|off|auto} to control BHI mitigation:
    
     auto - Deploy the hardware mitigation BHI_DIS_S, if available.
     on   - Deploy the hardware mitigation BHI_DIS_S, if available,
            otherwise deploy the software sequence at syscall entry and
            VMexit.
     off  - Turn off BHI mitigation.
    
    The default is auto mode which does not deploy the software sequence
    mitigation.  This is because of the hardening done in the syscall
    dispatch path, which is the likely target of BHI.
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bhi: Add support for clearing branch history at syscall entry [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Mon Mar 11 08:56:58 2024 -0700

    x86/bhi: Add support for clearing branch history at syscall entry
    
    commit 7390db8aea0d64e9deb28b8e1ce716f5020c7ee5 upstream.
    
    Branch History Injection (BHI) attacks may allow a malicious application to
    influence indirect branch prediction in kernel by poisoning the branch
    history. eIBRS isolates indirect branch targets in ring0.  The BHB can
    still influence the choice of indirect branch predictor entry, and although
    branch predictor entries are isolated between modes when eIBRS is enabled,
    the BHB itself is not isolated between modes.
    
    Alder Lake and new processors supports a hardware control BHI_DIS_S to
    mitigate BHI.  For older processors Intel has released a software sequence
    to clear the branch history on parts that don't support BHI_DIS_S. Add
    support to execute the software sequence at syscall entry and VMexit to
    overwrite the branch history.
    
    For now, branch history is not cleared at interrupt entry, as malicious
    applications are not believed to have sufficient control over the
    registers, since previous register state is cleared at interrupt
    entry. Researchers continue to poke at this area and it may become
    necessary to clear at interrupt entry as well in the future.
    
    This mitigation is only defined here. It is enabled later.
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Co-developed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bhi: Define SPEC_CTRL_BHI_DIS_S [+ + +]

Author: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Date:   Wed Mar 13 09:47:57 2024 -0700

    x86/bhi: Define SPEC_CTRL_BHI_DIS_S
    
    commit 0f4a837615ff925ba62648d280a861adf1582df7 upstream.
    
    Newer processors supports a hardware control BHI_DIS_S to mitigate
    Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel
    from userspace BHI attacks without having to manually overwrite the
    branch history.
    
    Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL.
    Mitigation is enabled later.
    
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bhi: Enumerate Branch History Injection (BHI) bug [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Mon Mar 11 08:57:03 2024 -0700

    x86/bhi: Enumerate Branch History Injection (BHI) bug
    
    commit be482ff9500999f56093738f9219bbabc729d163 upstream.
    
    Mitigation for BHI is selected based on the bug enumeration. Add bits
    needed to enumerate BHI bug.
    
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bhi: Mitigate KVM by default [+ + +]

Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Date:   Mon Mar 11 08:57:09 2024 -0700

    x86/bhi: Mitigate KVM by default
    
    commit 95a6ccbdc7199a14b71ad8901cb788ba7fb5167b upstream.
    
    BHI mitigation mode spectre_bhi=auto does not deploy the software
    mitigation by default. In a cloud environment, it is a likely scenario
    where userspace is trusted but the guests are not trusted. Deploying
    system wide mitigation in such cases is not desirable.
    
    Update the auto mode to unconditionally mitigate against malicious
    guests. Deploy the software sequence at VMexit in auto mode also, when
    hardware mitigation is not available. Unlike the force =on mode,
    software sequence is not deployed at syscalls in auto mode.
    
    Suggested-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/boot: Move mem_encrypt= parsing to the decompressor [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Tue Feb 27 16:19:14 2024 +0100

    x86/boot: Move mem_encrypt= parsing to the decompressor
    
    commit cd0d9d92c8bb46e77de62efd7df13069ddd61e7d upstream.
    
    The early SME/SEV code parses the command line very early, in order to
    decide whether or not memory encryption should be enabled, which needs
    to occur even before the initial page tables are created.
    
    This is problematic for a number of reasons:
    - this early code runs from the 1:1 mapping provided by the decompressor
      or firmware, which uses a different translation than the one assumed by
      the linker, and so the code needs to be built in a special way;
    - parsing external input while the entire kernel image is still mapped
      writable is a bad idea in general, and really does not belong in
      security minded code;
    - the current code ignores the built-in command line entirely (although
      this appears to be the case for the entire decompressor)
    
    Given that the decompressor/EFI stub is an intrinsic part of the x86
    bootable kernel image, move the command line parsing there and out of
    the core kernel. This removes the need to build lib/cmdline.o in a
    special way, or to use RIP-relative LEA instructions in inline asm
    blocks.
    
    This involves a new xloadflag in the setup header to indicate
    that mem_encrypt=on appeared on the kernel command line.
    
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240227151907.387873-17-ardb+git@google.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bpf: Fix IP after emitting call depth accounting [+ + +]

Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Mon Apr 1 20:55:29 2024 +0200

    x86/bpf: Fix IP after emitting call depth accounting
    
    commit 9d98aa088386aee3db1b7b60b800c0fde0654a4a upstream.
    
    Adjust the IP passed to `emit_patch` so it calculates the correct offset
    for the CALL instruction if `x86_call_depth_emit_accounting` emits code.
    Otherwise we will skip some instructions and most likely crash.
    
    Fixes: b2e9dfe54be4 ("x86/bpf: Emit call depth accounting if required")
    Link: https://lore.kernel.org/lkml/20230105214922.250473-1-joanbrugueram@gmail.com/
    Co-developed-by: Joan Bruguera MicцЁ <joanbrugueram@gmail.com>
    Signed-off-by: Joan Bruguera MicцЁ <joanbrugueram@gmail.com>
    Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/r/20240401185821.224068-2-ubizjak@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file [+ + +]

Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date:   Fri Apr 5 11:14:13 2024 -0700

    x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file
    
    commit 0cd01ac5dcb1e18eb18df0f0d05b5de76522a437 upstream.
    
    Change the format of the 'spectre_v2' vulnerabilities sysfs file
    slightly by converting the commas to semicolons, so that mitigations for
    future variants can be grouped together and separated by commas.
    
    Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/bugs: Fix the SRSO mitigation on Zen3/4 [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Thu Mar 28 13:59:05 2024 +0100

    x86/bugs: Fix the SRSO mitigation on Zen3/4
    
    Commit 4535e1a4174c4111d92c5a9a21e542d232e0fcaa upstream.
    
    The original version of the mitigation would patch in the calls to the
    untraining routines directly.  That is, the alternative() in UNTRAIN_RET
    will patch in the CALL to srso_alias_untrain_ret() directly.
    
    However, even if commit e7c25c441e9e ("x86/cpu: Cleanup the untrain
    mess") meant well in trying to clean up the situation, due to micro-
    architectural reasons, the untraining routine srso_alias_untrain_ret()
    must be the target of a CALL instruction and not of a JMP instruction as
    it is done now.
    
    Reshuffle the alternative macros to accomplish that.
    
    Fixes: e7c25c441e9e ("x86/cpu: Cleanup the untrain mess")
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/coco: Require seeding RNG with RDRAND on CoCo systems [+ + +]

Author: Jason A. Donenfeld <Jason@zx2c4.com>
Date:   Tue Mar 26 17:07:35 2024 +0100

    x86/coco: Require seeding RNG with RDRAND on CoCo systems
    
    commit 99485c4c026f024e7cb82da84c7951dbe3deb584 upstream.
    
    There are few uses of CoCo that don't rely on working cryptography and
    hence a working RNG. Unfortunately, the CoCo threat model means that the
    VM host cannot be trusted and may actively work against guests to
    extract secrets or manipulate computation. Since a malicious host can
    modify or observe nearly all inputs to guests, the only remaining source
    of entropy for CoCo guests is RDRAND.
    
    If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole
    is meant to gracefully continue on gathering entropy from other sources,
    but since there aren't other sources on CoCo, this is catastrophic.
    This is mostly a concern at boot time when initially seeding the RNG, as
    after that the consequences of a broken RDRAND are much more
    theoretical.
    
    So, try at boot to seed the RNG using 256 bits of RDRAND output. If this
    fails, panic(). This will also trigger if the system is booted without
    RDRAND, as RDRAND is essential for a safe CoCo boot.
    
    Add this deliberately to be "just a CoCo x86 driver feature" and not
    part of the RNG itself. Many device drivers and platforms have some
    desire to contribute something to the RNG, and add_device_randomness()
    is specifically meant for this purpose.
    
    Any driver can call it with seed data of any quality, or even garbage
    quality, and it can only possibly make the quality of the RNG better or
    have no effect, but can never make it worse.
    
    Rather than trying to build something into the core of the RNG, consider
    the particular CoCo issue just a CoCo issue, and therefore separate it
    all out into driver (well, arch/platform) code.
    
      [ bp: Massage commit message. ]
    
    Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Elena Reshetova <elena.reshetova@intel.com>
    Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: Theodore Ts'o <tytso@mit.edu>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20240326160735.73531-1-Jason@zx2c4.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/CPU/AMD: Add X86_FEATURE_ZEN1 [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Sat Dec 2 12:50:23 2023 +0100

    x86/CPU/AMD: Add X86_FEATURE_ZEN1
    
    [ Upstream commit 232afb557835d6f6859c73bf610bad308c96b131 ]
    
    Add a synthetic feature flag specifically for first generation Zen
    machines. There's need to have a generic flag for all Zen generations so
    make X86_FEATURE_ZEN be that flag.
    
    Fixes: 30fa92832f40 ("x86/CPU/AMD: Add ZenX generations flags")
    Suggested-by: Brian Gerst <brgerst@gmail.com>
    Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/dc3835e3-0731-4230-bbb9-336bbe3d042b@amd.com
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Add ZenX generations flags [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Tue Oct 31 23:30:59 2023 +0100

    x86/CPU/AMD: Add ZenX generations flags
    
    [ Upstream commit 30fa92832f405d5ac9f263e99f62445fa3084008 ]
    
    Add X86_FEATURE flags for each Zen generation. They should be used from
    now on instead of checking f/m/s.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Acked-by: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lore.kernel.org/r/20231120104152.13740-2-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Carve out the erratum 1386 fix [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Wed Nov 1 11:14:59 2023 +0100

    x86/CPU/AMD: Carve out the erratum 1386 fix
    
    [ Upstream commit a7c32a1ae9ee43abfe884f5af376877c4301d166 ]
    
    Call it on the affected CPU generations.
    
    No functional changes.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Link: http://lore.kernel.org/r/20231120104152.13740-3-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Get rid of amd_erratum_1054[] [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Fri Nov 3 19:53:49 2023 +0100

    x86/CPU/AMD: Get rid of amd_erratum_1054[]
    
    [ Upstream commit 54c33e23f75d5c9925495231c57d3319336722ef ]
    
    No functional changes.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Link: http://lore.kernel.org/r/20231120104152.13740-10-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Move erratum 1076 fix into the Zen1 init function [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Wed Nov 1 12:31:44 2023 +0100

    x86/CPU/AMD: Move erratum 1076 fix into the Zen1 init function
    
    [ Upstream commit 0da91912fc150d6d321b15e648bead202ced1a27 ]
    
    No functional changes.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Link: http://lore.kernel.org/r/20231120104152.13740-5-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Move the DIV0 bug detection to the Zen1 init function [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Wed Nov 1 12:52:01 2023 +0100

    x86/CPU/AMD: Move the DIV0 bug detection to the Zen1 init function
    
    [ Upstream commit bfff3c6692ce64fa9d86eb829d18229c307a0855 ]
    
    No functional changes.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Link: http://lore.kernel.org/r/20231120104152.13740-9-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/CPU/AMD: Move Zenbleed check to the Zen2 init function [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Wed Nov 1 12:38:35 2023 +0100

    x86/CPU/AMD: Move Zenbleed check to the Zen2 init function
    
    [ Upstream commit f69759be251dce722942594fbc62e53a40822a82 ]
    
    Prefix it properly so that it is clear which generation it is dealing
    with.
    
    No functional changes.
    
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: http://lore.kernel.org/r/20231120104152.13740-8-bp@alien8.de
    Stable-dep-of: c7b2edd8377b ("perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Thu Apr 4 17:16:14 2024 -0700

    x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
    
    commit 8cb4a9a82b21623dbb4b3051dd30d98356cf95bc upstream.
    
    Add CPUID_LNX_5 to track cpufeatures' word 21, and add the appropriate
    compile-time assert in KVM to prevent direct lookups on the features in
    CPUID_LNX_5.  KVM uses X86_FEATURE_* flags to manage guest CPUID, and so
    must translate features that are scattered by Linux from the Linux-defined
    bit to the hardware-defined bit, i.e. should never try to directly access
    scattered features in guest CPUID.
    
    Opportunistically add NR_CPUID_WORDS to enum cpuid_leafs, along with a
    compile-time assert in KVM's CPUID infrastructure to ensure that future
    additions update cpuid_leafs along with NCAPINTS.
    
    No functional change intended.
    
    Fixes: 7f274e609f3d ("x86/cpufeatures: Add new word for scattered features")
    Cc: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/cpufeatures: Add new word for scattered features [+ + +]

Author: Sandipan Das <sandipan.das@amd.com>
Date:   Mon Mar 25 13:01:44 2024 +0530

    x86/cpufeatures: Add new word for scattered features
    
    [ Upstream commit 7f274e609f3d5f45c22b1dd59053f6764458b492 ]
    
    Add a new word for scattered features because all free bits among the
    existing Linux-defined auxiliary flags have been exhausted.
    
    Signed-off-by: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/8380d2a0da469a1f0ad75b8954a79fb689599ff6.1711091584.git.sandipan.das@amd.com
    Stable-dep-of: 598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/efistub: Remap kernel text read-only before dropping NX attribute [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Thu Jan 25 14:32:07 2024 +0100

    x86/efistub: Remap kernel text read-only before dropping NX attribute
    
    commit 9c55461040a9264b7e44444c53d26480b438eda6 upstream.
    
    Currently, the EFI stub invokes the EFI memory attributes protocol to
    strip any NX restrictions from the entire loaded kernel, resulting in
    all code and data being mapped read-write-execute.
    
    The point of the EFI memory attributes protocol is to remove the need
    for all memory allocations to be mapped with both write and execute
    permissions by default, and make it the OS loader's responsibility to
    transition data mappings to code mappings where appropriate.
    
    Even though the UEFI specification does not appear to leave room for
    denying memory attribute changes based on security policy, let's be
    cautious and avoid relying on the ability to create read-write-execute
    mappings. This is trivially achievable, given that the amount of kernel
    code executing via the firmware's 1:1 mapping is rather small and
    limited to the .head.text region. So let's drop the NX restrictions only
    on that subregion, but not before remapping it as read-only first.
    
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/head/64: Move the __head definition to [+ + +]

Author: Hou Wenlong <houwenlong.hwl@antgroup.com>
Date:   Tue Oct 17 15:08:06 2023 +0800

    x86/head/64: Move the __head definition to <asm/init.h>
    
    commit d2a285d65bfde3218fd0c3b88794d0135ced680b upstream.
    
    Move the __head section definition to a header to widen its use.
    
    An upcoming patch will mark the code as __head in mem_encrypt_identity.c too.
    
    Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/0583f57977be184689c373fe540cbd7d85ca2047.1697525407.git.houwenlong.hwl@antgroup.com
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/mce: Make sure to grab mce_sysfs_mutex in set_bank() [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Wed Mar 13 14:48:27 2024 +0100

    x86/mce: Make sure to grab mce_sysfs_mutex in set_bank()
    
    commit 3ddf944b32f88741c303f0b21459dbb3872b8bc5 upstream.
    
    Modifying a MCA bank's MCA_CTL bits which control which error types to
    be reported is done over
    
      /sys/devices/system/machinecheck/
      Б■°Б■─Б■─ machinecheck0
      Б■┌б═б═ Б■°Б■─Б■─ bank0
      Б■┌б═б═ Б■°Б■─Б■─ bank1
      Б■┌б═б═ Б■°Б■─Б■─ bank10
      Б■┌б═б═ Б■°Б■─Б■─ bank11
      ...
    
    sysfs nodes by writing the new bit mask of events to enable.
    
    When the write is accepted, the kernel deletes all current timers and
    reinits all banks.
    
    Doing that in parallel can lead to initializing a timer which is already
    armed and in the timer wheel, i.e., in use already:
    
      ODEBUG: init active (active state 0) object: ffff888063a28000 object
      type: timer_list hint: mce_timer_fn+0x0/0x240 arch/x86/kernel/cpu/mce/core.c:2642
      WARNING: CPU: 0 PID: 8120 at lib/debugobjects.c:514
      debug_print_object+0x1a0/0x2a0 lib/debugobjects.c:514
    
    Fix that by grabbing the sysfs mutex as the rest of the MCA sysfs code
    does.
    
    Reported by: Yue Sun <samsun1006219@gmail.com>
    Reported by: xingwei lee <xrivendell7@gmail.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: <stable@kernel.org>
    Link: https://lore.kernel.org/r/CAEkJfYNiENwQY8yV1LYJ9LjJs%2Bx_-PqMv98gKig55=2vbzffRw@mail.gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/mm/pat: fix VM_PAT handling in COW mappings [+ + +]

Author: David Hildenbrand <david@redhat.com>
Date:   Wed Apr 3 23:21:30 2024 +0200

    x86/mm/pat: fix VM_PAT handling in COW mappings
    
    commit 04c35ab3bdae7fefbd7c7a7355f29fa03a035221 upstream.
    
    PAT handling won't do the right thing in COW mappings: the first PTE (or,
    in fact, all PTEs) can be replaced during write faults to point at anon
    folios.  Reliably recovering the correct PFN and cachemode using
    follow_phys() from PTEs will not work in COW mappings.
    
    Using follow_phys(), we might just get the address+protection of the anon
    folio (which is very wrong), or fail on swap/nonswap entries, failing
    follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and
    track_pfn_copy(), not properly calling free_pfn_range().
    
    In free_pfn_range(), we either wouldn't call memtype_free() or would call
    it with the wrong range, possibly leaking memory.
    
    To fix that, let's update follow_phys() to refuse returning anon folios,
    and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings
    if we run into that.
    
    We will now properly handle untrack_pfn() with COW mappings, where we
    don't need the cachemode.  We'll have to fail fork()->track_pfn_copy() if
    the first page was replaced by an anon folio, though: we'd have to store
    the cachemode in the VMA to make this work, likely growing the VMA size.
    
    For now, lets keep it simple and let track_pfn_copy() just fail in that
    case: it would have failed in the past with swap/nonswap entries already,
    and it would have done the wrong thing with anon folios.
    
    Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn():
    
    <--- C reproducer --->
     #include <stdio.h>
     #include <sys/mman.h>
     #include <unistd.h>
     #include <liburing.h>
    
     int main(void)
     {
             struct io_uring_params p = {};
             int ring_fd;
             size_t size;
             char *map;
    
             ring_fd = io_uring_setup(1, &p);
             if (ring_fd < 0) {
                     perror("io_uring_setup");
                     return 1;
             }
             size = p.sq_off.array + p.sq_entries * sizeof(unsigned);
    
             /* Map the submission queue ring MAP_PRIVATE */
             map = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE,
                        ring_fd, IORING_OFF_SQ_RING);
             if (map == MAP_FAILED) {
                     perror("mmap");
                     return 1;
             }
    
             /* We have at least one page. Let's COW it. */
             *map = 0;
             pause();
             return 0;
     }
    <--- C reproducer --->
    
    On a system with 16 GiB RAM and swap configured:
     # ./iouring &
     # memhog 16G
     # killall iouring
    [  301.552930] ------------[ cut here ]------------
    [  301.553285] WARNING: CPU: 7 PID: 1402 at arch/x86/mm/pat/memtype.c:1060 untrack_pfn+0xf4/0x100
    [  301.553989] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_g
    [  301.558232] CPU: 7 PID: 1402 Comm: iouring Not tainted 6.7.5-100.fc38.x86_64 #1
    [  301.558772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebu4
    [  301.559569] RIP: 0010:untrack_pfn+0xf4/0x100
    [  301.559893] Code: 75 c4 eb cf 48 8b 43 10 8b a8 e8 00 00 00 3b 6b 28 74 b8 48 8b 7b 30 e8 ea 1a f7 000
    [  301.561189] RSP: 0018:ffffba2c0377fab8 EFLAGS: 00010282
    [  301.561590] RAX: 00000000ffffffea RBX: ffff9208c8ce9cc0 RCX: 000000010455e047
    [  301.562105] RDX: 07fffffff0eb1e0a RSI: 0000000000000000 RDI: ffff9208c391d200
    [  301.562628] RBP: 0000000000000000 R08: ffffba2c0377fab8 R09: 0000000000000000
    [  301.563145] R10: ffff9208d2292d50 R11: 0000000000000002 R12: 00007fea890e0000
    [  301.563669] R13: 0000000000000000 R14: ffffba2c0377fc08 R15: 0000000000000000
    [  301.564186] FS:  0000000000000000(0000) GS:ffff920c2fbc0000(0000) knlGS:0000000000000000
    [  301.564773] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  301.565197] CR2: 00007fea88ee8a20 CR3: 00000001033a8000 CR4: 0000000000750ef0
    [  301.565725] PKRU: 55555554
    [  301.565944] Call Trace:
    [  301.566148]  <TASK>
    [  301.566325]  ? untrack_pfn+0xf4/0x100
    [  301.566618]  ? __warn+0x81/0x130
    [  301.566876]  ? untrack_pfn+0xf4/0x100
    [  301.567163]  ? report_bug+0x171/0x1a0
    [  301.567466]  ? handle_bug+0x3c/0x80
    [  301.567743]  ? exc_invalid_op+0x17/0x70
    [  301.568038]  ? asm_exc_invalid_op+0x1a/0x20
    [  301.568363]  ? untrack_pfn+0xf4/0x100
    [  301.568660]  ? untrack_pfn+0x65/0x100
    [  301.568947]  unmap_single_vma+0xa6/0xe0
    [  301.569247]  unmap_vmas+0xb5/0x190
    [  301.569532]  exit_mmap+0xec/0x340
    [  301.569801]  __mmput+0x3e/0x130
    [  301.570051]  do_exit+0x305/0xaf0
    ...
    
    Link: https://lkml.kernel.org/r/20240403212131.929421-3-david@redhat.com
    Signed-off-by: David Hildenbrand <david@redhat.com>
    Reported-by: Wupeng Ma <mawupeng1@huawei.com>
    Closes: https://lkml.kernel.org/r/20240227122814.3781907-1-mawupeng1@huawei.com
    Fixes: b1a86e15dc03 ("x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines")
    Fixes: 5899329b1910 ("x86: PAT: implement track/untrack of pfnmap regions for x86 - v3")
    Acked-by: Ingo Molnar <mingo@kernel.org>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/nospec: Refactor UNTRAIN_RET[_*] [+ + +]

Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date:   Mon Sep 4 22:05:03 2023 -0700

    x86/nospec: Refactor UNTRAIN_RET[_*]
    
    Commit e8efc0800b8b5045ba8c0d1256bfbb47e92e192a upstream.
    
    Factor out the UNTRAIN_RET[_*] common bits into a helper macro.
    
    Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/f06d45489778bd49623297af2a983eea09067a74.1693889988.git.jpoimboe@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Fri Apr 5 16:46:37 2024 +0200

    x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk
    
    commit b377c66ae3509ccea596512d6afb4777711c4870 upstream.
    
    srso_alias_untrain_ret() is special code, even if it is a dummy
    which is called in the !SRSO case, so annotate it like its real
    counterpart, to address the following objtool splat:
    
      vmlinux.o: warning: objtool: .export_symbol+0x2b290: data relocation to !ENDBR: srso_alias_untrain_ret+0x0
    
    Fixes: 4535e1a4174c ("x86/bugs: Fix the SRSO mitigation on Zen3/4")
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/20240405144637.17908-1-bp@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO [+ + +]

Author: Borislav Petkov (AMD) <bp@alien8.de>
Date:   Tue Apr 2 16:05:49 2024 +0200

    x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO
    
    commit 0e110732473e14d6520e49d75d2c88ef7d46fe67 upstream.
    
    The srso_alias_untrain_ret() dummy thunk in the !CONFIG_MITIGATION_SRSO
    case is there only for the altenative in CALL_UNTRAIN_RET to have
    a symbol to resolve.
    
    However, testing with kernels which don't have CONFIG_MITIGATION_SRSO
    enabled, leads to the warning in patch_return() to fire:
    
      missing return thunk: srso_alias_untrain_ret+0x0/0x10-0x0: eb 0e 66 66 2e
      WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:826 apply_returns (arch/x86/kernel/alternative.c:826
    
    Put in a plain "ret" there so that gcc doesn't put a return thunk in
    in its place which special and gets checked.
    
    In addition:
    
      ERROR: modpost: "srso_alias_untrain_ret" [arch/x86/kvm/kvm-amd.ko] undefined!
      make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Chyba 1
      make[1]: *** [/usr/src/linux-6.8.3/Makefile:1873: modpost] Chyba 2
      make: *** [Makefile:240: __sub-make] Chyba 2
    
    since !SRSO builds would use the dummy return thunk as reported by
    petr.pisar@atlas.cz, https://bugzilla.kernel.org/show_bug.cgi?id=218679.
    
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Closes: https://lore.kernel.org/oe-lkp/202404020901.da75a60f-oliver.sang@intel.com
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/all/202404020901.da75a60f-oliver.sang@intel.com/
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/sev: Move early startup code into .head.text section [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Tue Feb 27 16:19:16 2024 +0100

    x86/sev: Move early startup code into .head.text section
    
    commit 428080c9b19bfda37c478cd626dbd3851db1aff9 upstream.
    
    In preparation for implementing rigorous build time checks to enforce
    that only code that can support it will be called from the early 1:1
    mapping of memory, move SEV init code that is called in this manner to
    the .head.text section.
    
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240227151907.387873-19-ardb+git@google.com
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/sme: Move early SME kernel encryption handling into .head.text [+ + +]

Author: Ard Biesheuvel <ardb@kernel.org>
Date:   Tue Feb 27 16:19:15 2024 +0100

    x86/sme: Move early SME kernel encryption handling into .head.text
    
    commit 48204aba801f1b512b3abed10b8e1a63e03f3dd1 upstream.
    
    The .head.text section is the initial primary entrypoint of the core
    kernel, and is entered with the CPU executing from a 1:1 mapping of
    memory. Such code must never access global variables using absolute
    references, as these are based on the kernel virtual mapping which is
    not active yet at this point.
    
    Given that the SME startup code is also called from this early execution
    context, move it into .head.text as well. This will allow more thorough
    build time checks in the future to ensure that early startup code only
    uses RIP-relative references to global variables.
    
    Also replace some occurrences of __pa_symbol() [which relies on the
    compiler generating an absolute reference, which is not guaranteed] and
    an open coded RIP-relative access with RIP_REL_REF().
    
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20240227151907.387873-18-ardb+git@google.com
    Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/srso: Disentangle rethunk-dependent options [+ + +]

Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date:   Mon Sep 4 22:05:00 2023 -0700

    x86/srso: Disentangle rethunk-dependent options
    
    Commit 34a3cae7474c6e6f4a85aad4a7b8191b8b35cdcd upstream.
    
    CONFIG_RETHUNK, CONFIG_CPU_UNRET_ENTRY and CONFIG_CPU_SRSO are all
    tangled up.  De-spaghettify the code a bit.
    
    Some of the rethunk-related code has been shuffled around within the
    '.text..__x86.return_thunk' section, but otherwise there are no
    functional changes.  srso_alias_untrain_ret() and srso_alias_safe_ret()
    ((which are very address-sensitive) haven't moved.
    
    Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/2845084ed303d8384905db3b87b77693945302b4.1693889988.git.jpoimboe@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/srso: Improve i-cache locality for alias mitigation [+ + +]

Author: Josh Poimboeuf <jpoimboe@kernel.org>
Date:   Mon Sep 4 22:04:55 2023 -0700

    x86/srso: Improve i-cache locality for alias mitigation
    
    Commit aa730cff0c26244e88066b5b461a9f5fbac13823 upstream.
    
    Move srso_alias_return_thunk() to the same section as
    srso_alias_safe_ret() so they can share a cache line.
    
    Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
    Link: https://lore.kernel.org/r/eadaf5530b46a7ae8b936522da45ae555d2b3393.1693889988.git.jpoimboe@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/syscall: Don't force use of indirect calls for system calls [+ + +]

Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Apr 3 16:36:44 2024 -0700

    x86/syscall: Don't force use of indirect calls for system calls
    
    commit 1e3ad78334a69b36e107232e337f9d693dcc9df2 upstream.
    
    Make <asm/syscall.h> build a switch statement instead, and the compiler can
    either decide to generate an indirect jump, or - more likely these days due
    to mitigations - just a series of conditional branches.
    
    Yes, the conditional branches also have branch prediction, but the branch
    prediction is much more controlled, in that it just causes speculatively
    running the wrong system call (harmless), rather than speculatively running
    possibly wrong random less controlled code gadgets.
    
    This doesn't mitigate other indirect calls, but the system call indirection
    is the first and most easily triggered case.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86: set SPECTRE_BHI_ON as default [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Apr 9 19:32:41 2024 +0200

    x86: set SPECTRE_BHI_ON as default
    
    commit 2bb69f5fc72183e1c62547d900f560d0e9334925 upstream.
    
    Part of a merge commit from Linus that adjusted the default setting of
    SPECTRE_BHI_ON.
    
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen-netfront: Add missing skb_mark_for_recycle [+ + +]

Author: Jesper Dangaard Brouer <hawk@kernel.org>
Date:   Wed Mar 27 13:14:56 2024 +0100

    xen-netfront: Add missing skb_mark_for_recycle
    
    commit 037965402a010898d34f4e35327d22c0a95cd51f upstream.
    
    Notice that skb_mark_for_recycle() is introduced later than fixes tag in
    commit 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling").
    
    It is believed that fixes tag were missing a call to page_pool_release_page()
    between v5.9 to v5.14, after which is should have used skb_mark_for_recycle().
    Since v6.6 the call page_pool_release_page() were removed (in
    commit 535b9c61bdef ("net: page_pool: hide page_pool_release_page()")
    and remaining callers converted (in commit 6bfef2ec0172 ("Merge branch
    'net-page_pool-remove-page_pool_release_page'")).
    
    This leak became visible in v6.8 via commit dba1b8a7ab68 ("mm/page_pool: catch
    page_pool memory leaks").
    
    Cc: stable@vger.kernel.org
    Fixes: 6c5aa6fc4def ("xen networking: add basic XDP support for xen-netfront")
    Reported-by: Leonidas Spyropoulos <artafinde@archlinux.com>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218654
    Reported-by: Arthur Borsboom <arthurborsboom@gmail.com>
    Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
    Link: https://lore.kernel.org/r/171154167446.2671062.9127105384591237363.stgit@firesoul
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Список изменений в Linux 6.6.26