summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/apic
AgeCommit message (Collapse)Author
7 daysMerge tag 'soc-drivers-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "This is the first half of the driver changes: - A treewide interface change to the "syscore" operations for power management, as a preparation for future Tegra specific changes - Reset controller updates with added drivers for LAN969x, eic770 and RZ/G3S SoCs - Protection of system controller registers on Renesas and Google SoCs, to prevent trivially triggering a system crash from e.g. debugfs access - soc_device identification updates on Nvidia, Exynos and Mediatek - debugfs support in the ST STM32 firewall driver - Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI - Cleanups for memory controller support on Nvidia and Renesas" * tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits) memory: tegra186-emc: Fix missing put_bpmp Documentation: reset: Remove reset_controller_add_lookup() reset: fix BIT macro reference reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe reset: th1520: Support reset controllers in more subsystems reset: th1520: Prepare for supporting multiple controllers dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets reset: remove legacy reset lookup code clk: davinci: psc: drop unused reset lookup reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support reset: eswin: Add eic7700 reset driver dt-bindings: reset: eswin: Documentation for eic7700 SoC reset: sparx5: add LAN969x support dt-bindings: reset: microchip: Add LAN969x support soc: rockchip: grf: Add select correct PWM implementation on RK3368 soc/tegra: pmc: Add USB wake events for Tegra234 amba: tegra-ahb: Fix device leak on SMMU enable ...
10 daysMerge tag 'x86_misc_for_6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Dave Hansen: "The most significant are some changes to ensure that symbols exported for KVM are used only by KVM modules themselves, along with some related cleanups. In true x86/misc fashion, the other patch is completely unrelated and just enhances an existing pr_warn() to make it clear to users how they have tainted their kernel when something is mucking with MSRs. Summary: - Make MSR-induced taint easier for users to track down - Restrict KVM-specific exports to KVM itself" * tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible x86/mm: Drop unnecessary export of "ptdump_walk_pgd_level_debugfs" x86/mtrr: Drop unnecessary export of "mtrr_state" x86/bugs: Drop unnecessary export of "x86_spec_ctrl_base" x86/msr: Add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
2025-11-14syscore: Pass context data to callbacksThierry Reding
Several drivers can benefit from registering per-instance data along with the syscore operations. To achieve this, move the modifiable fields out of the syscore_ops structure and into a separate struct syscore that can be registered with the framework. Add a void * driver data field for drivers to store contextual data that will be passed to the syscore ops. Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-11-12x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possibleSean Christopherson
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(), and use the helper macro to export symbols for KVM throughout x86 if and only if KVM will build one or more modules, and only for those modules. To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be built (because no vendor modules are selected), let arch code #define EXPORT_SYMBOL_FOR_KVM to suppress/override the exports. Note, the set of symbols to restrict to KVM was generated by manual search and audit; any "misses" are due to human error, not some grand plan. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
2025-11-07x86/apic: Fix frequency in apic=verbose log outputJulian Stecklina
When apic=verbose is specified, the LAPIC timer calibration prints its results to the console. At least while debugging virtualization code, the CPU and bus frequencies are printed incorrectly. Specifically, for a 1.7 GHz CPU with 1 GHz bus frequency and HZ=1000, the log includes a superfluous 0 after the period: ..... calibration result: 999978 ..... CPU clock speed is 1696.0783 MHz. ..... host bus clock speed is 999.0978 MHz. Looking at the code, this only worked as intended for HZ=100. After the fix, the correct frequency is printed: ..... calibration result: 999828 ..... CPU clock speed is 1696.507 MHz. ..... host bus clock speed is 999.828 MHz. There is no functional change to the LAPIC calibration here, beyond the printing format changes. [ bp: - Massage commit message - Figures it should apply this patch about ~4 years later - Massage it into the current code ] Suggested-by: Markus Napierkowski <markus.napierkowski@cyberus-technology.de> Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20211030142148.143261-1-js@alien8.de
2025-10-21x86/ioapic: Simplify mp_irqdomain_alloc() slightlyChristophe JAILLET
The IRQ return value of irq_find_mapping() is only tested for existence, not used for anything else. So, this call can be replaced by a slightly simpler irq_resolve_mapping() call, which reduces generated code size a bit (x86-64 allmodconfig): text data bss dec hex filename 82142 38633 18048 138823 21e47 arch/x86/kernel/apic/io_apic.o.before 81932 38633 18048 138613 21d75 arch/x86/kernel/apic/io_apic.o.after [ mingo: Fixed & simplified the changelog ] Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: https://patch.msgid.link/cb3a4968538637aac3a5ae4f5ecc4f5eb43376ea.1760861877.git.christophe.jaillet@wanadoo.fr
2025-09-04x86/apic/savic: Do not use snp_abort()Borislav Petkov (AMD)
This function is going away so replace the callsites with the equivalent functionality. Add a new SAVIC-specific termination reason. If more granularity is needed there, it will be revisited in the future. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-01x86/apic: Enable Secure AVIC in the control MSRNeeraj Upadhyay
With all the pieces in place now, enable Secure AVIC in the Secure AVIC Control MSR. Any access to x2APIC MSRs are emulated by the hypervisor before Secure AVIC is enabled in the control MSR. Post Secure AVIC enablement, all x2APIC MSR accesses (whether accelerated by AVIC hardware or trapped as a #VC exception) operate on the vCPU's APIC backing page. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828112126.209028-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add kexec support for Secure AVICNeeraj Upadhyay
Add a apic->teardown() callback to disable Secure AVIC before rebooting into the new kernel. This ensures that the new kernel does not access the old APIC backing page which was allocated by the previous kernel. Such accesses can happen if there are any APIC accesses done during the guest boot before Secure AVIC driver probe is done by the new kernel (as Secure AVIC would have remained enabled in the Secure AVIC control MSR). Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828112008.209013-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Handle EOI writes for Secure AVIC guestsNeeraj Upadhyay
Secure AVIC accelerates the guest's EOI MSR writes for edge-triggered interrupts. For level-triggered interrupts, EOI MSR writes trigger a #VC exception with an SVM_EXIT_AVIC_UNACCELERATED_ACCESS error code. To complete EOI handling, the #VC exception handler would need to trigger a GHCB protocol MSR write event to notify the hypervisor about completion of the level-triggered interrupt. Hypervisor notification is required for cases like emulated IO-APIC, to complete and clear interrupt in the IO-APIC's interrupt state. However, #VC exception handling adds extra performance overhead for APIC register writes. In addition, for Secure AVIC, some unaccelerated APIC register MSR writes are trapped, whereas others are faulted. This results in additional complexity in #VC exception handling for unaccelerated APIC MSR accesses. So, directly do a GHCB protocol based APIC EOI MSR write from apic->eoi() callback for level-triggered interrupts. Use WRMSR for edge-triggered interrupts, so that hardware re-evaluates any pending interrupt which can be delivered to the guest vCPU. For level-triggered interrupts, re-evaluation happens on return from VMGEXIT corresponding to the GHCB event for APIC EOI MSR write. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111654.208987-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Read and write LVT* APIC registers from HV for SAVIC guestsNeeraj Upadhyay
The Hypervisor needs information about the current state of the LVT registers for device emulation and NMIs. So, forward reads and write of these registers to the hypervisor for Secure AVIC enabled guests. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111356.208972-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Allow NMI to be injected from hypervisor for Secure AVICNeeraj Upadhyay
Secure AVIC requires the "AllowedNmi" bit in the Secure AVIC Control MSR to be set for an NMI to be injected from the hypervisor. So set it. Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111243.208946-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add support to send NMI IPI for Secure AVICNeeraj Upadhyay
Secure AVIC introduces a new field in the APIC backing page "NmiReq" that has to be set by the guest to request a NMI IPI through APIC_ICR write. Add support to set NmiReq appropriately to send NMI IPI. Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111213.208933-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Support LAPIC timer for Secure AVICNeeraj Upadhyay
Secure AVIC requires the LAPIC timer to be emulated by the hypervisor. KVM already supports emulating the LAPIC timer using hrtimers. In order to emulate it, APIC_LVTT, APIC_TMICT and APIC_TDCR register values need to be propagated to the hypervisor for arming the timer. APIC_TMCCT register value has to be read from the hypervisor, which is required for calibrating the APIC timer. So, read/write all APIC timer registers from/to the hypervisor. Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110926.208866-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add support to send IPI for Secure AVICNeeraj Upadhyay
Secure AVIC hardware accelerates only Self-IPI, i.e. on WRMSR to APIC_SELF_IPI and APIC_ICR (with destination shorthand equal to Self) registers, hardware takes care of updating the APIC_IRR in the APIC backing page of the vCPU. For other IPI types (cross-vCPU, broadcast IPIs), software needs to take care of updating the APIC_IRR state of the target vCPUs and to ensure that the target vCPUs notice the new pending interrupt. Add new callbacks in the Secure AVIC driver for sending IPI requests. These callbacks update the IRR in the target guest vCPU's APIC backing page. To ensure that the remote vCPU notices the new pending interrupt, reuse the GHCB MSR handling code in vc_handle_msr() to issue APIC_ICR MSR-write GHCB protocol event to the hypervisor. For Secure AVIC guests, on APIC_ICR write MSR exits, the hypervisor notifies the target vCPU by either sending an AVIC doorbell (if target vCPU is running) or by waking up the non-running target vCPU. Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110824.208851-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add an update_vector() callback for Secure AVICNeeraj Upadhyay
Add an update_vector() callback to set/clear the ALLOWED_IRR field in a vCPU's APIC backing page for vectors which are emulated by the hypervisor. The ALLOWED_IRR field indicates the interrupt vectors which the guest allows the hypervisor to inject (typically for emulated devices). Interrupt vectors used exclusively by the guest itself and the vectors which are not emulated by the hypervisor, such as IPI vectors, should not be set by the guest in the ALLOWED_IRR fields. As clearing/setting state of a vector will also be used in subsequent commits for other APIC registers (such as APIC_IRR update for sending IPI), add a common update_vector() in the Secure AVIC driver. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110255.208779-4-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add update_vector() callback for APIC driversNeeraj Upadhyay
Add an update_vector() callback to allow APIC drivers to perform driver specific operations on external vector allocation/teardown on a CPU. This callback will be used by the Secure AVIC APIC driver to configure the vectors which a guest vCPU allows the hypervisor to send to it. As system vectors have fixed vector assignments and are not dynamically allocated, add an apic_update_vector() public API to facilitate update_vector() callback invocation for them. This will be used for Secure AVIC enabled guests to allow the hypervisor to inject system vectors which are emulated by the hypervisor such as APIC timer vector and HYPERVISOR_CALLBACK_VECTOR. While at it, cleanup line break in apic_update_irq_cfg(). Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828110255.208779-3-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Initialize APIC ID for Secure AVICNeeraj Upadhyay
Initialize the APIC ID in the Secure AVIC APIC backing page with the APIC_ID MSR value read from the hypervisor. CPU topology evaluation later during boot would catch and report any duplicate APIC ID for two CPUs. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110255.208779-2-Neeraj.Upadhyay@amd.com
2025-08-31x86/apic: Populate .read()/.write() callbacks of Secure AVIC driverNeeraj Upadhyay
Add read() and write() APIC callback functions to read and write the x2APIC registers directly from the guest APIC backing page of a vCPU. The x2APIC registers are mapped at an offset within the guest APIC backing page which is the same as their x2APIC MMIO offset. Secure AVIC adds new registers such as ALLOWED_IRRs (which are at 4-byte offset within the IRR register offset range) and NMI_REQ to the APIC register space. When Secure AVIC is enabled, accessing the guest's APIC registers through RD/WRMSR results in a #VC exception (for non-accelerated register accesses) with error code VMEXIT_AVIC_NOACCEL. The #VC exception handler can read/write the x2APIC register in the guest APIC backing page to complete the RDMSR/WRMSR. Since doing this would increase the latency of accessing the x2APIC registers, instead of doing RDMSR/WRMSR based register accesses and handling reads/writes in the #VC exception, directly read/write the APIC registers from/to the guest APIC backing page of the vCPU in read() and write() callbacks of the Secure AVIC APIC driver. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110255.208779-1-Neeraj.Upadhyay@amd.com
2025-08-31x86/apic: Initialize Secure AVIC APIC backing pageNeeraj Upadhyay
With Secure AVIC, the APIC backing page is owned and managed by the guest. Allocate and initialize APIC backing page for all guest CPUs. The NPT entry for a vCPU's APIC backing page must always be present when the vCPU is running in order for Secure AVIC to function. A VMEXIT_BUSY is returned on VMRUN and the vCPU cannot be resumed otherwise. To handle this, notify GPA of the vCPU's APIC backing page to the hypervisor by using the SVM_VMGEXIT_SECURE_AVIC GHCB protocol event. Before executing VMRUN, the hypervisor makes use of this information to make sure the APIC backing page is mapped in the NPT. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828070334.208401-3-Neeraj.Upadhyay@amd.com
2025-08-28x86/apic: Add new driver for Secure AVICNeeraj Upadhyay
The Secure AVIC feature provides SEV-SNP guests hardware acceleration for performance sensitive APIC accesses while securely managing the guest-owned APIC state through the use of a private APIC backing page. This helps prevent the hypervisor from generating unexpected interrupts for a vCPU or otherwise violate architectural assumptions around the APIC behavior. Add a new x2APIC driver that will serve as the base of the Secure AVIC support. It is initially the same as the x2APIC physical driver (without IPI callbacks), but will be modified as features are implemented. As the new driver does not implement Secure AVIC features yet, if the hypervisor sets the Secure AVIC bit in SEV_STATUS, maintain the existing behavior to enforce the guest termination. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828070334.208401-2-Neeraj.Upadhyay@amd.com
2025-08-25x86/apic: Make the ISR clearing saneThomas Gleixner
apic_pending_intr_clear() is fundamentally voodoo programming. It's primary purpose is to clear stale ISR bits in the local APIC, which would otherwise lock the corresponding interrupt priority level. The comments and the implementation claim falsely that after clearing the stale ISR bits, eventually stale IRR bits would be turned into ISR bits and can be cleared as well. That's just wishful thinking because: 1) If interrupts are disabled, the APIC does not propagate an IRR bit to the ISR. 2) If interrupts are enabled, then the APIC propagates the IRR bit to the ISR and raises the interrupt in the CPU, which means that code _cannot_ observe the ISR bit for any of those IRR bits. Rename the function to reflect the purpose and make exactly _one_ attempt to EOI the pending ISR bits and add comments why traversing the pending bit map in low to high priority order is correct. Instead of trying to "clear" IRR bits, simply print a warning message when the IRR is not empty. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/871ppwih4s.ffs@tglx
2025-07-15x86/apic: Move apic_update_irq_cfg() call to apic_update_vector()Neeraj Upadhyay
All callers of apic_update_vector() also call apic_update_irq_cfg() after it. So, move the apic_update_irq_cfg() call to apic_update_vector(). No functional change intended. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250709033242.267892-18-Neeraj.Upadhyay@amd.com
2025-06-03Merge tag 'hyperv-next-signed-20250602' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel) - Fixes for Hyper-V UIO driver (Long Li) - Fixes for Hyper-V PCI driver (Michael Kelley) - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley) - Documentation updates for Hyper-V VMBus (Michael Kelley) - Enhance logging for hv_kvp_daemon (Shradha Gupta) * tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits) Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir Documentation: hyperv: Update VMBus doc with new features and info PCI: hv: Remove unnecessary flex array in struct pci_packet Drivers: hv: Remove hv_alloc/free_* helpers Drivers: hv: Use kzalloc for panic page allocation uio_hv_generic: Align ring size to system page uio_hv_generic: Use correct size for interrupt and monitor pages Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary arch/x86: Provide the CPU number in the wakeup AP callback x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap() PCI: hv: Get vPCI MSI IRQ domain from DeviceTree ACPI: irq: Introduce acpi_get_gsi_dispatcher() Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device() Drivers: hv: vmbus: Get the IRQ number from DeviceTree dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties arm64, x86: hyperv: Report the VTL the system boots in arm64: hyperv: Initialize the Virtual Trust Level field Drivers: hv: Provide arch-neutral implementation of get_vtl() Drivers: hv: Enable VTL mode for arm64 ...
2025-05-27Merge tag 'timers-cleanups-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "Another set of timer API cleanups: - Convert init_timer*(), try_to_del_timer_sync() and destroy_timer_on_stack() over to the canonical timer_*() namespace convention. There is another large conversion pending, which has not been included because it would have caused a gazillion of merge conflicts in next. The conversion scripts will be run towards the end of the merge window and a pull request sent once all conflict dependencies have been merged" * tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack() treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try() timers: Rename init_timers() as timers_init() timers: Rename NEXT_TIMER_MAX_DELTA as TIMER_NEXT_MAX_DELTA timers: Rename __init_timer_on_stack() as __timer_init_on_stack() timers: Rename __init_timer() as __timer_init() timers: Rename init_timer_on_stack_key() as timer_init_key_on_stack() timers: Rename init_timer_key() as timer_init_key()
2025-05-27Merge tag 'irq-cleanups-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Thomas Gleixner: "A set of cleanups for the generic interrupt subsystem: - Consolidate on one set of functions for the interrupt domain code to get rid of pointlessly duplicated code with only marginal different semantics. - Update the documentation accordingly and consolidate the coding style of the irqdomain header" * tag 'irq-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) irqdomain: Consolidate coding style irqdomain: Fix kernel-doc and add it to Documentation Documentation: irqdomain: Update it Documentation: irq-domain.rst: Simple improvements Documentation: irq/concepts: Minor improvements Documentation: irq/concepts: Add commas and reflow irqdomain: Improve kernel-docs of functions irqdomain: Make struct irq_domain_info variables const irqdomain: Use irq_domain_instantiate()'s return value as initializers irqdomain: Drop irq_linear_revmap() pinctrl: keembay: Switch to irq_find_mapping() irqchip/armada-370-xp: Switch to irq_find_mapping() gpu: ipu-v3: Switch to irq_find_mapping() gpio: idt3243x: Switch to irq_find_mapping() sh: Switch to irq_find_mapping() powerpc: Switch to irq_find_mapping() irqdomain: Drop irq_domain_add_*() functions powerpc: Switch irq_domain_add_nomap() to use fwnode thermal: Switch to irq_domain_create_linear() soc: Switch to irq_domain_create_*() ...
2025-05-23arch/x86: Provide the CPU number in the wakeup AP callbackRoman Kisel
When starting APs, confidential guests and paravisor guests need to know the CPU number, and the pattern of using the linear search has emerged in several places. With N processors that leads to the O(N^2) time complexity. Provide the CPU number in the AP wake up callback so that one can get the CPU number in constant time. Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20250507182227.7421-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-3-romank@linux.microsoft.com>
2025-05-16x86/io_apic: Switch to of_fwnode_handle()Jiri Slaby (SUSE)
of_node_to_fwnode() is irqdomain's reimplementation of the "officially" defined of_fwnode_handle(). The former is in the process of being removed, so use the latter instead. [ tglx: Fix up subject prefix ] Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250319092951.37667-11-jirislaby@kernel.org
2025-05-13Merge branch 'x86/msr' into x86/core, to resolve conflictsIngo Molnar
Conflicts: arch/x86/boot/startup/sme.c arch/x86/coco/sev/core.c arch/x86/kernel/fpu/core.c arch/x86/kernel/fpu/xstate.c Semantic conflict: arch/x86/include/asm/sev-internal.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-08treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try()Ingo Molnar
Move this API to the canonical timer_*() namespace. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250507175338.672442-9-mingo@kernel.org
2025-05-02x86/msr: Add explicit includes of <asm/msr.h>Xin Li (Intel)
For historic reasons there are some TSC-related functions in the <asm/msr.h> header, even though there's an <asm/tsc.h> header. To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h> to <asm/tsc.h> and to eventually eliminate the inclusion of <asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency to the source files that reference definitions from <asm/msr.h>. [ mingo: Clarified the changelog. ] Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-04-18x86/asm: Rename rep_nop() to native_pause()Uros Bizjak
Rename rep_nop() function to what it really does. No functional change intended. Suggested-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lore.kernel.org/r/20250418080805.83679-1-ubizjak@gmail.com
2025-04-10x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'Ingo Molnar
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'Ingo Molnar
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-04irqdomain: Rename irq_set_default_host() to irq_set_default_domain()Jiri Slaby (SUSE)
Naming interrupt domains host is confusing at best and the irqdomain code uses both domain and host inconsistently. Therefore rename irq_set_default_host() to irq_set_default_domain(). Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250319092951.37667-3-jirislaby@kernel.org
2025-03-25Merge tag 'irq-drivers-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq driver updates from Thomas Gleixner: - Support for hard indices on RISC-V. The hart index identifies a hart (core) within a specific interrupt domain in RISC-V's Priviledged Architecture. - Rework of the RISC-V MSI driver This moves the driver over to the generic MSI library and solves the affinity problem of unmaskable PCI/MSI controllers. Unmaskable PCI/MSI controllers are prone to lose interrupts when the MSI message is updated to change the affinity because the message write consists of three 32-bit subsequent writes, which update address and data. As these writes are non-atomic versus the device raising an interrupt, the device can observe a half written update and issue an interrupt on the wrong vector. This is mitiated by a carefully orchestrated step by step update and the observation of an eventually pending interrupt on the CPU which issues the update. The algorithm follows the well established method of the X86 MSI driver. - A new driver for the RISC-V Sophgo SG2042 MSI controller - Overhaul of the Renesas RZQ2L driver Simplification of the probe function by using devm_*() mechanisms, which avoid the endless list of error prone gotos in the failure paths. - Expand the Renesas RZV2H driver to support RZ/G3E SoCs - A workaround for Rockchip 3568002 erratum in the GIC-V3 driver to ensure that the addressing is limited to the lower 32-bit of the physical address space. - Add support for the Allwinner AS23 NMI controller - Expand the IMX irqsteer driver to handle up to 960 input interrupts - The usual small updates, cleanups and device tree changes * tag 'irq-drivers-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) irqchip/imx-irqsteer: Support up to 960 input interrupts irqchip/sunxi-nmi: Support Allwinner A523 NMI controller dt-bindings: irq: sun7i-nmi: Document the Allwinner A523 NMI controller irqchip/davinci-cp-intc: Remove public header irqchip/renesas-rzv2h: Add RZ/G3E support irqchip/renesas-rzv2h: Update macros ICU_TSSR_TSSEL_{MASK,PREP} irqchip/renesas-rzv2h: Update TSSR_TIEN macro irqchip/renesas-rzv2h: Add field_width to struct rzv2h_hw_info irqchip/renesas-rzv2h: Add max_tssel to struct rzv2h_hw_info irqchip/renesas-rzv2h: Add struct rzv2h_hw_info with t_offs variable irqchip/renesas-rzv2h: Use devm_pm_runtime_enable() irqchip/renesas-rzv2h: Use devm_reset_control_get_exclusive_deasserted() irqchip/renesas-rzv2h: Simplify rzv2h_icu_init() irqchip/renesas-rzv2h: Drop irqchip from struct rzv2h_icu_priv irqchip/renesas-rzv2h: Fix wrong variable usage in rzv2h_tint_set_type() dt-bindings: interrupt-controller: renesas,rzv2h-icu: Document RZ/G3E SoC riscv: sophgo: dts: Add msi controller for SG2042 irqchip: Add the Sophgo SG2042 MSI interrupt controller dt-bindings: interrupt-controller: Add Sophgo SG2042 MSI arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI ...
2025-03-24Merge tag 'x86-cleanups-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Miscellaneous x86 cleanups by Arnd Bergmann, Charles Han, Mirsad Todorovac, Randy Dunlap, Thorsten Blum and Zhang Kunbo" * tag 'x86-cleanups-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/coco: Replace 'static const cc_mask' with the newly introduced cc_get_mask() function x86/delay: Fix inconsistent whitespace selftests/x86/syscall: Fix coccinelle WARNING recommending the use of ARRAY_SIZE() x86/platform: Fix missing declaration of 'x86_apple_machine' x86/irq: Fix missing declaration of 'io_apic_irqs' x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()
2025-03-19x86/apic: Fix 32-bit APIC initialization for extended Intel FamiliesSohil Mehta
APIC detection is currently limited to a few specific Families and will not match the upcoming Families >=18. Extend the check to include all Families 6 or greater. Also convert it to a VFM check to make it simpler. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-2-sohil.mehta@intel.com
2025-03-04Merge branch 'x86/asm' into x86/core, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-03x86/smp/32: Remove safe_smp_processor_id()Brian Gerst
The safe_smp_processor_id() function was originally implemented in: dc2bc768a009 ("stack overflow safe kdump: safe_smp_processor_id()") to mitigate the CPU number corruption on a stack overflow. At the time, x86-32 stored the CPU number in thread_struct, which was located at the bottom of the task stack and thus vulnerable to an overflow. The CPU number is now located in percpu memory, so this workaround is no longer needed. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250303170115.2176553-1-brgerst@gmail.com
2025-02-27x86/smp: Drop 32-bit "bigsmp" machine supportArnd Bergmann
The x86-32 kernel used to support multiple platforms with more than eight logical CPUs, from the 1999-2003 timeframe: Sequent NUMA-Q, IBM Summit, Unisys ES7000 and HP F8. Support for all except the latter was dropped back in 2014, leaving only the F8 based DL740 and DL760 G2 machines in this catery, with up to eight single-core Socket-603 Xeon-MP processors with hyperthreading. Like the already removed machines, the HP F8 servers at the time cost upwards of $100k in typical configurations, but were quickly obsoleted by their 64-bit Socket-604 cousins and the AMD Opteron. Earlier servers with up to 8 Pentium Pro or Xeon processors remain fully supported as they had no hyperthreading. Similarly, the more common 4-socket Xeon-MP machines with hyperthreading using Intel or ServerWorks chipsets continue to work without this, and all the multi-core Xeon processors also run 64-bit kernels. While the "bigsmp" support can also be used to run on later 64-bit machines (including VM guests), it seems best to discourage that and get any remaining users to update their kernels to 64-bit builds on these. As a side-effect of this, there is also no more need to support NUMA configurations on 32-bit x86, as all true 32-bit NUMA platforms are already gone. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-3-arnd@kernel.org
2025-02-21x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()Thorsten Blum
Remove hard-coded strings by using the str_disabled_enabled() helper. No change in functionality intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250209210333.5666-2-thorsten.blum@linux.dev
2025-02-20genirq: Introduce common irq_force_complete_move() implementationThomas Gleixner
CONFIG_GENERIC_PENDING_IRQ requires an architecture specific implementation of irq_force_complete_move() for CPU hotplug. At the moment, only x86 implements this unconditionally, but for RISC-V irq_force_complete_move() is only needed when the RISC-V IMSIC driver is in use and not needed otherwise. To allow runtime configuration of this mechanism, introduce a common irq_force_complete_move() implementation in the interrupt core code, which only invokes the completion function, when a interrupt chip in the hierarchy implements it. Switch X86 over to the new mechanism. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250217085657.789309-5-apatel@ventanamicro.com
2025-01-26Merge tag 'mm-stable-2025-01-26-14-59' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated: https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/ Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED) - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation - "Cleanup for memfd_create()" from Isaac Manjarres does that thing - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests" * tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm/compaction: fix UBSAN shift-out-of-bounds warning s390/mm: add missing ctor/dtor on page table upgrade kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags() tools: add VM_WARN_ON_VMG definition mm/damon/core: use str_high_low() helper in damos_wmark_wait_us() seqlock: add missing parameter documentation for raw_seqcount_try_begin() mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh mm/page_alloc: remove the incorrect and misleading comment zram: remove zcomp_stream_put() from write_incompressible_page() mm: separate move/undo parts from migrate_pages_batch() mm/kfence: use str_write_read() helper in get_access_type() selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags() selftests/mm: virtual_address_range: avoid reading from VM_IO mappings selftests/mm: vm_util: split up /proc/self/smaps parsing selftests/mm: virtual_address_range: unmap chunks after validation selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/memfd/memfd_test: fix possible NULL pointer dereference mm: add FGP_DONTCACHE folio creation flag mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue ...
2025-01-25mm/memblock: add memblock_alloc_or_panic interfaceGuo Weikang
Before SLUB initialization, various subsystems used memblock_alloc to allocate memory. In most cases, when memory allocation fails, an immediate panic is required. To simplify this behavior and reduce repetitive checks, introduce `memblock_alloc_or_panic`. This function ensures that memory allocation failures result in a panic automatically, improving code readability and consistency across subsystems that require this behavior. [guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic] Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/ Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390] Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-21Merge tag 'irq-core-2025-01-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: - Consolidate the machine_kexec_mask_interrupts() by providing a generic implementation and replacing the copy & pasta orgy in the relevant architectures. - Prevent unconditional operations on interrupt chips during kexec shutdown, which can trigger warnings in certain cases when the underlying interrupt has been shut down before. - Make the enforcement of interrupt handling in interrupt context unconditionally available, so that it actually works for non x86 related interrupt chips. The earlier enablement for ARM GIC chips set the required chip flag, but did not notice that the check was hidden behind a config switch which is not selected by ARM[64]. - Decrapify the handling of deferred interrupt affinity setting. Some interrupt chips require that affinity changes are made from the context of handling an interrupt to avoid certain race conditions. For x86 this was the default, but with interrupt remapping this requirement was lifted and a flag was introduced which tells the core code that affinity changes can be done in any context. Unrestricted affinity changes are the default for the majority of interrupt chips. RISCV has the requirement to add the deferred mode to one of it's interrupt controllers, but with the original implementation this would require to add the any context flag to all other RISC-V interrupt chips. That's backwards, so reverse the logic and require that chips, which need the deferred mode have to be marked accordingly. That avoids chasing the 'sane' chips and marking them. - Add multi-node support to the Loongarch AVEC interrupt controller driver. - The usual tiny cleanups, fixes and improvements all over the place. * tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set() genirq/timings: Add kernel-doc for a function parameter genirq: Remove IRQ_MOVE_PCNTXT and related code x86/apic: Convert to IRQCHIP_MOVE_DEFERRED genirq: Provide IRQCHIP_MOVE_DEFERRED hexagon: Remove GENERIC_PENDING_IRQ leftover ARC: Remove GENERIC_PENDING_IRQ genirq: Remove handle_enforce_irqctx() wrapper genirq: Make handle_enforce_irqctx() unconditionally available irqchip/loongarch-avec: Add multi-nodes topology support irqchip/ts4800: Replace seq_printf() by seq_puts() irqchip/ti-sci-inta : Add module build support irqchip/ti-sci-intr: Add module build support irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown kexec: Consolidate machine_kexec_mask_interrupts() implementation genirq: Reuse irq_thread_fn() for forced thread case genirq: Move irq_thread_fn() further up in the code
2025-01-21Merge tag 'x86-cleanups-2025-01-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Miscellaneous x86 cleanups and typo fixes, and also the removal of the 'disablelapic' boot parameter" * tag 'x86-cleanups-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Remove a stray tab in the IO-APIC type string x86/cpufeatures: Remove "AMD" from the comments to the AMD-specific leaf Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text x86/cpu: Fix typo in x86_match_cpu()'s doc x86/apic: Remove "disablelapic" cmdline option Documentation: Merge x86-specific boot options doc into kernel-parameters.txt x86/ioremap: Remove unused size parameter in remapping functions x86/ioremap: Simplify setup_data mapping variants x86/boot/compressed: Remove unused header includes from kaslr.c
2025-01-15x86/apic: Convert to IRQCHIP_MOVE_DEFERREDThomas Gleixner
Instead of marking individual interrupts as safe to be migrated in arbitrary contexts, mark the interrupt chips, which require the interrupt to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED flag. This makes more sense because this is a per interrupt chip property and not restricted to individual interrupts. That flips the logic from the historical opt-out to a opt-in model. This is simpler to handle for other architectures, which default to unrestricted affinity setting. It also allows to cleanup the redundant core logic significantly. All interrupt chips, which belong to a top-level domain sitting directly on top of the x86 vector domain are marked accordingly, unless the related setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-03x86/ioapic: Remove a stray tab in the IO-APIC type stringAlan Song
The type "physic al" should be "physical". [ bp: Massage commit message. ] Fixes: 54cd3795b471 ("x86/ioapic: Cleanup guarded debug printk()s") Signed-off-by: Alan Song <syfmark114@163.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20241230065706.16789-1-syfmark114@163.com
2024-12-17x86/cpu: Expose only stepping min/max interfaceDave Hansen
The x86_match_cpu() infrastructure can match CPU steppings. Since there are only 16 possible steppings, the matching infrastructure goes all out and stores the stepping match as a bitmap. That means it can match any possible steppings in a single list entry. Fun. But it exposes this bitmap to each of the X86_MATCH_*() helpers when none of them really need a bitmap. It makes up for this by exporting a helper (X86_STEPPINGS()) which converts a contiguous stepping range into the bitmap which every single user leverages. Instead of a bitmap, have the main helper for this sort of thing (X86_MATCH_VFM_STEPS()) just take a stepping range. This ends up actually being even more compact than before. Leave the helper in place (renamed to __X86_STEPPINGS()) to make it more clear what is going on instead of just having a random GENMASK() in the middle of an already complicated macro. One oddity that I hit was this macro: X86_MATCH_VFM_STEPS(vfm, X86_STEPPING_MIN, max_stepping, issues) It *could* have been converted over to take a min/max stepping value for each entry. But that would have been a bit too verbose and would prevent the one oddball in the list (INTEL_COMETLAKE_L stepping 0) from sticking out. Instead, just have it take a *maximum* stepping and imply that the match is from 0=>max_stepping. This is functional for all the cases now and also retains the nice property of having INTEL_COMETLAKE_L stepping 0 stick out like a sore thumb. skx_cpuids[] is goofy. It uses the stepping match but encodes all possible steppings. Just use a normal, non-stepping match helper. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185129.65527B2A%40davehans-spike.ostc.intel.com