summaryrefslogtreecommitdiff
path: root/arch/riscv/include
AgeCommit message (Collapse)Author
2025-09-18riscv: Move vendor errata definitions to new headerGuo Ren (Alibaba DAMO Academy)
Move vendor errata definitions into errata_list_vendors.h. Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Han Gao <rabenda.cn@gmail.com> Link: https://lore.kernel.org/r/20250713155321.2064856-2-guoren@kernel.org [pjw@kernel.org: updated to apply and to make the whitespace consistent] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-18riscv: introduce asm/swab.hIgnacio Encinas
Implement endianness swap macros for RISC-V. Use the rev8 instruction when Zbb is available. Otherwise, rely on the default mask-and-shift implementation. Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com> Link: https://lore.kernel.org/r/20250723-riscv-swab-v6-1-fc11e9a2efc9@iencinas.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-17riscv: Use generic TIF bitsThomas Gleixner
No point in defining generic items and the upcoming RSEQ optimizations are only available with this _and_ the generic entry infrastructure, which is already used by RISCV. So no further action required here. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2025-09-16riscv: kprobes: Move branch_funct3 to insn.hNam Cao
Similar to other instruction-processing macros/functions, branch_funct3 should be in insn.h. Move it into insn.h as RV_EXTRACT_FUNCT3. This new name matches the style in insn.h. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/linux-riscv/200c29a26338f19d09963fa02562787e8cfa06f2.1747215274.git.namcao@linutronix.de/ [pjw@kernel.org: updated to use RV_X_MASK and to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: kprobes: Move branch_rs2_idx to insn.hNam Cao
Similar to other instruction-processing macros/functions, branch_rs2_idx should be in insn.h. Move it into insn.h as RV_EXTRACT_RS2_REG. This new name matches the style in insn.h. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/linux-riscv/107d4a6c1818bf169be2407b273a0483e6d55bbb.1747215274.git.namcao@linutronix.de/ [pjw@kernel.org: updated to use RV_X_MASK and to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: Move all duplicate insn parsing macros into asm/insn.hAlexandre Ghiti
kernel/traps_misaligned.c and kvm/vcpu_insn.c define the same macros to extract information from the instructions. Let's move the definitions into asm/insn.h to avoid this duplication. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-3-d865dc9ad180@rivosinc.com [pjw@kernel.org: updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: Strengthen duplicate and inconsistent definition of RV_X()Alexandre Ghiti
RV_X() macro is defined in two different ways which is error prone. So harmonize its first definition and add another macro RV_X_MASK() for the second one. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-2-d865dc9ad180@rivosinc.com [pjw@kernel.org: upcase the macro name to conform with previous practice] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: Fix typo EXRACT -> EXTRACTAlexandre Ghiti
Simply fix a typo. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20250620-dev-alex-insn_duplicate_v5_manual-v5-1-d865dc9ad180@rivosinc.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headersThomas Huth
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's standardize on the __ASSEMBLER__ macro that is provided by the compilers now. This originally was a completely mechanical patch (done with a simple "sed -i" statement), with some manual fixups during rebasing of the patch later. Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: linux-riscv@lists.infradead.org Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/r/20250606070952.498274-3-thuth@redhat.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headersThomas Huth
__ASSEMBLY__ is only defined by the Makefile of the kernel, so this is not really useful for uapi headers (unless the userspace Makefile defines it, too). Let's switch to __ASSEMBLER__ which gets set automatically by the compiler when compiling assembly code. This is a completely mechanical patch (done with a simple "sed -i" statement). Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: linux-riscv@lists.infradead.org Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/r/20250606070952.498274-2-thuth@redhat.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16riscv: introduce ioremap_wc()Yunhui Cui
Compared with IO attributes, NC attributes can improve performance, specifically in these aspects: Relaxed Order, Gathering, Supports Read Speculation, Supports Unaligned Access. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250722091504.45974-2-cuiyunhui@bytedance.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-16RISC-V: KVM: Upgrade the supported SBI version to 3.0Atish Patra
Upgrade the SBI version to v3.0 so that corresponding features can be enabled in the guest. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Acked-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-8-d8f80cacb884@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Implement get event info functionAtish Patra
The new get_event_info funciton allows the guest to query the presence of multiple events with single SBI call. Currently, the perf driver in linux guest invokes it for all the standard SBI PMU events. Support the SBI function implementation in KVM as well. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Acked-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-7-d8f80cacb884@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16drivers/perf: riscv: Implement PMU event info functionAtish Patra
With the new SBI PMU event info function, we can query the availability of the all standard SBI PMU events at boot time with a single ecall. This improves the bootime by avoiding making an SBI call for each standard PMU event. Since this function is defined only in SBI v3.0, invoke this only if the underlying SBI implementation is v3.0 or higher. Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-4-d8f80cacb884@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16drivers/perf: riscv: Add raw event v2 supportAtish Patra
SBI v3.0 introduced a new raw event type that allows wider mhpmeventX width to be programmed via CFG_MATCH. Use the raw event v2 if SBI v3.0 is available. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Acked-by: Paul Walmsley <pjw@kernel.org> Link: https://lore.kernel.org/r/20250909-pmu_event_info-v6-2-d8f80cacb884@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Implement ONE_REG interface for SBI FWFT stateAnup Patel
The KVM user-space needs a way to save/restore the state of SBI FWFT features so implement SBI extension ONE_REG callbacks. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250823155947.1354229-6-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Move copy_sbi_ext_reg_indices() to SBI implementationAnup Patel
The ONE_REG handling of SBI extension enable/disable registers and SBI extension state registers is already under SBI implementation. On similar lines, let's move copy_sbi_ext_reg_indices() under SBI implementation. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250823155947.1354229-5-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Introduce optional ONE_REG callbacks for SBI extensionsAnup Patel
SBI extensions can have per-VCPU state which needs to be saved/restored through ONE_REG interface for Guest/VM migration. Introduce optional ONE_REG callbacks for SBI extensions so that ONE_REG implementation for an SBI extenion is part of the extension sources. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250823155947.1354229-4-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Allow bfloat16 extension for Guest/VMQuan Zhou
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zfbfmin/Zvfbfmin/Zvfbfwma extension for Guest/VM. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/f846cecd330ab9fc88211c55bc73126f903f8713.1754646071.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Allow Zicbop extension for Guest/VMQuan Zhou
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zicbop extension for Guest/VM. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/db4a9b679cc653bb6f5f5574e4196de7a980e458.1754646071.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Provide UAPI for Zicbop block sizeQuan Zhou
We're about to allow guests to use the Zicbop extension. KVM userspace needs to know the cache block size in order to properly advertise it to the guest. Provide a virtual config register for userspace to get it with the GET_ONE_REG API, but setting it cannot be supported, so disallow SET_ONE_REG. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/befd8403cd76d7adb97231ac993eaeb86bf2582c.1754646071.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-16RISC-V: KVM: Add support for SBI_FWFT_POINTER_MASKING_PMLENSamuel Holland
Pointer masking is controlled through a WARL field in henvcfg. Expose the feature only if at least one PMLEN value is supported for VS-mode. Allow the VMM to block access to the feature by disabling the Smnpm ISA extension in the guest. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250111004702.2813013-3-samuel.holland@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-13mm: introduce memdesc_flags_tMatthew Wilcox (Oracle)
Patch series "Add and use memdesc_flags_t". At some point struct page will be separated from struct slab and struct folio. This is a step towards that by introducing a type for the 'flags' word of all three structures. This gives us a certain amount of type safety by establishing that some of these unsigned longs are different from other unsigned longs in that they contain things like node ID, section number and zone number in the upper bits. That lets us have functions that can be easily called by anyone who has a slab, folio or page (but not easily by anyone else) to get the node or zone. There's going to be some unusual merge problems with this as some odd bits of the kernel decide they want to print out the flags value or something similar by writing page->flags and now they'll need to write page->flags.f instead. That's most of the churn here. Maybe we should be removing these things from the debug output? This patch (of 11): Wrap the unsigned long flags in a typedef. In upcoming patches, this will provide a strong hint that you can't just pass a random unsigned long to functions which take this as an argument. [willy@infradead.org: s/flags/flags.f/ in several architectures] Link: https://lkml.kernel.org/r/aKMgPRLD-WnkPxYm@casper.infradead.org [nicola.vetrini@gmail.com: mips: fix compilation error] Link: https://lore.kernel.org/lkml/CA+G9fYvkpmqGr6wjBNHY=dRp71PLCoi2341JxOudi60yqaeUdg@mail.gmail.com/ Link: https://lkml.kernel.org/r/20250825214245.1838158-1-nicola.vetrini@gmail.com Link: https://lkml.kernel.org/r/20250805172307.1302730-1-willy@infradead.org Link: https://lkml.kernel.org/r/20250805172307.1302730-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13riscv: use an atomic xchg in pudp_huge_get_and_clear()Alexandre Ghiti
Make sure we return the right pud value and not a value that could have been overwritten in between by a different core. Link: https://lkml.kernel.org/r/20250814-dev-alex-thp_pud_xchg-v1-1-b4704dfae206@rivosinc.com Fixes: c3cc2a4a3a23 ("riscv: Add support for PUD THP") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andrew Donnellan <ajd@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5Alexei Starovoitov
Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-08riscv: Add __attribute_const__ to ffs()-family implementationsKees Cook
While tracking down a problem where constant expressions used by BUILD_BUG_ON() suddenly stopped working[1], we found that an added static initializer was convincing the compiler that it couldn't track the state of the prior statically initialized value. Tracing this down found that ffs() was used in the initializer macro, but since it wasn't marked with __attribute__const__, the compiler had to assume the function might change variable states as a side-effect (which is not true for ffs(), which provides deterministic math results). Add missing __attribute_const__ annotations to RISC-V's implementations of variable__ffs(), variable__fls(), and variable_ffs() functions. These are pure mathematical functions that always return the same result for the same input with no side effects, making them eligible for compiler optimization. Build tested ARCH=riscv defconfig with GCC riscv64-linux-gnu 14.2.0. Link: https://github.com/KSPP/linux/issues/364 [1] Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250804164417.1612371-9-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-08RISC-V: KVM: add support for FWFT SBI extensionClément Léger
Add basic infrastructure to support the FWFT extension in KVM. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250523101932.1594077-14-cleger@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-09-05riscv: Fix sparse warning about different address spacesAlexandre Ghiti
We did not propagate the __user attribute of the pointers in __get_kernel_nofault() and __put_kernel_nofault(), which results in sparse complaining: >> mm/maccess.c:41:17: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const [noderef] __user *from @@ got unsigned long long [usertype] * @@ mm/maccess.c:41:17: sparse: expected void const [noderef] __user *from mm/maccess.c:41:17: sparse: got unsigned long long [usertype] * So fix this by correctly casting those pointers. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508161713.RWu30Lv1-lkp@intel.com/ Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com> Link: https://lore.kernel.org/r/20250903-dev-alex-sparse_warnings_v1-v1-2-7e6350beb700@rivosinc.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-05riscv: Fix sparse warning in __get_user_error()Alexandre Ghiti
We used to assign 0 to x without an appropriate cast which results in sparse complaining when x is a pointer: >> block/ioctl.c:72:39: sparse: sparse: Using plain integer as NULL pointer So fix this by casting 0 to the correct type of x. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508062321.gHv4kvuY-lkp@intel.com/ Fixes: f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com> Link: https://lore.kernel.org/r/20250903-dev-alex-sparse_warnings_v1-v1-1-7e6350beb700@rivosinc.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-05riscv: use lw when reading int cpu in asm_per_cpuRadim Krčmář
REG_L is wrong, because thread_info.cpu is 32-bit, not xlen-bit wide. The struct currently has a hole after cpu, so little endian accesses seemed fine. Fixes: be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Radim Krčmář <rkrcmar@ventanamicro.com> Link: https://lore.kernel.org/r/20250725165410.2896641-5-rkrcmar@ventanamicro.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-05riscv: uaccess: fix __put_user_nocheck for unaligned accessesAurelien Jarno
The type of the value to write should be determined by the size of the destination, not by the value itself, which may be a constant. This aligns the behavior with x86_64, where __typeof__(*(__gu_ptr)) is used to infer the correct type. This fixes an issue in put_cmsg, which was only writing 4 out of 8 bytes to the cmsg_len field, causing the glibc tst-socket-timestamp test to fail. Fixes: ca1a66cdd685 ("riscv: uaccess: do not do misaligned accesses in get/put_user()") Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250724220853.1969954-1-aurelien@aurel32.net Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-08-15riscv: Separate toolchain support dependency from RISCV_ISA_ZACASPu Lehui
RV64 bpf is going to support ZACAS instructions. Let's separate toolchain support dependency from RISCV_ISA_ZACAS. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20250719091730.2660197-5-pulehui@huaweicloud.com
2025-08-05Merge commit 'linus' into core/bugs, to resolve conflictsIngo Molnar
Resolve conflicts with this commit that was developed in parallel during the merge window: 8c8efa93db68 ("x86/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust") Conflicts: arch/riscv/include/asm/bug.h arch/x86/include/asm/bug.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-08-03Merge tag 'rust-6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Enable a set of Clippy lints: 'ptr_as_ptr', 'ptr_cast_constness', 'as_ptr_cast_mut', 'as_underscore', 'cast_lossless' and 'ref_as_ptr' These are intended to avoid type casts with the 'as' operator, which are quite powerful, into restricted variants that are less powerful and thus should help to avoid mistakes - Remove the 'author' key now that most instances were moved to the plural one in the previous cycle 'kernel' crate: - New 'bug' module: add 'warn_on!' macro which reuses the existing 'BUG'/'WARN' infrastructure, i.e. it respects the usual sysctls and kernel parameters: warn_on!(value == 42); To avoid duplicating the assembly code, the same strategy is followed as for the static branch code in order to share the assembly between both C and Rust This required a few rearrangements on C arch headers -- the existing C macros should still generate the same outputs, thus no functional change expected there - 'workqueue' module: add delayed work items, including a 'DelayedWork' struct, a 'impl_has_delayed_work!' macro and an 'enqueue_delayed' method, e.g.: /// Enqueue the struct for execution on the system workqueue, /// where its value will be printed 42 jiffies later. fn print_later(value: Arc<MyStruct>) { let _ = workqueue::system().enqueue_delayed(value, 42); } - New 'bits' module: add support for 'bit' and 'genmask' functions, with runtime- and compile-time variants, e.g.: static_assert!(0b00010000 == bit_u8(4)); static_assert!(0b00011110 == genmask_u8(1..=4)); assert!(checked_bit_u32(u32::BITS).is_none()); - 'uaccess' module: add 'UserSliceReader::strcpy_into_buf', which reads NUL-terminated strings from userspace into a '&CStr' Introduce 'UserPtr' newtype, similar in purpose to '__user' in C, to minimize mistakes handling userspace pointers, including mixing them up with integers and leaking them via the 'Debug' trait. Add it to the prelude, too - Start preparations for the replacement of our custom 'CStr' type with the analogous type in the 'core' standard library. This will take place across several cycles to make it easier. For this one, it includes a new 'fmt' module, using upstream method names and some other cleanups Replace 'fmt!' with a re-export, which helps Clippy lint properly, and clean up the found 'uninlined-format-args' instances - 'dma' module: - Clarify wording and be consistent in 'coherent' nomenclature - Convert the 'read!()' and 'write!()' macros to return a 'Result' - Add 'as_slice()', 'write()' methods in 'CoherentAllocation' - Expose 'count()' and 'size()' in 'CoherentAllocation' and add the corresponding type invariants - Implement 'CoherentAllocation::dma_handle_with_offset()' - 'time' module: - Make 'Instant' generic over clock source. This allows the compiler to assert that arithmetic expressions involving the 'Instant' use 'Instants' based on the same clock source - Make 'HrTimer' generic over the timer mode. 'HrTimer' timers take a 'Duration' or an 'Instant' when setting the expiry time, depending on the timer mode. With this change, the compiler can check the type matches the timer mode - Add an abstraction for 'fsleep'. 'fsleep' is a flexible sleep function that will select an appropriate sleep method depending on the requested sleep time - Avoid 64-bit divisions on 32-bit hardware when calculating timestamps - Seal the 'HrTimerMode' trait. This prevents users of the 'HrTimerMode' from implementing the trait on their own types - Pass the correct timer mode ID to 'hrtimer_start_range_ns()' - 'list' module: remove 'OFFSET' constants, allowing to remove pointer arithmetic; now 'impl_list_item!' invokes 'impl_has_list_links!' or 'impl_has_list_links_self_ptr!'. Other simplifications too - 'types' module: remove 'ForeignOwnable::PointedTo' in favor of a constant, which avoids exposing the type of the opaque pointer, and require 'into_foreign' to return non-null Remove the 'Either<L, R>' type as well. It is unused, and we want to encourage the use of custom enums for concrete use cases - 'sync' module: implement 'Borrow' and 'BorrowMut' for 'Arc' types to allow them to be used in generic APIs - 'alloc' module: implement 'Borrow' and 'BorrowMut' for 'Box<T, A>'; and 'Borrow', 'BorrowMut' and 'Default' for 'Vec<T, A>' - 'Opaque' type: add 'cast_from' method to perform a restricted cast that cannot change the inner type and use it in callers of 'container_of!'. Rename 'raw_get' to 'cast_into' to match it - 'rbtree' module: add 'is_empty' method - 'sync' module: new 'aref' submodule to hold 'AlwaysRefCounted' and 'ARef', which are moved from the too general 'types' module which we want to reduce or eventually remove. Also fix a safety comment in 'static_lock_class' 'pin-init' crate: - Add 'impl<T, E> [Pin]Init<T, E> for Result<T, E>', so results are now (pin-)initializers - Add 'Zeroable::init_zeroed()' that delegates to 'init_zeroed()' - New 'zeroed()', a safe version of 'mem::zeroed()' and also provide it via 'Zeroable::zeroed()' - Implement 'Zeroable' for 'Option<&T>', 'Option<&mut T>' and for 'Option<[unsafe] [extern "abi"] fn(...args...) -> ret>' for '"Rust"' and '"C"' ABIs and up to 20 arguments - Changed blanket impls of 'Init' and 'PinInit' from 'impl<T, E> [Pin]Init<T, E> for T' to 'impl<T> [Pin]Init<T> for T' - Renamed 'zeroed()' to 'init_zeroed()' - Upstream dev news: improve CI more to deny warnings, use '--all-targets'. Check the synchronization status of the two '-next' branches in upstream and the kernel MAINTAINERS: - Add Vlastimil Babka, Liam R. Howlett, Uladzislau Rezki and Lorenzo Stoakes as reviewers (thanks everyone) And a few other cleanups and improvements" * tag 'rust-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (76 commits) rust: Add warn_on macro arm64/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust riscv/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust x86/bug: Add ARCH_WARN_ASM macro for BUG/WARN asm code sharing with Rust rust: kernel: move ARef and AlwaysRefCounted to sync::aref rust: sync: fix safety comment for `static_lock_class` rust: types: remove `Either<L, R>` rust: kernel: use `core::ffi::CStr` method names rust: str: add `CStr` methods matching `core::ffi::CStr` rust: str: remove unnecessary qualification rust: use `kernel::{fmt,prelude::fmt!}` rust: kernel: add `fmt` module rust: kernel: remove `fmt!`, fix clippy::uninlined-format-args scripts: rust: emit path candidates in panic message scripts: rust: replace length checks with match rust: list: remove nonexistent generic parameter in link rust: bits: add support for bits/genmask macros rust: list: remove OFFSET constants rust: list: add `impl_list_item!` examples rust: list: use fully qualified path ...
2025-08-01Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay Mohan, Mark Rutland, Maxwell Bland) - Disallow tail calls between BPF programs that use different cgroup local storage maps to prevent out-of-bounds access (Daniel Borkmann) - Fix unaligned access in flow_dissector and netfilter BPF programs (Paul Chaignon) - Avoid possible use of uninitialized mod_len in libbpf (Achill Gilgenast) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test for unaligned flow_dissector ctx access bpf: Improve ctx access verifier error message bpf: Check netfilter ctx accesses are aligned bpf: Check flow_dissector ctx accesses are aligned arm64/cfi,bpf: Support kCFI + BPF on arm64 cfi: Move BPF CFI types and helpers to generic code cfi: add C CFI type macro libbpf: Avoid possible use of uninitialized mod_len bpf: Fix oob access in cgroup local storage bpf: Move cgroup iterator helpers to bpf.h bpf: Move bpf map owner out of common struct bpf: Add cookie object to bpf maps
2025-07-31cfi: Move BPF CFI types and helpers to generic codeSami Tolvanen
Instead of duplicating the same code for each architecture, move the CFI type hash variables for BPF function types and related helper functions to generic CFI code, and allow architectures to override the function definitions if needed. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250801001004.1859976-7-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-31Merge tag 'mm-stable-2025-07-30-15-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "As usual, many cleanups. The below blurbiage describes 42 patchsets. 21 of those are partially or fully cleanup work. "cleans up", "cleanup", "maintainability", "rationalizes", etc. I never knew the MM code was so dirty. "mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes) addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for merging with existing adjacent VMAs. "mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park) adds a new kernel module which simplifies the setup and usage of DAMON in production environments. "stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig) is a cleanup to the writeback code which removes a couple of pointers from struct writeback_control. "drivers/base/node.c: optimization and cleanups" (Donet Tom) contains largely uncorrelated cleanups to the NUMA node setup and management code. "mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman) does some maintenance work on the userfaultfd code. "Readahead tweaks for larger folios" (Ryan Roberts) implements some tuneups for pagecache readahead when it is reading into order>0 folios. "selftests/mm: Tweaks to the cow test" (Mark Brown) provides some cleanups and consistency improvements to the selftests code. "Optimize mremap() for large folios" (Dev Jain) does that. A 37% reduction in execution time was measured in a memset+mremap+munmap microbenchmark. "Remove zero_user()" (Matthew Wilcox) expunges zero_user() in favor of the more modern memzero_page(). "mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand) addresses some warts which David noticed in the huge page code. These were not known to be causing any issues at this time. "mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park) provides some cleanup and consolidation work in DAMON. "use vm_flags_t consistently" (Lorenzo Stoakes) uses vm_flags_t in places where we were inappropriately using other types. "mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy) increases the reliability of large page allocation in the memfd code. "mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple) removes several now-unneeded PFN_* flags. "mm/damon: decouple sysfs from core" (SeongJae Park) implememnts some cleanup and maintainability work in the DAMON sysfs layer. "madvise cleanup" (Lorenzo Stoakes) does quite a lot of cleanup/maintenance work in the madvise() code. "madvise anon_name cleanups" (Vlastimil Babka) provides additional cleanups on top or Lorenzo's effort. "Implement numa node notifier" (Oscar Salvador) creates a standalone notifier for NUMA node memory state changes. Previously these were lumped under the more general memory on/offline notifier. "Make MIGRATE_ISOLATE a standalone bit" (Zi Yan) cleans up the pageblock isolation code and fixes a potential issue which doesn't seem to cause any problems in practice. "selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park) adds additional drgn- and python-based DAMON selftests which are more comprehensive than the existing selftest suite. "Misc rework on hugetlb faulting path" (Oscar Salvador) fixes a rather obscure deadlock in the hugetlb fault code and follows that fix with a series of cleanups. "cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport) rationalizes and cleans up the highmem-specific code in the CMA allocator. "mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand) provides cleanups and future-preparedness to the migration code. "mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park) adds some tracepoints to some DAMON auto-tuning code. "mm/damon: fix misc bugs in DAMON modules" (SeongJae Park) does that. "mm/damon: misc cleanups" (SeongJae Park) also does what it claims. "mm: folio_pte_batch() improvements" (David Hildenbrand) cleans up the large folio PTE batching code. "mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park) facilitates dynamic alteration of DAMON's inter-node allocation policy. "Remove unmap_and_put_page()" (Vishal Moola) provides a couple of page->folio conversions. "mm: per-node proactive reclaim" (Davidlohr Bueso) implements a per-node control of proactive reclaim - beyond the current memcg-based implementation. "mm/damon: remove damon_callback" (SeongJae Park) replaces the damon_callback interface with a more general and powerful damon_call()+damos_walk() interface. "mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes) implements a number of mremap cleanups (of course) in preparation for adding new mremap() functionality: newly permit the remapping of multiple VMAs when the user is specifying MREMAP_FIXED. It still excludes some specialized situations where this cannot be performed reliably. "drop hugetlb_free_pgd_range()" (Anthony Yznaga) switches some sparc hugetlb code over to the generic version and removes the thus-unneeded hugetlb_free_pgd_range(). "mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park) augments the present userspace-requested update of DAMON sysfs monitoring files. Automatic update is now provided, along with a tunable to control the update interval. "Some randome fixes and cleanups to swapfile" (Kemeng Shi) does what is claims. "mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand) provides (and uses) a means by which debug-style functions can grab a copy of a pageframe and inspect it locklessly without tripping over the races inherent in operating on the live pageframe directly. "use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan) addresses the large contention issues which can be triggered by reads from that procfs file. Latencies are reduced by more than half in some situations. The series also introduces several new selftests for the /proc/pid/maps interface. "__folio_split() clean up" (Zi Yan) cleans up __folio_split()! "Optimize mprotect() for large folios" (Dev Jain) provides some quite large (>3x) speedups to mprotect() when dealing with large folios. "selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian) does some cleanup work in the selftests code. "tools/testing: expand mremap testing" (Lorenzo Stoakes) extends the mremap() selftest in several ways, including adding more checking of Lorenzo's recently added "permit mremap() move of multiple VMAs" feature. "selftests/damon/sysfs.py: test all parameters" (SeongJae Park) extends the DAMON sysfs interface selftest so that it tests all possible user-requested parameters. Rather than the present minimal subset" * tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits) MAINTAINERS: add missing headers to mempory policy & migration section MAINTAINERS: add missing file to cgroup section MAINTAINERS: add MM MISC section, add missing files to MISC and CORE MAINTAINERS: add missing zsmalloc file MAINTAINERS: add missing files to page alloc section MAINTAINERS: add missing shrinker files MAINTAINERS: move memremap.[ch] to hotplug section MAINTAINERS: add missing mm_slot.h file THP section MAINTAINERS: add missing interval_tree.c to memory mapping section MAINTAINERS: add missing percpu-internal.h file to per-cpu section mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info() selftests/damon: introduce _common.sh to host shared function selftests/damon/sysfs.py: test runtime reduction of DAMON parameters selftests/damon/sysfs.py: test non-default parameters runtime commit selftests/damon/sysfs.py: generalize DAMON context commit assertion selftests/damon/sysfs.py: generalize monitoring attributes commit assertion selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion selftests/damon/sysfs.py: test DAMOS filters commitment selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion selftests/damon/sysfs.py: test DAMOS destinations commitment ...
2025-07-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Host driver for GICv5, the next generation interrupt controller for arm64, including support for interrupt routing, MSIs, interrupt translation and wired interrupts - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on GICv5 hardware, leveraging the legacy VGIC interface - Userspace control of the 'nASSGIcap' GICv3 feature, allowing userspace to disable support for SGIs w/o an active state on hardware that previously advertised it unconditionally - Map supporting endpoints with cacheable memory attributes on systems with FEAT_S2FWB and DIC where KVM no longer needs to perform cache maintenance on the address range - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest hypervisor to inject external aborts into an L2 VM and take traps of masked external aborts to the hypervisor - Convert more system register sanitization to the config-driven implementation - Fixes to the visibility of EL2 registers, namely making VGICv3 system registers accessible through the VGIC device instead of the ONE_REG vCPU ioctls - Various cleanups and minor fixes LoongArch: - Add stat information for in-kernel irqchip - Add tracepoints for CPUCFG and CSR emulation exits - Enhance in-kernel irqchip emulation - Various cleanups RISC-V: - Enable ring-based dirty memory tracking - Improve perf kvm stat to report interrupt events - Delegate illegal instruction trap to VS-mode - MMU improvements related to upcoming nested virtualization s390x - Fixes x86: - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC, and PIT emulation at compile time - Share device posted IRQ code between SVM and VMX and harden it against bugs and runtime errors - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1) instead of O(n) - For MMIO stale data mitigation, track whether or not a vCPU has access to (host) MMIO based on whether the page tables have MMIO pfns mapped; using VFIO is prone to false negatives - Rework the MSR interception code so that the SVM and VMX APIs are more or less identical - Recalculate all MSR intercepts from scratch on MSR filter changes, instead of maintaining shadow bitmaps - Advertise support for LKGS (Load Kernel GS base), a new instruction that's loosely related to FRED, but is supported and enumerated independently - Fix a user-triggerable WARN that syzkaller found by setting the vCPU in INIT_RECEIVED state (aka wait-for-SIPI), and then putting the vCPU into VMX Root Mode (post-VMXON). Trying to detect every possible path leading to architecturally forbidden states is hard and even risks breaking userspace (if it goes from valid to valid state but passes through invalid states), so just wait until KVM_RUN to detect that the vCPU state isn't allowed - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of APERF/MPERF reads, so that a "properly" configured VM can access APERF/MPERF. This has many caveats (APERF/MPERF cannot be zeroed on vCPU creation or saved/restored on suspend and resume, or preserved over thread migration let alone VM migration) but can be useful whenever you're interested in letting Linux guests see the effective physical CPU frequency in /proc/cpuinfo - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been created, as there's no known use case for changing the default frequency for other VM types and it goes counter to the very reason why the ioctl was added to the vm file descriptor. And also, there would be no way to make it work for confidential VMs with a "secure" TSC, so kill two birds with one stone - Dynamically allocation the shadow MMU's hashed page list, and defer allocating the hashed list until it's actually needed (the TDP MMU doesn't use the list) - Extract many of KVM's helpers for accessing architectural local APIC state to common x86 so that they can be shared by guest-side code for Secure AVIC - Various cleanups and fixes x86 (Intel): - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest. Failure to honor FREEZE_IN_SMM can leak host state into guests - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF x86 (AMD): - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the nested SVM MSRPM offsets tracker can't handle an MSR (which is pretty much a static condition and therefore should never happen, but still) - Fix a variety of flaws and bugs in the AVIC device posted IRQ code - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware supports) instead of rejecting vCPU creation - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning clear in the vCPU's physical ID table entry - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by erratum #1235, to allow (safely) enabling AVIC on such CPUs - Request GA Log interrupts if and only if the target vCPU is blocking, i.e. only if KVM needs a notification in order to wake the vCPU - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the vCPU's CPUID model - Accept any SNP policy that is accepted by the firmware with respect to SMT and single-socket restrictions. An incompatible policy doesn't put the kernel at risk in any way, so there's no reason for KVM to care - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and use WBNOINVD instead of WBINVD when possible for SEV cache maintenance - When reclaiming memory from an SEV guest, only do cache flushes on CPUs that have ever run a vCPU for the guest, i.e. don't flush the caches for CPUs that can't possibly have cache lines with dirty, encrypted data Generic: - Rework irqbypass to track/match producers and consumers via an xarray instead of a linked list. Using a linked list leads to O(n^2) insertion times, which is hugely problematic for use cases that create large numbers of VMs. Such use cases typically don't actually use irqbypass, but eliminating the pointless registration is a future problem to solve as it likely requires new uAPI - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *", to avoid making a simple concept unnecessarily difficult to understand - Decouple device posted IRQs from VFIO device assignment, as binding a VM to a VFIO group is not a requirement for enabling device posted IRQs - Clean up and document/comment the irqfd assignment code - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e. ensure an eventfd is bound to at most one irqfd through the entire host, and add a selftest to verify eventfd:irqfd bindings are globally unique - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues related to private <=> shared memory conversions - Drop guest_memfd's .getattr() implementation as the VFS layer will call generic_fillattr() if inode_operations.getattr is NULL - Fix issues with dirty ring harvesting where KVM doesn't bound the processing of entries in any way, which allows userspace to keep KVM in a tight loop indefinitely - Kill off kvm_arch_{start,end}_assignment() and x86's associated tracking, now that KVM no longer uses assigned_device_count as a heuristic for either irqbypass usage or MDS mitigation Selftests: - Fix a comment typo - Verify KVM is loaded when getting any KVM module param so that attempting to run a selftest without kvm.ko loaded results in a SKIP message about KVM not being loaded/enabled (versus some random parameter not existing) - Skip tests that hit EACCES when attempting to access a file, and print a "Root required?" help message. In most cases, the test just needs to be run with elevated permissions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (340 commits) Documentation: KVM: Use unordered list for pre-init VGIC registers RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map() RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs RISC-V: perf/kvm: Add reporting of interrupt events RISC-V: KVM: Enable ring-based dirty memory tracking RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap RISC-V: KVM: Delegate illegal instruction fault to VS mode RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs RISC-V: KVM: Factor-out g-stage page table management RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence RISC-V: KVM: Introduce struct kvm_gstage_mapping RISC-V: KVM: Factor-out MMU related declarations into separate headers RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect() RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range() RISC-V: KVM: Don't flush TLB when PTE is unchanged RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize() RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init() RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list ...
2025-07-29Merge tag 'irq-drivers-2025-07-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt chip driver updates from Thomas Gleixner: - Add support of forced affinity setting to yet offline CPUs for the MIPS-GIC to ensure that the affinity of per CPU interrupts can be set during the early bringup phase of a secondary CPU in the hotplug code before the CPU is set online and interrupts are enabled - Add support for the MIPS (RISC-V !?!?) P8700 SoC in the ACLINT_SSWI interrupt chip - Make the interrupt routing to RISV-V harts specification compliant so it supports arbitrary hart indices - Add a command line parameter and related handling to disable the generic RISCV IMSIC mechanism on platforms which use a trap-emulated IMSIC. Unfortunatly this is required because there is no mechanism available to discover this programatically. - Enable wakeup sources on the Renesas RZV2H driver - Convert interrupt chip drivers, which use a open coded variant of msi_create_parent_irq_domain() to use the new functionality - Convert interrupt chip drivers, which use the old style two level implementation of MSI support over to the MSI parent mechanism to prepare for removing at least one of the three PCI/MSI backend variants. - The usual cleanups and improvements all over the place * tag 'irq-drivers-2025-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) irqchip/renesas-irqc: Convert to DEFINE_SIMPLE_DEV_PM_OPS() irqchip/renesas-intc-irqpin: Convert to DEFINE_SIMPLE_DEV_PM_OPS() irqchip/riscv-imsic: Add kernel parameter to disable IPIs irqchip/gic-v3: Fix GICD_CTLR register naming irqchip/ls-scfg-msi: Fix NULL dereference in error handling irqchip/ls-scfg-msi: Switch to use msi_create_parent_irq_domain() irqchip/armada-370-xp: Switch to msi_create_parent_irq_domain() irqchip/alpine-msi: Switch to msi_create_parent_irq_domain() irqchip/alpine-msi: Convert to __free irqchip/alpine-msi: Convert to lock guards irqchip/alpine-msi: Clean up whitespace style irqchip/sg2042-msi: Switch to msi_create_parent_irq_domain() irqchip/loongson-pch-msi.c: Switch to msi_create_parent_irq_domain() irqchip/imx-mu-msi: Convert to msi_create_parent_irq_domain() helper irqchip/riscv-imsic: Convert to msi_create_parent_irq_domain() helper irqchip/bcm2712-mip: Switch to msi_create_parent_irq_domain() irqdomain: Add device pointer to irq_domain_info and msi_domain_info irqchip/renesas-rzv2h: Remove unneeded includes irqchip/renesas-rzv2h: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND irqchip/aslint-sswi: Resolve hart index ...
2025-07-29Merge tag 'kvm-riscv-6.17-2' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.17 - Enabled ring-based dirty memory tracking - Improved perf kvm stat to report interrupt events - Delegate illegal instruction trap to VS-mode - MMU related improvements for KVM RISC-V for upcoming nested virtualization
2025-07-28RISC-V: KVM: Enable ring-based dirty memory trackingQuan Zhou
Enable ring-based dirty memory tracking on riscv: - Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL as riscv is weakly ordered. - Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page offset. - Add a check to kvm_vcpu_kvm_riscv_check_vcpu_requests for checking whether the dirty ring is soft full. To handle vCPU requests that cause exits to userspace, modified the `kvm_riscv_check_vcpu_requests` to return a value (currently only returns 0 or 1). Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20e116efb1f7aff211dd8e3cf8990c5521ed5f34.1749810735.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Delegate illegal instruction fault to VS modeXu Lu
Delegate illegal instruction fault to VS mode by default to avoid such exceptions being trapped to HS and redirected back to VS. The delegation of illegal instruction fault is particularly important to guest applications that use vector instructions frequently. In such cases, an illegal instruction fault will be raised when guest user thread uses vector instruction the first time and then guest kernel will enable user thread to execute following vector instructions. The fw pmu event counter remains undeleted so that guest can still query illegal instruction events via sbi call. Guest will only see zero count on illegal instruction faults and know 'firmware' has delegated it. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Link: https://lore.kernel.org/r/20250714094554.89151-1-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIsAnup Patel
Currently, all kvm_riscv_hfence_xyz() APIs assume VMID to be the host VMID of the Guest/VM which resticts use of these APIs only for host TLB maintenance. Let's allow passing VMID as a parameter to all kvm_riscv_hfence_xyz() APIs so that they can be re-used for nested virtualization related TLB maintenance. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-13-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Factor-out g-stage page table managementAnup Patel
The upcoming nested virtualization can share g-stage page table management with the current host g-stage implementation hence factor-out g-stage page table management as separate sources and also use "kvm_riscv_mmu_" prefix for host g-stage functions. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-12-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Add vmid field to struct kvm_riscv_hfenceAnup Patel
Currently, the struct kvm_riscv_hfence does not have vmid field and various hfence processing functions always pick vmid assigned to the guest/VM. This prevents us from doing hfence operation on arbitrary vmid hence add vmid field to struct kvm_riscv_hfence and use it wherever applicable. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-11-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Introduce struct kvm_gstage_mappingAnup Patel
Introduce struct kvm_gstage_mapping which represents a g-stage mapping at a particular g-stage page table level. Also, update the kvm_riscv_gstage_map() to return the g-stage mapping upon success. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-10-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Factor-out MMU related declarations into separate headersAnup Patel
The MMU, TLB, and VMID management for KVM RISC-V already exists as seprate sources so create separate headers along these lines. This further simplifies asm/kvm_host.h header. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-9-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()Anup Patel
The kvm_arch_flush_remote_tlbs_range() expected by KVM core can be easily implemented for RISC-V using kvm_riscv_hfence_gvma_vmid_gpa() hence provide it. Also with kvm_arch_flush_remote_tlbs_range() available for RISC-V, the mmu_wp_memory_region() can happily use kvm_flush_remote_tlbs_memslot() instead of kvm_flush_remote_tlbs(). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-7-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSHAnup Patel
The KVM_REQ_HFENCE_GVMA_VMID_ALL is same as KVM_REQ_TLB_FLUSH so to avoid confusion let's replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH. Also, rename kvm_riscv_hfence_gvma_vmid_all_process() to kvm_riscv_tlb_flush_process(). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-5-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-07-28RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()Anup Patel
The kvm_riscv_local_tlb_sanitize() deals with sanitizing current VMID related TLB mappings when a VCPU is moved from one host CPU to another. Let's move kvm_riscv_local_tlb_sanitize() to VMID management sources and rename it to kvm_riscv_gstage_vmid_sanitize(). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Link: https://lore.kernel.org/r/20250618113532.471448-4-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>