diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-05 13:52:43 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-05 13:52:43 -0800 |
| commit | 7203ca412fc8e8a0588e9adc0f777d3163f8dff3 (patch) | |
| tree | 7cbdcdb0bc0533f0133d472f95629099c123c3f9 /arch | |
| parent | ac20755937e037e586b1ca18a6717d31b1cbce93 (diff) | |
| parent | faf3c923523e5c8fc3baaa413d62e913774ae52f (diff) | |
Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
Rework the vmalloc() code to support non-blocking allocations
(GFP_ATOIC, GFP_NOWAIT)
"ksm: fix exec/fork inheritance" (xu xin)
Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
inherited across fork/exec
"mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
Some light maintenance work on the zswap code
"mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
Enhance the /sys/kernel/debug/page_owner debug feature by adding
unique identifiers to differentiate the various stack traces so
that userspace monitoring tools can better match stack traces over
time
"mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
Minor alterations to the page allocator's per-cpu-pages feature
"Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
Address a scalability issue in userfaultfd's UFFDIO_MOVE operation
"kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)
"drivers/base/node: fold node register and unregister functions" (Donet Tom)
Clean up the NUMA node handling code a little
"mm: some optimizations for prot numa" (Kefeng Wang)
Cleanups and small optimizations to the NUMA allocation hinting
code
"mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
Address long lock hold times at boot on large machines. These were
causing (harmless) softlockup warnings
"optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
Remove some now-unnecessary work from page reclaim
"mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
Enhance the DAMOS auto-tuning feature
"mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
configuration
"expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
Enhance the new(ish) file_operations.mmap_prepare() method and port
additional callsites from the old ->mmap() over to ->mmap_prepare()
"Fix stale IOTLB entries for kernel address space" (Lu Baolu)
Fix a bug (and possible security issue on non-x86) in the IOMMU
code. In some situations the IOMMU could be left hanging onto a
stale kernel pagetable entry
"mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
Clean up and optimize the folio splitting code
"mm, swap: misc cleanup and bugfix" (Kairui Song)
Some cleanups and a minor fix in the swap discard code
"mm/damon: misc documentation fixups" (SeongJae Park)
"mm/damon: support pin-point targets removal" (SeongJae Park)
Permit userspace to remove a specific monitoring target in the
middle of the current targets list
"mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
A couple of cleanups related to mm header file inclusion
"mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
improve the selection of swap devices for NUMA machines
"mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
Change the memory block labels from macros to enums so they will
appear in kernel debug info
"ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
Address an inefficiency when KSM unmerges an address range
"mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
Fix leaks and unhandled malloc() failures in DAMON userspace unit
tests
"some cleanups for pageout()" (Baolin Wang)
Clean up a couple of minor things in the page scanner's
writeback-for-eviction code
"mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
Move hugetlb's sysfs/sysctl handling code into a new file
"introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
Make the VMA guard regions available in /proc/pid/smaps and
improves the mergeability of guarded VMAs
"mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
Reduce mmap lock contention for callers performing VMA guard region
operations
"vma_start_write_killable" (Matthew Wilcox)
Start work on permitting applications to be killed when they are
waiting on a read_lock on the VMA lock
"mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
Add additional userspace testing of DAMON's "commit" feature
"mm/damon: misc cleanups" (SeongJae Park)
"make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
VMA is merged with another
"mm: support device-private THP" (Balbir Singh)
Introduce support for Transparent Huge Page (THP) migration in zone
device-private memory
"Optimize folio split in memory failure" (Zi Yan)
"mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
Some more cleanups in the folio splitting code
"mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
Clean up our handling of pagetable leaf entries by introducing the
concept of 'software leaf entries', of type softleaf_t
"reparent the THP split queue" (Muchun Song)
Reparent the THP split queue to its parent memcg. This is in
preparation for addressing the long-standing "dying memcg" problem,
wherein dead memcg's linger for too long, consuming memory
resources
"unify PMD scan results and remove redundant cleanup" (Wei Yang)
A little cleanup in the hugepage collapse code
"zram: introduce writeback bio batching" (Sergey Senozhatsky)
Improve zram writeback efficiency by introducing batched bio
writeback support
"memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
Clean up our handling of the interrupt safety of some memcg stats
"make vmalloc gfp flags usage more apparent" (Vishal Moola)
Clean up vmalloc's handling of incoming GFP flags
"mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
Teach soft dirty and userfaultfd write protect tracking to use
RISC-V's Svrsw60t59b extension
"mm: swap: small fixes and comment cleanups" (Youngjun Park)
Fix a small bug and clean up some of the swap code
"initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
Start work on converting the vma struct's flags to a bitmap, so we
stop running out of them, especially on 32-bit
"mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
Address a possible bug in the swap discard code and clean things
up a little
[ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu:
register device memory for poison handling") because it looks
broken to me, I've asked for clarification - Linus ]
* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm: fix vma_start_write_killable() signal handling
mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
mm/swapfile: fix list iteration when next node is removed during discard
fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
mm/kfence: add reboot notifier to disable KFENCE on shutdown
memcg: remove inc/dec_lruvec_kmem_state helpers
selftests/mm/uffd: initialize char variable to Null
mm: fix DEBUG_RODATA_TEST indentation in Kconfig
mm: introduce VMA flags bitmap type
tools/testing/vma: eliminate dependency on vma->__vm_flags
mm: simplify and rename mm flags function for clarity
mm: declare VMA flags by bit
zram: fix a spelling mistake
mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
pagemap: update BUDDY flag documentation
mm: swap: remove scan_swap_map_slots() references from comments
mm: swap: change swap_alloc_slow() to void
mm, swap: remove redundant comment for read_swap_cache_async
mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
...
Diffstat (limited to 'arch')
| -rw-r--r-- | arch/arm64/mm/mmu.c | 2 | ||||
| -rw-r--r-- | arch/csky/include/asm/pgtable.h | 3 | ||||
| -rw-r--r-- | arch/mips/alchemy/common/setup.c | 9 | ||||
| -rw-r--r-- | arch/mips/include/asm/pgtable.h | 5 | ||||
| -rw-r--r-- | arch/powerpc/kvm/book3s_hv_uvmem.c | 7 | ||||
| -rw-r--r-- | arch/powerpc/platforms/pseries/pci_dlpar.c | 2 | ||||
| -rw-r--r-- | arch/riscv/Kconfig | 16 | ||||
| -rw-r--r-- | arch/riscv/include/asm/hwcap.h | 1 | ||||
| -rw-r--r-- | arch/riscv/include/asm/pgtable-bits.h | 37 | ||||
| -rw-r--r-- | arch/riscv/include/asm/pgtable.h | 143 | ||||
| -rw-r--r-- | arch/riscv/kernel/cpufeature.c | 1 | ||||
| -rw-r--r-- | arch/s390/boot/vmem.c | 1 | ||||
| -rw-r--r-- | arch/s390/mm/gmap.c | 5 | ||||
| -rw-r--r-- | arch/s390/mm/gmap_helpers.c | 20 | ||||
| -rw-r--r-- | arch/s390/mm/pgtable.c | 12 | ||||
| -rw-r--r-- | arch/sparc/include/asm/pgtable_32.h | 12 | ||||
| -rw-r--r-- | arch/sparc/include/asm/pgtable_64.h | 12 | ||||
| -rw-r--r-- | arch/sparc/kernel/sys_sparc_64.c | 6 | ||||
| -rw-r--r-- | arch/x86/Kconfig | 1 | ||||
| -rw-r--r-- | arch/x86/kernel/cpu/sgx/driver.c | 2 | ||||
| -rw-r--r-- | arch/x86/mm/init_64.c | 2 | ||||
| -rw-r--r-- | arch/x86/mm/numa.c | 4 | ||||
| -rw-r--r-- | arch/x86/mm/pat/set_memory.c | 2 | ||||
| -rw-r--r-- | arch/x86/mm/pgtable.c | 12 |
24 files changed, 250 insertions, 67 deletions
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index c8d24b7cdd62..9ae7ce00a7ef 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -49,8 +49,6 @@ #define NO_CONT_MAPPINGS BIT(1) #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ -#define INVALID_PHYS_ADDR (-1ULL) - DEFINE_STATIC_KEY_FALSE(arm64_ptdump_lock_key); u64 kimage_voffset __ro_after_init; diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h index 5a394be09c35..d606afbabce1 100644 --- a/arch/csky/include/asm/pgtable.h +++ b/arch/csky/include/asm/pgtable.h @@ -263,7 +263,4 @@ void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, #define update_mmu_cache(vma, addr, ptep) \ update_mmu_cache_range(NULL, vma, addr, ptep, 1) -#define io_remap_pfn_range(vma, vaddr, pfn, size, prot) \ - remap_pfn_range(vma, vaddr, pfn, size, prot) - #endif /* __ASM_CSKY_PGTABLE_H */ diff --git a/arch/mips/alchemy/common/setup.c b/arch/mips/alchemy/common/setup.c index a7a6d31a7a41..c35b4f809d51 100644 --- a/arch/mips/alchemy/common/setup.c +++ b/arch/mips/alchemy/common/setup.c @@ -94,12 +94,13 @@ phys_addr_t fixup_bigphys_addr(phys_addr_t phys_addr, phys_addr_t size) return phys_addr; } -int io_remap_pfn_range(struct vm_area_struct *vma, unsigned long vaddr, - unsigned long pfn, unsigned long size, pgprot_t prot) +static inline unsigned long io_remap_pfn_range_pfn(unsigned long pfn, + unsigned long size) { phys_addr_t phys_addr = fixup_bigphys_addr(pfn << PAGE_SHIFT, size); - return remap_pfn_range(vma, vaddr, phys_addr >> PAGE_SHIFT, size, prot); + return phys_addr >> PAGE_SHIFT; } -EXPORT_SYMBOL(io_remap_pfn_range); +EXPORT_SYMBOL(io_remap_pfn_range_pfn); + #endif /* CONFIG_MIPS_FIXUP_BIGPHYS_ADDR */ diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h index ae73ecf4c41a..9c06a612d33a 100644 --- a/arch/mips/include/asm/pgtable.h +++ b/arch/mips/include/asm/pgtable.h @@ -604,9 +604,8 @@ static inline void update_mmu_cache_pmd(struct vm_area_struct *vma, */ #ifdef CONFIG_MIPS_FIXUP_BIGPHYS_ADDR phys_addr_t fixup_bigphys_addr(phys_addr_t addr, phys_addr_t size); -int io_remap_pfn_range(struct vm_area_struct *vma, unsigned long vaddr, - unsigned long pfn, unsigned long size, pgprot_t prot); -#define io_remap_pfn_range io_remap_pfn_range +unsigned long io_remap_pfn_range_pfn(unsigned long pfn, unsigned long size); +#define io_remap_pfn_range_pfn io_remap_pfn_range_pfn #else #define fixup_bigphys_addr(addr, size) (addr) #endif /* CONFIG_MIPS_FIXUP_BIGPHYS_ADDR */ diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 03f8c34fa0a2..e5000bef90f2 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -723,7 +723,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm) dpage = pfn_to_page(uvmem_pfn); dpage->zone_device_data = pvt; - zone_device_page_init(dpage); + zone_device_page_init(dpage, 0); return dpage; out_clear: spin_lock(&kvmppc_uvmem_bitmap_lock); @@ -1014,8 +1014,9 @@ static vm_fault_t kvmppc_uvmem_migrate_to_ram(struct vm_fault *vmf) * to a normal PFN during H_SVM_PAGE_OUT. * Gets called with kvm->arch.uvmem_lock held. */ -static void kvmppc_uvmem_page_free(struct page *page) +static void kvmppc_uvmem_folio_free(struct folio *folio) { + struct page *page = &folio->page; unsigned long pfn = page_to_pfn(page) - (kvmppc_uvmem_pgmap.range.start >> PAGE_SHIFT); struct kvmppc_uvmem_page_pvt *pvt; @@ -1034,7 +1035,7 @@ static void kvmppc_uvmem_page_free(struct page *page) } static const struct dev_pagemap_ops kvmppc_uvmem_ops = { - .page_free = kvmppc_uvmem_page_free, + .folio_free = kvmppc_uvmem_folio_free, .migrate_to_ram = kvmppc_uvmem_migrate_to_ram, }; diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c index aeb8633a3d00..8c77ec7980de 100644 --- a/arch/powerpc/platforms/pseries/pci_dlpar.c +++ b/arch/powerpc/platforms/pseries/pci_dlpar.c @@ -29,7 +29,7 @@ struct pci_controller *init_phb_dynamic(struct device_node *dn) nid = of_node_to_nid(dn); if (likely((nid) >= 0)) { if (!node_online(nid)) { - if (register_one_node(nid)) { + if (register_node(nid)) { pr_err("PCI: Failed to register node %d\n", nid); } else { update_numa_distance(dn); diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index fadec20b87a8..c95833e80423 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -142,11 +142,13 @@ config RISCV select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER + select HAVE_ARCH_SOFT_DIRTY if 64BIT && MMU && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_THREAD_STRUCT_WHITELIST select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD + select HAVE_ARCH_USERFAULTFD_WP if 64BIT && MMU && USERFAULTFD && RISCV_ISA_SVRSW60T59B select HAVE_ARCH_VMAP_STACK if MMU && 64BIT select HAVE_ASM_MODVERSIONS select HAVE_CONTEXT_TRACKING_USER @@ -849,6 +851,20 @@ config RISCV_ISA_ZICBOP If you don't know what to do here, say Y. +config RISCV_ISA_SVRSW60T59B + bool "Svrsw60t59b extension support for using PTE bits 60 and 59" + depends on MMU && 64BIT + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the Svrsw60t59b + extension and enable its usage. + + The Svrsw60t59b extension allows to free the PTE reserved bits 60 + and 59 for software to use. + + If you don't know what to do here, say Y. + config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI def_bool y # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index affd63e11b0a..f98fcb5c17d5 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -106,6 +106,7 @@ #define RISCV_ISA_EXT_ZAAMO 97 #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 +#define RISCV_ISA_EXT_SVRSW60T59B 100 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 179bd4afece4..b422d9691e60 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -19,6 +19,43 @@ #define _PAGE_SOFT (3 << 8) /* Reserved for software */ #define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */ + +#ifdef CONFIG_MEM_SOFT_DIRTY + +/* ext_svrsw60t59b: bit 59 for soft-dirty tracking */ +#define _PAGE_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 59) : 0) +/* + * Bit 3 is always zero for swap entry computation, so we + * can borrow it for swap page soft-dirty tracking. + */ +#define _PAGE_SWP_SOFT_DIRTY \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_EXEC : 0) +#else +#define _PAGE_SOFT_DIRTY 0 +#define _PAGE_SWP_SOFT_DIRTY 0 +#endif /* CONFIG_MEM_SOFT_DIRTY */ + +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */ +#define _PAGE_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + (1UL << 60) : 0) +/* + * Bit 4 is not involved into swap entry computation, so we + * can borrow it for swap page uffd-wp tracking. + */ +#define _PAGE_SWP_UFFD_WP \ + ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \ + _PAGE_USER : 0) +#else +#define _PAGE_UFFD_WP 0 +#define _PAGE_SWP_UFFD_WP 0 +#endif + #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 5a08eb5fe99f..1c311193e7da 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -417,6 +417,41 @@ static inline pte_t pte_wrprotect(pte_t pte) return __pte(pte_val(pte) & ~(_PAGE_WRITE)); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pgtable_supports_uffd_wp() \ + riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_UFFD_WP); +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP)); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP)); +} + +static inline bool pte_swp_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP)); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + /* static inline pte_t pte_mkread(pte_t pte) */ static inline pte_t pte_mkwrite_novma(pte_t pte) @@ -428,7 +463,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte) { - return __pte(pte_val(pte) | _PAGE_DIRTY); + return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY); } static inline pte_t pte_mkclean(pte_t pte) @@ -456,6 +491,42 @@ static inline pte_t pte_mkhuge(pte_t pte) return pte; } +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +#define pgtable_supports_soft_dirty() \ + (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && \ + riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) + +static inline bool pte_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY); +} + +static inline pte_t pte_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY)); +} + +static inline bool pte_swp_soft_dirty(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_mksoft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY); +} + +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY)); +} +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + #ifdef CONFIG_RISCV_ISA_SVNAPOT #define pte_leaf_size(pte) (pte_napot(pte) ? \ napot_cont_size(napot_cont_order(pte)) :\ @@ -805,6 +876,72 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline bool pmd_uffd_wp(pmd_t pmd) +{ + return pte_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd))); +} + +static inline bool pmd_swp_uffd_wp(pmd_t pmd) +{ + return pte_swp_uffd_wp(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd))); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY +static inline bool pmd_soft_dirty(pmd_t pmd) +{ + return pte_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd))); +} + +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION +static inline bool pmd_swp_soft_dirty(pmd_t pmd) +{ + return pte_swp_soft_dirty(pmd_pte(pmd)); +} + +static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd))); +} + +static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) +{ + return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd))); +} +#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ + static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { @@ -1003,7 +1140,9 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) * * Format of swap PTE: * bit 0: _PAGE_PRESENT (zero) - * bit 1 to 3: _PAGE_LEAF (zero) + * bit 1 to 2: (zero) + * bit 3: _PAGE_SWP_SOFT_DIRTY + * bit 4: _PAGE_SWP_UFFD_WP * bit 5: _PAGE_PROT_NONE (zero) * bit 6: exclusive marker * bits 7 to 11: swap type diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 72ca768f4e91..5441282656a7 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -539,6 +539,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), __RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT), __RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT), + __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B), __RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC), }; diff --git a/arch/s390/boot/vmem.c b/arch/s390/boot/vmem.c index cea3de4dce8c..fbe64ffdfb96 100644 --- a/arch/s390/boot/vmem.c +++ b/arch/s390/boot/vmem.c @@ -16,7 +16,6 @@ #include "decompressor.h" #include "boot.h" -#define INVALID_PHYS_ADDR (~(phys_addr_t)0) struct ctlreg __bootdata_preserved(s390_invalid_asce); #ifdef CONFIG_PROC_FS diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 603d9e5febb5..dd85bcca817d 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -596,8 +596,9 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) | _SEGMENT_ENTRY_GMAP_UC | _SEGMENT_ENTRY; } else - *table = pmd_val(*pmd) & - _SEGMENT_ENTRY_HARDWARE_BITS; + *table = (pmd_val(*pmd) & + _SEGMENT_ENTRY_HARDWARE_BITS) + | _SEGMENT_ENTRY; } } else if (*table & _SEGMENT_ENTRY_PROTECT && !(pmd_val(*pmd) & _SEGMENT_ENTRY_PROTECT)) { diff --git a/arch/s390/mm/gmap_helpers.c b/arch/s390/mm/gmap_helpers.c index d4c3c36855e2..549f14ad08af 100644 --- a/arch/s390/mm/gmap_helpers.c +++ b/arch/s390/mm/gmap_helpers.c @@ -11,27 +11,27 @@ #include <linux/mm.h> #include <linux/hugetlb.h> #include <linux/swap.h> -#include <linux/swapops.h> +#include <linux/leafops.h> #include <linux/pagewalk.h> #include <linux/ksm.h> #include <asm/gmap_helpers.h> #include <asm/pgtable.h> /** - * ptep_zap_swap_entry() - discard a swap entry. + * ptep_zap_softleaf_entry() - discard a software leaf entry. * @mm: the mm - * @entry: the swap entry that needs to be zapped + * @entry: the software leaf entry that needs to be zapped * - * Discards the given swap entry. If the swap entry was an actual swap - * entry (and not a migration entry, for example), the actual swapped + * Discards the given software leaf entry. If the leaf entry was an actual + * swap entry (and not a migration entry, for example), the actual swapped * page is also discarded from swap. */ -static void ptep_zap_swap_entry(struct mm_struct *mm, swp_entry_t entry) +static void ptep_zap_softleaf_entry(struct mm_struct *mm, softleaf_t entry) { - if (!non_swap_entry(entry)) + if (softleaf_is_swap(entry)) dec_mm_counter(mm, MM_SWAPENTS); - else if (is_migration_entry(entry)) - dec_mm_counter(mm, mm_counter(pfn_swap_entry_folio(entry))); + else if (softleaf_is_migration(entry)) + dec_mm_counter(mm, mm_counter(softleaf_to_folio(entry))); free_swap_and_cache(entry); } @@ -66,7 +66,7 @@ void gmap_helper_zap_one_page(struct mm_struct *mm, unsigned long vmaddr) preempt_disable(); pgste = pgste_get_lock(ptep); - ptep_zap_swap_entry(mm, pte_to_swp_entry(*ptep)); + ptep_zap_softleaf_entry(mm, softleaf_from_pte(*ptep)); pte_clear(mm, vmaddr, ptep); pgste_set_unlock(ptep, pgste); diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 7ae77df276b5..666adcd681ab 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -16,7 +16,7 @@ #include <linux/spinlock.h> #include <linux/rcupdate.h> #include <linux/slab.h> -#include <linux/swapops.h> +#include <linux/leafops.h> #include <linux/sysctl.h> #include <linux/ksm.h> #include <linux/mman.h> @@ -673,12 +673,12 @@ void ptep_unshadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *ptep) pgste_set_unlock(ptep, pgste); } -static void ptep_zap_swap_entry(struct mm_struct *mm, swp_entry_t entry) +static void ptep_zap_softleaf_entry(struct mm_struct *mm, softleaf_t entry) { - if (!non_swap_entry(entry)) + if (softleaf_is_swap(entry)) dec_mm_counter(mm, MM_SWAPENTS); - else if (is_migration_entry(entry)) { - struct folio *folio = pfn_swap_entry_folio(entry); + else if (softleaf_is_migration(entry)) { + struct folio *folio = softleaf_to_folio(entry); dec_mm_counter(mm, mm_counter(folio)); } @@ -700,7 +700,7 @@ void ptep_zap_unused(struct mm_struct *mm, unsigned long addr, if (!reset && pte_swap(pte) && ((pgstev & _PGSTE_GPS_USAGE_MASK) == _PGSTE_GPS_USAGE_UNUSED || (pgstev & _PGSTE_GPS_ZERO))) { - ptep_zap_swap_entry(mm, pte_to_swp_entry(pte)); + ptep_zap_softleaf_entry(mm, softleaf_from_pte(pte)); pte_clear(mm, addr, ptep); } if (reset) diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h index f1538a48484a..a9f802d1dd64 100644 --- a/arch/sparc/include/asm/pgtable_32.h +++ b/arch/sparc/include/asm/pgtable_32.h @@ -395,12 +395,8 @@ __get_iospace (unsigned long addr) #define GET_IOSPACE(pfn) (pfn >> (BITS_PER_LONG - 4)) #define GET_PFN(pfn) (pfn & 0x0fffffffUL) -int remap_pfn_range(struct vm_area_struct *, unsigned long, unsigned long, - unsigned long, pgprot_t); - -static inline int io_remap_pfn_range(struct vm_area_struct *vma, - unsigned long from, unsigned long pfn, - unsigned long size, pgprot_t prot) +static inline unsigned long io_remap_pfn_range_pfn(unsigned long pfn, + unsigned long size) { unsigned long long offset, space, phys_base; @@ -408,9 +404,9 @@ static inline int io_remap_pfn_range(struct vm_area_struct *vma, space = GET_IOSPACE(pfn); phys_base = offset | (space << 32ULL); - return remap_pfn_range(vma, from, phys_base >> PAGE_SHIFT, size, prot); + return phys_base >> PAGE_SHIFT; } -#define io_remap_pfn_range io_remap_pfn_range +#define io_remap_pfn_range_pfn io_remap_pfn_range_pfn #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS #define ptep_set_access_flags(__vma, __address, __ptep, __entry, __dirty) \ diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 64b85ff9c766..615f460c50af 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1048,9 +1048,6 @@ int page_in_phys_avail(unsigned long paddr); #define GET_IOSPACE(pfn) (pfn >> (BITS_PER_LONG - 4)) #define GET_PFN(pfn) (pfn & 0x0fffffffffffffffUL) -int remap_pfn_range(struct vm_area_struct *, unsigned long, unsigned long, - unsigned long, pgprot_t); - void adi_restore_tags(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t pte); @@ -1084,9 +1081,8 @@ static inline int arch_unmap_one(struct mm_struct *mm, return 0; } -static inline int io_remap_pfn_range(struct vm_area_struct *vma, - unsigned long from, unsigned long pfn, - unsigned long size, pgprot_t prot) +static inline unsigned long io_remap_pfn_range_pfn(unsigned long pfn, + unsigned long size) { unsigned long offset = GET_PFN(pfn) << PAGE_SHIFT; int space = GET_IOSPACE(pfn); @@ -1094,9 +1090,9 @@ static inline int io_remap_pfn_range(struct vm_area_struct *vma, phys_base = offset | (((unsigned long) space) << 32UL); - return remap_pfn_range(vma, from, phys_base >> PAGE_SHIFT, size, prot); + return phys_base >> PAGE_SHIFT; } -#define io_remap_pfn_range io_remap_pfn_range +#define io_remap_pfn_range_pfn io_remap_pfn_range_pfn static inline unsigned long __untagged_addr(unsigned long start) { diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c index 55faf2effa46..dbf118b40601 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -241,7 +241,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u if (flags & MAP_FIXED) { /* Ok, don't mess with it. */ - return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags); + return mm_get_unmapped_area(NULL, orig_addr, len, pgoff, flags); } flags &= ~MAP_SHARED; @@ -254,7 +254,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u align_goal = (64UL * 1024); do { - addr = mm_get_unmapped_area(current->mm, NULL, orig_addr, + addr = mm_get_unmapped_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff, flags); if (!(addr & ~PAGE_MASK)) { addr = (addr + (align_goal - 1UL)) & ~(align_goal - 1UL); @@ -273,7 +273,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u * be obtained. */ if (addr & ~PAGE_MASK) - addr = mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags); + addr = mm_get_unmapped_area(NULL, orig_addr, len, pgoff, flags); return addr; } diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 17a107cc5244..80527299f859 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -281,6 +281,7 @@ config X86 select HAVE_PCI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP + select ASYNC_KERNEL_PGTABLE_FREE if IOMMU_SVA select MMU_GATHER_RCU_TABLE_FREE select MMU_GATHER_MERGE_VMAS select HAVE_POSIX_CPU_TIMERS_TASK_WORK diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c index 79d6020dfe9c..a42c7180900b 100644 --- a/arch/x86/kernel/cpu/sgx/driver.c +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -130,7 +130,7 @@ static unsigned long sgx_get_unmapped_area(struct file *file, if (flags & MAP_FIXED) return addr; - return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); + return mm_get_unmapped_area(file, addr, len, pgoff, flags); } #ifdef CONFIG_COMPAT diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 1044aafd5d94..9983017ecbe0 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1028,7 +1028,7 @@ static void __meminit free_pagetable(struct page *page, int order) free_reserved_pages(page, nr_pages); #endif } else { - __free_pages(page, order); + pagetable_free(page_ptdesc(page)); } } diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index c24890c40138..7a97327140df 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -262,7 +262,7 @@ void __init init_gi_nodes(void) * bringup_nonboot_cpus * cpu_up * __try_online_node - * register_one_node + * register_node * because node_subsys is not initialized yet. * TODO remove dependency on node_online */ @@ -303,7 +303,7 @@ void __init init_cpu_to_node(void) * bringup_nonboot_cpus * cpu_up * __try_online_node - * register_one_node + * register_node * because node_subsys is not initialized yet. * TODO remove dependency on node_online */ diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 970981893c9b..fffb6ef1997d 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -429,7 +429,7 @@ static void cpa_collapse_large_pages(struct cpa_data *cpa) list_for_each_entry_safe(ptdesc, tmp, &pgtables, pt_list) { list_del(&ptdesc->pt_list); - __free_page(ptdesc_page(ptdesc)); + pagetable_free(ptdesc); } } diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index ddf248c3ee7d..2e5ecfdce73c 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -729,7 +729,7 @@ int pmd_clear_huge(pmd_t *pmd) int pud_free_pmd_page(pud_t *pud, unsigned long addr) { pmd_t *pmd, *pmd_sv; - pte_t *pte; + struct ptdesc *pt; int i; pmd = pud_pgtable(*pud); @@ -750,8 +750,8 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr) for (i = 0; i < PTRS_PER_PMD; i++) { if (!pmd_none(pmd_sv[i])) { - pte = (pte_t *)pmd_page_vaddr(pmd_sv[i]); - pte_free_kernel(&init_mm, pte); + pt = page_ptdesc(pmd_page(pmd_sv[i])); + pagetable_dtor_free(pt); } } @@ -772,15 +772,15 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr) */ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) { - pte_t *pte; + struct ptdesc *pt; - pte = (pte_t *)pmd_page_vaddr(*pmd); + pt = page_ptdesc(pmd_page(*pmd)); pmd_clear(pmd); /* INVLPG to clear all paging-structure caches */ flush_tlb_kernel_range(addr, addr + PAGE_SIZE-1); - pte_free_kernel(&init_mm, pte); + pagetable_dtor_free(pt); return 1; } |