summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
AgeCommit message (Collapse)Author
2025-09-18drivers/perf: hisi: Add support for HiSilicon NoC PMUYicong Yang
Adds the support for HiSilicon NoC (Network on Chip) PMU which will be used to monitor the events on the system bus. The PMU device will be named after the SCL ID (either Super CPU cluster or Super IO cluster) and the index ID, just similar to other HiSilicon Uncore PMUs. Below PMU formats are provided besides the event: - ch: the transaction channel (data, request, response, etc) which can be used to filter the counting. - tt_en: tracetag filtering enable. Just as other HiSilicon Uncore PMUs the NoC PMU supports only counting the transactions with tracetag. The NoC PMU doesn't have an interrupt to indicate the overflow. However we have a 64 bit counter which is large enough and it's nearly impossible to overflow. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18perf/dwc_pcie: Support counting multiple lane events in parallelIlkka Koskinen
While Designware PCIe PMU allows to count only one time based event at a time, it allows to count all the lane events simultaneously. After the patch one is able to count a group of lane events: $ perf stat -e '{dwc_rootport/tx_memory_write,lane=1/,dwc_rootport/rx_memory_read,lane=0/}' dd if=/dev/nvme0n1 of=/dev/null bs=1M count=1 Earlier the events wouldn't have been counted successfully. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-16arm64: mm: Rework the 'rodata=' optionsHuang Shijie
As per admin guide documentation, "rodata=on" should be the default on platforms. Documentation/admin-guide/kernel-parameters.txt describes these options as rodata= [KNL,EARLY] on Mark read-only kernel memory as read-only (default). off Leave read-only kernel memory writable for debugging. full Mark read-only kernel memory and aliases as read-only [arm64] But on arm64 platform, RODATA_FULL_DEFAULT_ENABLED is enabled by default, so "rodata=full" is the default instead. For parity with other architectures, namely x86, rework 'rodata=on' to match the current "full" behaviour and replace 'rodata=full' with a new 'rodata=noalias' option which retains writable aliases in the direct map for memory regions outside of the kernel image. Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-16xfs: extend removed sysctls tableBagas Sanjaya
Commit 21d59d00221e4e ("xfs: remove deprecated sysctl knobs") moves recently-removed sysctls to the removed sysctls table but fails to extend the table, hence triggering Sphinx warning: Documentation/admin-guide/xfs.rst:365: ERROR: Malformed table. Text in column margin in table line 8. ============================= ======= Name Removed ============================= ======= fs.xfs.xfsbufd_centisec v4.0 fs.xfs.age_buffer_centisecs v4.0 fs.xfs.irix_symlink_mode v6.18 fs.xfs.irix_sgid_inherit v6.18 fs.xfs.speculative_cow_prealloc_lifetime v6.18 ============================= ======= [docutils] Extend "Name" column of the table to fit the now-longest sysctl, which is fs.xfs.speculative_cow_prealloc_lifetime. Fixes: 21d59d00221e ("xfs: remove deprecated sysctl knobs") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250908180406.32124fb7@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-09-15KVM: SEV: Add SEV-SNP CipherTextHiding supportAshish Kalra
Ciphertext hiding prevents host accesses from reading the ciphertext of SNP guest private memory. Instead of reading ciphertext, the host reads will see constant default values (0xff). The SEV ASID space is split into SEV and SEV-ES/SEV-SNP ASID ranges. Enabling ciphertext hiding further splits the SEV-ES/SEV-SNP ASID space into separate ASID ranges for SEV-ES and SEV-SNP guests. Add a new off-by-default kvm-amd module parameter to enable ciphertext hiding and allow the admin to configure the SEV-ES and SEV-SNP ASID ranges. Simply cap the maximum SEV-SNP ASID as appropriate, i.e. don't reject loading KVM or disable ciphertest hiding for a too-big value, as KVM's general approach for module params is to sanitize inputs based on hardware/kernel support, not burn the world down. This also allows the admin to use -1u to assign all SEV-ES/SNP ASIDs to SNP without needing dedicated handling in KVM. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/95abc49edfde36d4fb791570ea2a4be6ad95fd0d.1755721927.git.ashish.kalra@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-15x86/resctrl: Add ABMC feature in the command line optionsBabu Moger
Add a kernel command-line parameter to enable or disable the exposure of the ABMC (Assignable Bandwidth Monitoring Counters) hardware feature to resctrl. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-13panic: refine the document for 'panic_print'Feng Tang
User reported current document about SYS_INFO_PANIC_CONSOLE_REPLAY is confusing, that people could expect all user space console messages to be replayed. Specify that only 'kernel' messages will be replayed to solve the confusion. Link: https://lkml.kernel.org/r/20250825025701.81921-3-feng.tang@linux.alibaba.com Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com> Reported-by: Askar Safin <safinaskar@zohomail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lance Yang <lance.yang@linux.dev> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13Docs/admin-guide/mm/damon/usage: document addr_unit fileSeongJae Park
Document addr_unit DAMON sysfs file on DAMON usage document. Link: https://lkml.kernel.org/r/20250828171242.59810-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ze zuo <zuoze1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13docs: transhuge: document process level THP controlsUsama Arif
This includes the PR_SET_THP_DISABLE/PR_GET_THP_DISABLE pair of prctl calls as well the newly introduced PR_THP_DISABLE_EXCEPT_ADVISED flag for the PR_SET_THP_DISABLE prctl call. Link: https://lkml.kernel.org/r/20250815135549.130506-5-usamaarif642@gmail.com Signed-off-by: Usama Arif <usamaarif642@gmail.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yafang <laoar.shao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-12x86/bugs: Add attack vector controls for VMSCAPEDavid Kaplan
Use attack vector controls to select whether VMSCAPE requires mitigation, similar to other bugs. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc6). Conflicts: net/netfilter/nft_set_pipapo.c net/netfilter/nft_set_pipapo_avx2.c c4eaca2e1052 ("netfilter: nft_set_pipapo: don't check genbit from packetpath lookups") 84c1da7b38d9 ("netfilter: nft_set_pipapo: use avx2 algorithm for insertions too") Only trivial adjacent changes (in a doc and a Makefile). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-10Merge tag 'vmscape-for-linus-20250904' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vmescape mitigation fixes from Dave Hansen: "Mitigate vmscape issue with indirect branch predictor flushes. vmscape is a vulnerability that essentially takes Spectre-v2 and attacks host userspace from a guest. It particularly affects hypervisors like QEMU. Even if a hypervisor may not have any sensitive data like disk encryption keys, guest-userspace may be able to attack the guest-kernel using the hypervisor as a confused deputy. There are many ways to mitigate vmscape using the existing Spectre-v2 defenses like IBRS variants or the IBPB flushes. This series focuses solely on IBPB because it works universally across vendors and all vulnerable processors. Further work doing vendor and model-specific optimizations can build on top of this if needed / wanted. Do the normal issue mitigation dance: - Add the CPU bug boilerplate - Add a list of vulnerable CPUs - Use IBPB to flush the branch predictors after running guests" * tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vmscape: Add old Intel CPUs to affected list x86/vmscape: Warn when STIBP is disabled with SMT x86/bugs: Move cpu_bugs_smt_update() down x86/vmscape: Enable the mitigation x86/vmscape: Add conditional IBPB mitigation x86/vmscape: Enumerate VMSCAPE bug Documentation/hw-vuln: Add VMSCAPE documentation
2025-09-09media: i2c: ov6650: Drop unused driverLaurent Pinchart
The ov6650 driver was introduced in v2.6.37 to support the OMAP1-based Amstrad Delta video phone. The platform still has a board file in the kernel, but support for the camera was dropped in commit ce548396a433 ("media: mach-omap1: board-ams-delta.c: remove soc_camera dependencies") in v5.9. The driver has been unused since as it has received neither ACPI nor DT support. The ov6650 driver is one of the last sensor drivers calling clk_set_rate(). This is deprecated, and calls to the function are being removed to avoid cargo-cult. As the driver is unlikely to ever be used again, drop it instead of trying to avoid call clk_set_rate(). Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Mehdi Djait <mehdi.djait@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2025-09-09Documentation: update Hans Verkuil's email addressHans Verkuil
Replace hverkuil@xs4all.nl by hverkuil@kernel.org. Signed-off-by: Hans Verkuil <hverkuil@kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2025-09-06md/md-llbitmap: introduce new lockless bitmapYu Kuai
Redundant data is used to enhance data fault tolerance, and the storage method for redundant data vary depending on the RAID levels. And it's important to maintain the consistency of redundant data. Bitmap is used to record which data blocks have been synchronized and which ones need to be resynchronized or recovered. Each bit in the bitmap represents a segment of data in the array. When a bit is set, it indicates that the multiple redundant copies of that data segment may not be consistent. Data synchronization can be performed based on the bitmap after power failure or readding a disk. If there is no bitmap, a full disk synchronization is required. Due to known performance issues with md-bitmap and the unreasonable implementations: - self-managed IO submitting like filemap_write_page(); - global spin_lock I have decided not to continue optimizing based on the current bitmap implementation, this new bitmap is invented without locking from IO fast path and can be used with fast disks. For designs and details, see the comments in drivers/md-llbitmap.c. Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-12-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Li Nan <linan122@huawei.com>
2025-09-06md/md-bitmap: delay registration of bitmap_ops until creating bitmapYu Kuai
Currently bitmap_ops is registered while allocating mddev, this is fine when there is only one bitmap_ops. Delay setting bitmap_ops until creating bitmap, so that user can choose which bitmap to use before running the array. Link: https://lore.kernel.org/linux-raid/20250721171557.34587-7-yukuai@kernel.org Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Li Nan <linan122@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com>
2025-09-06md/md-bitmap: add a new sysfs api bitmap_typeYu Kuai
The api will be used by mdadm to set bitmap_type while creating new array or assembling array, prepare to add a new bitmap. Currently available options are: cat /sys/block/md0/md/bitmap_type none [bitmap] Link: https://lore.kernel.org/linux-raid/20250829080426.1441678-6-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Li Nan <linan122@huawei.com>
2025-09-05cgroup: Merge branch 'for-6.17-fixes' into for-6.18Tejun Heo
Pull for-6.17-fixes to receive 79f919a89c9d ("cgroup: split cgroup_destroy_wq into 3 workqueues") to resolve its conflict with 7fa33aa3b001 ("cgroup: WQ_PERCPU added to alloc_workqueue users"). The latter adds WQ_PERCPU when creating cgroup_destroy_wq and the former splits the workqueue into three. Resolve by applying WQ_PERCPU to the three split workqueues. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-09-05xfs: remove deprecated sysctl knobsDarrick J. Wong
These sysctl knobs were scheduled for removal in September 2025. That time has come, so remove them. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-05xfs: remove deprecated mount optionsDarrick J. Wong
These four mount options were scheduled for removal in September 2025, so remove them now. Cc: preichl@redhat.com Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-05xfs: disable deprecated features by default in KconfigDarrick J. Wong
We promised to turn off these old features by default in September 2025. Do so now. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2025-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc5). No conflicts. Adjacent changes: include/net/sock.h c51613fa276f ("net: add sk->sk_drop_counters") 5d6b58c932ec ("net: lockless sock_i_ino()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-04x86/cfi: Add "debug" option to "cfi=" bootparamKees Cook
Add "debug" option for "cfi=" bootparam to get details on early CFI initialization steps so future Kees can find breakage easier. Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250904034656.3670313-5-kees@kernel.org
2025-09-04x86/cfi: Document the "cfi=" bootparam optionsKees Cook
The kernel-parameters.txt didn't have a section for the cfi= options. Add it. Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250904034656.3670313-3-kees@kernel.org
2025-09-04x86/microcode: Add microcode loader debugging functionalityBorislav Petkov (AMD)
Instead of adding ad-hoc debugging glue to the microcode loader each time I need it, add debugging functionality which is not built by default. Simulate all patch handling the loader does except the actual loading of the microcode patch into the hardware. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250820135043.19048-3-bp@kernel.org
2025-09-04x86/microcode: Add microcode= cmdline parsingBorislav Petkov (AMD)
Add a "microcode=" command line argument after which all options can be passed in a comma-separated list. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Chang S. Bae <chang.seok.bae@intel.com> Link: https://lore.kernel.org/20250820135043.19048-2-bp@kernel.org
2025-09-03docs: admin-guide: Fix typo in nfsroot.rstZenghui Yu
There is an obvious mistake in nfsroot.rst where pxelinux was wrongly written as pxeliunx. Fix it. Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250827144738.43361-1-zenghui.yu@linux.dev
2025-09-03genirq: Add support for warning on long-running interrupt handlersWladislav Wiebe
Introduce a mechanism to detect and warn about prolonged interrupt handlers. With a new command-line parameter (irqhandler.duration_warn_us=), users can configure the duration threshold in microseconds when a warning in such format should be emitted: "[CPU14] long duration of IRQ[159:bad_irq_handler [long_irq]], took: 1330 us" The implementation uses local_clock() to measure the execution duration of the generic IRQ per-CPU event handler. Signed-off-by: Wladislav Wiebe <wladislav.wiebe@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/all/20250804093525.851-1-wladislav.wiebe@nokia.com
2025-08-29Fix typo in RAID arrays documentationVivek Alurkar
Changed "write-throuth" to "write-through". Signed-off-by: Vivek Alurkar <primalkenja@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250821051622.8341-2-primalkenja@gmail.com
2025-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/intel/idpf/idpf_txrx.c 02614eee26fb ("idpf: do not linearize big TSO packets") 6c4e68480238 ("idpf: remove obsolete stashing code") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27x86/bugs: Add attack vector controls for SSBDavid Kaplan
Attack vector controls for SSB were missed in the initial attack vector series. The default mitigation for SSB requires user-space opt-in so it is only relevant for user->user attacks. Check with attack vector controls when the command is auto - i.e., no explicit user selection has been done. Fixes: 2d31d2874663 ("x86/bugs: Define attack vectors relevant for each bug") Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250819192200.2003074-5-david.kaplan@amd.com
2025-08-25dm-pcache: add persistent cache target in device-mapperDongsheng Yang
This patch introduces dm-pcache, a new DM target that places a DAX- capable persistent-memory device in front of any slower block device and uses it as a high-throughput, low-latency cache. Design highlights ----------------- - DAX data path – data is copied directly between DRAM and the pmem mapping, bypassing the block layer’s overhead. - Segmented, crash-consistent layout - all layout metadata are dual-replicated CRC-protected. - atomic kset flushes; key replay on mount guarantees cache integrity even after power loss. - Striped multi-tree index - Multi‑tree indexing for high parallelism. - overlap-resolution logic ensures non-intersecting cached extents. - Background services - write-back worker flushes dirty keys in order, preserving backing-device crash consistency. This is important for checkpoint in cloud storage. - garbage collector reclaims clean segments when utilisation exceeds a tunable threshold. - Data integrity – optional CRC32 on cached payload; metadata always protected. Comparison with existing block-level caches --------------------------------------------------------------------------------------------------------------------------------- | Feature | pcache (this patch) | bcache | dm-writecache | |----------------------------------|---------------------------------|------------------------------|---------------------------| | pmem access method | DAX | bio (block I/O) | DAX | | Write latency (4 K rand-write) | ~5 µs | ~20 µs | ~5 µs | | Concurrency | multi subtree index | global index tree | single tree + wc_lock | | IOPS (4K randwrite, 32 numjobs) | 2.1 M | 352 K | 283 K | | Read-cache support | YES | YES | NO | | Deployment | no re-format of backend | backend devices must be | no re-format of backend | | | | reformatted | | | Write-back ordering | log-structured; | no ordering guarantee | no ordering guarantee | | | preserves app-IO-order | | | | Data integrity checks | metadata + data CRC(optional) | metadata CRC only | none | --------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-22cgroup: cgroup.stat.local time accountingTiffany Yang
There isn't yet a clear way to identify a set of "lost" time that everyone (or at least a wider group of users) cares about. However, users can perform some delay accounting by iterating over components of interest. This patch allows cgroup v2 freezing time to be one of those components. Track the cumulative time that each v2 cgroup spends freezing and expose it to userland via a new local stat file in cgroupfs. Thank you to Michal, who provided the ASCII art in the updated documentation. To access this value: $ mkdir /sys/fs/cgroup/test $ cat /sys/fs/cgroup/test/cgroup.stat.local freeze_time_total 0 Ensure consistent freeze time reads with freeze_seq, a per-cgroup sequence counter. Writes are serialized using the css_set_lock. Signed-off-by: Tiffany Yang <ynaffit@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-21Merge tag 'cgroup-for-6.17-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix NULL de-ref in css_rstat_exit() which could happen after allocation failure - Fix a cpuset partition handling bug and a couple other misc issues - Doc spelling fix * tag 'cgroup-for-6.17-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup: fixed spelling mistakes in documentation cgroup: avoid null de-ref in css_rstat_exit() cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() cgroup/cpuset: Fix a partition error with CPU hotplug cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
2025-08-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-21docs: sysctl: add a few more top-level /proc/sys entriesRandy Dunlap
Add a few missing directories under /proc/sys. Fix punctuation and doubled words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250819075456.113623-1-rdunlap@infradead.org
2025-08-21docs: kernel-parameters: typo fix and add missing SPDX-License tagBartlomiej Kubik
Fix documentation issues by removing a duplicated word and adding the missing SPDX-License identifier. Signed-off-by: Bartlomiej Kubik <kubik.bartlomiej@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250819113551.34356-1-kubik.bartlomiej@gmail.com
2025-08-21fs: Add 'initramfs_options' to set initramfs mount optionsLichen Liu
When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs. By default, a tmpfs mount is limited to using 50% of the available RAM for its content. This can be problematic in memory-constrained environments, particularly during a kdump capture. In a kdump scenario, the capture kernel boots with a limited amount of memory specified by the 'crashkernel' parameter. If the initramfs is large, it may fail to unpack into the tmpfs rootfs due to insufficient space. This is because to get X MB of usable space in tmpfs, 2*X MB of memory must be available for the mount. This leads to an OOM failure during the early boot process, preventing a successful crash dump. This patch introduces a new kernel command-line parameter, initramfs_options, which allows passing specific mount options directly to the rootfs when it is first mounted. This gives users control over the rootfs behavior. For example, a user can now specify initramfs_options=size=75% to allow the tmpfs to use up to 75% of the available memory. This can significantly reduce the memory pressure for kdump. Consider a practical example: To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With the default 50% limit, this requires a memory pool of 96MB to be available for the tmpfs mount. The total memory requirement is therefore approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB. By using initramfs_options=size=75%, the memory pool required for the 48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed with a smaller crashkernel size, such as 192MB. An alternative approach of reusing the existing rootflags parameter was considered. However, a new, dedicated initramfs_options parameter was chosen to avoid altering the current behavior of rootflags (which applies to the final root filesystem) and to prevent any potential regressions. Also add documentation for the new kernel parameter "initramfs_options" This approach is inspired by prior discussions and patches on the topic. Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128 Ref: https://landley.net/notes-2015.html#01-01-2015 Ref: https://lkml.org/lkml/2021/6/29/783 Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html#what-is-rootfs Signed-off-by: Lichen Liu <lichliu@redhat.com> Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com Tested-by: Rob Landley <rob@landley.net> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-20net: set net.core.rmem_max and net.core.wmem_max to 4 MBEric Dumazet
SO_RCVBUF and SO_SNDBUF have limited range today, unless distros or system admins change rmem_max and wmem_max. Even iproute2 uses 1 MB SO_RCVBUF which is capped by the kernel. Decouple [rw]mem_max and [rw]mem_default and increase [rw]mem_max to 4 MB. Before: $ sysctl net.core.rmem_default net.core.rmem_max net.core.wmem_default net.core.wmem_max net.core.rmem_default = 212992 net.core.rmem_max = 212992 net.core.wmem_default = 212992 net.core.wmem_max = 212992 After: $ sysctl net.core.rmem_default net.core.rmem_max net.core.wmem_default net.core.wmem_max net.core.rmem_default = 212992 net.core.rmem_max = 4194304 net.core.wmem_default = 212992 net.core.wmem_max = 4194304 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250819174030.1986278-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19dm-vdo: Promote dm-vdo title to title headingBagas Sanjaya
dm-vdo docs currently has no explicit title heading but instead there are multiple section headings as top-level heading. As such, these sections are rendered as titles and inflates number of entries in the toctree index. Promote the first section heading ("dm-vdo") to title heading. Fixes: 04bf7ac646ab ("dm: add documentation for dm-vdo target") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-19docs: device-mapper: fixed spelling mistakes in documentationSoham Metha
found/fixed the following typos - flushs -> flushes in `Documentation/admin-guide/device-mapper/delay.rst` Signed-off-by: Soham Metha <sohammetha01@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-19docs: device-mapper: fix typos in delay.rst and vdo-design.rstShubham Sharma
Fixed the following typos in device-mapper documentation: - explicitely -> explicitly - approriate -> appropriate Signed-off-by: Shubham Sharma <slopixelz@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-08-18docs: Remove remainders of reiserfsDavid Sterba
Reiserfs has been removed in 6.13, there are still some mentions in the documentation about it and the tools. Remove those that don't seem relevant anymore but keep references to reiserfs' r5 hash used by some code. There's one change in a script scripts/selinux/install_policy.sh but it does not seem to be relevant either. Signed-off-by: David Sterba <dsterba@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250813100053.1291961-1-dsterba@suse.com
2025-08-18Merge branch 'bjorn' into docs-mwJonathan Corbet
A big set of typo fixes from Bjorn Helgaas
2025-08-18Documentation: Fix admin-guide typosBjorn Helgaas
Fix typos. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250813200526.290420-4-helgaas@kernel.org
2025-08-18docs: remove a duplicated word from kernel-parameters.txtVivek Yadav
Fix kernel-doc warning in kernel-parameters.txt WARNING: Possible repeated word: 'is' Signed-off-by: Vivek Yadav <vivekyadav1207731111@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250816082452.219009-1-vivekyadav1207731111@gmail.com
2025-08-18docs: Replace dead links to spectre side channel white papersNikola Z. Ivanov
The papers are published by Intel, AMD and MIPS. Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250816190028.55573-1-zlatistiv@gmail.com
2025-08-17Merge tag 'x86_urgent_for_v6.17_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Remove a transitional asm/cpuid.h header which was added only as a fallback during cpuid helpers reorg - Initialize reserved fields in the SVSM page validation calls structure to zero in order to allow for future structure extensions - Have the sev-guest driver's buffers used in encryption operations be in linear mapping space as the encryption operation can be offloaded to an accelerator - Have a read-only MSR write when in an AMD SNP guest trap to the hypervisor as it is usually done. This makes the guest user experience better by simply raising a #GP instead of terminating said guest - Do not output AVX512 elapsed time for kernel threads because the data is wrong and fix a NULL pointer dereferencing in the process - Adjust the SRSO mitigation selection to the new attack vectors * tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpuid: Remove transitional <asm/cpuid.h> header x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero virt: sev-guest: Satisfy linear mapping requirement in get_derived_key() x86/sev: Improve handling of writes to intercepted TSC MSRs x86/fpu: Fix NULL dereference in avx512_status() x86/bugs: Select best SRSO mitigation
2025-08-14x86/vmscape: Enable the mitigationPawan Gupta
Enable the previously added mitigation for VMscape. Add the cmdline vmscape={off|ibpb|force} and sysfs reporting. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
2025-08-14Documentation/hw-vuln: Add VMSCAPE documentationPawan Gupta
VMSCAPE is a vulnerability that may allow a guest to influence the branch prediction in host userspace, particularly affecting hypervisors like QEMU. Add the documentation. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>