diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-03 12:18:07 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-03 12:18:07 -0800 |
| commit | 98e7dcbb82fa57de8dfad357f9b851c3625797fa (patch) | |
| tree | 18b2b79e656f29dc6553cc316e276a379695e572 /Documentation | |
| parent | b687034b1a4d85333ced0fe07f67b17276cccdc8 (diff) | |
| parent | 9a08942f17017b708991c5089843d4a1bfac4420 (diff) | |
Merge tag 'rcu.release.v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Frederic Weisbecker:
"SRCU:
- Properly handle SRCU readers within IRQ disabled sections in tiny
SRCU
- Preparation to reimplement RCU Tasks Trace on top of SRCU fast:
- Introduce API to expedite a grace period and test it through
rcutorture
- Split srcu-fast in two flavours: SRCU-fast and SRCU-fast-updown.
Both are still targeted toward faster readers (without full
barriers on LOCK and UNLOCK) at the expense of heavier write
side (using full RCU grace period ordering instead of simply
full ordering) as compared to "traditional" non-fast SRCU. But
those srcu-fast flavours are going to be optimized in two
different ways:
- SRCU-fast will become the reimplementation basis for
RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must
be NMI safe, SRCU-fast must be as well.
- SRCU-fast-updown will be needed for uretprobes code in order
to get rid of the read-side memory barriers while still
allowing entering the reader at task level while exiting it
in a timer handler. It is considered semaphore-like in that
it can have different owners between LOCK and UNLOCK.
However it is not NMI-safe.
The actual optimizations are work in progress for the next
cycle. Only the new interfaces are added for now, along with
related torture and scalability test code.
- Create/document/debug/torture new proper initializers for RCU fast:
DEFINE_SRCU_FAST() and init_srcu_struct_fast()
This allows for using right away the proper ordering on the write
side (either full ordering or full RCU grace period ordering)
without waiting for the read side to tell which to use.
This also optimizes the read side altogether with moving flavour
debug checks under debug config and with removing a costly RmW
operation on their first call.
- Make some diagnostic functions tracing safe
Refscale:
- Add performance testing for common context synchronizations
(Preemption, IRQ, Softirq) and per-cpu increments. Those are
relevant comparisons against SRCU-fast read side APIs, especially
as they are planned to synchronize further tracing fast-path code
Miscellanous:
- In order to prepare the layout for nohz_full work deferral to user
exit, the context tracking state must shrink the counter of
transitions to/from RCU not watching. The only possible hazard is
to trigger wrap-around more easily, delaying a bit grace periods
when that happens. This should be a rare event though. Yet add
debugging and torture code to test that assumption
- Fix memory leak on locktorture module
- Annotate accesses in rculist_nulls.h to prevent from KCSAN
warnings. On recent discussions, we also concluded that all those
WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate
comments. Something to be expected for the next cycle
- Provide a script to apply several configs to several commits with
torture
- Allow torture to reuse a build directory in order to save needless
rebuild time
- Various cleanups"
* tag 'rcu.release.v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (29 commits)
refscale: Add SRCU-fast-updown readers
refscale: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast()
rcutorture: Make srcu{,d}_torture_init() announce the SRCU type
srcu: Create an SRCU-fast-updown API
refscale: Do not disable interrupts for tests involving local_bh_enable()
refscale: Add non-atomic per-CPU increment readers
refscale: Add this_cpu_inc() readers
refscale: Add preempt_disable() readers
refscale: Add local_bh_disable() readers
refscale: Add local_irq_disable() and local_irq_save() readers
torture: Permit negative kvm.sh --kconfig numberic arguments
srcu: Add SRCU_READ_FLAVOR_FAST_UPDOWN CPP macro
rcu: Mark diagnostic functions as notrace
rcutorture: Make TREE04 use CONFIG_RCU_DYNTICKS_TORTURE
rcutorture: Remove redundant rcutorture_one_extend() from rcu_torture_one_read()
rcutorture: Permit kvm-again.sh to re-use the build directory
torture: Add kvm-series.sh to test commit/scenario combination
rcu: use WRITE_ONCE() for ->next and ->pprev of hlist_nulls
locktorture: Fix memory leak in param_set_cpumask()
doc: Update for SRCU-fast definitions and initialization
...
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.rst | 33 | ||||
| -rw-r--r-- | Documentation/RCU/checklist.rst | 12 | ||||
| -rw-r--r-- | Documentation/RCU/whatisRCU.rst | 3 |
3 files changed, 27 insertions, 21 deletions
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index f24b3c0b9b0d..ba417a08b93d 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -2637,15 +2637,16 @@ synchronize_srcu() for some other domain ``ss1``, and if an that was held across as ``ss``-domain synchronize_srcu(), deadlock would again be possible. Such a deadlock cycle could extend across an arbitrarily large number of different SRCU domains. Again, with great -power comes great responsibility. +power comes great responsibility, though lockdep is now able to detect +this sort of deadlock. -Unlike the other RCU flavors, SRCU read-side critical sections can run -on idle and even offline CPUs. This ability requires that -srcu_read_lock() and srcu_read_unlock() contain memory barriers, -which means that SRCU readers will run a bit slower than would RCU -readers. It also motivates the smp_mb__after_srcu_read_unlock() API, -which, in combination with srcu_read_unlock(), guarantees a full -memory barrier. +Unlike the other RCU flavors, SRCU read-side critical sections can run on +idle and even offline CPUs, with the exception of srcu_read_lock_fast() +and friends. This ability requires that srcu_read_lock() and +srcu_read_unlock() contain memory barriers, which means that SRCU +readers will run a bit slower than would RCU readers. It also motivates +the smp_mb__after_srcu_read_unlock() API, which, in combination with +srcu_read_unlock(), guarantees a full memory barrier. Also unlike other RCU flavors, synchronize_srcu() may **not** be invoked from CPU-hotplug notifiers, due to the fact that SRCU grace @@ -2681,15 +2682,15 @@ run some tests first. SRCU just might need a few adjustment to deal with that sort of load. Of course, your mileage may vary based on the speed of your CPUs and the size of your memory. -The `SRCU -API <https://lwn.net/Articles/609973/#RCU%20Per-Flavor%20API%20Table>`__ +The `SRCU API +<https://lwn.net/Articles/609973/#RCU%20Per-Flavor%20API%20Table>`__ includes srcu_read_lock(), srcu_read_unlock(), -srcu_dereference(), srcu_dereference_check(), -synchronize_srcu(), synchronize_srcu_expedited(), -call_srcu(), srcu_barrier(), and srcu_read_lock_held(). It -also includes DEFINE_SRCU(), DEFINE_STATIC_SRCU(), and -init_srcu_struct() APIs for defining and initializing -``srcu_struct`` structures. +srcu_dereference(), srcu_dereference_check(), synchronize_srcu(), +synchronize_srcu_expedited(), call_srcu(), srcu_barrier(), +and srcu_read_lock_held(). It also includes DEFINE_SRCU(), +DEFINE_STATIC_SRCU(), DEFINE_SRCU_FAST(), DEFINE_STATIC_SRCU_FAST(), +init_srcu_struct(), and init_srcu_struct_fast() APIs for defining and +initializing ``srcu_struct`` structures. More recently, the SRCU API has added polling interfaces: diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst index c9bfb2b218e5..4b30f701225f 100644 --- a/Documentation/RCU/checklist.rst +++ b/Documentation/RCU/checklist.rst @@ -417,11 +417,13 @@ over a rather long period of time, but improvements are always welcome! you should be using RCU rather than SRCU, because RCU is almost always faster and easier to use than is SRCU. - Also unlike other forms of RCU, explicit initialization and - cleanup is required either at build time via DEFINE_SRCU() - or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct() - and cleanup_srcu_struct(). These last two are passed a - "struct srcu_struct" that defines the scope of a given + Also unlike other forms of RCU, explicit initialization + and cleanup is required either at build time via + DEFINE_SRCU(), DEFINE_STATIC_SRCU(), DEFINE_SRCU_FAST(), + or DEFINE_STATIC_SRCU_FAST() or at runtime via either + init_srcu_struct() or init_srcu_struct_fast() and + cleanup_srcu_struct(). These last three are passed a + `struct srcu_struct` that defines the scope of a given SRCU domain. Once initialized, the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu(). A given diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index cf0b0ac9f463..a1582bd653d1 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -1227,7 +1227,10 @@ SRCU: Initialization/cleanup/ordering:: DEFINE_SRCU DEFINE_STATIC_SRCU + DEFINE_SRCU_FAST // for srcu_read_lock_fast() and friends + DEFINE_STATIC_SRCU_FAST // for srcu_read_lock_fast() and friends init_srcu_struct + init_srcu_struct_fast cleanup_srcu_struct smp_mb__after_srcu_read_unlock |