mm: readahead: improve mmap_miss heuristic for concurrent faults

If two or more threads of an application faulting on the same folio, the mmap_miss counter can be decreased multiple times. It breaks the mmap_miss heuristic and keeps the readahead enabled even under extreme levels of memory pressure. It happens often if file folios backing a multi-threaded application are getting evicted and re-faulted. Fix it by skipping decreasing mmap_miss if the folio is locked. This change was evaluated on several hundred thousands hosts in Google's production over a couple of weeks. The number of containers being stuck in a vicious reclaim cycle for a long time was reduced several fold (~10-20x), as well as the overall fleet-wide cpu time spent in direct memory reclaim was meaningfully reduced. No regressions were observed. Link: https://lkml.kernel.org/r/20250815183224.62007-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Reviewed-by: Jan Kara <jack@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
author: Roman Gushchin <roman.gushchin@linux.dev> 2025-08-15 11:32:24 -0700
committer: Andrew Morton <akpm@linux-foundation.org> 2025-09-13 16:55:04 -0700
commit: e338d83531540c6cda4bdc8de52aaae186d8a97c (patch)
tree: 1ce2f96591676e0ee89d05f281270be31445b6f5 /mm/filemap.c
parent: 19de1e5d11d142d50c80cad1aa9916f25a36ab0d (diff)
1 files changed, 11 insertions, 3 deletions
diff --git a/mm/filemap.c b/mm/filemap.c
index d1fb0b12bff2..1a388b11cfa9 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3323,9 +3323,17 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
 	if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages)
 		return fpin;
 
-	mmap_miss = READ_ONCE(ra->mmap_miss);
-	if (mmap_miss)
-		WRITE_ONCE(ra->mmap_miss, --mmap_miss);
+	/*
+	 * If the folio is locked, we're likely racing against another fault.
+	 * Don't touch the mmap_miss counter to avoid decreasing it multiple
+	 * times for a single folio and break the balance with mmap_miss
+	 * increase in do_sync_mmap_readahead().
+	 */
+	if (likely(!folio_test_locked(folio))) {
+		mmap_miss = READ_ONCE(ra->mmap_miss);
+		if (mmap_miss)
+			WRITE_ONCE(ra->mmap_miss, --mmap_miss);
+	}
 
 	if (folio_test_readahead(folio)) {
 		fpin = maybe_unlock_mmap_for_io(vmf, fpin);
author	Roman Gushchin <roman.gushchin@linux.dev>	2025-08-15 11:32:24 -0700
committer	Andrew Morton <akpm@linux-foundation.org>	2025-09-13 16:55:04 -0700
commit	e338d83531540c6cda4bdc8de52aaae186d8a97c (patch)
tree	1ce2f96591676e0ee89d05f281270be31445b6f5 /mm/filemap.c
parent	19de1e5d11d142d50c80cad1aa9916f25a36ab0d (diff)