diff options
| author | Shivank Garg <shivankg@amd.com> | 2025-10-16 10:28:46 -0700 |
|---|---|---|
| committer | Sean Christopherson <seanjc@google.com> | 2025-10-20 06:30:41 -0700 |
| commit | ed1ffa810bd600ae40c794d87e4ca587dee5fa3a (patch) | |
| tree | be72abc063cb37900c6582f0c46f64b3cb550a7b /drivers/usb/cdns3/cdns3-debug.h | |
| parent | f609e89ae8936dbba8992ada83d27ece5cb8393a (diff) | |
KVM: guest_memfd: Enforce NUMA mempolicy using shared policy
Previously, guest-memfd allocations followed local NUMA node id in absence
of process mempolicy, resulting in arbitrary memory allocation.
Moreover, mbind() couldn't be used by the VMM as guest memory wasn't
mapped into userspace when allocation occurred.
Enable NUMA policy support by implementing vm_ops for guest-memfd mmap
operation. This allows the VMM to use mmap()+mbind() to set the desired
NUMA policy for a range of memory, and provides fine-grained control over
guest memory allocation across NUMA nodes.
Note, using mmap()+mbind() works even for PRIVATE memory, as mbind()
doesn't require the memory to be faulted in. However, get_mempolicy()
and other paths that require the userspace page tables to be populated
may return incorrect information for PRIVATE memory (though under the hood,
KVM+guest_memfd will still behave correctly).
Store the policy in the inode structure, gmem_inode, as a shared memory
policy, so that the policy is a property of the physical memory itself,
i.e. not bound to the VMA. In guest_memfd, KVM is the primary MMU and any
VMAs are secondary, i.e. using mbind() on a VMA to set policy is a means
to an end, e.g. to avoid having to add a file-based equivalent to mbind().
Similarly, retrieve the policy via mpol_shared_policy_lookup(), not
get_vma_policy(), even when allocating to fault in memory for userspace
mappings, so that the policy stored in gmem_inode is always the source of
true.
Apply policy changes only to future allocations, i.e. do not migrate
existing memory in the guest_memfd instance. This matches mbind(2)'s
default behavior, which affects only new allocations unless overridden
with MPOL_MF_MOVE/MPOL_MF_MOVE_ALL flags (which are not supported by
guest_memfd as guest_memfd memory is unmovable).
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Shivank Garg <shivankg@amd.com>
Tested-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/all/e9d43abc-bcdb-4f9f-9ad7-5644f714de19@amd.com
[sean: fold in fixup (see Link above), massage changelog]
Link: https://lore.kernel.org/r/20251016172853.52451-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Diffstat (limited to 'drivers/usb/cdns3/cdns3-debug.h')
0 files changed, 0 insertions, 0 deletions