mm: FLEXIBLE_THP for improved performance
Introduce FLEXIBLE_THP feature, which allows anonymous memory to be
allocated in large folios of a specified order. All pages of the large
folio are pte-mapped during the same page fault, significantly reducing
the number of page faults. The number of per-page operations (e.g. ref
counting, rmap management lru list management) are also significantly
reduced since those ops now become per-folio.
The new behaviour is hidden behind the new FLEXIBLE_THP Kconfig, which
defaults to disabled for now; there is a long list of todos to make
FLEXIBLE_THP robust with existing features (e.g. compaction, mlock, some
madvise ops, etc). These items will be tackled in subsequent patches.
When enabled, the preferred folio order is as returned by
arch_wants_pte_order(), which may be overridden by the arch as it sees
fit. Some architectures (e.g. arm64) can coalsece TLB entries if a
contiguous set of ptes map physically contigious, naturally aligned
memory, so this mechanism allows the architecture to optimize as
required.
If the preferred order can't be used (e.g. because the folio would
breach the bounds of the vma, or because ptes in the region are already
mapped) then we fall back to a suitable lower order.
Signed-off-by:
Ryan Roberts <ryan.roberts@arm.com>
Loading
Please register or sign in to comment