mm: thp: Introduce anon_orders and anon_always_mask sysfs files
In preparation for adding support for anonymous large folios that are
smaller than the PMD-size, introduce 2 new sysfs files that will be used
to control the new behaviours via the transparent_hugepage interface.
For now, the kernel still only supports PMD-order anonymous THP, so when
reading back anon_orders, it will reflect that. Therefore there are no
behavioural changes intended here.
The bulk of the change is implemented by converting
transhuge_vma_suitable() and hugepage_vma_check() so that they take a
bitfield of orders for which the user wants to determine support, and
the functions filter out all the orders that can't be supported. If
there is only 1 order set in the input then the output can continue to
be treated like a boolean; this is the case for most call sites.
The remainder is copied from Documentation/admin-guide/mm/transhuge.rst,
as modified by this commit. See that file for further details.
By default, allocation of anonymous THPs that are smaller than PMD-size
is disabled. These smaller allocation orders can be enabled by writing
an encoded set of orders as follows::
echo 0x208 >/sys/kernel/mm/transparent_hugepage/anon_orders
Where an order refers to the number of pages in the large folio as
2^order, and where each order is encoded in the written value such that
each set bit represents an enabled order; So setting bit-2 indicates
that order-2 folios are in use, and order-2 means 2^2=4 pages (=16K if
the page size is 4K). The example above enables order-9 (PMD-order) and
order-3.
By enabling multiple orders, allocation of each order will be attempted,
highest to lowest, until a successful allocation is made. If the
PMD-order is unset, then no PMD-sized THPs will be allocated.
The kernel will ignore any orders that it does not support so read the
file back to determine which orders are enabled::
cat /sys/kernel/mm/transparent_hugepage/anon_orders
For some workloads it may be desirable to limit some THP orders to be
used only for MADV_HUGEPAGE regions, while allowing others to be used
always. For example, a workload may only benefit from PMD-sized THP in
specific areas, but can take benefit of 32K sized THP more generally. In
this case, THP can be enabled in ``madvise`` mode as normal, but
specific orders can be configured to be allocated as if in ``always``
mode. The below example enables orders 9 and 3, with order-9 only
applied to MADV_HUGEPAGE regions, and order-3 applied always::
echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
echo 0x208 >/sys/kernel/mm/transparent_hugepage/anon_orders
echo 0x008 >/sys/kernel/mm/transparent_hugepage/anon_always_mask
Signed-off-by:
Ryan Roberts <ryan.roberts@arm.com>
Loading
Please register or sign in to comment