arm64/mm: Update range-based tlb invalidation routines for FEAT_LPA2 (f2e448ec) · Commits · linux-arm / linux-rr

Commit f2e448ec authored Sep 14, 2023 by Ryan Roberts

arm64/mm: Update range-based tlb invalidation routines for FEAT_LPA2



The BADDR field of the range-based tlbi instructions is specified in
64KB units when LPA2 is in use (TCR.DS=1), whereas it is in page units
otherwise.

When LPA2 is enabled, use the non-range tlbi instructions to forward
align to a 64KB boundary first, then use range-based tlbi from there on,
until we have either invalidated all pages or we have a single page
remaining. If the latter, that is done with non-range tlbi. (Previously
we invalidated a single odd page first, but we can no longer do this
because it could wreck our 64KB alignment). When LPA2 is not in use, we
don't need the initial alignemnt step. However, the bigger impact is
that we can no longer use the previous method of iterating from smallest
to largest 'scale', since this would likely unalign the boundary again
for the LPA2 case. So instead we iterate from highest to lowest scale,
which guarrantees that we remain 64KB aligned until the last op (at
scale=0).

The original commit (d1d3aa98 "arm64: tlb: Use the TLBI RANGE feature in
arm64") stated this as the reason for incrementing scale:

  However, in most scenarios, the pages = 1 when flush_tlb_range() is
  called. Start from scale = 3 or other proper value (such as scale
  =ilog2(pages)), will incur extra overhead. So increase 'scale' from 0
  to maximum, the flush order is exactly opposite to the example.

But pages=1 is already special cased by the non-range invalidation path,
which will take care of it the first time through the loop (both in the
original commit and in my change), so I don't think switching to
decrement scale should have any extra performance impact after all.

Note: This patch uses LPA2 range-based tlbi based on the new lpa2 param
passed to __flush_tlb_range_op(). This allows both KVM and the kernel to
opt-in/out of LPA2 usage independently. But once both are converted over
(and keyed off the same static key), the parameter could be dropped and
replaced by the static key directly in the macro.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>

parent 78686efb

Hide whitespace changes

Inline Side-by-side

Please register or to comment