sched/fair: Check runnable signal to skip util_est updates (b83e1dfb) · Commits · linux-arm / linux-pg

Commit b83e1dfb authored Mar 27, 2025 by Pierre Gondois

sched/fair: Check runnable signal to skip util_est updates

commit 50181c0c ("sched/pelt: Avoid underestimation of task
utilization")
allowed to skip decaying util_est to handle the case where the util_avg
signal of a task is decreased due to the presence of co-scheduled tasks.
In such case, a given task will receive less running time, lowering
its util_avg.

Checking the util_avg and runnable signals are within a certain margin
effectively means that a task received less CPU time that desired.
The margin represents 10 util (=1% * 1024). However there can be 2
different cases:
1.
The task is always running.
In that case, the util_avg value is capped by the relative load of the
CPU. E.g.: three 100% duty_cycle tasks will only reach a peak util_avg
of ~340.
2.
The task is not always running.
In that case, the util_avg value will grow slower and reach a lower
value than if there was no co-scheduled task. However, the util_avg
of the task is not capped.

This patch aims to only prevent util_est from decaying in the case 1.
Indeed, in the PELT computation, the last 4ms impact signals for
respectively:
1ms: 22, 2ms: 21, 3ms: 21, 4ms: 20
I.e. a co-scheduled task will create a delta between the runnable and
util_avg signals of 84 (=22 + 21 + 21 + 20) after not running during
4ms.
Thus, a delta of 10 between the runnable/util_avg signal the margin
- is easy to reach
- takes time to remove

A task is considered as always running when its runnable signal
reaches ~80% * 1024. The condition is arguable, but the current
condition is easily triggered and maintains an overestimation of the
size of tasks through util_est.

Running 5 iterations of speedometer 2.1 on a Pixel6, based on a 6.12
kernel:
Triggering the condition:
- Base condition: triggered ~47%
- New condition: triggered ~10%
Overutilized state:
- Base condition: OU state ~65% of the time
- New condition: triggered ~57% of the time
Energy (using energy counters):
- Base condition: 99884 +/- 936
- New condition: 98857 +/-1325
Score:
- Base condition: 204 +/- 1.5
- New condition: 201.5 +/-1.4

So the patch lowers the overutilzed state residency and reduces the
score. However, over-estimating tasks can only improve the score.
This patch doesn't solve the initial issue reported by Lukasz Luba at
[1], but another way to detect the initial issue should ideally be
used.

[1] https://lore.kernel.org/lkml/f1b1b663-3a12-9e5d-932b-b3ffb5f02e14@arm.com/



Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>

parent 833c58ef

Hide whitespace changes

Inline Side-by-side

Please register or to comment