sched/fair: Relax task_hot() for misfit tasks
Consider the following topology: DIE [ ] MC [ ][ ] 0 1 2 3 capacity_orig_of(x \in {0-1}) < capacity_orig_of(x \in {2-3}) w/ CPUs 2-3 idle and CPUs 0-1 running CPU hogs (util_avg=1024). When CPU2 goes through load_balance() (via periodic / NOHZ balance), it should pull one CPU hog from either CPU0 or CPU1 (this is misfit task upmigration). However, should a e.g. pcpu kworker awake on CPU0 just before this load_balance() happens and preempt the CPU hog running there, we would have, for the [0-1] group at CPU2's DIE level: o sgs->sum_nr_running > sgs->group_weight o sgs->group_capacity * 100 < sgs->group_util * imbalance_pct IOW, this group is group_overloaded. Considering CPU0 is picked by find_busiest_queue(), we would then visit the preempted CPU hog in detach_tasks(). However, given it has just been preempted by this pcpu kworker, task_hot() will prevent it from being detached. We then leave load_balance() without having done anything. Long story short, preempted misfit tasks are affected by task_hot(), while currently running misfit tasks are intentionally preempted by the stopper task to migrate them over to a higher-capacity CPU. Align detach_tasks() with the active-balance logic and let it pick a cache-hot misfit task when the destination CPU can provide a capacity uplift. Reviewed-by:Qais Yousef <qais.yousef@arm.com> Signed-off-by:
Valentin Schneider <valentin.schneider@arm.com>
Loading
Please register or sign in to comment