I think we may have a race between try_to_wake_up() and
migrate_live_tasks() -> move_task_off_dead_cpu() when the later one
may end up looping endlessly.
Interrupts are enabled on other CPUs when migration_call(CPU_DEAD, ...) is
called so we may get a race between try_to_wake_up() and
migrate_live_tasks() -> move_task_off_dead_cpu(). The former one may push
a task out of a dead CPU causing the later one to loop endlessly.
Heiko Carstens observed:
| That's exactly what explains a dump I got yesterday. Thanks for fixing! :)
Signed-off-by: Dmitry Adamushko <[email protected]>
Cc: [email protected]
Cc: Lai Jiangshan <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
double_rq_lock(rq_src, rq_dest);
/* Already moved. */
- if (task_cpu(p) != src_cpu)
+ if (task_cpu(p) != src_cpu) {
+ ret = 1;
goto out;
+ }
/* Affinity changed (again). */
if (!cpu_isset(dest_cpu, p->cpus_allowed))
goto out;