oom: don't assume that a coredumping thread will exit soon
authorOleg Nesterov <[email protected]>
Sat, 13 Dec 2014 00:56:24 +0000 (16:56 -0800)
committerLinus Torvalds <[email protected]>
Sat, 13 Dec 2014 20:42:49 +0000 (12:42 -0800)
commitd003f371b27016354c392464819530d47a915765
tree0615df29b862f71b0ef14d8079218ba9edb201fd
parentba914f481507a0542a7c8a3fc15d89414bc2ebf3
oom: don't assume that a coredumping thread will exit soon

oom_kill.c assumes that PF_EXITING task should exit and free the memory
soon.  This is wrong in many ways and one important case is the coredump.
A task can sleep in exit_mm() "forever" while the coredumping sub-thread
can need more memory.

Change the PF_EXITING checks to take SIGNAL_GROUP_COREDUMP into account,
we add the new trivial helper for that.

Note: this is only the first step, this patch doesn't try to solve other
problems.  The SIGNAL_GROUP_COREDUMP check is obviously racy, a task can
participate in coredump after it was already observed in PF_EXITING state,
so TIF_MEMDIE (which also blocks oom-killer) still can be wrongly set.
fatal_signal_pending() can be true because of SIGNAL_GROUP_COREDUMP so
out_of_memory() and mem_cgroup_out_of_memory() shouldn't blindly trust it.
 And even the name/usage of the new helper is confusing, an exiting thread
can only free its ->mm if it is the only/last task in thread group.

[[email protected]: add comment]
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Cong Wang <[email protected]>
Acked-by: David Rientjes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
include/linux/oom.h
mm/memcontrol.c
mm/oom_kill.c