cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach...
authorTejun Heo <[email protected]>
Thu, 21 Apr 2016 23:06:48 +0000 (19:06 -0400)
committerTejun Heo <[email protected]>
Mon, 25 Apr 2016 19:45:14 +0000 (15:45 -0400)
commit5cf1cacb49aee39c3e02ae87068fc3c6430659b0
tree569518d16093a55169e4ef1cd5399e9a3f620d06
parentbcc981e9ed84c678533299d7eff17d2c81e4d5de
cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback

Since e93ad19d0564 ("cpuset: make mm migration asynchronous"), cpuset
kicks off asynchronous NUMA node migration if necessary during task
migration and flushes it from cpuset_post_attach_flush() which is
called at the end of __cgroup_procs_write().  This is to avoid
performing migration with cgroup_threadgroup_rwsem write-locked which
can lead to deadlock through dependency on kworker creation.

memcg has a similar issue with charge moving, so let's convert it to
an official callback rather than the current one-off cpuset specific
function.  This patch adds cgroup_subsys->post_attach callback and
makes cpuset register cpuset_post_attach_flush() as its ->post_attach.

The conversion is mostly one-to-one except that the new callback is
called under cgroup_mutex.  This is to guarantee that no other
migration operations are started before ->post_attach callbacks are
finished.  cgroup_mutex is one of the outermost mutex in the system
and has never been and shouldn't be a problem.  We can add specialized
synchronization around __cgroup_procs_write() but I don't think
there's any noticeable benefit.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]> # 4.4+ prerequisite for the next patch
include/linux/cgroup-defs.h
include/linux/cpuset.h
kernel/cgroup.c
kernel/cpuset.c