ocfs2: dlm: fix race between purge and get lock resource
authorJunxiao Bi <[email protected]>
Tue, 5 May 2015 23:24:02 +0000 (16:24 -0700)
committerLinus Torvalds <[email protected]>
Wed, 6 May 2015 00:10:11 +0000 (17:10 -0700)
There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged.  This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.

    dlm_get_lock_resource {
        ...
        spin_lock(&dlm->spinlock);
        tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
        if (tmpres) {
             spin_unlock(&dlm->spinlock);
             >>>>>>>> race window, dlm_run_purge_list() may run and purge
                              the lock resource
             spin_lock(&tmpres->spinlock);
             ...
             spin_unlock(&tmpres->spinlock);
        }
    }

Signed-off-by: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
fs/ocfs2/dlm/dlmmaster.c

index a6944b25fd5b5ddba427ac66c8f50fb582cf1451..fdf4b41d0609a00e591afeed69ea7cc2ab9d5254 100644 (file)
@@ -757,6 +757,19 @@ lookup:
        if (tmpres) {
                spin_unlock(&dlm->spinlock);
                spin_lock(&tmpres->spinlock);
+
+               /*
+                * Right after dlm spinlock was released, dlm_thread could have
+                * purged the lockres. Check if lockres got unhashed. If so
+                * start over.
+                */
+               if (hlist_unhashed(&tmpres->hash_node)) {
+                       spin_unlock(&tmpres->spinlock);
+                       dlm_lockres_put(tmpres);
+                       tmpres = NULL;
+                       goto lookup;
+               }
+
                /* Wait on the thread that is mastering the resource */
                if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
                        __dlm_wait_on_lockres(tmpres);