mm: pmd dirty emulation in page fault handler
authorMinchan Kim <[email protected]>
Wed, 11 Jan 2017 00:57:51 +0000 (16:57 -0800)
committerLinus Torvalds <[email protected]>
Wed, 11 Jan 2017 02:31:55 +0000 (18:31 -0800)
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64:

  http://lkml.kernel.org/r/[email protected]

The problem is currently page fault handler doesn't supports dirty bit
emulation of pmd for non-HW dirty-bit architecture so that application
stucks until VM marked the pmd dirty.

How the emulation work depends on the architecture.  In case of arm64,
when it set up pte firstly, it sets pte PTE_RDONLY to get a chance to
mark the pte dirty via triggering page fault when store access happens.
Once the page fault occurs, VM marks the pmd dirty and arch code for
setting pmd will clear PTE_RDONLY for application to proceed.

IOW, if VM doesn't mark the pmd dirty, application hangs forever by
repeated fault(i.e., store op but the pmd is PTE_RDONLY).

This patch enables pmd dirty-bit emulation for those architectures.

[1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called

Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Minchan Kim <[email protected]>
Reported-by: Andreas Schwab <[email protected]>
Tested-by: Andreas Schwab <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Jason Evans <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: <[email protected]> [4.5+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
mm/huge_memory.c

index 72339a646fb14620d0b11fcda4ca65ce03e7d270..9a6bd6c8d55a6691047e516a46c2cf6b931b912d 100644 (file)
@@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd)
 {
        pmd_t entry;
        unsigned long haddr;
+       bool write = vmf->flags & FAULT_FLAG_WRITE;
 
        vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd);
        if (unlikely(!pmd_same(*vmf->pmd, orig_pmd)))
                goto unlock;
 
        entry = pmd_mkyoung(orig_pmd);
+       if (write)
+               entry = pmd_mkdirty(entry);
        haddr = vmf->address & HPAGE_PMD_MASK;
-       if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry,
-                               vmf->flags & FAULT_FLAG_WRITE))
+       if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write))
                update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd);
 
 unlock: