2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
linux/mm
David Hildenbrand 772e5b4a5e mm/mremap: fix WARN with uffd that has remap events disabled
Registering userfaultd on a VMA that spans at least one PMD and then
mremap()'ing that VMA can trigger a WARN when recovering from a failed
page table move due to a page table allocation error.

The code ends up doing the right thing (recurse, avoiding moving actual
page tables), but triggering that WARN is unpleasant:

WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_normal_pmd mm/mremap.c:357 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_pgt_entry mm/mremap.c:595 [inline]
WARNING: CPU: 2 PID: 6133 at mm/mremap.c:357 move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Modules linked in:
CPU: 2 UID: 0 PID: 6133 Comm: syz.0.19 Not tainted 6.17.0-rc1-syzkaller-00004-g53e760d89498 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:move_normal_pmd mm/mremap.c:357 [inline]
RIP: 0010:move_pgt_entry mm/mremap.c:595 [inline]
RIP: 0010:move_page_tables+0x3832/0x44a0 mm/mremap.c:852
Code: ...
RSP: 0018:ffffc900037a76d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000032930007 RCX: ffffffff820c6645
RDX: ffff88802e56a440 RSI: ffffffff820c7201 RDI: 0000000000000007
RBP: ffff888037728fc0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000032930007 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc900037a79a8 R14: 0000000000000001 R15: dffffc0000000000
FS:  000055556316a500(0000) GS:ffff8880d68bc000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30863fff CR3: 0000000050171000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 copy_vma_and_data+0x468/0x790 mm/mremap.c:1215
 move_vma+0x548/0x1780 mm/mremap.c:1282
 mremap_to+0x1b7/0x450 mm/mremap.c:1406
 do_mremap+0xfad/0x1f80 mm/mremap.c:1921
 __do_sys_mremap+0x119/0x170 mm/mremap.c:1977
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f00d0b8ebe9
Code: ...
RSP: 002b:00007ffe5ea5ee98 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
RAX: ffffffffffffffda RBX: 00007f00d0db5fa0 RCX: 00007f00d0b8ebe9
RDX: 0000000000400000 RSI: 0000000000c00000 RDI: 0000200000000000
RBP: 00007ffe5ea5eef0 R08: 0000200000c00000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000002
R13: 00007f00d0db5fa0 R14: 00007f00d0db5fa0 R15: 0000000000000005
 </TASK>

The underlying issue is that we recurse during the original page table
move, but not during the recovery move.

Fix it by checking for both VMAs and performing the check before the
pmd_none() sanity check.

Add a new helper where we perform+document that check for the PMD and PUD
level.

Thanks to Harry for bisecting.

Link: https://lkml.kernel.org/r/20250818175358.1184757-1-david@redhat.com
Fixes: 0cef0bb836 ("mm: clear uffd-wp PTE/PMD state on mremap()")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: syzbot+4d9a13f0797c46a29e42@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/689bb893.050a0220.7f033.013a.GAE@google.com
Tested-by: Harry Yoo <harry.yoo@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-19 16:35:57 -07:00
..
damon mm/damon/sysfs-schemes: put damos dests dir after removing its files 2025-08-19 16:35:57 -07:00
kasan kasan/test: fix protection against compiler elision 2025-08-05 13:28:46 -07:00
kfence kfence: Remove mention of PG_slab 2025-07-23 11:55:22 +02:00
kmsan kmsan: test: add module description 2025-06-05 22:02:25 -07:00
backing-dev.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
balloon_compaction.c mm/migrate: fix NULL movable_ops if CONFIG_ZSMALLOC=m 2025-08-19 16:35:57 -07:00
bootmem_info.c mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
cma_debug.c mm: cma: simplify cma_maxchunk_get() 2025-07-24 19:12:36 -07:00
cma_sysfs.c mm/cma: export total and free number of pages for CMA areas 2025-03-16 22:06:24 -07:00
cma.c mm: cma: simplify cma_debug_show_areas() 2025-07-24 19:12:36 -07:00
cma.h mm: cma: set early_pfn and bitmap as a union in cma_memrange 2025-05-22 14:55:36 -07:00
compaction.c mm: rename PG_isolated to PG_movable_ops_isolated 2025-07-13 16:38:30 -07:00
debug_page_alloc.c mm/debug_page_alloc: improve error message for invalid guardpage minorder 2025-05-12 23:50:38 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: clear page table entries at destroy_args() 2025-08-19 16:35:54 -07:00
debug.c mm/util: introduce snapshot_page() 2025-07-24 19:12:35 -07:00
dmapool_test.c
dmapool.c docs: dma-api: replace consistent with coherent 2025-07-01 13:25:36 -06:00
early_ioremap.c mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference 2025-01-13 22:40:59 -08:00
execmem.c mm: correct type for vmalloc vm_flags fields 2025-08-02 12:06:13 -07:00
fadvise.c fdget(), trivial conversions 2024-11-03 01:28:06 -05:00
fail_page_alloc.c fault-inject: improve build for CONFIG_FAULT_INJECTION=n 2024-09-01 20:43:33 -07:00
failslab.c fault-inject: improve build for CONFIG_FAULT_INJECTION=n 2024-09-01 20:43:33 -07:00
filemap.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
folio-compat.c mm: Remove grab_cache_page_write_begin() 2025-03-04 17:02:25 +00:00
gup_test.c
gup_test.h
gup.c mm: rename PAGE_MAPPING_* to FOLIO_MAPPING_* 2025-07-13 16:38:31 -07:00
highmem.c
hmm.c mm/hmm: move pmd_to_hmm_pfn_flags() to the respective #ifdeffery 2025-07-19 18:59:53 -07:00
huge_memory.c mm/huge_memory: refactor after-split (page) cache code 2025-07-24 19:12:39 -07:00
hugetlb_cgroup.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
hugetlb_cma.c mm/hugetlb: use separate nodemask for bootmem allocations 2025-05-12 23:50:35 -07:00
hugetlb_cma.h mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_vmemmap.c mm/pagewalk: split walk_page_range_novma() into kernel/user parts 2025-07-09 22:42:05 -07:00
hugetlb_vmemmap.h mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
hugetlb.c mm/page_owner: convert set_page_owner_migrate_reason() to folios 2025-07-19 18:59:57 -07:00
hwpoison-inject.c
init-mm.c mm: replace vm_lock and detached flag with a reference count 2025-03-16 22:06:20 -07:00
internal.h Significant patch series in this pull request: 2025-08-05 16:02:07 +03:00
interval_tree.c
ioremap.c mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long 2025-03-16 22:06:23 -07:00
Kconfig mm: remove mm/io-mapping.c 2025-08-02 12:06:10 -07:00
Kconfig.debug mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
khugepaged.c mm: fix the race between collapse and PT_RECLAIM under per-vma lock 2025-08-05 13:28:47 -07:00
kmemleak.c mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup() 2025-08-05 13:28:47 -07:00
ksm.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
list_lru.c mm, list_lru: refactor the locking code 2025-07-09 22:41:56 -07:00
maccess.c mm: unexport globally copy_to_kernel_nofault 2025-07-09 22:42:22 -07:00
madvise.c mm/mseal: small cleanups 2025-08-02 12:06:09 -07:00
Makefile mm: remove mm/io-mapping.c 2025-08-02 12:06:10 -07:00
mapping_dirty_helpers.c mm: remove redundant pXd_devmap calls 2025-07-09 22:42:17 -07:00
memblock.c memblock: add KHO support for reserve_mem 2025-05-12 23:50:42 -07:00
memcontrol-v1.c memcg: make count_memcg_events re-entrant safe against irqs 2025-05-22 14:55:38 -07:00
memcontrol-v1.h memcg: move do_memsw_account() to CONFIG_MEMCG_V1 2025-03-21 22:03:11 -07:00
memcontrol.c cgroup: Changes for v6.17 2025-07-31 16:04:19 -07:00
memfd.c mm/memfd: replace deprecated strcpy() with memcpy() in alloc_name() 2025-07-19 18:59:57 -07:00
memory_hotplug.c mm: rename __PageMovable() to page_has_movable_ops() 2025-07-13 16:38:29 -07:00
memory-failure.c mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn 2025-08-19 16:35:56 -07:00
memory-tiers.c mm,memory-tiers: use node-notifier instead of memory-notifier 2025-07-13 16:38:15 -07:00
memory.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
mempolicy.c mm: split folio_pte_batch() into folio_pte_batch() and folio_pte_batch_flags() 2025-07-19 18:59:45 -07:00
mempool.c mm: mempool: fix crash in mempool_free() for zero-minimum pools 2025-08-02 12:06:13 -07:00
memremap.c mm/page_alloc: add support for initializing pageblock as isolated 2025-07-13 16:38:17 -07:00
memtest.c
migrate_device.c mm: remove redundant pXd_devmap calls 2025-07-09 22:42:17 -07:00
migrate.c mm/migrate: fix NULL movable_ops if CONFIG_ZSMALLOC=m 2025-08-19 16:35:57 -07:00
mincore.c mm/mincore: hold PTL in mincore_hugetlb 2025-08-02 12:06:10 -07:00
mlock.c mm: split folio_pte_batch() into folio_pte_batch() and folio_pte_batch_flags() 2025-07-19 18:59:45 -07:00
mm_init.c mm/page_alloc: add support for initializing pageblock as isolated 2025-07-13 16:38:17 -07:00
mm_slot.h
mmap_lock.c mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped 2025-08-02 12:06:11 -07:00
mmap.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
mmu_gather.c mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() 2025-05-31 22:46:12 -07:00
mmu_notifier.c Update Christoph's Email address and make it consistent 2025-05-12 23:50:31 -07:00
mmzone.c mm: improve code consistency with zonelist_* helper functions 2024-09-01 20:25:55 -07:00
mprotect.c mm: pass page directly instead of using folio_page 2025-08-11 23:00:59 -07:00
mremap.c mm/mremap: fix WARN with uffd that has remap events disabled 2025-08-19 16:35:57 -07:00
mseal.c mm/mseal: rework mseal apply logic 2025-08-02 12:06:09 -07:00
msync.c
nommu.c Significant patch series in this pull request: 2025-08-05 16:02:07 +03:00
numa_emulation.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
numa_memblks.c mm: numa_memblks: introduce numa_add_reserved_memblk 2025-05-22 14:55:36 -07:00
numa.c mm/numa: remove unnecessary local variable in alloc_node_data() 2025-05-12 23:50:38 -07:00
oom_kill.c mm/oom_kill: fix trivial typo in comment 2025-03-16 22:05:55 -07:00
page_alloc.c mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info() 2025-07-26 15:08:22 -07:00
page_counter.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
page_ext.c mm,page_ext: derive the node from the pfn 2025-07-13 16:38:16 -07:00
page_frag_cache.c mm/page_alloc: export free_frozen_pages() instead of free_unref_page() 2025-01-13 22:40:31 -08:00
page_idle.c sysfs: treewide: switch back to attribute_group::bin_attrs 2025-06-17 10:44:15 +02:00
page_io.c mm: stop passing a writeback_control structure to swap_writeout 2025-07-09 22:41:58 -07:00
page_isolation.c mm/page_isolation: drop __folio_test_movable() check for large folios 2025-07-13 16:38:29 -07:00
page_owner.c mm/page_owner: convert set_page_owner_migrate_reason() to folios 2025-07-19 18:59:57 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c mm/page_table_check: Batch-check pmds/puds just like ptes 2025-05-09 13:43:07 +01:00
page_vma_mapped.c mm: remove redundant pXd_devmap calls 2025-07-09 22:42:17 -07:00
page-writeback.c mm, vmstat: remove the NR_WRITEBACK_TEMP node_stat_item counter 2025-07-19 18:59:47 -07:00
pagewalk.c mm: remove redundant pXd_devmap calls 2025-07-09 22:42:17 -07:00
percpu-internal.h
percpu-km.c
percpu-stats.c mm: remove outdated filename comment in percpu-stats.c 2025-07-13 16:38:23 -07:00
percpu-vm.c
percpu.c mm/percpu: prevent concurrency problem for pcpu_nr_populated read with spin lock 2025-07-13 16:38:21 -07:00
pgalloc-track.h
pgtable-generic.c mm: remove redundant pXd_devmap calls 2025-07-09 22:42:17 -07:00
process_vm_access.c mm: refactor mm_access() to not return NULL 2024-11-05 16:56:23 -08:00
pt_reclaim.c mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED) 2025-01-13 22:40:48 -08:00
ptdump.c mm/ptdump: take the memory hotplug lock inside ptdump_walk_pgd() 2025-07-09 22:42:20 -07:00
readahead.c readahead: use folio_nr_pages() instead of shift operation 2025-07-19 18:59:53 -07:00
rmap.c mm: add get_and_clear_ptes() and clear_ptes() 2025-08-02 12:06:10 -07:00
rodata_test.c mm/rodata_test: verify test data is unchanged, rather than non-zero 2025-01-13 22:40:38 -08:00
secretmem.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
shmem_quota.c
shmem.c Significant patch series in this pull request: 2025-08-05 16:02:07 +03:00
show_mem.c mm, vmstat: remove the NR_WRITEBACK_TEMP node_stat_item counter 2025-07-19 18:59:47 -07:00
shrinker_debug.c mm/shrinker: fix name consistency issue in shrinker_debugfs_rename() 2025-03-17 00:05:40 -07:00
shrinker.c mm: shrinker: avoid memleak in alloc_shrinker_info 2024-10-31 20:27:04 -07:00
shuffle.c
shuffle.h
slab_common.c Update Christoph's Email address and make it consistent 2025-05-12 23:50:31 -07:00
slab.h slab: Add SL_pfmemalloc flag 2025-06-18 13:06:26 +02:00
slub.c printk changes for 6.17 2025-08-04 10:54:36 -07:00
sparse-vmemmap.c mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
sparse.c drivers/base/memory: improve add_boot_memory_block() 2025-03-17 22:07:01 -07:00
swap_cgroup.c mm: swap_cgroup: remove double initialization of locals 2025-03-17 22:06:58 -07:00
swap_state.c - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox 2025-05-31 15:44:16 -07:00
swap.c mm: optimize lru_note_cost() by adding lru_note_cost_unlock_irq() 2025-07-24 19:12:28 -07:00
swap.h mm: stop passing a writeback_control structure to swap_writeout 2025-07-09 22:41:58 -07:00
swapfile.c mm: swap: remove stale comment stale comment in cluster_alloc_swap_entry() 2025-07-24 19:12:34 -07:00
truncate.c - The 2 patch series "zram: support algorithm-specific parameters" from 2025-06-02 16:00:26 -07:00
usercopy.c mm: security: Check early if HARDENED_USERCOPY is enabled 2025-02-28 11:51:31 -08:00
userfaultfd.c userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry 2025-08-11 23:00:59 -07:00
util.c mm/util: introduce snapshot_page() 2025-07-24 19:12:35 -07:00
vma_exec.c mm/vma: use vmg->target to specify target VMA for new VMA merge 2025-07-09 22:42:11 -07:00
vma_init.c mm: convert VM_PFNMAP tracking to pfnmap_track() + pfnmap_untrack() 2025-05-22 14:55:37 -07:00
vma_internal.h mm/vma: move brk() internals to mm/vma.c 2025-01-13 22:40:42 -08:00
vma.c Significant patch series in this pull request: 2025-08-05 16:02:07 +03:00
vma.h mm/mseal: small cleanups 2025-08-02 12:06:09 -07:00
vmalloc.c mm/vmalloc: leave lazy MMU mode on PTE mapping error 2025-07-09 21:07:52 -07:00
vmpressure.c memcg: convert memcg->socket_pressure to u64 2025-07-24 19:12:32 -07:00
vmscan.c Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
vmstat.c mm, vmstat: remove the NR_WRITEBACK_TEMP node_stat_item counter 2025-07-19 18:59:47 -07:00
workingset.c mm: workingset: simplify lockdep check in update_node 2025-05-12 23:50:44 -07:00
zpdesc.h mm: convert "movable" flag in page->mapping to a page flag 2025-07-13 16:38:30 -07:00
zpool.c zsmalloc: prefer the the original page's node for compressed data 2025-05-11 17:48:06 -07:00
zsmalloc.c mm/migrate: fix NULL movable_ops if CONFIG_ZSMALLOC=m 2025-08-19 16:35:57 -07:00
zswap.c mm: stop passing a writeback_control structure to __swap_writepage 2025-07-09 22:41:57 -07:00