2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

9975 Commits

Author SHA1 Message Date
Sabyrzhan Tasbolatov
ae193dd793 kasan: move checks to do_strncpy_from_user
Patch series "kasan: migrate the last module test to kunit", v4.

copy_user_test() is the last KUnit-incompatible test with
CONFIG_KASAN_MODULE_TEST requirement, which we are going to migrate to
KUnit framework and delete the former test and Kconfig as well.

In this patch series:

	- [1/3] move kasan_check_write() and check_object_size() to
		do_strncpy_from_user() to cover with KASAN checks with
		multiple conditions	in strncpy_from_user().

	- [2/3] migrated copy_user_test() to KUnit, where we can also test
		strncpy_from_user() due to [1/4].

		KUnits have been tested on:
		- x86_64 with CONFIG_KASAN_GENERIC. Passed
		- arm64 with CONFIG_KASAN_SW_TAGS. 1 fail. See [1]
		- arm64 with CONFIG_KASAN_HW_TAGS. 1 fail. See [1]
		[1] https://lore.kernel.org/linux-mm/CACzwLxj21h7nCcS2-KA_q7ybe+5pxH0uCDwu64q_9pPsydneWQ@mail.gmail.com/

	- [3/3] delete CONFIG_KASAN_MODULE_TEST and documentation occurrences.


This patch (of 3):

Since in the commit 2865baf54077("x86: support user address masking
instead of non-speculative conditional") do_strncpy_from_user() is called
from multiple places, we should sanitize the kernel *dst memory and size
which were done in strncpy_from_user() previously.

Link: https://lkml.kernel.org/r/20241016131802.3115788-1-snovitoll@gmail.com
Link: https://lkml.kernel.org/r/20241016131802.3115788-2-snovitoll@gmail.com
Fixes: 2865baf540 ("x86: support user address masking instead of non-speculative conditional")
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hu Haowen <2023002089@link.tyut.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marco Elver <elver@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 00:26:43 -08:00
Andrew Morton
2ec0859039 Merge branch 'mm-hotfixes-stable' into mm-stable
Pick up e7ac4daeed ("mm: count zeromap read and set for swapout and
swapin") in order to move

mm: define obj_cgroup_get() if CONFIG_MEMCG is not defined
mm: zswap: modify zswap_compress() to accept a page instead of a folio
mm: zswap: rename zswap_pool_get() to zswap_pool_tryget()
mm: zswap: modify zswap_stored_pages to be atomic_long_t
mm: zswap: support large folios in zswap_store()
mm: swap: count successful large folio zswap stores in hugepage zswpout stats
mm: zswap: zswap_store_page() will initialize entry after adding to xarray.
mm: add per-order mTHP swpin counters

from mm-unstable into mm-stable.
2024-11-11 00:04:10 -08:00
Luis Chamberlain
2466b31201 tests/module/gen_test_kallsyms.sh: use 0 value for variables
Use 0 for the values as we use them for the return value on init
to keep the test modules simple. This fixes a splat reported

do_init_module: 'test_kallsyms_b'->init suspiciously returned 255, it should follow 0/-E convention
do_init_module: loading module anyway...
CPU: 5 UID: 0 PID: 1873 Comm: modprobe Not tainted 6.12.0-rc3+ #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2024.08-1 09/18/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x56/0x80
 do_init_module.cold+0x21/0x26
 init_module_from_file+0x88/0xf0
 idempotent_init_module+0x108/0x300
 __x64_sys_finit_module+0x5a/0xb0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f4f3a718839
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff>
RSP: 002b:00007fff97d1a9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000055b94001ab90 RCX: 00007f4f3a718839
RDX: 0000000000000000 RSI: 000055b910e68a10 RDI: 0000000000000004
RBP: 0000000000000000 R08: 00007f4f3a7f1b20 R09: 000055b94001c5b0
R10: 0000000000000040 R11: 0000000000000246 R12: 000055b910e68a10
R13: 0000000000040000 R14: 000055b94001ad60 R15: 0000000000000000
 </TASK>
do_init_module: 'test_kallsyms_b'->init suspiciously returned 255, it should follow 0/-E convention
do_init_module: loading module anyway...
CPU: 1 UID: 0 PID: 1884 Comm: modprobe Not tainted 6.12.0-rc3+ #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2024.08-1 09/18/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x56/0x80
 do_init_module.cold+0x21/0x26
 init_module_from_file+0x88/0xf0
 idempotent_init_module+0x108/0x300
 __x64_sys_finit_module+0x5a/0xb0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7ffaa5d18839

Reported-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-11-07 14:32:55 -08:00
Suren Baghdasaryan
b7fc16a16b mm/codetag: uninline and move pgalloc_tag_copy and pgalloc_tag_split
pgalloc_tag_copy() and pgalloc_tag_split() are sizable and outside of any
performance-critical paths, so it should be fine to uninline them.  Also
move their declarations into pgalloc_tag.h which seems like a more
appropriate place for them.  No functional changes other than uninlining.

Link: https://lkml.kernel.org/r/20241024162318.1640781-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:17 -08:00
Suren Baghdasaryan
4835f747d3 alloc_tag: support for page allocation tag compression
Implement support for storing page allocation tag references directly in
the page flags instead of page extensions.  sysctl.vm.mem_profiling boot
parameter it extended to provide a way for a user to request this mode. 
Enabling compression eliminates memory overhead caused by page_ext and
results in better performance for page allocations.  However this mode
will not work if the number of available page flag bits is insufficient to
address all kernel allocations.  Such condition can happen during boot or
when loading a module.  If this condition is detected, memory allocation
profiling gets disabled with an appropriate warning.  By default
compression mode is disabled.

Link: https://lkml.kernel.org/r/20241023170759.999909-7-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
0f9b685626 alloc_tag: populate memory for module tags as needed
The memory reserved for module tags does not need to be backed by physical
pages until there are tags to store there.  Change the way we reserve this
memory to allocate only virtual area for the tags and populate it with
physical pages as needed when we load a module.

[surenb@google.com: avoid execmem_vmap() when !MMU]
  Link: https://lkml.kernel.org/r/20241031233611.3833002-1-surenb@google.com
Link: https://lkml.kernel.org/r/20241023170759.999909-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
0db6f8d782 alloc_tag: load module tags into separate contiguous memory
When a module gets unloaded there is a possibility that some of the
allocations it made are still used and therefore the allocation tags
corresponding to these allocations are still referenced.  As such, the
memory for these tags can't be freed.  This is currently handled as an
abnormal situation and module's data section is not being unloaded.  To
handle this situation without keeping module's data in memory, allow
codetags with longer lifespan than the module to be loaded into their own
separate memory.  The in-use memory areas and gaps after module unloading
in this separate memory are tracked using maple trees.  Allocation tags
arrange their separate memory so that it is virtually contiguous and that
will allow simple allocation tag indexing later on in this patchset.  The
size of this virtually contiguous memory is set to store up to 100000
allocation tags.

[surenb@google.com: fix empty codetag module section handling]
  Link: https://lkml.kernel.org/r/20241101000017.3856204-1-surenb@google.com
[akpm@linux-foundation.org: update comment, per Dan]
Link: https://lkml.kernel.org/r/20241023170759.999909-4-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Suren Baghdasaryan
3e09c500bb alloc_tag: introduce shutdown_mem_profiling helper function
Implement a helper function to disable memory allocation profiling and use
it when creation of /proc/allocinfo fails.  Ensure /proc/allocinfo does
not get created when memory allocation profiling is disabled.

Link: https://lkml.kernel.org/r/20241023170759.999909-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiongwei Song <xiongwei.song@windriver.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:16 -08:00
Masami Hiramatsu (Google)
cb6fcef8b4 objpool: fix to make percpu slot allocation more robust
Since gfp & GFP_ATOMIC == GFP_ATOMIC is true for GFP_KERNEL | GFP_HIGH, it
will use kmalloc if user specifies that combination.  Here the reason why
combining the __vmalloc_node() and kmalloc_node() is that the vmalloc does
not support all GFP flag, especially GFP_ATOMIC.  So we should check if
gfp & (GFP_ATOMIC | GFP_KERNEL) != GFP_ATOMIC for vmalloc first.  This
ensures caller can sleep.  And for the robustness, even if vmalloc fails,
it should retry with kmalloc to allocate it.

Link: https://lkml.kernel.org/r/173008598713.1262174.2959179484209897252.stgit@mhiramat.roam.corp.google.com
Fixes: aff1871bfc ("objpool: fix choosing allocation for percpu slots")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/all/CAHk-=whO+vSH+XVRio8byJU8idAWES0SPGVZ7KAVdc4qrV0VUA@mail.gmail.com/
Cc: Leo Yan <leo.yan@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Wu <wuqiang.matt@bytedance.com>
Cc: Mikel Rychliski <mikel@mikelr.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:14:58 -08:00
Jakub Kicinski
2696e451df Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc7).

Conflicts:

drivers/net/ethernet/freescale/enetc/enetc_pf.c
  e15c5506dd ("net: enetc: allocate vf_state during PF probes")
  3774409fd4 ("net: enetc: build enetc_pf_common.c as a separate module")
https://lore.kernel.org/20241105114100.118bd35e@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/ti/am65-cpsw-nuss.c
  de794169cf ("net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7")
  4a7b2ba94a ("net: ethernet: ti: am65-cpsw: Use tstats instead of open coded version")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 13:44:16 -08:00
David Hildenbrand
2b37c814aa lib/Kconfig.debug: Default STRICT_DEVMEM to "y" on s390
virtio-mem currently depends on !DEVMEM | STRICT_DEVMEM. Let's default
STRICT_DEVMEM to "y" just like we do for arm64 and x86.

There could be ways in the future to filter access to virtio-mem device
memory even without STRICT_DEVMEM, but for now let's just keep it
simple.

Tested-by: Mario Casquero <mcasquer@redhat.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20241025141453.1210600-6-david@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07 10:26:24 +01:00
Wei Yang
38dc8f4952 maple_tree: remove sanity check from mas_wr_slot_store()
After commit 5d659bbb52 ("maple_tree: introduce mas_wr_store_type()"),
the check here is redundant.

Let's remove it.

Link: https://lkml.kernel.org/r/20241017015809.23392-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:16 -08:00
Wei Yang
61e9df7085 maple_tree: calculate new_end when needed
Patch series "Following cleanup after introduce mas_wr_store_type()", v2.

Patch 1 postpone new_end calculation when needed.
Patch 2 removes a unnecessary sanity check in mas_wr_slot_store().


This patch (of 2):

For wr_exact_fit/wr_new_root, we don't need to calculate new_end.

Let's postpone it until necessary.

Link: https://lkml.kernel.org/r/20241017015809.23392-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241017015809.23392-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:15 -08:00
Andy Shevchenko
4a7bba1df0 percpu: add a test case for the specific 64-bit value addition
It might be a corner case when we add UINT_MAX as 64-bit unsigned value to
the percpu variable as it's not the same as -1 (ULONG_LONG_MAX).  Add a
test case for that.

Link: https://lkml.kernel.org/r/20241016182635.1156168-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
908378a30b maple_tree: simplify mas_push_node()
When count is not 0, we know head is valid.  So we can put the assignment
in if (count) instead of checking the head pointer again.

Also count represents current total, we can assign the new total by
increasing the count by one.

Link: https://lkml.kernel.org/r/20241015120746.15850-4-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
4223dd93bf maple_tree: total is not changed for nomem_one case
If it jumps to nomem_one, the total allocated number is not changed.  So
we don't need to adjust it.

For the nomem_bulk case, we know there is a valid mas->alloc.  So we don't
need to do the check.

Link: https://lkml.kernel.org/r/20241015120746.15850-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
e852cb1d00 maple_tree: clear request_count for new allocated one
Patch series "maple_tree: simplify mas_push_node()", v2.

When count is not 0, we know head is valid.  So we can put the assignment
in if (count) instead of checking the head pointer again.

Also count represents current total, we can assign the new total by
increasing the count by one.


This patch (of 3):

If this is not a new allocated one, the request_count has already been
cleared in mas_set_alloc_req().

Link: https://lkml.kernel.org/r/20241015120746.15850-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241015120746.15850-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:14 -08:00
Wei Yang
0cc8d68abe maple_tree: root node could be handled by !p_slot too
For a root node, mte_parent_slot() return 0, this exactly fits the
following !p_slot check.

So we can remove the special handling for root node.

Link: https://lkml.kernel.org/r/20240913063128.27391-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:13 -08:00
Jiazi Li
5b2100f723 maple_tree: fix alloc node fail issue
In the following code, the second call to the mas_node_count will return
-ENOMEM:

	mas_node_count(mas, MAPLE_ALLOC_SLOTS + 1);
	mas_node_count(mas, MAPLE_ALLOC_SLOTS * 2 + 2);

This is because there may be some full maple_alloc node in current maple
state.  Use full maple_alloc node will make max_req equal to 0.  And it
leads to mt_alloc_bulk return 0.  As a result, mas_node_count set mas.node
to MA_ERROR(-ENOMEM).

Find a non-full maple_alloc node, and if necessary, use this non-full node
in the next while loop.

Link: https://lkml.kernel.org/r/20240626160631.3636515-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:13 -08:00
Sidhartha Kumar
f0c99037a0 maple_tree: refactor mas_wr_store_type()
In mas_wr_store_type(), we check if new_end < mt_slots[wr_mas->type].  If
this check fails, we know that ,after this, new_end is >= mt_min_slots. 
Checking this again when we detect a wr_node_store later in the function
is reduntant.  Because this check is part of an OR statement, the
statement will always evaluate to true, therefore we can just get rid of
it.

We also refactor mas_wr_store_type() to return the store type rather than
set it directly as it greatly cleans up the function.

Link: https://lkml.kernel.org/r/20241011214451.7286-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha <sidhartha.kumar@oracle.com>
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Suggested-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:12 -08:00
Lorenzo Stoakes
b314e21596 maple_tree: do not hash pointers on dump in debug mode
Many maple tree values output when an mt_validate() or equivalent hits an
issue utilise tagged pointers, most notably parent nodes. Also some
pivots/slots contain meaningful values, output as pointers, such as the
index of the last entry with data for example.

All pointer values such as this are destroyed by kernel pointer hashing
rendering the debug output obtained from CONFIG_DEBUG_VM_MAPLE_TREE
considerably less usable.

Update this code to output the raw pointers using %px rather than %p when
CONFIG_DEBUG_VM_MAPLE_TREE is defined. This is justified, as the use of
this configuration flag indicates that this is a test environment.

Userland does not understand %px, so use %p there.

In an abundance of caution, if CONFIG_DEBUG_VM_MAPLE_TREE is not set, also
use %p to avoid exposing raw kernel pointers except when we are positive a
testing mode is enabled.

This was inspired by the investigation performed in recent debugging
efforts around a maple tree regression [0] where kernel pointer tagging had
to be disabled in order to obtain truly meaningful and useful data.

[0]:https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/

Link: https://lkml.kernel.org/r/20241007115335.90104-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06 20:11:09 -08:00
Sui Jingfeng
e01caa2b63 lib/scatterlist: use sg_phys() helper
This shorten the length of code in horizential direction, therefore is
easier to read.

Link: https://lkml.kernel.org/r/20241028182920.1025819-1-sui.jingfeng@linux.dev
Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:40 -08:00
Kuan-Wei Chiu
d559bb2c6d lib/test_min_heap: update min_heap_callbacks to use default builtin swap
Replace the swp function pointer in the min_heap_callbacks of
test_min_heap with NULL, allowing direct usage of the default builtin swap
implementation.  This modification simplifies the code and improves
performance by removing unnecessary function indirection.

Link: https://lkml.kernel.org/r/20241020040200.939973-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:35 -08:00
Kuan-Wei Chiu
92a8b224b8 lib/min_heap: introduce non-inline versions of min heap API functions
Patch series "Enhance min heap API with non-inline functions and
optimizations", v2.

Add non-inline versions of the min heap API functions in lib/min_heap.c
and updates all users outside of kernel/events/core.c to use these
non-inline versions.  To mitigate the performance impact of indirect
function calls caused by the non-inline versions of the swap and compare
functions, a builtin swap has been introduced that swaps elements based on
their size.  Additionally, it micro-optimizes the efficiency of the min
heap by pre-scaling the counter, following the same approach as in
lib/sort.c.  Documentation for the min heap API has also been added to the
core-api section.


This patch (of 10):

All current min heap API functions are marked with '__always_inline'. 
However, as the number of users increases, inlining these functions
everywhere leads to a increase in kernel size.

In performance-critical paths, such as when perf events are enabled and
min heap functions are called on every context switch, it is important to
retain the inline versions for optimal performance.  To balance this, the
original inline functions are kept, and additional non-inline versions of
the functions have been added in lib/min_heap.c.

Link: https://lkml.kernel.org/r/20241020040200.939973-1-visitorckw@gmail.com
Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
Link: https://lkml.kernel.org/r/20241020040200.939973-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:34 -08:00
Kuan-Wei Chiu
908ef9bb4b lib/list_sort: remove unnecessary header includes
Patch series "Remove unnecessary header includes from
{tools/}lib/list_sort.c".

Remove outdated and unnecessary header includes from lib/list_sort.c and
tools/lib/list_sort.c.  Additionally, update the hunk exceptions checked
by check_headers.sh to reflect these changes.


This patch (of 3):

After commit 043b3f7b63 ("lib/list_sort: simplify and remove
MAX_LIST_LENGTH_BITS"), list_sort.c no longer uses ARRAY_SIZE() (which
required kernel.h and bug.h for BUILD_BUG_ON_ZERO via __must_be_array) or
memset() (which required string.h).  As these headers are no longer
needed, removes them.

There are no changes to the generated code, as confirmed by 'objdump -d'. 
Additionally, 'wc -l' shows that the size of lib/.list_sort.o.cmd is
reduced from 259 lines to 101 lines.

Link: https://lkml.kernel.org/r/20241012042828.471614-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20241012042828.471614-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:33 -08:00
Kuan-Wei Chiu
bf9850f6ea lib/Makefile: make union-find compilation conditional on CONFIG_CPUSETS
Currently, cpuset is the only user of the union-find implementation. 
Compiling union-find in all configurations unnecessarily increases the
code size when building the kernel without cgroup support.  Modify the
build system to compile union-find only when CONFIG_CPUSETS is enabled.

Link: https://lore.kernel.org/lkml/1ccd6411-5002-4574-bb8e-3e64bba6a757@redhat.com/
Link: https://lkml.kernel.org/r/20241011141214.87096-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Waiman Long <llong@redhat.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Xavier <xavier_qy@163.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:32 -08:00
Vinicius Peixoto
5d04270708 lib/crc16_kunit.c: add KUnit tests for crc16
Add Kunit tests for the kernel's implementation of the standard CRC-16
algorithm (<linux/crc16.h>).  The test data consists of 100
randomly-generated test cases, validated against a naive CRC-16
implementation.

This test follows roughly the same logic as lib/crc32test.c, but without
the performance measurements.

Link: https://lkml.kernel.org/r/20241012-crc16-kunit-v3-1-0ca75cb58ca9@lkcamp.dev
Signed-off-by: Vinicius Peixoto <vpeixoto@lkcamp.dev>
Co-developed-by: Enzo Bertoloti <ebertoloti@lkcamp.dev>
Signed-off-by: Enzo Bertoloti <ebertoloti@lkcamp.dev>
Co-developed-by: Fabricio Gasperin <fgasperin@lkcamp.dev>
Signed-off-by: Fabricio Gasperin <fgasperin@lkcamp.dev>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:31 -08:00
I Hsin Cheng
5a3c9366cb list: test: check the size of every lists for list_cut_position*()
Check the total number of elements in both resultant lists are correct
within list_cut_position*().  Previously, only the first list's size was
checked.  so additional elements in the second list would not have been
caught.

Link: https://lkml.kernel.org/r/20241008065253.26673-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:30 -08:00
Kuan-Wei Chiu
b42166427b lib/Kconfig.debug: move int_pow test option to runtime testing section
When executing 'make menuconfig' with KUNIT enabled, the int_pow test
option appears on the first page of the main menu instead of under the
runtime testing section.  Relocate the int_pow test configuration to the
appropriate runtime testing submenu, ensuring a more organized and logical
structure in the menu configuration.

Link: https://lkml.kernel.org/r/20241005222221.2154393-1-visitorckw@gmail.com
Fixes: 7fcc9b5321 ("lib/math: Add int_pow test suite")
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: David Gow <davidgow@google.com>
Cc: Luis Felipe Hernandez <luis.hernandez093@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:30 -08:00
Wei Yang
5059aa6334 maple_tree: memset maple_big_node as a whole
In mast_fill_bnode(), we first clear some fields of maple_big_node and set
the 'type' unconditionally before return.  This means we won't leverage
any information in maple_big_node and it is safe to clear the whole
structure.

In maple_big_node, we define slot and padding/gap in a union.  And based
on current definition of MAPLE_BIG_NODE_SLOTS/GAPS, padding is always less
than slot and part of the gap is overlapped by slot.

For example on 64bit system:

  MAPLE_BIG_NODE_SLOT is 34
  MAPLE_BIG_NODE_GAP  is 21

With this knowledge, current code may clear some space by twice. And
this could be avoid by clearing the structure as a whole.

Link: https://lkml.kernel.org/r/20240908140554.20378-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
f36ba81081 maple_tree: remove maple_big_node.parent
Patch series "Reduce the space to be cleared for maple_big_node", v2.

Found current code may clear maple_big_node redundantly.

First we define a field parent, which is never used.  After removing this,
we reduce the size of memory to be cleared by memset.

Then mast_fill_bnode() clears part of the structure twice, since slot and
gap share some space.  By clearing the whole structure, we can avoid this.


This patch (of 2):

The member parent of maple_big_node is never used.

Let's remove it which could reduce the number of space to be cleared on
memset.

Link: https://lkml.kernel.org/r/20240908140554.20378-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20240908140554.20378-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
1c148069b2 maple_tree: goto complete directly on a pivot of 0
When we break the loop after assigning a pivot, the index i/j is not
changed.  Then the following code assign pivot, which means we do the
assignment with same i/j by mas_safe_pivot.

Since the loop condition is (i < piv_end), from which we can get i is less
than mt_pivots[mt].  It implies mas_safe_pivot() return pivot[i] which is
the same value we get in loop.

Now we can conclude it does a redundant assignment on a pivot of 0.  Let's
just go to complete to avoid it.

Link: https://lkml.kernel.org/r/20240911142759.20989-3-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:24 -08:00
Wei Yang
8c7904a8cd maple_tree: i is always less than or equal to mas_end
Patch series "refine mas_mab_cp()".

By analysis of the code, one condition check can be removed and one case
would hit a redundant assignment.


This patch (of 2):

mas_mab_cp() copy range [mas_start, mas_end] inclusively from a
maple_node to maple_big_node. This implies mas_start <= mas_end.

Based on the relationship of mas_start and mas_end, we can have the
following four cases:

                 | mas_start == mas_end |  mas_start < mas_end
  ---------------+----------------------+----------------------
  mas_start == 0 |         1            |          2
  ---------------+----------------------+----------------------
  mas_start != 0 |         3            |          4

We can see in all these four cases, i is always less than or equal to
mas_end after finish the loop:

  Case 1: After assign pivot 0, i is set to 1, which is bigger than
          mas_end 0. So it jumps to complete and skip the check.
  Case 2: After assign pivot 0, i is set to 1.
          ∵ (mas_start < mas_end) && (mas_start == 0)
             ==>  (1 <= mas_end)
          ∵ (i == 1) && (1 <= mas_end)
             ==>  (i <= mas_end)
          ∴ Before loop, we have (i <= mas_end). And we still hold this
             if it skips the loop. For example, (i == mas_end).

          Now let's see what happens in the loop:
          ∵ piv_end = min(mas_end, mt_pivots[mt])
             ==>  (piv_end <= mas_end)
	  ∵ loop condition is (i < piv_end)
	     ==>  (i <= piv_end) on finish the loop both normally or break
          ∵ (i <= piv_end) && (piv_end <= mas_end)
             ==>  (i <= mas_end)
          ∴ After loop, we still get (i <= mas_end) in this case
  Case 3: This case would skip both if clause and loop. So when it comes
          to the check, i is still mas_start which equals to mas_end.
  Case 4: This case would skip the if clause.
          ∵ (mas_start < mas_end) && (i == mas_start)
             ==>  (i < mas_end)
          ∴ Before loop, we have (i < mas_end).
          The loop process is similar with Case 2, so we get the same
	  result.

Now we can conclude in all cases, we get (i <= mas_end) when doing
check. Then it is not necessary to do the check.

Link: https://lkml.kernel.org/r/20240911142759.20989-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20240911142759.20989-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 16:56:23 -08:00
Greg Kroah-Hartman
09fbb82f94 Merge 6.12-rc6 into driver-core-next
We need the driver-core fix/revert in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05 10:11:53 +01:00
Jakub Kicinski
cbf49bed6a bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZyP6TxUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISINz7QD/RTuJAzPJXPQmjdzMj7pepjnSQH4K
 DnOc1soDqjJPSFkBAMlklDCZqSsFoNtNxagbyILrYQBC/MsV9jngimK46DEN
 =pDzC
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-10-31

We've added 13 non-merge commits during the last 16 day(s) which contain
a total of 16 files changed, 710 insertions(+), 668 deletions(-).

The main changes are:

1) Optimize and homogenize bpf_csum_diff helper for all archs and also
   add a batch of new BPF selftests for it, from Puranjay Mohan.

2) Rewrite and migrate the test_tcp_check_syncookie.sh BPF selftest
   into test_progs so that it can be run in BPF CI, from Alexis Lothoré.

3) Two BPF sockmap selftest fixes, from Zijian Zhang.

4) Small XDP synproxy BPF selftest cleanup to remove IP_DF check,
   from Vincent Li.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add a selftest for bpf_csum_diff()
  selftests/bpf: Don't mask result of bpf_csum_diff() in test_verifier
  bpf: bpf_csum_diff: Optimize and homogenize for all archs
  net: checksum: Move from32to16() to generic header
  selftests/bpf: remove xdp_synproxy IP_DF check
  selftests/bpf: remove test_tcp_check_syncookie
  selftests/bpf: test MSS value returned with bpf_tcp_gen_syncookie
  selftests/bpf: add ipv4 and dual ipv4/ipv6 support in btf_skc_cls_ingress
  selftests/bpf: get rid of global vars in btf_skc_cls_ingress
  selftests/bpf: add missing ns cleanups in btf_skc_cls_ingress
  selftests/bpf: factorize conn and syncookies tests in a single runner
  selftests/bpf: Fix txmsg_redir of test_txmsg_pull in test_sockmap
  selftests/bpf: Fix msg_verify_data in test_sockmap

====================

Link: https://patch.msgid.link/20241031221543.108853-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 14:44:51 -08:00
Caleb Sander Mateos
61bf0009a7 dim: pass dim_sample to net_dim() by reference
net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64
passes it on the stack. All callers have already initialized dim_sample
on the stack, so passing it by value requires pushing a duplicated copy
to the stack. Either witing to the stack and immediately reading it, or
perhaps dereferencing addresses relative to the stack pointer in a chain
of push instructions, seems to perform quite poorly.

In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy
dim_sample on the stack for the call to net_dim():
// Call ktime_get()
  0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
       |4ead7:   lea    0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
       |4eade:   mov    %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
       |4eae3:   mov    %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
  0.15 |4eae8:   mov    %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
 94.16 |4eaed:   push   0x10(%rsp)
  2.79 |4eaf1:   push   0x10(%rsp)
  0.07 |4eaf5:   push   %rax
// Call net_dim()
  0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>

To allow the caller to reuse the struct dim_sample already on the stack,
pass the struct dim_sample by reference to net_dim().

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241031002326.3426181-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:36:54 -08:00
Caleb Sander Mateos
a865276872 dim: make dim_calc_stats() inputs const pointers
Make the start and end arguments to dim_calc_stats() const pointers
to clarify that the function does not modify their values.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Link: https://patch.msgid.link/20241031002326.3426181-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:35:57 -08:00
Bartosz Golaszewski
a508ef4b1d lib: string_helpers: silence snprintf() output truncation warning
The output of ".%03u" with the unsigned int in range [0, 4294966295] may
get truncated if the target buffer is not 12 bytes. This can't really
happen here as the 'remainder' variable cannot exceed 999 but the
compiler doesn't know it. To make it happy just increase the buffer to
where the warning goes away.

Fixes: 3c9f3681d0 ("[SCSI] lib: add generic helper to print sizes rounded to the correct SI range")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Kees Cook <kees@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20241101205453.9353-1-brgl@bgdev.pl
Signed-off-by: Kees Cook <kees@kernel.org>
2024-11-02 13:08:55 -07:00
Thomas Gleixner
d44d26987b timekeeping: Remove CONFIG_DEBUG_TIMEKEEPING
Since 135225a363 timekeeping_cycles_to_ns() handles large offsets which
would lead to 64bit multiplication overflows correctly. It's also protected
against negative motion of the clocksource unconditionally, which was
exclusive to x86 before.

timekeeping_advance() handles large offsets already correctly.

That means the value of CONFIG_DEBUG_TIMEKEEPING which analyzed these cases
is very close to zero. Remove all of it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/all/20241031120328.536010148@linutronix.de
2024-11-02 10:14:31 +01:00
Ming Lei
341468e0ab lib/iov_iter: fix bvec iterator setup
.bi_size of bvec iterator should be initialized as real max size for
walking, and .bi_bvec_done just counts how many bytes need to be
skipped in the 1st bvec, so .bi_size isn't related with .bi_bvec_done.

This patch fixes bvec iterator initialization, and the inner `size`
check isn't needed any more, so revert Eric Dumazet's commit
7bc802acf193 ("iov-iter: do not return more bytes than requested in
iov_iter_extract_bvec_pages()").

Cc: Eric Dumazet <edumazet@google.com>
Fixes: e4e535bff2 ("iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages")
Reported-by: syzbot+71abe7ab2b70bca770fd@syzkaller.appspotmail.com
Tested-by: syzbot+71abe7ab2b70bca770fd@syzkaller.appspotmail.com
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-01 20:18:21 -06:00
Linus Torvalds
3dfffd506e arm64 fixes for -rc6
- Fix handling of POR_EL0 during signal delivery so that pushing the
   signal context doesn't fail based on the pkey configuration of the
   interrupted context and align our user-visible behaviour with that of
   x86.
 
 - Fix a bogus pointer being passed to the CPU hotplug code from the
   Arm SDEI driver.
 
 - Re-enable software tag-based KASAN with GCC by using an alternative
   implementation of '__no_sanitize_address'.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmcjr8wQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNL2DB/4tNl7feCA2V4fW/Eu3RzXrHTdJbZvTjLDl
 JjeXPZr4WdGQQMgQ0DPZtpnmeBzd5nswx9WHG9VSsUxc5g+rzWxwvMnUeplDvEXo
 Y/QMUq4JZN3eqDZWPs0mEN4fMI+QOihInErVHvFXaJLcbxYrU5BvfwExgfY53AjT
 ZJEPmF291OL6V4UCWVWggk44BQaTBeWmc4itJcYm6z6mIgAgh84MZGK5M0e582ip
 CRAImDiAPqLxRO9kzKcYthI3FDyyVi1HtiSL1CiNktOXMNz19qPelq1XAnDEyvBt
 TEUitTLTwbUJ0nqi4u7ve09aebneAq8nsGucteYTrBU4U/PRjvQO
 =LTB9
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The important one is a change to the way in which we handle protection
  keys around signal delivery so that we're more closely aligned with
  the x86 behaviour, however there is also a revert of the previous fix
  to disable software tag-based KASAN with GCC, since a workaround
  materialised shortly afterwards.

  I'd love to say we're done with 6.12, but we're aware of some
  longstanding fpsimd register corruption issues that we're almost at
  the bottom of resolving.

  Summary:

   - Fix handling of POR_EL0 during signal delivery so that pushing the
     signal context doesn't fail based on the pkey configuration of the
     interrupted context and align our user-visible behaviour with that
     of x86.

   - Fix a bogus pointer being passed to the CPU hotplug code from the
     Arm SDEI driver.

   - Re-enable software tag-based KASAN with GCC by using an alternative
     implementation of '__no_sanitize_address'"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
  firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state()
  Revert "kasan: Disable Software Tag-Based KASAN with GCC"
  kasan: Fix Software Tag-Based KASAN with GCC
2024-11-01 07:54:11 -10:00
Linus Torvalds
d56239a82e vfs-6.12-rc6.fixes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZyTGAQAKCRCRxhvAZXjc
 opd6AQCal4omyfS8FYe4VRRZ/0XHouagq99I0U0TAmKkvoKAsgD/XrdE+pSTEkPX
 Pv4T9phh1cZRxcyKVu77UoYkuHJEDAg=
 =Lu9R
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull filesystem fixes from Christian Brauner:
 "VFS:

   - Fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP=y is set

   - Add a get_tree_bdev_flags() helper that allows to modify e.g.,
     whether errors are logged into the filesystem context during
     superblock creation. This is used by erofs to fix a userspace
     regression where an error is currently logged when its used on a
     regular file which is an new allowed mode in erofs.

  netfs:

   - Fix the sysfs debug path in the documentation.

   - Fix iov_iter_get_pages*() for folio queues by skipping the page
     extracation if we're at the end of a folio.

  afs:

   - Fix moving subdirectories to different parent directory.

  autofs:

   - Fix handling of AUTOFS_DEV_IOCTL_TIMEOUT_CMD ioctl in
     validate_dev_ioctl(). The actual ioctl number, not the ioctl
     command needs to be checked for autofs"

* tag 'vfs-6.12-rc6.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
  iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP
  autofs: fix thinko in validate_dev_ioctl()
  iov_iter: Fix iov_iter_get_pages*() for folio_queue
  afs: Fix missing subdir edit when renamed between parent dirs
  doc: correcting the debug path for cachefiles
  erofs: use get_tree_bdev_flags() to avoid misleading messages
  fs/super.c: introduce get_tree_bdev_flags()
2024-11-01 07:37:10 -10:00
Eric Dumazet
a911bad094 dql: annotate data-races around dql->last_obj_cnt
dql->last_obj_cnt is read/written from different contexts,
without any lock synchronization.

Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20241029191425.2519085-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 19:19:36 -07:00
Jakub Kicinski
5b1c965956 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc6).

Conflicts:

drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
  cbe84e9ad5 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
  188a1bf894 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/

net/mac80211/cfg.c
  c4382d5ca1 ("wifi: mac80211: update the right link for tx power")
  8dd0498983 ("wifi: mac80211: Fix setting txpower with emulate_chanctx")

drivers/net/ethernet/intel/ice/ice_ptp_hw.h
  6e58c33106 ("ice: fix crash on probe for DPLL enabled E810 LOM")
  e4291b64e1 ("ice: Align E810T GPIO to other products")
  ebb2693f8f ("ice: Read SDP section from NVM for pin definitions")
  ac532f4f42 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:10:07 -07:00
Ming Lei
496a51b371 lib/iov_iter.c: initialize bi.bi_idx before iterating over bvec
Initialize bi.bi_idx as 0 before iterating over bvec, otherwise
garbage data can be used as ->bi_idx.

Cc: Christoph Hellwig <hch@lst.de>
Reported-and-tested-by: Klara Modin <klarasmodin@gmail.com>
Fixes: e4e535bff2 ("iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-31 07:40:34 -06:00
Puranjay Mohan
db71aae70e net: checksum: Move from32to16() to generic header
from32to16() is used by lib/checksum.c and also by
arch/parisc/lib/checksum.c. The next patch will use it in the
bpf_csum_diff helper.

Move from32to16() to the include/net/checksum.h as csum_from32to16() and
remove other implementations.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20241026125339.26459-2-puranjay@kernel.org
2024-10-30 15:29:59 +01:00
Linus Torvalds
7fbaacafbc slab fixes for 6.12-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmcgrxcACgkQu+CwddJF
 iJrq9ggAiZ/2c7p23s52LdVhT9GTyV5omVOh2kDztVx4w6RM3RbkhkLWdqt0XUag
 uf1TJe6kOvnCeHEFEEo3sqPj820XebxKDf0GGCdI6a9f4n30ipKH+vWSQ0iutKO/
 dOBdArxr0FGOV5VZR9i3xQ6sUqZXXUbJdte0c0ovp6Q6HDHTeQeKNhOQ2fv33TG/
 7jBh5HVyhI6JE/+TOxrMaklH0IqYBb6z49wdbaN7XBvXVXlb5MtOZy109gfUHDwe
 tfktifyE45VtmF0WdHfxDbCnqyDSG1Jm3wsLDbMq+voJ1BQlUvIZ5Dv4kucYqffm
 VN5HkH6uQ09aoounBoU4g50UYeNpiQ==
 =xAw8
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - Fix for a slub_kunit test warning with MEM_ALLOC_PROFILING_DEBUG (Pei
   Xiao)

 - Fix for a MTE-based KASAN BUG in krealloc() (Qun-Wei Lin)

* tag 'slab-for-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm: krealloc: Fix MTE false alarm in __do_krealloc
  slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof
2024-10-29 16:24:02 -10:00
Ming Lei
e4e535bff2 iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages
The iov_iter_extract_pages interface allows to return physically
discontiguous pages, as long as all but the first and last page
in the array are page aligned and page size.  Rewrite
iov_iter_extract_bvec_pages to take advantage of that instead of only
returning ranges of physically contiguous pages.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[hch: minor cleanups, new commit log]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241024050021.627350-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-29 09:14:28 -06:00
Arnd Bergmann
5a8b4b4001 lib/iomem_copy: fix kerneldoc format style
The newly added file did not quite get the punctuation right:

lib/iomem_copy.c:14: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410290907.0mDZVYPK-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-29 07:14:29 +00:00
Arnd Bergmann
af08475370 selftests: kallsyms: add MODULE_DESCRIPTION
The newly added test script creates modules that are lacking
a description line in order to build cleanly:

WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_a.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_b.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_c.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/tests/module/test_kallsyms_d.o

Fixes: 84b4a51fce ("selftests: add new kallsyms selftests")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-10-28 15:27:08 -07:00
Julian Vetter
b660d0a2ac
New implementation for IO memcpy and IO memset
The IO memcpy and IO memset functions in asm-generic/io.h simply call
memcpy and memset. This can lead to alignment problems or faults on
architectures that do not define their own version and fall back to
these defaults.
This patch introduces new implementations for IO memcpy and IO memset,
that use read{l,q} accessor functions, align accesses to machine word
size, and resort to byte accesses when the target memory is not aligned.
For new architectures and existing ones that were using the old
fallbacks these functions are save to use, because IO memory constraints
are taken into account. Moreover, architectures with similar
implementations can now use these new versions, not needing to implement
their own.

Reviewed-by: Yann Sionneau <ysionneau@kalrayinc.com>
Signed-off-by: Julian Vetter <jvetter@kalrayinc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28 21:44:29 +00:00
Nicolas Pitre
1dc82675cb
lib/math/test_div64: add some edge cases relevant to __div64_const32()
Be sure to test the extreme cases with and without bias.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28 21:44:28 +00:00
Ira Weiny
4261974701 printf: Add print format (%pra) for struct range
The use of struct range in the CXL subsystem is growing.  In particular,
the addition of Dynamic Capacity devices uses struct range in a number
of places which are reported in debug and error messages.

To wit requiring the printing of the start/end fields in each print
became cumbersome.  Dan Williams mentions in [1] that it might be time
to have a print specifier for struct range similar to struct resource.

A few alternatives were considered including '%par', '%r', and '%pn'.
%pra follows that struct range is similar to struct resource (%p[rR])
but needs to be different.  Based on discussions with Petr and Andy
'%pra' was chosen.[2]

Andy also suggested to keep the range prints similar to struct resource
though combined code.  Add hex_range() to handle printing for both
pointer types.

Finally introduce DEFINE_RANGE() as a parallel to DEFINE_RES_*() and use
it in the tests.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: open list <linux-kernel@vger.kernel.org>
Link: https://lore.kernel.org/all/663922b475e50_d54d72945b@dwillia2-xfh.jf.intel.com.notmuch/ [1]
Link: https://lore.kernel.org/all/66cea3bf3332f_f937b29424@iweiny-mobl.notmuch/ [2]
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20241025-cxl-pra-v2-3-123a825daba2@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28 14:32:43 -07:00
Ira Weiny
8e7f07e608 test printf: Add very basic struct resource tests
The printf tests for struct resource were stubbed out.  struct range
printing will leverage the struct resource implementation.

To prevent regression add some basic sanity tests for struct resource.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20241007-dcd-type2-upstream-v4-1-c261ee6eeded@intel.com
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20241025-cxl-pra-v2-1-123a825daba2@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28 14:30:52 -07:00
Hugh Dickins
c749d9b7eb
iov_iter: fix copy_page_from_iter_atomic() if KMAP_LOCAL_FORCE_MAP
generic/077 on x86_32 CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP=y with highmem,
on huge=always tmpfs, issues a warning and then hangs (interruptibly):

WARNING: CPU: 5 PID: 3517 at mm/highmem.c:622 kunmap_local_indexed+0x62/0xc9
CPU: 5 UID: 0 PID: 3517 Comm: cp Not tainted 6.12.0-rc4 #2
...
copy_page_from_iter_atomic+0xa6/0x5ec
generic_perform_write+0xf6/0x1b4
shmem_file_write_iter+0x54/0x67

Fix copy_page_from_iter_atomic() by limiting it in that case
(include/linux/skbuff.h skb_frag_must_loop() does similar).

But going forward, perhaps CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is too
surprising, has outlived its usefulness, and should just be removed?

Fixes: 908a1ad894 ("iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Link: https://lore.kernel.org/r/dd5f0c89-186e-18e1-4f43-19a60f5a9774@google.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-28 13:39:35 +01:00
Eric Biggers
4964a1d91c crypto: api - move crypto_simd_disabled_for_test to lib
Move crypto_simd_disabled_for_test to lib/ so that crypto_simd_usable()
can be used by library code.

This was discussed previously
(https://lore.kernel.org/linux-crypto/20220716062920.210381-4-ebiggers@kernel.org/)
but was not done because there was no use case yet.  However, this is
now needed for the arm64 CRC32 library code.

Tested with:
    export ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
    echo CONFIG_CRC32=y > .config
    echo CONFIG_MODULES=y >> .config
    echo CONFIG_CRYPTO=m >> .config
    echo CONFIG_DEBUG_KERNEL=y >> .config
    echo CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n >> .config
    echo CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y >> .config
    make olddefconfig
    make -j$(nproc)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:11 +08:00
Ard Biesheuvel
16739efac6 crypto: crc32c - Provide crc32c-arch driver for accelerated library code
crc32c-generic is currently backed by the architecture's CRC-32c library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32c-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32c-arch implementation which is based on the
arch library code if available, and modify crc32c-generic so it is
always based on the generic C implementation. If the arch has no CRC-32c
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Ard Biesheuvel
a37e55791f crypto: crc32 - Provide crc32-arch driver for accelerated library code
crc32-generic is currently backed by the architecture's CRC-32 library
code, which may offer a variety of implementations depending on the
capabilities of the platform. These are not covered by the crypto
subsystem's fuzz testing capabilities because crc32-generic is the
reference driver that the fuzzing logic uses as a source of truth.

Fix this by providing a crc32-arch implementation which is based on the
arch library code if available, and modify crc32-generic so it is
always based on the generic C implementation. If the arch has no CRC-32
library code, this change does nothing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-28 18:33:10 +08:00
Paolo Abeni
03fc07a247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts and no adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-25 09:08:22 +02:00
Linus Torvalds
c2cd8e4592 Probes fixes for v6.12-rc4(2):
- objpool: Fix choosing allocation for percpu slots
   Fixes to allocate objpool's percpu slots correctly according to the
   GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose
   the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag
   is set, because GFP_ATOMIC is a combined flag.
 
 - tracing/probes: Fix MAX_TRACE_ARGS limit handling
   If more than MAX_TRACE_ARGS are passed for creating a probe event, the
   entries over MAX_TRACE_ARG in trace_arg array are not initialized.
   Thus if the kernel accesses those entries, it crashes. This rejects
   creating event if the number of arguments is over MAX_TRACE_ARGS.
 
 - tracing: Consider the NULL character when validating the event length
   A strlen() is used when parsing the event name, and the original code
   does not consider the terminal null byte. Thus it can pass the name
   1 byte longer than the buffer. This fixes to check it correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmcZBJ0ACgkQ2/sHvwUr
 Pxu4qAgAm+mIiCaBGyolsT1oB5EF+9gztbwRtcAOY1811RJZ0XiQPuOwtZfijpBr
 1Pl+SjubRKhLg+lLHEuCQHxkqlTSp+zrjkF+A0hFlB38nJ5P3pIw+b5pM5FCvhY+
 w0tBTwkjiRBS9h1z88c74ciKYA/XR4apcMMUrPQZUCHq8P73Wu/Fo2lhnCVGBs6q
 nYESyrTcOCDR0c6HP9D2GWxQFtbbCyAfotUjX37EIooTcl7ufAr8IPm8jBx7EzCa
 WM841FwbuIgGbFCGYlG1/lOR+Qf7FszKAY5SBJMV/BiyFbxJqZfA5DWfJcrZ9YpW
 pl86oKWyEkidwx8OIiB3Y1enPzUUJQ==
 =8oUB
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - objpool: Fix choosing allocation for percpu slots

   Fixes to allocate objpool's percpu slots correctly according to the
   GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose
   the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag
   is set, because GFP_ATOMIC is a combined flag.

 - tracing/probes: Fix MAX_TRACE_ARGS limit handling

   If more than MAX_TRACE_ARGS are passed for creating a probe event,
   the entries over MAX_TRACE_ARG in trace_arg array are not
   initialized. Thus if the kernel accesses those entries, it crashes.
   This rejects creating event if the number of arguments is over
   MAX_TRACE_ARGS.

 - tracing: Consider the NUL character when validating the event length

   A strlen() is used when parsing the event name, and the original code
   does not consider the terminal null byte. Thus it can pass the name
   one byte longer than the buffer. This fixes to check it correctly.

* tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Consider the NULL character when validating the event length
  tracing/probes: Fix MAX_TRACE_ARGS limit handling
  objpool: fix choosing allocation for percpu slots
2024-10-24 13:51:58 -07:00
Luis Chamberlain
84b4a51fce selftests: add new kallsyms selftests
We lack find_symbol() selftests, so add one. This let's us stress test
improvements easily on find_symbol() or optimizations. It also inherently
allows us to test the limits of kallsyms on Linux today.

We test a pathalogical use case for kallsyms by introducing modules
which are automatically written for us with a larger number of symbols.
We have 4 kallsyms test modules:

A: has KALLSYSMS_NUMSYMS exported symbols
B: uses one of A's symbols
C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported
D: adds 2 * the symbols than C

By using anything much larger than KALLSYSMS_NUMSYMS as 10,000 and
KALLSYMS_SCALE_FACTOR of 8 we segfault today. So we're capped at
around 160000 symbols somehow today. We can inpsect that issue at
our leasure later, but for now the real value to this test is that
this will easily allow us to test improvements on find_symbol().

We want to enable this test on allyesmodconfig builds so we can't
use this combination, so instead just use a safe value for now and
be informative on the Kconfig symbol documentation about where our
thresholds are for testers. We default then to KALLSYSMS_NUMSYMS of
just 100 and KALLSYMS_SCALE_FACTOR of 8.

On x86_64 we can use perf, for other architectures we just use 'time'
and allow for customizations. For example a future enhancements could
be done for parisc to check for unaligned accesses which triggers a
special special exception handler assembler code inside the kernel.
The negative impact on performance is so large on parisc that it
keeps track of its accesses on /proc/cpuinfo as UAH:

IRQ:       CPU0       CPU1
3:       1332          0         SuperIO  ttyS0
7:    1270013          0         SuperIO  pata_ns87415
64:  320023012  320021431             CPU  timer
65:   17080507   20624423             CPU  IPI
UAH:   10948640      58104   Unaligned access handler traps

While at it, this tidies up lib/ test modules to allow us to have
a new directory for them. The amount of test modules under lib/
is insane.

This should also hopefully showcase how to start doing basic
self module writing code, which may be more useful for more complex
cases later in the future.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-10-24 10:14:12 -07:00
David Howells
e65a0dc1ca
iov_iter: Fix iov_iter_get_pages*() for folio_queue
p9_get_mapped_pages() uses iov_iter_get_pages_alloc2() to extract pages
from an iterator when performing a zero-copy request and under some
circumstances, this crashes with odd page errors[1], for example, I see:

    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbcf0
    flags: 0x2000000000000000(zone=1)
    ...
    page dumped because: VM_BUG_ON_FOLIO(((unsigned int) folio_ref_count(folio) + 127u <= 127u))
    ------------[ cut here ]------------
    kernel BUG at include/linux/mm.h:1444!

This is because, unlike in iov_iter_extract_folioq_pages(), the
iter_folioq_get_pages() helper function doesn't skip the current folio
when iov_offset points to the end of it, but rather extracts the next
page beyond the end of the folio and adds it to the list.  Reading will
then clobber the contents of this page, leading to system corruption,
and if the page is not in use, put_page() may try to clean up the unused
page.

This can be worked around by copying the iterator before each
extraction[2] and using iov_iter_advance() on the original as the
advance function steps over the page we're at the end of.

Fix this by skipping the page extraction if we're at the end of the
folio.

This was reproduced in the ktest environment[3] by forcing 9p to use the
fscache caching mode and then reading a file through 9p.

Fixes: db0aa2e956 ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Reported-by: Antony Antony <antony@phenome.org>
Closes: https://lore.kernel.org/r/ZxFQw4OI9rrc7UYc@Antony2201.local/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@kernel.org>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: v9fs@lists.linux.dev
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/ZxFEi1Tod43pD6JC@moon.secunet.de/ [1]
Link: https://lore.kernel.org/r/2299159.1729543103@warthog.procyon.org.uk/ [2]
Link: https://github.com/koverstreet/ktest.git [3]
Tested-by: Antony Antony <antony.antony@secunet.com>
Link: https://lore.kernel.org/r/3327438.1729678025@warthog.procyon.org.uk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-24 13:50:27 +02:00
Marco Elver
237ab03e30 Revert "kasan: Disable Software Tag-Based KASAN with GCC"
This reverts commit 7aed6a2c51.

Now that __no_sanitize_address attribute is fixed for KASAN_SW_TAGS with
GCC, allow re-enabling KASAN_SW_TAGS with GCC.

Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20241021120013.3209481-2-elver@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-23 16:04:30 +01:00
Pei Xiao
2b059d0d1e slub/kunit: fix a WARNING due to unwrapped __kmalloc_cache_noprof
'modprobe slub_kunit' will have a warning as shown below. The root cause
is that __kmalloc_cache_noprof was directly used, which resulted in no
alloc_tag being allocated. This caused current->alloc_tag to be null,
leading to a warning in alloc_tag_add_check.

Let's add an alloc_hook layer to __kmalloc_cache_noprof specifically
within lib/slub_kunit.c, which is the only user of this internal slub
function outside kmalloc implementation itself.

[58162.947016] WARNING: CPU: 2 PID: 6210 at
./include/linux/alloc_tag.h:125 alloc_tagging_slab_alloc_hook+0x268/0x27c
[58162.957721] Call trace:
[58162.957919]  alloc_tagging_slab_alloc_hook+0x268/0x27c
[58162.958286]  __kmalloc_cache_noprof+0x14c/0x344
[58162.958615]  test_kmalloc_redzone_access+0x50/0x10c [slub_kunit]
[58162.959045]  kunit_try_run_case+0x74/0x184 [kunit]
[58162.959401]  kunit_generic_run_threadfn_adapter+0x2c/0x4c [kunit]
[58162.959841]  kthread+0x10c/0x118
[58162.960093]  ret_from_fork+0x10/0x20
[58162.960363] ---[ end trace 0000000000000000 ]---

Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Fixes: a0a44d9175 ("mm, slab: don't wrap internal functions with alloc_hooks()")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-23 09:50:58 +02:00
Viktor Malik
aff1871bfc objpool: fix choosing allocation for percpu slots
objpool intends to use vmalloc for default (non-atomic) allocations of
percpu slots and objects. However, the condition checking if GFP flags
set any bit of GFP_ATOMIC is wrong b/c GFP_ATOMIC is a combination of bits
(__GFP_HIGH|__GFP_KSWAPD_RECLAIM) and so `pool->gfp & GFP_ATOMIC` will
be true if either bit is set. Since GFP_ATOMIC and GFP_KERNEL share the
___GFP_KSWAPD_RECLAIM bit, kmalloc will be used in cases when GFP_KERNEL
is specified, i.e. in all current usages of objpool.

This may lead to unexpected OOM errors since kmalloc cannot allocate
large amounts of memory.

For instance, objpool is used by fprobe rethook which in turn is used by
BPF kretprobe.multi and kprobe.session probe types. Trying to attach
these to all kernel functions with libbpf using

    SEC("kprobe.session/*")
    int kprobe(struct pt_regs *ctx)
    {
        [...]
    }

fails on objpool slot allocation with ENOMEM.

Fix the condition to truly use vmalloc by default.

Link: https://lore.kernel.org/all/20240826060718.267261-1-vmalik@redhat.com/

Fixes: b4edb8d2d4 ("lib: objpool added: ring-array based lockless MPMC")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-10-22 14:22:42 +09:00
Linus Torvalds
a777c32ca4 This push fixes a regression in mpi that broke RSA.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmcSJQYACgkQxycdCkmx
 i6ejoBAAhK/3bk9jmxMOnVvednjrjVMqg+17daXHKbHT6eMcOwXsgr4ZWrkc5syV
 tQBRipdSfLhwf4aTNOzgyg3GIVVQkLZuRKDanntVdyYs65YKKUP/BiUshMAJ4DbW
 nkPe+LBdl0EvIWexrSKy5cyB2Yt+5MknK+mUMHyAeRjgVHNCEBMbMo/4KHGDW6fL
 Cn8rBATD1LCBODkxFC83pHe5M/TsxM08hL8xQxPJZm9SvNiBa7+xaS/oSApyIs8x
 L0RmYdlXlRGQcok5/ZCFc66QEOw2lIOwIc6sTmbT+eKFtvztkZ+ErhAuubgk5UKa
 TaB0qrBIpsQs2O7gFq4OU7BkG4QAlFt37MqBuf21b5Zh605s/ORDWEQobcokXpBY
 SmxOBxBhhLcRgb1cjUQn44/M8vrRXL0+IZiuOWkb+vcNln32bCH+BeiW6traNdL3
 s3uVRF28Pd76xB4eAuT4eqiSOuCI/FyB7+hJmkOcpKC1eQUq2whrFLfru3iGItn8
 bJWJQjPaysI8QXoky6miMjaeBWWOHuBWgYb2BzzHRsAdxK2oXUN/Q3BOJq1wONtP
 YaRzqu5vBvPk+0F/SOIl1MBp1nt62T8WRcDyIAhDsgmnuWASAKzo9Smzzo0gJr8q
 bB9iHTHN6yR9J3+zPyOqPY99zkaABSrQU9StFqEjN8icndG5Tfo=
 =MHMX
 -----END PGP SIGNATURE-----

Merge tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "Fix a regression in mpi that broke RSA"

* tag 'v6.12-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
2024-10-21 09:59:43 -07:00
Linus Torvalds
4e6bd4a33a Rust fixes for v6.12 (2nd)
Toolchain and infrastructure:
 
  - Fix several issues with the 'rustc-option' macro. It includes a
    refactor from Masahiro of three '{cc,rust}-*' macros, which is not
    a fix but avoids repeating the same commands (which would be several
    lines in the case of 'rustc-option').
 
  - Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
    includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not a
    fix but is needed for the actual fix.
 
 And a trivial grammar fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmcS5LkACgkQGXyLc2ht
 IW07ghAAxP94zqWzf8bQ4IIgTYrV9WSqR9vMpd31VAPknRJjGUq5dehFxiQxDJ5X
 ibMcpyja8V1CGeOh4qthLJAD/OGw+ANafjLfHM/l9cQRx1uwLEac3h4/YR1x52Ep
 al3ISewhbs3cjko2aa6Gnym3hdYizqkKY9Bca6kvo7k4ZRRmWT3sKAsle6rV93Hw
 q9AjC40XC8iy2VYv/JPvP1zcr3T7ZzCrs3ELG8sLSeR0gZZEmI3e3FOWWHcRlVRa
 uig4SSPvhHVssG8k64CHmzUtVQCApuJuzQGG72Ozs4V5Xxk86ZRE0XzyMXaw15nu
 Mm8s+hDxsFXfESQg0GMCVQ7wnGFSuvRwK3sWALltXmqtGQxkYgcJ3mYtu0sP8p51
 VIzDIomdUfGLxk+sDn7Lnl5PrSLaetUd94nr5qCMmfb2/7/kSaB4aHmML+8ZHCn5
 I4TQONL/pVmmRm97HFaAFOzCaGRWfVoIzQ/cRaQhqK+qrTfRjyFcsMzN+Flp5A58
 c3AgnTVlm4pPqtlLQ1z9BiGYT50dI0fHBOQiisogGsZwwMUqzEMOnbZjbhS/HKSp
 FG8hu/OyzIsNnNqOfQZN4DSTyf4qfIuyTmFM1OAel8zllCwlxy5F2hVp/opwH3/y
 On6CW0lunUBzCXZZ+byWudo7Vg8YpMVHATLqp9FHZpJb8JK688w=
 =Y7fL
 -----END PGP SIGNATURE-----

Merge tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux

Pull rust fixes from Miguel Ojeda:
 "Toolchain and infrastructure:

   - Fix several issues with the 'rustc-option' macro. It includes a
     refactor from Masahiro of three '{cc,rust}-*' macros, which is not
     a fix but avoids repeating the same commands (which would be
     several lines in the case of 'rustc-option').

   - Fix conditions for 'CONFIG_HAVE_CFI_ICALL_NORMALIZE_INTEGERS'. It
     includes the addition of 'CONFIG_RUSTC_LLVM_VERSION', which is not
     a fix but is needed for the actual fix.

  And a trivial grammar fix"

* tag 'rust-fixes-6.12-2' of https://github.com/Rust-for-Linux/linux:
  cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS
  kbuild: rust: add `CONFIG_RUSTC_LLVM_VERSION`
  kbuild: fix issues with rustc-option
  kbuild: refactor cc-option-yn, cc-disable-warning, rust-option-yn macros
  lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
2024-10-19 08:32:47 -07:00
Linus Torvalds
3d5ad2d4ec BPF fixes:
- Fix BPF verifier to not affect subreg_def marks in its range
   propagation, from Eduard Zingerman.
 
 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx, from Dimitar Kanaliev.
 
 - Fix the BPF verifier's delta propagation between linked
   registers under 32-bit addition, from Daniel Borkmann.
 
 - Fix a NULL pointer dereference in BPF devmap due to missing
   rxq information, from Florian Kauer.
 
 - Fix a memory leak in bpf_core_apply, from Jiri Olsa.
 
 - Fix an UBSAN-reported array-index-out-of-bounds in BTF
   parsing for arrays of nested structs, from Hou Tao.
 
 - Fix build ID fetching where memory areas backing the file
   were created with memfd_secret, from Andrii Nakryiko.
 
 - Fix BPF task iterator tid filtering which was incorrectly
   using pid instead of tid, from Jordan Rome.
 
 - Several fixes for BPF sockmap and BPF sockhash redirection
   in combination with vsocks, from Michal Luczaj.
 
 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered,
   from Andrea Parri.
 
 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the
   possibility of an infinite BPF tailcall, from Pu Lehui.
 
 - Fix a build warning from resolve_btfids that bpf_lsm_key_free
   cannot be resolved, from Thomas Weißschuh.
 
 - Fix a bug in kfunc BTF caching for modules where the wrong
   BTF object was returned, from Toke Høiland-Jørgensen.
 
 - Fix a BPF selftest compilation error in cgroup-related tests
   with musl libc, from Tony Ambardar.
 
 - Several fixes to BPF link info dumps to fill missing fields,
   from Tyrone Wu.
 
 - Add BPF selftests for kfuncs from multiple modules, checking
   that the correct kfuncs are called, from Simon Sundberg.
 
 - Ensure that internal and user-facing bpf_redirect flags
   don't overlap, also from Toke Høiland-Jørgensen.
 
 - Switch to use kvzmalloc to allocate BPF verifier environment,
   from Rik van Riel.
 
 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic
   splat under RT, from Wander Lairson Costa.
 
 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZxK4OhUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISIOCrwEAib2kC5EEQn5+wKVE/bnZryVX2leT
 YXdfItDCBU6zCYUA+wTU5hGGn9lcDUcZx72l/KZPDyPw7HdzNJ+6iR1zQqoM
 =f9kv
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Daniel Borkmann:

 - Fix BPF verifier to not affect subreg_def marks in its range
   propagation (Eduard Zingerman)

 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx (Dimitar Kanaliev)

 - Fix the BPF verifier's delta propagation between linked registers
   under 32-bit addition (Daniel Borkmann)

 - Fix a NULL pointer dereference in BPF devmap due to missing rxq
   information (Florian Kauer)

 - Fix a memory leak in bpf_core_apply (Jiri Olsa)

 - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
   arrays of nested structs (Hou Tao)

 - Fix build ID fetching where memory areas backing the file were
   created with memfd_secret (Andrii Nakryiko)

 - Fix BPF task iterator tid filtering which was incorrectly using pid
   instead of tid (Jordan Rome)

 - Several fixes for BPF sockmap and BPF sockhash redirection in
   combination with vsocks (Michal Luczaj)

 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)

 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
   of an infinite BPF tailcall (Pu Lehui)

 - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
   be resolved (Thomas Weißschuh)

 - Fix a bug in kfunc BTF caching for modules where the wrong BTF object
   was returned (Toke Høiland-Jørgensen)

 - Fix a BPF selftest compilation error in cgroup-related tests with
   musl libc (Tony Ambardar)

 - Several fixes to BPF link info dumps to fill missing fields (Tyrone
   Wu)

 - Add BPF selftests for kfuncs from multiple modules, checking that the
   correct kfuncs are called (Simon Sundberg)

 - Ensure that internal and user-facing bpf_redirect flags don't overlap
   (Toke Høiland-Jørgensen)

 - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
   Riel)

 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
   under RT (Wander Lairson Costa)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
  lib/buildid: Handle memfd_secret() files in build_id_parse()
  selftests/bpf: Add test case for delta propagation
  bpf: Fix print_reg_state's constant scalar dump
  bpf: Fix incorrect delta propagation between linked registers
  bpf: Properly test iter/task tid filtering
  bpf: Fix iter/task tid filtering
  riscv, bpf: Make BPF_CMPXCHG fully ordered
  bpf, vsock: Drop static vsock_bpf_prot initialization
  vsock: Update msg_count on read_skb()
  vsock: Update rx_bytes on read_skb()
  bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
  selftests/bpf: Add asserts for netfilter link info
  bpf: Fix link info netfilter flags to populate defrag flag
  selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
  selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
  bpf: Fix truncation bug in coerce_reg_to_size_sx()
  selftests/bpf: Assert link info uprobe_multi count & path_size if unset
  bpf: Fix unpopulated path_size when uprobe_multi fields unset
  selftests/bpf: Fix cross-compiling urandom_read
  selftests/bpf: Add test for kfunc module order
  ...
2024-10-18 16:27:14 -07:00
Sebastian Andrzej Siewior
560af5dc83 lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.
With the printk issues solved, the last known splat created by
PROVE_RAW_LOCK_NESTING is gone.

Enable PROVE_RAW_LOCK_NESTING by default as part of PROVE_LOCKING. Keep
the defines around in case something serious pops up and it needs to be
disabled.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20241009161041.1018375-2-bigeasy@linutronix.de
2024-10-17 21:21:16 -07:00
Ahmed Ehab
5eadeb7b3b locking/lockdep: Add a test for lockdep_set_subclass()
Add a test case to ensure that no new name string literal will be
created in lockdep_set_subclass(), otherwise a warning will be triggered
in look_up_lock_class(). Add this to catch the problem in the future.

[boqun: Reword the title, replace #if with #ifdef and rename functions
and variables]

Signed-off-by: Ahmed Ehab <bottaawesome633@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/20240905011220.356973-1-bottaawesome633@gmail.com/
2024-10-17 21:21:16 -07:00
Linus Torvalds
4d939780b7 28 hotfixes. 13 are cc:stable. 23 are MM.
It is the usual shower of unrelated singletons - please see the individual
 changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZxGY5wAKCRDdBJ7gKXxA
 js6RAQC16zQ7WRV091i79cEi1C5648NbZjMCU626hZjuyfbzKgEA2v8PYtjj9w2e
 UGLxMY+PYZki2XNEh75Sikdkiyl9Vgg=
 =xcWT
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "28 hotfixes. 13 are cc:stable. 23 are MM.

  It is the usual shower of unrelated singletons - please see the
  individual changelogs for details"

* tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits)
  maple_tree: add regression test for spanning store bug
  maple_tree: correct tree corruption on spanning store
  mm/mglru: only clear kswapd_failures if reclaimable
  mm/swapfile: skip HugeTLB pages for unuse_vma
  selftests: mm: fix the incorrect usage() info of khugepaged
  MAINTAINERS: add Jann as memory mapping/VMA reviewer
  mm: swap: prevent possible data-race in __try_to_reclaim_swap
  mm: khugepaged: fix the incorrect statistics when collapsing large file folios
  MAINTAINERS: kasan, kcov: add bugzilla links
  mm: don't install PMD mappings when THPs are disabled by the hw/process/vma
  mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()
  Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs
  Docs/damon/maintainer-profile: add missing '_' suffixes for external web links
  maple_tree: check for MA_STATE_BULK on setting wr_rebalance
  mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point
  mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets()
  mm: remove unused stub for can_swapin_thp()
  mailmap: add an entry for Andy Chiu
  MAINTAINERS: add memory mapping/VMA co-maintainers
  fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization
  ...
2024-10-17 16:33:06 -07:00
Andrii Nakryiko
5ac9b4e935 lib/buildid: Handle memfd_secret() files in build_id_parse()
>From memfd_secret(2) manpage:

  The memory areas backing the file created with memfd_secret(2) are
  visible only to the processes that have access to the file descriptor.
  The memory region is removed from the kernel page tables and only the
  page tables of the processes holding the file descriptor map the
  corresponding physical memory. (Thus, the pages in the region can't be
  accessed by the kernel itself, so that, for example, pointers to the
  region can't be passed to system calls.)

We need to handle this special case gracefully in build ID fetching
code. Return -EFAULT whenever secretmem file is passed to build_id_parse()
family of APIs. Original report and repro can be found in [0].

  [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/

Fixes: de3ec364c3 ("lib/buildid: add single folio-based file reader abstraction")
Reported-by: Yi Lai <yi1.lai@intel.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Link: https://lore.kernel.org/bpf/20241017175431.6183-A-hca@linux.ibm.com
Link: https://lore.kernel.org/bpf/20241017174713.2157873-1-andrii@kernel.org
2024-10-17 21:30:32 +02:00
Linus Torvalds
6efbea77b3 arm64 fixes for -rc4
- Disable software tag-based KASAN when compiling with GCC, as functions
   are incorrectly instrumented leading to a crash early during boot.
 
 - Fix pkey configuration for kernel threads when POE is enabled.
 
 - Fix invalid memory accesses in uprobes when targetting load-literal
   instructions.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmcPrzQQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNIr6B/wN+o1xI7Fv/QdlaTuKYLvOOg/XTl6sbUDj
 YssxtjhpKuaFVG4zJHNsWvgUqO+YCM7m3F1L8LVPMF7l2xoKtRTIB1Ye315hTjYm
 dW5Te6xBMVKF8SVxE8sBbZobdokIW1JNPBrvGvHO3d5ujmofzwHU8RNMXuTUItRw
 z85Qy75FkEDTEbsWhS3VL5HOgEr+k0TYDRa8SXwKWVj7/rYna3tO39kIdS5dt9VX
 wDJbnxtWJMhiHmDnevFFhBkSZrips12P1Rb6HUSmhpUJh0Rk4TAZntSl2f/lr+jA
 PuboBbSG68UOCwAHoNmTcLdFhkiNaiyw4w2F7hk2A6aNRtme+bT0
 =M/ug
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Disable software tag-based KASAN when compiling with GCC, as
   functions are incorrectly instrumented leading to a crash early
   during boot

 - Fix pkey configuration for kernel threads when POE is enabled

 - Fix invalid memory accesses in uprobes when targetting load-literal
   instructions

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kasan: Disable Software Tag-Based KASAN with GCC
  Documentation/protection-keys: add AArch64 to documentation
  arm64: set POR_EL0 for kernel threads
  arm64: probes: Fix uprobes for big-endian kernels
  arm64: probes: Fix simulate_ldr*_literal()
  arm64: probes: Remove broken LDR (literal) uprobe support
2024-10-17 09:51:03 -07:00
Lorenzo Stoakes
bea07fd631 maple_tree: correct tree corruption on spanning store
Patch series "maple_tree: correct tree corruption on spanning store", v3.

There has been a nasty yet subtle maple tree corruption bug that appears
to have been in existence since the inception of the algorithm.

This bug seems far more likely to happen since commit f8d112a4e6
("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
at which reports started to be submitted concerning this bug.

We were made definitely aware of the bug thanks to the kind efforts of
Bert Karwatzki who helped enormously in my being able to track this down
and identify the cause of it.

The bug arises when an attempt is made to perform a spanning store across
two leaf nodes, where the right leaf node is the rightmost child of the
shared parent, AND the store completely consumes the right-mode node.

This results in mas_wr_spanning_store() mitakenly duplicating the new and
existing entries at the maximum pivot within the range, and thus maple
tree corruption.

The fix patch corrects this by detecting this scenario and disallowing the
mistaken duplicate copy.

The fix patch commit message goes into great detail as to how this occurs.

This series also includes a test which reliably reproduces the issue, and
asserts that the fix works correctly.

Bert has kindly tested the fix and confirmed it resolved his issues.  Also
Mikhail Gavrilov kindly reported what appears to be precisely the same
bug, which this fix should also resolve.


This patch (of 2):

There has been a subtle bug present in the maple tree implementation from
its inception.

This arises from how stores are performed - when a store occurs, it will
overwrite overlapping ranges and adjust the tree as necessary to
accommodate this.

A range may always ultimately span two leaf nodes.  In this instance we
walk the two leaf nodes, determine which elements are not overwritten to
the left and to the right of the start and end of the ranges respectively
and then rebalance the tree to contain these entries and the newly
inserted one.

This kind of store is dubbed a 'spanning store' and is implemented by
mas_wr_spanning_store().

In order to reach this stage, mas_store_gfp() invokes
mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
walk the tree and update the object (mas) to traverse to the location
where the write should be performed, determining its store type.

When a spanning store is required, this function returns false stopping at
the parent node which contains the target range, and mas_wr_store_type()
marks the mas->store_type as wr_spanning_store to denote this fact.

When we go to perform the store in mas_wr_spanning_store(), we first
determine the elements AFTER the END of the range we wish to store (that
is, to the right of the entry to be inserted) - we do this by walking to
the NEXT pivot in the tree (i.e.  r_mas.last + 1), starting at the node we
have just determined contains the range over which we intend to write.

We then turn our attention to the entries to the left of the entry we are
inserting, whose state is represented by l_mas, and copy these into a 'big
node', which is a special node which contains enough slots to contain two
leaf node's worth of data.

We then copy the entry we wish to store immediately after this - the copy
and the insertion of the new entry is performed by mas_store_b_node().

After this we copy the elements to the right of the end of the range which
we are inserting, if we have not exceeded the length of the node (i.e. 
r_mas.offset <= r_mas.end).

Herein lies the bug - under very specific circumstances, this logic can
break and corrupt the maple tree.

Consider the following tree:

Height
  0                             Root Node
                                 /      \
                 pivot = 0xffff /        \ pivot = ULONG_MAX
                               /          \
  1                       A [-----]       ...
                             /   \
             pivot = 0x4fff /     \ pivot = 0xffff
                           /       \
  2 (LEAVES)          B [-----]  [-----] C
                                      ^--- Last pivot 0xffff.

Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
that all ranges expressed in maple tree code are inclusive):

1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
   determines that this is a spanning store across nodes B and C. The mas
   state is set such that the current node from which we traverse further
   is node A.

2. In mas_wr_spanning_store() we try to find elements to the right of pivot
   0xffff by searching for an index of 0x10000:

    - mas_wr_walk_index() invokes mas_wr_walk_descend() and
      mas_wr_node_walk() in turn.

        - mas_wr_node_walk() loops over entries in node A until EITHER it
          finds an entry whose pivot equals or exceeds 0x10000 OR it
          reaches the final entry.

        - Since no entry has a pivot equal to or exceeding 0x10000, pivot
          0xffff is selected, leading to node C.

    - mas_wr_walk_traverse() resets the mas state to traverse node C. We
      loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
      in turn once again.

         - Again, we reach the last entry in node C, which has a pivot of
           0xffff.

3. We then copy the elements to the left of 0x4000 in node B to the big
   node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
   too.

4. We determine whether we have any entries to copy from the right of the
   end of the range via - and with r_mas set up at the entry at pivot
   0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
   pivot 0xffff.

5. BUG! The maple tree is corrupted with a duplicate entry.

This requires a very specific set of circumstances - we must be spanning
the last element in a leaf node, which is the last element in the parent
node.

spanning store across two leaf nodes with a range that ends at that shared
pivot.

A potential solution to this problem would simply be to reset the walk
each time we traverse r_mas, however given the rarity of this situation it
seems that would be rather inefficient.

Instead, this patch detects if the right hand node is populated, i.e.  has
anything we need to copy.

We do so by only copying elements from the right of the entry being
inserted when the maximum value present exceeds the last, rather than
basing this on offset position.

The patch also updates some comments and eliminates the unused bool return
value in mas_wr_walk_index().

The work performed in commit f8d112a4e6 ("mm/mmap: avoid zeroing vma
tree in mmap_region()") seems to have made the probability of this event
much more likely, which is the point at which reports started to be
submitted concerning this bug.

The motivation for this change arose from Bert Karwatzki's report of
encountering mm instability after the release of kernel v6.12-rc1 which,
after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
options, was identified as maple tree corruption.

After Bert very generously provided his time and ability to reproduce this
event consistently, I was able to finally identify that the issue
discussed in this commit message was occurring for him.

Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf@web.de>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 08:35:10 -07:00
Sidhartha Kumar
a6e0ceb7bf maple_tree: check for MA_STATE_BULK on setting wr_rebalance
It is possible for a bulk operation (MA_STATE_BULK is set) to enter the
new_end < mt_min_slots[type] case and set wr_rebalance as a store type. 
This is incorrect as bulk stores do not rebalance per write, but rather
after the all of the writes are done through the mas_bulk_rebalance()
path.  Therefore, add a check to make sure MA_STATE_BULK is not set before
we return wr_rebalance as the store type.

Also add a test to make sure wr_rebalance is never the store type when
doing bulk operations via mas_expected_entries()

This is a hotfix for this rc however it has no userspace effects as there
are no users of the bulk insertion mode.

Link: https://lkml.kernel.org/r/20241011214451.7286-1-sidhartha.kumar@oracle.com
Fixes: 5d659bbb52 ("maple_tree: introduce mas_wr_store_type()")
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Sidhartha <sidhartha.kumar@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 00:28:09 -07:00
Florian Westphal
dc783ba4b9 lib: alloc_tag_module_unload must wait for pending kfree_rcu calls
Ben Greear reports following splat:
 ------------[ cut here ]------------
 net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
 WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
 Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
...
 Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
 RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
  codetag_unload_module+0x19b/0x2a0
  ? codetag_load_module+0x80/0x80

nf_nat module exit calls kfree_rcu on those addresses, but the free
operation is likely still pending by the time alloc_tag checks for leaks.

Wait for outstanding kfree_rcu operations to complete before checking
resolves this warning.

Reproducer:
unshare -n iptables-nft -t nat -A PREROUTING -p tcp
grep nf_nat /proc/allocinfo # will list 4 allocations
rmmod nft_chain_nat
rmmod nf_nat                # will WARN.

[akpm@linux-foundation.org: add comment]
Link: https://lkml.kernel.org/r/20241007205236.11847-1-fw@strlen.de
Fixes: a473573964 ("lib: code tagging module support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reported-by: Ben Greear <greearb@candelatech.com>
Closes: https://lore.kernel.org/netdev/bdaaef9d-4364-4171-b82b-bcfc12e207eb@candelatech.com/
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-17 00:28:07 -07:00
Qianqiang Liu
cd843399d7 crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
The "err" variable may be returned without an initialized value.

Fixes: 8e3a67f2de ("crypto: lib/mpi - Add error checks to extension")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-16 13:38:16 +08:00
Thomas Gleixner
ff8d523cc4 debugobjects: Track object usage to avoid premature freeing of objects
The freelist is freed at a constant rate independent of the actual usage
requirements. That's bad in scenarios where usage comes in bursts. The end
of a burst puts the objects on the free list and freeing proceeds even when
the next burst which requires objects started again.

Keep track of the usage with a exponentially wheighted moving average and
take that into account in the worker function which frees objects from the
free list.

This further reduces the kmem_cache allocation/free rate for a full kernel
compile:

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   225k		173k
Usage:	    170k		117k

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/87bjznhme2.ffs@tglx
2024-10-15 17:30:33 +02:00
Thomas Gleixner
13f9ca7239 debugobjects: Refill per CPU pool more agressively
Right now the per CPU pools are only refilled when they become
empty. That's suboptimal especially when there are still non-freed objects
in the to free list.

Check whether an allocation from the per CPU pool emptied a batch and try
to allocate from the free pool if that still has objects available.

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   295k		245k
Refill:	    225k		173k

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.439053085@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
a201a96b96 debugobjects: Double the per CPU slots
In situations where objects are rapidly allocated from the pool and handed
back, the size of the per CPU pool turns out to be too small.

Double the size of the per CPU pool.

This reduces the kmem cache allocation and free operations during a kernel compile:

     	     alloc    	    free
Baseline:    380k           330k
Double size: 295k	    245k

Especially the reduction of allocations is important because that happens
in the hot path when objects are initialized.

The maximum increase in per CPU pool memory consumption is about 2.5K per
online CPU, which is acceptable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.378676302@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
2638345d22 debugobjects: Move pool statistics into global_pool struct
Keep it along with the pool as that's a hot cache line anyway and it makes
the code more comprehensible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.318776207@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
f57ebb92ba debugobjects: Implement batch processing
Adding and removing single objects in a loop is bad in terms of lock
contention and cache line accesses.

To implement batching, record the last object in a batch in the object
itself. This is trivialy possible as hlists are strictly stacks. At a batch
boundary, when the first object is added to the list the object stores a
pointer to itself in debug_obj::batch_last. When the next object is added
to the list then the batch_last pointer is retrieved from the first object
in the list and stored in the to be added one.

That means for batch processing the first object always has a pointer to
the last object in a batch, which allows to move batches in a cache line
efficient way and reduces the lock held time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.258995000@linutronix.de
2024-10-15 17:30:33 +02:00
Thomas Gleixner
aebbfe0779 debugobjects: Prepare kmem_cache allocations for batching
Allocate a batch and then push it into the pool. Utilize the
debug_obj::last_node pointer for keeping track of the batch boundary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164914.198647184@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
74fe1ad413 debugobjects: Prepare for batching
Move the debug_obj::object pointer into a union and add a pointer to the
last node in a batch. That allows to implement batch processing efficiently
by utilizing the stack property of hlist:

When the first object of a batch is added to the list, then the batch
pointer is set to the hlist node of the object itself. Any subsequent add
retrieves the pointer to the last node from the first object in the list
and uses that for storing the last node pointer in the newly added object.

Add the pointer to the data structure and ensure that all relevant pool
sizes are strictly batch sized. The actual batching implementation follows
in subsequent changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.139204961@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
14077b9e58 debugobjects: Use static key for boot pool selection
Get rid of the conditional in the hot path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.077247071@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
9ce99c6d7b debugobjects: Rework free_object_work()
Convert it to batch processing with intermediate helper functions. This
reduces the final changes for batch processing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164914.015906394@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
a3b9e191f5 debugobjects: Rework object freeing
__free_object() is uncomprehensibly complex. The same can be achieved by:

   1) Adding the object to the per CPU pool

   2) If that pool is full, move a batch of objects into the global pool
      or if the global pool is full into the to free pool

This also prepares for batch processing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.955542307@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
fb60c004f3 debugobjects: Rework object allocation
The current allocation scheme tries to allocate from the per CPU pool
first. If that fails it allocates one object from the global pool and then
refills the per CPU pool from the global pool.

That is in the way of switching the pool management to batch mode as the
global pool needs to be a strict stack of batches, which does not allow
to allocate single objects.

Rework the code to refill the per CPU pool first and then allocate the
object from the refilled batch. Also try to allocate from the to free pool
first to avoid freeing and reallocating objects.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.893554162@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
96a9a0421c debugobjects: Move min/max count into pool struct
Having the accounting in the datastructure is better in terms of cache
lines and allows more optimizations later on.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.831908427@linutronix.de
2024-10-15 17:30:32 +02:00
Thomas Gleixner
18b8afcb37 debugobjects: Rename and tidy up per CPU pools
No point in having a separate data structure. Reuse struct obj_pool and
tidy up the code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.770595795@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
cb58d19084 debugobjects: Use separate list head for boot pool
There is no point to handle the statically allocated objects during early
boot in the actual pool list. This phase does not require accounting, so
all of the related complexity can be avoided.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.708939081@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
e18328ff70 debugobjects: Move pools into a datastructure
The contention on the global pool lock can be reduced by strict batch
processing where batches of objects are moved from one list head to another
instead of moving them object by object. This also reduces the cache
footprint because it avoids the list walk and dirties at maximum three
cache lines instead of potentially up to eighteen.

To prepare for that, move the hlist head and related counters into a
struct.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.646171170@linutronix.de
2024-10-15 17:30:31 +02:00
Zhen Lei
d8c6cd3a5c debugobjects: Reduce parallel pool fill attempts
The contention on the global pool_lock can be massive when the global pool
needs to be refilled and many CPUs try to handle this.

Address this by:

  - splitting the refill from free list and allocation.

    Refill from free list has no constraints vs. the context on RT, so
    it can be tried outside of the RT specific preemptible() guard

  - Let only one CPU handle the free list

  - Let only one CPU do allocations unless the pool level is below
    half of the minimum fill level.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-4-thunder.leizhen@huawei.com-
Link: https://lore.kernel.org/all/20241007164913.582118421@linutronix.de

--
 lib/debugobjects.c |   84 +++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 59 insertions(+), 25 deletions(-)
2024-10-15 17:30:31 +02:00
Thomas Gleixner
661cc28b52 debugobjects: Make debug_objects_enabled bool
Make it what it is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.518175013@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
49a5cb827d debugobjects: Provide and use free_object_list()
Move the loop to free a list of objects into a helper function so it can be
reused later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164913.453912357@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
241463f4fd debugobjects: Remove pointless debug printk
It has zero value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.390511021@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
49968cf181 debugobjects: Reuse put_objects() on OOM
Reuse the helper function instead of having a open coded copy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.326834268@linutronix.de
2024-10-15 17:30:31 +02:00
Thomas Gleixner
a2a702383e debugobjects: Dont free objects directly on CPU hotplug
Freeing the per CPU pool of the unplugged CPU directly is suboptimal as the
objects can be reused in the real pool if there is room. Aside of that this
gets the accounting wrong.

Use the regular free path, which allows reuse and has the accounting correct.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.263960570@linutronix.de
2024-10-15 17:30:30 +02:00
Thomas Gleixner
3f397bf955 debugobjects: Remove pointless hlist initialization
It's BSS zero initialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/all/20241007164913.200379308@linutronix.de
2024-10-15 17:30:30 +02:00
Thomas Gleixner
55fb412ef7 debugobjects: Dont destroy kmem cache in init()
debug_objects_mem_init() is invoked from mm_core_init() before work queues
are available. If debug_objects_mem_init() destroys the kmem cache in the
error path it causes an Oops in __queue_work():

 Oops: Oops: 0000 [#1] PREEMPT SMP PTI
 RIP: 0010:__queue_work+0x35/0x6a0
  queue_work_on+0x66/0x70
  flush_all_cpus_locked+0xdf/0x1a0
  __kmem_cache_shutdown+0x2f/0x340
  kmem_cache_destroy+0x4e/0x150
  mm_core_init+0x9e/0x120
  start_kernel+0x298/0x800
  x86_64_start_reservations+0x18/0x30
  x86_64_start_kernel+0xc5/0xe0
  common_startup_64+0x12c/0x138

Further the object cache pointer is used in various places to check for
early boot operation. It is exposed before the replacments for the static
boot time objects are allocated and the self test operates on it.

This can be avoided by:

     1) Running the self test with the static boot objects

     2) Exposing it only after the replacement objects have been added to
     	the pool.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241007164913.137021337@linutronix.de
2024-10-15 17:30:30 +02:00
Zhen Lei
813fd07858 debugobjects: Collect newly allocated objects in a list to reduce lock contention
Collect the newly allocated debug objects in a list outside the lock, so
that the lock held time and the potential lock contention is reduced.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-3-thunder.leizhen@huawei.com
Link: https://lore.kernel.org/all/20241007164913.073653668@linutronix.de
2024-10-15 17:30:30 +02:00
Zhen Lei
a0ae950408 debugobjects: Delete a piece of redundant code
The statically allocated objects are all located in obj_static_pool[],
the whole memory of obj_static_pool[] will be reclaimed later. Therefore,
there is no need to split the remaining statically nodes in list obj_pool
into isolated ones, no one will use them anymore. Just write
INIT_HLIST_HEAD(&obj_pool) is enough. Since hlist_move_list() directly
discards the old list, even this can be omitted.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240911083521.2257-2-thunder.leizhen@huawei.com
Link: https://lore.kernel.org/all/20241007164913.009849239@linutronix.de
2024-10-15 17:30:30 +02:00
Will Deacon
7aed6a2c51 kasan: Disable Software Tag-Based KASAN with GCC
Syzbot reports a KASAN failure early during boot on arm64 when building
with GCC 12.2.0 and using the Software Tag-Based KASAN mode:

  | BUG: KASAN: invalid-access in smp_build_mpidr_hash arch/arm64/kernel/setup.c:133 [inline]
  | BUG: KASAN: invalid-access in setup_arch+0x984/0xd60 arch/arm64/kernel/setup.c:356
  | Write of size 4 at addr 03ff800086867e00 by task swapper/0
  | Pointer tag: [03], memory tag: [fe]

Initial triage indicates that the report is a false positive and a
thorough investigation of the crash by Mark Rutland revealed the root
cause to be a bug in GCC:

  > When GCC is passed `-fsanitize=hwaddress` or
  > `-fsanitize=kernel-hwaddress` it ignores
  > `__attribute__((no_sanitize_address))`, and instruments functions
  > we require are not instrumented.
  >
  > [...]
  >
  > All versions [of GCC] I tried were broken, from 11.3.0 to 14.2.0
  > inclusive.
  >
  > I think we have to disable KASAN_SW_TAGS with GCC until this is
  > fixed

Disable Software Tag-Based KASAN when building with GCC by making
CC_HAS_KASAN_SW_TAGS depend on !CC_IS_GCC.

Cc: Andrey Konovalov <andreyknvl@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: syzbot+908886656a02769af987@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000f362e80620e27859@google.com
Link: https://lore.kernel.org/r/ZvFGwKfoC4yVjN_X@J2N7QTR9R3
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218854
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241014161100.18034-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-15 11:38:10 +01:00
Rob Herring (Arm)
6ba55951e7 logic_pio: Constify fwnode_handle
The fwnode_handle passed into find_io_range_by_fwnode() and
logic_pio_trans_hwaddr() are not modified, so make them const.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20241010-dt-const-v1-2-87a51f558425@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-14 16:33:24 -05:00
Zijun Hu
9bd133f05b lib: devres: Simplify API devm_ioport_unmap() implementation
Simplify devm_ioport_unmap() implementation by dedicated API
devres_release(), compared with current solution, namely
ioport_unmap() + devres_destroy(), devres_release() has below advantages:

- it is simpler if devm_ioport_unmap()'s parameter @addr was ever
  returned by devm_ioport_map().

- it can avoid unnecessary ioport_unmap(@addr) if @addr was not
  ever returned by devm_ioport_map().

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20240918-fix_lib_devres-v1-2-e696ab5486e6@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14 08:21:09 +02:00
Zijun Hu
0ee4dcafda lib: devres: Simplify API devm_iounmap() implementation
Simplify devm_iounmap() implementation by dedicated API devres_release()
compared with current solution, namely, devres_destroy() + iounmap()
devres_release() has the following advantages:

- it is simpler if devm_iounmap()'s parameter @addr is valid, namely
  @addr was ever returned by one of devm_ioremap() variants.

- it can avoid unnecessary iounmap(@addr) if @addr is not valid.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20240918-fix_lib_devres-v1-1-e696ab5486e6@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14 08:21:09 +02:00
Jakub Kicinski
9c0fc36ec4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3).

No conflicts and no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 13:13:33 -07:00
Vladimir Oltean
1405981bbb lib: packing: catch kunit_kzalloc() failure in the pack() test
kunit_kzalloc() may fail. Other call sites verify that this is the case,
either using a direct comparison with the NULL pointer, or the
KUNIT_ASSERT_NOT_NULL() or KUNIT_ASSERT_NOT_ERR_OR_NULL().

Pick KUNIT_ASSERT_NOT_NULL() as the error handling method that made most
sense to me. It's an unlikely thing to happen, but at least we call
__kunit_abort() instead of dereferencing this NULL pointer.

Fixes: e9502ea6db ("lib: packing: add KUnit tests adapted from selftests")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241004110012.1323427-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:36:25 -07:00
Timo Grautstueck
ab8851431b lib/Kconfig.debug: fix grammar in RUST_BUILD_ASSERT_ALLOW
Just a grammar fix in lib/Kconfig.debug, under the config option
RUST_BUILD_ASSERT_ALLOW.

Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://github.com/Rust-for-Linux/linux/issues/1006
Fixes: ecaa6ddff2 ("rust: add `build_error` crate")
Signed-off-by: Timo Grautstueck <timo.grautstueck@web.de>
Link: https://lore.kernel.org/r/20241006140244.5509-1-timo.grautstueck@web.de
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-07 19:13:03 +02:00
Qianqiang Liu
6100da511b crypto: lib/mpi - Fix an "Uninitialized scalar variable" issue
The "err" variable may be returned without an initialized value.

Fixes: 8e3a67f2de ("crypto: lib/mpi - Add error checks to extension")
Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05 13:22:05 +08:00
Linus Torvalds
f6785e0ccf slab fixes for 6.12-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmb/8bcACgkQu+CwddJF
 iJoApwf5AWWhKFbbYwFUCXDi7+/Xr7T7c9H9q+GAEOQiDLsDxihEAo1KYQ+DLl+h
 Vp1ddRYIKMIUfllW3bcD4O6C8L46OX3XPHhTHnksEfvtn3fQGjcU3jKH8n0eL01J
 s9eUdvduNSJorAWqjFPPRrGuLJTXmervrDYYPJLaXGITHHMOxMjKfLAxtXehvARv
 mVQV1F0NTvvNqieuibUCM5XqJs37lrmqB39pLun7bQDU48z4OR1L3nkJxTFF1bGm
 EcvAPayTiNybMt08QSVHIwqfSs+e0HmyKqjvSLpJPImDrfSrWOJvBCJxI4DU+1aw
 UiHyWYLaxWZ7DoJgtZuHV2//8wOWww==
 =EXEA
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:
 "Fixes for issues introduced in this merge window: kobject memory leak,
  unsupressed warning and possible lockup in new slub_kunit tests,
  misleading code in kvfree_rcu_queue_batch()"

* tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
  mm, slab: suppress warnings in test_leak_destroy kunit test
  rcu/kvfree: Refactor kvfree_rcu_queue_batch()
  mm, slab: fix use of SLAB_SUPPORTS_SYSFS in kmem_cache_release()
2024-10-04 12:05:39 -07:00
Vladimir Oltean
46e784e94b lib: packing: use GENMASK() for box_mask
This is an u8, so using GENMASK_ULL() for unsigned long long is
unnecessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-10-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
fb02c7c8a5 lib: packing: use BITS_PER_BYTE instead of 8
This helps clarify what the 8 is for.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-9-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
e7fdf5dddc lib: packing: fix QUIRK_MSB_ON_THE_RIGHT behavior
The QUIRK_MSB_ON_THE_RIGHT quirk is intended to modify pack() and unpack()
so that the most significant bit of each byte in the packed layout is on
the right.

The way the quirk is currently implemented is broken whenever the packing
code packs or unpacks any value that is not exactly a full byte.

The broken behavior can occur when packing any values smaller than one
byte, when packing any value that is not exactly a whole number of bytes,
or when the packing is not aligned to a byte boundary.

This quirk is documented in the following way:

  1. Normally (no quirks), we would do it like this:

  ::

    63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
    7                       6                       5                        4
    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
    3                       2                       1                        0

  <snip>

  2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:

  ::

    56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
    7                       6                        5                       4
    24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
    3                       2                        1                       0

  That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
  inverts bit offsets inside a byte.

Essentially, the mapping for physical bit offsets should be reserved for a
given byte within the payload. This reversal should be fixed to the bytes
in the packing layout.

The logic to implement this quirk is handled within the
adjust_for_msb_right_quirk() function. This function does not work properly
when dealing with the bytes that contain only a partial amount of data.

In particular, consider trying to pack or unpack the range 53-44. We should
always be mapping the bits from the logical ordering to their physical
ordering in the same way, regardless of what sequence of bits we are
unpacking.

This, we should grab the following logical bits:

  Logical: 55 54 53 52 51 50 49 48 47 45 44 43 42 41 40 39
                  ^  ^  ^  ^  ^  ^  ^  ^  ^

And pack them into the physical bits:

   Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
    Logical: 48 49 50 51 52 53                   44 45 46 47
              ^  ^  ^  ^  ^  ^                    ^  ^  ^  ^

The current logic in adjust_for_msb_right_quirk is broken. I believe it is
intending to map according to the following:

  Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
   Logical:       48 49 50 51 52 53 44 45 46 47
                   ^  ^  ^  ^  ^  ^  ^  ^  ^  ^

That is, it tries to keep the bits at the start and end of a packing
together. This is wrong, as it makes the packing change what bit is being
mapped to what based on which bits you're currently packing or unpacking.

Worse, the actual calculations within adjust_for_msb_right_quirk don't make
sense.

Consider the case when packing the last byte of an unaligned packing. It
might have a start bit of 7 and an end bit of 5. This would have a width of
3 bits. The new_start_bit will be calculated as the width - the box_end_bit
- 1. This will underflow and produce a negative value, which will
ultimate result in generating a new box_mask of all 0s.

For any other values, the result of the calculations of the
new_box_end_bit, new_box_start_bit, and the new box_mask will result in the
exact same values for the box_end_bit, box_start_bit, and box_mask. This
makes the calculations completely irrelevant.

If box_end_bit is 0, and box_start_bit is 7, then the entire function of
adjust_for_msb_right_quirk will boil down to just:

    *to_write = bitrev8(*to_write)

The other adjustments are attempting (incorrectly) to keep the bits in the
same place but just reversed. This is not the right behavior even if
implemented correctly, as it leaves the mapping dependent on the bit values
being packed or unpacked.

Remove adjust_for_msb_right_quirk() and just use bitrev8 to reverse the
byte order when interacting with the packed data.

In particular, for packing, we need to reverse both the box_mask and the
physical value being packed. This is done after shifting the value by
box_end_bit so that the reversed mapping is always aligned to the physical
buffer byte boundary. The box_mask is reversed as we're about to use it to
clear any stale bits in the physical buffer at this block.

For unpacking, we need to reverse the contents of the physical buffer
*before* masking with the box_mask. This is critical, as the box_mask is a
logical mask of the bit layout before handling the QUIRK_MSB_ON_THE_RIGHT.

Add several new tests which cover this behavior. These tests will fail
without the fix and pass afterwards. Note that no current drivers make use
of QUIRK_MSB_ON_THE_RIGHT. I suspect this is why there have been no reports
of this inconsistency before.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-8-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
fcd6dd91d0 lib: packing: add additional KUnit tests
While reviewing the initial KUnit tests for lib/packing, Przemek pointed
out that the test values have duplicate bytes in the input sequence.

In addition, I noticed that the unit tests pack and unpack on a byte
boundary, instead of crossing bytes. Thus, we lack good coverage of the
corner cases of the API.

Add additional unit tests to cover packing and unpacking byte buffers which
do not have duplicate bytes in the unpacked value, and which pack and
unpack to an unaligned offset.

A careful reviewer may note the lack tests for QUIRK_MSB_ON_THE_RIGHT. This
is because I found issues with that quirk during test implementation. This
quirk will be fixed and the tests will be included in a future change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-7-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Jacob Keller
e9502ea6db lib: packing: add KUnit tests adapted from selftests
Add 24 simple KUnit tests for the lib/packing.c pack() and unpack() APIs.

The first 16 tests exercise all combinations of quirks with a simple magic
number value on a 16-byte buffer. The remaining 8 tests cover
non-multiple-of-4 buffer sizes.

These tests were originally written by Vladimir as simple selftest
functions. I adapted them to KUnit, refactoring them into a table driven
approach. This will aid in adding additional tests in the future.

Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-6-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
28aec9ca29 lib: packing: duplicate pack() and unpack() implementations
packing() is now used in some hot paths, and it would be good to get rid
of some ifs and buts that depend on "op", to speed things up a little bit.

With the main implementations now taking size_t endbit, we no longer
have to check for negative values. Update the local integer variables to
also be size_t to match.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-5-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
7263f64e16 lib: packing: add pack() and unpack() wrappers over packing()
Geert Uytterhoeven described packing() as "really bad API" because of
not being able to enforce const correctness. The same function is used
both when "pbuf" is input and "uval" is output, as in the other way
around.

Create 2 wrapper functions where const correctness can be ensured.
Do ugly type casts inside, to be able to reuse packing() as currently
implemented - which will _not_ modify the input argument.

Also, take the opportunity to change the type of startbit and endbit to
size_t - an unsigned type - in these new function prototypes. When int,
an extra check for negative values is necessary. Hopefully, when
packing() goes away completely, that check can be dropped.

My concern is that code which does rely on the conditional directionality
of packing() is harder to refactor without blowing up in size. So it may
take a while to completely eliminate packing(). But let's make alternatives
available for those who do not need that.

Link: https://lore.kernel.org/netdev/20210223112003.2223332-1-geert+renesas@glider.be/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-4-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:04 -07:00
Vladimir Oltean
a636ba5e86 lib: packing: adjust definitions and implementation for arbitrary buffer lengths
Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.

Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.

Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.

It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.

No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.

Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-2-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:03 -07:00
Vladimir Oltean
8b3e26677b lib: packing: refuse operating on bit indices which exceed size of buffer
While reworking the implementation, it became apparent that this check
does not exist.

There is no functional issue yet, because at call sites, "startbit" and
"endbit" are always hardcoded to correct values, and never come from the
user.

Even with the upcoming support of arbitrary buffer lengths, the
"startbit >= 8 * pbuflen" check will remain correct. This is because
we intend to always interpret the packed buffer in a way that avoids
discontinuities in the available bit indices.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-1-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 15:32:03 -07:00
Linus Torvalds
20c2474fa5 vfs-6.12-rc2.fixes.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZv5Y3gAKCRCRxhvAZXjc
 ojFPAP45kz5JgVKFn8iZmwfjPa7qbCa11gEzmx0SbUt3zZ3mJAD/fL9k9KaNU+qA
 LIcZW5BJn/p5fumUAw8/fKoz4ajCWQk=
 =LIz1
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12-rc2.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "vfs:

   - Ensure that iter_folioq_get_pages() advances to the next slot
     otherwise it will end up using the same folio with an out-of-bound
     offset.

  iomap:

   - Dont unshare delalloc extents which can't be reflinked, and thus
     can't be shared.

   - Constrain the file range passed to iomap_file_unshare() directly in
     iomap instead of requiring the callers to do it.

  netfs:

   - Use folioq_count instead of folioq_nr_slot to prevent an
     unitialized value warning in netfs_clear_buffer().

   - Fix missing wakeup after issuing writes by scheduling the write
     collector only if all the subrequest queues are empty and thus no
     writes are pending.

   - Fix two minor documentation bugs"

* tag 'vfs-6.12-rc2.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  iomap: constrain the file range passed to iomap_file_unshare
  iomap: don't bother unsharing delalloc extents
  netfs: Fix missing wakeup after issuing writes
  Documentation: add missing folio_queue entry
  folio_queue: fix documentation
  netfs: Fix a KMSAN uninit-value error in netfs_clear_buffer
  iov_iter: fix advancing slot in iter_folioq_get_pages()
2024-10-03 09:22:50 -07:00
Uros Bizjak
0402779aae lib/test_scanf: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:29 +02:00
Uros Bizjak
1da74f9050 lib/test_parman: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:27 +02:00
Uros Bizjak
2e2fe47182 bpf/tests: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:23 +02:00
Uros Bizjak
a7e74510e0 lib/rbtree-test: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:19 +02:00
Uros Bizjak
baacb8b413 random32: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:17 +02:00
Uros Bizjak
9127ad4242 kunit: string-stream-test: Include <linux/prandom.h>
Include <linux/random.h> header to allow the removal of legacy
inclusion of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:14 +02:00
Uros Bizjak
d46150d6fd lib/interval_tree_test.c: Include <linux/prandom.h> instead of <linux/random.h>
Substitute the inclusion of <linux/random.h> header with
<linux/prandom.h> to allow the removal of legacy inclusion
of <linux/prandom.h> from <linux/random.h>.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-10-03 18:20:12 +02:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Vlastimil Babka
cac39b0706 slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
Guenter Roeck reports that the new slub kunit tests added by commit
4e1c44b3db ("kunit, slub: add test_kfree_rcu() and
test_leak_destroy()") cause a lockup on boot on several architectures
when the kunit tests are configured to be built-in and not modules.

The test_kfree_rcu test invokes kfree_rcu() and boot sequence inspection
showed the runner for built-in kunit tests kunit_run_all_tests() is
called before setting system_state to SYSTEM_RUNNING and calling
rcu_end_inkernel_boot(), so this seems like a likely cause. So while I
was unable to reproduce the problem myself, skipping the test when the
slub_kunit module is built-in should avoid the issue.

An alternative fix that was moving the call to kunit_run_all_tests() a
bit later in the boot was tried, but has broken tests with functions
marked as __init due to free_initmem() already being done.

Fixes: 4e1c44b3db ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: rcu@vger.kernel.org
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: linux-kselftest@vger.kernel.org
Cc: kunit-dev@googlegroups.com
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-02 16:28:46 +02:00
Vlastimil Babka
3f1dd33f99 mm, slab: suppress warnings in test_leak_destroy kunit test
The test_leak_destroy kunit test intends to test the detection of stray
objects in kmem_cache_destroy(), which normally produces a warning. The
other slab kunit tests suppress the warnings in the kunit test context,
so suppress warnings and related printk output in this test as well.
Automated test running environments then don't need to learn to filter
the warnings.

Also rename the test's kmem_cache, the name was wrongly copy-pasted from
test_kfree_rcu.

Fixes: 4e1c44b3db ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202408251723.42f3d902-oliver.sang@intel.com
Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Closes: https://lore.kernel.org/all/CAB=+i9RHHbfSkmUuLshXGY_ifEZg9vCZi3fqr99+kmmnpDus7Q@mail.gmail.com/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-10-02 16:28:46 +02:00
Omar Sandoval
0d24852bd7
iov_iter: fix advancing slot in iter_folioq_get_pages()
iter_folioq_get_pages() decides to advance to the next folioq slot when
it has reached the end of the current folio. However, it is checking
offset, which is the beginning of the current part, instead of
iov_offset, which is adjusted to the end of the current part, so it
doesn't advance the slot when it's supposed to. As a result, on the next
iteration, we'll use the same folio with an out-of-bounds offset and
return an unrelated page.

This manifested as various crashes and other failures in 9pfs in drgn's
VM testing setup and BPF CI.

Fixes: db0aa2e956 ("mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios")
Link: https://lore.kernel.org/linux-fsdevel/20240923183432.1876750-1-chantr4@gmail.com/
Tested-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/cbaf141ba6c0e2e209717d02746584072844841a.1727722269.git.osandov@fb.com
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Joey Gouly <joey.gouly@arm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-01 11:49:57 +02:00
Linus Torvalds
9c44575c78 bitmap-for-6.12
- switch all bitmamp APIs from inline to __always_inline from Brian Norris;
  - introduce GENMASK_U128() macro from Anshuman Khandual;
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmb22isACgkQsUSA/Tof
 vsie2gwAl3l5vye90xnD6N8wFmKBKAWXMn8Iby7JyM9gAn6j1QuE5AppS+3JtIpZ
 rPRSgFZIVPOgBtiKjb6zAWj7KbtCmaSW+L5ZVaLQ+vtwBVNpWIWHsHKu0uIpuugT
 3wp/IeaE92bc/mioqb27pj2Gnv+lzYBmbK7Mu08a3q1Adwv0I7BJ4GvqxN1lLAEW
 xrFB86xztqdV7QC45J7Q5nIyUw7UBYK078elQ8iKSj5BR8MeaEJiavETwx9DHgAO
 Z8cG94ek3IpvLpiexNcgG+FTezZj9PnTVHxry9o7CIctafiqjYqXAJ9gks1Q4QUu
 q1IjPAdueLTAMPkpK67sI3fwC6zPyX5d8DVDUTuA6qhCsMyHW687gTRy4LPR14LL
 gd1Tzg+J9DQ5KBoG4TYN/g5VoP1hkKQqpetaJhdPqmYocfmqZuzyItb+gBjhyvSp
 3YOgLg/4lULy3sZ6Qd/q8CWglWlaNYXXzf13H8f2qUpVx4NLTDOwjj/CVjZR/D0C
 wje/8XU3
 =8jNc
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.12' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - switch all bitmamp APIs from inline to __always_inline (Brian Norris)

   The __always_inline series improves on code generation, and now with
   the latest compiler versions is required to avoid compilation
   warnings. It spent enough in my backlog, and I'm thankful to Brian
   Norris for taking over and moving it forward.

 - introduce GENMASK_U128() macro (Anshuman Khandual)

   GENMASK_U128() is a prerequisite needed for arm64 development

* tag 'bitmap-for-6.12' of https://github.com/norov/linux:
  lib/test_bits.c: Add tests for GENMASK_U128()
  uapi: Define GENMASK_U128
  nodemask: Switch from inline to __always_inline
  cpumask: Switch from inline to __always_inline
  bitmap: Switch from inline to __always_inline
  find: Switch from inline to __always_inline
2024-09-27 12:10:45 -07:00
Linus Torvalds
eee280841e 19 hotfixes. 13 are cc:stable.
There's a focus on fixes for the memfd_pin_folios() work which was added
 into 6.11.  Apart from that, the usual shower of singleton fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZvbhSAAKCRDdBJ7gKXxA
 jp8CAP47txk2c+tBLggog2MkQamADY5l5MT6E3fYq3ghSiKtVQEAnqX3LiQJ02tB
 o9LcPcVrM90QntpKrLP1CpWCVdR+zA8=
 =e0QC
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull  misc fixes from Andrew Morton:
 "19 hotfixes.  13 are cc:stable.

  There's a focus on fixes for the memfd_pin_folios() work which was
  added into 6.11. Apart from that, the usual shower of singleton fixes"

* tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  ocfs2: fix uninit-value in ocfs2_get_block()
  zram: don't free statically defined names
  memory tiers: use default_dram_perf_ref_source in log message
  Revert "list: test: fix tests for list_cut_position()"
  kselftests: mm: fix wrong __NR_userfaultfd value
  compiler.h: specify correct attribute for .rodata..c_jump_table
  mm/damon/Kconfig: update DAMON doc URL
  mm: kfence: fix elapsed time for allocated/freed track
  ocfs2: fix deadlock in ocfs2_get_system_file_inode
  ocfs2: reserve space for inline xattr before attaching reflink tree
  mm: migrate: annotate data-race in migrate_folio_unmap()
  mm/hugetlb: simplify refs in memfd_alloc_folio
  mm/gup: fix memfd_pin_folios alloc race panic
  mm/gup: fix memfd_pin_folios hugetlb page allocation
  mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
  mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
  mm/filemap: fix filemap_get_folios_contig THP panic
  mm: make SPLIT_PTE_PTLOCKS depend on SMP
  tools: fix shared radix-tree build
2024-09-27 10:27:22 -07:00
Guenter Roeck
c509f67df3 Revert "list: test: fix tests for list_cut_position()"
This reverts commit e620799c41.

The commit introduces unit test failures.

     Expected cur == &entries[i], but
         cur == 0000037fffadfd80
         &entries[i] == 0000037fffadfd60
     # list_test_list_cut_position: pass:0 fail:1 skip:0 total:1
     not ok 21 list_test_list_cut_position
     # list_test_list_cut_before: EXPECTATION FAILED at lib/list-test.c:444
     Expected cur == &entries[i], but
         cur == 0000037fffa9fd70
         &entries[i] == 0000037fffa9fd60
     # list_test_list_cut_before: EXPECTATION FAILED at lib/list-test.c:444
     Expected cur == &entries[i], but
         cur == 0000037fffa9fd80
         &entries[i] == 0000037fffa9fd70

Revert it.

Link: https://lkml.kernel.org/r/20240922150507.553814-1-linux@roeck-us.net
Fixes: e620799c41 ("list: test: fix tests for list_cut_position()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-26 14:01:44 -07:00
Linus Torvalds
11a299a793 for-6.12/block-20240925
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmb0T5AQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnfHEADCXqmqZC+xr3sHZH9T1lz9KaFp1FjuBhCw
 bGpUgXQ9aLcqQUWJxmYVer8N2x2+Ds+xq4fm/rP1BfvNgRupqheHBwuLxSrz14EX
 lYmKZ+krMIPTDaLFewmEWflDwmZX0WFgV6nKTMLiO5BMeI4zXCkFGtwYFys2+Cdd
 9zYCFPgGDZUR77Ws5PpyqPVz2MoiNtsjrGmHpEmNZ+rIDzlpVOYgYk27X9ZbvNxC
 /l0KTc9+ayAeG0Kx5jO+m6Hrj3I6ehvM9JZMgpS/tF/jtccD2oVkJFJDlU+Jciv6
 BwVzgyDPGV7sXFT1fnSqDBYYwr/73nzNH0Gk8wn4Jg2LhjmVANVo9eQSOXDTYZI+
 O4HfIHGTIrk75TQd4bhq3dqaylS78pKBI/eQJUli2UNoyLWMrMyE88yh2YJam2Fs
 vJ/MHGxvFRurYbAlqLr33nb3ajvpg+D7XuAYfqHPMc2ZUe28Kza50Dj+luNjfVCu
 3qfR6qBlsdWuABtUS3vneB9jZp5jDnOpVfuBgtcAqIboUjehTXsI7If09Ex/mxLq
 O0KqNwBMfunPOKd5kGXlAgY8LRMfOhNaAAFBlXYUZB2eAadQnqVselTFvHMZkXo7
 wH/l6trd+/Tf+7Rav0YduNIlpVr7IctC+A7ph4zPdIjQxFEySCrC7cvAjel29LyV
 zgWW0Mw/sA==
 =yiWu
 -----END PGP SIGNATURE-----

Merge tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:

 - Improve blk-integrity segment counting and merging (Keith)

 - NVMe pull request via Keith:
      - Multipath fixes (Hannes)
      - Sysfs attribute list NULL terminate fix (Shin'ichiro)
      - Remove problematic read-back (Keith)

 - Fix for a regression with the IO scheduler switching freezing from
   6.11 (Damien)

 - Use a raw spinlock for sbitmap, as it may get called from preempt
   disabled context (Ming)

 - Cleanup for bd_claiming waiting, using var_waitqueue() rather than
   the bit waitqueues, as that more accurately describes that it does
   (Neil)

 - Various cleanups (Kanchan, Qiu-ji, David)

* tag 'for-6.12/block-20240925' of git://git.kernel.dk/linux:
  nvme: remove CC register read-back during enabling
  nvme: null terminate nvme_tls_attrs
  nvme-multipath: avoid hang on inaccessible namespaces
  nvme-multipath: system fails to create generic nvme device
  lib/sbitmap: define swap_lock as raw_spinlock_t
  block: Remove unused blk_limits_io_{min,opt}
  drbd: Fix atomicity violation in drbd_uuid_set_bm()
  block: Fix elv_iosched_local_module handling of "none" scheduler
  block: remove bogus union
  block: change wait on bd_claiming to use a var_waitqueue
  blk-integrity: improved sg segment mapping
  block: unexport blk_rq_count_integrity_sg
  nvme-rdma: use request to get integrity segments
  scsi: use request to get integrity segments
  block: provide a request helper for user integrity segments
  blk-integrity: consider entire bio list for merging
  blk-integrity: properly account for segments
  blk-mq: set the nr_integrity_segments from bio
  blk-mq: unconditional nr_integrity_segments
2024-09-25 14:56:40 -07:00
Linus Torvalds
68e5c7d4ce Kbuild updates for v6.12
- Support cross-compiling linux-headers Debian package and kernel-devel
    RPM package
 
  - Add support for the linux-debug Pacman package
 
  - Improve module rebuilding speed by factoring out the common code to
    scripts/module-common.c
 
  - Separate device tree build rules into scripts/Makefile.dtbs
 
  - Add a new script to generate modules.builtin.ranges, which is useful
    for tracing tools to find symbols in built-in modules
 
  - Refactor Kconfig and misc tools
 
  - Update Kbuild and Kconfig documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmby2+QVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGpQ0QALWMgox3OdceNiBT8QieqRFfwKFv
 5jxtsZt+MbTdWNMEfgc4Cq2i5ZAqpYGZh32RwTiZJogBvYEIoO7M4Md9VwoEe/BC
 q8VZ6FhUy7358IX/FCukfB0dYvkziRalBRDrE4iFmMMdhBvZ9nrvMxllqFCMllLj
 DTrBTTiMus3qiiczr4tb5QwaIR6C+yqiEBF++ftLmWvo9dn8YNNUnI65fGjyQM/w
 0wMPwsB3Y2HdnRpLUS6T18gZbjoXsAk4+WX0TpdBfTs3d7AdbzlSMtc0BslEm6Tb
 JjIK6SbJCM3kNC7O0/gsUenOaSBxSbKjjg33gQxn/eNoi0nRt+qnBMMreYiTd95G
 Hq86QcNfKQtWAagKRTppMkYEDqMU2RKH7BmJOsfQyeG9cGpAAu+0HsQv3f/h5QP1
 MlA8o+NP5oQn6RbrhZz1Pqm24+OMxiXaBhmo8XbZ+MXzi/CBR54Eo4ip/FSHzXII
 EGEAQL7t7YU7xu8qMIE6ZQMH7BJsjJNee0vrNiYZa4xHLYyHi6mJl8K6LlHQ3nEx
 WOsPX9MLITtSJwcvIio/0sEnuR7pjcShGfqhbHO5tiOYznsbcSvu3+18HPGCpFRt
 vYFkNIRc298k7++A+Zp2wwdD2TS+SSilrAImmJXMhf0M+Nyg2vnlfAo8t0QSkFlh
 1g9dJuy+8jYRjHXP
 =g4t/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Support cross-compiling linux-headers Debian package and kernel-devel
   RPM package

 - Add support for the linux-debug Pacman package

 - Improve module rebuilding speed by factoring out the common code to
   scripts/module-common.c

 - Separate device tree build rules into scripts/Makefile.dtbs

 - Add a new script to generate modules.builtin.ranges, which is useful
   for tracing tools to find symbols in built-in modules

 - Refactor Kconfig and misc tools

 - Update Kbuild and Kconfig documentation

* tag 'kbuild-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits)
  kbuild: doc: replace "gcc" in external module description
  kbuild: doc: describe the -C option precisely for external module builds
  kbuild: doc: remove the description about shipped files
  kbuild: doc: drop section numbering, use references in modules.rst
  kbuild: doc: throw out the local table of contents in modules.rst
  kbuild: doc: remove outdated description of the limitation on -I usage
  kbuild: doc: remove description about grepping CONFIG options
  kbuild: doc: update the description about Kbuild/Makefile split
  kbuild: remove unnecessary export of RUST_LIB_SRC
  kbuild: remove append operation on cmd_ld_ko_o
  kconfig: cache expression values
  kconfig: use hash table to reuse expressions
  kconfig: refactor expr_eliminate_dups()
  kconfig: add comments to expression transformations
  kconfig: change some expr_*() functions to bool
  scripts: move hash function from scripts/kconfig/ to scripts/include/
  kallsyms: change overflow variable to bool type
  kallsyms: squash output_address()
  kbuild: add install target for modules.builtin.ranges
  scripts: add verifier script for builtin module range data
  ...
2024-09-24 13:02:06 -07:00
Linus Torvalds
9ab27b0186 The core clk framework is left largely untouched this time around except for
support for the newly ratified DT property 'assigned-clock-rates-u64'. I'm much
 more excited about the support for loading DT overlays from KUnit tests so that
 we can test how the clk framework parses DT nodes during clk registration. The
 clk framework has some places that are highly DeviceTree dependent so this
 charts the path to extend the KUnit tests to cover even more framework code in
 the future. I've got some more tests on the list that use the DT overlay
 support, but they uncovered issues with clk unregistration that I'm still
 working on fixing.
 
 Outside the core, the clk driver update pile is dominated by Qualcomm and
 Renesas SoCs, making it fairly usual. Looking closer, there are fixes for
 things all over the place, like adding missing clk frequencies or moving
 defines for the number of clks out of DT binding headers into the drivers.
 There are even conversions of DT bindings to YAML and migration away from
 strings to describe clk topology. Overall it doesn't look unusual so I expect
 the new drivers to be where we'll have fixes in the coming weeks.
 
 Core:
  - KUnit tests for clk registration and fixed rate basic clk type
  - A couple more devm helpers, one consumer and one provider
  - Support for assigned-clock-rates-u64
 
 New Drivers:
  - Camera, display and GPU clocks on Qualcomm SM4450
  - Camera clocks on Qualcomm SM8150
  - Rockchip rk3576 clks
  - Microchip SAM9X7 clks
  - Renesas RZ/V2H(P) (R9A09G057) clks
 
 Updates:
  - Mark a bunch of struct freq_tbl const to reduce .data usage
  - Add Qualcomm MSM8226 A7PLL and Regera PLL support
  - Fix the Qualcomm Lucid 5LPE PLL configuration sequence to not reuse
    Trion, as they do differ
  - A number of fixes to the Qualcomm SM8550 display clock driver
  - Fold Qualcomm SM8650 display clock driver into SM8550 one
  - Add missing clocks and GDSCs needed for audio on Qualcomm MSM8998
  - Add missing USB MP resets, GPLL9, and QUPv3 DFS to Qualcomm SC8180X
  - Fix sdcc clk frequency tables on Qualcomm SC8180X
  - Drop the Qualcomm SM8150 gcc_cpuss_ahb_clk_src
  - Mark Qualcomm PCIe GDSCs as RET_ON on sm8250 and sm8540 to avoid them
    turning off during suspend
  - Use the HW_CTRL mechanism on Qualcomm SM8550 video clock controller
    GDSCs
  - Get rid of CLK_NR_CLKS defines in Rockchip DT binding headers
  - Some fixes for Rockchip rk3228 and rk3588
  - Exynos850: Add clock for Thermal Management Unit
  - Exynos7885: Fix duplicated ID in the header, add missing TOP PLLs and
    add clocks for USB block in the FSYS clock controller
  - ExynosAutov9: Add DPUM clock controller
  - ExynosAutov920: Add new (first) clock controllers: TOP and PERIC0
    (and a bit more complete bindings)
  - Use clk_hw pointer instead of fw_name for acm_aud_clk[0-1]_sel clocks
    on i.MX8Q as parents in ACM provider
  - Add i.MX95 NETCMIX support to the block control provider
  - Fix parents for ENETx_REF_SEL clocks on i.MX6UL
  - Add USB clocks, resets and power domains on Renesas RZ/G3S
  - Add Generic Timer (GTM), I2C Bus Interface (RIIC), SD/MMC Host
    Interface (SDHI) and Watchdog Timer (WDT) clocks and resets on
    Renesas RZ/V2H
  - Add PCIe, PWM, and CAN-FD clocks on Renesas R-Car V4M
  - Add LCD controller clocks and resets on Renesas RZ/G2UL
  - Add DMA clocks and resets on Renesas RZ/G3S
  - Add fractional multiplication PLL support on Renesas R-Car Gen4
  - Document support for the Renesas RZ/G2M v3.0 (r8a774a3) SoC
  - Support for the Microchip SAM9X7 SoC as follows:
  - Updates for the Microchip PLL drivers
  - DT binding documentation updates (for the new clock driver and for
    the slow clock controller that SAM9X7 is using)
  - A fix for the Microchip SAMA7G5 clock driver to avoid allocating more
    memory than necessary
  - Constify some Amlogic structs
  - Add SM1 eARC clocks for Amlogic
  - Introduce a symbol namespace for Amlogic clock specific symbols
  - Add reset controller support to audiomix block control on i.MX
  - Add CLK_SET_RATE_PARENT flag to all audiomix clocks and to
    i.MX7D lcdif_pixel_src clock
  - Fix parent clocks for earc_phy and audpll on i.MX8MP
  - Fix default parents for enet[12]_ref_sel on i.MX6UL
  - Add ops in composite 8M and 93 that allow no-op on disable
  - Add check for PCC present bit on composite 7ULP register
  - Fix fractional part for fracn-gppll on prepare in i.MX
  - Fix clock tree update for TF-A managed clocks on i.MX8M
  - Drop CLK_SET_PARENT_GATE for DRAM mux on i.MX7D
  - Add the SAI7 IPG clock for i.MX8MN
  - Mark the 'nand_usdhc_bus' clock as non-critical on i.MX8MM
  - Add LVDS bypass clocks on i.MX8QXP
  - Add muxes for MIPI and PHY ref clocks on i.MX
  - Reorder dc0_bypass0_clk, lcd_pxl and dc1_disp clocks on i.MX8QXP
  - Add 1039.5MHz and 800MHz rates to fracn-gppll table on i.MX
  - Add CLK_SET_RATE_PARENT for media_disp pixel clocks on i.MX8QXP
  - Add some module descriptions to the i.MX generic and the
    i.MXRT1050 driver
  - Fix return value for bypass for composite i.MX7ULP
  - Move Mediatek clk bindings to clock/
  - Convert some more clk bindings to dt schema
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmbxswcRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSXjoQ/9GRwTJsRBHhFKZscwklDGHJiFOowsLnzC
 q+fk0J2in+7rLezNv/5nkANOtm7eicYv5kkiY/OQArHB704neHkdVfXvSuaGMMM5
 SXPLq7YtH/4haOWhs/HYfx551+cWGHv9orTVDJpF8GHQ5t37C1BX4KphLlUcgxFe
 X0ZvbLdecp/VS4BiU+HM2zPM/SLU8V4xNmARUMZhur9QQ1P2n4YY8zGU87bWLaTB
 u1wrwm9LMtq+A+LR6ViMRwLZKYXaR9o+rndbhCVURvYZEmrIB+x5iYS8RPJa2kvy
 utsPOghOP0VRqZLT2VvLmKud7lk2Th1Uzng4xwcPxdDtpo6D5y+18VoA8tSHD2Zr
 uwirN8pGbJm+7Ak9K9I4KcA9/9JgGRMsPBgCqdnvJxFgD1c7kT2/aJ5AEWmG8GBD
 zUtqLzmSSnNfYBxXeWAqdrGNFzYZju53tl0ACI01W3lwUffPoJwnvHAdI4aiWMv1
 WdzABSnieX7YcGJrnGzV7ZaIdGwUUyR9OQ5JEi+ajD+qCbnI+oXJgEa+tHI5/XLY
 3As5WJlktmRkWzyacAPiGKsyYJYLNTy0TGwBw1CKQIrtIwjR/HF5THEr2qcy6cze
 YiT7xAzhHcjUlMjjcDEe6Qg5R9ykvYSrFixRscWXbdehP1GpWJkqdgzc1+aBJWGW
 QLLHSYHPkXo=
 =XmiQ
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
 "The core clk framework is left largely untouched this time around
  except for support for the newly ratified DT property
  'assigned-clock-rates-u64'.

  I'm much more excited about the support for loading DT overlays from
  KUnit tests so that we can test how the clk framework parses DT nodes
  during clk registration. The clk framework has some places that are
  highly DeviceTree dependent so this charts the path to extend the
  KUnit tests to cover even more framework code in the future. I've got
  some more tests on the list that use the DT overlay support, but they
  uncovered issues with clk unregistration that I'm still working on
  fixing.

  Outside the core, the clk driver update pile is dominated by Qualcomm
  and Renesas SoCs, making it fairly usual. Looking closer, there are
  fixes for things all over the place, like adding missing clk
  frequencies or moving defines for the number of clks out of DT binding
  headers into the drivers. There are even conversions of DT bindings to
  YAML and migration away from strings to describe clk topology. Overall
  it doesn't look unusual so I expect the new drivers to be where we'll
  have fixes in the coming weeks.

  Core:
   - KUnit tests for clk registration and fixed rate basic clk type
   - A couple more devm helpers, one consumer and one provider
   - Support for assigned-clock-rates-u64

  New Drivers:
   - Camera, display and GPU clocks on Qualcomm SM4450
   - Camera clocks on Qualcomm SM8150
   - Rockchip rk3576 clks
   - Microchip SAM9X7 clks
   - Renesas RZ/V2H(P) (R9A09G057) clks

  Updates:
   - Mark a bunch of struct freq_tbl const to reduce .data usage
   - Add Qualcomm MSM8226 A7PLL and Regera PLL support
   - Fix the Qualcomm Lucid 5LPE PLL configuration sequence to not reuse
     Trion, as they do differ
   - A number of fixes to the Qualcomm SM8550 display clock driver
   - Fold Qualcomm SM8650 display clock driver into SM8550 one
   - Add missing clocks and GDSCs needed for audio on Qualcomm MSM8998
   - Add missing USB MP resets, GPLL9, and QUPv3 DFS to Qualcomm SC8180X
   - Fix sdcc clk frequency tables on Qualcomm SC8180X
   - Drop the Qualcomm SM8150 gcc_cpuss_ahb_clk_src
   - Mark Qualcomm PCIe GDSCs as RET_ON on sm8250 and sm8540 to avoid
     them turning off during suspend
   - Use the HW_CTRL mechanism on Qualcomm SM8550 video clock controller
     GDSCs
   - Get rid of CLK_NR_CLKS defines in Rockchip DT binding headers
   - Some fixes for Rockchip rk3228 and rk3588
   - Exynos850: Add clock for Thermal Management Unit
   - Exynos7885: Fix duplicated ID in the header, add missing TOP PLLs
     and add clocks for USB block in the FSYS clock controller
   - ExynosAutov9: Add DPUM clock controller
   - ExynosAutov920: Add new (first) clock controllers: TOP and PERIC0
     (and a bit more complete bindings)
   - Use clk_hw pointer instead of fw_name for acm_aud_clk[0-1]_sel
     clocks on i.MX8Q as parents in ACM provider
   - Add i.MX95 NETCMIX support to the block control provider
   - Fix parents for ENETx_REF_SEL clocks on i.MX6UL
   - Add USB clocks, resets and power domains on Renesas RZ/G3S
   - Add Generic Timer (GTM), I2C Bus Interface (RIIC), SD/MMC Host
     Interface (SDHI) and Watchdog Timer (WDT) clocks and resets on
     Renesas RZ/V2H
   - Add PCIe, PWM, and CAN-FD clocks on Renesas R-Car V4M
   - Add LCD controller clocks and resets on Renesas RZ/G2UL
   - Add DMA clocks and resets on Renesas RZ/G3S
   - Add fractional multiplication PLL support on Renesas R-Car Gen4
   - Document support for the Renesas RZ/G2M v3.0 (r8a774a3) SoC
   - Support for the Microchip SAM9X7 SoC as follows:
   - Updates for the Microchip PLL drivers
   - DT binding documentation updates (for the new clock driver and for
     the slow clock controller that SAM9X7 is using)
   - A fix for the Microchip SAMA7G5 clock driver to avoid allocating
     more memory than necessary
   - Constify some Amlogic structs
   - Add SM1 eARC clocks for Amlogic
   - Introduce a symbol namespace for Amlogic clock specific symbols
   - Add reset controller support to audiomix block control on i.MX
   - Add CLK_SET_RATE_PARENT flag to all audiomix clocks and to i.MX7D
     lcdif_pixel_src clock
   - Fix parent clocks for earc_phy and audpll on i.MX8MP
   - Fix default parents for enet[12]_ref_sel on i.MX6UL
   - Add ops in composite 8M and 93 that allow no-op on disable
   - Add check for PCC present bit on composite 7ULP register
   - Fix fractional part for fracn-gppll on prepare in i.MX
   - Fix clock tree update for TF-A managed clocks on i.MX8M
   - Drop CLK_SET_PARENT_GATE for DRAM mux on i.MX7D
   - Add the SAI7 IPG clock for i.MX8MN
   - Mark the 'nand_usdhc_bus' clock as non-critical on i.MX8MM
   - Add LVDS bypass clocks on i.MX8QXP
   - Add muxes for MIPI and PHY ref clocks on i.MX
   - Reorder dc0_bypass0_clk, lcd_pxl and dc1_disp clocks on i.MX8QXP
   - Add 1039.5MHz and 800MHz rates to fracn-gppll table on i.MX
   - Add CLK_SET_RATE_PARENT for media_disp pixel clocks on i.MX8QXP
   - Add some module descriptions to the i.MX generic and the i.MXRT1050
     driver
   - Fix return value for bypass for composite i.MX7ULP
   - Move Mediatek clk bindings to clock/
   - Convert some more clk bindings to dt schema"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (180 commits)
  clk: Switch back to struct platform_driver::remove()
  dt-bindings: clock, reset: fix top-comment indentation rk3576 headers
  clk: rockchip: remove unused mclk_pdm0_p/pdm0_p definitions
  clk: provide devm_clk_get_optional_enabled_with_rate()
  clk: fixed-rate: add devm_clk_hw_register_fixed_rate_parent_data()
  clk: imx6ul: fix clock parent for IMX6UL_CLK_ENETx_REF_SEL
  clk: renesas: r9a09g057: Add clock and reset entries for GTM/RIIC/SDHI/WDT
  clk: renesas: rzv2h: Add support for dynamic switching divider clocks
  clk: renesas: r9a08g045: Add clocks, resets and power domains for USB
  clk: rockchip: fix error for unknown clocks
  clk: rockchip: rk3588: drop unused code
  clk: rockchip: Add clock controller for the RK3576
  clk: rockchip: Add new pll type pll_rk3588_ddr
  dt-bindings: clock, reset: Add support for rk3576
  dt-bindings: clock: rockchip,rk3588-cru: drop unneeded assigned-clocks
  clk: rockchip: rk3588: Fix 32k clock name for pmu_24m_32k_100m_src_p
  clk: imx95: enable the clock of NETCMIX block control
  dt-bindings: clock: add RMII clock selection
  dt-bindings: clock: add i.MX95 NETCMIX block control
  clk: imx: imx8: Use clk_hw pointer for self registered clock in clk_parent_data
  ...
2024-09-23 15:01:48 -07:00
Linus Torvalds
b3f391fddf bcachefs changes for 6.12-rc1
rcu_pending, btree key cache rework: this solves lock contenting in the
 key cache, eliminating the biggest source of the srcu lock hold time
 warnings, and drastically improving performance on some metadata heavy
 workloads - on multithreaded creates we're now 3-4x faster than xfs.
 
 We're now using an rhashtable instead of the system inode hash table;
 this is another significant performance improvement on multithreaded
 metadata workloads, eliminating more lock contention.
 
 for_each_btree_key_in_subvolume_upto(): new helper for iterating over
 keys within a specific subvolume, eliminating a lot of open coded
 "subvolume_get_snapshot()" and also fixing another source of srcu lock
 time warnings, by running each loop iteration in its own transaction (as
 the existing for_each_btree_key() does).
 
 More work on btree_trans locking asserts; we now assert that we don't
 hold btree node locks when trans->locked is false, which is important
 because we don't use lockdep for tracking individual btree node locks.
 
 Some cleanups and improvements in the bset.c btree node lookup code,
 from Alan.
 
 Rework of btree node pinning, which we use in backpointers fsck. The old
 hacky implementation, where the shrinker just skipped over nodes in the
 pinned range, was causing OOMs; instead we now use another shrinker with
 a much higher seeks number for pinned nodes.
 
 Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
 where rebalance would sometimes fall back to allocating from the full
 filesystem, which is not what we want when it's trying to move data to a
 specific target.
 
 Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
 allocations.
 
 Idmap mounts are now supported - Hongbo.
 
 Rename whiteouts are now supported - Hongbo.
 
 Erasure coding can now handle devices being marked as failed, or
 forcibly removed. We still need the evacuate path for erasure coding,
 but it's getting very close to ready for people to start using.
 
 Status, and when will we be taking off experimental:
 ----------------------------------------------------
 
 Going by critical, user facing bugs getting found and fixed, we're
 nearly there. There are a couple key items that need to be finished
 before we can take off the experimental label:
 
 - The end-user experience is still pretty painful when the root
   filesystem needs a fsck; we need some form of limited self healing so
   that necessary repair gets run automatically. Errors (by type) are
   recorded in the superblock, so what we need to do next is convert
   remaining inconsistent() errors to fsck() errors (so that all runtime
   inconsistencies are logged in the superblock), and we need to go
   through the list of fsck errors and classify them by which fsck passes
   are needed to repair them.
 
 - We need comprehensive torture testing for all our repair paths, to
   shake out remaining bugs there. Thomas has been working on the tooling
   for this, so this is coming soonish.
 
 Slightly less critical items:
 
 - We need to improve the end-user experience for degraded mounts: right
   now, a degraded root filesystem means dropping to an initramfs shell
   or somehow inputting mount options manually (we don't want to allow
   degraded mounts without some form of user input, except on unattended
   servers) - we need the mount helper to prompt the user to allow
   mounting degraded, and make sure this works with systemd.
 
 - Scalabiity: we have users running 100TB+ filesystems, and that's
   effectively the limit right now due to fsck times. We have some
   reworks in the pipeline to address this, we're aiming to make petabyte
   sized filesystems practical.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbvHQoACgkQE6szbY3K
 bnYfAw/+IXQ43/O+Jzs0MLD7pKZnrlbHiX9FqYLazD40vWvkyRTQOwgTn8pVNhq3
 4YWmtuZyqh036YC+bGqYFOhz20YetS5UdgbClpwmc99JJ6xsY+Z1mdpYfz5oq1Dw
 /pBX5iYb3rAt8UbQoZ8lcWM+GpT3GKJVgJuiLB2gRp9gATFesuh+0qU42oIVVVU5
 4y3VhDBUmRk4XqEnk8hr7EIDMW0wWP3aptxYMZzeUPW0x1cEQ+FWrJo5D6lXv2KK
 dKv3MogvA0FFNi/eNexclPiu2pXtI7vrxT7umsxAICHLt41rWpV5ttE6io3bC4ZN
 qvwF9w2CpmKPKchFru9PO+QrWHVR7e6bphwf3TzyoKZ7tTn42f1RQlub7gBzI3bz
 ai5ZwGRIvpUoPVBj+CO+Ipog81uUb23Ma+gXg1akEFBOAb+o7I3KOOSBh5l+0cHj
 3Ov1n0TLcsoO2cqoqfsV2QubW9YcWEZ76g5mKwQnUn8Cs6Fp0wWaIyK9aNkIAxcr
 tNDPGtH1gKitxUvju5i/LyI7y1UoeFvqJFee0VsU6QnixHn1ySzhePsJt6UEnIJT
 Ia3C96Igqu2mV9FxhfGHj/qi7TGjqqkZHa8+B610cDpgf15cx7Ps2DYjkuQMFCqZ
 Q3Q1o5De9roRq5xF2hLiYJCbzJKqd5ichFsBtLQuX572ICxbICg=
 =oVCy
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - rcu_pending, btree key cache rework: this solves lock contenting in
   the key cache, eliminating the biggest source of the srcu lock hold
   time warnings, and drastically improving performance on some metadata
   heavy workloads - on multithreaded creates we're now 3-4x faster than
   xfs.

 - We're now using an rhashtable instead of the system inode hash table;
   this is another significant performance improvement on multithreaded
   metadata workloads, eliminating more lock contention.

 - for_each_btree_key_in_subvolume_upto(): new helper for iterating over
   keys within a specific subvolume, eliminating a lot of open coded
   "subvolume_get_snapshot()" and also fixing another source of srcu
   lock time warnings, by running each loop iteration in its own
   transaction (as the existing for_each_btree_key() does).

 - More work on btree_trans locking asserts; we now assert that we don't
   hold btree node locks when trans->locked is false, which is important
   because we don't use lockdep for tracking individual btree node
   locks.

 - Some cleanups and improvements in the bset.c btree node lookup code,
   from Alan.

 - Rework of btree node pinning, which we use in backpointers fsck. The
   old hacky implementation, where the shrinker just skipped over nodes
   in the pinned range, was causing OOMs; instead we now use another
   shrinker with a much higher seeks number for pinned nodes.

 - Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
   where rebalance would sometimes fall back to allocating from the full
   filesystem, which is not what we want when it's trying to move data
   to a specific target.

 - Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
   allocations.

 - Idmap mounts are now supported (Hongbo Li)

 - Rename whiteouts are now supported (Hongbo Li)

 - Erasure coding can now handle devices being marked as failed, or
   forcibly removed. We still need the evacuate path for erasure coding,
   but it's getting very close to ready for people to start using.

* tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs: (99 commits)
  bcachefs: return err ptr instead of null in read sb clean
  bcachefs: Remove duplicated include in backpointers.c
  bcachefs: Don't drop devices with stripe pointers
  bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices
  bcachefs: bch_fs.rw_devs_change_count
  bcachefs: bch2_dev_remove_stripes()
  bcachefs: bch2_trigger_ptr() calculates sectors even when no device
  bcachefs: improve error messages in bch2_ec_read_extent()
  bcachefs: improve error message on too few devices for ec
  bcachefs: improve bch2_new_stripe_to_text()
  bcachefs: ec_stripe_head.nr_created
  bcachefs: bch_stripe.disk_label
  bcachefs: stripe_to_mem()
  bcachefs: EIO errcode cleanup
  bcachefs: Rework btree node pinning
  bcachefs: split up btree cache counters for live, freeable
  bcachefs: btree cache counters should be size_t
  bcachefs: Don't count "skipped access bit" as touched in btree cache scan
  bcachefs: Failed devices no longer require mounting in degraded mode
  bcachefs: bch2_dev_rcu_noerror()
  ...
2024-09-23 10:05:41 -07:00
Linus Torvalds
de5cb0dcb7 Merge branch 'address-masking'
Merge user access fast validation using address masking.

This allows architectures to optionally use a data dependent address
masking model instead of a conditional branch for validating user
accesses.  That avoids the Spectre-v1 speculation barriers.

Right now only x86-64 takes advantage of this, and not all architectures
will be able to do it.  It requires a guard region between the user and
kernel address spaces (so that you can't overflow from one to the
other), and an easy way to generate a guaranteed-to-fault address for
invalid user pointers.

Also note that this currently assumes that there is no difference
between user read and write accesses.  If extended to architectures like
powerpc, we'll also need to separate out the user read-vs-write cases.

* address-masking:
  x86: make the masked_user_access_begin() macro use its argument only once
  x86: do the user address masking outside the user access area
  x86: support user address masking instead of non-speculative conditional
2024-09-22 11:19:35 -07:00
Linus Torvalds
88264981f2 sched_ext: Initial pull request for v6.12
This is the initial pull request of sched_ext. The v7 patchset
 (https://lkml.kernel.org/r/20240618212056.2833381-1-tj@kernel.org) is
 applied on top of tip/sched/core + bpf/master as of Jun 18th.
 
   tip/sched/core 793a62823d1c ("sched/core: Drop spinlocks on contention iff kernel is preempti
 ble")
   bpf/master f6afdaf72a ("Merge branch 'bpf-support-resilient-split-btf'")
 
 Since then, the following pulls were made:
 
 - v6.11-rc1 is pulled to keep up with the mainline.
 
 - tip/sched/core was pulled several times:
 
   - 7b9f6c864a, 0df340ceae, 5ac998574f, 0b1777f0fa: To resolve
     conflicts. See each commit for details on conflicts and their
     resolutions.
 
   - d7b01aef9d: To receive fd03c5b858 ("sched: Rework pick_next_task()")
     and related commits. @prev in added to sched_class->put_prev_task() and
     put_prev_task() is reordered after ->pick_task(), which makes
     sched_class->switch_class() unnecessary. The follow-up commits update
     sched_ext accordingly and drop sched_class->switch_class().
 
 - bpf/master was pulled to receive baebe9aaba ("bpf: allow passing struct
   bpf_iter_<type> as kfunc arguments") and related changes in preparation
   for the DSQ iterator patchset
 
 To obtain the net sched_ext changes, diff against:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git for-6.12-base
 
 which is the merge of:
 
   tip/sched/core bc9057da1a ("sched/cpufreq: Use NSEC_PER_MSEC for deadline task")
   bpf/master 2ad6d23f46 ("selftests/bpf: Do not update vmlinux.h unnecessarily")
 
 Since the v7 patchset, the following changes were made:
 
 - cpuperf support which was a part of the v6 patchset was posted separately
   and then applied after reviews.
 
 - cgroup support which was a part of the v6 patchset was posted seprately,
   iterated and then applied.
 
 - Improve integration with sched core.
 
 - Double locking usage in migration paths dropped. Depend on
   TASK_ON_RQ_MIGRATING synchronization instead.
 
 - The BPF scheduler couldn't directly dispatch to the local DSQ of another
   CPU using a SCX_DSQ_LOCAL_ON verdict. This caused difficulties around
   handling non-wakeup enqueues. Updated so that SCX_DSQ_LOCAL_ON can be used
   in the enqueue path too.
 
 - DSQ iterator which was a part of the v6 patchset was posted separately.
   The iterator itself was applied after a couple revisions. The associated
   selective consumption kfunc can use further improvements and is still
   being worked on.
 
 - scx_bpf_dispatch[_vtime]_from_dsq() added to increase flexibility. A task
   can now be transferred between two DSQs from almost any context. This
   involved significant refactoring of migration code.
 
 - Various fixes and improvements.
 
 As the branch is based on top of tip/sched/core + bpf/master, please merge
 after both are applied.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZuOSuA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGVZyAQDBU3WPkYKB8gl6a6YQ+/PzBXorOK7mioS9A2iJ
 vBR3FgEAg1vtcss1S+2juWmVq7ItiFNWCqtXzUr/bVmL9CqqDwA=
 =bOOC
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext support from Tejun Heo:
 "This implements a new scheduler class called ‘ext_sched_class’, or
  sched_ext, which allows scheduling policies to be implemented as BPF
  programs.

  The goals of this are:

   - Ease of experimentation and exploration: Enabling rapid iteration
     of new scheduling policies.

   - Customization: Building application-specific schedulers which
     implement policies that are not applicable to general-purpose
     schedulers.

   - Rapid scheduler deployments: Non-disruptive swap outs of scheduling
     policies in production environments"

See individual commits for more documentation, but also the cover letter
for the latest series:

Link: https://lore.kernel.org/all/20240618212056.2833381-1-tj@kernel.org/

* tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (110 commits)
  sched: Move update_other_load_avgs() to kernel/sched/pelt.c
  sched_ext: Don't trigger ops.quiescent/runnable() on migrations
  sched_ext: Synchronize bypass state changes with rq lock
  scx_qmap: Implement highpri boosting
  sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()
  sched_ext: Compact struct bpf_iter_scx_dsq_kern
  sched_ext: Replace consume_local_task() with move_local_task_to_local_dsq()
  sched_ext: Move consume_local_task() upward
  sched_ext: Move sanity check and dsq_mod_nr() into task_unlink_from_dsq()
  sched_ext: Reorder args for consume_local/remote_task()
  sched_ext: Restructure dispatch_to_local_dsq()
  sched_ext: Fix processs_ddsp_deferred_locals() by unifying DTL_INVALID handling
  sched_ext: Make find_dsq_for_dispatch() handle SCX_DSQ_LOCAL_ON
  sched_ext: Refactor consume_remote_task()
  sched_ext: Rename scx_kfunc_set_sleepable to unlocked and relocate
  sched_ext: Add missing static to scx_dump_data
  sched_ext: Add missing static to scx_has_op[]
  sched_ext: Temporarily work around pick_task_scx() being called without balance_scx()
  sched_ext: Add a cgroup scheduler which uses flattened hierarchy
  sched_ext: Add cgroup support
  ...
2024-09-21 09:44:57 -07:00
Linus Torvalds
440b652328 bpf-next-6.12
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmbk/nIACgkQ6rmadz2v
 bTqxuBAAnqW81Rr0nORIxeJMbyo4EiFuYHGk6u5BYP9NPzqHroUPCLVmSP7Hp/Ta
 CJjsiZeivZsGa6Qlc3BCa4hHNpqP5WE1C/73svSDn7/99EfxdSBtirpMVFUPsUtn
 DDb5chNpvnxKNS8Mw5Ty8wBrdbXHMlSx+IfaFHpv0Yn6EAcuF4UdoEUq2l3PqhfD
 Il9Zm127eViPGAP+o+TBZFfW+rRw8d0ngqeRq2GvJ8ibNEDWss+GmBI1Dod7d+fC
 dUDg96Ipdm1a5Xz7dnH80eXz9JHdpu6qhQrQMKKArnlpJElrKiOf9b17ZcJoPQOR
 ZnstEnUyVnrWROZxUuKY72+2tx3TuSf+L9uZqFHNx3Ix5FIoS+tFbHf4b8SxtsOb
 hb2X7SigdGqhQDxUT+IPeO5hsJlIvG1/VYxMXxgc++rh9DjL06hDLUSH1WBSU0fC
 kFQ7HrcpAlVHtWmGbwwUyVjD+KC/qmZBTAnkcYT4C62WZVytSCnihIuSFAvV1tpZ
 SSIhVPyQ599UoZIiQYihp0S4qP74FotCtErWSrThneh2Cl8kDsRq//lV1nj/PTV8
 CpTvz4VCFDFTgthCfd62fP95EwW5K+aE3NjGTPW/9Hx/0+J/1tT+yqWsrToGaruf
 TbrqtzQhpclz9UEqA+696cVAXNj9uRU4AoD3YIg72kVnRlkgYd0=
 =MDwh
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Introduce '__attribute__((bpf_fastcall))' for helpers and kfuncs with
   corresponding support in LLVM.

   It is similar to existing 'no_caller_saved_registers' attribute in
   GCC/LLVM with a provision for backward compatibility. It allows
   compilers generate more efficient BPF code assuming the verifier or
   JITs will inline or partially inline a helper/kfunc with such
   attribute. bpf_cast_to_kern_ctx, bpf_rdonly_cast,
   bpf_get_smp_processor_id are the first set of such helpers.

 - Harden and extend ELF build ID parsing logic.

   When called from sleepable context the relevants parts of ELF file
   will be read to find and fetch .note.gnu.build-id information. Also
   harden the logic to avoid TOCTOU, overflow, out-of-bounds problems.

 - Improvements and fixes for sched-ext:
    - Allow passing BPF iterators as kfunc arguments
    - Make the pointer returned from iter_next method trusted
    - Fix x86 JIT convergence issue due to growing/shrinking conditional
      jumps in variable length encoding

 - BPF_LSM related:
    - Introduce few VFS kfuncs and consolidate them in
      fs/bpf_fs_kfuncs.c
    - Enforce correct range of return values from certain LSM hooks
    - Disallow attaching to other LSM hooks

 - Prerequisite work for upcoming Qdisc in BPF:
    - Allow kptrs in program provided structs
    - Support for gen_epilogue in verifier_ops

 - Important fixes:
    - Fix uprobe multi pid filter check
    - Fix bpf_strtol and bpf_strtoul helpers
    - Track equal scalars history on per-instruction level
    - Fix tailcall hierarchy on x86 and arm64
    - Fix signed division overflow to prevent INT_MIN/-1 trap on x86
    - Fix get kernel stack in BPF progs attached to tracepoint:syscall

 - Selftests:
    - Add uprobe bench/stress tool
    - Generate file dependencies to drastically improve re-build time
    - Match JIT-ed and BPF asm with __xlated/__jited keywords
    - Convert older tests to test_progs framework
    - Add support for RISC-V
    - Few fixes when BPF programs are compiled with GCC-BPF backend
      (support for GCC-BPF in BPF CI is ongoing in parallel)
    - Add traffic monitor
    - Enable cross compile and musl libc

* tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (260 commits)
  btf: require pahole 1.21+ for DEBUG_INFO_BTF with default DWARF version
  btf: move pahole check in scripts/link-vmlinux.sh to lib/Kconfig.debug
  btf: remove redundant CONFIG_BPF test in scripts/link-vmlinux.sh
  bpf: Call the missed kfree() when there is no special field in btf
  bpf: Call the missed btf_record_free() when map creation fails
  selftests/bpf: Add a test case to write mtu result into .rodata
  selftests/bpf: Add a test case to write strtol result into .rodata
  selftests/bpf: Rename ARG_PTR_TO_LONG test description
  selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test
  bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error
  bpf: Improve check_raw_mode_ok test for MEM_UNINIT-tagged types
  bpf: Fix helper writes to read-only maps
  bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers
  bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
  selftests/bpf: Add tests for sdiv/smod overflow cases
  bpf: Fix a sdiv overflow issue
  libbpf: Add bpf_object__token_fd accessor
  docs/bpf: Add missing BPF program types to docs
  docs/bpf: Add constant values for linkages
  bpf: Use fake pt_regs when doing bpf syscall tracepoint tracing
  ...
2024-09-21 09:27:50 -07:00
Linus Torvalds
7856a56541 Many singleton patches - please see the various changelogs for details.
Quite a lot of nilfs2 work this time around.
 
 Notable patch series in this pull request are:
 
 "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
 assistance from Uwe Kleine-König.  Reimplement mul_u64_u64_div_u64() to
 provide (much) more accurate results.  The current implementation was
 causing Uwe some issues in the PWM drivers.
 
 "xz: Updates to license, filters, and compression options" from Lasse
 Collin.  Miscellaneous maintenance and kinor feature work to the xz
 decompressor.
 
 "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee.
 Fixes and enhancements to the gdb scripts.
 
 "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson.
 Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this.
 
 "nilfs2: add support for some common ioctls" from Ryusuke Konishi.  Adds
 various commonly-available ioctls to nilfs2.
 
 "This series fixes a number of formatting issues in kernel doc comments"
 from Ryusuke Konishi does that.
 
 "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi.  Fix
 issues where -ENOENT was being unintentionally and inappropriately
 returned to userspace.
 
 "nilfs2: assorted cleanups" from Huang Xiaojia.
 
 "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
 Konishi fixes some issues which can occur on corrupted nilfs2 filesystems.
 
 "scripts/decode_stacktrace.sh: improve error reporting and usability" from
 Luca Ceresoli does those things.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu7dpAAKCRDdBJ7gKXxA
 jsPqAPwMDEZyKlfSw7QioEHNHDkmkbP7VYCYR0CbUnppbztwpAD8D37aVbWQ+UzM
 3nnOq3W2Pc2o/20zqi8Upf1mnvUrygQ=
 =/NWE
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Many singleton patches - please see the various changelogs for
  details.

  Quite a lot of nilfs2 work this time around.

  Notable patch series in this pull request are:

   - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
     assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64()
     to provide (much) more accurate results. The current implementation
     was causing Uwe some issues in the PWM drivers.

   - "xz: Updates to license, filters, and compression options" from
     Lasse Collin. Miscellaneous maintenance and kinor feature work to
     the xz decompressor.

   - "Fix some GDB command error and add some GDB commands" from
     Kuan-Ying Lee. Fixes and enhancements to the gdb scripts.

   - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff
     Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of
     warnings about this.

   - "nilfs2: add support for some common ioctls" from Ryusuke Konishi.
     Adds various commonly-available ioctls to nilfs2.

   - "This series fixes a number of formatting issues in kernel doc
     comments" from Ryusuke Konishi does that.

   - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke
     Konishi. Fix issues where -ENOENT was being unintentionally and
     inappropriately returned to userspace.

   - "nilfs2: assorted cleanups" from Huang Xiaojia.

   - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
     Konishi fixes some issues which can occur on corrupted nilfs2
     filesystems.

   - "scripts/decode_stacktrace.sh: improve error reporting and
     usability" from Luca Ceresoli does those things"

* tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits)
  list: test: increase coverage of list_test_list_replace*()
  list: test: fix tests for list_cut_position()
  proc: use __auto_type more
  treewide: correct the typo 'retun'
  ocfs2: cleanup return value and mlog in ocfs2_global_read_info()
  nilfs2: remove duplicate 'unlikely()' usage
  nilfs2: fix potential oob read in nilfs_btree_check_delete()
  nilfs2: determine empty node blocks as corrupted
  nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
  user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation
  tools/mm: rm thp_swap_allocator_test when make clean
  squashfs: fix percpu address space issues in decompressor_multi_percpu.c
  lib: glob.c: added null check for character class
  nilfs2: refactor nilfs_segctor_thread()
  nilfs2: use kthread_create and kthread_stop for the log writer thread
  nilfs2: remove sc_timer_task
  nilfs2: do not repair reserved inode bitmap in nilfs_new_inode()
  nilfs2: eliminate the shared counter and spinlock for i_generation
  nilfs2: separate inode type information from i_state field
  nilfs2: use the BITS_PER_LONG macro
  ...
2024-09-21 08:20:50 -07:00
Linus Torvalds
617a814f14 ALong with the usual shower of singleton patches, notable patch series in
this pull request are:
 
 "Align kvrealloc() with krealloc()" from Danilo Krummrich.  Adds
 consistency to the APIs and behaviour of these two core allocation
 functions.  This also simplifies/enables Rustification.
 
 "Some cleanups for shmem" from Baolin Wang.  No functional changes - mode
 code reuse, better function naming, logic simplifications.
 
 "mm: some small page fault cleanups" from Josef Bacik.  No functional
 changes - code cleanups only.
 
 "Various memory tiering fixes" from Zi Yan.  A small fix and a little
 cleanup.
 
 "mm/swap: remove boilerplate" from Yu Zhao.  Code cleanups and
 simplifications and .text shrinkage.
 
 "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt.  This
 is a feature, it adds new feilds to /proc/vmstat such as
 
     $ grep kstack /proc/vmstat
     kstack_1k 3
     kstack_2k 188
     kstack_4k 11391
     kstack_8k 243
     kstack_16k 0
 
 which tells us that 11391 processes used 4k of stack while none at all
 used 16k.  Useful for some system tuning things, but partivularly useful
 for "the dynamic kernel stack project".
 
 "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov.
 Teaches kmemleak to detect leaksage of percpu memory.
 
 "mm: memcg: page counters optimizations" from Roman Gushchin.  "3
 independent small optimizations of page counters".
 
 "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David
 Hildenbrand.  Improves PTE/PMD splitlock detection, makes powerpc/8xx work
 correctly by design rather than by accident.
 
 "mm: remove arch_make_page_accessible()" from David Hildenbrand.  Some
 folio conversions which make arch_make_page_accessible() unneeded.
 
 "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel.
 Cleans up and fixes our handling of the resetting of the cgroup/process
 peak-memory-use detector.
 
 "Make core VMA operations internal and testable" from Lorenzo Stoakes.
 Rationalizaion and encapsulation of the VMA manipulation APIs.  With a
 view to better enable testing of the VMA functions, even from a
 userspace-only harness.
 
 "mm: zswap: fixes for global shrinker" from Takero Funaki.  Fix issues in
 the zswap global shrinker, resulting in improved performance.
 
 "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao.  Fill in
 some missing info in /proc/zoneinfo.
 
 "mm: replace follow_page() by folio_walk" from David Hildenbrand.  Code
 cleanups and rationalizations (conversion to folio_walk()) resulting in
 the removal of follow_page().
 
 "improving dynamic zswap shrinker protection scheme" from Nhat Pham.  Some
 tuning to improve zswap's dynamic shrinker.  Significant reductions in
 swapin and improvements in performance are shown.
 
 "mm: Fix several issues with unaccepted memory" from Kirill Shutemov.
 Improvements to the new unaccepted memory feature,
 
 "mm/mprotect: Fix dax puds" from Peter Xu.  Implements mprotect on DAX
 PUDs.  This was missing, although nobody seems to have notied yet.
 
 "Introduce a store type enum for the Maple tree" from Sidhartha Kumar.
 Cleanups and modest performance improvements for the maple tree library
 code.
 
 "memcg: further decouple v1 code from v2" from Shakeel Butt.  Move more
 cgroup v1 remnants away from the v2 memcg code.
 
 "memcg: initiate deprecation of v1 features" from Shakeel Butt.  Adds
 various warnings telling users that memcg v1 features are deprecated.
 
 "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li.
 Greatly improves the success rate of the mTHP swap allocation.
 
 "mm: introduce numa_memblks" from Mike Rapoport.  Moves various disparate
 per-arch implementations of numa_memblk code into generic code.
 
 "mm: batch free swaps for zap_pte_range()" from Barry Song.  Greatly
 improves the performance of munmap() of swap-filled ptes.
 
 "support large folio swap-out and swap-in for shmem" from Baolin Wang.
 With this series we no longer split shmem large folios into simgle-page
 folios when swapping out shmem.
 
 "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao.  Nice performance
 improvements and code reductions for gigantic folios.
 
 "support shmem mTHP collapse" from Baolin Wang.  Adds support for
 khugepaged's collapsing of shmem mTHP folios.
 
 "mm: Optimize mseal checks" from Pedro Falcato.  Fixes an mprotect()
 performance regression due to the addition of mseal().
 
 "Increase the number of bits available in page_type" from Matthew Wilcox.
 Increases the number of bits available in page_type!
 
 "Simplify the page flags a little" from Matthew Wilcox.  Many legacy page
 flags are now folio flags, so the page-based flags and their
 accessors/mutators can be removed.
 
 "mm: store zero pages to be swapped out in a bitmap" from Usama Arif.  An
 optimization which permits us to avoid writing/reading zero-filled zswap
 pages to backing store.
 
 "Avoid MAP_FIXED gap exposure" from Liam Howlett.  Fixes a race window
 which occurs when a MAP_FIXED operqtion is occurring during an unrelated
 vma tree walk.
 
 "mm: remove vma_merge()" from Lorenzo Stoakes.  Major rotorooting of the
 vma_merge() functionality, making ot cleaner, more testable and better
 tested.
 
 "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.  Minor
 fixups of DAMON selftests and kunit tests.
 
 "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.  Code
 cleanups and folio conversions.
 
 "Shmem mTHP controls and stats improvements" from Ryan Roberts.  Cleanups
 for shmem controls and stats.
 
 "mm: count the number of anonymous THPs per size" from Barry Song.  Expose
 additional anon THP stats to userspace for improved tuning.
 
 "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio
 conversions and removal of now-unused page-based APIs.
 
 "replace per-quota region priorities histogram buffer with per-context
 one" from SeongJae Park.  DAMON histogram rationalization.
 
 "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae
 Park.  DAMON documentation updates.
 
 "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve
 related doc and warn" from Jason Wang: fixes usage of page allocator
 __GFP_NOFAIL and GFP_ATOMIC flags.
 
 "mm: split underused THPs" from Yu Zhao.  Improve THP=always policy - this
 was overprovisioning THPs in sparsely accessed memory areas.
 
 "zram: introduce custom comp backends API" frm Sergey Senozhatsky.  Add
 support for zram run-time compression algorithm tuning.
 
 "mm: Care about shadow stack guard gap when getting an unmapped area" from
 Mark Brown.  Fix up the various arch_get_unmapped_area() implementations
 to better respect guard areas.
 
 "Improve mem_cgroup_iter()" from Kinsey Ho.  Improve the reliability of
 mem_cgroup_iter() and various code cleanups.
 
 "mm: Support huge pfnmaps" from Peter Xu.  Extends the usage of huge
 pfnmap support.
 
 "resource: Fix region_intersects() vs add_memory_driver_managed()" from
 Huang Ying.  Fix a bug in region_intersects() for systems with CXL memory.
 
 "mm: hwpoison: two more poison recovery" from Kefeng Wang.  Teaches a
 couple more code paths to correctly recover from the encountering of
 poisoned memry.
 
 "mm: enable large folios swap-in support" from Barry Song.  Support the
 swapin of mTHP memory into appropriately-sized folios, rather than into
 single-page folios.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu1BBwAKCRDdBJ7gKXxA
 jlWNAQDYlqQLun7bgsAN4sSvi27VUuWv1q70jlMXTfmjJAvQqwD/fBFVR6IOOiw7
 AkDbKWP2k0hWPiNJBGwoqxdHHx09Xgo=
 =s0T+
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Along with the usual shower of singleton patches, notable patch series
  in this pull request are:

   - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds
     consistency to the APIs and behaviour of these two core allocation
     functions. This also simplifies/enables Rustification.

   - "Some cleanups for shmem" from Baolin Wang. No functional changes -
     mode code reuse, better function naming, logic simplifications.

   - "mm: some small page fault cleanups" from Josef Bacik. No
     functional changes - code cleanups only.

   - "Various memory tiering fixes" from Zi Yan. A small fix and a
     little cleanup.

   - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and
     simplifications and .text shrinkage.

   - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel
     Butt. This is a feature, it adds new feilds to /proc/vmstat such as

       $ grep kstack /proc/vmstat
       kstack_1k 3
       kstack_2k 188
       kstack_4k 11391
       kstack_8k 243
       kstack_16k 0

     which tells us that 11391 processes used 4k of stack while none at
     all used 16k. Useful for some system tuning things, but
     partivularly useful for "the dynamic kernel stack project".

   - "kmemleak: support for percpu memory leak detect" from Pavel
     Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory.

   - "mm: memcg: page counters optimizations" from Roman Gushchin. "3
     independent small optimizations of page counters".

   - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from
     David Hildenbrand. Improves PTE/PMD splitlock detection, makes
     powerpc/8xx work correctly by design rather than by accident.

   - "mm: remove arch_make_page_accessible()" from David Hildenbrand.
     Some folio conversions which make arch_make_page_accessible()
     unneeded.

   - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David
     Finkel. Cleans up and fixes our handling of the resetting of the
     cgroup/process peak-memory-use detector.

   - "Make core VMA operations internal and testable" from Lorenzo
     Stoakes. Rationalizaion and encapsulation of the VMA manipulation
     APIs. With a view to better enable testing of the VMA functions,
     even from a userspace-only harness.

   - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix
     issues in the zswap global shrinker, resulting in improved
     performance.

   - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill
     in some missing info in /proc/zoneinfo.

   - "mm: replace follow_page() by folio_walk" from David Hildenbrand.
     Code cleanups and rationalizations (conversion to folio_walk())
     resulting in the removal of follow_page().

   - "improving dynamic zswap shrinker protection scheme" from Nhat
     Pham. Some tuning to improve zswap's dynamic shrinker. Significant
     reductions in swapin and improvements in performance are shown.

   - "mm: Fix several issues with unaccepted memory" from Kirill
     Shutemov. Improvements to the new unaccepted memory feature,

   - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on
     DAX PUDs. This was missing, although nobody seems to have notied
     yet.

   - "Introduce a store type enum for the Maple tree" from Sidhartha
     Kumar. Cleanups and modest performance improvements for the maple
     tree library code.

   - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move
     more cgroup v1 remnants away from the v2 memcg code.

   - "memcg: initiate deprecation of v1 features" from Shakeel Butt.
     Adds various warnings telling users that memcg v1 features are
     deprecated.

   - "mm: swap: mTHP swap allocator base on swap cluster order" from
     Chris Li. Greatly improves the success rate of the mTHP swap
     allocation.

   - "mm: introduce numa_memblks" from Mike Rapoport. Moves various
     disparate per-arch implementations of numa_memblk code into generic
     code.

   - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly
     improves the performance of munmap() of swap-filled ptes.

   - "support large folio swap-out and swap-in for shmem" from Baolin
     Wang. With this series we no longer split shmem large folios into
     simgle-page folios when swapping out shmem.

   - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice
     performance improvements and code reductions for gigantic folios.

   - "support shmem mTHP collapse" from Baolin Wang. Adds support for
     khugepaged's collapsing of shmem mTHP folios.

   - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect()
     performance regression due to the addition of mseal().

   - "Increase the number of bits available in page_type" from Matthew
     Wilcox. Increases the number of bits available in page_type!

   - "Simplify the page flags a little" from Matthew Wilcox. Many legacy
     page flags are now folio flags, so the page-based flags and their
     accessors/mutators can be removed.

   - "mm: store zero pages to be swapped out in a bitmap" from Usama
     Arif. An optimization which permits us to avoid writing/reading
     zero-filled zswap pages to backing store.

   - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race
     window which occurs when a MAP_FIXED operqtion is occurring during
     an unrelated vma tree walk.

   - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of
     the vma_merge() functionality, making ot cleaner, more testable and
     better tested.

   - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.
     Minor fixups of DAMON selftests and kunit tests.

   - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.
     Code cleanups and folio conversions.

   - "Shmem mTHP controls and stats improvements" from Ryan Roberts.
     Cleanups for shmem controls and stats.

   - "mm: count the number of anonymous THPs per size" from Barry Song.
     Expose additional anon THP stats to userspace for improved tuning.

   - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more
     folio conversions and removal of now-unused page-based APIs.

   - "replace per-quota region priorities histogram buffer with
     per-context one" from SeongJae Park. DAMON histogram
     rationalization.

   - "Docs/damon: update GitHub repo URLs and maintainer-profile" from
     SeongJae Park. DAMON documentation updates.

   - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and
     improve related doc and warn" from Jason Wang: fixes usage of page
     allocator __GFP_NOFAIL and GFP_ATOMIC flags.

   - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy.
     This was overprovisioning THPs in sparsely accessed memory areas.

   - "zram: introduce custom comp backends API" frm Sergey Senozhatsky.
     Add support for zram run-time compression algorithm tuning.

   - "mm: Care about shadow stack guard gap when getting an unmapped
     area" from Mark Brown. Fix up the various arch_get_unmapped_area()
     implementations to better respect guard areas.

   - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability
     of mem_cgroup_iter() and various code cleanups.

   - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge
     pfnmap support.

   - "resource: Fix region_intersects() vs add_memory_driver_managed()"
     from Huang Ying. Fix a bug in region_intersects() for systems with
     CXL memory.

   - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches
     a couple more code paths to correctly recover from the encountering
     of poisoned memry.

   - "mm: enable large folios swap-in support" from Barry Song. Support
     the swapin of mTHP memory into appropriately-sized folios, rather
     than into single-page folios"

* tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits)
  zram: free secondary algorithms names
  uprobes: turn xol_area->pages[2] into xol_area->page
  uprobes: introduce the global struct vm_special_mapping xol_mapping
  Revert "uprobes: use vm_special_mapping close() functionality"
  mm: support large folios swap-in for sync io devices
  mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios
  mm: fix swap_read_folio_zeromap() for large folios with partial zeromap
  mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries
  set_memory: add __must_check to generic stubs
  mm/vma: return the exact errno in vms_gather_munmap_vmas()
  memcg: cleanup with !CONFIG_MEMCG_V1
  mm/show_mem.c: report alloc tags in human readable units
  mm: support poison recovery from copy_present_page()
  mm: support poison recovery from do_cow_fault()
  resource, kunit: add test case for region_intersects()
  resource: make alloc_free_mem_region() works for iomem_resource
  mm: z3fold: deprecate CONFIG_Z3FOLD
  vfio/pci: implement huge_fault support
  mm/arm64: support large pfn mappings
  mm/x86: support large pfn mappings
  ...
2024-09-21 07:29:05 -07:00
Ming Lei
65f666c620 lib/sbitmap: define swap_lock as raw_spinlock_t
When called from sbitmap_queue_get(), sbitmap_deferred_clear() may be run
with preempt disabled. In RT kernel, spin_lock() can sleep, then warning
of "BUG: sleeping function called from invalid context" can be triggered.

Fix it by replacing it with raw_spin_lock.

Cc: Yang Yang <yang.yang@vivo.com>
Fixes: 72d04bdcf3 ("sbitmap: fix io hung due to race on sbitmap_word::cleared")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Yang Yang <yang.yang@vivo.com>
Link: https://lore.kernel.org/r/20240919021709.511329-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-09-20 00:20:06 -06:00
Kris Van Hees
5f5e734432 kbuild: generate offset range data for builtin modules
Create file module.builtin.ranges that can be used to find where
built-in modules are located by their addresses. This will be useful for
tracing tools to find what functions are for various built-in modules.

The offset range data for builtin modules is generated using:
 - modules.builtin: associates object files with module names
 - vmlinux.map: provides load order of sections and offset of first member
    per section
 - vmlinux.o.map: provides offset of object file content per section
 - .*.cmd: build cmd file with KBUILD_MODFILE

The generated data will look like:

.text 00000000-00000000 = _text
.text 0000baf0-0000cb10 amd_uncore
.text 0009bd10-0009c8e0 iosf_mbi
...
.text 00b9f080-00ba011a intel_skl_int3472_discrete
.text 00ba0120-00ba03c0 intel_skl_int3472_discrete intel_skl_int3472_tps68470
.text 00ba03c0-00ba08d6 intel_skl_int3472_tps68470
...
.data 00000000-00000000 = _sdata
.data 0000f020-0000f680 amd_uncore

For each ELF section, it lists the offset of the first symbol.  This can
be used to determine the base address of the section at runtime.

Next, it lists (in strict ascending order) offset ranges in that section
that cover the symbols of one or more builtin modules.  Multiple ranges
can apply to a single module, and ranges can be shared between modules.

The CONFIG_BUILTIN_MODULE_RANGES option controls whether offset range data
is generated for kernel modules that are built into the kernel image.

How it works:

 1. The modules.builtin file is parsed to obtain a list of built-in
    module names and their associated object names (the .ko file that
    the module would be in if it were a loadable module, hereafter
    referred to as <kmodfile>).  This object name can be used to
    identify objects in the kernel compile because any C or assembler
    code that ends up into a built-in module will have the option
    -DKBUILD_MODFILE=<kmodfile> present in its build command, and those
    can be found in the .<obj>.cmd file in the kernel build tree.

    If an object is part of multiple modules, they will all be listed
    in the KBUILD_MODFILE option argument.

    This allows us to conclusively determine whether an object in the
    kernel build belong to any modules, and which.

 2. The vmlinux.map is parsed next to determine the base address of each
    top level section so that all addresses into the section can be
    turned into offsets.  This makes it possible to handle sections
    getting loaded at different addresses at system boot.

    We also determine an 'anchor' symbol at the beginning of each
    section to make it possible to calculate the true base address of
    a section at runtime (i.e. symbol address - symbol offset).

    We collect start addresses of sections that are included in the top
    level section.  This is used when vmlinux is linked using vmlinux.o,
    because in that case, we need to look at the vmlinux.o linker map to
    know what object a symbol is found in.

    And finally, we process each symbol that is listed in vmlinux.map
    (or vmlinux.o.map) based on the following structure:

    vmlinux linked from vmlinux.a:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

    vmlinux linked from vmlinux.o:

      vmlinux.map:
        <top level section>
          <included section>  -- might be same as top level section)
            vmlinux.o         -- need to use vmlinux.o.map
              <symbol>        -- ignored
              ...

      vmlinux.o.map:
        <section>
            <object>          -- built-in association known
              <symbol>        -- belongs to module(s) object belongs to
              ...

 3. As sections, objects, and symbols are processed, offset ranges are
    constructed in a straight-forward way:

      - If the symbol belongs to one or more built-in modules:
          - If we were working on the same module(s), extend the range
            to include this object
          - If we were working on another module(s), close that range,
            and start the new one
      - If the symbol does not belong to any built-in modules:
          - If we were working on a module(s) range, close that range

Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Nick Alcock <nick.alcock@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Sam James <sam@gentoo.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-09-20 09:21:43 +09:00
Linus Torvalds
4a39ac5b7d Random number generator updates for Linux 6.12-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmboHyUACgkQSfxwEqXe
 A66wGQ/8DRIjBllwf1YuTWi4T6OcfoYxK6C9bXO6QPP5gzdTyFE9pvDuuPyad6+F
 FR086ydTHeodemz1dFiQCL9etcUaxo4+6FRKyXKF9/1ezGbTA5nJd0/fKJGlqbI2
 EoA4LNYHOsvCZk1BTpxRNWKeKphU9zQgQdSigy6Rx8p269UkGmIZjD1PtUc+vqfR
 Ox0dK/Cswyo236fRi5HzaoMntWI4vXgLfxty0e1R7tfbstkCxSKWAON1lo3uHgkA
 0HpJXWgWXAPt9gp++Fs/jGNpOqbt6IaKeV5f7CjYfvWhlFjNMhQxF+PbxknaZn/k
 K0gQsItOIoFTfbQdLDIdfnj9awMdLW8FB2A1WXHpNr9pVC4ickPb1bMTF/XRd0tm
 wBNu4BL0gklx6017KZg5uINMIduzMLGkBLRFiBW0en/sZMLTJTMg58BJn0CL1Pmh
 1ll/Q3ToSMHalvxU2OnJagTwh4fzzCEpK/hW9WiDO4jSCsMXyX0clinrCjNo1JfA
 tqgTWEy3uGtg+dg0Du9VD5JASbNQSJ0ZRnas5+qz10IRWWfTolrsk61dliXLQ4Sv
 tSryDtsE2znwJF1Krh4aHNSSVhD5/l/8QaXkf9aZc/kkaHxwsx83FuWnqw6nMz8c
 l4B2MbH0jUgsEqEyx+0iwk+FXE9kZKWumTVLjFZ6bRnq3q+uq0U=
 =mWCw
 -----END PGP SIGNATURE-----

Merge tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "Originally I'd planned on sending each of the vDSO getrandom()
  architecture ports to their respective arch trees. But as we started
  to work on this, we found lots of interesting issues in the shared
  code and infrastructure, the fixes for which the various archs needed
  to base their work.

  So in the end, this turned into a nice collaborative effort fixing up
  issues and porting to 5 new architectures -- arm64, powerpc64,
  powerpc32, s390x, and loongarch64 -- with everybody pitching in and
  commenting on each other's code. It was a fun development cycle.

  This contains:

   - Numerous fixups to the vDSO selftest infrastructure, getting it
     running successfully on more platforms, and fixing bugs in it.

   - Additions to the vDSO getrandom & chacha selftests. Basically every
     time manual review unearthed a bug in a revision of an arch patch,
     or an ambiguity, the tests were augmented.

     By the time the last arch was submitted for review, s390x, v1 of
     the series was essentially fine right out of the gate.

   - Fixes to the the generic C implementation of vDSO getrandom, to
     build and run successfully on all archs, decoupling it from
     assumptions we had (unintentionally) made on x86_64 that didn't
     carry through to the other architectures.

   - Port of vDSO getrandom to LoongArch64, from Xi Ruoyao and acked by
     Huacai Chen.

   - Port of vDSO getrandom to ARM64, from Adhemerval Zanella and acked
     by Will Deacon.

   - Port of vDSO getrandom to PowerPC, in both 32-bit and 64-bit
     varieties, from Christophe Leroy and acked by Michael Ellerman.

   - Port of vDSO getrandom to S390X from Heiko Carstens, the arch
     maintainer.

  While it'd be natural for there to be things to fix up over the course
  of the development cycle, these patches got a decent amount of review
  from a fairly diverse crew of folks on the mailing lists, and, for the
  most part, they've been cooking in linux-next, which has been helpful
  for ironing out build issues.

  In terms of architectures, I think that mostly takes care of the
  important 64-bit archs with hardware still being produced and running
  production loads in settings where vDSO getrandom is likely to help.

  Arguably there's still RISC-V left, and we'll see for 6.13 whether
  they find it useful and submit a port"

* tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (47 commits)
  selftests: vDSO: check cpu caps before running chacha test
  s390/vdso: Wire up getrandom() vdso implementation
  s390/vdso: Move vdso symbol handling to separate header file
  s390/vdso: Allow alternatives in vdso code
  s390/module: Provide find_section() helper
  s390/facility: Let test_facility() generate static branch if possible
  s390/alternatives: Remove ALT_FACILITY_EARLY
  s390/facility: Disable compile time optimization for decompressor code
  selftests: vDSO: fix vdso_config for s390
  selftests: vDSO: fix ELF hash table entry size for s390x
  powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64
  powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32
  powerpc/vdso: Refactor CFLAGS for CVDSO build
  powerpc/vdso32: Add crtsavres
  mm: Define VM_DROPPABLE for powerpc/32
  powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
  selftests: vDSO: don't include generated headers for chacha test
  arm64: vDSO: Wire up getrandom() vDSO implementation
  arm64: alternative: make alternative_has_cap_likely() VDSO compatible
  selftests: vDSO: also test counter in vdso_test_chacha
  ...
2024-09-18 15:26:31 +02:00
Linus Torvalds
39b3f4e0db hardening updates for v6.12-rc1
- lib/string_choices: Add str_up_down() helper (Michal Wajdeczko)
 
 - lib/string_choices: Add str_true_false()/str_false_true() helper
   (Hongbo Li)
 
 - lib/string_choices: Introduce several opposite string choice helpers
   (Hongbo Li)
 
 - lib/string_helpers: rework overflow-dependent code (Justin Stitt)
 
 - fortify: refactor test_fortify Makefile to fix some build problems
   (Masahiro Yamada)
 
 - string: Check for "nonstring" attribute on strscpy() arguments
 
 - virt: vbox: Replace 1-element arrays with flexible arrays
 
 - media: venus: hfi_cmds: Replace 1-element arrays with flexible arrays
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZufwawAKCRA2KwveOeQk
 u3n9AQCI8G1FSMFSa8MKSSwTo600dHbZGavJd33fl2VrV7KCvQD8CMPRC/itOIVI
 PXcGo9tekW+zAOOw+v47QorpxHGd1w4=
 =jSSr
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - lib/string_choices:
    - Add str_up_down() helper (Michal Wajdeczko)
    - Add str_true_false()/str_false_true() helper  (Hongbo Li)
    - Introduce several opposite string choice helpers  (Hongbo Li)

 - lib/string_helpers:
    - rework overflow-dependent code (Justin Stitt)

 - fortify: refactor test_fortify Makefile to fix some build problems
   (Masahiro Yamada)

 - string: Check for "nonstring" attribute on strscpy() arguments

 - virt: vbox: Replace 1-element arrays with flexible arrays

 - media: venus: hfi_cmds: Replace 1-element arrays with flexible arrays

* tag 'hardening-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lib/string_choices: Add some comments to make more clear for string choices helpers.
  lib/string_choices: Introduce several opposite string choice helpers
  lib/string_choices: Add str_true_false()/str_false_true() helper
  string: Check for "nonstring" attribute on strscpy() arguments
  media: venus: hfi_cmds: struct hfi_session_release_buffer_pkt: Add __counted_by annotation
  media: venus: hfi_cmds: struct hfi_session_release_buffer_pkt: Replace 1-element array with flexible array
  virt: vbox: struct vmmdev_hgcm_pagelist: Replace 1-element array with flexible array
  lib/string_helpers: rework overflow-dependent code
  coccinelle: Add rules to find str_down_up() replacements
  string_choices: Add wrapper for str_down_up()
  coccinelle: Add rules to find str_up_down() replacements
  lib/string_choices: Add str_up_down() helper
  fortify: use if_changed_dep to record header dependency in *.cmd files
  fortify: move test_fortify.sh to lib/test_fortify/
  fortify: refactor test_fortify Makefile to fix some build problems
2024-09-18 12:12:41 +02:00
Linus Torvalds
bdf56c7580 slab updates for 6.12
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmbn5g0ACgkQu+CwddJF
 iJq+Uwf/aqnLNEpjUBzwUUhSojCpPnTtiyjv+AILTxoSTHmbu8OvN0W79+Rpbdmk
 O4QapAK+BCs+VL2VATwCCufcJ75Z78txO+buQE0DgwluFTIYZ+IwpUMPsK04ln6A
 FD1/uvP1QFx60heqcp2c4zWFBUpg4DE6ufx2A5kieO268lFcWLxyVlcdgRU79ZCt
 uAcV2yDLk3GvPGfxZwPKEmZUo/FmuSoBv0XgT+eWxmTu/R7hcpFse49OyjBH8Tvb
 8d/RCIFgXOr8dTIjtds7eenwB/is4TkRlctezEQ0jO9/JwL/BVOgXZjD1qCtNWqz
 is4TWK7VV+vdq1RD+0xC2hV/+uGEwQ==
 =+WAm
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "This time it's mostly refactoring and improving APIs for slab users in
  the kernel, along with some debugging improvements.

   - kmem_cache_create() refactoring (Christian Brauner)

     Over the years have been growing new parameters to
     kmem_cache_create() where most of them are needed only for a small
     number of caches - most recently the rcu_freeptr_offset parameter.

     To avoid adding new parameters to kmem_cache_create() and adjusting
     all its callers, or creating new wrappers such as
     kmem_cache_create_rcu(), we can now pass extra parameters using the
     new struct kmem_cache_args. Not explicitly initialized fields
     default to values interpreted as unused.

     kmem_cache_create() is for now a wrapper that works both with the
     new form: kmem_cache_create(name, object_size, args, flags) and the
     legacy form: kmem_cache_create(name, object_size, align, flags,
     ctor)

   - kmem_cache_destroy() waits for kfree_rcu()'s in flight (Vlastimil
     Babka, Uladislau Rezki)

     Since SLOB removal, kfree() is allowed for freeing objects
     allocated by kmem_cache_create(). By extension kfree_rcu() as
     allowed as well, which can allow converting simple call_rcu()
     callbacks that only do kmem_cache_free(), as there was never a
     kmem_cache_free_rcu() variant. However, for caches that can be
     destroyed e.g. on module removal, the cache owners knew to issue
     rcu_barrier() first to wait for the pending call_rcu()'s, and this
     is not sufficient for pending kfree_rcu()'s due to its internal
     batching optimizations. Ulad has provided a new
     kvfree_rcu_barrier() and to make the usage less error-prone,
     kmem_cache_destroy() calls it. Additionally, destroying
     SLAB_TYPESAFE_BY_RCU caches now again issues rcu_barrier()
     synchronously instead of using an async work, because the past
     motivation for async work no longer applies. Users of custom
     call_rcu() callbacks should however keep calling rcu_barrier()
     before cache destruction.

   - Debugging use-after-free in SLAB_TYPESAFE_BY_RCU caches (Jann Horn)

     Currently, KASAN cannot catch UAFs in such caches as it is legal to
     access them within a grace period, and we only track the grace
     period when trying to free the underlying slab page. The new
     CONFIG_SLUB_RCU_DEBUG option changes the freeing of individual
     object to be RCU-delayed, after which KASAN can poison them.

   - Delayed memcg charging (Shakeel Butt)

     In some cases, the memcg is uknown at allocation time, such as
     receiving network packets in softirq context. With
     kmem_cache_charge() these may be now charged later when the user
     and its memcg is known.

   - Misc fixes and improvements (Pedro Falcato, Axel Rasmussen,
     Christoph Lameter, Yan Zhen, Peng Fan, Xavier)"

* tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
  mm, slab: restore kerneldoc for kmem_cache_create()
  io_uring: port to struct kmem_cache_args
  slab: make __kmem_cache_create() static inline
  slab: make kmem_cache_create_usercopy() static inline
  slab: remove kmem_cache_create_rcu()
  file: port to struct kmem_cache_args
  slab: create kmem_cache_create() compatibility layer
  slab: port KMEM_CACHE_USERCOPY() to struct kmem_cache_args
  slab: port KMEM_CACHE() to struct kmem_cache_args
  slab: remove rcu_freeptr_offset from struct kmem_cache
  slab: pass struct kmem_cache_args to do_kmem_cache_create()
  slab: pull kmem_cache_open() into do_kmem_cache_create()
  slab: pass struct kmem_cache_args to create_cache()
  slab: port kmem_cache_create_usercopy() to struct kmem_cache_args
  slab: port kmem_cache_create_rcu() to struct kmem_cache_args
  slab: port kmem_cache_create() to struct kmem_cache_args
  slab: add struct kmem_cache_args
  slab: s/__kmem_cache_create/do_kmem_cache_create/g
  memcg: add charging of already allocated slab objects
  mm/slab: Optimize the code logic in find_mergeable()
  ...
2024-09-18 08:53:53 +02:00
Linus Torvalds
067610ebaa RCU pull request for v6.12
This pull request contains the following branches:
 
 context_tracking.15.08.24a: Rename context tracking state related
         symbols and remove references to "dynticks" in various context
         tracking state variables and related helpers; force
         context_tracking_enabled_this_cpu() to be inlined to avoid
         leaving a noinstr section.
 
 csd.lock.15.08.24a: Enhance CSD-lock diagnostic reports; add an API
         to provide an indication of ongoing CSD-lock stall.
 
 nocb.09.09.24a: Update and simplify RCU nocb code to handle
         (de-)offloading of callbacks only for offline CPUs; fix RT
         throttling hrtimer being armed from offline CPU.
 
 rcutorture.14.08.24a: Remove redundant rcu_torture_ops get_gp_completed
         fields; add SRCU ->same_gp_state and ->get_comp_state
         functions; add generic test for NUM_ACTIVE_*RCU_POLL* for
         testing RCU and SRCU polled grace periods; add CFcommon.arch
         for arch-specific Kconfig options; print number of update types
         in rcu_torture_write_types();
         add rcutree.nohz_full_patience_delay testing to the TREE07
         scenario; add a stall_cpu_repeat module parameter to test
         repeated CPU stalls; add argument to limit number of CPUs a
         guest OS can use in torture.sh;
 
 rcustall.09.09.24a: Abbreviate RCU CPU stall warnings during CSD-lock
         stalls; Allow dump_cpu_task() to be called without disabling
         preemption; defer printing stall-warning backtrace when holding
         rcu_node lock.
 
 srcu.12.08.24a: Make SRCU gp seq wrap-around faster; add KCSAN checks
         for concurrent updates to ->srcu_n_exp_nodelay and
         ->reschedule_count which are used in heuristics governing
         auto-expediting of normal SRCU grace periods and
         grace-period-state-machine delays; mark idle SRCU-barrier
         callbacks to help identify stuck SRCU-barrier callback.
 
 rcu.tasks.14.08.24a: Remove RCU Tasks Rude asynchronous APIs as they
         are no longer used; stop testing RCU Tasks Rude asynchronous
         APIs; fix access to non-existent percpu regions; check
         processor-ID assumptions during chosen CPU calculation for
         callback enqueuing; update description of rtp->tasks_gp_seq
         grace-period sequence number; add rcu_barrier_cb_is_done()
         to identify whether a given rcu_barrier callback is stuck;
         mark idle Tasks-RCU-barrier callbacks; add
         *torture_stats_print() functions to print detailed
         diagnostics for Tasks-RCU variants; capture start time of
         rcu_barrier_tasks*() operation to help distinguish a hung
         barrier operation from a long series of barrier operations.
 
 rcu_scaling_tests.15.08.24a:
         refscale: Add a TINY scenario to support tests of Tiny RCU
         and Tiny SRCU; Optimize process_durations() operation;
 
         rcuscale: Dump stacks of stalled rcu_scale_writer() instances;
         dump grace-period statistics when rcu_scale_writer() stalls;
         mark idle RCU-barrier callbacks to identify stuck RCU-barrier
         callbacks; print detailed grace-period and barrier diagnostics
         on rcu_scale_writer() hangs for Tasks-RCU variants; warn if
         async module parameter is specified for RCU implementations
         that do not have async primitives such as RCU Tasks Rude;
         make all writer tasks report upon hang; tolerate repeated
         GFP_KERNEL failure in rcu_scale_writer(); use special allocator
         for rcu_scale_writer(); NULL out top-level pointers to heap
         memory to avoid double-free bugs on modprobe failures; maintain
         per-task instead of per-CPU callbacks count to avoid any issues
         with migration of either tasks or callbacks; constify struct
         ref_scale_ops.
 
 fixes.12.08.24a: Use system_unbound_wq for kfree_rcu work to avoid
         disturbing isolated CPUs.
 
 misc.11.08.24a: Warn on unexpected rcu_state.srs_done_tail state;
         Better define "atomic" for list_replace_rcu() and
         hlist_replace_rcu() routines; annotate struct
         kvfree_rcu_bulk_data with __counted_by().
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZt8+8wAKCRAAHS7/6Z0w
 pTqoAPwPN//tlEoJx2PRs6t0q+nD1YNvnZawPaRmdzgdM8zJogD+PiSN+XhqRr80
 jzyvMDU4Aa0wjUNP3XsCoaCxo7L/lQk=
 =bZ9z
 -----END PGP SIGNATURE-----

Merge tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Neeraj Upadhyay:
 "Context tracking:
   - rename context tracking state related symbols and remove references
     to "dynticks" in various context tracking state variables and
     related helpers
   - force context_tracking_enabled_this_cpu() to be inlined to avoid
     leaving a noinstr section

  CSD lock:
   - enhance CSD-lock diagnostic reports
   - add an API to provide an indication of ongoing CSD-lock stall

  nocb:
   - update and simplify RCU nocb code to handle (de-)offloading of
     callbacks only for offline CPUs
   - fix RT throttling hrtimer being armed from offline CPU

  rcutorture:
   - remove redundant rcu_torture_ops get_gp_completed fields
   - add SRCU ->same_gp_state and ->get_comp_state functions
   - add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU
     polled grace periods
   - add CFcommon.arch for arch-specific Kconfig options
   - print number of update types in rcu_torture_write_types()
   - add rcutree.nohz_full_patience_delay testing to the TREE07 scenario
   - add a stall_cpu_repeat module parameter to test repeated CPU stalls
   - add argument to limit number of CPUs a guest OS can use in
     torture.sh

  rcustall:
   - abbreviate RCU CPU stall warnings during CSD-lock stalls
   - Allow dump_cpu_task() to be called without disabling preemption
   - defer printing stall-warning backtrace when holding rcu_node lock

  srcu:
   - make SRCU gp seq wrap-around faster
   - add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and
     ->reschedule_count which are used in heuristics governing
     auto-expediting of normal SRCU grace periods and
     grace-period-state-machine delays
   - mark idle SRCU-barrier callbacks to help identify stuck
     SRCU-barrier callback

  rcu tasks:
   - remove RCU Tasks Rude asynchronous APIs as they are no longer used
   - stop testing RCU Tasks Rude asynchronous APIs
   - fix access to non-existent percpu regions
   - check processor-ID assumptions during chosen CPU calculation for
     callback enqueuing
   - update description of rtp->tasks_gp_seq grace-period sequence
     number
   - add rcu_barrier_cb_is_done() to identify whether a given
     rcu_barrier callback is stuck
   - mark idle Tasks-RCU-barrier callbacks
   - add *torture_stats_print() functions to print detailed diagnostics
     for Tasks-RCU variants
   - capture start time of rcu_barrier_tasks*() operation to help
     distinguish a hung barrier operation from a long series of barrier
     operations

  refscale:
   - add a TINY scenario to support tests of Tiny RCU and Tiny
     SRCU
   - optimize process_durations() operation

  rcuscale:
   - dump stacks of stalled rcu_scale_writer() instances and
     grace-period statistics when rcu_scale_writer() stalls
   - mark idle RCU-barrier callbacks to identify stuck RCU-barrier
     callbacks
   - print detailed grace-period and barrier diagnostics on
     rcu_scale_writer() hangs for Tasks-RCU variants
   - warn if async module parameter is specified for RCU implementations
     that do not have async primitives such as RCU Tasks Rude
   - make all writer tasks report upon hang
   - tolerate repeated GFP_KERNEL failure in rcu_scale_writer()
   - use special allocator for rcu_scale_writer()
   - NULL out top-level pointers to heap memory to avoid double-free
     bugs on modprobe failures
   - maintain per-task instead of per-CPU callbacks count to avoid any
     issues with migration of either tasks or callbacks
   - constify struct ref_scale_ops

  Fixes:
   - use system_unbound_wq for kfree_rcu work to avoid disturbing
     isolated CPUs

  Misc:
   - warn on unexpected rcu_state.srs_done_tail state
   - better define "atomic" for list_replace_rcu() and
     hlist_replace_rcu() routines
   - annotate struct kvfree_rcu_bulk_data with __counted_by()"

* tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits)
  rcu: Defer printing stall-warning backtrace when holding rcu_node lock
  rcu/nocb: Remove superfluous memory barrier after bypass enqueue
  rcu/nocb: Conditionally wake up rcuo if not already waiting on GP
  rcu/nocb: Fix RT throttling hrtimer armed from offline CPU
  rcu/nocb: Simplify (de-)offloading state machine
  context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline
  context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching
  rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}()
  rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()
  rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck()
  rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save()
  rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap
  rcu: Rename struct rcu_data .dynticks_snap into .watching_snap
  rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs()
  rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since()
  rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs()
  rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online()
  context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu()
  context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*()
  refscale: Constify struct ref_scale_ops
  ...
2024-09-18 07:52:24 +02:00
Linus Torvalds
78567e2bc7 cgroup: Changes for v6.12
- cpuset isolation improvements.
 
 - cpuset cgroup1 support is split into its own file behind the new config
   option CONFIG_CPUSET_V1. This makes it the second controller which makes
   cgroup1 support optional after memcg.
 
 - Handling of unavailable v1 controller handling improved during cgroup1
   mount operations.
 
 - union_find applied to cpuset. It makes code simpler and more efficient.
 
 - Reduce spurious events in pids.events.
 
 - Cleanups and other misc changes.
 
 - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes that
   further changes build upon.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZuNU3Q4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGdMsAP9yqPxu//LiJ3lPWhKcVVKtdwrA3AYDLE81VSJO
 5VZJhAD+Ic+Ly/jZjDtjjQpZ1U3JsBpBRcVBqzeH0gD7eXaJgwk=
 =h/+c
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cpuset isolation improvements

 - cpuset cgroup1 support is split into its own file behind the new
   config option CONFIG_CPUSET_V1. This makes it the second controller
   which makes cgroup1 support optional after memcg

 - Handling of unavailable v1 controller handling improved during
   cgroup1 mount operations

 - union_find applied to cpuset. It makes code simpler and more
   efficient

 - Reduce spurious events in pids.events

 - Cleanups and other misc changes

 - Contains a merge of cgroup/for-6.11-fixes to receive cpuset fixes
   that further changes build upon

* tag 'cgroup-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (34 commits)
  cgroup: Do not report unavailable v1 controllers in /proc/cgroups
  cgroup: Disallow mounting v1 hierarchies without controller implementation
  cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only
  cgroup/cpuset: Move cpu.h include to cpuset-internal.h
  cgroup/cpuset: add sefltest for cpuset v1
  cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1
  cgroup/cpuset: rename functions shared between v1 and v2
  cgroup/cpuset: move v1 interfaces to cpuset-v1.c
  cgroup/cpuset: move validate_change_legacy to cpuset-v1.c
  cgroup/cpuset: move legacy hotplug update to cpuset-v1.c
  cgroup/cpuset: add callback_lock helper
  cgroup/cpuset: move memory_spread to cpuset-v1.c
  cgroup/cpuset: move relax_domain_level to cpuset-v1.c
  cgroup/cpuset: move memory_pressure to cpuset-v1.c
  cgroup/cpuset: move common code to cpuset-internal.h
  cgroup/cpuset: introduce cpuset-v1.c
  selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
  cgroup/cpuset: Account for boot time isolated CPUs
  cgroup/cpuset: remove use_parent_ecpus of cpuset
  cgroup/cpuset: remove fetch_xcpus
  ...
2024-09-18 06:39:03 +02:00
Linus Torvalds
194fcd20eb linux_kselftest-kunit-6.12-rc1
This kunit update for Linux 6.12-rc1 consists of:
 
 -- a new int_pow test suite
 -- documentation update to clarify filename best practices
 -- kernel-doc fix for EXPORT_SYMBOL_IF_KUNIT
 -- change to build compile_commands.json automatically instead
    of requiring a manual build.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmbo3WEACgkQCwJExA0N
 Qxz1WxAAj+772NHxsJ4JnPqr/74doKnzKc1jM2V4g/F9Y+BT0tSKs1Cu5CyN9VsT
 wvxVPWqYltyhumVm/H6SaUGb0yZ7CzJi/5FuT3p3QFUDidMSu1h9KnlLi79q3cDI
 VuFKE8K4DDP0GfyFMpbSPZOGfYQp24FybhxRxreY+7q6uRVAnPh33Q1/Bonv6K6q
 5329a0z9wWySgisa93ABmQNpF4UJSYunR2bsdUzZqHgyrTXSyK66fcmVKwbBUaIT
 o16P1LBjDcIbfwswFb+xUmWD1IPGk7ulirEq8n69tErI6zKbkv1rojXHsoXuvOEN
 a4i+sNyR+a7NVI1h/T8F25pSbegkL0XQs7cmehATqpInmEZNDeGR8PkaGZNXXrFy
 kG/z7LlWh8zQUBrTsqOLU/iz4sRVrsPCuLIUzo8MiKpAskmj/7fqw5Cab9jmL5V3
 6OLAfCQDrfcH7fM9V5U6Ury2dkcovFuw+ZhFcBuLnspB5z0Cj7Yqz6aDZdJ97qyR
 PfZuyBU2ouykhpJ4P/sRJC3Gq1t0b+PoDq3qNdCqz4ETld1jaiDz0e75ypquJWyB
 QdVMNJF6W7Nwnmpzp4GY9QZ6dtwOKGZyuvW5J0eleWKiD4gjHZaoupIzqT24fgYi
 vdscbcOxMMU3/b9F4qDlgsLSPCLVF4HIXTAK2UdiznLdaxYVHQ0=
 =rmqh
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - a new int_pow test suite

 - documentation update to clarify filename best practices

 - kernel-doc fix for EXPORT_SYMBOL_IF_KUNIT

 - change to build compile_commands.json automatically instead of
   requiring a manual build

* tag 'linux_kselftest-kunit-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  lib/math: Add int_pow test suite
  kunit: tool: Build compile_commands.json
  kunit: Fix kernel-doc for EXPORT_SYMBOL_IF_KUNIT
  Documentation: KUnit: Update filename best practices
2024-09-17 16:52:24 +02:00
Linus Torvalds
dea435d397 Enable UBSAN traps for x86, which provides better reporting through
metadata encodeded into UD1.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpM6ITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoU/kEACWS7Z9mQrWB3r22ufTTPoN+hNudth+
 CP8wluXZGvLPh1Pq9dpB9ZniBUN8levYoGyj3NTdr6VtoMJ6NYcZVuH98lCCEMXO
 1UmDpydSGZ3BqVgmf4h0eYAJgEiA5qTflXMsh6SfsaPQR7jniJTE451hgJdRIogG
 DvgWeVTYn5vt0+oRHJp6ogRLR9oOUgdp94fIwaW34OpesbVJeWUW9zAvBcqdNrDT
 KJIM7ta6eivEakFRxriQZTKRc+3ElvZ2fdWNdo9qrRd64MTIOTXAj3G0lXt3YtpZ
 06pfJ1CfQ+nwHKfxmmy4gz4eJG7KcpMM+KFZTR3NoSAz4oMTzAvVTxAuEt+pahx6
 bmLzaY/I/gRB/Rt+e5oEZSEIq+Sh/Lm3IZoQUhK0+HeJBjwPghBZw3BjkFJvEsMw
 S0arvklH2x37gP9rnzOODf2QG7aIAqLTrvRJS610fctwadR4k+2UIE8ZGHOTt55J
 UdiK/QhU4gMVaRTebTcPquu3IMmnJjla/bEWdIrBtOSiGtVd1BnAp/kvmkdQH3eI
 ZUqJbnfofN4rzSufFqSVY88ORVIcQMnNDLM0qyJofIC79u7OiU40icoDxWS6mDHQ
 wQSEszInhwNzyAxoHnNkXDunjDVKhATQPOde0F4TxLcrYD9KRpvJag/1j5fCQi+0
 ftODZflfGS2UjQ==
 =Z5Hg
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core update from Thomas Gleixner:
 "Enable UBSAN traps for x86, which provides better reporting through
  metadata encodeded into UD1"

* tag 'x86-core-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/traps: Enable UBSAN traps on x86
2024-09-17 13:17:27 +02:00
Linus Torvalds
5ba202a7c9 Updates for KCOV instrumentation on x86:
- Prevent spurious KCOV coverage in common_interrupt()
 
   - Fixup the KCOV Makefile directive which got stale due to a source file
     rename
 
   - Exclude stack unwinding from KCOV as it creates large amounts of
     uninteresting coverage
 
   - Provide a self test to validate that KCOV coverage of the interrupt
     handling code starts not before preempt count got updated.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpMeITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaOeD/4oO3g0soK0LIcDIwzaG0ap0hx0nucw
 aVSAESuY+ZaSbRbV0fNoYdHORvLdErs67SeyeJRSxTzSNqGH2dGoFrfbkRSXq951
 RdCSPP60T7xgqAme1YLDiChfXt/gkbWk/8V5Q7sG3oq3GaVcPUyZgPo4M4HQMdfg
 Mla3VPikW5Np3fvs0IZYWQ5VdY0fFOHY5JGMhKJznJxf+Ud+VAtxsbJUcO4MEYWW
 A9CVJNHGEXssGA6vm5kgtLu6n2QFuoSj6En/WqLEaJb8f/V332e04Xj2ZHUaOOjV
 2abVeDovv+dwUYb4SgrGVg9gfEwwcLPDnmOuuQJmQBB5kU4mJsCqI5TTS6c1fgU4
 x8tQsGSOKHFQAI14ZWtitrL4rS2uFcBkAFXo0dF8J5o4989RA8cpfeWVSVUb/UXd
 u38BWpc9iHiihHKMmMQgsa1bUMwdSUTvN5XFHkeP4oqUdMiEiWn8iM5+zXd/lfTs
 9mrTv+kcLA7mjFOmn4JyE2b+NuiPdgS2FCBGLycHvGwvJoJlO2UmSpF89AJ5vdKs
 F8vWLkV+gno/HtwS5o949cAwjYiCodfc7u1W0xj2VDAbx0RbaBw1SDhXMQcLxLgn
 BTt4yHKKIeLX++WH3fpeyL91+UJWubUzNzY4rAmLkz5DedWAkpES+45fatp1buIz
 Lp/hGiIsG9p5xw==
 =tiXT
 -----END PGP SIGNATURE-----

Merge tag 'x86-build-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 build updates from Thomas Gleixner:
 "Updates for KCOV instrumentation on x86:

   - Prevent spurious KCOV coverage in common_interrupt()

   - Fixup the KCOV Makefile directive which got stale due to a source
     file rename

   - Exclude stack unwinding from KCOV as it creates large amounts of
     uninteresting coverage

   - Provide a self test to validate that KCOV coverage of the interrupt
     handling code starts not before preempt count got updated"

* tag 'x86-build-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Ignore stack unwinding in KCOV
  module: Fix KCOV-ignored file name
  kcov: Add interrupt handling self test
  x86/entry: Remove unwanted instrumentation in common_interrupt()
2024-09-17 12:40:34 +02:00
I Hsin Cheng
5e06e08939 list: test: increase coverage of list_test_list_replace*()
Increase the test coverage of list_test_list_replace*() by adding the
checks to compare the pointer of "a_new.next" and "a_new.prev" to make
sure a perfect circular doubly linked list is formed after the
replacement.

Link: https://lkml.kernel.org/r/20240910040818.65723-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:11:20 -07:00
I Hsin Cheng
e620799c41 list: test: fix tests for list_cut_position()
Fix test for list_cut_position*() for the missing check of integer "i"
after the second loop.  The variable should be checked for second time to
make sure both lists after the cut operation are formed as expected.

Link: https://lkml.kernel.org/r/20240910043531.71343-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Cc: David Gow <davidgow@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:11:20 -07:00
Huang Ying
99185c10d5 resource, kunit: add test case for region_intersects()
Patch series "resource: Fix region_intersects() vs
add_memory_driver_managed()", v3.

The patchset fixes a bug of region_intersects() for systems with CXL
memory.  The details of the bug can be found in [1/3].  To avoid similar
bugs in the future.  A kunit test case for region_intersects() is added in
[3/3].  [2/3] is a preparation patch for [3/3].


This patch (of 3):

region_intersects() is important because it's used for /dev/mem permission
checking.  To avoid possible bug of region_intersects() in the future, a
kunit test case for region_intersects() is added.

Link: https://lkml.kernel.org/r/20240906030713.204292-1-ying.huang@intel.com
Link: https://lkml.kernel.org/r/20240906030713.204292-4-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17 01:07:00 -07:00
Linus Torvalds
daa394f0f9 A set of updates for debugobjects:
- Use the threshold to check for the pool refill condition and not the
     run time recorded all time low fill value, which is lower than the
     threshold and therefore causes refills to be delayed.
 
   - KCSAN annotation updates and simplification of the fill_pool() code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn480THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVB1D/0UE1n86SLFrR7plXudttXJnbyJ/OjK
 uOjLSHx66TyMkN1z6xF6K4bZTyQRpIUifPLz4evyd9CdDvITvnrvkboby/15rsGW
 8sEBqAFVMkENkPzDA1Qmn3fxJs9XvHoER7WcMjaEl9yQbSi4gjO5Y+B0BNp4XKHZ
 P1YSmRJqUBX5F0BvmeeDlHCCpyUxeRGiyzxZ/WSl70e6RSGis10R+B/aqsMxf3Zz
 6WboQJqMxnDT3ICtDxTicH9VJ6Lh9iJxppeLVxAtZ+acfhcRmpwKFmsfJJOVy1eg
 zkJuDh3ieb8hH7vr6bqzMEoP8qclUY7JgcJCK0dIwcASIvr7ZFVLCDLDx6Ta9UrG
 D+L7sjGs+h/wz7NOoKTaGJS0XHwijVtLhc5/O64p1POUiQVTfjCVW6E3RAs3IGBI
 uXTxuVzpK7XXvbg7+iEwYVcE5fp5vctnlLyepkbXvei9r/ccgIndj3rVGZz1qyOc
 41LVhTx1Uu9MSqnsWTGbr+kzIze/g1rj8OlSH+692nbLL0mxWsOuojljvDgILC1Q
 rcvZLJrf8e4FDFyGZiX8kG3eHbyYQPdf3fqUCI7B05n0o7utXLf4Mgw+/LdIvpKY
 JTx4/lhwZ4TXFMvf+LiW/rhRlP72QsVkljjsyJTHI6a5LukdNL9dNXKTqSCypcjm
 hAsMzee52FiZoQ==
 =B0II
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects updates from Thomas Gleixner:

 - Use the threshold to check for the pool refill condition and not the
   run time recorded all time low fill value, which is lower than the
   threshold and therefore causes refills to be delayed.

 - KCSAN annotation updates and simplification of the fill_pool() code.

* tag 'core-debugobjects-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Remove redundant checks in fill_pool()
  debugobjects: Fix conditions in fill_pool()
  debugobjects: Fix the compilation attributes of some global variables
2024-09-17 08:14:00 +02:00
Linus Torvalds
9ea925c806 Updates for timers and timekeeping:
- Core:
 
 	- Overhaul of posix-timers in preparation of removing the
 	  workaround for periodic timers which have signal delivery
 	  ignored.
 
         - Remove the historical extra jiffie in msleep()
 
 	  msleep() adds an extra jiffie to the timeout value to ensure
 	  minimal sleep time. The timer wheel ensures minimal sleep
 	  time since the large rewrite to a non-cascading wheel, but the
 	  extra jiffie in msleep() remained unnoticed. Remove it.
 
         - Make the timer slack handling correct for realtime tasks.
 
 	  The procfs interface is inconsistent and does neither reflect
 	  reality nor conforms to the man page. Show the correct 0 slack
 	  for real time tasks and enforce it at the core level instead of
 	  having inconsistent individual checks in various timer setup
 	  functions.
 
         - The usual set of updates and enhancements all over the place.
 
   - Drivers:
 
         - Allow the ACPI PM timer to be turned off during suspend
 
 	- No new drivers
 
 	- The usual updates and enhancements in various drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn7jQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobqnD/9COlU0nwsulABI/aNIrsh6iYvnCC9v
 14CcNta7Qn+157Wfw9BWOyHdNhR1/fPCXE8jJ71zTyIOeW27HV2JyTtxTwe9ZcdK
 ViHAaj7YcIjcVUEC3StCoRCPnvLslEw4qJA5AOQuDyMivdQn+YVa2c0baJxKaXZt
 xk4HZdMj4NAS0jRKnoZSwtKW/+Oz6rR4GAWrZo+Zs1/8ur3HfqnQfi8lJ1hJtLLW
 V7XDCVRvamVi6Ah3ocYPPp/1P6yeQDA1ge9aMddqaza5STWISXRtSnFMUmYP3rbS
 FaL8TyL+ilfny8pkGB2WlG6nLuSbtvogtdEh1gG1k1RmZt44kAtk8ba/KiWFPBSb
 zK9cjojRMBS71f9G4kmb5F4rnXoLsg1YbD1Nzhz3wq2Cs1Z90dc2QwMren0zoQ1x
 Fn56ueRyAiagBlnrSaKyso/2RvqJTNoSdi3RkpjYeAph0UoDCqvTvKjGAf1mWiw1
 T/1lUWSVqWHnzZbM7XXzzajIN9bl6A7bbqlcAJ2O9vZIDt7273DG+bQym9Vh6Why
 0LTGGERHxzKBsG7WRg+2Gmvv6S18UPKRo8tLtlA758rHlFuPTZCShWrIriwSNl1K
 Hxon+d4BparSnm1h9W/NHPKJA574UbWRCBjdk58IkAj8DxZZY4ORD9SMP+ggkV7G
 F6p9cgoDNP9KFg==
 =jE0N
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Core:

   - Overhaul of posix-timers in preparation of removing the workaround
     for periodic timers which have signal delivery ignored.

   - Remove the historical extra jiffie in msleep()

     msleep() adds an extra jiffie to the timeout value to ensure
     minimal sleep time. The timer wheel ensures minimal sleep time
     since the large rewrite to a non-cascading wheel, but the extra
     jiffie in msleep() remained unnoticed. Remove it.

   - Make the timer slack handling correct for realtime tasks.

     The procfs interface is inconsistent and does neither reflect
     reality nor conforms to the man page. Show the correct 0 slack for
     real time tasks and enforce it at the core level instead of having
     inconsistent individual checks in various timer setup functions.

   - The usual set of updates and enhancements all over the place.

  Drivers:

   - Allow the ACPI PM timer to be turned off during suspend

   - No new drivers

   - The usual updates and enhancements in various drivers"

* tag 'timers-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  ntp: Make sure RTC is synchronized when time goes backwards
  treewide: Fix wrong singular form of jiffies in comments
  cpu: Use already existing usleep_range()
  timers: Rename next_expiry_recalc() to be unique
  platform/x86:intel/pmc: Fix comment for the pmc_core_acpi_pm_timer_suspend_resume function
  clocksource/drivers/jcore: Use request_percpu_irq()
  clocksource/drivers/cadence-ttc: Add missing clk_disable_unprepare in ttc_setup_clockevent
  clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init
  clocksource/drivers/qcom: Add missing iounmap() on errors in msm_dt_timer_init()
  clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers
  platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended
  clocksource: acpi_pm: Add external callback for suspend/resume
  clocksource/drivers/arm_arch_timer: Using for_each_available_child_of_node_scoped()
  dt-bindings: timer: rockchip: Add rk3576 compatible
  timers: Annotate possible non critical data race of next_expiry
  timers: Remove historical extra jiffie for timeout in msleep()
  hrtimer: Use and report correct timerslack values for realtime tasks
  hrtimer: Annotate hrtimer_cpu_base_.*_expiry() for sparse.
  timers: Add sparse annotation for timer_sync_wait_running().
  signal: Replace BUG_ON()s
  ...
2024-09-17 07:25:37 +02:00
Linus Torvalds
cb69d86550 Updates for the interrupt subsystem:
- Core:
 	- Remove a global lock in the affinity setting code
 
 	  The lock protects a cpumask for intermediate results and the lock
 	  causes a bottleneck on simultaneous start of multiple virtual
 	  machines. Replace the lock and the static cpumask with a per CPU
 	  cpumask which is nicely serialized by raw spinlock held when
 	  executing this code.
 
 	- Provide support for giving a suffix to interrupt domain names.
 
 	  That's required to support devices with subfunctions so that the
 	  domain names are distinct even if they originate from the same
 	  device node.
 
 	- The usual set of cleanups and enhancements all over the place
 
   - Drivers:
 
 	- Support for longarch AVEC interrupt chip
 
 	- Refurbishment of the Armada driver so it can be extended for new
           variants.
 
 	- The usual set of cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn5p8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRFtD/43eB3h5usY2OPW0JmDqrE6qnzsvjPZ
 1H52BcmMcOuI6yCfTnbi/fBB52mwSEGq9Dmt1GXradyq9/CJDIqZ1ajI1rA2jzW2
 YdbeTDpKm1rS2ddzfp2LT2BryrNt+7etrRO7qHn4EKSuOcNuV2f58WPbIIqasvaK
 uPbUDVDPrvXxLNcjoab6SqaKrEoAaHSyKpd0MvDd80wHrtcSC/QouW7JDSUXv699
 RwvLebN1OF6mQ2J8Z3DLeCQpcbAs+UT8UvID7kYUJi1g71J/ZY+xpMLoX/gHiDNr
 isBtsuEAiZeNaFpksc7A6Jgu5ljZf2/aLCqbPLlHaduHFNmo94x9KUbIF2cpEMN+
 rsf5Ff7AVh1otz3cUwLLsm+cFLWRRoZdLuncn7rrgB4Yg0gll7qzyLO6YGvQHr8U
 Ocj1RXtvvWsMk4XzhgCt1AH/42cO6go+bhA4HspeYykNpsIldIUl1MeFbO8sWiDJ
 kybuwiwHp3oaMLjEK4Lpq65u7Ll8Lju2zRde65YUJN2nbNmJFORrOLmeC1qsr6ri
 dpend6n2qD9UD1oAt32ej/uXnG160nm7UKescyxiZNeTm1+ez8GW31hY128ifTY3
 4R3urGS38p3gazXBsfw6eqkeKx0kEoDNoQqrO5gBvb8kowYTvoZtkwMGAN9OADwj
 w6vvU0i+NIyVMA==
 =JlJ2
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Core:

   - Remove a global lock in the affinity setting code

     The lock protects a cpumask for intermediate results and the lock
     causes a bottleneck on simultaneous start of multiple virtual
     machines. Replace the lock and the static cpumask with a per CPU
     cpumask which is nicely serialized by raw spinlock held when
     executing this code.

   - Provide support for giving a suffix to interrupt domain names.

     That's required to support devices with subfunctions so that the
     domain names are distinct even if they originate from the same
     device node.

   - The usual set of cleanups and enhancements all over the place

  Drivers:

   - Support for longarch AVEC interrupt chip

   - Refurbishment of the Armada driver so it can be extended for new
     variants.

   - The usual set of cleanups and enhancements all over the place"

* tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
  genirq: Use cpumask_intersects()
  genirq/cpuhotplug: Use cpumask_intersects()
  irqchip/apple-aic: Only access system registers on SoCs which provide them
  irqchip/apple-aic: Add a new "Global fast IPIs only" feature level
  irqchip/apple-aic: Skip unnecessary enabling of use_fast_ipi
  dt-bindings: apple,aic: Document A7-A11 compatibles
  irqdomain: Use IS_ERR_OR_NULL() in irq_domain_trim_hierarchy()
  genirq/msi: Use kmemdup_array() instead of kmemdup()
  genirq/proc: Change the return value for set affinity permission error
  genirq/proc: Use irq_move_pending() in show_irq_affinity()
  genirq/proc: Correctly set file permissions for affinity control files
  genirq: Get rid of global lock in irq_do_set_affinity()
  genirq: Fix typo in struct comment
  irqchip/loongarch-avec: Add AVEC irqchip support
  irqchip/loongson-pch-msi: Prepare get_pch_msi_handle() for AVECINTC
  irqchip/loongson-eiointc: Rename CPUHP_AP_IRQ_LOONGARCH_STARTING
  LoongArch: Architectural preparation for AVEC irqchip
  LoongArch: Move irqchip function prototypes to irq-loongson.h
  irqchip/loongson-pch-msi: Switch to MSI parent domains
  softirq: Remove unused 'action' parameter from action callback
  ...
2024-09-17 07:09:17 +02:00
Linus Torvalds
35219bc5c7 vfs-6.12.netfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZuQEvgAKCRCRxhvAZXjc
 onQWAQD6IxAKPU0zom2FoWNilvSzPs7WglTtvddX9pu/lT1RNAD/YC/wOLW8mvAv
 9oTAmigQDQQhEWdJA9RgLZBiw7k+DAw=
 =zWFb
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull netfs updates from Christian Brauner:
 "This contains the work to improve read/write performance for the new
  netfs library.

  The main performance enhancing changes are:

   - Define a structure, struct folio_queue, and a new iterator type,
     ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
     that patch for questions about naming and form.

     ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
     problem with an xarray is that accessing it requires the use of a
     lock (typically the RCU read lock) - and this means that we can't
     supply iterate_and_advance() with a step function that might sleep
     (crypto for example) without having to drop the lock between pages.
     ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
     where each folio_queue holds a small list of folios. A folio_queue
     struct is a simpler structure than xarray and is not subject to
     concurrent manipulation by the VM. folio_queue is used rather than
     a bvec[] as it can form lists of indefinite size, adding to one end
     and removing from the other on the fly.

   - Provide a copy_folio_from_iter() wrapper.

   - Make cifs RDMA support ITER_FOLIOQ.

   - Use folio queues in the write-side helpers instead of xarrays.

   - Add a function to reset the iterator in a subrequest.

   - Simplify the write-side helpers to use sheaves to skip gaps rather
     than trying to work out where gaps are.

   - In afs, make the read subrequests asynchronous, putting them into
     work items to allow the next patch to do progressive
     unlocking/reading.

   - Overhaul the read-side helpers to improve performance.

   - Fix the caching of a partial block at the end of a file.

   - Allow a store to be cancelled.

  Then some changes for cifs to make it use folio queues instead of
  xarrays for crypto bufferage:

   - Use raw iteration functions rather than manually coding iteration
     when hashing data.

   - Switch to using folio_queue for crypto buffers.

   - Remove the xarray bits.

  Make some adjustments to the /proc/fs/netfs/stats file such that:

   - All the netfs stats lines begin 'Netfs:' but change this to
     something a bit more useful.

   - Add a couple of stats counters to track the numbers of skips and
     waits on the per-inode writeback serialisation lock to make it
     easier to check for this as a source of performance loss.

  Miscellaneous work:

   - Ensure that the sb_writers lock is taken around
     vfs_{set,remove}xattr() in the cachefiles code.

   - Reduce the number of conditional branches in netfs_perform_write().

   - Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
     remove cifs_post_modify().

   - Move the max_len/max_nr_segs members from netfs_io_subrequest to
     netfs_io_request as they're only needed for one subreq at a time.

   - Add an 'unknown' source value for tracing purposes.

   - Remove NETFS_COPY_TO_CACHE as it's no longer used.

   - Set the request work function up front at allocation time.

   - Use bh-disabling spinlocks for rreq->lock as cachefiles completion
     may be run from block-filesystem DIO completion in softirq context.

   - Remove fs/netfs/io.c"

* tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
  docs: filesystems: corrected grammar of netfs page
  cifs: Don't support ITER_XARRAY
  cifs: Switch crypto buffer to use a folio_queue rather than an xarray
  cifs: Use iterate_and_advance*() routines directly for hashing
  netfs: Cancel dirty folios that have no storage destination
  cachefiles, netfs: Fix write to partial block at EOF
  netfs: Remove fs/netfs/io.c
  netfs: Speed up buffered reading
  afs: Make read subreqs async
  netfs: Simplify the writeback code
  netfs: Provide an iterator-reset function
  netfs: Use new folio_queue data type and iterator instead of xarray iter
  cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
  iov_iter: Provide copy_folio_from_iter()
  mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
  netfs: Use bh-disabling spinlocks for rreq->lock
  netfs: Set the request work function upon allocation
  netfs: Remove NETFS_COPY_TO_CACHE
  netfs: Reserve netfs_sreq_source 0 as unset/unknown
  netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
  ...
2024-09-16 12:13:31 +02:00
Linus Torvalds
85ffc6e4ed This update includes the following changes:
API:
 
 - Make self-test asynchronous.
 
 Algorithms:
 
 - Remove MPI functions added for SM3.
 - Add allocation error checks to remaining MPI functions (introduced for SM3).
 - Set default Jitter RNG OSR to 3.
 
 Drivers:
 
 - Add hwrng driver for Rockchip RK3568 SoC.
 - Allow disabling SR-IOV VFs through sysfs in qat.
 - Fix device reset bugs in hisilicon.
 - Fix authenc key parsing by using generic helper in octeontx*.
 
 Others:
 
 - Fix xor benchmarking on parisc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmbnq/wACgkQxycdCkmx
 i6cyXw//cBgngKOuCv7tLqMPeSLC39jDJEKkP9tS9ZilYyxzg1b9cbnDLlKNk4Yq
 4A6rRqqh8PD3/yJT58pGXaU5Is5sVMQRqqgwFutXrkD+hsMLk2nlgzsWYhg6aUsY
 /THTfmKTwEgfc3qDLZq6xGOShmMdO6NiOGsH3MeEWhbgfrDuJlOkHXd7QncNa7q8
 NEP7kI3vBc0xFcOxzbjy8tSGYEmPft1LECXAKsgOycWj9Q0SkzPocmo59iSbM21b
 HfV0p3hgAEa5VgKv0Rc5/6PevAqJqOSjGNfRBSPZ97o7dop8sl/z/cOWiy8dM7wO
 xhd9g7XXtmML6UO2MpJPMJzsLgMsjmUTWO2UyEpIyst6RVfJlniOL/jGzWmZ/P2+
 vw/F/mX8k60Zv1du46PC3p6eBeH4Yx/2fEPvPTJus+DQHS9GchXtAKtMToyyUHc2
 6TAy0nOihVQK2Q3QuQ1B/ghQS4tkdOenOIYHSCf9a9nJamub+PqP8jWDw0Y2RcY6
 jSs+tk6hwHJaKnj/T/Mr0gVPX9L8KHCYBtZD7Qbr0NhoXOT6w47m6bbew/dzTN+0
 pmFsgz32fNm8vb8R8D0kZDF63s6uz6CN+P9Dx6Tab4X+87HxNdeaBPS/Le9tYgOC
 0MmE5oIquooqedpM5tW55yuyOHhLPGLQS2SDiA+Ke+WYbAC8SQc=
 =rG1X
 -----END PGP SIGNATURE-----

Merge tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu"
 "API:
   - Make self-test asynchronous

  Algorithms:
   - Remove MPI functions added for SM3
   - Add allocation error checks to remaining MPI functions (introduced
     for SM3)
   - Set default Jitter RNG OSR to 3

  Drivers:
   - Add hwrng driver for Rockchip RK3568 SoC
   - Allow disabling SR-IOV VFs through sysfs in qat
   - Fix device reset bugs in hisilicon
   - Fix authenc key parsing by using generic helper in octeontx*

  Others:
   - Fix xor benchmarking on parisc"

* tag 'v6.12-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (96 commits)
  crypto: n2 - Set err to EINVAL if snprintf fails for hmac
  crypto: camm/qi - Use ERR_CAST() to return error-valued pointer
  crypto: mips/crc32 - Clean up useless assignment operations
  crypto: qcom-rng - rename *_of_data to *_match_data
  crypto: qcom-rng - fix support for ACPI-based systems
  dt-bindings: crypto: qcom,prng: document support for SA8255p
  crypto: aegis128 - Fix indentation issue in crypto_aegis128_process_crypt()
  crypto: octeontx* - Select CRYPTO_AUTHENC
  crypto: testmgr - Hide ENOENT errors
  crypto: qat - Remove trailing space after \n newline
  crypto: hisilicon/sec - Remove trailing space after \n newline
  crypto: algboss - Pass instance creation error up
  crypto: api - Fix generic algorithm self-test races
  crypto: hisilicon/qm - inject error before stopping queue
  crypto: hisilicon/hpre - mask cluster timeout error
  crypto: hisilicon/qm - reset device before enabling it
  crypto: hisilicon/trng - modifying the order of header files
  crypto: hisilicon - add a lock for the qp send operation
  crypto: hisilicon - fix missed error branch
  crypto: ccp - do not request interrupt on cmd completion when irqs disabled
  ...
2024-09-16 06:28:28 +02:00
Masahiro Yamada
5277d13094 btf: require pahole 1.21+ for DEBUG_INFO_BTF with default DWARF version
As described in commit 42d9b379e3 ("lib/Kconfig.debug: Allow BTF +
DWARF5 with pahole 1.21+"), the combination of CONFIG_DEBUG_INFO_BTF
and CONFIG_DEBUG_INFO_DWARF5 requires pahole 1.21+.

GCC 11+ and Clang 14+ default to DWARF 5 when the -g flag is passed.
For the same reason, the combination of CONFIG_DEBUG_INFO_BTF and
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is also likely to require
pahole 1.21+ these days. (At least, it is uncertain whether the actual
requirement is pahole 1.16+ or 1.21+.)

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240913173759.1316390-3-masahiroy@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-13 20:03:29 -07:00
Masahiro Yamada
42450f7a90 btf: move pahole check in scripts/link-vmlinux.sh to lib/Kconfig.debug
When DEBUG_INFO_DWARF5 is selected, pahole 1.21+ is required to enable
DEBUG_INFO_BTF.

When DEBUG_INFO_DWARF4 or DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is selected,
DEBUG_INFO_BTF can be enabled without pahole installed, but a build error
will occur in scripts/link-vmlinux.sh:

    LD      .tmp_vmlinux1
  BTF: .tmp_vmlinux1: pahole (pahole) is not available
  Failed to generate BTF for vmlinux
  Try to disable CONFIG_DEBUG_INFO_BTF

We did not guard DEBUG_INFO_BTF by PAHOLE_VERSION when previously
discussed [1].

However, commit 613fe16923 ("kbuild: Add CONFIG_PAHOLE_VERSION")
added CONFIG_PAHOLE_VERSION after all. Now several CONFIG options, as
well as the combination of DEBUG_INFO_BTF and DEBUG_INFO_DWARF5, are
guarded by PAHOLE_VERSION.

The remaining compile-time check in scripts/link-vmlinux.sh now appears
to be an awkward inconsistency.

This commit adopts Nathan's original work.

[1]: https://lore.kernel.org/lkml/20210111180609.713998-1-natechancellor@gmail.com/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240913173759.1316390-2-masahiroy@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-13 20:03:29 -07:00
Christophe Leroy
7f053812da random: vDSO: minimize and simplify header includes
Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel
is problematic when some system headers are included.

Minimise the amount of headers by moving needed items, such as
__{get,put}_unaligned_t, into dedicated common headers and in general
use more specific headers, similar to what was done in commit
8165b57bca ("linux/const.h: Extract common header for vDSO") and
commit 8c59ab839f ("lib/vdso: Enable common headers").

On some architectures this results in missing PAGE_SIZE, as was
described by commit 8b3843ae36 ("vdso/datapage: Quick fix - use
asm/page-def.h for ARM64"), so define this if necessary, in the same way
as done prior by commit cffaefd15a ("vdso: Use CONFIG_PAGE_SHIFT in
vdso/datapage.h").

Removing linux/time64.h leads to missing 'struct timespec64' in
x86's asm/pvclock.h. Add a forward declaration of that struct in
that file.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
b7bad082e1 random: vDSO: avoid call to out of line memset()
With the current implementation, __cvdso_getrandom_data() calls
memset() on certain architectures, which is unexpected in the VDSO.

Rather than providing a memset(), simply rewrite opaque data
initialization to avoid memset().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
81723e3ac3 random: vDSO: add missing c-getrandom-y in Makefile
Same as for the gettimeofday CVDSO implementation, add c-getrandom-y to
ease the inclusion of lib/vdso/getrandom.c in architectures' VDSO
builds.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Christophe Leroy
81c6896049 random: vDSO: don't use 64-bit atomics on 32-bit architectures
Performing SMP atomic operations on u64 fails on powerpc32:

    CC      drivers/char/random.o
  In file included from <command-line>:
  drivers/char/random.c: In function 'crng_reseed':
  ././include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_391' declared with attribute error: Need native word sized stores/loads for atomicity.
    510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |                                             ^
  ././include/linux/compiler_types.h:491:25: note: in definition of macro '__compiletime_assert'
    491 |                         prefix ## suffix();                             \
        |                         ^~~~~~
  ././include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert'
    510 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |         ^~~~~~~~~~~~~~~~~~~
  ././include/linux/compiler_types.h:513:9: note: in expansion of macro 'compiletime_assert'
    513 |         compiletime_assert(__native_word(t),                            \
        |         ^~~~~~~~~~~~~~~~~~
  ./arch/powerpc/include/asm/barrier.h:74:9: note: in expansion of macro 'compiletime_assert_atomic_type'
     74 |         compiletime_assert_atomic_type(*p);                             \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release'
    172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
        |                                                       ^~~~~~~~~~~~~~~~~~~
  drivers/char/random.c:286:9: note: in expansion of macro 'smp_store_release'
    286 |         smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1);
        |         ^~~~~~~~~~~~~~~~~

The kernel-side generation counter in the random driver is handled as an
unsigned long, not as a u64, in base_crng and struct crng.

But on the vDSO side, it needs to be an u64, not just an unsigned long,
in order to support a 32-bit vDSO atop a 64-bit kernel.

On kernel side, however, it is an unsigned long, hence a 32-bit value on
32-bit architectures, so just cast it to unsigned long for the
smp_store_release(). A side effect is that on big endian architectures
the store will be performed in the upper 32 bits. It is not an issue on
its own because the vDSO site doesn't mind the value, as it only checks
differences. Just make sure that the vDSO side checks the full 64 bits.
For that, the local current_generation has to be u64 as well.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13 17:28:35 +02:00
Luis Felipe Hernandez
7fcc9b5321 lib/math: Add int_pow test suite
Adds test suite for integer based power function which performs integer
exponentiation.

The test suite is designed to verify that the implementation of int_pow
correctly computes the power of a given base raised to a given exponent.

The tests check various scenarios and edge cases to ensure the accuracy
and reliability of the exponentiation function.

Updated commit with test information at commit time: Shuah Khan

Signed-off-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-09-12 10:03:00 -06:00
David Howells
db0aa2e956
mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
Define a data structure, struct folio_queue, to represent a sequence of
folios and a kernel-internal I/O iterator type, ITER_FOLIOQ, to allow a
list of folio_queue structures to be used to provide a buffer to
iov_iter-taking functions, such as sendmsg and recvmsg.

The folio_queue structure looks like:

	struct folio_queue {
		struct folio_batch	vec;
		u8			orders[PAGEVEC_SIZE];
		struct folio_queue	*next;
		struct folio_queue	*prev;
		unsigned long		marks;
		unsigned long		marks2;
	};

It does not use a list_head so that next and/or prev can be set to NULL at
the ends of the list, allowing iov_iter-handling routines to determine that
they *are* the ends without needing to store a head pointer in the iov_iter
struct.

A folio_batch struct is used to hold the folio pointers which allows the
batch to be passed to batch handling functions.  Two mark bits are
available per slot.  The intention is to use at least one of them to mark
folios that need putting, but that might not be ultimately necessary.
Accessor functions are used to access the slots to do the masking and an
additional accessor function is used to indicate the size of the array.

The order of each folio is also stored in the structure to avoid the need
for iov_iter_advance() and iov_iter_revert() to have to query each folio to
find its size.

With careful barriering, this can be used as an extending buffer with new
folios inserted and new folio_queue structs added without the need for a
lock.  Further, provided we always keep at least one struct in the buffer,
we can also remove consumed folios and consumed structs from the head end
as we without the need for locks.

[Questions/thoughts]

 (1) To manage this, I need a head pointer, a tail pointer, a tail slot
     number (assuming insertion happens at the tail end and the next
     pointers point from head to tail).  Should I put these into a struct
     of their own, say "folio_queue_head" or "rolling_buffer"?

     I will end up with two of these in netfs_io_request eventually, one
     keeping track of the pagecache I'm dealing with for buffered I/O and
     the other to hold a bounce buffer when we need one.

 (2) Should I make the slots {folio,off,len} or bio_vec?

 (3) This is intended to replace ITER_XARRAY eventually.  Using an xarray
     in I/O iteration requires the taking of the RCU read lock, doing
     copying under the RCU read lock, walking the xarray (which may change
     under us), handling retries and dealing with special values.

     The advantage of ITER_XARRAY is that when we're dealing with the
     pagecache directly, we don't need any allocation - but if we're doing
     encrypted comms, there's a good chance we'd be using a bounce buffer
     anyway.

     This will require afs, erofs, cifs, orangefs and fscache to be
     converted to not use this.  afs still uses it for dirs and symlinks;
     some of erofs usages should be easy to change, but there's one which
     won't be so easy; ceph's use via fscache can be fixed by porting ceph
     to netfslib; cifs is using xarray as a bounce buffer - that can be
     moved to use sheaves instead; and orangefs has a similar problem to
     erofs - maybe orangefs could use netfslib?

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Gao Xiang <xiang@kernel.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-afs@lists.infradead.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: devel@lists.orangefs.org
Link: https://lore.kernel.org/r/20240814203850.2240469-13-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-12 12:20:21 +02:00
Andrii Nakryiko
cdbb44f9a7 lib/buildid: don't limit .note.gnu.build-id to the first page in ELF
With freader we don't need to restrict ourselves to a single page, so
let's allow ELF notes to be at any valid position with the file.

We also merge parse_build_id() and parse_build_id_buf() as now the only
difference between them is note offset overflow, which makes sense to
check in all situations.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:31 -07:00
Andrii Nakryiko
ad41251c29 lib/buildid: implement sleepable build_id_parse() API
Extend freader with a flag specifying whether it's OK to cause page
fault to fetch file data that is not already physically present in
memory. With this, it's now easy to wait for data if the caller is
running in sleepable (faultable) context.

We utilize read_cache_folio() to bring the desired folio into page
cache, after which the rest of the logic works just the same at folio level.

Suggested-by: Omar Sandoval <osandov@fb.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:31 -07:00
Andrii Nakryiko
45b8fc3096 lib/buildid: rename build_id_parse() into build_id_parse_nofault()
Make it clear that build_id_parse() assumes that it can take no page
fault by renaming it and current few users to build_id_parse_nofault().

Also add build_id_parse() stub which for now falls back to non-sleepable
implementation, but will be changed in subsequent patches to take
advantage of sleepable context. PROCMAP_QUERY ioctl() on
/proc/<pid>/maps file is using build_id_parse() and will automatically
take advantage of more reliable sleepable context implementation.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
4e9d360c4c lib/buildid: remove single-page limit for PHDR search
Now that freader allows to access multiple pages transparently, there is
no need to limit program headers to the very first ELF file page. Remove
this limitation, but still put some sane limit on amount of program
headers that we are willing to iterate over (set arbitrarily to 256).

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
d4deb82423 lib/buildid: take into account e_phoff when fetching program headers
Current code assumption is that program (segment) headers are following
ELF header immediately. This is a common case, but is not guaranteed. So
take into account e_phoff field of the ELF header when accessing program
headers.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
de3ec364c3 lib/buildid: add single folio-based file reader abstraction
Add freader abstraction that transparently manages fetching and local
mapping of the underlying file page(s) and provides a simple direct data
access interface.

freader_fetch() is the only and single interface necessary. It accepts
file offset and desired number of bytes that should be accessed, and
will return a kernel mapped pointer that caller can use to dereference
data up to requested size. Requested size can't be bigger than the size
of the extra buffer provided during initialization (because, worst case,
all requested data has to be copied into it, so it's better to flag
wrongly sized buffer unconditionally, regardless if requested data range
is crossing page boundaries or not).

If folio is not paged in, or some of the conditions are not satisfied,
NULL is returned and more detailed error code can be accessed through
freader->err field. This approach makes the usage of freader_fetch()
cleaner.

To accommodate accessing file data that crosses folio boundaries, user
has to provide an extra buffer that will be used to make a local copy,
if necessary. This is done to maintain a simple linear pointer data
access interface.

We switch existing build ID parsing logic to it, without changing or
lifting any of the existing constraints, yet. This will be done
separately.

Given existing code was written with the assumption that it's always
working with a single (first) page of the underlying ELF file, logic
passes direct pointers around, which doesn't really work well with
freader approach and would be limiting when removing the single page (folio)
limitation. So we adjust all the logic to work in terms of file offsets.

There is also a memory buffer-based version (freader_init_from_mem())
for cases when desired data is already available in kernel memory. This
is used for parsing vmlinux's own build ID note. In this mode assumption
is that provided data starts at "file offset" zero, which works great
when parsing ELF notes sections, as all the parsing logic is relative to
note section's start.

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Andrii Nakryiko
905415ff3f lib/buildid: harden build ID parsing logic
Harden build ID parsing logic, adding explicit READ_ONCE() where it's
important to have a consistent value read and validated just once.

Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
note is within a page bounds, so move the overflow check up and add an
extra note_size boundaries validation.

Fixes tag below points to the code that moved this code into
lib/buildid.c, and then subsequently was used in perf subsystem, making
this code exposed to perf_event_open() users in v5.12+.

Cc: stable@vger.kernel.org
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Jann Horn <jannh@google.com>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Fixes: bd7525dacd ("bpf: Move stack_map_get_build_id into lib")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240829174232.3133883-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-09-11 09:58:30 -07:00
Thomas Gleixner
2f7eedca6c Merge branch 'linus' into timers/core
To update with the latest fixes.
2024-09-10 13:49:53 +02:00
Alok Swaminathan
7b0a5b6669 lib: glob.c: added null check for character class
Add null check for character class.  Previously, an inverted character
class could result in a nul byte being matched and lead to the function
reading past the end of the inputted string.

Link: https://lkml.kernel.org/r/20240826155709.12383-1-swaminathanalok@gmail.com
Signed-off-by: Alok Swaminathan <swaminathanalok@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:47:41 -07:00
Liam R. Howlett
1930c6ad93 maple_tree: mark three functions as __maybe_unused
People keep trying to remove three functions that are going to be used in
a feature that is being developed.  Dropping the functions entirely may
end up with people trying to use the bit for other uses, as people have
tried in the past.

Adding __maybe_unused stops compilers complaining about the unused
functions so they can be silently optimised out of the compiled code and
people won't try to claim the bit for another use.

Link: https://lore.kernel.org/all/20230726080916.17454-2-zhangpeng.00@bytedance.com/
Link: https://lore.kernel.org/all/202408310728.S7EE59BN-lkp@intel.com/
Link: https://lkml.kernel.org/r/20240907021506.4018676-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:17 -07:00
Sergey Senozhatsky
f3c11cf5ca lib: zstd: fix null-deref in ZSTD_createCDict_advanced2()
ZSTD_createCDict_advanced2() must ensure that
ZSTD_createCDict_advanced_internal() has successfully allocated cdict. 
customMalloc() may be called under low memory condition and may be unable
to allocate workspace for cdict.

Link: https://lkml.kernel.org/r/20240902105656.1383858-4-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Sergey Senozhatsky
7518847430 lib: lz4hc: export LZ4_resetStreamHC symbol
This symbol is needed to enable lz4hc dictionary support.

Link: https://lkml.kernel.org/r/20240902105656.1383858-3-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Sergey Senozhatsky
4fc4187984 lib: zstd: export API needed for dictionary support
Patch series "zram: introduce custom comp backends API", v7.

This series introduces support for run-time compression algorithms tuning,
so users, for instance, can adjust compression/acceleration levels and
provide pre-trained compression/decompression dictionaries which certain
algorithms support.

At this point we stop supporting (old/deprecated) comp API.  We may add
new acomp API support in the future, but before that zram needs to undergo
some major rework (we are not ready for async compression).

Some benchmarks for reference (look at column #2)

*** init zstd
/sys/block/zram0/mm_stat
1750659072 504622188 514355200        0 514355200        1        0    34204    34204

*** init zstd dict=/home/ss/zstd-dict-amd64
/sys/block/zram0/mm_stat
1750650880 465908890 475398144        0 475398144        1        0    34185    34185

*** init zstd level=8 dict=/home/ss/zstd-dict-amd64
/sys/block/zram0/mm_stat
1750654976 430803319 439873536        0 439873536        1        0    34185    34185

*** init lz4
/sys/block/zram0/mm_stat
1750646784 664266564 677060608        0 677060608        1        0    34288    34288

*** init lz4 dict=/home/ss/lz4-dict-amd64
/sys/block/zram0/mm_stat
1750650880 619990300 632102912        0 632102912        1        0    34278    34278

*** init lz4hc
/sys/block/zram0/mm_stat
1750630400 609023822 621232128        0 621232128        1        0    34288    34288

*** init lz4hc dict=/home/ss/lz4-dict-amd64
/sys/block/zram0/mm_stat
1750659072 505133172 515231744        0 515231744        1        0    34278    34278


Recompress
init zram zstd (prio=0), zstd level=5 (prio 1), zstd with dict (prio 2)

*** zstd
/sys/block/zram0/mm_stat
1750982656 504630584 514269184        0 514269184        1        0    34204    34204

*** idle recompress priority=1 (zstd level=5)
/sys/block/zram0/mm_stat
1750982656 488645601 525438976        0 514269184        1        0    34204    34204

*** idle recompress priority=2 (zstd dict)
/sys/block/zram0/mm_stat
1750982656 460869640 517914624        0 514269184        1        0    34185    34204


This patch (of 24):

We need to export a number of API functions that enable advanced zstd
usage - C/D dictionaries, dictionaries sharing between contexts, etc.

Link: https://lkml.kernel.org/r/20240902105656.1383858-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20240902105656.1383858-2-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:06 -07:00
Wei Yang
96ae4c9019 maple_tree: cleanup function descriptions
This patch tries to cleanup some function description:

  * function name mismatch
  * parameter name mismatch
  * parameter all end up with ':'
  * not prefix '*' if parameter is a pointer

There is still some missing description of parameters, I didn't add them
since I am not sure the exact meaning.

Link: https://lkml.kernel.org/r/20240830220400.2007-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:05 -07:00
Wei Yang
21a449bedf maple_tree: dump error message based on format
Just do what mt_dump_range64() does.

Dump the error message based on format.

Link: https://lkml.kernel.org/r/20240826012422.29935-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:01 -07:00
Wei Yang
8152831069 maple_tree: arange64 node is not a leaf node
mt_dump_arange64() only applies to an entry whose type is maple_arange_64,
in which mte_is_leaf() must return false.

Since mte_is_leaf() here is always false, we can remove this condition
check.

Link: https://lkml.kernel.org/r/20240826012422.29935-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 16:39:01 -07:00
Zhen Lei
63a4a9b52c debugobjects: Remove redundant checks in fill_pool()
fill_pool() checks locklessly at the beginning whether the pool has to be
refilled. After that it checks locklessly in a loop whether the free list
contains objects and repeats the refill check.

If both conditions are true, it acquires the pool lock and tries to move
objects from the free list to the pool repeating the same checks again.

There are two redundant issues with that:

      1) The repeated check for the fill condition
      2) The loop processing

The repeated check is pointless as it was just established that fill is
required. The condition has to be re-evaluated under the lock anyway.

The loop processing is not required either because there is practically
zero chance that a repeated attempt will succeed if the checks under the
lock terminate the moving of objects.

Remove the redundant check and replace the loop with a simple if condition.

[ tglx: Massaged change log ]

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-4-thunder.leizhen@huawei.com
2024-09-09 16:40:26 +02:00
Zhen Lei
684d28feb8 debugobjects: Fix conditions in fill_pool()
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be
handed back to the kmem cache. But 'obj_pool_min_free' records the lowest
historical value of the number of objects in the object pool and not the
minimum number of objects which should be kept in the pool.

Use 'debug_objects_pool_min_level' instead, which holds the minimum number
which was scaled to the number of CPUs at boot time.

[ tglx: Massage change log ]

Fixes: d26bf5056f ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()")
Fixes: 36c4ead6f6 ("debugobjects: Add global free list and the counter")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com
2024-09-09 16:40:25 +02:00
Zhen Lei
e4757c710b debugobjects: Fix the compilation attributes of some global variables
1. Both debug_objects_pool_min_level and debug_objects_pool_size are
   read-only after initialization, change attribute '__read_mostly' to
   '__ro_after_init', and remove '__data_racy'.

2. Many global variables are read in the debug_stats_show() function, but
   didn't mask KCSAN's detection. Add '__data_racy' for them.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-2-thunder.leizhen@huawei.com
2024-09-09 16:40:25 +02:00
Kent Overstreet
b3f9da79e7 lib/generic-radix-tree.c: add preallocation
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Kent Overstreet
f659463381 lib/generic-radix-tree.c: genradix_ptr_inlined()
Provide an inlined fast path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:47 -04:00
Anna-Maria Behnsen
bd7c8ff9fe treewide: Fix wrong singular form of jiffies in comments
There are several comments all over the place, which uses a wrong singular
form of jiffies.

Replace 'jiffie' by 'jiffy'. No functional change.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-08 20:47:40 +02:00
Jakub Kicinski
502cc061de Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/phy_device.c
  2560db6ede ("net: phy: Fix missing of_node_put() for leds")
  1dce520abd ("net: phy: Use for_each_available_child_of_node_scoped()")
https://lore.kernel.org/20240904115823.74333648@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/xilinx/xilinx_axienet.h
drivers/net/ethernet/xilinx/xilinx_axienet_main.c
  858430db28 ("net: xilinx: axienet: Fix race in axienet_stop")
  76abb5d675 ("net: xilinx: axienet: Add statistics support")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05 20:37:20 -07:00
Linus Torvalds
120434e5b3 linux_kselftest-kunit-fixes-6.11-rc7
This kunit update for Linux 6.11-rc7 consist of one single fix to
 a use-after-free bug resulting from kunit_driver_create() failing
 to copy the driver name leaving it on the stack or freeing it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmbY0WMACgkQCwJExA0N
 QxzCgBAA7Cb6tyvGcXsQTXC50S90CR+3bGmHzTL8jl/ElHvTz521UzPTn01QB51t
 JcGNhKz3RByvRBuukhg7abpCnCYWZoa9pmxojVD5D1TO2AXvypWEv0ao/UwSAYyi
 2b7BTkcc7ciRske51/yFfipjwI/NLLIlu4HVcZ0OisOt+tvHzoz50KiyYV+Qan8r
 e8NkqVI587KLfDAZRC+cLXyJCIRwlCK+jNMrjoiOanv1Ybe65eAGNQmAIyuGX1Fo
 Ku8ZgoCgpc+Vjc1bMWgwgHWCdFOvINdd7ibfCp59JBBAkqYFpHYS5Lk9kHWH6lYF
 X9THLaCSh5cq+u0qksW8p4ml1fYnWZbm92qkdPj0wG36v9la769HSXijtVhL2lxD
 b1ca/NpfNfbbr5mxoVRq4ulO1JvyC6jmRKSJWt1p1SFfHf+Oaowh2Sr2ZjFfOozj
 +/Joh3n2dxlnH/in8BvXGwQIo7xbyTatm/4IVCccJAolR+hPv7izBeWfYn3xgtu5
 5WZVcxPMxNwgNHWnxm2nbxTtBTvTsOSC8/nbxm8g3jM9cHCP7Mz3/zSV6p2vcRxm
 HPx/Qj2LmNcPKGXs4jh7WLErgkunxlvsqCJChwGjZoYR0fgRmzCgrwbkDE6/26UW
 Teo51bWwD/CxTy7OtXi8D2pPzVqt8u5cFPaNgHaRzxLDuVTouhU=
 =JRC5
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fix fromShuah Khan:
 "One single fix to a use-after-free bug resulting from
  kunit_driver_create() failing to copy the driver name leaving it on
  the stack or freeing it"

* tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Device wrappers should also manage driver name
2024-09-05 09:43:38 -07:00
Tejun Heo
649e980dad Merge branch 'bpf/master' into for-6.12
Pull bpf/master to receive baebe9aaba ("bpf: allow passing struct
bpf_iter_<type> as kfunc arguments") and related changes in preparation for
the DSQ iterator patchset.

Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-04 11:41:32 -10:00
Matthew Wilcox (Oracle)
e27ad6560e printf: remove %pGt support
Patch series "Increase the number of bits available in page_type".

Kent wants more than 16 bits in page_type, so I resurrected this old patch
and expanded it a bit.  It's a bit more efficient than our current scheme
(1 4-byte insn vs 3 insns of 13 bytes total) to test a single page type.


This patch (of 4):

An upcoming patch will convert page type from being a bitfield to a
single byte, so we will not be able to use %pG to print the page type
any more.  The printing of the symbolic name will be restored in that
patch.

Link: https://lkml.kernel.org/r/20240821173914.2270383-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20240821173914.2270383-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03 21:15:42 -07:00
Alexander Lobakin
00d066a4d4 netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature,
rather an attribute, very similar to IFF_NO_QUEUE (and hot).
Free one netdev_features_t bit and make it a "hot" private flag.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Yang Ruibin
38676d9e33 lib: fix the NULL vs IS_ERR() bug for debugfs_create_dir()
debugfs_create_dir() returns error pointers.  It never returns NULL.  So
use IS_ERR() to check it.

Link: https://lkml.kernel.org/r/20240821073441.9701-1-11162571@vivo.com
Signed-off-by: Yang Ruibin <11162571@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:40 -07:00
Andy Shevchenko
fb54ea1ee8 dimlib: use *-y instead of *-objs in Makefile
*-objs suffix is reserved rather for (user-space) host programs while
usually *-y suffix is used for kernel drivers (although *-objs works for
that purpose for now).

Let's correct the old usages of *-objs in Makefiles.

Link: https://lkml.kernel.org/r/20240821155140.611514-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tal Gilboa <talgi@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:40 -07:00
Uros Bizjak
16d9691ad4 lib/percpu_counter: add missing __percpu qualifier to a cast
Add missing __percpu qualifier to a (void *) cast to fix

percpu_counter.c:212:36: warning: cast removes address space '__percpu' of expression
percpu_counter.c:212:33: warning: incorrect type in assignment (different address spaces)
percpu_counter.c:212:33:    expected signed int [noderef] [usertype] __percpu *counters
percpu_counter.c:212:33:    got void *

sparse warnings.

Found by GCC's named address space checks.

There were no changes in the resulting object file.

Link: https://lkml.kernel.org/r/20240814064437.940162-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:34 -07:00
Kuan-Wei Chiu
cbf164cd44 lib/bcd: optimize _bin2bcd() for improved performance
The original _bin2bcd() function used / 10 and % 10 operations for
conversion.  Although GCC optimizes these operations and does not generate
division or modulus instructions, the new implementation reduces the
number of mov instructions in the generated code for both x86-64 and ARM
architectures.

This optimization calculates the tens digit using (val * 103) >> 10, which
is accurate for values of 'val' in the range [0, 178].  Given that the
valid input range is [0, 99], this method ensures correctness while
simplifying the generated code.

Link: https://lkml.kernel.org/r/20240812170229.229380-1-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:33 -07:00
Jani Nikula
6ce2082fd3 fault-inject: improve build for CONFIG_FAULT_INJECTION=n
The fault-inject.h users across the kernel need to add a lot of #ifdef
CONFIG_FAULT_INJECTION to cater for shortcomings in the header.  Make
fault-inject.h self-contained for CONFIG_FAULT_INJECTION=n, and add stubs
for DECLARE_FAULT_ATTR(), setup_fault_attr(), should_fail_ex(), and
should_fail() to allow removal of conditional compilation.

[akpm@linux-foundation.org: repair fallout from no longer including debugfs.h into fault-inject.h]
[akpm@linux-foundation.org: fix drivers/misc/xilinx_tmr_inject.c]
[akpm@linux-foundation.org: Add debugfs.h inclusion to more files, per Stephen]
Link: https://lkml.kernel.org/r/20240813121237.2382534-1-jani.nikula@intel.com
Fixes: 6ff1cb355e ("[PATCH] fault-injection capabilities infrastructure")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:33 -07:00
Davidlohr Bueso
a15bec6a8f lib/rhashtable: cleanup fallback check in bucket_table_alloc()
Upon allocation failure, the current check with the nofail bits is
unnecessary, and further stands in the way of discouraging direct use of
__GFP_NOFAIL.  Remove this and replace with the proper way of determining
if doing a non-blocking allocation for the nested table case.

Link: https://lkml.kernel.org/r/20240806153927.184515-1-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Suggested-by: Michal Hocko <mhocko@suse.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:32 -07:00
J. R. Okajima
e0ba72e3a4 lockdep: upper limit LOCKDEP_CHAINS_BITS
CONFIG_LOCKDEP_CHAINS_BITS value decides the size of chain_hlocks[] in
kernel/locking/lockdep.c, and it is checked by add_chain_cache() with
    BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
This patch is just to silence BUILD_BUG_ON().

See also https://lore.kernel.org/all/30795.1620913191@jrobl/

[cmllamas@google.com: fix minor checkpatch issues in commit log]
Link: https://lkml.kernel.org/r/20240723164018.2489615-1-cmllamas@google.com
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:32 -07:00
Thorsten Blum
b6e21b7120 lib: checksum: use ARRAY_SIZE() to improve assert_setup_correct()
Use ARRAY_SIZE() to simplify the assert_setup_correct() function and
improve its readability.

Link: https://lkml.kernel.org/r/20240726154946.230928-1-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:30 -07:00
Deshan Zhang
9a42bfd255 lib/lru_cache: fix spelling mistake "colision"->"collision"
There is a spelling mistake in a literal string and in cariable names. 
Fix these.

Link: https://lkml.kernel.org/r/20240725093044.1742842-1-deshan@nfschina.com
Signed-off-by: Deshan Zhang <deshan@nfschina.com>
Cc: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Markus Elfring
fbe617af69 closures: use seq_putc() in debug_show()
A single line break should be put into a sequence.  Thus use the
corresponding function "seq_putc".

This issue was transformed by using the Coccinelle software.

Link: https://lkml.kernel.org/r/e7faa2c4-9590-44b4-8669-69ef810277b1@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Markus Elfring
7b76689a02 dyndbg: use seq_putc() in ddebug_proc_show()
Single characters should be put into a sequence.  Thus use the
corresponding function "seq_putc".

This issue was transformed by using the Coccinelle software.

Link: https://lkml.kernel.org/r/375b5b4b-6295-419e-bae9-da724a7a682d@web.de
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:29 -07:00
Lasse Collin
c6f371bab2 xz: remove XZ_EXTERN and extern from functions
XZ_EXTERN was used to make internal functions static in the preboot code. 
However, in other decompressors this hasn't been done.  On x86-64, this
makes no difference to the kernel image size.

Omit XZ_EXTERN and let some of the internal functions be extern in the
preboot code.  Omitting XZ_EXTERN from include/linux/xz.h fixes warnings
in "make htmldocs" and makes the intradocument links to xz_dec functions
work in Documentation/staging/xz.rst.  The alternative would have been to
add "XZ_EXTERN" to c_id_attributes in Documentation/conf.py but omitting
XZ_EXTERN seemed cleaner.

Link: https://lore.kernel.org/lkml/20240723205437.3c0664b0@kaneli/
Link: https://lkml.kernel.org/r/20240724110544.16430-1-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:27 -07:00
Lasse Collin
7472ff8ada xz: adjust arch-specific options for better kernel compression
Use LZMA2 options that match the arch-specific alignment of instructions. 
This change reduces compressed kernel size 0-2 % depending on the arch. 
On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs
it helps the most.

Use the ARM-Thumb filter for ARM-Thumb2 kernels.  This reduces compressed
kernel size about 5 %.[1] Previously such kernels were compressed using
the ARM filter which didn't do anything useful with ARM-Thumb2 code.

Add BCJ filter support for ARM64 and RISC-V.  Compared to unfiltered XZ or
plain LZMA, the compressed kernel size is reduced about 5 % on ARM64 and 7
% on RISC-V.  A new enough version of the xz tool is required: 5.4.0 for
ARM64 and 5.6.0 for RISC-V.  With an old xz version, a message is printed
to standard error and the kernel is compressed without the filter.

Update lib/decompress_unxz.c to match the changes to xz_wrap.sh.

Update the CONFIG_KERNEL_XZ help text in init/Kconfig:
  - Add the RISC-V and ARM64 filters.
  - Clarify that the PowerPC filter is for big endian only.
  - Omit IA-64.

Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1]
Link: https://lkml.kernel.org/r/20240721133633.47721-15-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:27 -07:00
Lasse Collin
93d09773d1 xz: add RISC-V BCJ filter
A later commit updates lib/decompress_unxz.c to enable this filter for
kernel decompression.  lib/decompress_unxz.c is already used if
CONFIG_EFI_ZBOOT=y && CONFIG_KERNEL_XZ=y.

This filter can be used by Squashfs without modifications to the Squashfs
kernel code (only needs support in userspace Squashfs-tools).

Link: https://lkml.kernel.org/r/20240721133633.47721-13-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
4b62813f5e xz: Add ARM64 BCJ filter
Also omit a duplicated check for XZ_DEC_ARM in xz_private.h.

A later commit updates lib/decompress_unxz.c to enable this filter for
kernel decompression.  lib/decompress_unxz.c is already used if
CONFIG_EFI_ZBOOT=y && CONFIG_KERNEL_XZ=y.

This filter can be used by Squashfs without modifications to the Squashfs
kernel code (only needs support in userspace Squashfs-tools).

Link: https://lkml.kernel.org/r/20240721133633.47721-12-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
bdfc041171 xz: optimize for-loop conditions in the BCJ decoders
Compilers cannot optimize the addition "i + 4" away since theoretically it
could overflow.

Link: https://lkml.kernel.org/r/20240721133633.47721-11-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:26 -07:00
Lasse Collin
2ee96abef2 xz: cleanup CRC32 edits from 2018
In 2018, a dependency on <linux/crc32poly.h> was added to avoid
duplicating the same constant in multiple files.  Two months later it was
found to be a bad idea and the definition of CRC32_POLY_LE macro was moved
into xz_private.h to avoid including <linux/crc32poly.h>.

xz_private.h is a wrong place for it too.  Revert back to the upstream
version which has the poly in xz_crc32_init() in xz_crc32.c.

Link: https://lkml.kernel.org/r/20240721133633.47721-10-lasse.collin@tukaani.org
Fixes: faa16bc404 ("lib: Use existing define with polynomial")
Fixes: 242cdad873 ("lib/xz: Put CRC32_POLY_LE in xz_private.h")
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:25 -07:00
Lasse Collin
ff221153aa xz: fix comments and coding style
- Fix comments that were no longer in sync with the code below them.
- Fix language errors.
- Fix coding style.

Link: https://lkml.kernel.org/r/20240721133633.47721-5-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:24 -07:00
Lasse Collin
836d13a6ef xz: switch from public domain to BSD Zero Clause License (0BSD)
Remove the public domain notices and add SPDX license identifiers.

Change MODULE_LICENSE from "GPL" to "Dual BSD/GPL" because 0BSD should
count as a BSD license variant here.

The switch to 0BSD was done in the upstream XZ Embedded project because
public domain has (real or perceived) legal issues in some jurisdictions.

Link: https://lkml.kernel.org/r/20240721133633.47721-4-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Sam James <sam@gentoo.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:24 -07:00
Andrey Konovalov
e24f4de8a7 kcov: don't instrument lib/find_bit.c
This file produces large amounts of flaky coverage not useful for the
KCOV's intended use case (guiding the fuzzing process).

Link: https://lkml.kernel.org/r/20240722223726.194658-1-andrey.konovalov@linux.dev
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:23 -07:00
Jeff Johnson
053a5e4cbb lib: test_objpool: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_objpool.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240715-md-lib-test_objpool-v2-1-5a2b9369c37e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Matt Wu <wuqiang.matt@bytedance.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:23 -07:00
Nicolas Pitre
1635e62e75 mul_u64_u64_div_u64: basic sanity test
Verify that edge cases produce proper results, and some more.

[npitre@baylibre.com: avoid undefined shift value]
  Link: https://lkml.kernel.org/r/7rrs9pn1-n266-3013-9q6n-1osp8r8s0rrn@syhkavp.arg
Link: https://lkml.kernel.org/r/20240707190648.1982714-3-nico@fluxnic.net
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:22 -07:00
Nicolas Pitre
b29a62d87c mul_u64_u64_div_u64: make it precise always
Patch series "mul_u64_u64_div_u64: new implementation", v3.

This provides an implementation for mul_u64_u64_div_u64() that always
produces exact results.


This patch (of 2):

Library facilities must always return exact results.  If the caller may be
contented with approximations then it should do the approximation on its
own.

In this particular case the comment in the code says "the algorithm
... below might lose some precision". Well, if you try it with e.g.:

	a = 18446462598732840960
	b = 18446462598732840960
	c = 18446462598732840961

then the produced answer is 0 whereas the exact answer should be
18446462598732840959.  This is _some_ precision lost indeed!

Let's reimplement this function so it always produces the exact result
regardless of its inputs while preserving existing fast paths when
possible.

Uwe said:

: My personal interest is to get the calculations in pwm drivers right. 
: This function is used in several drivers below drivers/pwm/ .  With the
: errors in mul_u64_u64_div_u64(), pwm consumers might not get the
: settings they request.  Although I have to admit that I'm not aware it
: breaks real use cases (because typically the periods used are too short
: to make the involved multiplications overflow), but I pretty sure am
: not aware of all usages and it breaks testing.
: 
: Another justification is commits like
: https://git.kernel.org/tip/77baa5bafcbe1b2a15ef9c37232c21279c95481c,
: where people start to work around the precision shortcomings of
: mul_u64_u64_div_u64().

Link: https://lkml.kernel.org/r/20240707190648.1982714-1-nico@fluxnic.net
Link: https://lkml.kernel.org/r/20240707190648.1982714-2-nico@fluxnic.net
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Tested-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:22 -07:00
Sidhartha Kumar
ed4dfd9aa1 maple_tree: make write helper functions void
The return value of various write helper functions are not checked. We
can safely change the return type of these functions to be void.

Link: https://lkml.kernel.org/r/20240814161944.55347-18-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:18 -07:00
Sidhartha Kumar
c27e6183c6 maple_tree: remove unneeded mas_wr_walk() in mas_store_prealloc()
Users of mas_store_prealloc() enter this function with nodes already
preallocated. This means the store type must be already set. We can then
remove the call to mas_wr_store_type() and initialize the write state to
continue the partial walk that was done when determining the store type.

Link: https://lkml.kernel.org/r/20240814161944.55347-17-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:18 -07:00
Sidhartha Kumar
add60ea5f6 maple_tree: remove repeated sanity checks from write helper functions
These sanity checks are now redundant as they are already checked in
mas_wr_store_type(). We can remove them from mas_wr_append() and
mas_wr_node_store().

Link: https://lkml.kernel.org/r/20240814161944.55347-16-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
9155e84334 maple_tree: remove node allocations from various write helper functions
These write helper functions are all called from store paths which
preallocate enough nodes that will be needed for the write. There is no
more need to allocate within the functions themselves.

Link: https://lkml.kernel.org/r/20240814161944.55347-15-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
4037d44f54 maple_tree: have mas_store() allocate nodes if needed
Not all users of mas_store() enter with nodes already preallocated.
Check for the MA_STATE_PREALLOC flag to decide whether to preallocate nodes
within mas_store() rather than relying on future write helper functions
to perform the allocations. This allows the write helper functions to be
simplified as they do not have to do checks to make sure there are
enough allocated nodes to perform the write.

Link: https://lkml.kernel.org/r/20240814161944.55347-14-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
7987d02779 maple_tree: remove mas_wr_modify()
There are no more users of the function, safely remove it.

Link: https://lkml.kernel.org/r/20240814161944.55347-13-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:17 -07:00
Sidhartha Kumar
62c7b2b984 maple_tree: simplify mas_commit_b_node()
The only callers of mas_commit_b_node() are those with store type of
wr_rebalance and wr_split_store. Use mas->store_type to dispatch to the
correct helper function. This allows the removal of mas_reuse_node() as
it is no longer used.

Link: https://lkml.kernel.org/r/20240814161944.55347-12-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
1fd7c4f322 maple_tree: convert mas_insert() to preallocate nodes
By setting the store type in mas_insert(), we no longer need to use
mas_wr_modify() to determine the correct store function to use. Instead,
set the store type and call mas_wr_store_entry(). Also, pass in the
requested gfp flags to mas_insert() so they can be passed to the call to
mas_wr_preallocate().

Link: https://lkml.kernel.org/r/20240814161944.55347-11-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
580fcbd67c maple_tree: use store type in mas_wr_store_entry()
When storing an entry, we can read the store type that was set from a
previous partial walk of the tree. Now that the type of store is known,
select the correct write helper function to use to complete the store.

Also noinline mas_wr_spanning_store() to limit stack frame usage in
mas_wr_store_entry() as it allocates a maple_big_node on the stack.

Link: https://lkml.kernel.org/r/20240814161944.55347-10-sidhartha.kumar@oracle.com
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
23e217a848 maple_tree: print store type in mas_dump()
Knowing the store type of the maple state could be helpful for debugging.
Have mas_dump() print mas->store_type.

Link: https://lkml.kernel.org/r/20240814161944.55347-9-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:16 -07:00
Sidhartha Kumar
85db8f2417 maple_tree: use mas_store_gfp() in mtree_store_range()
Refactor mtree_store_range() to use mas_store_gfp() which will abstract
the store, memory allocation, and error handling.

Link: https://lkml.kernel.org/r/20240814161944.55347-8-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
7e093834ed maple_tree: preallocate nodes in mas_erase()
Use mas_wr_preallocate() in mas_erase() to preallocate enough nodes to
complete the erase.  Add error handling by skipping the store if the
preallocation lead to some error besides no memory.

Link: https://lkml.kernel.org/r/20240814161944.55347-7-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
3cd9e92e00 maple_tree: remove mas_destroy() from mas_nomem()
Separate call to mas_destroy() from mas_nomem() so we can check for no
memory errors without destroying the current maple state in
mas_store_gfp().  We then add calls to mas_destroy() to callers of
mas_nomem().

Link: https://lkml.kernel.org/r/20240814161944.55347-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
5d659bbb52 maple_tree: introduce mas_wr_store_type()
Introduce mas_wr_store_type() which will set the correct store type based
on a walk of the tree.  In mas_wr_node_store() the <= min_slots condition
is changed to < as if new_end is = to mt_min_slots then there is not
enough room.

mas_prealloc_calc() is also introduced to abstract the calculation used to
determine the number of nodes needed for a store operation.

In this change a call to mas_reset() is removed in the error case of
mas_prealloc().  This is only needed in the MA_STATE_REBALANCE case of
mas_destroy().  We can move the call to mas_reset() directly to
mas_destroy().

Also, add a test case to validate the order that we check the store type
in is correct.  This test models a vma expanding and then shrinking which
is part of the boot process.

Link: https://lkml.kernel.org/r/20240814161944.55347-5-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:15 -07:00
Sidhartha Kumar
3cc6f42a53 maple_tree: move up mas_wr_store_setup() and mas_wr_prealloc_setup()
Subsequent patches require these definitions to be higher, no functional
changes intended.

Link: https://lkml.kernel.org/r/20240814161944.55347-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:14 -07:00
Sidhartha Kumar
19138a2cc1 maple_tree: introduce mas_wr_prealloc_setup()
Introduce a helper function, mas_wr_prealoc_setup(), that will set up a
maple write state in order to start a walk of a maple tree.

Link: https://lkml.kernel.org/r/20240814161944.55347-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:14 -07:00
Wei Yang
c64d66153b maple_tree: fix comment typo with corresponding maple_status
In comment of function mas_start(), we list the return value of different
cases.  According to the comment context, tell the maple_status here is
more consistent with others.

Let's correct it with ma_active in the case it's a tree.

Link: https://lkml.kernel.org/r/20240812150925.31551-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Wei Yang
7a0529d0c2 maple_tree: fix comment typo of ma_root
In comment of mas_start(), we lists the return value for different cases. 
In case of a single entry, we set mas->status to ma_root, while the
comment uses mas_root, which is not a maple_status.

Fix the typo according to the code.

Link: https://lkml.kernel.org/r/20240812150925.31551-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Sidhartha Kumar
617f8e4d76 maple_tree: add test to replicate low memory race conditions
Add new callback fields to the userspace implementation of struct
kmem_cache.  This allows for executing callback functions in order to
further test low memory scenarios where node allocation is retried.

This callback can help test race conditions by calling a function when a
low memory event is tested.

Link: https://lkml.kernel.org/r/20240812190543.71967-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Sidhartha Kumar
e1b8b883bb maple_tree: reset mas->index and mas->last on write retries
The following scenario can result in a race condition:

Consider a node with the following indices and values

	a<------->b<----------->c<--------->d
	    0xA        NULL          0xB

	CPU 1			  CPU 2
      ---------        		---------
	mas_set_range(a,b)
	mas_erase()
		-> range is expanded (a,c) because of null expansion

	mas_nomem()
	mas_unlock()
				mas_store_range(b,c,0xC)

The node now looks like:

	a<------->b<----------->c<--------->d
	    0xA        0xC          0xB

	mas_lock()
	mas_erase() <------ range of erase is still (a,c)

The node is now NULL from (a,c) but the write from CPU 2 should have been
retained and range (b,c) should still have 0xC as its value.  We can fix
this by re-intializing to the original index and last.  This does not need
a cc: Stable as there are no users of the maple tree which use internal
locking and this condition is only possible with internal locking.

Link: https://lkml.kernel.org/r/20240812190543.71967-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:26:11 -07:00
Thorsten Blum
592c9330e3 lib: test_hmm: use min() to improve dmirror_exclusive()
Use min() to simplify the dmirror_exclusive() function and improve its
readability.

Link: https://lkml.kernel.org/r/20240726131245.161695-1-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:25:52 -07:00
Danilo Krummrich
590b9d576c mm: kvmalloc: align kvrealloc() with krealloc()
Besides the obvious (and desired) difference between krealloc() and
kvrealloc(), there is some inconsistency in their function signatures and
behavior:

 - krealloc() frees the memory when the requested size is zero, whereas
   kvrealloc() simply returns a pointer to the existing allocation.

 - krealloc() behaves like kmalloc() if a NULL pointer is passed, whereas
   kvrealloc() does not accept a NULL pointer at all and, if passed,
   would fault instead.

 - krealloc() is self-contained, whereas kvrealloc() relies on the caller
   to provide the size of the previous allocation.

Inconsistent behavior throughout allocation APIs is error prone, hence
make kvrealloc() behave like krealloc(), which seems superior in all
mentioned aspects.

Besides that, implementing kvrealloc() by making use of krealloc() and
vrealloc() provides oppertunities to grow (and shrink) allocations more
efficiently.  For instance, vrealloc() can be optimized to allocate and
map additional pages to grow the allocation or unmap and free unused pages
to shrink the allocation.

[dakr@kernel.org: document concurrency restrictions]
  Link: https://lkml.kernel.org/r/20240725125442.4957-1-dakr@kernel.org
[dakr@kernel.org: disable KASAN when switching to vmalloc]
  Link: https://lkml.kernel.org/r/20240730185049.6244-2-dakr@kernel.org
[dakr@kernel.org: properly document __GFP_ZERO behavior]
  Link: https://lkml.kernel.org/r/20240730185049.6244-5-dakr@kernel.org
Link: https://lkml.kernel.org/r/20240722163111.4766-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Chandan Babu R <chandan.babu@oracle.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:25:44 -07:00
Suren Baghdasaryan
052a45c1cb alloc_tag: fix allocation tag reporting when CONFIG_MODULES=n
codetag_module_init() is used to initialize sections containing allocation
tags.  This function is used to initialize module sections as well as core
kernel sections, in which case the module parameter is set to NULL.  This
function has to be called even when CONFIG_MODULES=n to initialize core
kernel allocation tag sections.  When CONFIG_MODULES=n, this function is a
NOP, which is wrong.  This leads to /proc/allocinfo reported as empty. 
Fix this by making it independent of CONFIG_MODULES.

Link: https://lkml.kernel.org/r/20240828231536.1770519-1-surenb@google.com
Fixes: 916cc5167c ("lib: code tagging framework")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[6.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 17:59:03 -07:00
Liam R. Howlett
f806de88d8 maple_tree: remove rcu_read_lock() from mt_validate()
The write lock should be held when validating the tree to avoid updates
racing with checks.  Holding the rcu read lock during a large tree
validation may also cause a prolonged rcu read window and "rcu_preempt
detected stalls" warnings.

Link: https://lore.kernel.org/all/0000000000001d12d4062005aea1@google.com/
Link: https://lkml.kernel.org/r/20240820175417.2782532-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reported-by: syzbot+036af2f0c7338a33b0cd@syzkaller.appspotmail.com
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 17:59:01 -07:00
Linus Torvalds
d5d547aa7b Random number generator fixes for Linux 6.11-rc6.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmbPwucACgkQSfxwEqXe
 A653nRAA0pk0iDH9iz/DLXVy5e4WWE1WQyCdT4jB5H2SItG3fz4kcKz0x1qcPEtA
 RUhO4bZLTeFE/QkAQROA41x0ysAbg2dnIefO6CzFhndKGDyOEfUKYAsb65HiYj8Z
 HI9XGRYWc8kD35BGDtqGrgbgDgSVS3JPASC8mPJKv608h9f1M1ABqtyuft8bxz57
 2OxuXoxVVN4ZI0VyQqqhT1roEiCIuuDaSZlPUws2PjnLxcqIQXXXPMLgN2vi9QzG
 cCslhtJMxBAhQ/skAVbxQlI6S2OB0zGROE78k2PK7eqGZuBAex9G0kuWH9Rl3RQL
 NmYjITWPZts7LRxCcvUQzxcKYsGb08mvCMCu+AAS9QfI1rOQu/TS7+4IfRHnHyg0
 J7OBN0aPqKfciAch5NCfxN5EMUAlwXdro2/salONdGNF7do9mdjt/LqUzhbSKBPi
 kpVWBkLHzl0obPR1F/BBfC2oRW7Us5ShjaLod9J1DcJps/GTr7MXir8lEnPxwypJ
 5t4F8Y4M34MpxmVZ/k2oNsEGhugpicaTAqa5KO4vqtWDPk1TNHi2POxU1Fjnth5K
 ds/NxoRvXV/2K5V+XiJQnngt5pgRjqU5DgCh19Bq1W7PqqbGkVWmzIa+zfYm9sCH
 +RuZiyjM16RyN/tDAxhfKowBqsagW6/DM7LJe3fWJO7yCem/S5g=
 =a3c1
 -----END PGP SIGNATURE-----

Merge tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator fix from Jason Donenfeld:
 "Reject invalid flags passed to vgetrandom() in the same way that
  getrandom() does, so that the behavior is the same, from Yann.

  The flags argument to getrandom() only has a behavioral effect on the
  function if the RNG isn't initialized yet, so vgetrandom() falls back
  to the syscall in that case. But if the RNG is initialized, all of the
  flags behave the same way, so vgetrandom() didn't bother checking
  them, and just ignored them entirely.

  But that doesn't account for invalid flags passed in, which need to be
  rejected so we can use them later"

* tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: vDSO: reject unknown getrandom() flags
2024-08-29 13:59:18 +12:00
Anshuman Khandual
d7bcc37436 lib/test_bits.c: Add tests for GENMASK_U128()
This adds GENMASK_U128() tests although currently only 64 bit wide masks
are being tested.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-08-28 06:54:39 -07:00
Vlastimil Babka
4e1c44b3db kunit, slub: add test_kfree_rcu() and test_leak_destroy()
Add a test that will create cache, allocate one object, kfree_rcu() it
and attempt to destroy it. As long as the usage of kvfree_rcu_barrier()
in kmem_cache_destroy() works correctly, there should be no warnings in
dmesg and the test should pass.

Additionally add a test_leak_destroy() test that leaks an object on
purpose and verifies that kmem_cache_destroy() catches it.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-08-27 14:12:51 +02:00
David Gow
f2c6dbd220 kunit: Device wrappers should also manage driver name
kunit_driver_create() accepts a name for the driver, but does not copy
it, so if that name is either on the stack, or otherwise freed, we end
up with a use-after-free when the driver is cleaned up.

Instead, strdup() the name, and manage it as another KUnit allocation.
As there was no existing kunit_kstrdup(), we add one. Further, add a
kunit_ variant of strdup_const() and kfree_const(), so we don't need to
allocate and manage the string in the majority of cases where it's a
constant.

However, these are inline functions, and is_kernel_rodata() only works
for built-in code. This causes problems in two cases:
- If kunit is built as a module, __{start,end}_rodata is not defined.
- If a kunit test using these functions is built as a module, it will
  suffer the same fate.

This fixes a KASAN splat with overflow.overflow_allocation_test, when
built as a module.

Restrict the is_kernel_rodata() case to when KUnit is built as a module,
which fixes the first case, at the cost of losing the optimisation.

Also, make kunit_{kstrdup,kfree}_const non-inline, so that other modules
using them will not accidentally depend on is_kernel_rodata(). If KUnit
is built-in, they'll benefit from the optimisation, if KUnit is not,
they won't, but the string will be properly duplicated.

Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reported-by: Nico Pache <npache@redhat.com>
Closes: https://groups.google.com/g/kunit-dev/c/81V9b9QYON0
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-08-26 07:03:46 -06:00
Yann Droneaud
28f5df210d random: vDSO: reject unknown getrandom() flags
Like the getrandom() syscall, vDSO getrandom() must also reject unknown
flags. [1]

It would be possible to return -EINVAL from vDSO itself, but in the
possible case that a new flag is added to getrandom() syscall in the
future, it would be easier to get the behavior from the syscall, instead
of erroring until the vDSO is extended to support the new flag or
explicitly falling back.

[1] Designing the API: Planning for Extension
    https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension

Signed-off-by: Yann Droneaud <yann@droneaud.fr>
[Jason: reworded commit message]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-08-26 09:58:52 +02:00
Caleb Sander Mateos
e68ac2b488 softirq: Remove unused 'action' parameter from action callback
When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function pointer.

This pointer isn't useful, as the action callback already knows what
function it is. And since each callback handles a specific soft interrupt,
the callback also knows which soft interrupt number is running.

No soft interrupt action callback actually uses this parameter, so remove
it from the function pointer signature. This clarifies that soft interrupt
actions are global routines and makes it slightly cheaper to call them.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/20240815171549.3260003-1-csander@purestorage.com
2024-08-20 17:13:40 +02:00
Linus Torvalds
2865baf540 x86: support user address masking instead of non-speculative conditional
The Spectre-v1 mitigations made "access_ok()" much more expensive, since
it has to serialize execution with the test for a valid user address.

All the normal user copy routines avoid this by just masking the user
address with a data-dependent mask instead, but the fast
"unsafe_user_read()" kind of patterms that were supposed to be a fast
case got slowed down.

This introduces a notion of using

	src = masked_user_access_begin(src);

to do the user address sanity using a data-dependent mask instead of the
more traditional conditional

	if (user_read_access_begin(src, len)) {

model.

This model only works for dense accesses that start at 'src' and on
architectures that have a guard region that is guaranteed to fault in
between the user space and the kernel space area.

With this, the user access doesn't need to be manually checked, because
a bad address is guaranteed to fault (by some architecture masking
trick: on x86-64 this involves just turning an invalid user address into
all ones, since we don't map the top of address space).

This only converts a couple of examples for now.  Example x86-64 code
generation for loading two words from user space:

        stac
        mov    %rax,%rcx
        sar    $0x3f,%rcx
        or     %rax,%rcx
        mov    (%rcx),%r13
        mov    0x8(%rcx),%r14
        clac

where all the error handling and -EFAULT is now purely handled out of
line by the exception path.

Of course, if the micro-architecture does badly at 'clac' and 'stac',
the above is still pitifully slow.  But at least we did as well as we
could.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-19 11:31:18 -07:00
Linus Torvalds
b718175853 bcachefs fixes for 6.11-rc4
- New on disk format version, bcachefs_metadata_version_disk_accounting_inum
 
 This adds one more disk accounting counter, which counts disk usage and
 number of extents per inode number. This lets us track fragmentation,
 for implementing defragmentation later, and it also counts disk usage
 per inode in all snapshots, which will be a useful thing to expose to
 users.
 
 - One performance issue we've observed is threads spinning when they
 should be waiting for dirty keys in the key cache to be flushed by
 journal reclaim, so we now have hysteresis for the waiting thread, as
 well as improving the tracepoint and a new time_stat, for tracking time
 blocked waiting on key cache flushing.
 
 And, various assorted smaller fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAma/9QkACgkQE6szbY3K
 bnYcBw/+LBSZ415gWSjPktdecf5rc6K4KxETxAxV0f0KesYzxqAtQzN0SCDvKt65
 3aALU03wM8vWITiLS38/ckT+j6S2BpXcOxdu/OC0nRYQEUg9ZLvqEG5lQ3a/LliV
 Q64N33qsSr6QaKszFllLYcN4tGduKg8HoMlHn6+vJ7HNPjdfv0HHERSUsc7K84/w
 jkRtDE2NxsRJZKMEvIFp8hd5KXUR5zyBz/kc4P0WliLXpSyJLITzhKw1JV7ikKVD
 0mO2bJ/0i7wPIabAD2HJahvbC7fl+2fkYFxUJ2XnvMTgU/+QyeGHEufbcbVrVSp0
 BpzBTmSMFbGXBkbQBruFX5rJetzXeBqdYf0Yfavd4KDhGvYlSfDZQUapXT1QKC2q
 aHSB/s+2r7Crr/MBJyjbeFgXFTNGvI5yerlbdp2yj1kxjYJHHaKrp6h7n6XXk21W
 /mGF5tkIMkFTv98rQnIaky4neJzOPsLTTgxeR8zEudCgMaVUqEcaMdIFvARDjY/3
 n52VR0zl3olV3vu7LgHaHfgH6lfaMV0sHPaGNYGL0YL+bCJD+lYM8a6l9aaks8vk
 md7+mFcOS4FUdDdS8MEKIN/k/gkEOC/EpmI864i9rIl0SiNXNy7FPTDKON8b+Ury
 5omBMUQMEe9Q/pgKGXfpJWFynhSPEVf4y1DIOsrXk/jeBqenFyo=
 =BPGT
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent OverstreetL

 - New on disk format version, bcachefs_metadata_version_disk_accounting_inum

   This adds one more disk accounting counter, which counts disk usage
   and number of extents per inode number. This lets us track
   fragmentation, for implementing defragmentation later, and it also
   counts disk usage per inode in all snapshots, which will be a useful
   thing to expose to users.

 - One performance issue we've observed is threads spinning when they
   should be waiting for dirty keys in the key cache to be flushed by
   journal reclaim, so we now have hysteresis for the waiting thread, as
   well as improving the tracepoint and a new time_stat, for tracking
   time blocked waiting on key cache flushing.

... and various assorted smaller fixes.

* tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs:
  bcachefs: Fix locking in __bch2_trans_mark_dev_sb()
  bcachefs: fix incorrect i_state usage
  bcachefs: avoid overflowing LRU_TIME_BITS for cached data lru
  bcachefs: Fix forgetting to pass trans to fsck_err()
  bcachefs: Increase size of cuckoo hash table on too many rehashes
  bcachefs: bcachefs_metadata_version_disk_accounting_inum
  bcachefs: Kill __bch2_accounting_mem_mod()
  bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
  bcachefs: Fix warning in __bch2_fsck_err() for trans not passed in
  bcachefs: Add a time_stat for blocked on key cache flush
  bcachefs: Improve trans_blocked_journal_reclaim tracepoint
  bcachefs: Add hysteresis to waiting on btree key cache flush
  lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
  bcachefs: Convert for_each_btree_node() to lockrestart_do()
  bcachefs: Add missing downgrade table entry
  bcachefs: disk accounting: ignore unknown types
  bcachefs: bch2_accounting_invalid() fixup
  bcachefs: Fix bch2_trigger_alloc when upgrading from old versions
  bcachefs: delete faulty fastpath in bch2_btree_path_traverse_cached()
2024-08-17 09:46:10 -07:00
Herbert Xu
8e3a67f2de crypto: lib/mpi - Add error checks to extension
The remaining functions added by commit
a8ea8bdd9d did not check for memory
allocation errors.  Add the checks and change the API to allow errors
to be returned.

Fixes: a8ea8bdd9d ("lib/mpi: Extend the MPI library")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17 13:55:50 +08:00
Herbert Xu
fca5cb4dd2 Revert "lib/mpi: Extend the MPI library"
This partially reverts commit a8ea8bdd9d.

Most of it is no longer needed since sm2 has been removed.  However,
the following functions have been kept as they have developed other
uses:

mpi_copy

mpi_mod

mpi_test_bit
mpi_set_bit
mpi_rshift

mpi_add
mpi_sub
mpi_addm
mpi_subm

mpi_mul
mpi_mulm

mpi_tdiv_r
mpi_fdiv_r

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17 13:55:50 +08:00
Justin Stitt
bbf3c7ff9d lib/string_helpers: rework overflow-dependent code
When @size is 0, the desired behavior is to allow unlimited bytes to be
parsed. Currently, this relies on some intentional arithmetic overflow
where --size gives us SIZE_MAX when size is 0.

Explicitly spell out the desired behavior without relying on intentional
overflow/underflow.

Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20240808-b4-string_helpers_caa133-v1-1-686a455167c4@google.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
9c6b7fbbd7 fortify: use if_changed_dep to record header dependency in *.cmd files
After building with CONFIG_FORTIFY_SOURCE=y, many .*.d files are left
in lib/test_fortify/ because the compiler outputs header dependencies
into *.d without fixdep being invoked.

When compiling C files, if_changed_dep should be used so that the
auto-generated header dependencies are recorded in .*.cmd files.

Currently, if_changed is incorrectly used, and only two headers are
hard-coded in lib/Makefile.

In the previous patch version, the kbuild test robot detected new errors
on GCC 7.

GCC 7 or older does not produce test.d with the following test code:

 $ echo 'void b(void) __attribute__((__error__(""))); void a(void) { b(); }' |
   gcc -Wp,-MMD,test.d -c -o /dev/null -x c -

Perhaps, this was a bug that existed in older GCC versions.

Skip the tests for GCC<=7 for now, as this will be eventually solved
when we bump the minimal supported GCC version.

Link: https://lore.kernel.org/oe-kbuild-all/CAK7LNARmJcyyzL-jVJfBPi3W684LTDmuhMf1koF0TXoCpKTmcw@mail.gmail.com/T/#m13771bf78ae21adff22efc4d310c973fb4bcaf67
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-4-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
5a8d0c46c9 fortify: move test_fortify.sh to lib/test_fortify/
This script is only used in lib/test_fortify/.

There is no reason to keep it in scripts/.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-3-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Masahiro Yamada
4e9903b086 fortify: refactor test_fortify Makefile to fix some build problems
There are some issues in the test_fortify Makefile code.

Problem 1: cc-disable-warning invokes compiler dozens of times

To see how many times the cc-disable-warning is evaluated, change
this code:

  $(call cc-disable-warning,fortify-source)

to:

  $(call cc-disable-warning,$(shell touch /tmp/fortify-$$$$)fortify-source)

Then, build the kernel with CONFIG_FORTIFY_SOURCE=y. You will see a
large number of '/tmp/fortify-<PID>' files created:

  $ ls -1 /tmp/fortify-* | wc
       80      80    1600

This means the compiler was invoked 80 times just for checking the
-Wno-fortify-source flag support.

$(call cc-disable-warning,fortify-source) should be added to a simple
variable instead of a recursive variable.

Problem 2: do not recompile string.o when the test code is updated

The test cases are independent of the kernel. However, when the test
code is updated, $(obj)/string.o is rebuilt and vmlinux is relinked
due to this dependency:

  $(obj)/string.o: $(obj)/$(TEST_FORTIFY_LOG)

always-y is suitable for building the log files.

Problem 3: redundant code

  clean-files += $(addsuffix .o, $(TEST_FORTIFY_LOGS))

... is unneeded because the top Makefile globally cleans *.o files.

This commit fixes these issues and makes the code readable.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240727150302.1823750-2-masahiroy@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:26:02 -07:00
Ivan Orlov
92e9bac181 kunit/overflow: Fix UB in overflow_allocation_test
The 'device_name' array doesn't exist out of the
'overflow_allocation_test' function scope. However, it is being used as
a driver name when calling 'kunit_driver_create' from
'kunit_device_register'. It produces the kernel panic with KASAN
enabled.

Since this variable is used in one place only, remove it and pass the
device name into kunit_device_register directly as an ascii string.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/r/20240815000431.401869-1-ivan.orlov0322@gmail.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15 09:24:55 -07:00
Paul E. McKenney
ac9d45544c locking/csd_lock: Provide an indication of ongoing CSD-lock stall
If a CSD-lock stall goes on long enough, it will cause an RCU CPU
stall warning.  This additional warning provides much additional
console-log traffic and little additional information.  Therefore,
provide a new csd_lock_is_stuck() function that returns true if there
is an ongoing CSD-lock stall.  This function will be used by the RCU
CPU stall warnings to provide a one-line indication of the stall when
this function returns true.

[ neeraj.upadhyay: Apply Rik van Riel feedback. ]
[ neeraj.upadhyay: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-15 00:05:39 +05:30
Kent Overstreet
b2f11c6f3e lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
If we need to increase the tree depth, allocate a new node, and then
race with another thread that increased the tree depth before us, we'll
still have a preallocated node that might be used later.

If we then use that node for a new non-root node, it'll still have a
pointer to the old root instead of being zeroed - fix this by zeroing it
in the cmpxchg failure path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13 22:56:50 -04:00
Herbert Xu
da4fe6815a Revert "lib/mpi: Introduce ec implementation to MPI library"
This reverts commit d58bb7e55a.

It's no longer needed since sm2 has been removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-10 12:25:34 +08:00
Dmitry Vyukov
6cd0dd934b kcov: Add interrupt handling self test
Add a boot self test that can catch sprious coverage from interrupts.
The coverage callback filters out interrupt code, but only after the
handler updates preempt count. Some code periodically leaks out
of that section and leads to spurious coverage.
Add a best-effort (but simple) test that is likely to catch such bugs.
If the test is enabled on CI systems that use KCOV, they should catch
any issues fast.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/all/7662127c97e29da1a748ad1c1539dd7b65b737b2.1718092070.git.dvyukov@google.com
2024-08-08 17:36:35 +02:00
Gatlin Newhouse
7424fc6b86 x86/traps: Enable UBSAN traps on x86
Currently ARM64 extracts which specific sanitizer has caused a trap via
encoded data in the trap instruction. Clang on x86 currently encodes the
same data in the UD1 instruction but x86 handle_bug() and
is_valid_bugaddr() currently only look at UD2.

Bring x86 to parity with ARM64, similar to commit 25b84002af ("arm64:
Support Clang UBSAN trap codes for better reporting"). See the llvm
links for information about the code generation.

Enable the reporting of UBSAN sanitizer details on x86 compiled with clang
when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate
which is encoded by the compiler after the UD1.

[ tglx: Simplified it by moving the printk() into handle_bug() ]

Signed-off-by: Gatlin Newhouse <gatlin.newhouse@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/20240724000206.451425-1-gatlin.newhouse@gmail.com
Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f
Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
2024-08-06 13:42:40 +02:00
Xavier
93c8332c83 Union-Find: add a new module in kernel library
This patch implements a union-find data structure in the kernel library,
which includes operations for allocating nodes, freeing nodes,
finding the root of a node, and merging two nodes.

Signed-off-by: Xavier <xavier_qy@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 13:04:36 -10:00
Tejun Heo
c8faf11cd1 Linux 6.11-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmamtfseHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC20H/j6G3+7gYGDtSsl9
 5eH7UFzk18JeIG4c9Z5q9p2YVqdTggHOyWUA0qYBJWLyjpQa0q5SO+Qf2VwH8bH7
 NpHZQYIdRB6dy/MySZII/6KdOJobz779P8EOPVdPs6PaAmiwOwzdK4aHxhi3iQJv
 8QHmswjnT6t44p7WX1gZCUL2R3TL5hyA505BfPBz5OPBLkuuTArCBO8mZfTvk3R6
 fskKrVBC3oEb9Vgx/bycah9wTJn4ptPUGggaTnbu44RkhZcHfMiciqOrtMtYtqKx
 fmGQllbVQ8CHp4IBZ5nYfUB4E04Zg+XqNeYHa0T9R97e7crZ5iMKutujydmnhqA0
 r3Ca53w=
 =R3sl
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-rc1' into for-6.12

Linux 6.11-rc1
2024-07-30 09:30:11 -10:00
Stephen Boyd
5ac7973032 platform: Add test managed platform_device/driver APIs
Introduce KUnit resource wrappers around platform_driver_register(),
platform_device_alloc(), and platform_device_add() so that test authors
can register platform drivers/devices from their tests and have the
drivers/devices automatically be unregistered when the test is done.

This makes test setup code simpler when a platform driver or platform
device is needed. Add a few test cases at the same time to make sure the
APIs work as intended.

Cc: Brendan Higgins <brendan.higgins@linux.dev>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org
2024-07-29 15:33:12 -07:00
Linus Torvalds
cb04e8b1d2 minmax: don't use max() in situations that want a C constant expression
We only had a couple of array[] declarations, and changing them to just
use 'MAX()' instead of 'max()' fixes the issue.

This will allow us to simplify our min/max macros enormously, since they
can now unconditionally use temporary variables to avoid using the
argument values multiple times.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 20:23:27 -07:00
Linus Torvalds
1a251f52cf minmax: make generic MIN() and MAX() macros available everywhere
This just standardizes the use of MIN() and MAX() macros, with the very
traditional semantics.  The goal is to use these for C constant
expressions and for top-level / static initializers, and so be able to
simplify the min()/max() macros.

These macro names were used by various kernel code - they are very
traditional, after all - and all such users have been fixed up, with a
few different approaches:

 - trivial duplicated macro definitions have been removed

   Note that 'trivial' here means that it's obviously kernel code that
   already included all the major kernel headers, and thus gets the new
   generic MIN/MAX macros automatically.

 - non-trivial duplicated macro definitions are guarded with #ifndef

   This is the "yes, they define their own versions, but no, the include
   situation is not entirely obvious, and maybe they don't get the
   generic version automatically" case.

 - strange use case #1

   A couple of drivers decided that the way they want to describe their
   versioning is with

	#define MAJ 1
	#define MIN 2
	#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)

   which adds zero value and I just did my Alexander the Great
   impersonation, and rewrote that pointless Gordian knot as

	#define DRV_VERSION "1.2"

   instead.

 - strange use case #2

   A couple of drivers thought that it's a good idea to have a random
   'MIN' or 'MAX' define for a value or index into a table, rather than
   the traditional macro that takes arguments.

   These values were re-written as C enum's instead. The new
   function-line macros only expand when followed by an open
   parenthesis, and thus don't clash with enum use.

Happily, there weren't really all that many of these cases, and a lot of
users already had the pattern of using '#ifndef' guarding (or in one
case just using '#undef MIN') before defining their own private version
that does the same thing. I left such cases alone.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 15:49:18 -07:00
Linus Torvalds
910bfc26d1 Rust changes for v6.11
The highlight is the establishment of a minimum version for the Rust
 toolchain, including 'rustc' (and bundled tools) and 'bindgen'.
 
 The initial minimum will be the pinned version we currently have, i.e.
 we are just widening the allowed versions. That covers 3 stable Rust
 releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow), plus beta,
 plus nightly.
 
 This should already be enough for kernel developers in distributions
 that provide recent Rust compiler versions routinely, such as Arch
 Linux, Debian Unstable (outside the freeze period), Fedora Linux,
 Gentoo Linux (especially the testing channel), Nix (unstable) and
 openSUSE Slowroll and Tumbleweed.
 
 In addition, the kernel is now being built-tested by Rust's pre-merge
 CI. That is, every change that is attempting to land into the Rust
 compiler is tested against the kernel, and it is merged only if it
 passes. Similarly, the bindgen tool has agreed to build the kernel in
 their CI too.
 
 Thus, with the pre-merge CI in place, both projects hope to avoid
 unintentional changes to Rust that break the kernel. This means that,
 in general, apart from intentional changes on their side (that we
 will need to workaround conditionally on our side), the upcoming Rust
 compiler versions should generally work.
 
 In addition, the Rust project has proposed getting the kernel into
 stable Rust (at least solving the main blockers) as one of its three
 flagship goals for 2024H2 [1].
 
 I would like to thank Niko, Sid, Emilio et al. for their help promoting
 the collaboration between Rust and the kernel.
 
 [1] https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals
 
 Toolchain and infrastructure:
 
  - Support several Rust toolchain versions.
 
  - Support several bindgen versions.
 
  - Remove 'cargo' requirement and simplify 'rusttest', thanks to 'alloc'
    having been dropped last cycle.
 
  - Provide proper error reporting for the 'rust-analyzer' target.
 
 'kernel' crate:
 
  - Add 'uaccess' module with a safe userspace pointers abstraction.
 
  - Add 'page' module with a 'struct page' abstraction.
 
  - Support more complex generics in workqueue's 'impl_has_work!' macro.
 
 'macros' crate:
 
  - Add 'firmware' field support to the 'module!' macro.
 
  - Improve 'module!' macro documentation.
 
 Documentation:
 
  - Provide instructions on what packages should be installed to build
    the kernel in some popular Linux distributions.
 
  - Introduce the new kernel.org LLVM+Rust toolchains.
 
  - Explain '#[no_std]'.
 
 And a few other small bits.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmahqRUACgkQGXyLc2ht
 IW0xbA/6A26b14LjvmFBJU6LZb0ey1BCbK9cOWtd6K6f/uWp108WAIdA/+gHgOGU
 I6rW8nXk3af078lHRqv0ihMDUks/1mz5wyxEXoZ/mVvRJbzH9TsHN7cSP2fr4H14
 8rES4esr2XBlu9OdgDFb/o7jequ7PE0+WQDapV6eAhWQlBC6AI+ShyX26pWcB5gv
 8O4mE59Up51d21L8apVh+pnEgBsCsu7c68pUMbrk2k4sHVvnRti4iLoVlemf4X80
 Di9hyi8iN/MvWMdfq+hCIufUIbcWde07HcCbLjQlkJv0sc20V+UIGUx4EOUasOTY
 ugUyzhlFNGPxJYayAZAb8KJtQZhSbGZ+R244Z/CoV2RMlEw9LxSCpyzHr1nalOLT
 01gqZh6+gIFyPm6F0ORsetcV6yzdvUcGTjx1vuEJ9qqeKG/gc/VqFOcmCPaT7y8K
 nTOMg6zY3mzaqTn1iBebid7INzXJN7ha9dk1TkDv47BNZAic51d3L0hQFXuDrEuu
 MxVIPTAPKJSaQTCh0jrLxLJ649v/98OP0urYqlVeKuTeovupETxCsBTVtjjjsv+w
 ZomqEO+JWuf7hjG0RLuCwi/IvWpUFpEdOal4qfHbKLOAOn7zxV/WrG675HcRKbw5
 Zkr/0Q44fwbZWd2b/svTO1qOKaYV7oL0utVOdUb2KX05K71NNVo=
 =8PYF
 -----END PGP SIGNATURE-----

Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux

Pull Rust updates from Miguel Ojeda:
 "The highlight is the establishment of a minimum version for the Rust
  toolchain, including 'rustc' (and bundled tools) and 'bindgen'.

  The initial minimum will be the pinned version we currently have, i.e.
  we are just widening the allowed versions. That covers three stable
  Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow),
  plus beta, plus nightly.

  This should already be enough for kernel developers in distributions
  that provide recent Rust compiler versions routinely, such as Arch
  Linux, Debian Unstable (outside the freeze period), Fedora Linux,
  Gentoo Linux (especially the testing channel), Nix (unstable) and
  openSUSE Slowroll and Tumbleweed.

  In addition, the kernel is now being built-tested by Rust's pre-merge
  CI. That is, every change that is attempting to land into the Rust
  compiler is tested against the kernel, and it is merged only if it
  passes. Similarly, the bindgen tool has agreed to build the kernel in
  their CI too.

  Thus, with the pre-merge CI in place, both projects hope to avoid
  unintentional changes to Rust that break the kernel. This means that,
  in general, apart from intentional changes on their side (that we will
  need to workaround conditionally on our side), the upcoming Rust
  compiler versions should generally work.

  In addition, the Rust project has proposed getting the kernel into
  stable Rust (at least solving the main blockers) as one of its three
  flagship goals for 2024H2 [1].

  I would like to thank Niko, Sid, Emilio et al. for their help
  promoting the collaboration between Rust and the kernel.

  Toolchain and infrastructure:

   - Support several Rust toolchain versions.

   - Support several bindgen versions.

   - Remove 'cargo' requirement and simplify 'rusttest', thanks to
     'alloc' having been dropped last cycle.

   - Provide proper error reporting for the 'rust-analyzer' target.

  'kernel' crate:

   - Add 'uaccess' module with a safe userspace pointers abstraction.

   - Add 'page' module with a 'struct page' abstraction.

   - Support more complex generics in workqueue's 'impl_has_work!'
     macro.

  'macros' crate:

   - Add 'firmware' field support to the 'module!' macro.

   - Improve 'module!' macro documentation.

  Documentation:

   - Provide instructions on what packages should be installed to build
     the kernel in some popular Linux distributions.

   - Introduce the new kernel.org LLVM+Rust toolchains.

   - Explain '#[no_std]'.

  And a few other small bits"

Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1]

* tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits)
  docs: rust: quick-start: add section on Linux distributions
  rust: warn about `bindgen` versions 0.66.0 and 0.66.1
  rust: start supporting several `bindgen` versions
  rust: work around `bindgen` 0.69.0 issue
  rust: avoid assuming a particular `bindgen` build
  rust: start supporting several compiler versions
  rust: simplify Clippy warning flags set
  rust: relax most deny-level lints to warnings
  rust: allow `dead_code` for never constructed bindings
  rust: init: simplify from `map_err` to `inspect_err`
  rust: macros: indent list item in `paste!`'s docs
  rust: add abstraction for `struct page`
  rust: uaccess: add typed accessors for userspace pointers
  uaccess: always export _copy_[from|to]_user with CONFIG_RUST
  rust: uaccess: add userspace pointers
  kbuild: rust-analyzer: improve comment documentation
  kbuild: rust-analyzer: better error handling
  docs: rust: no_std is used
  rust: alloc: add __GFP_HIGHMEM flag
  rust: alloc: fix typo in docs for GFP_NOWAIT
  ...
2024-07-27 13:44:54 -07:00
Linus Torvalds
7b0acd911c 11 hotfixes, 7 of which are cc:stable. 7 are MM, 4 are other.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZqQWWQAKCRDdBJ7gKXxA
 jqJVAP9vU9HNzIyKDOOqoNHKMI+VzGn39w1FihWjG6AU5a+9NQD+MZJwr7bBwkpH
 ii43HLUGvNRQtsldBZSRypsaitCSwAI=
 =HGce
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "11 hotfixes, 7 of which are cc:stable.  7 are MM, 4 are other"

* tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nilfs2: handle inconsistent state in nilfs_btnode_create_block()
  selftests/mm: skip test for non-LPA2 and non-LVA systems
  mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()
  mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node
  alloc_tag: outline and export free_reserved_page()
  decompress_bunzip2: fix rare decompression failure
  mm/huge_memory: avoid PMD-size page cache if needed
  mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines
  mm: fix old/young bit handling in the faulting path
  dt-bindings: arm: update James Clark's email address
  MAINTAINERS: mailmap: update James Clark's email address
2024-07-27 10:26:41 -07:00
Ross Lagerwall
bf6acd5d16 decompress_bunzip2: fix rare decompression failure
The decompression code parses a huffman tree and counts the number of
symbols for a given bit length.  In rare cases, there may be >= 256
symbols with a given bit length, causing the unsigned char to overflow. 
This causes a decompression failure later when the code tries and fails to
find the bit length for a given symbol.

Since the maximum number of symbols is 258, use unsigned short instead.

Link: https://lkml.kernel.org/r/20240717162016.1514077-1-ross.lagerwall@citrix.com
Fixes: bc22c17e12 ("bzip2/lzma: library support for gzip, bzip2 and lzma decompression")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: Alain Knaff <alain@knaff.lu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-26 14:33:09 -07:00
Linus Torvalds
51c4767503 bitmap-6.11-rc1
Random fixes for v6.11.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmahKbIACgkQsUSA/Tof
 vsh8zQwAvguyeNubDFqdMe3E/Vp1J3WqXsBFzbE1rGLCyI2S0cgJFL5BlW51zY47
 70wLt9EmroEobwj1qHSQlzejNp31kSBQ1Sqq25oivfJqEF1elDT5PQxYqBbU1C9Y
 kVWnxtb+oKaoFd5jiBK8+iTl8dXjT6H2RoV0zpPab/JPcqsjwFfkUvtENt/Kpo5c
 aRrGTFwshdp5eT4sEZQv57VKroBcwZOvv2//qrklFHrJHl4pjMT8eaX3twcQysoy
 umTVt+TK6NErLnht+VRQJ2/L02FKi7b+bHePVgNzaT+1FSDMT4FltmZd96Xwbzah
 hSkwWtqy0N2gaTcqie9nwdEiCJGjF39M7k2wangUS91CeDsbIUSsJgDCESUCm+zK
 hRqleGOnoeg4+jZBci7M53lKa/pADlmLhnU8iAc3BSKozsaioltkT+hHn8vAkstk
 h/kHlbfkzasufUWAhduBpIn384gWWEY6RACffgCsOuvbT+ZyDKUJpKYaEwVx+Pri
 l72j0hs9
 =RbET
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux

Pull bitmap updates from Yury Norov:
 "Random fixes"

* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
  riscv: Remove unnecessary int cast in variable_fls()
  radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
  bitops: Add a comment explaining the double underscore macros
  lib: bitmap: add missing MODULE_DESCRIPTION() macros
  cpumask: introduce assign_cpu() macro
2024-07-26 09:50:36 -07:00
Linus Torvalds
8bf100092d trivial printk changes for 6.11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmaiUQcACgkQUqAMR0iA
 lPITNQ/+KDdmQljKwpuXlqe01F1mG/LFn5i1Y/a8fZVep/OSmihsgnEqnYzBomTq
 CF22tlrH7r6EZ2D5by1fjno/AG6/BAXvwH8jGDr9jhVNNBsneeVYrtMB1UUslR/e
 OEFoFKyzpq6VJNmHl5aAM95CEFEkE5uBba4DkJ/hCh3oErc2zP5DRdD9COCkdlIp
 +LzQa6XsMjwzrWAMAm2vWdBgePCHVKsAVVFUdfmN28FQw3BcFZEgIvN7vPT7Ee3I
 ESKx/Asb3myb1J7bFvDKnpT9O+7EkU/cpQn+HxjiIFVPqFLX3mfSXzgvfNocuPB0
 hkIUzA9Sbu2wa+SE7qU1IwHVZj2N612OPso8lG8cbcic/KaYLpd6Kt7bnJe8kBe7
 gFGBVmDjvapilQwWteWJcMs2hBxXuq0Xd+CMcXTMKYcLS8Z+4TVAYA8onssrUq+0
 Jye8hW/CvST/P7wazVtuQu1fsKKz3SG+dAaXw9/7fYTGQ2LdRoLUkDhLkwqIURjD
 j6+pMMuYpxrpeA7yaOD1xLLOKC33OINClgfodjGVHvDlwxsQBhZFchCKsbufYEa1
 CxFi8lDcfNVSGuw5x3a6iMqwnqxoeVKpi6eKKgpU81fXnGd4G2NA5jGRMycWmqzX
 +uUq6Ot1NO4QVK+pT70GhXMt2rQ6UsC1SyWggeFYBhaaqfIGKRk=
 =89Nl
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - trivial printk changes

The bigger "real" printk work is still being discussed.

* tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  vsprintf: add missing MODULE_DESCRIPTION() macro
  printk: Rename console_replay_all() and update context
2024-07-25 13:18:41 -07:00
Linus Torvalds
c2a96b7f18 Driver core changes for 6.11-rc1
Here is the big set of driver core changes for 6.11-rc1.
 
 Lots of stuff in here, with not a huge diffstat, but apis are evolving
 which required lots of files to be touched.  Highlights of the changes
 in here are:
   - platform remove callback api final fixups (Uwe took many releases to
     get here, finally!)
   - Rust bindings for basic firmware apis and initial driver-core
     interactions.  It's not all that useful for a "write a whole driver
     in rust" type of thing, but the firmware bindings do help out the
     phy rust drivers, and the driver core bindings give a solid base on
     which others can start their work.  There is still a long way to go
     here before we have a multitude of rust drivers being added, but
     it's a great first step.
   - driver core const api changes.  This reached across all bus types,
     and there are some fix-ups for some not-common bus types that
     linux-next and 0-day testing shook out.  This work is being done to
     help make the rust bindings more safe, as well as the C code, moving
     toward the end-goal of allowing us to put driver structures into
     read-only memory.  We aren't there yet, but are getting closer.
   - minor devres cleanups and fixes found by code inspection
   - arch_topology minor changes
   - other minor driver core cleanups
 
 All of these have been in linux-next for a very long time with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZqH+aQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymoOQCfVBdLcBjEDAGh3L8qHRGMPy4rV2EAoL/r+zKm
 cJEYtJpGtWX6aAtugm9E
 =ZyJV
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big set of driver core changes for 6.11-rc1.

  Lots of stuff in here, with not a huge diffstat, but apis are evolving
  which required lots of files to be touched. Highlights of the changes
  in here are:

   - platform remove callback api final fixups (Uwe took many releases
     to get here, finally!)

   - Rust bindings for basic firmware apis and initial driver-core
     interactions.

     It's not all that useful for a "write a whole driver in rust" type
     of thing, but the firmware bindings do help out the phy rust
     drivers, and the driver core bindings give a solid base on which
     others can start their work.

     There is still a long way to go here before we have a multitude of
     rust drivers being added, but it's a great first step.

   - driver core const api changes.

     This reached across all bus types, and there are some fix-ups for
     some not-common bus types that linux-next and 0-day testing shook
     out.

     This work is being done to help make the rust bindings more safe,
     as well as the C code, moving toward the end-goal of allowing us to
     put driver structures into read-only memory. We aren't there yet,
     but are getting closer.

   - minor devres cleanups and fixes found by code inspection

   - arch_topology minor changes

   - other minor driver core cleanups

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  ARM: sa1100: make match function take a const pointer
  sysfs/cpu: Make crash_hotplug attribute world-readable
  dio: Have dio_bus_match() callback take a const *
  zorro: make match function take a const pointer
  driver core: module: make module_[add|remove]_driver take a const *
  driver core: make driver_find_device() take a const *
  driver core: make driver_[create|remove]_file take a const *
  firmware_loader: fix soundness issue in `request_internal`
  firmware_loader: annotate doctests as `no_run`
  devres: Correct code style for functions that return a pointer type
  devres: Initialize an uninitialized struct member
  devres: Fix memory leakage caused by driver API devm_free_percpu()
  devres: Fix devm_krealloc() wasting memory
  driver core: platform: Switch to use kmemdup_array()
  driver core: have match() callback in struct bus_type take a const *
  MAINTAINERS: add Rust device abstractions to DRIVER CORE
  device: rust: improve safety comments
  MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
  MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
  firmware: rust: improve safety comments
  ...
2024-07-25 10:42:22 -07:00
Linus Torvalds
7a3fad30fd Random number generator updates for Linux 6.11-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmaarzgACgkQSfxwEqXe
 A66ZWBAAlhXx8bve0uKlDRK8fffWHgruho/fOY4lZJ137AKwA9JCtmOyqdfL4Dmk
 VxFe7pEQJlQhcA/6kH54uO7SBXwfKlKZJth6SYnaCRMUIbFifHjjIQ0QqldjEKi0
 rP90Hu4FVsbwQC7u9i9lQj9n2P36zb6pn83BzpZQ/2PtoVCSCrdSJUe0Rxa3H3GN
 0+nNkDSXQt5otCByLaeE3x7KJgXLWL9+G2eFSFLTZ8rSVfMx1CdOIAG37WlLGdWm
 BaFYPDKMyBTVvVJBNgAe9YSqtrsZ5nlmLz+Z9wAe/hTL7RlL03kWUu34/Udcpull
 zzMDH0WMntiGK3eFQ2gOYSWqypvAjwHgn3BzqNmjUb69+89mZsdU1slcvnxWsUwU
 D3vphrscaqarF629tfsXti3jc5PoXwUTjROZVcCyeFPBhyAZgzK8xUvPpJO+RT+K
 EuUABob9cpA6FCpW/QeolDmMDhXlNT8QgsZu1juokZac2xP3Ly3REyEvT7HLbU2W
 ZJjbEqm1ppp3RmGELUOJbyhwsLrnbt+OMDO7iEWoG8aSFK4diBK/ZM6WvLMkr8Oi
 7ioXGIsYkCy3c47wpZKTrAapOPJp5keqNAiHSEbXw8mozp6429QAEZxNOcczgHKC
 Ea2JzRkctqutcIT+Slw/uUe//i1iSsIHXbE81fp5udcQTJcUByo=
 =P8aI
 -----END PGP SIGNATURE-----

Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "This adds getrandom() support to the vDSO.

  First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
  lets the kernel zero out pages anytime under memory pressure, which
  enables allocating memory that never gets swapped to disk but also
  doesn't count as being mlocked.

  Then, the vDSO implementation of getrandom() is introduced in a
  generic manner and hooked into random.c.

  Next, this is implemented on x86. (Also, though it's not ready for
  this pull, somebody has begun an arm64 implementation already)

  Finally, two vDSO selftests are added.

  There are also two housekeeping cleanup commits"

* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  MAINTAINERS: add random.h headers to RNG subsection
  random: note that RNDGETPOOL was removed in 2.6.9-rc2
  selftests/vDSO: add tests for vgetrandom
  x86: vdso: Wire up getrandom() vDSO implementation
  random: introduce generic vDSO getrandom() implementation
  mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-24 10:29:50 -07:00
Linus Torvalds
7d080fa867 for-6.11/block-20240722
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmaeZBIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqI7D/9XPinZuuwiZ/670P8yjk1SHFzqzdwtuFuP
 +Dq2lcoYRkuwm5PvCvhs3QH2mnjS1vo1SIoAijGEy3V1bs41mw87T2knKMIn4g5v
 I5A4gC6i0IqxIkFm17Zx9yG+MivoOmPtqM4RMxze2xS/uJwWcvg4tjBHZfylY3d9
 oaIXyZj+0dTRf955K2x/5dpfE6qjtDG0bqrrJXnzaIKHBJk2HKezYFbTstAA4OY+
 MvMqRL7uJmJBd7384/WColIO0b8/UEchPl7qG+zy9pg+wzQGLFyF/Z/KdjrWdDMD
 IFs92uNDFQmiGoyujJmXdDV9xpKi94nqDAtUR+Qct0Mui5zz0w2RNcGvyTDjBMpv
 CAzTkTW48moYkwLPhPmy8Ge69elT82AC/9ZQAHbA7g3TYgJML5IT/7TtiaVe6Rc1
 podnTR3/e9XmZnc25aUZeAr6CG7b+0NBvB+XPO9lNyMEE38sfwShoPdAGdKX25oA
 mjnLHBc9grVOQzRGEx22E11k+1ChXf/o9H546PB2Pr9yvf/DQ3868a+QhHssxufL
 Xul1K5a+pUmOnaTLD3ESftYlFmcDOHQ6gDK697so7mU7lrD3ctN4HYZ2vwNk35YY
 2b4xrABrOEbAXlUo3Ht8F/ecg6qw4xTr9vAW5q4+L2H5+28RaZKYclHhLmR23yfP
 xJ/d5FfVFQ==
 =fqoV
 -----END PGP SIGNATURE-----

Merge tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:

 - MD fixes via Song:
     - md-cluster fixes (Heming Zhao)
     - raid1 fix (Mateusz Jończyk)

 - s390/dasd module description (Jeff)

 - Series cleaning up and hardening the blk-mq debugfs flag handling
   (John, Christoph)

 - blk-cgroup cleanup (Xiu)

 - Error polled IO attempts if backend doesn't support it (hexue)

 - Fix for an sbitmap hang (Yang)

* tag 'for-6.11/block-20240722' of git://git.kernel.dk/linux: (23 commits)
  blk-cgroup: move congestion_count to struct blkcg
  sbitmap: fix io hung due to race on sbitmap_word::cleared
  block: avoid polling configuration errors
  block: Catch possible entries missing from rqf_name[]
  block: Simplify definition of RQF_NAME()
  block: Use enum to define RQF_x bit indexes
  block: Catch possible entries missing from cmd_flag_name[]
  block: Catch possible entries missing from alloc_policy_name[]
  block: Catch possible entries missing from hctx_flag_name[]
  block: Catch possible entries missing from hctx_state_name[]
  block: Catch possible entries missing from blk_queue_flag_name[]
  block: Make QUEUE_FLAG_x as an enum
  block: Relocate BLK_MQ_MAX_DEPTH
  block: Relocate BLK_MQ_CPU_WORK_BATCH
  block: remove QUEUE_FLAG_STOPPED
  block: Add missing entry to hctx_flag_name[]
  block: Add zone write plugging entry to rqf_name[]
  block: Add missing entries from cmd_flag_name[]
  s390/dasd: fix error checks in dasd_copy_pair_store()
  s390/dasd: add missing MODULE_DESCRIPTION() macros
  ...
2024-07-22 11:32:05 -07:00
Linus Torvalds
527eff227d - In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code and
   has taught bcachefs to use the new more generic implementation.
 
 - Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
   reworks the cpumask and nodemask headers to make things generally more
   rational.
 
 - Kuan-Wei Chiu has sent along some maintenance work against our sorting
   library code in the series "lib/sort: Optimizations and cleanups".
 
 - More library maintainance work from Christophe Jaillet in the series
   "Remove usage of the deprecated ida_simple_xx() API".
 
 - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
   series "nilfs2: eliminate the call to inode_attach_wb()".
 
 - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB
   command error".
 
 - Plus the usual shower of singleton patches all over the place.  Please
   see the relevant changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2GvwAKCRDdBJ7gKXxA
 jlf/AP48xP5ilIHbtpAKm2z+MvGuTxJQ5VSC0UXFacuCbc93lAEA+Yo+vOVRmh6j
 fQF2nVKyKLYfSz7yqmCyAaHWohIYLgg=
 =Stxz
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - In the series "treewide: Refactor heap related implementation",
   Kuan-Wei Chiu has significantly reworked the min_heap library code
   and has taught bcachefs to use the new more generic implementation.

 - Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
   reworks the cpumask and nodemask headers to make things generally
   more rational.

 - Kuan-Wei Chiu has sent along some maintenance work against our
   sorting library code in the series "lib/sort: Optimizations and
   cleanups".

 - More library maintainance work from Christophe Jaillet in the series
   "Remove usage of the deprecated ida_simple_xx() API".

 - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
   series "nilfs2: eliminate the call to inode_attach_wb()".

 - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix
   GDB command error".

 - Plus the usual shower of singleton patches all over the place. Please
   see the relevant changelogs for details.

* tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits)
  ia64: scrub ia64 from poison.h
  watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
  tsacct: replace strncpy() with strscpy()
  lib/bch.c: use swap() to improve code
  test_bpf: convert comma to semicolon
  init/modpost: conditionally check section mismatch to __meminit*
  init: remove unused __MEMINIT* macros
  nilfs2: Constify struct kobj_type
  nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
  math: rational: add missing MODULE_DESCRIPTION() macro
  lib/zlib: add missing MODULE_DESCRIPTION() macro
  fs: ufs: add MODULE_DESCRIPTION()
  lib/rbtree.c: fix the example typo
  ocfs2: add bounds checking to ocfs2_check_dir_entry()
  fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
  coredump: simplify zap_process()
  selftests/fpu: add missing MODULE_DESCRIPTION() macro
  compiler.h: simplify data_race() macro
  build-id: require program headers to be right after ELF header
  resource: add missing MODULE_DESCRIPTION()
  ...
2024-07-21 17:56:22 -07:00
Linus Torvalds
fbc90c042c - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN
walkers") is known to cause a performance regression
   (https://lore.kernel.org/all/3acefad9-96e5-4681-8014-827d6be71c7a@linux.ibm.com/T/#mfa809800a7862fb5bdf834c6f71a3a5113eb83ff).
   Yu has a fix which I'll send along later via the hotfixes branch.
 
 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.
 
 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that.  This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches.  My bad.
 
 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"
 
 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability of
   cgroup writeback"
 
 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache index".
 
 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of the
   zeropage in MAP_SHARED mappings.  I don't see any runtime effects here -
   more a cleanup/understandability/maintainablity thing.
 
 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of
   higher addresses, for aarch64.  The (poorly named) series is
   "Restructure va_high_addr_switch".
 
 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".
 
 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the
   series "Enhance soft hwpoison handling and injection".
 
 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything.  Some landed in this pull.
 
 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has
   simplified migration's use of hardware-offload memory copying.
 
 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".
 
 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code.  This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.
 
 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.
 
 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP.  By default this
   is a no-op - users must opt in vis sysfs controls.  Dramatic
   improvements in pagefault latency are realized.
 
 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".
 
 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".
 
 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".
 
 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".
 
 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances.  A 10x speedup is seen in a microbenchmark.
 
   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.
 
 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".
 
 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.
 
 - Is anyone reading this stuff?  If so, email me!
 
 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.
 
 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".
 
 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.
 
 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".
 
 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE".  It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.
 
 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.
 
 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large folio
   userspace copying.
 
 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers.  From SeongJae Park.
 
 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.
 
 - David Hildenbrand sends along more cleanups, this time against the
   migration code.  The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".
 
 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code.  He addresses this in the series "mm: Fix various
   readahead quirks".
 
 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's self
   testing code.
 
 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code.  The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this.  The series is marked cc:stable.
 
 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.
 
 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion.  The series (which also makes the memcg-v1 code
   Kconfigurable) are
 
   "mm: memcg: separate legacy cgroup v1 code and put under config
   option" and
   "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1"
 
 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.
 
 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of excessive
   correctable memory errors.  In order to permit userspace to monitor and
   handle this situation.
 
 - Kefeng Wang's series "mm: migrate: support poison recover from migrate
   folio" teaches the kernel to appropriately handle migration from
   poisoned source folios rather than simply panicing.
 
 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.
 
 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory utilization.
 
 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare
   refcount increments.  So these paes can first be moved aside if they
   reside in the movable zone or a CMA block.
 
 - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps
   for much faster reading of vma information.  The series is "query VMAs
   from /proc/<pid>/maps".
 
 - In the series "mm: introduce per-order mTHP split counters" Lance Yang
   improves the kernel's presentation of developer information related to
   multisize THP splitting.
 
 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)".  This permits
   userspace to use all available huge page sizes.
 
 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and not
   very useful feature from slab fault injection.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2C+QAKCRDdBJ7gKXxA
 joTkAQDvjqOoFStqk4GU3OXMYB7WCU/ZQMFG0iuu1EEwTVDZ4QEA8CnG7seek1R3
 xEoo+vw0sWWeLV3qzsxnCA1BJ8cTJA8=
 =z0Lf
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.

 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that. This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches. My
   bad.

 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"

 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability
   of cgroup writeback"

 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache
   index".

 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of
   the zeropage in MAP_SHARED mappings. I don't see any runtime effects
   here - more a cleanup/understandability/maintainablity thing.

 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
   of higher addresses, for aarch64. The (poorly named) series is
   "Restructure va_high_addr_switch".

 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".

 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
   the series "Enhance soft hwpoison handling and injection".

 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything. Some landed in this pull.

 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
   has simplified migration's use of hardware-offload memory copying.

 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".

 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code. This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.

 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.

 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP. By default this
   is a no-op - users must opt in vis sysfs controls. Dramatic
   improvements in pagefault latency are realized.

 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".

 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".

 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".

 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".

 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances. A 10x speedup is seen in a microbenchmark.

   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.

 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".

 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.

 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.

 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".

 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.

 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".

 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.

 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.

 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large
   folio userspace copying.

 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers. From SeongJae Park.

 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.

 - David Hildenbrand sends along more cleanups, this time against the
   migration code. The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".

 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code. He addresses this in the series "mm: Fix various
   readahead quirks".

 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's
   self testing code.

 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code. The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this. The series is marked cc:stable.

 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.

 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion. The series (which also makes the memcg-v1 code
   Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
   under config option" and "mm: memcg: put cgroup v1-specific memcg
   data under CONFIG_MEMCG_V1"

 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.

 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of
   excessive correctable memory errors. In order to permit userspace to
   monitor and handle this situation.

 - Kefeng Wang's series "mm: migrate: support poison recover from
   migrate folio" teaches the kernel to appropriately handle migration
   from poisoned source folios rather than simply panicing.

 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.

 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory
   utilization.

 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than
   bare refcount increments. So these paes can first be moved aside if
   they reside in the movable zone or a CMA block.

 - Andrii Nakryiko has added a binary ioctl()-based API to
   /proc/pid/maps for much faster reading of vma information. The series
   is "query VMAs from /proc/<pid>/maps".

 - In the series "mm: introduce per-order mTHP split counters" Lance
   Yang improves the kernel's presentation of developer information
   related to multisize THP splitting.

 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)". This permits
   userspace to use all available huge page sizes.

 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and
   not very useful feature from slab fault injection.

* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
  mm/mglru: fix ineffective protection calculation
  mm/zswap: fix a white space issue
  mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
  mm/hugetlb: fix possible recursive locking detected warning
  mm/gup: clear the LRU flag of a page before adding to LRU batch
  mm/numa_balancing: teach mpol_to_str about the balancing mode
  mm: memcg1: convert charge move flags to unsigned long long
  alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
  lib: reuse page_ext_data() to obtain codetag_ref
  lib: add missing newline character in the warning message
  mm/mglru: fix overshooting shrinker memory
  mm/mglru: fix div-by-zero in vmpressure_calc_level()
  mm/kmemleak: replace strncpy() with strscpy()
  mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
  mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
  mm: ignore data-race in __swap_writepage
  hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
  mm: shmem: rename mTHP shmem counters
  mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
  mm/migrate: putback split folios when numa hint migration fails
  ...
2024-07-21 17:15:46 -07:00
Linus Torvalds
acc5965b9f Char/Misc and other driver changes for 6.11-rc1
Here is the "big" set of char/misc and other driver subsystem changes
 for 6.11-rc1.  Nothing major in here, just loads of new drivers and
 updates.  Included in here are:
   - IIO api updates and new drivers added
   - wait_interruptable_timeout() api cleanups for some drivers
   - MODULE_DESCRIPTION() additions for loads of drivers
   - parport out-of-bounds fix
   - interconnect driver updates and additions
   - mhi driver updates and additions
   - w1 driver fixes
   - binder speedups and fixes
   - eeprom driver updates
   - coresight driver updates
   - counter driver update
   - new misc driver additions
   - other minor api updates
 
 All of these, EXCEPT for the final Kconfig build fix for 32bit systems,
 have been in linux-next for a while with no reported issues.  The
 Kconfig fixup went in 29 hours ago, so might have missed the latest
 linux-next, but was acked by everyone involved.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZppR4w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykwoQCeIaW3nbOiNTmOupvEnZwrN3yVNs8An3Q5L+Br
 1LpTASaU6A8pN81Z1m5g
 =6U1z
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc and other driver updates from Greg KH:
 "Here is the "big" set of char/misc and other driver subsystem changes
  for 6.11-rc1. Nothing major in here, just loads of new drivers and
  updates. Included in here are:

   - IIO api updates and new drivers added

   - wait_interruptable_timeout() api cleanups for some drivers

   - MODULE_DESCRIPTION() additions for loads of drivers

   - parport out-of-bounds fix

   - interconnect driver updates and additions

   - mhi driver updates and additions

   - w1 driver fixes

   - binder speedups and fixes

   - eeprom driver updates

   - coresight driver updates

   - counter driver update

   - new misc driver additions

   - other minor api updates

  All of these, EXCEPT for the final Kconfig build fix for 32bit
  systems, have been in linux-next for a while with no reported issues.
  The Kconfig fixup went in 29 hours ago, so might have missed the
  latest linux-next, but was acked by everyone involved"

* tag 'char-misc-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (330 commits)
  misc: Kconfig: exclude mrvl-cn10k-dpi compilation for 32-bit systems
  misc: delete Makefile.rej
  binder: fix hang of unregistered readers
  misc: Kconfig: add a new dependency for MARVELL_CN10K_DPI
  virtio: add missing MODULE_DESCRIPTION() macro
  agp: uninorth: add missing MODULE_DESCRIPTION() macro
  spmi: add missing MODULE_DESCRIPTION() macros
  dev/parport: fix the array out-of-bounds risk
  samples: configfs: add missing MODULE_DESCRIPTION() macro
  misc: mrvl-cn10k-dpi: add Octeon CN10K DPI administrative driver
  misc: keba: Fix missing AUXILIARY_BUS dependency
  slimbus: Fix struct and documentation alignment in stream.c
  MAINTAINERS: CC dri-devel list on Qualcomm FastRPC patches
  misc: fastrpc: use coherent pool for untranslated Compute Banks
  misc: fastrpc: support complete DMA pool access to the DSP
  misc: fastrpc: add missing MODULE_DESCRIPTION() macro
  misc: fastrpc: Add missing dev_err newlines
  misc: fastrpc: Use memdup_user()
  nvmem: core: Implement force_ro sysfs attribute
  nvmem: Use sysfs_emit() for type attribute
  ...
2024-07-19 15:55:08 -07:00
Jason A. Donenfeld
4ad10a5f5f random: introduce generic vDSO getrandom() implementation
Provide a generic C vDSO getrandom() implementation, which operates on
an opaque state returned by vgetrandom_alloc() and produces random bytes
the same way as getrandom(). This has the following API signature:

  ssize_t vgetrandom(void *buffer, size_t len, unsigned int flags,
                     void *opaque_state, size_t opaque_len);

The return value and the first three arguments are the same as ordinary
getrandom(), while the last two arguments are a pointer to the opaque
allocated state and its size. Were all five arguments passed to the
getrandom() syscall, nothing different would happen, and the functions
would have the exact same behavior.

The actual vDSO RNG algorithm implemented is the same one implemented by
drivers/char/random.c, using the same fast-erasure techniques as that.
Should the in-kernel implementation change, so too will the vDSO one.

It requires an implementation of ChaCha20 that does not use any stack,
in order to maintain forward secrecy if a multi-threaded program forks
(though this does not account for a similar issue with SA_SIGINFO
copying registers to the stack), so this is left as an
architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm
to implement on a variety of architectures, so this shouldn't be too
onerous.

Initially, the state is keyless, and so the first call makes a
getrandom() syscall to generate that key, and then uses it for
subsequent calls. By keeping track of a generation counter, it knows
when its key is invalidated and it should fetch a new one using the
syscall. Later, more than just a generation counter might be used.

Since MADV_WIPEONFORK is set on the opaque state, the key and related
state is wiped during a fork(), so secrets don't roll over into new
processes, and the same state doesn't accidentally generate the same
random stream. The generation counter, as well, is always >0, so that
the 0 counter is a useful indication of a fork() or otherwise
uninitialized state.

If the kernel RNG is not yet initialized, then the vDSO always calls the
syscall, because that behavior cannot be emulated in userspace, but
fortunately that state is short lived and only during early boot. If it
has been initialized, then there is no need to inspect the `flags`
argument, because the behavior does not change post-initialization
regardless of the `flags` value.

Since the opaque state passed to it is mutated, vDSO getrandom() is not
reentrant, when used with the same opaque state, which libc should be
mindful of.

The function works over an opaque per-thread state of a particular size,
which must be marked VM_WIPEONFORK, VM_DONTDUMP, VM_NORESERVE, and
VM_DROPPABLE for proper operation. Over time, the nuances of these
allocations may change or grow or even differ based on architectural
features.

The opaque state passed to vDSO getrandom() must be allocated using the
mmap_flags and mmap_prot parameters provided by the vgetrandom_opaque_params
struct, which also contains the size of each state. That struct can be
obtained with a call to vgetrandom(NULL, 0, 0, &params, ~0UL). Then,
libc can call mmap(2) and slice up the returned array into a state per
each thread, while ensuring that no single state straddles a page
boundary. Libc is expected to allocate a chunk of these on first use,
and then dole them out to threads as they're created, allocating more
when needed.

vDSO getrandom() provides the ability for userspace to generate random
bytes quickly and safely, and is intended to be integrated into libc's
thread management. As an illustrative example, the introduced code in
the vdso_test_getrandom self test later in this series might be used to
do the same outside of libc. In a libc the various pthread-isms are
expected to be elided into libc internals.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-07-19 20:22:12 +02:00
Linus Torvalds
c434e25b62 This update includes the following changes:
API:
 
 - Test setkey in no-SIMD context.
 - Add skcipher speed test for user-specified algorithm.
 
 Algorithms:
 
 - Add x25519 support on ppc64le.
 - Add VAES and AVX512 / AVX10 optimized AES-GCM on x86.
 - Remove sm2 algorithm.
 
 Drivers:
 
 - Add Allwinner H616 support to sun8i-ce.
 - Use DMA in stm32.
 - Add Exynos850 hwrng support to exynos.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmaZFsgACgkQxycdCkmx
 i6f76Q//ej7akY9fo6/qsn8UFK16O0SCEMkx7TrkxqHV8R6uwy4ret3+b5dbckY6
 hBjDabiL/BAdNzo8hvta+BOtN6ToEqquSVwNCpX0U3YMLf9dIzcMA4Uri3LbxUHi
 x9Qa8klI5x62Kg+RW+ovaJC4C11oKTpjVeDn4S57MudlBnhEa3DYcEADKiUowkEz
 aigtLx8HrZYjwkQxwgWeS0xzeojhW1P20yaghOd6hTCD7vKw18JaKdD8r4YFGOBu
 39eDaM/0vR+wWokk3NNl6NmXieBT8qLFt+OIbQs6b3gX9K37daahRs1VoShcL+ix
 l8GaqLpo1n1llVrV1OWzyVLVLtYK849QEo6OmlusnbK7e5pQKEOXoACQ0VB8ElNE
 1u7KNW6CBWGzr33dWPgl9yYBrT3BmMXABIK4dNmTicJsK2zk2FPKbLDZNi8fWah/
 D46mv7Rb8EtTdhN56EzceUJpd1ZfmP9S4vY1Hu8YdmI1pxex11US/XppKLoyymqp
 vNOzf85VuZ/GkUPfHdyWAFBnTaCjXtSBrlXD6+0nxavU9KGli0PLLX5tKNNWGw0l
 51Z0tbNsDbo3Z+sMmtfvBXR2V8NwiAT5f775W0lLvpq/44mbDpdN3jGvfy9y9C7u
 1DUC6F0XtUhZjR7e6/EhvHh3lB/a3w/m3+XC+XzDeox/VYTrC3Q=
 =x80X
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu:
 "API:
   - Test setkey in no-SIMD context
   - Add skcipher speed test for user-specified algorithm

  Algorithms:
   - Add x25519 support on ppc64le
   - Add VAES and AVX512 / AVX10 optimized AES-GCM on x86
   - Remove sm2 algorithm

  Drivers:
   - Add Allwinner H616 support to sun8i-ce
   - Use DMA in stm32
   - Add Exynos850 hwrng support to exynos"

* tag 'v6.11-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (81 commits)
  hwrng: core - remove (un)register_miscdev()
  crypto: lib/mpi - delete unnecessary condition
  crypto: testmgr - generate power-of-2 lengths more often
  crypto: mxs-dcp - Ensure payload is zero when using key slot
  hwrng: Kconfig - Do not enable by default CN10K driver
  crypto: starfive - Fix nent assignment in rsa dec
  crypto: starfive - Align rsa input data to 32-bit
  crypto: qat - fix unintentional re-enabling of error interrupts
  crypto: qat - extend scope of lock in adf_cfg_add_key_value_param()
  Documentation: qat: fix auto_reset attribute details
  crypto: sun8i-ce - add Allwinner H616 support
  crypto: sun8i-ce - wrap accesses to descriptor address fields
  dt-bindings: crypto: sun8i-ce: Add compatible for H616
  hwrng: core - Fix wrong quality calculation at hw rng registration
  hwrng: exynos - Enable Exynos850 support
  hwrng: exynos - Add SMC based TRNG operation
  hwrng: exynos - Implement bus clock control
  hwrng: exynos - Use devm_clk_get_enabled() to get the clock
  hwrng: exynos - Improve coding style
  dt-bindings: rng: Add Exynos850 support to exynos-trng
  ...
2024-07-19 08:52:58 -07:00
Yang Yang
72d04bdcf3 sbitmap: fix io hung due to race on sbitmap_word::cleared
Configuration for sbq:
  depth=64, wake_batch=6, shift=6, map_nr=1

1. There are 64 requests in progress:
  map->word = 0xFFFFFFFFFFFFFFFF
2. After all the 64 requests complete, and no more requests come:
  map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF
3. Now two tasks try to allocate requests:
  T1:                                       T2:
  __blk_mq_get_tag                          .
  __sbitmap_queue_get                       .
  sbitmap_get                               .
  sbitmap_find_bit                          .
  sbitmap_find_bit_in_word                  .
  __sbitmap_get_word  -> nr=-1              __blk_mq_get_tag
  sbitmap_deferred_clear                    __sbitmap_queue_get
  /* map->cleared=0xFFFFFFFFFFFFFFFF */     sbitmap_find_bit
    if (!READ_ONCE(map->cleared))           sbitmap_find_bit_in_word
      return false;                         __sbitmap_get_word -> nr=-1
    mask = xchg(&map->cleared, 0)           sbitmap_deferred_clear
    atomic_long_andnot()                    /* map->cleared=0 */
                                              if (!(map->cleared))
                                                return false;
                                     /*
                                      * map->cleared is cleared by T1
                                      * T2 fail to acquire the tag
                                      */

4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken
up due to the wake_batch being set at 6. If no more requests come, T1
will wait here indefinitely.

This patch achieves two purposes:
1. Check on ->cleared and update on both ->cleared and ->word need to
be done atomically, and using spinlock could be the simplest solution.
2. Add extra check in sbitmap_deferred_clear(), to identify whether
->word has free bits.

Fixes: ea86ea2cdc ("sbitmap: ammortize cost of clearing bits")
Signed-off-by: Yang Yang <yang.yang@vivo.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-19 09:39:32 -06:00
Linus Torvalds
76d9b92e68 slab updates for 6.11
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmaXl0kACgkQu+CwddJF
 iJrOlgf+N/G7BmgoW2CBF7mKsvCYs+pX3xeBuxPtsuq4FD386nsPFMN8gWAYLG3q
 ZU1z1S+0M8LhTg6/G9jMYLHt2Y7WhYbhFTjTHmULJkuhMDTUP9CRYy4XZ+hdPtHF
 30ezSdJQF9x/XxCSaaRVK1s+SMVHFg5xAOHKpfkNSamcMz9g+ZkYyPBr10/VoKd0
 JqwhW7r6hrlvWAiqY3QKCOvohIWglgvBUnNjUGMh1cUkOE2aYLYHklhRwICKgA6z
 p/2BUXiAEWUtgBkUrizwm/pdhJXLs0pOeYarVZP1v83tQMxyrc6XLNnqhvxP3DPW
 31thF5Rf9I8WaWTczXhxsAwFjqO3KQ==
 =4uf9
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "The most prominent change this time is the kmem_buckets based
  hardening of kmalloc() allocations from Kees Cook.

  We have also extended the kmalloc() alignment guarantees for
  non-power-of-two sizes in a way that benefits rust.

  The rest are various cleanups and non-critical fixups.

   - Dedicated bucket allocator (Kees Cook)

     This series [1] enhances the probabilistic defense against heap
     spraying/grooming of CONFIG_RANDOM_KMALLOC_CACHES from last year.

     kmalloc() users that are known to be useful for exploits can get
     completely separate set of kmalloc caches that can't be shared with
     other users. The first converted users are alloc_msg() and
     memdup_user().

     The hardening is enabled by CONFIG_SLAB_BUCKETS.

   - Extended kmalloc() alignment guarantees (Vlastimil Babka)

     For years now we have guaranteed natural alignment for power-of-two
     allocations, but nothing was defined for other sizes (in practice,
     we have two such buckets, kmalloc-96 and kmalloc-192).

     To avoid unnecessary padding in the rust layer due to its alignment
     rules, extend the guarantee so that the alignment is at least the
     largest power-of-two divisor of the requested size.

     This fits what rust needs, is a superset of the existing
     power-of-two guarantee, and does not in practice change the layout
     (and thus does not add overhead due to padding) of the kmalloc-96
     and kmalloc-192 caches, unless slab debugging is enabled for them.

   - Cleanups and non-critical fixups (Chengming Zhou, Suren
     Baghdasaryan, Matthew Willcox, Alex Shi, and Vlastimil Babka)

     Various tweaks related to the new alloc profiling code, folio
     conversion, debugging and more leftovers after SLAB"

Link: https://lore.kernel.org/all/20240701190152.it.631-kees@kernel.org/ [1]

* tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/memcg: alignment memcg_data define condition
  mm, slab: move prepare_slab_obj_exts_hook under CONFIG_MEM_ALLOC_PROFILING
  mm, slab: move allocation tagging code in the alloc path into a hook
  mm/util: Use dedicated slab buckets for memdup_user()
  ipc, msg: Use dedicated slab buckets for alloc_msg()
  mm/slab: Introduce kmem_buckets_create() and family
  mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument
  mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
  mm/slab: Introduce kmem_buckets typedef
  slab, rust: extend kmalloc() alignment guarantees to remove Rust padding
  slab: delete useless RED_INACTIVE and RED_ACTIVE
  slab: don't put freepointer outside of object if only orig_size
  slab: make check_object() more consistent
  mm: Reduce the number of slab->folio casts
  mm, slab: don't wrap internal functions with alloc_hooks()
2024-07-18 15:08:12 -07:00
Linus Torvalds
db2451e78d Bootconfig updates for v6.11:
- Remove duplicate included header file linux/bootconfig.h from
   lib/bootconfig.c. This is a cleanup, no behavior change.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmaWhj4bHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bdd0H/iraZ7ZOFWxCapOZI4dL
 7f870j0PQG/KU7lB4jAo+3u7YyQWQTTLdhDPEOci4axsDG+56C/SVpHV0Z26SGHX
 ZqcKlA/H0HT4BA3zG1leRzXC/qPYiAEdIw38NngYPYBUWhqM3qmYlrRIBeg89VrM
 B4yaIJA/Uae7KAlB2dcmhmrIg86QK1iPKU6G+U5mIFecxDQmowE7z5f5pI/K/M5j
 2HT2Kg1XPTtxOb15mKtA19TXbbA1IqYUvwW5jOffppKMwtiggEaOj4mLQ1MhlrP0
 pEb1OJMx21MvEJYtjOXi8qsSGOhdWH8sBpxdUv21GzwRvOuG/AoaN1YKMIZCQp1K
 Jjo=
 =Bjzb
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig update from Masami Hiramatsu:

 - Remove duplicate included header file linux/bootconfig.h from
   lib/bootconfig.c. This is a cleanup, no behavior change.

* tag 'bootconfig-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  bootconfig: Remove duplicate included header file linux/bootconfig.h
2024-07-18 12:39:40 -07:00
Linus Torvalds
b3ce7a3084 drm next for 6.11-rc1:
core:
 - deprecate DRM data and return 0 date
 - connector: Create a set of helpers to help with HDMI support
 - Remove driver owner assignments
 - Allow more drivers to compile with COMPILE_TEST
 - Conversions to drm_edid
 - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
 - Remove drm_mm_replace_node
 - print: Add a drm prefix to warn level messages too, remove
          ___drm_dbg, consolidate prefix handling
 - New monochrome TV mode variant
 
 ttm:
 - improve number of page faults on some platforms
 - fix test builds under PREEMPT_RT
 - more test coverage
 
 ci:
 - Require a more recent version of mesa,
 - improve farm setup and test generation
 
 dma-buf:
 - warn if reserving 0 fence slots
 - internal API heap enhancements
 
 fbdev:
 - Create memory manager optimized fbdev emulation
 
 panic:
 - Allow to select fonts,
 - improve drm_fb_dma_get_scanout_buffer
 - Allow to dump kmsg to the screen
 
 bridge:
 - Remove redundant checks on bridge->encoder
 - Remove drm_bridge_chain_mode_fixup
 - bridge-connector: Plumb in the new HDMI helper
 - analogix_dp: Various improvements, handle AUX transfers timeout
 - samsung-dsim: Fix timings calculation
 - tc358767: Plenty of small fixes, fix no connector attach, fix clocks
 - sii902x: state validation improvements
 
 panels:
 - Switch panels from register table initialization to proper code
 - Now that the panel code tracks the panel state, remove every
   ad-hoc implementation in the panel drivers
 - More cleanup of prepare / enable state tracking in drivers
 - edp: Drop legacy panel compatibles
 - simple-bridge: Switch to devm_drm_bridge_add
 - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
   13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
   nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView PM070WL4,
   Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
   AUO G104STN01, K&d kd101ne3-40ti
 
 amdgpu:
 - DCN 4.0.x support
 - GC 12.0 support
 - GMC 12.0 support
 - SDMA 7.0 support
 - MES12 support
 - MMHUB 4.1 support
 - GFX12 modifier and DCC support
 - lots of IP fixes/updates
 
 amdkfd:
 - Contiguous VRAM allocations
 - GC 12.0 support
 - SDMA 7.0 support
 - SR-IOV fixes
 - KFD GFX ALU exceptions
 
 i915:
 - Battlemage Xe2 HPD display enablement
 - Panel Replay enabling
 - DP AUX-less ALPM/LOBF
 - Enable link training failure fallback for DP MST links
 - CMRR (Content Match Refresh Rate) enabling
 - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
 - Enable eDP AUX based HDR backlight
 - Support replaying GPU hangs with captured context image
 - Automate CCS Mode setting during engine resets
 - lots of refactoring
 - Support replaying GPU hangs with captured context image
 - Increase FLR timeout from 3s to 9s
 - Enable w/a 16021333562 for DG2, MTL and ARL [guc]
 
 xe:
 - update MAINATINERS
 - New uapi adding OA functionality to Xe
 - expose l3 bank mask
 - fix display detect on ADL-N
 - runtime PM Fixes
 - Fix silent backmerge issues
 - More prep for SR-IOV
 - HWmon additions
 - per client usage info
 - Rework GPU page fault handling
 - Drop EXEC_QUEUE_FLAG_BANNED
 - Add BMG PCI IDs
 - Scheduler fixes and improvements
 - Rename xe_exec_queue::compute to xe_exec_queue::lr
 - Use ttm_uncached for BO with NEEDS_UC flag
 - Rename xe perf layer as xe observation layer
 - lots of refactoring
 
 radeon:
 - Backlight workaround for iMac
 - Silence UBSAN flex array warnings
 
 msm:
 - Validate registers XML description against schema in CI
 - core/dpu: SM7150 support
 - mdp5: Add support for MSM8937
 - gpu: Add param for userspace to know if raytracing is supported
 - gpu: X185 support (aka gpu in X1 laptop chips)
 - gpu: a505 support
 
 ivpu:
 - hardware scheduler support
 - profiling support
 - improvements to the platform support layer
 - firmware handling improvements
 - clocks/power mgmt improvements
 - scheduler/logging improvements
 
 habanalabs:
 - Gradual sleep in polling memory macro.
 - Reduce Gaudi2 MSI-X interrupt count to 128.
 - Add Gaudi2-D revision support.
 - Add timestamp to CPLD info.
 - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
 - Align Gaudi2 interrupt names.
 - Check for errors after preboot is ready.
 - Change habanalabs maintainer and git repo path.
 
 mgag200:
 - refactoring and improvements
 - Add BMC output
 - enable polling
 
 nouveau:
 - add registry command line
 
 v3d:
 - perf counters improvements
 
 zynqmp:
 - irq and debugfs improvements
 
 atmel-hlcdc:
 - Support XLCDC in sam9x7
 
 mipi-dbi:
 - Remove mipi_dbi_machine_little_endian
 - make SPI bits per word configurable
 - support RGB888
 - allow pixel formats to be specified in the DT
 
 sun4i:
 - Rework the blender setup for DE2
 
 panfrost:
 - Enable MT8188 support
 
 vc4:
 - Monochrome TV support
 
 exynos:
 - fix fallback mode regression
 - fix memory leak
 - Use drm_edid_duplicate() instead of kmemdup()
 
 etnaviv:
 - fix i.MX8MP NPU clock gating
 - workaround FE register cdc issues on some cores
 - fix DMA sync handling for cached buffers
 - fix job timeout handling
 - keep TS enabled on MMUv2 cores for improved performance
 
 mediatek:
 - Convert to platform remove callback returning void-
 - Drop chain_mode_fixup call in mode_valid()
 - Fixes the errors of MediaTek display driver found by IGT.
 - Add display support for the MT8365-EVK board
 - Fix bit depth overwritten for mtk_ovl_set bit_depth()
 - Fix possible_crtcs calculation
 - Fix spurious kfree()
 
 ast:
 - refactor mode setting code
 
 stm:
 - Add LVDS support
 - DSI PHY updates
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmaYqVEACgkQDHTzWXnE
 hr5p3Q/+OOxTHKJ/8WMwfV1Tuep5otkCZdBgNdcuu9zqzpEMEDUDwmV1iboIvT9x
 qJsDwSAJomwbZAnVjDKsbZuycSHUBV6HQdf+5+rtq6be1EfFRwJVzOq0u5+D3KGt
 7f2vy6sM9tw4tR6EikiuP7vCvnSz4iGrWERvEJDEtXECbALhju8sulht8ZMnr6GW
 /MfUetULLSDjq0L1x3TWAq2MPGnJ5UxIkIeOBUP6n4etAUX1BPTNA6N76eN/xMvn
 a40JhtM+pCjjkHxvloIZ+KTYN3S+hskIRksczPHh9HtNX7y/A437wyhOHJZ1NvZb
 yc5ke9GjXxGcxyZH+PY5aCS7O/XElzSSkR1jFZ2s3/MX7PVKgCahGK7+yWjPsiK2
 R5oXebdObshUa8LHDE/3WgBUmTchkvKRTXV9cvGqzxEPhC2zrxArvwP5v6B4mhCn
 Vqo3Pv0Cyr+n65Z5Dzqz/9+m999LJjFTsTrug0p5b/qBJQKu2rQONe4lpZ0NFwwY
 ExyjdxILj7mqrQpKcA6V5Bel5ZCnlVsGfTshFL6Iux54VFlJyRMzKWZ+Gdv4av5k
 dbjz+re+CojKabn3ML/7pAQujK6Rqe58vPuHV78zkvAGJnQgJOOTrmYNYtn3oBqe
 ogdCN+/PREb/9U7i6mQv5hhdHs4tT9ROXaT9jyb8XSHXW+t9lBM=
 =g+Ad
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "There's a lot of stuff in here, amd, i915 and xe have new platform
  work, lots of core rework around EDID handling, some new COMPILE_TEST
  options, maintainer changes and a lots of other stuff. Summary:

  core:
   - deprecate DRM data and return 0 date
   - connector: Create a set of helpers to help with HDMI support
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
   - Remove drm_mm_replace_node
   - print: Add a drm prefix to warn level messages too, remove
            ___drm_dbg, consolidate prefix handling
   - New monochrome TV mode variant

  ttm:
   - improve number of page faults on some platforms
   - fix test builds under PREEMPT_RT
   - more test coverage

  ci:
   - Require a more recent version of mesa
   - improve farm setup and test generation

  dma-buf:
   - warn if reserving 0 fence slots
   - internal API heap enhancements

  fbdev:
   - Create memory manager optimized fbdev emulation

  panic:
   - Allow to select fonts
   - improve drm_fb_dma_get_scanout_buffer
   - Allow to dump kmsg to the screen

  bridge:
   - Remove redundant checks on bridge->encoder
   - Remove drm_bridge_chain_mode_fixup
   - bridge-connector: Plumb in the new HDMI helper
   - analogix_dp: Various improvements, handle AUX transfers timeout
   - samsung-dsim: Fix timings calculation
   - tc358767: Plenty of small fixes, fix no connector attach, fix
               clocks
   - sii902x: state validation improvements

  panels:
   - Switch panels from register table initialization to proper code
   - Now that the panel code tracks the panel state, remove every ad-hoc
     implementation in the panel drivers
   - More cleanup of prepare / enable state tracking in drivers
   - edp: Drop legacy panel compatibles
   - simple-bridge: Switch to devm_drm_bridge_add
   - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
                 13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
                 BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
                 PM070WL4, Lincoln Technologies LCD197, Ortustech
                 COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti

  amdgpu:
   - DCN 4.0.x support
   - GC 12.0 support
   - GMC 12.0 support
   - SDMA 7.0 support
   - MES12 support
   - MMHUB 4.1 support
   - GFX12 modifier and DCC support
   - lots of IP fixes/updates

  amdkfd:
   - Contiguous VRAM allocations
   - GC 12.0 support
   - SDMA 7.0 support
   - SR-IOV fixes
   - KFD GFX ALU exceptions

  i915:
   - Battlemage Xe2 HPD display enablement
   - Panel Replay enabling
   - DP AUX-less ALPM/LOBF
   - Enable link training failure fallback for DP MST links
   - CMRR (Content Match Refresh Rate) enabling
   - Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
   - Enable eDP AUX based HDR backlight
   - Support replaying GPU hangs with captured context image
   - Automate CCS Mode setting during engine resets
   - lots of refactoring
   - Support replaying GPU hangs with captured context image
   - Increase FLR timeout from 3s to 9s
   - Enable w/a 16021333562 for DG2, MTL and ARL [guc]

  xe:
   - update MAINATINERS
   - New uapi adding OA functionality to Xe
   - expose l3 bank mask
   - fix display detect on ADL-N
   - runtime PM Fixes
   - Fix silent backmerge issues
   - More prep for SR-IOV
   - HWmon additions
   - per client usage info
   - Rework GPU page fault handling
   - Drop EXEC_QUEUE_FLAG_BANNED
   - Add BMG PCI IDs
   - Scheduler fixes and improvements
   - Rename xe_exec_queue::compute to xe_exec_queue::lr
   - Use ttm_uncached for BO with NEEDS_UC flag
   - Rename xe perf layer as xe observation layer
   - lots of refactoring

  radeon:
   - Backlight workaround for iMac
   - Silence UBSAN flex array warnings

  msm:
   - Validate registers XML description against schema in CI
   - core/dpu: SM7150 support
   - mdp5: Add support for MSM8937
   - gpu: Add param for userspace to know if raytracing is supported
   - gpu: X185 support (aka gpu in X1 laptop chips)
   - gpu: a505 support

  ivpu:
   - hardware scheduler support
   - profiling support
   - improvements to the platform support layer
   - firmware handling improvements
   - clocks/power mgmt improvements
   - scheduler/logging improvements

  habanalabs:
   - Gradual sleep in polling memory macro
   - Reduce Gaudi2 MSI-X interrupt count to 128
   - Add Gaudi2-D revision support
   - Add timestamp to CPLD info
   - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
   - Align Gaudi2 interrupt names
   - Check for errors after preboot is ready
   - Change habanalabs maintainer and git repo path

  mgag200:
   - refactoring and improvements
   - Add BMC output
   - enable polling

  nouveau:
   - add registry command line

  v3d:
   - perf counters improvements

  zynqmp:
   - irq and debugfs improvements

  atmel-hlcdc:
   - Support XLCDC in sam9x7

  mipi-dbi:
   - Remove mipi_dbi_machine_little_endian
   - make SPI bits per word configurable
   - support RGB888
   - allow pixel formats to be specified in the DT

  sun4i:
   - Rework the blender setup for DE2

  panfrost:
   - Enable MT8188 support

  vc4:
   - Monochrome TV support

  exynos:
   - fix fallback mode regression
   - fix memory leak
   - Use drm_edid_duplicate() instead of kmemdup()

  etnaviv:
   - fix i.MX8MP NPU clock gating
   - workaround FE register cdc issues on some cores
   - fix DMA sync handling for cached buffers
   - fix job timeout handling
   - keep TS enabled on MMUv2 cores for improved performance

  mediatek:
   - Convert to platform remove callback returning void-
   - Drop chain_mode_fixup call in mode_valid()
   - Fixes the errors of MediaTek display driver found by IGT
   - Add display support for the MT8365-EVK board
   - Fix bit depth overwritten for mtk_ovl_set bit_depth()
   - Fix possible_crtcs calculation
   - Fix spurious kfree()

  ast:
   - refactor mode setting code

  stm:
   - Add LVDS support
   - DSI PHY updates"

* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
  drm/amdgpu/mes12: add missing opcode string
  drm/amdgpu/mes11: update opcode strings
  Revert "drm/amd/display: Reset freesync config before update new state"
  drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
  drm/xe: Drop trace_xe_hw_fence_free
  drm/xe/uapi: Rename xe perf layer as xe observation layer
  drm/amdgpu: remove exp hw support check for gfx12
  drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
  drm/amdgpu: flush all cached ras bad pages to eeprom
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/display: Allow display DCC for DCN401
  drm/amdgpu: select compute ME engines dynamically
  drm/amdgpu/job: Replace DRM_INFO/ERROR logging
  drm/amdgpu: select compute ME engines dynamically
  drm/amd/pm: Ignore initial value in smu response register
  drm/amdgpu: Initialize VF partition mode
  drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
  MAINTAINERS: fix Xinhui's name
  MAINTAINERS: update powerplay and swsmu
  drm/qxl: Pin buffer objects for internal mappings
  ...
2024-07-18 09:34:02 -07:00
Linus Torvalds
51835949dd Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time.
 
 Core & protocols
 ----------------
 
  - Use local_lock in addition to local_bh_disable() to protect per-CPU
    resources in networking, a step closer for local_bh_disable() not
    to act as a big lock on PREEMPT_RT.
 
  - Use flex array for netdevice priv area, ensure its cache alignment.
 
  - Add a sysctl knob to allow user to specify a default rto_min at socket
    init time. Bit of a big hammer but multiple companies were
    independently carrying such patch downstream so clearly it's useful.
 
  - Support scheduling transmission of packets based on CLOCK_TAI.
 
  - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off
    using cpusets.
 
  - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address.
 
  - Allow configuration of multipath hash seed, to both allow synchronizing
    hashing of two routers, and preventing partial accidental sync.
 
  - Improve TCP compliance with RFC 9293 for simultaneous connect().
 
  - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace
    IKE daemon had to do this before, but the kernel can better keep
    track of it.
 
  - Support sending supervision HSR frames with MAC addresses stored in
    ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled.
 
  - Introduce IPPROTO_SMC for selecting SMC when socket is created.
 
  - Allow UDP GSO transmit from devices with no checksum offload.
 
  - openvswitch: add packet sampling via psample, separating the sampled
    traffic from "upcall" packets sent to user space for forwarding.
 
  - nf_tables: shrink memory consumption for transaction objects.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Power Sequencing subsystem (used by Qualcomm Bluetooth driver
    for QCA6390).
 
  - Add IRQ information in sysfs for auxiliary bus.
 
  - Introduce guard definition for local_lock.
 
  - Add aligned flavor of __cacheline_group_{begin, end}() markings for
    grouping fields in structures.
 
 BPF
 ---
 
  - Notify user space (via epoll) when a struct_ops object is getting
    detached/unregistered.
 
  - Add new kfuncs for a generic, open-coded bits iterator.
 
  - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
    bpf_list_head.
 
  - Support resilient split BTF which cuts down on duplication and makes
    BTF as compact as possible WRT BTF from modules.
 
  - Add support for dumping kfunc prototypes from BTF which enables both
    detecting as well as dumping compilable prototypes for kfuncs.
 
  - riscv64 BPF JIT improvements in particular to add 12-argument support
    for BPF trampolines and to utilize bpf_prog_pack for the latter.
 
  - Add the capability to offload the netfilter flowtable in XDP layer
    through kfuncs.
 
 Driver API
 ----------
 
  - Allow users to configure IRQ tresholds between which automatic IRQ
    moderation can choose.
 
  - Expand Power Sourcing (PoE) status with power, class and failure
    reason. Support setting power limits.
 
  - Track additional RSS contexts in the core, make sure configuration
    changes don't break them.
 
  - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP
    data paths.
 
  - Support updating firmware on SFP modules.
 
 Tests and tooling
 -----------------
 
  - mptcp: use net/lib.sh to manage netns.
 
  - TCP-AO and TCP-MD5: replace debug prints used by tests with
    tracepoints.
 
  - openvswitch: make test self-contained (don't depend on OvS CLI tools).
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - increase the max total outstanding PTP TX packets to 4
      - add timestamping statistics support
      - implement netdev_queue_mgmt_ops
      - support new RSS context API
    - Intel (100G, ice, idpf):
      - implement FEC statistics and dumping signal quality indicators
      - support E825C products (with 56Gbps PHYs)
    - nVidia/Mellanox:
      - support HW-GRO
      - mlx4/mlx5: support per-queue statistics via netlink
      - obey the max number of EQs setting in sub-functions
    - AMD/Solarflare:
      - support new RSS context API
    - AMD/Pensando:
      - ionic: rework fix for doorbell miss to lower overhead
        and skip it on new HW
    - Wangxun:
      - txgbe: support Flow Director perfect filters
 
  - Ethernet NICs consumer, embedded and virtual:
    - Add driver for Tehuti Networks TN40xx chips
    - Add driver for Meta's internal NIC chips
    - Add driver for Ethernet MAC on Airoha EN7581 SoCs
    - Add driver for Renesas Ethernet-TSN devices
    - Google cloud vNIC:
      - flow steering support
    - Microsoft vNIC:
      - support page sizes other than 4KB on ARM64
    - vmware vNIC:
      - support latency measurement (update to version 9)
    - VirtIO net:
      - support for Byte Queue Limits
      - support configuring thresholds for automatic IRQ moderation
      - support for AF_XDP Rx zero-copy
    - Synopsys (stmmac):
      - support for STM32MP13 SoC
      - let platforms select the right PCS implementation
    - TI:
      - icssg-prueth: add multicast filtering support
      - icssg-prueth: enable PTP timestamping and PPS
    - Renesas:
      - ravb: improve Rx performance 30-400% by using page pool,
        theaded NAPI and timer-based IRQ coalescing
      - ravb: add MII support for R-Car V4M
    - Cadence (macb):
      - macb: add ARP support to Wake-On-LAN
    - Cortina:
      - use phylib for RX and TX pause configuration
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support configuration of multipath hash seed
      - report more accurate max MTU
      - use page_pool to improve Rx performance
    - MediaTek:
      - mt7530: add support for bridge port isolation
    - Qualcomm:
      - qca8k: add support for bridge port isolation
    - Microchip:
      - lan9371/2: add 100BaseTX PHY support
    - NXP:
      - vsc73xx: implement VLAN operations
 
  - Ethernet PHYs:
    - aquantia: enable support for aqr115c
    - aquantia: add support for PHY LEDs
    - realtek: add support for rtl8224 2.5Gbps PHY
    - xpcs: add memory-mapped device support
    - add BroadR-Reach link mode and support in Broadcom's PHY driver
 
  - CAN:
    - add document for ISO 15765-2 protocol support
    - mcp251xfd: workaround for erratum DS80000789E, use timestamps
      to catch when device returns incorrect FIFO status
 
  - WiFi:
    - mac80211/cfg80211:
      - parse Transmit Power Envelope (TPE) data in mac80211 instead of
        in drivers
      - improvements for 6 GHz regulatory flexibility
      - multi-link improvements
      - support multiple radios per wiphy
      - remove DEAUTH_NEED_MGD_TX_PREP flag
    - Intel (iwlwifi):
      - bump FW API to 91 for BZ/SC devices
      - report 64-bit radiotap timestamp
      - enable P2P low latency by default
      - handle Transmit Power Envelope (TPE) advertised by AP
      - remove support for older FW for new devices
      - fast resume (keeping the device configured)
      - mvm: re-enable Multi-Link Operation (MLO)
      - aggregation (A-MSDU) optimizations
    - MediaTek (mt76):
      - mt7925 Multi-Link Operation (MLO) support
    - Qualcomm (ath10k):
      - LED support for various chipsets
    - Qualcomm (ath12k):
      - remove unsupported Tx monitor handling
      - support channel 2 in 6 GHz band
      - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
      - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
        Advertisements (EMA)
      - support dynamic VLAN
      - add panic handler for resetting the firmware state
      - DebugFS support for datapath statistics
      - WCN7850: support for Wake on WLAN
    - Microchip (wilc1000):
      - read MAC address during probe to make it visible to user space
      - suspend/resume improvements
    - TI (wl18xx):
      - support newer firmware versions
    - RealTek (rtw89):
      - preparation for RTL8852BE-VT support
      - Wake on WLAN support for WiFi 6 chips
      - 36-bit PCI DMA support
    - RealTek (rtlwifi):
      - RTL8192DU support
    - Broadcom (brcmfmac):
      - Management Frame Protection support (to enable WPA3)
 
  - Bluetooth:
    - qualcomm: use the power sequencer for QCA6390
    - btusb: mediatek: add ISO data transmission functions
    - hci_bcm4377: add BCM4388 support
    - btintel: add support for BlazarU core
    - btintel: add support for Whale Peak2
    - btnxpuart: add support for AW693 A1 chipset
    - btnxpuart: add support for IW615 chipset
    - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S
 IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6
 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE
 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2
 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp
 NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr
 I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD
 TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP
 QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6
 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP
 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0
 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0=
 =opz8
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Not much excitement - a handful of large patchsets (devmem among them)
  did not make it in time.

  Core & protocols:

   - Use local_lock in addition to local_bh_disable() to protect per-CPU
     resources in networking, a step closer for local_bh_disable() not
     to act as a big lock on PREEMPT_RT

   - Use flex array for netdevice priv area, ensure its cache alignment

   - Add a sysctl knob to allow user to specify a default rto_min at
     socket init time. Bit of a big hammer but multiple companies were
     independently carrying such patch downstream so clearly it's useful

   - Support scheduling transmission of packets based on CLOCK_TAI

   - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
     off using cpusets

   - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address

   - Allow configuration of multipath hash seed, to both allow
     synchronizing hashing of two routers, and preventing partial
     accidental sync

   - Improve TCP compliance with RFC 9293 for simultaneous connect()

   - Support sending NAT keepalives in IPsec ESP in UDP states.
     Userspace IKE daemon had to do this before, but the kernel can
     better keep track of it

   - Support sending supervision HSR frames with MAC addresses stored in
     ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled

   - Introduce IPPROTO_SMC for selecting SMC when socket is created

   - Allow UDP GSO transmit from devices with no checksum offload

   - openvswitch: add packet sampling via psample, separating the
     sampled traffic from "upcall" packets sent to user space for
     forwarding

   - nf_tables: shrink memory consumption for transaction objects

  Things we sprinkled into general kernel code:

   - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
     QCA6390)           [ Already merged separately - Linus ]

   - Add IRQ information in sysfs for auxiliary bus

   - Introduce guard definition for local_lock

   - Add aligned flavor of __cacheline_group_{begin, end}() markings for
     grouping fields in structures

  BPF:

   - Notify user space (via epoll) when a struct_ops object is getting
     detached/unregistered

   - Add new kfuncs for a generic, open-coded bits iterator

   - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
     bpf_list_head

   - Support resilient split BTF which cuts down on duplication and
     makes BTF as compact as possible WRT BTF from modules

   - Add support for dumping kfunc prototypes from BTF which enables
     both detecting as well as dumping compilable prototypes for kfuncs

   - riscv64 BPF JIT improvements in particular to add 12-argument
     support for BPF trampolines and to utilize bpf_prog_pack for the
     latter

   - Add the capability to offload the netfilter flowtable in XDP layer
     through kfuncs

  Driver API:

   - Allow users to configure IRQ tresholds between which automatic IRQ
     moderation can choose

   - Expand Power Sourcing (PoE) status with power, class and failure
     reason. Support setting power limits

   - Track additional RSS contexts in the core, make sure configuration
     changes don't break them

   - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
     ESP data paths

   - Support updating firmware on SFP modules

  Tests and tooling:

   - mptcp: use net/lib.sh to manage netns

   - TCP-AO and TCP-MD5: replace debug prints used by tests with
     tracepoints

   - openvswitch: make test self-contained (don't depend on OvS CLI
     tools)

  Drivers:

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - increase the max total outstanding PTP TX packets to 4
         - add timestamping statistics support
         - implement netdev_queue_mgmt_ops
         - support new RSS context API
      - Intel (100G, ice, idpf):
         - implement FEC statistics and dumping signal quality indicators
         - support E825C products (with 56Gbps PHYs)
      - nVidia/Mellanox:
         - support HW-GRO
         - mlx4/mlx5: support per-queue statistics via netlink
         - obey the max number of EQs setting in sub-functions
      - AMD/Solarflare:
         - support new RSS context API
      - AMD/Pensando:
         - ionic: rework fix for doorbell miss to lower overhead and
           skip it on new HW
      - Wangxun:
         - txgbe: support Flow Director perfect filters

   - Ethernet NICs consumer, embedded and virtual:
      - Add driver for Tehuti Networks TN40xx chips
      - Add driver for Meta's internal NIC chips
      - Add driver for Ethernet MAC on Airoha EN7581 SoCs
      - Add driver for Renesas Ethernet-TSN devices
      - Google cloud vNIC:
         - flow steering support
      - Microsoft vNIC:
         - support page sizes other than 4KB on ARM64
      - vmware vNIC:
         - support latency measurement (update to version 9)
      - VirtIO net:
         - support for Byte Queue Limits
         - support configuring thresholds for automatic IRQ moderation
         - support for AF_XDP Rx zero-copy
      - Synopsys (stmmac):
         - support for STM32MP13 SoC
         - let platforms select the right PCS implementation
      - TI:
         - icssg-prueth: add multicast filtering support
         - icssg-prueth: enable PTP timestamping and PPS
      - Renesas:
         - ravb: improve Rx performance 30-400% by using page pool,
           theaded NAPI and timer-based IRQ coalescing
         - ravb: add MII support for R-Car V4M
      - Cadence (macb):
         - macb: add ARP support to Wake-On-LAN
      - Cortina:
         - use phylib for RX and TX pause configuration

   - Ethernet switches:
      - nVidia/Mellanox:
         - support configuration of multipath hash seed
         - report more accurate max MTU
         - use page_pool to improve Rx performance
      - MediaTek:
         - mt7530: add support for bridge port isolation
      - Qualcomm:
         - qca8k: add support for bridge port isolation
      - Microchip:
         - lan9371/2: add 100BaseTX PHY support
      - NXP:
         - vsc73xx: implement VLAN operations

   - Ethernet PHYs:
      - aquantia: enable support for aqr115c
      - aquantia: add support for PHY LEDs
      - realtek: add support for rtl8224 2.5Gbps PHY
      - xpcs: add memory-mapped device support
      - add BroadR-Reach link mode and support in Broadcom's PHY driver

   - CAN:
      - add document for ISO 15765-2 protocol support
      - mcp251xfd: workaround for erratum DS80000789E, use timestamps to
        catch when device returns incorrect FIFO status

   - WiFi:
      - mac80211/cfg80211:
         - parse Transmit Power Envelope (TPE) data in mac80211 instead
           of in drivers
         - improvements for 6 GHz regulatory flexibility
         - multi-link improvements
         - support multiple radios per wiphy
         - remove DEAUTH_NEED_MGD_TX_PREP flag
      - Intel (iwlwifi):
         - bump FW API to 91 for BZ/SC devices
         - report 64-bit radiotap timestamp
         - enable P2P low latency by default
         - handle Transmit Power Envelope (TPE) advertised by AP
         - remove support for older FW for new devices
         - fast resume (keeping the device configured)
         - mvm: re-enable Multi-Link Operation (MLO)
         - aggregation (A-MSDU) optimizations
      - MediaTek (mt76):
         - mt7925 Multi-Link Operation (MLO) support
      - Qualcomm (ath10k):
         - LED support for various chipsets
      - Qualcomm (ath12k):
         - remove unsupported Tx monitor handling
         - support channel 2 in 6 GHz band
         - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
         - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
           Advertisements (EMA)
         - support dynamic VLAN
         - add panic handler for resetting the firmware state
         - DebugFS support for datapath statistics
         - WCN7850: support for Wake on WLAN
      - Microchip (wilc1000):
         - read MAC address during probe to make it visible to user space
         - suspend/resume improvements
      - TI (wl18xx):
         - support newer firmware versions
      - RealTek (rtw89):
         - preparation for RTL8852BE-VT support
         - Wake on WLAN support for WiFi 6 chips
         - 36-bit PCI DMA support
      - RealTek (rtlwifi):
         - RTL8192DU support
      - Broadcom (brcmfmac):
         - Management Frame Protection support (to enable WPA3)

   - Bluetooth:
      - qualcomm: use the power sequencer for QCA6390
      - btusb: mediatek: add ISO data transmission functions
      - hci_bcm4377: add BCM4388 support
      - btintel: add support for BlazarU core
      - btintel: add support for Whale Peak2
      - btnxpuart: add support for AW693 A1 chipset
      - btnxpuart: add support for IW615 chipset
      - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"

* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
  eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
  tcp: Replace strncpy() with strscpy()
  wifi: ath12k: fix build vs old compiler
  tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
  eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
  eth: fbnic: Add L2 address programming
  eth: fbnic: Add basic Rx handling
  eth: fbnic: Add basic Tx handling
  eth: fbnic: Add link detection
  eth: fbnic: Add initial messaging to notify FW of our presence
  eth: fbnic: Implement Rx queue alloc/start/stop/free
  eth: fbnic: Implement Tx queue alloc/start/stop/free
  eth: fbnic: Allocate a netdevice and napi vectors with queues
  eth: fbnic: Add FW communication mechanism
  eth: fbnic: Add message parsing for FW messages
  eth: fbnic: Add register init to set PCIe/Ethernet device config
  eth: fbnic: Allocate core device specific structures and devlink interface
  eth: fbnic: Add scaffolding for Meta's NIC driver
  PCI: Add Meta Platforms vendor ID
  net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
  ...
2024-07-16 19:28:34 -07:00
Linus Torvalds
f8d22a3195 linux_kselftest-kunit-6.11-rc1
This KUnit next update for Linux 6.11-rc1 consists of:
 
 -- adds vm_mmap() allocation resource manager
 -- converts usercopy kselftest to KUnit
 -- disables usercopy testing on !CONFIG_MMU
 -- adds MODULE_DESCRIPTION() to core, list, and usercopy tests
 -- adds tests for assertion formatting functions - assert.c
 -- introduces KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros
 -- fixes KUNIT_ASSERT_STRNEQ comments to make it clear that it is
    an assertion
 -- renames KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaWpCYACgkQCwJExA0N
 QxwdPQ/9G26Q+xhbieosvXHu/04ZWTcuUP/cFRv56jLH9bKm25YbW8WZzKM/imE5
 So35IT6SIYlwxn9fYyriPz372h3ZC522cu8tIVrUh5Uo3O5LbzQqdrxos9a+RuCg
 u6lenSksAjJRZ3S3IKDJ1ErxLnPYKyjjZFwDmV1+0Xxy30SwzFEbQqj9lY2Q4iGs
 KWBm0lrFPipbHdBqZcPB/mxIDyF6rhe+oeuOPU8uag6ncNN31xMpDanU8O6XEAz9
 QoAiDICANbVKTRKG5xXgmsJtyLF8GON4e49kEYtCLdnESPc39hQtf3cTHeYI22HC
 7OWhhOySifNIukFj1hVtxnN3ZfjtBGmbCwe5rXZFvMovE3YwAplKK61GoOaI9UV0
 qPk5GGrAb/xEh2HZ9tgf8+CsqmnPQLGnVt2h3u3c28u4YzbkinqVj20KYsye39zz
 KzJsO2yDJH4LlIJjc8XWof1cyyo0TIJQVOwJqAieOPePnfs4zabmVOus8y1Cj07V
 iAvQTPPoZ165zA1cl0iSMolKkXeAgf2FjlEGbODrktKKX6Ag/PKVp3e6PW28zJbp
 0p1V1IDQQAlEhbcRAZb+5y1voh+hcy++KyPwpj7lAVkmHd7RoK/mDL3W+oLdOTrB
 aXWs4JOlkmtUaz3EpAQZuvhYWVW7DexR9rU1SF44UAVzSdZSndw=
 =nnFR
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - add vm_mmap() allocation resource manager

 - convert usercopy kselftest to KUnit

 - disable usercopy testing on !CONFIG_MMU

 - add MODULE_DESCRIPTION() to core, list, and usercopy tests

 - add tests for assertion formatting functions - assert.c

 - introduce KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros

 - fix KUNIT_ASSERT_STRNEQ comments to make it clear that it is an
   assertion

 - rename KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT

* tag 'linux_kselftest-kunit-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Introduce KUNIT_ASSERT_MEMEQ and KUNIT_ASSERT_MEMNEQ macros
  kunit: Rename KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT for readability
  kunit: Fix the comment of KUNIT_ASSERT_STRNEQ as assertion
  kunit: executor: Simplify string allocation handling
  kunit/usercopy: Add missing MODULE_DESCRIPTION()
  kunit/usercopy: Disable testing on !CONFIG_MMU
  usercopy: Convert test_user_copy to KUnit test
  kunit: test: Add vm_mmap() allocation resource manager
  list: test: add the missing MODULE_DESCRIPTION() macro
  kunit: add missing MODULE_DESCRIPTION() macros to core modules
  list: test: remove unused struct 'klist_test_struct'
  kunit: Cover 'assert.c' with tests
2024-07-16 17:42:14 -07:00
Linus Torvalds
f8a8b94d06 sysctl changes for 6.11-rc1
Summary
 
 * Remove "->procname == NULL" check when iterating through sysctl table arrays
 
     Removing sentinels in ctl_table arrays reduces the build time size and
     runtime memory consumed by ~64 bytes per array. With all ctl_table
     sentinels gone, the additional check for ->procname == NULL that worked in
     tandem with the ARRAY_SIZE to calculate the size of the ctl_table arrays is
     no longer needed and has been removed. The sysctl register functions now
     returns an error if a sentinel is used.
 
 * Preparation patches for sysctl constification
 
     Constifying ctl_table structs prevents the modification of proc_handler
     function pointers as they would reside in .rodata. The ctl_table arguments
     in sysctl utility functions are const qualified in preparation for a future
     treewide proc_handler argument constification commit.
 
 * Misc fixes
 
     Increase robustness of set_ownership by providing sane default ownership
     values in case the callee doesn't set them. Bound check proc_dou8vec_minmax
     to avoid loading buggy modules and give sysctl testing module a name to
     avoid compiler complaints.
 
 Testing
 
   * This got push to linux-next in v6.10-rc2, so it has had more than a month
     of testing
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmaWdz4ACgkQupfNUreW
 QU/WKQwAkSuUz42yCQye77BK+Z8ANcTF1f3aI/wfv2nahq1GaSrNBpqUiXvEe9Tt
 KD2lM1PWiQfizVLIDPh96yxa5q69GQrPPOA/V1jwIXmk/HRpjjoONCFNNXVRCTls
 VCqDz/RatuXvzO35Yn87MnWnxv6PiX7X/zq/3WikVsUI381kvTgC6OwZxdFM52w4
 ESwOa3LeOovtRnqV5dpHr6DCQKyd0N52nPxgXvaerjlsJsv7PlezN7z9YyLOOfmW
 xUD7X6LQcJq7HcEukaB6I9o2GQOi4yYXL2YOzed7qu9Thu+lasEoN3Bd7P+ilXkc
 JY6EXJ5o+d69PewKRuJ1QvD7wrHIkhNMNbMtvehNay124wAHDy3KtonFzyvlX4wE
 qCHBYc6rySJNhSqwVp9MoksOZfDM99pVIOs9YVIjc90Zzu5J7tORgYWRVOHTcAtj
 fd8nMdkK3+ZANapygFCyew6GueIzaqlQwveVgLGw4vc5L3ClknmURit3y487Pzdg
 B+BEVlsp
 =bs2G
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

 - Remove "->procname == NULL" check when iterating through sysctl table
   arrays

   Removing sentinels in ctl_table arrays reduces the build time size
   and runtime memory consumed by ~64 bytes per array. With all
   ctl_table sentinels gone, the additional check for ->procname == NULL
   that worked in tandem with the ARRAY_SIZE to calculate the size of
   the ctl_table arrays is no longer needed and has been removed. The
   sysctl register functions now returns an error if a sentinel is used.

 - Preparation patches for sysctl constification

   Constifying ctl_table structs prevents the modification of
   proc_handler function pointers as they would reside in .rodata. The
   ctl_table arguments in sysctl utility functions are const qualified
   in preparation for a future treewide proc_handler argument
   constification commit.

 - Misc fixes

   Increase robustness of set_ownership by providing sane default
   ownership values in case the callee doesn't set them. Bound check
   proc_dou8vec_minmax to avoid loading buggy modules and give sysctl
   testing module a name to avoid compiler complaints.

* tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Warn on an empty procname element
  sysctl: Remove ctl_table sentinel code comments
  sysctl: Remove "child" sysctl code comments
  sysctl: Remove superfluous empty allocations from sysctl internals
  sysctl: Replace nr_entries with ctl_table_size in new_links
  sysctl: Remove check for sentinel element in ctl_table arrays
  mm profiling: Remove superfluous sentinel element from ctl_table
  locking: Remove superfluous sentinel element from kern_lockdep_table
  sysctl: Add module description to sysctl-testing
  sysctl: constify ctl_table arguments of utility function
  utsname: constify ctl_table arguments of utility function
  sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array
  sysctl: always initialize i_uid/i_gid
2024-07-16 14:24:29 -07:00
Linus Torvalds
ce5a51bfac hardening updates for v6.11-rc1
- lkdtm/bugs: add test for hung smp_call_function_single() (Mark Rutland)
 
 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)
 
 - ARM: Remove address checking for MMUless devices (Yanjun Yang)
 
 - randomize_kstack: Clean up per-arch entropy and codegen
 
 - KCFI: Make FineIBT mode Kconfig selectable
 
 - fortify: Do not special-case 0-sized destinations
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmaVT2IACgkQiXL039xt
 wCbq8A//RhxTdr+l/h2gyMy/Lcy/NMR9KEWklnxdftuM1V1Kzr53yeH/g6Ehw69g
 e8Ag3Sp7Fn4rNBVa+tY6RqzKwfrUHIbeewGI4LkRe19NDWFWc/Od+4tamfRSPf9c
 GL9ZnJZviRm3zByetwr4CbS69HocXFFSSgcpIv/7xOd+haSWWdvEc3KcSnavY/aq
 8wQPkZxzy8ESkOajZj2k0E2l9JP42Ex20qy0KcjweSSYVafKmbTxhKZgriwAKMCD
 Yj2m55fbD6D08vd0Y6S7H4TPilYtRbulXR9FNMtw59UpKeoUceEmyn4B43psDvau
 9XuJF/oFKrXBEJG+OUZogNu5L6uYUaNdYdtb43upu9lCsjrAjmMYfmXDHO2E40V8
 76MikxHtyFAPEzUwg/BH2CGUu9hil+FADd28s8zLuUBpRDitgYudQD+Cqrc34b6s
 QlAX19bX7KFgXqlsdwy6zJNSd3dpoMBVsP58/EhQQfiqv/ZU2TOryZenz0URlH+k
 ZCAbpXYRAzTyGz23qkutRO+6MiKXoheE7gmd9jESiaqyXe2Q6mIMPyoFU50458TH
 xXhXbZc7War8vbJLyWF7fvK/GlooTHu4xOxfNTsxKWiYShI01iiwG1hH+j4ZDVOG
 NBBK2AfX9GM8AOHJolp5EaGmon0AoVsxbRANSs1K4qZ93WTNGLk=
 =LoG2
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:

 - lkdtm/bugs: add test for hung smp_call_function_single() (Mark
   Rutland)

 - gcc-plugins: Remove duplicate included header file stringpool.h
   (Thorsten Blum)

 - ARM: Remove address checking for MMUless devices (Yanjun Yang)

 - randomize_kstack: Clean up per-arch entropy and codegen

 - KCFI: Make FineIBT mode Kconfig selectable

 - fortify: Do not special-case 0-sized destinations

* tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randomize_kstack: Improve stack alignment codegen
  ARM: Remove address checking for MMUless devices
  gcc-plugins: Remove duplicate included header file stringpool.h
  randomize_kstack: Remove non-functional per-arch entropy filtering
  fortify: Do not special-case 0-sized destinations
  x86/alternatives: Make FineIBT mode Kconfig selectable
  lkdtm/bugs: add test for hung smp_call_function_single()
2024-07-16 13:45:43 -07:00
Linus Torvalds
4fd9435641 Updates for timers, timekeeping and related functionality:
- Core:
 
     - Make the takeover of a hrtimer based broadcast timer reliable during
       CPU hot-unplug. The current implementation suffers from a race which
       can lead to broadcast timer starvation in the worst case.
 
     - VDSO related cleanups and simplifications
 
     - Small cleanups and enhancements all over the place
 
   - PTP:
 
     - Replace the architecture specific base clock to clocksource, e.g. ART
       to TSC, conversion function with generic functionality to avoid
       exposing such internals to drivers and convert all existing drivers
       over. This also allows to provide functionality which converts the
       other way round in the core code based on the same parameter set.
 
     - Provide a function to convert CLOCK_REALTIME to the base clock to
       support the upcoming PPS output driver on Intel platforms.
 
   - Drivers:
 
     - A set of Device Tree bindings for new hardware
 
     - Cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmaUOM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYofolD/9kK+aYdDj1gCFuZXZ2wTgMMxFmf/91
 0UcsGRuBJiIXs3H3iizQ0Mb0cdTW6qZJoBp0jPlvUSm0BEKdEgE1uRX2RuAPZ/Gq
 4/54ZJVopKSgAqeJFmqQubRVSv2XdMRAAJT0o1oUG3jZ0c6u8vqArIh5ZCnu13l/
 tsNOeYLYzQFyA30eHSJ/KjQ2zHwAhJnl5a/b7pdAvxmlN37bGgKEpglv+9zwFiDB
 K/kWbpb/oED9WOmoQy5QYi8iSvLQHEhFGrqzXV3fegu/B/mBBf/bpsisVx7Z1m2R
 nzxNqg86RdMjNR6giwBETZjm7YxM+gKb9nCBNILjbjWZFC4tyrBkLGJ+KniTRNyZ
 M5R4X1oP/14h00qXmCgIEFWysXaJRewYI+TIm8R2rLXrR6Tf3c4oL6fHQJxy3X52
 7A+4Z/vOk/KX6PxYmLC+xQDukhFh2nirVYsP1oNM9yC9zR/wkBBXTTmUSAI+8m8l
 KphniSPS2HMSBI6TtgOT8SKY7lRUZTnafBZq7wRXCv0Zz8AXoofgQDmBkXC99BkB
 MjLvRotJVJvY9a8LtA7htjDg/jiEMa0wHRNAGNSbflKoAKrJzoE5WbFxFZKbq3vZ
 o8cEYRMAIP+X+qn+oymT45XXXQlifZiccJdAi9FqDTvplEib2jmTmH6Ae5Khkr4l
 Lbzh/nSKVN7lOg==
 =8GjP
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for timers, timekeeping and related functionality:

  Core:

   - Make the takeover of a hrtimer based broadcast timer reliable
     during CPU hot-unplug. The current implementation suffers from a
     race which can lead to broadcast timer starvation in the worst
     case.

   - VDSO related cleanups and simplifications

   - Small cleanups and enhancements all over the place

  PTP:

   - Replace the architecture specific base clock to clocksource, e.g.
     ART to TSC, conversion function with generic functionality to avoid
     exposing such internals to drivers and convert all existing drivers
     over. This also allows to provide functionality which converts the
     other way round in the core code based on the same parameter set.

   - Provide a function to convert CLOCK_REALTIME to the base clock to
     support the upcoming PPS output driver on Intel platforms.

  Drivers:

   - A set of Device Tree bindings for new hardware

   - Cleanups and enhancements all over the place"

* tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  clocksource/drivers/realtek: Add timer driver for rtl-otto platforms
  dt-bindings: timer: Add schema for realtek,otto-timer
  dt-bindings: timer: Add SOPHGO SG2002 clint
  dt-bindings: timer: renesas,tmu: Add R-Car Gen2 support
  dt-bindings: timer: renesas,tmu: Add RZ/G1 support
  dt-bindings: timer: renesas,tmu: Add R-Mobile APE6 support
  clocksource/drivers/mips-gic-timer: Correct sched_clock width
  clocksource/drivers/mips-gic-timer: Refine rating computation
  clocksource/drivers/sh_cmt: Address race condition for clock events
  clocksource/driver/arm_global_timer: Remove unnecessary ‘0’ values from err
  clocksource/drivers/arm_arch_timer: Remove unnecessary ‘0’ values from irq
  tick/broadcast: Make takeover of broadcast hrtimer reliable
  tick/sched: Combine WARN_ON_ONCE and print_once
  x86/vdso: Remove unused include
  x86/vgtod: Remove unused typedef gtod_long_t
  x86/vdso: Fix function reference in comment
  vdso: Add comment about reason for vdso struct ordering
  vdso/gettimeofday: Clarify comment about open coded function
  timekeeping: Add missing kernel-doc function comments
  tick: Remove unnused tick_nohz_get_idle_calls()
  ...
2024-07-15 15:03:09 -07:00
Linus Torvalds
0e4b77d4ea A single update for debugobjects to annotate all intentionally racy global
debug variables so that KCSAN ignores them.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmaULP4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT6pEACXmc34OzO3jbOGEmgt5ch0cYSNvlY0
 AL0iAV5JakC8AGDWeDNAUhR5r7tuNqjMmiy/XH+uR/4+xCZLZvQp7flyhrm/W7vd
 rB3slu4xqqHizoQe81ZdH3ffg7Cj/Q/zqcJTv44UYkWLlAKA92S79bsn903UHpnL
 ENH0IMulpP0b3GedV3GySz476kyAJX4ZJHXfsG71oyWz8gJahXfaDzSMqnMW0bLG
 z0u51D9Q2R60zYpEsSPfBCKERKZ+Dzbn/YOYF85kytpXkVQd183JY05IkZmDgxyB
 O973GgxvPGXZMXrUfhd+h7Kr17TiG+OKFpxhxgGCQoJNebFUt4A+QFWwQ7/FE/TN
 FmjvwTBHllrLpucskivvI6zEETnJB/13XBB/T3k0BMB3cFfUiXdQS0N+xOBVoAhD
 CLo21kG+xNPbzuKwzKx1+Vb/FH8/aoKp6py5kQlKAtQ6ddfqyvyGN3TZKYQGl3Hk
 9o1ZuwlfkpG0a/0GKvyPcUeLUP0IagGe1wrOard+uL2VRlPRTnr4GH7ItTEedmAY
 JRlCD0A1GQzwVtOy+D54W0G0ueW/tX76QzxuIJj5wwmZQpcV37eTOfIbZXnk4RzS
 TZJ6gjxSLGbjYMbTiIcTFBU6UXhKjkE30bb5gPdzpXh8QtI1SSqpftZszqTAXWA3
 qbMwI0/csYVXsg==
 =PuR2
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects update from Thomas Gleixner:
 "A single update for debugobjects to annotate all intentionally racy
  global debug variables so that KCSAN ignores them"

* tag 'core-debugobjects-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Annotate racy debug variables
2024-07-15 14:52:21 -07:00
Vlastimil Babka
436381eaf2 Merge branch 'slab/for-6.11/buckets' into slab/for-next
Merge all the slab patches previously collected on top of v6.10-rc1,
over cleanups/fixes that had to be based on rc6.
2024-07-15 10:44:16 +02:00
Linus Torvalds
882ddcd1bf Kbuild fixes for v6.10 (fourth)
- Make scripts/ld-version.sh robust against the latest LLD
 
  - Fix warnings in rpm-pkg with device tree support
 
  - Fix warnings in fortify tests with KASAN
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmaUM9kVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGHEYQAKvrP1IzwlkANEEj/2qW1iGHlUod
 im3cFCKyxrlBar71n15fYtclhK0N4GTVAUMHAk0d1GZo8UjuCeUzurHM4o53Hu1P
 D0pXbkmiA0YgndpJYQpSV0CrLxCOCVAEFRf7AgdolVjpNLuba4z0bXSTQfEwfHKC
 W1igTL2vGG8citbfHhEGZfB7AIEQBB0LtNkarpsDVD39rG+blZAABBLEtCueSVtw
 rVX/Yuny9nDET6tlaCNgr2esNfkrHPIxOSsufeLWdhVVIZprPSGERflhkL3yE299
 v6R45ANn72iVNKnnmmjxTNeezIpr74w1NSzBJ0jRM1KRqzbsEuFAf0ZNamtoUJ4r
 m4tSu5l7lDj86APvehoO2o07A3omd8vcgLPt+lZlFsBIjorVIKovsjix6pVUgHlS
 BTvxbSojbSMUa/NrkbosJkLo/6TzZxYxHKr17nxk+HsXu0i9A9IiHPBK5dTcbtua
 olp1MKolQG78FYMwl7v4yQithawRG0mNDLJ2J8oTEIATXQtXV0WAaje73qQFIs6I
 cMBEfeaDAMH4z0/VvKZsdksXFPDrrjoW0/x1tPqcAgOSyacPGbki4asn52rwDHT5
 mfAzlnUc8ts56sBasArmMpk0z+PKC4MZeFUXNGJf7bZ3NZqoDRDHpb69Aqs11vSw
 AJa9Kj07o5YD8N+E
 =MMw8
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Make scripts/ld-version.sh robust against the latest LLD

 - Fix warnings in rpm-pkg with device tree support

 - Fix warnings in fortify tests with KASAN

* tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  fortify: fix warnings in fortify tests with KASAN
  kbuild: rpm-pkg: avoid the warnings with dtb's listed twice
  kbuild: Make ld-version.sh more robust against version string changes
2024-07-14 15:29:35 -07:00
Masahiro Yamada
84679f04ce fortify: fix warnings in fortify tests with KASAN
When a software KASAN mode is enabled, the fortify tests emit warnings
on some architectures.

For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y
and CONFIG_KASAN=y produces the following warnings:

    TEST    lib/test_fortify/read_overflow-memchr.log
  warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c
    TEST    lib/test_fortify/read_overflow-memchr_inv.log
  warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c
    TEST    lib/test_fortify/read_overflow-memcmp.log
  warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c
    TEST    lib/test_fortify/read_overflow-memscan.log
  warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c
    TEST    lib/test_fortify/read_overflow2-memcmp.log
  warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c
     [ more and more similar warnings... ]

Commit 9c2d1328f8 ("kbuild: provide reasonable defaults for tool
coverage") removed KASAN flags from non-kernel objects by default.
It was an intended behavior because lib/test_fortify/*.c are unit
tests that are not linked to the kernel.

As it turns out, some architectures require -fsanitize=kernel-(hw)address
to define __SANITIZE_ADDRESS__ for the fortify tests.

Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h
defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>.

This issue does not occur on x86 thanks to commit 4ec4190be4
("kasan, x86: don't rename memintrinsics in uninstrumented files"),
but there are still some architectures that define __NO_FORTIFY
in such a situation.

Set KASAN_SANITIZE=y explicitly to the fortify tests.

Fixes: 9c2d1328f8 ("kbuild: provide reasonable defaults for tool coverage")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/all/0e8dee26-41cc-41ae-9493-10cd1a8e3268@app.fastmail.com/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-15 04:53:49 +09:00
Thomas Gleixner
b7625d67eb - Remove unnecessary local variables initialization as they will be
initialized in the code path anyway right after on the ARM arch
   timer and the ARM global timer (Li kunyu)
 
 - Fix a race condition in the interrupt leading to a deadlock on the
   SH CMT driver. Note that this fix was not tested on the platform
   using this timer but the fix seems reasonable enough to be picked
   confidently (Niklas Söderlund)
 
 - Increase the rating of the gic-timer and use the configured width
   clocksource register on the MIPS architecture (Jiaxun Yang)
 
 - Add the DT bindings for the TMU on the Renesas platforms (Geert
   Uytterhoeven)
 
 - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
   Bonnefille)
 
 - Add the rtl-otto timer driver along with the DT bindings for the
   Realtek platform (Chris Packham)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmaRQh0ACgkQqDIjiipP
 6E+rfQgAqkAWZ9BjswxV8Fg+Hj+a1cSohKjDczqitQF5rJm25X5VvMwlXVa3XQGm
 yemh4tKPpll02LOiYCTyqOWzNrkVS9VsoBd5rrYjRX5aSv7UD35EXklLj4P/INwX
 O9CRGD6aK4Xbw66xxheYHSSh+2iRs2x2mq61+/VdcIBlAwpQo+vx7McRoJZZI+2t
 NFIXw8RF5dDlmmAaqiB0WnPAtcOK3SDo9fu1LEAX1ZAzvbZriLo7XLnL7ibySWVe
 BW1n7Ore6PN5Dvz7jMfTsOQsgAlVv6MPfp/s4EDqMfBLVqXNirzXrdhiee/ahnYP
 vyzQyU5HPCMiIYS45mhJF0OyDd3wyw==
 =wuYA
 -----END PGP SIGNATURE-----

Merge tag 'timers-v6.11-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/core

Pull clocksource/event driver updates from Daniel Lezcano:

  - Remove unnecessary local variables initialization as they will be
    initialized in the code path anyway right after on the ARM arch
    timer and the ARM global timer (Li kunyu)

  - Fix a race condition in the interrupt leading to a deadlock on the
    SH CMT driver. Note that this fix was not tested on the platform
    using this timer but the fix seems reasonable enough to be picked
    confidently (Niklas Söderlund)

  - Increase the rating of the gic-timer and use the configured width
    clocksource register on the MIPS architecture (Jiaxun Yang)

  - Add the DT bindings for the TMU on the Renesas platforms (Geert
    Uytterhoeven)

  - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
    Bonnefille)

  - Add the rtl-otto timer driver along with the DT bindings for the
    Realtek platform (Chris Packham)

Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
2024-07-13 12:07:10 +02:00
Dan Carpenter
fe69b772e3 crypto: lib/mpi - delete unnecessary condition
We checked that "nlimbs" is non-zero in the outside if statement so delete
the duplicate check here.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-07-13 11:50:28 +12:00
Thorsten Blum
e1fb7430fc lib/bch.c: use swap() to improve code
Use the swap() macro to simplify the functions solve_linear_system() and
gf_poly_gcd() and improve their readability.  Remove the local variable
tmp.

Fixes the following three Coccinelle/coccicheck warnings reported by
swap.cocci:

  WARNING opportunity for swap()
  WARNING opportunity for swap()
  WARNING opportunity for swap()

Link: https://lkml.kernel.org/r/20240708224023.9312-2-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 16:39:53 -07:00
Chen Ni
4f5d4a1ba7 test_bpf: convert comma to semicolon
Replace commas between expression statements with semicolons.

Link: https://lkml.kernel.org/r/20240709034323.586185-1-nichen@iscas.ac.cn
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 16:39:52 -07:00
Kees Cook
7554a7b96d kunit: executor: Simplify string allocation handling
The alloc/copy code pattern is better consolidated to single kstrdup (and
kstrndup) calls instead. This gets rid of deprecated[1] strncpy() uses as
well. Replace one other strncpy() use with the more idiomatic strscpy().

Link: https://github.com/KSPP/linux/issues/90 [1]
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-12 10:11:48 -06:00
Thorsten Blum
0d9c0a67b1 bootconfig: Remove duplicate included header file linux/bootconfig.h
The header file linux/bootconfig.h is included whether __KERNEL__ is
defined or not.

Include it only once before the #ifdef/#else/#endif preprocessor
directives and remove the following make includecheck warning:

  linux/bootconfig.h is included more than once

Move the comment to the top and delete the now empty #else block.

Link: https://lore.kernel.org/all/20240711084315.1507-1-thorsten.blum@toblux.com/

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-07-12 08:55:02 +09:00
Jakub Kicinski
7c8267275d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/act_ct.c
  26488172b0 ("net/sched: Fix UAF when resolving a clash")
  3abbd7ed8b ("act_ct: prepare for stolen verdict coming from conntrack and nat engine")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 12:58:13 -07:00
Linus Torvalds
9d9a2f29ae 21 hotfixes, 15 of which are cc:stable.
No identifiable theme here - all are singleton patches, 19 are for MM.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZo7tTQAKCRDdBJ7gKXxA
 jvhZAP977PnAwQH5khIS3xJxZrqx/+Tho7UPZzQPvHJPRpHorAD/TZfDazGtlPMD
 uLPEVslh18rks/w+kddLrnlBnkpUMwY=
 =vhts
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "21 hotfixes, 15 of which are cc:stable.

  No identifiable theme here - all are singleton patches, 19 are for MM"

* tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
  mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
  mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()
  filemap: replace pte_offset_map() with pte_offset_map_nolock()
  arch/xtensa: always_inline get_current() and current_thread_info()
  sched.h: always_inline alloc_tag_{save|restore} to fix modpost warnings
  MAINTAINERS: mailmap: update Lorenzo Stoakes's email address
  mm: fix crashes from deferred split racing folio migration
  lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
  mm: gup: stop abusing try_grab_folio
  nilfs2: fix kernel bug on rename operation of broken directory
  mm/hugetlb_vmemmap: fix race with speculative PFN walkers
  cachestat: do not flush stats in recency check
  mm/shmem: disable PMD-sized page cache if needed
  mm/filemap: skip to create PMD-sized page cache if needed
  mm/readahead: limit page cache size in page_cache_ra_order()
  mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray
  mm/damon/core: merge regions aggressively when max_nr_regions is unmet
  Fix userfaultfd_api to return EINVAL as expected
  mm: vmalloc: check if a hash-index is in cpu_possible_mask
  mm: prevent derefencing NULL ptr in pfn_section_valid()
  ...
2024-07-10 14:59:41 -07:00
Linus Torvalds
f6963ab4b0 bcachefs fixes for 6.10-rc8
- Switch some asserts to WARN()
 - Fix a few "transaction not locked" asserts in the data read retry
   paths and backpointers gc
 - Fix a race that would cause the journal to get stuck on a flush commit
 - Add missing fsck checks for the fragmentation LRU
 - The usual assorted ssorted syzbot fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmaOuRwACgkQE6szbY3K
 bnaCHhAAi9VRqws+zx3fSpe2OMwWqAEWA84QgIFJccy+I86d7dXkqG389gFqJwMG
 9S3BUHP1WooJmpsTRhK5cNtxZuKKOajXlxUYz3onsF7O/U3dHFY5GU7yIIjXS/0o
 q7+iryWAJ4MmlOrAJhgPMH/WlhbSVsjANUN0n/NhlOWHccFGHmpdMTb6aYzb+lfL
 iZOONKmEOR65gLzZYlO323OB2Tv00iEbOZAtxk68BLZYX+WON/j1T1A8gK4G0XSX
 8wcYpXNxGGkCufjBfAbXf4mcp/WygQq0Wj3bdVMFkZ+AwSJDcfGeK1H7f6tJ9e4n
 lqfWL4tgWIckS+41sA96B5cYry9TMDdhu3IeFaAm0ZrF55JT1JySGE1GNA+mo6xA
 mkMAqhG7rwYh6nSJfWX0Ie+zJ9TFbmi05ZbI7jaTuQjnJ5uvPpTuRfBDi+qSWmoi
 +IBDAi9hZgCUNEsLRGDm7RDQo0dpbFo6jpArn1RHK4MO/HkTrqcKpTqiGnfwFAU4
 PFxwq5G9+d38+M6YMX0tXdfQ+fdxroA6aIBJSsIpF18tPRBOBlQsM2GFP34uHbyk
 L6HOzed2QpM5ExBmViX79F+obuDQ/gzXQszYvDKL4QTFNbx43gPWRDrGm8EQen6y
 12EScamXbUWBSWnOqxscmeUsTdTKxLfw/F43JbE2fE7jSxc5tss=
 =VGT8
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:

 - Switch some asserts to WARN()

 - Fix a few "transaction not locked" asserts in the data read retry
   paths and backpointers gc

 - Fix a race that would cause the journal to get stuck on a flush
   commit

 - Add missing fsck checks for the fragmentation LRU

 - The usual assorted ssorted syzbot fixes

* tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Add missing bch2_trans_begin()
  bcachefs: Fix missing error check in journal_entry_btree_keys_validate()
  bcachefs: Warn on attempting a move with no replicas
  bcachefs: bch2_data_update_to_text()
  bcachefs: Log mount failure error code
  bcachefs: Fix undefined behaviour in eytzinger1_first()
  bcachefs: Mark bch_inode_info as SLAB_ACCOUNT
  bcachefs: Fix bch2_inode_insert() race path for tmpfiles
  closures: fix closure_sync + closure debugging
  bcachefs: Fix journal getting stuck on a flush commit
  bcachefs: io clock: run timer fns under clock lock
  bcachefs: Repair fragmentation_lru in alloc_write_key()
  bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref()
  bcachefs: bch2_btree_write_buffer_maybe_flush()
  bcachefs: Add missing printbuf_tabstops_reset() calls
  bcachefs: Fix loop restart in bch2_btree_transactions_read()
  bcachefs: Fix bch2_read_retry_nodecode()
  bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs
  bcachefs: Fix shift greater than integer size
  bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning
  ...
2024-07-10 11:50:16 -07:00
Kent Overstreet
29f1c1ae6d closures: fix closure_sync + closure debugging
originally, stack closures were only used synchronously, and with the
original implementation of closure_sync() the ref never hit 0; thus,
closure_put_after_sub() assumes that if the ref hits 0 it's on the debug
list, in debug mode.

that's no longer true with the current implementation of closure_sync,
so we need a new magic so closure_debug_destroy() doesn't pop an assert.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-10 09:53:39 -04:00
Paolo Abeni
7b769adc26 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI
 g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY
 9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs=
 =p1Zz
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-07-08

The following pull-request contains BPF updates for your *net-next* tree.

We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).

The main changes are:

1) Support resilient split BTF which cuts down on duplication and makes BTF
   as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.

2) Add support for dumping kfunc prototypes from BTF which enables both detecting
   as well as dumping compilable prototypes for kfuncs, from Daniel Xu.

3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
   support for BPF exceptions, from Ilya Leoshkevich.

4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
   for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui.

5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option
   for skbs and add coverage along with it, from Vadim Fedorenko.

6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives
   a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan.

7) Extend the BPF verifier to track the delta between linked registers in order
   to better deal with recent LLVM code optimizations, from Alexei Starovoitov.

8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should
   have been a pointer to the map value, from Benjamin Tissoires.

9) Extend BPF selftests to add regular expression support for test output matching
   and adjust some of the selftest when compiled under gcc, from Cupertino Miranda.

10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always
    iterates exactly once anyway, from Dan Carpenter.

11) Add the capability to offload the netfilter flowtable in XDP layer through
    kfuncs, from Florian Westphal & Lorenzo Bianconi.

12) Various cleanups in networking helpers in BPF selftests to shave off a few
    lines of open-coded functions on client/server handling, from Geliang Tang.

13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so
    that x86 JIT does not need to implement detection, from Leon Hwang.

14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an
    out-of-bounds memory access for dynpointers, from Matt Bobrowski.

15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as
    it might lead to problems on 32-bit archs, from Jiri Olsa.

16) Enhance traffic validation and dynamic batch size support in xsk selftests,
    from Tushar Vyavahare.

bpf-next-for-netdev

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits)
  selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep
  selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
  bpf: helpers: fix bpf_wq_set_callback_impl signature
  libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
  selftests/bpf: Remove exceptions tests from DENYLIST.s390x
  s390/bpf: Implement exceptions
  s390/bpf: Change seen_reg to a mask
  bpf: Remove unnecessary loop in task_file_seq_get_next()
  riscv, bpf: Optimize stack usage of trampoline
  bpf, devmap: Add .map_alloc_check
  selftests/bpf: Remove arena tests from DENYLIST.s390x
  selftests/bpf: Add UAF tests for arena atomics
  selftests/bpf: Introduce __arena_global
  s390/bpf: Support arena atomics
  s390/bpf: Enable arena
  s390/bpf: Support address space cast instruction
  s390/bpf: Support BPF_PROBE_MEM32
  s390/bpf: Land on the next JITed instruction after exception
  s390/bpf: Introduce pre- and post- probe functions
  s390/bpf: Get rid of get_probe_mem_regno()
  ...
====================

Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09 17:01:46 +02:00
Arnd Bergmann
1f9a8286bc uaccess: always export _copy_[from|to]_user with CONFIG_RUST
Rust code needs to be able to access _copy_from_user and _copy_to_user
so that it can skip the check_copy_size check in cases where the length
is known at compile-time, mirroring the logic for when C code will skip
check_copy_size. To do this, we ensure that exported versions of these
methods are available when CONFIG_RUST is enabled.

Alice has verified that this patch passes the CONFIG_TEST_USER_COPY test
on x86 using the Android cuttlefish emulator.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240528-alice-mm-v7-2-78222c31b8f4@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-07-08 23:44:01 +02:00
Andrew Morton
8ef6fd0e9e Merge branch 'mm-hotfixes-stable' into mm-stable to pick up "mm: fix
crashes from deferred split racing folio migration", needed by "mm:
migrate: split folio_migrate_mapping()".
2024-07-06 11:44:41 -07:00
Paul Menzel
2fe29fe945 lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
On a system with Perl 5.12.1, commit 5ef6dc08cf
("lib/build_OID_registry: don't mention the full path of the script in
output") causes the build to fail with the error below.

     Bareword found where operator expected at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
     syntax error at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
     Execution of ./lib/build_OID_registry aborted due to compilation errors.
     make[3]: *** [lib/Makefile:352: lib/oid_registry_data.c] Error 255

Ahmad Fatoum analyzed that non-destructive substitution is only supported since
Perl 5.13.2. Instead of dropping `r` and having the side effect of modifying
`$0`, introduce a dedicated variable to support older Perl versions.

Link: https://lkml.kernel.org/r/20240702223512.8329-2-pmenzel@molgen.mpg.de
Link: https://lkml.kernel.org/r/20240701155802.75152-1-pmenzel@molgen.mpg.de
Fixes: 5ef6dc08cf ("lib/build_OID_registry: don't mention the full path of the script in output")
Link: https://lore.kernel.org/all/259f7a87-2692-480e-9073-1c1c35b52f67@molgen.mpg.de/
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Suggested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-06 11:39:51 -07:00
Daniel Vetter
86634fa4e6 Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge v6.10-rc6 into drm-next

The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.

Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-07-05 10:47:28 +02:00
Jeff Johnson
8547d1150f math: rational: add missing MODULE_DESCRIPTION() macro
With ARCH=sh, make allmodconfig && make W=1 C=1 reports: WARNING: modpost:
missing MODULE_DESCRIPTION() in lib/math/rational.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240702-md-sh-lib-math-v1-1-93f4ac4fa8fd@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Jeff Johnson
bee6c683de lib/zlib: add missing MODULE_DESCRIPTION() macro
With ARCH=csky, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/zlib_deflate/zlib_deflate.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240613-md-csky-lib-zlib_deflate-v1-1-83504d9a27d6@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Guo Ren <guoren@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Hsin Chang Yu
cedb08caac lib/rbtree.c: fix the example typo
Replace the "Sr" with "sr", the example is wrong if sl and N don't have
child nodes, so sr should be red node.

Link: https://lkml.kernel.org/r/20240628142229.69419-1-zxcvb600870024@gmail.com
Signed-off-by: Hsin Chang Yu <zxcvb600870024@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
Jakub Kicinski
76ed626479 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/aquantia/aquantia.h
  219343755e ("net: phy: aquantia: add missing include guards")
  61578f6793 ("net: phy: aquantia: add support for PHY LEDs")

drivers/net/ethernet/wangxun/libwx/wx_hw.c
  bd07a98178 ("net: txgbe: remove separate irq request for MSI and INTx")
  b501d261a5 ("net: txgbe: add FDIR ATR support")
https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/

include/linux/mlx5/mlx5_ifc.h
  048a403648 ("net/mlx5: IFC updates for changing max EQs")
  99be56171f ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/

Adjacent changes:

drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
  4130c67cd1 ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference")
  3f3126515f ("wifi: iwlwifi: mvm: add mvm-specific guard")

include/net/mac80211.h
  816c6bec09 ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP")
  5a009b42e0 ("wifi: mac80211: track changes in AP's TPE")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 14:16:11 -07:00
Ilya Leoshkevich
89f42df66c lib/zlib: unpoison DFLTCC output buffers
The constraints of the DFLTCC inline assembly are not precise: they do not
communicate the size of the output buffers to the compiler, so it cannot
automatically instrument it.

Add the manual kmsan_unpoison_memory() calls for the output buffers.  The
logic is the same as in [1].

[1] 1f5ddcc009

Link: https://lkml.kernel.org/r/20240621113706.315500-21-iii@linux.ibm.com
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <kasan-dev@googlegroups.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:23 -07:00
JaeJoon Jung
739820a617 maple_tree: modified return type of mas_wr_store_entry()
Since the return value of mas_wr_store_entry() is not used,
the return type can be changed to void.

Link: https://lkml.kernel.org/r/20240614092428.29491-1-rgbi3307@gmail.com
Signed-off-by: JaeJoon Jung <rgbi3307@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:19 -07:00
Jeff Johnson
3c666d0a32 lib: test_hmm: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hmm.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-lib-md-test_hmm-v1-1-e4aa17daa57b@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
a619dd3948 test_maple_tree: add the missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing
MODULE_DESCRIPTION() in lib/test_maple_tree.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_maple_tree-v1-1-7b1b485aeec3@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
2ec83987a5 ubsan: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ubsan.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_ubsan-v1-1-c2a80d258842@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Jeff Johnson
757234f1ad test_xarray: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_xarray.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_xarray-v1-1-42fd6833bdd4@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:05 -07:00
Anna-Maria Behnsen
f48955e038 vdso/gettimeofday: Clarify comment about open coded function
The two comments state, that the following code open codes something but
they lack to specify what exactly is open coded.

Expand comments by mentioning the reference to the open coded function.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20240701-vdso-cleanup-v1-1-36eb64e7ece2@linutronix.de
2024-07-03 21:27:03 +02:00
Jeff Johnson
67c9971cd6 kunit/usercopy: Add missing MODULE_DESCRIPTION()
Fix warning seen with:

$ make allmodconfig && make W=1 C=1 lib/usercopy_kunit.ko
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/usercopy_kunit.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-02 10:11:45 -06:00
Kees Cook
4d6cf24832 kunit/usercopy: Disable testing on !CONFIG_MMU
Since arch_pick_mmap_layout() is an inline for non-MMU systems, disable
this test there.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202406160505.uBge6TMY-lkp@intel.com/
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-02 10:11:40 -06:00
Greg Kroah-Hartman
19ed3bb558 Merge 6.10-rc6 into char-misc-next
We need the char/misc/iio fixes in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-01 13:55:39 +02:00
Jeff Johnson
2a49c8b6b6 selftests/fpu: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 now reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_fpu.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240622-md-i386-lib-test_fpu_glue-v1-1-a4e40b7b1264@quicinc.com
Fixes: 9613736d85 ("selftests/fpu: move FP code to a separate translation unit")
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28 19:36:30 -07:00
Alexey Dobriyan
961a285132 build-id: require program headers to be right after ELF header
Neither ELF spec not ELF loader require program header to be placed right
after ELF header, but build-id code very much assumes such placement:

See

	find_get_page(vma->vm_file->f_mapping, 0);

line and checks against PAGE_SIZE.

Returns errors for now until someone rewrites build-id parser
to be more inline with load_elf_binary().

Link: https://lkml.kernel.org/r/d58bc281-6ca7-467a-9a64-40fa214bd63e@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-28 19:36:30 -07:00
Linus Torvalds
b75f947270 hardening fixes for v6.10-rc6
- Remove invalid tty __counted_by annotation (Nathan Chancellor)
 
 - Add missing MODULE_DESCRIPTION()s for KUnit string tests (Jeff Johnson)
 
 - Remove non-functional per-arch kstack entropy filtering
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZ+4Z4ACgkQiXL039xt
 wCYUPQ/9Ghbg4CfOIyjl5G7fAYuG+/zLDCkY+kh7XcO2kAn3213KiyRKm0GUAhXY
 p3N7rDH9NsXedfO2bnQ0YTDR3TU8AWIegKgEyGBsyqvdtjSe0ParwWOoGGpavJZ2
 6Op39e6LL2fKGyL4N72lkhRpGPJgGQOqckTljaDl5yQfIHryMpQl0fXzMMjh1HUt
 TKc39kSRbQxguDdIqU1zHgs+Lu9Kph6A3q9PjVap9qzCcPZ4RjIRms4gDrghP7GK
 M0POyZbuXUWxaJ8VwRHbqAtEyEGjXdfBW9DgKQM1fg9XWGZbCkucu3PZbPHv+c6e
 eBGG6O5l6UylmXpmkqLMfIudUekfo8cAEXqcLCBYis8uIuasUWiLMhoTDjdfcvhn
 HHr6iu25IKR698PZzTHQ5yUiuBP38qjXfXr9DDzXrI2+SUbxjurTfbHxFBWK/FYX
 YSdrZR4DbeaU/HI1I+I5YghgeRfR6TQ5NGrmj61wW1QnwvEF6Gdlh+MZgUS59SP5
 S+T50ggGKEYARZcZj1N6Nz39Co9syn/xlhyPKFPkgsRTXw1QE0z6e841V1jxhr49
 cStKFcKAovDeG2UN4bAju49/MWUFlcpkIxn9Y0ZHiu6R6SC9zasXhKi7+xDFolmP
 B6PmON2ZSSoFNwMr7Fr1SC0gWg7V3TYLmpHITDWz5KL00ReEdJY=
 =dItV
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Remove invalid tty __counted_by annotation (Nathan Chancellor)

 - Add missing MODULE_DESCRIPTION()s for KUnit string tests (Jeff
   Johnson)

 - Remove non-functional per-arch kstack entropy filtering

* tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  tty: mxser: Remove __counted_by from mxser_board.ports[]
  randomize_kstack: Remove non-functional per-arch entropy filtering
  string: kunit: add missing MODULE_DESCRIPTION() macros
2024-06-28 16:11:02 -07:00
Linus Torvalds
cd63a278ac bcachefs fixes for 6.10-rc6
simple stuff:
 - null ptr/err ptr deref fixes
 - fix for getting wedged on shutdown after journal error
 
 - fix missing recalc_capacity() call, capacity now changes correctly
   after a device goes read only
 
   however: our capacity calculation still doesn't take into account when
   we have mixed ro/rw devices and the ro devices have data on them,
   that's going to be a more involved fix to separate accounting for
   "capacity used on ro devices" and "capacity used on rw devices"
 
 - boring syzbot stuff
 
 slightly more involved:
 - discard, invalidate workers are now per device
   this has the effect of simplifying how we take device refs in these
   paths, and the device ref cleanup fixes a longstanding race between
   the device removal path and the discard path
 
 - fixes for how the debugfs code takes refs on btree_trans objects
   we have debugfs code that prints in use btree_trans objects. It uses
   closure_get() on trans->ref, which is mainly for the cycle detector,
   but the debugfs code was using it on a closure that may have hit 0,
   which is not allowed; for performance reasons we cannot avoid having
   not-in-use transactions on the global list.
 
   introduce some new primitives to fix this and make the synchronization
   here a whole lot saner
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZ+ye4ACgkQE6szbY3K
 bnb4shAAqkKdgB2abIaD1t8+KjUiXwt7seRY4EmzwrEaWniW5bDUYMBvV+tew93j
 uvGmSKMs4ML/r24hcg0zGPJ9GoWrFb3MWhPYizzRS8QspsUjsECJuehNPCe3RPaf
 QBgQtKahTge1e41y1frzkiGKqaOGOTtUVLOfPIebe+oJAhRCYRnrGY2dkZTms7Ue
 aXNtBmnlX3Fkmlm0GiKYrTHpAZz3d0kzdX11Pc2vTXvqo/znuJTTVGnjJkdrHzyv
 6cz6YnMKFdxLVbYO1KlB/3Hu9y9qt815g1rjvaqym8pDk9ltsGHNM3LcCCCyp7Of
 btnbLQ6TdfggK5Kf2hNYuJRY2pnjNyfcNxupQF3RNaw/D/4G5EU16zfFElORC6Mw
 eGwXLvDIGqOSSIvevoRZrgJKAvVptXNg9EtCI5Z5ujQ4ExW8ti1lPHp/r5SVOhyz
 x0Am14H2ERuz7Vt5jUas3k74+tAck6JWc5OemMQawA5waeH1inMT7QZuBt+Bmrhx
 Av0zbhaq4aTsHXmm+Xi6ofj3UBaOQ2rNzT7Au0kxdvJgDPe/USjw4tejV5DmjmHA
 SyRsTG7Zn5xJBi7jc47fcwUgUzlxlffVQGFCVjRUU1vF6u/Ldn7K0zfYbkwSCiKp
 iWSEyg3j5z5N69Vrgdadma4xTDjL/C5+XsMWh8G8ohf+crhUeSo=
 =svIi
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Simple stuff:

   - NULL ptr/err ptr deref fixes

   - fix for getting wedged on shutdown after journal error

   - fix missing recalc_capacity() call, capacity now changes correctly
     after a device goes read only

     however: our capacity calculation still doesn't take into account
     when we have mixed ro/rw devices and the ro devices have data on
     them, that's going to be a more involved fix to separate accounting
     for "capacity used on ro devices" and "capacity used on rw devices"

   - boring syzbot stuff

  Slightly more involved:

   - discard, invalidate workers are now per device

     this has the effect of simplifying how we take device refs in these
     paths, and the device ref cleanup fixes a longstanding race between
     the device removal path and the discard path

   - fixes for how the debugfs code takes refs on btree_trans objects we
     have debugfs code that prints in use btree_trans objects.

     It uses closure_get() on trans->ref, which is mainly for the cycle
     detector, but the debugfs code was using it on a closure that may
     have hit 0, which is not allowed; for performance reasons we cannot
     avoid having not-in-use transactions on the global list.

     Introduce some new primitives to fix this and make the
     synchronization here a whole lot saner"

* tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: Fix kmalloc bug in __snapshot_t_mut
  bcachefs: Discard, invalidate workers are now per device
  bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc
  bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpu
  bcachefs: Add missing bch2_journal_do_writes() call
  bcachefs: Fix null ptr deref in journal_pins_to_text()
  bcachefs: Add missing recalc_capacity() call
  bcachefs: Fix btree_trans list ordering
  bcachefs: Fix race between trans_put() and btree_transactions_read()
  closures: closure_get_not_zero(), closure_return_sync()
  bcachefs: Make btree_deadlock_to_text() clearer
  bcachefs: fix seqmutex_relock()
  bcachefs: Fix freeing of error pointers
2024-06-28 09:25:21 -07:00
Jeff Johnson
6a4805b2f5 string: kunit: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/string_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/string_helpers_kunit.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-string-v1-1-2738cf057d94@quicinc.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-28 08:54:55 -07:00
Dave Airlie
91fdc5e765 drm-misc-next for $kernel-version:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - panic: Monochrome logo support, Various fixes
   - ttm: Improve the number of page faults on some platforms, Fix test
     build breakage with PREEMPT_RT, more test coverage and various test
     improvements
 
 Driver Changes:
   - Add missing MODULE_DESCRIPTION where needed
   - ipu-v3: Various fixes
   - vc4: Monochrome TV support
   - bridge:
     - analogix_dp: Various improvements and reworks, handle AUX
       transfers timeout
     - tc358767: Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR, Fix clock
       calculations
   - panels:
     - More transitions to mipi_dsi wrapped functions
     - New panels: Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZn1DmQAKCRDj7w1vZxhR
 xYj3AP9ThM8q3HoCqXKerpEfnb5LYDB4NocLjn/Bamtm134oNQD+M4Gu2zLSVymV
 74PwtPYuQGKWrmXdw0tD70/MtTAihQc=
 =fSI4
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-06-27' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.11:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
  - panic: Monochrome logo support, Various fixes
  - ttm: Improve the number of page faults on some platforms, Fix test
    build breakage with PREEMPT_RT, more test coverage and various test
    improvements

Driver Changes:
  - Add missing MODULE_DESCRIPTION where needed
  - ipu-v3: Various fixes
  - vc4: Monochrome TV support
  - bridge:
    - analogix_dp: Various improvements and reworks, handle AUX
      transfers timeout
    - tc358767: Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR, Fix clock
      calculations
  - panels:
    - More transitions to mipi_dsi wrapped functions
    - New panels: Lincoln Technologies LCD197, Ortustech COM35H3P70ULC,

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240627-congenial-pistachio-nyala-848cf4@houat
2024-06-28 08:20:37 +10:00
Jakub Kicinski
193b9b2002 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:
  e3f02f32a0 ("ionic: fix kernel panic due to multi-buffer handling")
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-27 12:14:11 -07:00
Leon Hwang
d65f3767de bpf: Fix tailcall cases in test_bpf
Since f663a03c8e ("bpf, x64: Remove tail call detection"),
tail_call_reachable won't be detected in x86 JIT. And, tail_call_reachable
is provided by verifier.

Therefore, in test_bpf, the tail_call_reachable must be provided in test
cases before running.

Fix and test:

[  174.828662] test_bpf: #0 Tail call leaf jited:1 170 PASS
[  174.829574] test_bpf: #1 Tail call 2 jited:1 244 PASS
[  174.830363] test_bpf: #2 Tail call 3 jited:1 296 PASS
[  174.830924] test_bpf: #3 Tail call 4 jited:1 719 PASS
[  174.831863] test_bpf: #4 Tail call load/store leaf jited:1 197 PASS
[  174.832240] test_bpf: #5 Tail call load/store jited:1 326 PASS
[  174.832240] test_bpf: #6 Tail call error path, max count reached jited:1 2214 PASS
[  174.835713] test_bpf: #7 Tail call count preserved across function calls jited:1 609751 PASS
[  175.446098] test_bpf: #8 Tail call error path, NULL target jited:1 472 PASS
[  175.447597] test_bpf: #9 Tail call error path, index out of range jited:1 206 PASS
[  175.448833] test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406251415.c51865bc-oliver.sang@intel.com
Fixes: f663a03c8e ("bpf, x64: Remove tail call detection")
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20240625145351.40072-1-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-25 18:12:44 -07:00
Heng Qi
13ba28c5cd dim: add new interfaces for initialization and getting results
DIM-related mode and work have been collected in one same place,
so new interfaces are added to provide convenience.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-5-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
f750dfe825 ethtool: provide customized dim profile management
The NetDIM library, currently leveraged by an array of NICs, delivers
excellent acceleration benefits. Nevertheless, NICs vary significantly
in their dim profile list prerequisites.

Specifically, virtio-net backends may present diverse sw or hw device
implementation, making a one-size-fits-all parameter list impractical.
On Alibaba Cloud, the virtio DPU's performance under the default DIM
profile falls short of expectations, partly due to a mismatch in
parameter configuration.

I also noticed that ice/idpf/ena and other NICs have customized
profilelist or placed some restrictions on dim capabilities.

Motivated by this, I tried adding new params for "ethtool -C" that provides
a per-device control to modify and access a device's interrupt parameters.

Usage
========
The target NIC is named ethx.

Assume that ethx only declares support for rx profile setting
(with DIM_PROFILE_RX flag set in profile_flags) and supports modification
of usec and pkt fields.

1. Query the currently customized list of the device

$ ethtool -c ethx
...
rx-profile:
{.usec =   1, .pkts = 256, .comps = n/a,},
{.usec =   8, .pkts = 256, .comps = n/a,},
{.usec =  64, .pkts = 256, .comps = n/a,},
{.usec = 128, .pkts = 256, .comps = n/a,},
{.usec = 256, .pkts = 256, .comps = n/a,}
tx-profile:   n/a

2. Tune
$ ethtool -C ethx rx-profile 1,1,n_2,n,n_3,3,n_4,4,n_n,5,n
"n" means do not modify this field.
$ ethtool -c ethx
...
rx-profile:
{.usec =   1, .pkts =   1, .comps = n/a,},
{.usec =   2, .pkts = 256, .comps = n/a,},
{.usec =   3, .pkts =   3, .comps = n/a,},
{.usec =   4, .pkts =   4, .comps = n/a,},
{.usec = 256, .pkts =   5, .comps = n/a,}
tx-profile:   n/a

3. Hint
If the device does not support some type of customized dim profiles,
the corresponding "n/a" will display.

If the "n/a" field is being modified, -EOPNOTSUPP will be reported.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-4-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
b65e697a7c dim: make DIMLIB dependent on NET
DIMLIB's capabilities are supplied by the dim, net_dim, and
rdma_dim objects, and dim's interfaces solely act as a base for
net_dim and rdma_dim and are not explicitly used anywhere else.
rdma_dim is utilized by the infiniband driver, while net_dim
is for network devices, excluding the soc/fsl driver.

In this patch, net_dim relies on some NET's interfaces, thus
DIMLIB needs to explicitly depend on the NET Kconfig.

The soc/fsl driver uses the functions provided by net_dim, so
it also needs to depend on NET.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-3-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Heng Qi
0e942053e4 linux/dim: move useful macros to .h file
Useful macros will be used effectively elsewhere.
These will be utilized in subsequent patches.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240621101353.107425-2-hengqi@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25 17:15:06 -07:00
Jeff Johnson
3034749132 KUnit: add missing MODULE_DESCRIPTION() macros for lib/test_*.ko
make allmodconfig && make W=1 C=1 reports for lib/test_*.ko:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hexdump.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_dhry.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_firmware.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_sysctl.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_hash.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ida.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_list_sort.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_min_heap.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_module.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_sort.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_static_keys.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_static_key_base.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_memcat_p.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_blackhole_dev.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_meminit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_free_pages.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_kprobes.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_ref_tracker.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bits.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240619-md-lib-test-v2-1-301e30eeba1e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:11 -07:00
Suren Baghdasaryan
d2917ff199 lib/dump_stack: report process UID in dump_stack_print_info()
To make it easier to identify the crashing process, report effective UID
when dumping the stack.

Link: https://lkml.kernel.org/r/20240615041358.103791-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:11 -07:00
I Hsin Cheng
c8dab79f9e lib/plist.c: avoid worst case scenario in plist_add
Worst case scenario of plist_add() happens when the priority of the
inserted plist_node is going to be the largest after the insertion is
done.  The cost is going to be more significant when the original plist is
longer, because the iterator is going to traverse the whole plist to find
the correct position to insert the new node.

The situation can be avoided by using a reverse iterator at the same time,
doing so the maximum possible number of iteration is going to shrink from
N to N/2.

The proposed change of plist_add pasts the test in lib/plist.c to validate
its correctness, also add the worst case scenario test for plist_add() in
plist_test().

The worst case test are tested with the size of test_data and test_node
growing from 200 to 1000.  The result are showned in the following table,
in which we can observed that the proposed change of plist_add performs
better than the original version, and the difference between these two
implementations are more significant with the size of N growing.

The random case test [1], and best case test [2] are also provided, with
result showing the proposed change performs slightly better in random case
test while the original implementation performs slightly better in best
case test, while the difference in both test are minor, we can see them as
even in those two situations.

 -----------------------------------------------------------
| Test size      |   200 |   400 |    600 |    800 |   1000 |
 -----------------------------------------------------------
| new_plist_add  | 140911| 548681| 1220512| 2048493| 3763755|
 -----------------------------------------------------------
| old_plist_add  | 188198| 774222| 1643547| 3008929| 4947435|
 -----------------------------------------------------------

Link: https://lkml.kernel.org/r/20240614154603.65203-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Signed-off-by: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:10 -07:00
Brian Masney
d0bff05405 lib/Kconfig.debug: document panic= command line option and procfs entry for PANIC_TIMEOUT
PANIC_TIMEOUT can also be controlled with the panic= kernel command line
option and the file /proc/sys/kernel/panic.  Let's document both of these
in the Kconfig help text.

Link: https://lkml.kernel.org/r/20240607152443.925168-1-bmasney@redhat.com
Signed-off-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
09aaf15a78 lib/test_linear_ranges: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_linear_ranges.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_linear_ranges-v1-1-053a1aad37c6@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
7ef148daa5 lib/test_kmod: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_kmod.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_kmod-v1-1-fdf11bc6095e@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
d46a555d3c siphash: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/siphash_kunit.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-siphash_kunit-v1-1-38688065b796@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:07 -07:00
Jeff Johnson
683da20738 uuid: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_uuid.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-test_uuid-v1-1-67fa498104c0@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
1c5a13b39d kunit: add missing MODULE_DESCRIPTION() macros to lib/*.c
make allmodconfig && make W=1 C=1 reports for lib/*kunit:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/bitfield_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/checksum_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/cmdline_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/is_signed_type_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/overflow_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/stackinit_kunit.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-kunit-tests-v1-1-4493fe0032b9@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
2e29fcb774 lib/asn1_encoder: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/asn1_encoder.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-asn1_encoder-v1-1-8c634ed2d2e8@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
f069e33daf KUnit: add missing MODULE_DESCRIPTION() macros for lib/*_test.ko
make allmodconfig && make W=1 C=1 reports for lib/*_test.ko:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/atomic64_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/hashtable_test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240601-md-lib-test2-v1-1-be764b785f17@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:06 -07:00
Jeff Johnson
e471831be2 kunit/fortify: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/memcpy_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/fortify_kunit.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-md-lib-fortify_source-v1-1-2c37f7fbaafc@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:05 -07:00
Jani Nikula
2f183c6834 kernel/panic: add verbose logging of kernel taints in backtraces
With nearly 20 taint flags and respective characters, it's getting a bit
difficult to remember what each taint flag character means.  Add verbose
logging of the set taints in the format:

Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN

in dump_stack_print_info() when there are taints.

Note that the "negative flag" G is not included.

Link: https://lkml.kernel.org/r/7321e306166cb2ca2807ab8639e665baa2462e9c.1717146197.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:05 -07:00
Jeff Johnson
21516c56ff lib/ts: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_kmp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_bm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_fsm.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240531-lib-ts-v1-1-03d7f3546c49@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:04 -07:00
I Hsin Cheng
7abcb84f95 lib/plist.c: enforce memory ordering in plist_check_list
There exists an iteration over a plist in plist_check_list(), and memory
dependency exists between variables "prev", "next" and "prev->next".  As
plist is used in the scheduling subsystem, we should guarantee the memory
ordering between multiple processors.

Using macro "WRITE_ONCE()" can help us to ensure the memory ordering as
it was stated in "Documentation/memory-barriers.txt".

Link: https://lkml.kernel.org/r/20240526140139.17220-1-richard120310@gmail.com
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:04 -07:00
Mateusz Guzik
51d821654b percpu_counter: add a cmpxchg-based _add_batch variant
Interrupt disable/enable trips are quite expensive on x86-64 compared to a
mere cmpxchg (note: no lock prefix!) and percpu counters are used quite
often.

With this change I get a bump of 1% ops/s for negative path lookups,
plugged into will-it-scale:

void testcase(unsigned long long *iterations, unsigned long nr)
{
        while (1) {
                int fd = open("/tmp/nonexistent", O_RDONLY);
                assert(fd == -1);

                (*iterations)++;
        }
}

The win would be higher if it was not for other slowdowns, but one has
to start somewhere.

Link: https://lkml.kernel.org/r/20240528204257.434817-1-mjguzik@gmail.com
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
54ce43da25 lib/test_sort: add a testcase to ensure code coverage
The addition of an if statement in lib/sort to handle the final unsorted 2
or 3 elements is not covered by existing test cases, leading to incomplete
test coverage.  To ensure comprehensive testing and maintain 100% code
coverage, add a new testcase for scenarios where the if statement is
triggered.

Since the if statement is only triggered when the array length is odd and
the first element is greater than the second element, a testcase is
created using an array length of TEST_LEN - 1 and a suitable random seed
to maintain full code coverage.

Link: https://lkml.kernel.org/r/20240527203011.1644280-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
41ed780435 lib/sort: optimize heapsort for handling final 2 or 3 elements
After building the heap, the code continuously pops two elements from the
heap until only 2 or 3 elements remain, at which point it switches back to
a regular heapsort with one element popped at a time.  However, to handle
the final 2 or 3 elements, an additional else-if statement in the while
loop was introduced, potentially increasing branch misses.  Moreover, when
there are only 2 or 3 elements left, continuing with regular heapify
operations is unnecessary as these cases are simple enough to be handled
with a single comparison and 1 or 2 swaps outside the while loop.

Eliminating the additional else-if statement and directly managing cases
involving 2 or 3 elements outside the loop reduces unnecessary conditional
branches resulting from the numerous loops and conditionals in heapify.

This optimization maintains consistent numbers of comparisons and swaps
for arrays with even lengths while reducing swaps and comparisons for
arrays with odd lengths from 2.5 swaps and 1 comparison to 1.5 swaps and 1
comparison.

Link: https://lkml.kernel.org/r/20240527203011.1644280-4-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:03 -07:00
Kuan-Wei Chiu
f49ac9571b lib/sort: fix outdated comment regarding glibc qsort()
The existing comment in lib/sort refers to glibc qsort() using quicksort. 
However, glibc qsort() no longer uses quicksort; it now uses mergesort and
falls back to heapsort if memory allocation for mergesort fails.  This
makes the comment outdated and incorrect.

Update the comment to refer to quicksort in general rather than glibc's
implementation to provide accurate information about the comparisons and
trade-offs without implying an outdated implementation.

Link: https://lkml.kernel.org/r/20240527203011.1644280-3-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:02 -07:00
Kuan-Wei Chiu
85fb11a879 lib/sort: remove unused pr_fmt macro
Patch series "lib/sort: Optimizations and cleanups".

This patch series optimizes the handling of the last 2 or 3 elements in
lib/sort and adds a testcase in lib/test_sort to maintain 100% code
coverage reflecting this change.  Additionally, it corrects outdated
descriptions regarding glibc qsort() and removes the unused pr_fmt macro.


This patch (of 4):

The pr_fmt macro is defined but not used in lib/sort.c.  Since there are
no pr_* functions printing any messages, the pr_fmt macro is redundant and
can be safely removed.

Link: https://lkml.kernel.org/r/20240527203011.1644280-1-visitorckw@gmail.com
Link: https://lkml.kernel.org/r/20240527203011.1644280-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:25:02 -07:00
Kuan-Wei Chiu
7099f74dc3 lib/test_min_heap: add test for heap_del()
Add test cases for the min_heap_del() to ensure its functionality is
thoroughly tested.

Link: https://lkml.kernel.org/r/20240524152958.919343-15-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:59 -07:00
Kuan-Wei Chiu
267607e875 lib min_heap: add args for min_heap_callbacks
Add a third parameter 'args' for the 'less' and 'swp' functions in the
'struct min_heap_callbacks'.  This additional parameter allows these
comparison and swap functions to handle extra arguments when necessary.

Link: https://lkml.kernel.org/r/20240524152958.919343-9-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:58 -07:00
Kuan-Wei Chiu
873ce25766 lib min_heap: add type safe interface
Implement a type-safe interface for min_heap using strong type pointers
instead of void * in the data field.  This change includes adding small
macro wrappers around functions, enabling the use of __minheap_cast and
__minheap_obj_size macros for type casting and obtaining element size. 
This implementation removes the necessity of passing element size in
min_heap_callbacks.  Additionally, introduce the MIN_HEAP_PREALLOCATED
macro for preallocating some elements.

Link: https://lkml.kernel.org/ioyfizrzq7w7mjrqcadtzsfgpuntowtjdw5pgn4qhvsdp4mqqg@nrlek5vmisbu
Link: https://lkml.kernel.org/r/20240524152958.919343-5-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24 22:24:57 -07:00
Breno Leitao
5b5baba622 debugobjects: Annotate racy debug variables
KCSAN has identified a potential data race in debugobjects, where the
global variable debug_objects_maxchain is accessed for both reading and
writing simultaneously in separate and parallel data paths. This results in
the following splat printed by KCSAN:

  BUG: KCSAN: data-race in debug_check_no_obj_freed / debug_object_activate

  write to 0xffffffff847ccfc8 of 4 bytes by task 734 on cpu 41:
  debug_object_activate (lib/debugobjects.c:199 lib/debugobjects.c:564 lib/debugobjects.c:710)
  call_rcu (kernel/rcu/rcu.h:227 kernel/rcu/tree.c:2719 kernel/rcu/tree.c:2838)
  security_inode_free (security/security.c:1626)
  __destroy_inode (./include/linux/fsnotify.h:222 fs/inode.c:287)
  ...
  read to 0xffffffff847ccfc8 of 4 bytes by task 384 on cpu 31:
  debug_check_no_obj_freed (lib/debugobjects.c:1000 lib/debugobjects.c:1019)
  kfree (mm/slub.c:2081 mm/slub.c:4280 mm/slub.c:4390)
  percpu_ref_exit (lib/percpu-refcount.c:147)
  css_free_rwork_fn (kernel/cgroup/cgroup.c:5357)
  ...
  value changed: 0x00000070 -> 0x00000071

The data race is actually harmless as this is just used for debugfs
statistics, as all other debug variables.

Annotate all debug variables as racy explicitly, since these variables
are known to be racy and harmless.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240611091813.1189860-1-leitao@debian.org
2024-06-24 16:46:43 +02:00
Geert Uytterhoeven
a03a84bee3 lib/fonts: Fix visiblity of SUN12x22 and TER16x32 if DRM_PANIC
When CONFIG_FONTS ("Select compiled-in fonts") is not enabled, the user
should not be asked about any fonts.  However, when CONFIG_DRM_PANIC is
enabled, the user is still asked about the Sparc console 12x22 and
Terminus 16x32 fonts.

Fix this by moving the "|| DRM_PANIC" to where it belongs.
Split the dependency in two rules to improve readability.

Fixes: b94605a388 ("lib/fonts: Allow to select fonts for drm_panic")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ac474c6755800e61e18bd5af407c6acb449c5149.1718305355.git.geert+renesas@glider.be
2024-06-24 13:18:02 +02:00
Kent Overstreet
06efa5f30c closures: closure_get_not_zero(), closure_return_sync()
Provide new primitives for solving a lifetime issue with bcachefs
btree_trans objects.

closure_sync_return(): like closure_sync(), wait synchronously for any
outstanding gets. like closure_return, the closure is considered
"finished" and the ref left at 0.

closure_get_not_zero(): get a ref on a closure if it's alive, i.e. the
ref is not zero.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23 00:57:21 -04:00
Linus Torvalds
c3de9b572f bcachefs fixes for 6.10-rc5
Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.
 
 The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
 for the LRU position (we would prefer 64), so wraparound is possible for
 the cached data LRUs on a filesystem that has done sufficient
 (petabytes) reads; this is now handled.
 
 One notable user reported bugfix, where we were forgetting to correctly
 set the bucket data type, which should have been BCH_DATA_need_gc_gens
 instead of BCH_DATA_free; this was causing us to go emergency read-only
 on a filesystem that had seen heavy enough use to see bucket gen
 wraparoud.
 
 We're now starting to fix simple (safe) errors without requiring user
 intervention - i.e. a small incremental step towards full self healing.
 This is currently limited to just certain allocation information
 counters, and the error is still logged in the superblock; see that
 patch for more information. ("bcachefs: Fix safe errors by default").
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZ22GAACgkQE6szbY3K
 bnYJaQ/+Pzep1M9JU9bQRCjmbR1pDkHswqeiVR0DSPTqDSCR0KmoypA+iwAbAmzC
 X0Z3bHgh9X36QdnF5s+JSLSzeAdzD74btLeyCI58iH//QaIg7da6tE2FgJCstMt8
 11i9a172fhiYLH7YjigZczV10nrWApClS/9qHgY+paVEgMeJgx/3zJwysC1UhuT9
 6bsSZKeGMhkGDca3k5hd7mZZKUvFXpE//xe6axK05aTHvd2wDQbDOaMdn07XC+hF
 KIWloxYVu9utqprjIq2XHWLJaxRhguHwlI4xq+n8eljLw8Kt6S9lZp7CA85Hq4RA
 hLmv1qoqJvh8+YZ7twwYAhflm9mcz58GGKrIqPCG/YaftIktJx3DCOkZzn2b7TmD
 iVXBVYkcmlZqLpZzPisKO8omqVkH4YIN/WPIGa1JU/+jkw5Qzpw62K+9AjYowUCp
 q47TVWRNtAuL5sct2KVUdTkC5Dkhx7lu3NDvx4jVXfbPsv0ssNYiTKMNnAUqefz/
 eM37MCVzmy7OwAymdb5d83CzMIHm0JKetc7CgLBAjOcMLMoLDjRdGEcFGxq/iBMB
 2Ty4rUWGFbXlwV1umcYd2cODqIt+iLwmHWCAIXjtTlOw1h5YuwX67wb9zw/tzB1W
 JUEetJQWzQ7P/Q1huntNUbiIHw2GbWzeB2u0wBPaVVEgyHWftyk=
 =ktka
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs.

  The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits
  for the LRU position (we would prefer 64), so wraparound is possible
  for the cached data LRUs on a filesystem that has done sufficient
  (petabytes) reads; this is now handled.

  One notable user reported bugfix, where we were forgetting to
  correctly set the bucket data type, which should have been
  BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to
  go emergency read-only on a filesystem that had seen heavy enough use
  to see bucket gen wraparoud.

  We're now starting to fix simple (safe) errors without requiring user
  intervention - i.e. a small incremental step towards full self
  healing.

  This is currently limited to just certain allocation information
  counters, and the error is still logged in the superblock; see that
  patch for more information. ("bcachefs: Fix safe errors by default")"

* tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits)
  bcachefs: Move the ei_flags setting to after initialization
  bcachefs: Fix a UAF after write_super()
  bcachefs: Use bch2_print_string_as_lines for long err
  bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()
  bcachefs: Replace bare EEXIST with private error codes
  bcachefs: Fix missing alloc_data_type_set()
  closures: Change BUG_ON() to WARN_ON()
  bcachefs: fix alignment of VMA for memory mapped files on THP
  bcachefs: Fix safe errors by default
  bcachefs: Fix bch2_trans_put()
  bcachefs: set_worker_desc() for delete_dead_snapshots
  bcachefs: Fix bch2_sb_downgrade_update()
  bcachefs: Handle cached data LRU wraparound
  bcachefs: Guard against overflowing LRU_TIME_BITS
  bcachefs: delete_dead_snapshots() doesn't need to go RW
  bcachefs: Fix early init error path in journal code
  bcachefs: Check for invalid btree IDs
  bcachefs: Fix btree ID bitmasks
  bcachefs: Fix shift overflow in read_one_super()
  bcachefs: Fix a locking bug in the do_discard_fast() path
  ...
2024-06-22 09:02:39 -07:00
Kent Overstreet
339b84ab6b closures: Change BUG_ON() to WARN_ON()
If a BUG_ON() can be hit in the wild, it shouldn't be a BUG_ON()

For reference, this has popped up once in the CI, and we'll need more
info to debug it:

03240 ------------[ cut here ]------------
03240 kernel BUG at lib/closure.c:21!
03240 kernel BUG at lib/closure.c:21!
03240 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
03240 Modules linked in:
03240 CPU: 15 PID: 40534 Comm: kworker/u80:1 Not tainted 6.10.0-rc4-ktest-ga56da69799bd #25570
03240 Hardware name: linux,dummy-virt (DT)
03240 Workqueue: btree_update btree_interior_update_work
03240 pstate: 00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
03240 pc : closure_put+0x224/0x2a0
03240 lr : closure_put+0x24/0x2a0
03240 sp : ffff0000d12071c0
03240 x29: ffff0000d12071c0 x28: dfff800000000000 x27: ffff0000d1207360
03240 x26: 0000000000000040 x25: 0000000000000040 x24: 0000000000000040
03240 x23: ffff0000c1f20180 x22: 0000000000000000 x21: ffff0000c1f20168
03240 x20: 0000000040000000 x19: ffff0000c1f20140 x18: 0000000000000001
03240 x17: 0000000000003aa0 x16: 0000000000003ad0 x15: 1fffe0001c326974
03240 x14: 0000000000000a1e x13: 0000000000000000 x12: 1fffe000183e402d
03240 x11: ffff6000183e402d x10: dfff800000000000 x9 : ffff6000183e402e
03240 x8 : 0000000000000001 x7 : 00009fffe7c1bfd3 x6 : ffff0000c1f2016b
03240 x5 : ffff0000c1f20168 x4 : ffff6000183e402e x3 : ffff800081391954
03240 x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000a8000000
03240 Call trace:
03240  closure_put+0x224/0x2a0
03240  bch2_check_for_deadlock+0x910/0x1028
03240  bch2_six_check_for_deadlock+0x1c/0x30
03240  six_lock_slowpath.isra.0+0x29c/0xed0
03240  six_lock_ip_waiter+0xa8/0xf8
03240  __bch2_btree_node_lock_write+0x14c/0x298
03240  bch2_trans_lock_write+0x6d4/0xb10
03240  __bch2_trans_commit+0x135c/0x5520
03240  btree_interior_update_work+0x1248/0x1c10
03240  process_scheduled_works+0x53c/0xd90
03240  worker_thread+0x370/0x8c8
03240  kthread+0x258/0x2e8
03240  ret_from_fork+0x10/0x20
03240 Code: aa1303e0 d63f0020 a94363f7 17ffff8c (d4210000)
03240 ---[ end trace 0000000000000000 ]---
03240 Kernel panic - not syncing: Oops - BUG: Fatal exception
03240 SMP: stopping secondary CPUs
03241 SMP: failed to stop secondary CPUs 13,15
03241 Kernel Offset: disabled
03241 CPU features: 0x00,00000003,80000008,4240500b
03241 Memory Limit: none
03241 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
03246 ========= FAILED TIMEOUT copygc_torture_no_checksum in 7200s

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21 10:17:04 -04:00
Jeff Johnson
3cbe18b0bc crypto: lib - add missing MODULE_DESCRIPTION() macros
With ARCH=arm, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libsha256.o

Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().

This includes sha1.c and utils.c which, although they did not produce
a warning with the arm allmodconfig configuration, may cause this
warning with other configurations.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:19 +10:00
Jiapeng Chong
b44327ebc1 crypto: lib/mpi - Use swap() in mpi_powm()
Use existing swap() function rather than duplicating its implementation.

./lib/crypto/mpi/mpi-pow.c:211:11-12: WARNING opportunity for swap().
./lib/crypto/mpi/mpi-pow.c:239:12-13: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9327
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:18 +10:00
Jiapeng Chong
f0da7a231c crypto: lib/mpi - Use swap() in mpi_ec_mul_point()
Use existing swap() function rather than duplicating its implementation.

./lib/crypto/mpi/ec.c:1291:20-21: WARNING opportunity for swap().
./lib/crypto/mpi/ec.c:1292:20-21: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9328
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-21 22:04:17 +10:00
Dave Airlie
f680df51ca drm-misc-next for 6.11:
UAPI Changes:
   - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION
 
 Core Changes:
   - connector: Create a set of helpers to help with HDMI support
   - fbdev: Create memory manager optimized fbdev emulation
   - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer
 
 Driver Changes:
   - Remove driver owner assignments
   - Allow more drivers to compile with COMPILE_TEST
   - Conversions to drm_edid
   - ivpu: hardware scheduler support, profiling support, improvements
     to the platform support layer
   - mgag200: general reworks and improvements
   - nouveau: Add NVreg_RegistryDwords command line option
   - rockchip: Conversion to the hdmi helpers
   - sun4i: Conversion to the hdmi helpers
   - vc4: Conversion to the hdmi helpers
   - v3d: Perf counters improvements
   - zynqmp: IRQ and debugfs improvements
   - bridge:
     - Remove redundant checks on bridge->encoder
   - panels:
     - Switch panels from register table initialization to proper code
     - Now that the panel code tracks the panel state, remove every
       ad-hoc implementation in the panel drivers
     - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
       13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
       nv110wum-l60, IVO t109nw41
 -----BEGIN PGP SIGNATURE-----
 
 iJUEABMJAB0WIQTkHFbLp4ejekA/qfgnX84Zoj2+dgUCZlhUKAAKCRAnX84Zoj2+
 dgHoAYDTpShgXFXnlnMtqZr+ZuShcjcwiqzwM4qNWdtyji9MONtJJU3ZQnGlnXbI
 ZU+oZP0Bf0PyT0/8bf+rmZBJ1UdAxt2IQaLkP1tTHOad4E+KlcL5n1opzMi160mB
 EZSm9f7aNw==
 =bZPt
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-05-30' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for 6.11:

UAPI Changes:
  - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION

Core Changes:
  - connector: Create a set of helpers to help with HDMI support
  - fbdev: Create memory manager optimized fbdev emulation
  - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer

Driver Changes:
  - Remove driver owner assignments
  - Allow more drivers to compile with COMPILE_TEST
  - Conversions to drm_edid
  - ivpu: hardware scheduler support, profiling support, improvements
    to the platform support layer
  - mgag200: general reworks and improvements
  - nouveau: Add NVreg_RegistryDwords command line option
  - rockchip: Conversion to the hdmi helpers
  - sun4i: Conversion to the hdmi helpers
  - vc4: Conversion to the hdmi helpers
  - v3d: Perf counters improvements
  - zynqmp: IRQ and debugfs improvements
  - bridge:
    - Remove redundant checks on bridge->encoder
  - panels:
    - Switch panels from register table initialization to proper code
    - Now that the panel code tracks the panel state, remove every
      ad-hoc implementation in the panel drivers
    - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
      13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
      nv110wum-l60, IVO t109nw41

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240530-hilarious-flat-magpie-5fa186@houat
2024-06-21 10:30:31 +10:00
Jakub Kicinski
a6ec08beec Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  1e7962114c ("bnxt_en: Restore PTP tx_avail count in case of skb_pad() error")
  165f87691a ("bnxt_en: add timestamping statistics support")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-20 13:49:59 -07:00
Kees Cook
2003e483a8 fortify: Do not special-case 0-sized destinations
All fake flexible arrays should have been removed now, so remove the
special casing that was avoiding checking them. If a destination claims
to be 0 sized, believe it. This is especially important for cases where
__counted_by is in use and may have a 0 element count.

Link: https://lore.kernel.org/r/20240619203105.work.747-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-19 13:32:04 -07:00
David Vernet
1538e33995 sched_ext: Print sched_ext info when dumping stack
It would be useful to see what the sched_ext scheduler state is, and what
scheduler is running, when we're dumping a task's stack. This patch
therefore adds a new print_scx_info() function that's called in the same
context as print_worker_info() and print_stop_info(). An example dump
follows.

  BUG: kernel NULL pointer dereference, address: 0000000000000999
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 0 P4D 0
  Oops: 0002 [#1] PREEMPT SMP
  CPU: 13 PID: 2047 Comm: insmod Tainted: G           O       6.6.0-work-10323-gb58d4cae8e99-dirty #34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
  Sched_ext: qmap (enabled+all), task: runnable_at=-17ms
  RIP: 0010:init_module+0x9/0x1000 [test_module]
  ...

v3: - scx_ops_enable_state_str[] definition moved to an earlier patch as
      it's now used by core implementation.

    - Convert jiffy delta to msecs using jiffies_to_msecs() instead of
      multiplying by (HZ / MSEC_PER_SEC). The conversion is implemented in
      jiffies_delta_msecs().

v2: - We are now using scx_ops_enable_state_str[] outside
      CONFIG_SCHED_DEBUG. Move it outside of CONFIG_SCHED_DEBUG and to the
      top. This was reported by Changwoo and Andrea.

Signed-off-by: David Vernet <void@manifault.com>
Reported-by: Changwoo Min <changwoo@igalia.com>
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-18 10:09:18 -10:00
Jeff Johnson
e334771d83 lib: bitmap: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/find_bit_benchmark.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/cpumask_kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bitmap.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-06-18 10:40:52 -07:00
Linus Torvalds
5d272dd1b3 cpumask: limit FORCE_NR_CPUS to just the UP case
Hardcoding the number of CPUs at compile time does improve code
generation, but if you get it wrong the result will be confusion.

We already limited this earlier to only "experts" (see commit
fe5759d5bf "cpumask: limit visibility of FORCE_NR_CPUS"), but with
distro kernel configs often having EXPERT enabled, that turns out to not
be much of a limit.

To quote the philosophers at Disney: "Everyone can be an expert. And
when everyone's an expert, no one will be".

There's a runtime warning if you then set nr_cpus to anything but the
forced number, but apparently that can be ignored too [1] and by then
it's pretty much too late anyway.

If we had some real way to limit this to "embedded only", maybe it would
be worth it, but let's see if anybody even notices that the option is
gone.  We need to simplify kernel configuration anyway.

Link: https://lore.kernel.org/all/20240618105036.208a8860@rorschach.local.home/ [1]
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-06-18 09:00:04 -07:00
Linus Torvalds
e6b324fbf2 19 hotfixes, 8 of which are cc:stable.
Mainly MM singleton fixes.  And a couple of ocfs2 regression fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZnCEQAAKCRDdBJ7gKXxA
 jmgSAQDk3BYs1n67cnwx/Zi04yMYDyfYTCYg2udPfT2a+GpmbwD+N5dJd/vCztXH
 5eLpP11xd/yr2+I9FefyZeUuA80KtgQ=
 =2agY
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Mainly MM singleton fixes. And a couple of ocfs2 regression fixes"

* tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  kcov: don't lose track of remote references during softirqs
  mm: shmem: fix getting incorrect lruvec when replacing a shmem folio
  mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick
  mm: fix possible OOB in numa_rebuild_large_mapping()
  mm/migrate: fix kernel BUG at mm/compaction.c:2761!
  selftests: mm: make map_fixed_noreplace test names stable
  mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC
  mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default
  gcov: add support for GCC 14
  zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
  mm: huge_memory: fix misused mapping_large_folio_support() for anon folios
  lib/alloc_tag: fix RCU imbalance in pgalloc_tag_get()
  lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n
  MAINTAINERS: remove Lorenzo as vmalloc reviewer
  Revert "mm: init_mlocked_on_free_v3"
  mm/page_table_check: fix crash on ZONE_DEVICE
  gcc: disable '-Warray-bounds' for gcc-9
  ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger()
  ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
2024-06-17 12:30:07 -07:00
Linus Torvalds
5cf81d7b0d hardening fixes for v6.10-rc5
- yama: document function parameter (Christian Göttsche_
 
 - mm/util: Swap kmemdup_array() arguments (Jean-Philippe Brucker)
 
 - kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
 
 - MAINTAINERS: Update entries for Kees Cook
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZwfVsACgkQiXL039xt
 wCYfuQ/+KidYsVlf9xhc9eU6XQQZmPXhQT7QCWZEX2xj6xdob5Pv+YBHrL2dGCvn
 4b7xqWFqrkjDGVEQW5zF7mmn9T7a3c6+czKUR6rSueB6aO+NFns961rCBViYWxLN
 /xgee/1iCRg5iwg6SfP5CR9NIr9h6jU9d4Mv7cT2rwy913bCeQa89gkqCD2LJXmr
 m9HZgT0vsgfUO3+XsA42LKpP+dP+8UHtTumNOZrqnzZr9k69io9ncRjzmS/LjQPL
 ILo3QQ6QIV8bkSlOogMLZNHRc84Sc8x91KUM42ZUhV2tNxpNG6lt6UZXPATbvq/g
 TLHxvayjYOTWwF2DmlXncF/rtDLugsg/lyGS4tPjRX00Iq+jaTm1HOVJQ0rDUeLI
 lmMlGyDzAPK7UXU3hmx+i3sOuyt6HbfJYwF/7ErR0plDaWIbUrqy7uVxarag3qnc
 i4Lrr/5OdThUKl1jTBIBmfrOELI+m5opMvF2zUpS1BgHUw1U33rHWxQRoW1iTUnH
 Df11bl0NycmxyY0Vv4M1dnm8uP7XpjfFbdi87xj4+lGGKTM+wM9iQhrHVLBeIdPa
 dntZfsFB2ZF8LYlNXVnOcWLJjQP8SC99VCMsp/Un6AVmu/HMBP/+cZ6LHGWcUoWz
 qVrxqu9OjnK7jqsaDbDm3TLroCzL/8/oLRbqXuGJNamLOxz9oW0=
 =RFT7
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - yama: document function parameter (Christian Göttsche)

 - mm/util: Swap kmemdup_array() arguments (Jean-Philippe Brucker)

 - kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()

 - MAINTAINERS: Update entries for Kees Cook

* tag 'hardening-v6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  MAINTAINERS: Update entries for Kees Cook
  kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
  yama: document function parameter
  mm/util: Swap kmemdup_array() arguments
2024-06-17 12:00:22 -07:00
Greg Kroah-Hartman
b5dd424181 Linux 6.10-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmZvTbAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGVksIAJEn4a9IVM8FNCJy
 Dxo0BItD1/qJ5mLDptqUFRKlxInjbojofz5CyoeIeXb0DwRfB16ALXqNXAkd3APi
 saoOpfjFsg2H2OqL9CHdkzWcJEAq2lDnL0zaOjumeDVu/EyeT+tC4e4hq1e6Bm0E
 fPC5ms2b+07DF9Rg6/DW8yPbdM5n6Mz1bRd3fQOIgvpM3yGOyGztEBgTRub/ZUgH
 5pNJauknFAZgdiWhgNpc+lPWYZbgHKULQPhUBPdVhDIXPtQNUlKgNTQc6+L0Nmbb
 K1sG1q7FLeMJOTFGQfD4r26X5DNQUi894q/9SX8X7rcrECdJKcw2WjVyB4myADpf
 ae2gP+A=
 =XjWP
 -----END PGP SIGNATURE-----

Merge tag 'v6.10-rc4' into driver-core-next

We need the driver core and sysfs fixes in here to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-17 08:33:41 +02:00
Greg Kroah-Hartman
2046047295 Linux 6.10-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmZvTbAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGVksIAJEn4a9IVM8FNCJy
 Dxo0BItD1/qJ5mLDptqUFRKlxInjbojofz5CyoeIeXb0DwRfB16ALXqNXAkd3APi
 saoOpfjFsg2H2OqL9CHdkzWcJEAq2lDnL0zaOjumeDVu/EyeT+tC4e4hq1e6Bm0E
 fPC5ms2b+07DF9Rg6/DW8yPbdM5n6Mz1bRd3fQOIgvpM3yGOyGztEBgTRub/ZUgH
 5pNJauknFAZgdiWhgNpc+lPWYZbgHKULQPhUBPdVhDIXPtQNUlKgNTQc6+L0Nmbb
 K1sG1q7FLeMJOTFGQfD4r26X5DNQUi894q/9SX8X7rcrECdJKcw2WjVyB4myADpf
 ae2gP+A=
 =XjWP
 -----END PGP SIGNATURE-----

Merge tag 'v6.10-rc4' into char-misc-next

We need the char-misc and iio fixes in here as well to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-17 08:31:12 +02:00
Suren Baghdasaryan
c944bf60c1 lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n
Memory allocation profiling is trying to register sysctl interface even
when CONFIG_SYSCTL=n, resulting in proc_do_static_key() being undefined. 
Prevent that by skipping sysctl registration for such configurations.

Link: https://lkml.kernel.org/r/20240601233831.617124-1-surenb@google.com
Fixes: 22d407b164 ("lib: add allocation tagging support for memory allocation profiling")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405280616.wcOGWJEj-lkp@intel.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-15 10:43:05 -07:00
Kees Cook
cf6219ee88 usercopy: Convert test_user_copy to KUnit test
Convert the runtime tests of hardened usercopy to standard KUnit tests.

Additionally disable usercopy_test_invalid() for systems with separate
address spaces (or no MMU) since it's not sensible to test for address
confusion there (e.g. m68k).

Co-developed-by: Vitor Massaru Iha <vitor@massaru.org>
Signed-off-by: Vitor Massaru Iha <vitor@massaru.org>
Link: https://lore.kernel.org/r/20200721174654.72132-1-vitor@massaru.org
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-14 19:31:39 -06:00
Kees Cook
51104c19d8 kunit: test: Add vm_mmap() allocation resource manager
For tests that need to allocate using vm_mmap() (e.g. usercopy and
execve), provide the interface to have the allocation tracked by KUnit
itself. This requires bringing up a placeholder userspace mm.

This combines my earlier attempt at this with Mark Rutland's version[1].

Normally alloc_mm() and arch_pick_mmap_layout() aren't exported for
modules, so export these only for KUnit testing.

Link: https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutland@arm.com/ [1]
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-14 19:31:33 -06:00
Joel Granados
e2a6c472de mm profiling: Remove superfluous sentinel element from ctl_table
This commit is part of a greater effort to remove all empty elements at
the end of the ctl_table arrays (sentinels) which will reduce the
overall build time size of the kernel and run time memory bloat by ~64
bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Removed sentinel from memory_allocation_profiling_sysctls

Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-13 10:50:52 +02:00
Jeff Johnson
a930fde94a vsprintf: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_printf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_scanf.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-vsprintf-v1-1-d8bc7e21539a@quicinc.com
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-06-12 13:26:28 +02:00
Zijun Hu
dd6e9894b4 kobject_uevent: Fix OOB access within zap_modalias_env()
zap_modalias_env() wrongly calculates size of memory block to move, so
will cause OOB memory access issue if variable MODALIAS is not the last
one within its @env parameter, fixed by correcting size to memmove.

Fixes: 9b3fa47d4a ("kobject: fix suppressing modalias in uevents delivered over netlink")
Cc: stable@vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Lk Sii <lk_sii@163.com>
Link: https://lore.kernel.org/r/1717074877-11352-1-git-send-email-quic_zijuhu@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 13:24:05 +02:00
Jakub Kicinski
b1156532bc bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZmIsRAAKCRDbK58LschI
 g4SSAP0bkl6rPMn7zp1h+/l7hlvpp2aVOmasBTe8hIhAGUbluwD/TGq4sNsGgXFI
 i4tUtFRhw8pOjy2guy6526qyJvBs8wY=
 =WMhY
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-06-06

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).

The main changes are:

1) Add a user space notification mechanism via epoll when a struct_ops
   object is getting detached/unregistered, from Kui-Feng Lee.

2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
   tests, from Geliang Tang.

3) Add BTF field (type and string fields, right now) iterator support
   to libbpf instead of using existing callback-based approaches,
   from Andrii Nakryiko.

4) Extend BPF selftests for the latter with a new btf_field_iter
   selftest, from Alan Maguire.

5) Add new kfuncs for a generic, open-coded bits iterator,
   from Yafang Shao.

6) Fix BPF selftests' kallsyms_find() helper under kernels configured
   with CONFIG_LTO_CLANG_THIN, from Yonghong Song.

7) Remove a bunch of unused structs in BPF selftests,
   from David Alan Gilbert.

8) Convert test_sockmap section names into names understood by libbpf
   so it can deduce program type and attach type, from Jakub Sitnicki.

9) Extend libbpf with the ability to configure log verbosity
   via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.

10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
    in nested VMs, from Song Liu.

11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
    optimization, from Xiao Wang.

12) Enable BPF programs to declare arrays and struct fields with kptr,
    bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
  selftests/bpf: Add start_test helper in bpf_tcp_ca
  selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
  selftests/bpf: Add btf_field_iter selftests
  selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
  libbpf: Remove callback-based type/string BTF field visitor helpers
  bpftool: Use BTF field iterator in btfgen
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Add BTF field iterator
  selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
  selftests/bpf: Fix bpf_cookie and find_vma in nested VM
  selftests/bpf: Test global bpf_list_head arrays.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  bpf: limit the number of levels of a nested struct type.
  bpf: look into the types of the fields of a struct type recursively.
  ...
====================

Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10 18:02:14 -07:00
Kees Cook
9dd5134c61 kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
When a flexible array structure has a __counted_by annotation, its use
with DEFINE_RAW_FLEX() will result in the count being zero-initialized.
This is expected since one doesn't want to use RAW with a counted_by
struct. Adjust the tests to check for the condition and for compiler
support.

Reported-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Closes: https://lore.kernel.org/all/0bfc6b38-8bc5-4971-b6fb-dc642a73fbfe@gmail.com/
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240610182301.work.272-kees@kernel.org
Tested-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Reviewed-by: Christian Schrefl <chrisi.schrefl@gmail.com>
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-10 12:00:04 -07:00
Ido Schimmel
97d833ceb2 mlxsw: spectrum_acl_erp: Fix object nesting warning
ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
(A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
contain more ACLs (i.e., tc filters), but the number of masks in each
region (i.e., tc chain) is limited.

In order to mitigate the effects of the above limitation, the device
allows filters to share a single mask if their masks only differ in up
to 8 consecutive bits. For example, dst_ip/25 can be represented using
dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
number of masks being used (and therefore does not support mask
aggregation), but can contain a limited number of filters.

The driver uses the "objagg" library to perform the mask aggregation by
passing it objects that consist of the filter's mask and whether the
filter is to be inserted into the A-TCAM or the C-TCAM since filters in
different TCAMs cannot share a mask.

The set of created objects is dependent on the insertion order of the
filters and is not necessarily optimal. Therefore, the driver will
periodically ask the library to compute a more optimal set ("hints") by
looking at all the existing objects.

When the library asks the driver whether two objects can be aggregated
the driver only compares the provided masks and ignores the A-TCAM /
C-TCAM indication. This is the right thing to do since the goal is to
move as many filters as possible to the A-TCAM. The driver also forbids
two identical masks from being aggregated since this can only happen if
one was intentionally put in the C-TCAM to avoid a conflict in the
A-TCAM.

The above can result in the following set of hints:

H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta

After getting the hints from the library the driver will start migrating
filters from one region to another while consulting the computed hints
and instructing the device to perform a lookup in both regions during
the transition.

Assuming a filter with mask X is being migrated into the A-TCAM in the
new region, the hints lookup will return H1. Since H2 is the parent of
H1, the library will try to find the object associated with it and
create it if necessary in which case another hints lookup (recursive)
will be performed. This hints lookup for {mask Y, A-TCAM} will either
return H2 or H3 since the driver passes the library an object comparison
function that ignores the A-TCAM / C-TCAM indication.

This can eventually lead to nested objects which are not supported by
the library [1].

Fix by removing the object comparison function from both the driver and
the library as the driver was the only user. That way the lookup will
only return exact matches.

I do not have a reliable reproducer that can reproduce the issue in a
timely manner, but before the fix the issue would reproduce in several
minutes and with the fix it does not reproduce in over an hour.

Note that the current usefulness of the hints is limited because they
include the C-TCAM indication and represent aggregation that cannot
actually happen. This will be addressed in net-next.

[1]
WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
Modules linked in:
CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
[...]
Call Trace:
 <TASK>
 __objagg_obj_get+0x2bb/0x580
 objagg_obj_get+0xe/0x80
 mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
 mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370

Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
b4a3a89fff lib: objagg: Fix general protection fault
The library supports aggregation of objects into other objects only if
the parent object does not have a parent itself. That is, nesting is not
supported.

Aggregation happens in two cases: Without and with hints, where hints
are a pre-computed recommendation on how to aggregate the provided
objects.

Nesting is not possible in the first case due to a check that prevents
it, but in the second case there is no check because the assumption is
that nesting cannot happen when creating objects based on hints. The
violation of this assumption leads to various warnings and eventually to
a general protection fault [1].

Before fixing the root cause, error out when nesting happens and warn.

[1]
general protection fault, probably for non-canonical address 0xdead000000000d90: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1083 Comm: kworker/1:9 Tainted: G        W          6.9.0-rc6-custom-gd9b4f1cca7fb #7
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:mlxsw_sp_acl_erp_bf_insert+0x25/0x80
[...]
Call Trace:
 <TASK>
 mlxsw_sp_acl_atcam_entry_add+0x256/0x3c0
 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
 mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
 mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
 process_one_work+0x151/0x370
 worker_thread+0x2cb/0x3e0
 kthread+0xd0/0x100
 ret_from_fork+0x34/0x50
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: 9069a3817d ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
2aad28ec45 lib: test_objagg: Fix spelling
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Ido Schimmel
c1e156ae50 lib: objagg: Fix spelling
Fixes: 0a020d416d ("lib: introduce initial implementation of object aggregation manager")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:14:52 +01:00
Jeff Johnson
425ae3ab5a list: test: add the missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/list-test.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-07 15:59:15 -06:00
Jeff Johnson
a521746821 kunit: add missing MODULE_DESCRIPTION() macros to core modules
make allmodconfig && make W=1 C=1 reports in lib/kunit:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit-test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/kunit/kunit-example-test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-07 15:59:05 -06:00
Jeff Johnson
645211db13 crypto: lib - add missing MODULE_DESCRIPTION() macros
Fix the allmodconfig 'make W=1' warnings:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libchacha.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libarc4.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libdes.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libpoly1305.o

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-07 19:46:39 +08:00
Linus Torvalds
d30d0e49da Including fixes from BPF and big collection of fixes for WiFi core
and drivers.
 
 Current release - regressions:
 
  - vxlan: fix regression when dropping packets due to invalid src addresses
 
  - bpf: fix a potential use-after-free in bpf_link_free()
 
  - xdp: revert support for redirect to any xsk socket bound to the same
    UMEM as it can result in a corruption
 
  - virtio_net:
    - add missing lock protection when reading return code from control_buf
    - fix false-positive lockdep splat in DIM
    - Revert "wifi: wilc1000: convert list management to RCU"
 
  - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config
 
 Previous releases - regressions:
 
  - rtnetlink: make the "split" NLM_DONE handling generic, restore the old
    behavior for two cases where we started coalescing those messages with
    normal messages, breaking sloppily-coded userspace
 
  - wifi:
    - cfg80211: validate HE operation element parsing
    - cfg80211: fix 6 GHz scan request building
    - mt76: mt7615: add missing chanctx ops
    - ath11k: move power type check to ASSOC stage, fix connecting
      to 6 GHz AP
    - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
    - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
    - iwlwifi: mvm: fix a crash on 7265
 
 Previous releases - always broken:
 
  - ncsi: prevent multi-threaded channel probing, a spec violation
 
  - vmxnet3: disable rx data ring on dma allocation failure
 
  - ethtool: init tsinfo stats if requested, prevent unintentionally
    reporting all-zero stats on devices which don't implement any
 
  - dst_cache: fix possible races in less common IPv6 features
 
  - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED
 
  - ax25: fix two refcounting bugs
 
  - eth: ionic: fix kernel panic in XDP_TX action
 
 Misc:
 
  - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZh3mUACgkQMUZtbf5S
 IrvPwRAApv8X0ZIbPD5PuVEkiYuSkSE6QVou5GaVO7DzF4gj07zPNtCe6B/ZZdBu
 RLdlppxjAmVwdCRmUo0plxSydYZcqFpQqV6lRH/rbWmktWIp0pGIOAcOG7ISRPCC
 FAYJ4udSt4+wrq0hXTsE1KO1JZ0p7zE2bXxNC8uR8wgM9yonUjqhYdAUZhrl3yCY
 zOCD/+kvWFLYtehDcmyNK0ANS3yNveTNkRhXDc1UrpOGMtza60lf5u3bWK+sU5VS
 NGPe9cU60WKMQi6QnWFBZKIcp4Vgy2MukOLdNn9e8BRjFLh2dbY86LAmE4HWPA7I
 ONZagOfEjeOcRSCMdFHxui/PUDZLBZNhrnqQ6x8uC2yKwwIMr+CgEt5sCmVFwH6n
 3HTlWSjL38yuiVuYuhxGchmVnZfC4bLi2qAFF1oxhlDGViBDhAwi36MSCnjDpN8k
 Jo0x6crQLS/uvwVXPKWAUcQhy7OE69A3FwwA1PtkxRX5EQPn1if2Z7yq7YfYb9aD
 bChvCarlfuVDm+CBItphXg0ajVZc+im7+JK62Zn50A1cTbEK0lnYCOcmqzqiqrXI
 Vr3XXt6gVVnvwY374JDO1vmB5ft2IYBn7sWnLcIvR2UlggqEfqMdKSSwm7pOprG9
 YJ/LDAXVmG0kLN7rZUYUBLItnpuHAhYDrBOsV5HaFeksWauc1oY=
 =mwEJ
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from BPF and big collection of fixes for WiFi core and
  drivers.

  Current release - regressions:

   - vxlan: fix regression when dropping packets due to invalid src
     addresses

   - bpf: fix a potential use-after-free in bpf_link_free()

   - xdp: revert support for redirect to any xsk socket bound to the
     same UMEM as it can result in a corruption

   - virtio_net:
      - add missing lock protection when reading return code from
        control_buf
      - fix false-positive lockdep splat in DIM
      - Revert "wifi: wilc1000: convert list management to RCU"

   - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config

  Previous releases - regressions:

   - rtnetlink: make the "split" NLM_DONE handling generic, restore the
     old behavior for two cases where we started coalescing those
     messages with normal messages, breaking sloppily-coded userspace

   - wifi:
      - cfg80211: validate HE operation element parsing
      - cfg80211: fix 6 GHz scan request building
      - mt76: mt7615: add missing chanctx ops
      - ath11k: move power type check to ASSOC stage, fix connecting to
        6 GHz AP
      - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
      - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
      - iwlwifi: mvm: fix a crash on 7265

  Previous releases - always broken:

   - ncsi: prevent multi-threaded channel probing, a spec violation

   - vmxnet3: disable rx data ring on dma allocation failure

   - ethtool: init tsinfo stats if requested, prevent unintentionally
     reporting all-zero stats on devices which don't implement any

   - dst_cache: fix possible races in less common IPv6 features

   - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED

   - ax25: fix two refcounting bugs

   - eth: ionic: fix kernel panic in XDP_TX action

  Misc:

   - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB"

* tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  selftests: net: lib: set 'i' as local
  selftests: net: lib: avoid error removing empty netns name
  selftests: net: lib: support errexit with busywait
  net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
  ipv6: fix possible race in __fib6_drop_pcpu_from()
  af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
  af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
  af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
  af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
  af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
  af_unix: Annotate data-races around sk->sk_sndbuf.
  af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
  af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
  af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
  af_unix: Annotate data-race of sk->sk_state in unix_accept().
  af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
  af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
  af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
  af_unix: Annodate data-races around sk->sk_state for writers.
  af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
  ...
2024-06-06 09:55:27 -07:00
Jean-Philippe Brucker
0ee1472547 mm/util: Swap kmemdup_array() arguments
GCC 14.1 complains about the argument usage of kmemdup_array():

  drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
    130 |         fuse->lookups = kmemdup_array(fuse->soc->lookups, sizeof(*fuse->lookups),
        |                                                                 ^
  drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should specify number of elements, later size of each element

The annotation introduced by commit 7d78a77733 ("string: Add
additional __realloc_size() annotations for "dup" helpers") lets the
compiler think that kmemdup_array() follows the same format as calloc(),
with the number of elements preceding the size of one element. So we
could simply swap the arguments to __realloc_size() to get rid of that
warning, but it seems cleaner to instead have kmemdup_array() follow the
same format as krealloc_array(), memdup_array_user(), calloc() etc.

Fixes: 7d78a77733 ("string: Add additional __realloc_size() annotations for "dup" helpers")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20240606144608.97817-2-jean-philippe@linaro.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-06 08:55:20 -07:00
Jeff Johnson
0d618e3976 lib/math: add missing MODULE_DESCRIPTION() macros
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/prime_numbers.o
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/math/rational-test.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-math-v1-1-11a3bec51ebb@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04 17:40:07 +02:00
Jeff Johnson
5a71c0d118 dyndbg: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_dynamic_debug.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-test_dynamic_debug-v1-1-2194b477f55e@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04 17:40:02 +02:00
Jeff Johnson
c6cab01d7e lib/test_rhashtable: add missing MODULE_DESCRIPTION() macro
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_rhashtable.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240531-md-lib-test_rhashtable-v1-1-cd6d4138f1b6@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03 18:51:18 -07:00
Dr. David Alan Gilbert
8031042cc5 list: test: remove unused struct 'klist_test_struct'
'klist_test_struct' has been unused since the original
commit 57b4f760f9 ("list: test: Test the klist structure").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-03 11:16:03 -06:00
Jeff Johnson
ec1249d327 test_bpf: Add missing MODULE_DESCRIPTION()
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bpf.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240531-md-lib-test_bpf-v1-1-868e4bd2f9ed@quicinc.com
2024-06-03 17:03:41 +02:00
Kees Cook
99a6087dfd kunit/fortify: Remove __kmalloc_node() test
__kmalloc_node() is considered an "internal" function to the Slab, so
drop it from explicit testing.

Link: https://lore.kernel.org/r/20240531185703.work.588-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2024-05-31 13:47:41 -07:00
Ivan Orlov
2c7afc2a88 kunit: Cover 'assert.c' with tests
There are multiple assertion formatting functions in the `assert.c`
file, which are not covered with tests yet. Implement the KUnit test
for these functions.

The test consists of 11 test cases for the following functions:

1) 'is_literal'
2) 'is_str_literal'
3) 'kunit_assert_prologue', test case for multiple assert types
4) 'kunit_assert_print_msg'
5) 'kunit_unary_assert_format'
6) 'kunit_ptr_not_err_assert_format'
7) 'kunit_binary_assert_format'
8) 'kunit_binary_ptr_assert_format'
9) 'kunit_binary_str_assert_format'
10) 'kunit_assert_hexdump'
11) 'kunit_mem_assert_format'

The test aims at maximizing the branch coverage for the assertion
formatting functions.

As you can see, it covers some of the static helper functions as
well, so mark the static functions in `assert.c` as 'VISIBLE_IF_KUNIT'
and conditionally export them with EXPORT_SYMBOL_IF_KUNIT. Add the
corresponding definitions to `assert.h`.

Build the assert test when CONFIG_KUNIT_TEST is enabled, similar to
how it is done for the string stream test.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-30 12:53:47 -06:00
Vlastimil Babka
a0a44d9175 mm, slab: don't wrap internal functions with alloc_hooks()
The functions __kmalloc_noprof(), kmalloc_large_noprof(),
kmalloc_trace_noprof() and their _node variants are all internal to the
implementations of kmalloc_noprof() and kmalloc_node_noprof() and are
only declared in the "public" slab.h and exported so that those
implementations can be static inline and distinguish the build-time
constant size variants. The only other users for some of the internal
functions are slub_kunit and fortify_kunit tests which make very
short-lived allocations.

Therefore we can stop wrapping them with the alloc_hooks() macro.
Instead add a __ prefix to all of them and a comment documenting these
as internal. Also rename __kmalloc_trace() to __kmalloc_cache() which is
more descriptive - it is a variant of __kmalloc() where the exact
kmalloc cache has been already determined.

The usage in fortify_kunit can be removed completely, as the internal
functions should be tested already through kmalloc() tests in the
test variant that passes non-constant allocation size.

Reported-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-05-28 09:27:50 +02:00
Maxime Ripard
375c4d1583
Merge drm/drm-next into drm-misc-next
Let's start the new release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-27 11:08:31 +02:00
Linus Torvalds
9b62e02e63 16 hotfixes, 11 of which are cc:stable.
A few nilfs2 fixes, the remainder are for MM: a couple of selftests fixes,
 various singletons fixing various issues in various parts.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZlIOUgAKCRDdBJ7gKXxA
 jrYnAP9UeOw8YchTIsjEllmAbTMAqWGI+54CU/qD78jdIHoVWAEAmp0QqgFW3r2p
 jze4jBkh3lGQjykTjkUskaR71h9AZww=
 =AHeV
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "16 hotfixes, 11 of which are cc:stable.

  A few nilfs2 fixes, the remainder are for MM: a couple of selftests
  fixes, various singletons fixing various issues in various parts"

* tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/ksm: fix possible UAF of stable_node
  mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
  mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
  nilfs2: fix potential hang in nilfs_detach_log_writer()
  nilfs2: fix unexpected freezing of nilfs_segctor_sync()
  nilfs2: fix use-after-free of timer for log writer thread
  selftests/mm: fix build warnings on ppc64
  arm64: patching: fix handling of execmem addresses
  selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
  selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
  selftests/mm: compaction_test: fix bogus test success on Aarch64
  mailmap: update email address for Satya Priya
  mm/huge_memory: don't unpoison huge_zero_folio
  kasan, fortify: properly rename memintrinsics
  lib: add version into /proc/allocinfo output
  mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
2024-05-25 15:10:33 -07:00
Suren Baghdasaryan
a38568a0b4 lib: add version into /proc/allocinfo output
Add version string and a header at the beginning of /proc/allocinfo to
allow later format changes.  Example output:

> head /proc/allocinfo
allocinfo - version: 1.0
#     <size>  <calls> <tag info>
           0        0 init/main.c:1314 func:do_initcalls
           0        0 init/do_mounts.c:353 func:mount_nodev_root
           0        0 init/do_mounts.c:187 func:mount_root_generic
           0        0 init/do_mounts.c:158 func:do_mount_root
           0        0 init/initramfs.c:493 func:unpack_to_rootfs
           0        0 init/initramfs.c:492 func:unpack_to_rootfs
           0        0 init/initramfs.c:491 func:unpack_to_rootfs
         512        1 arch/x86/events/rapl.c:681 func:init_rapl_pmus
         128        1 arch/x86/events/rapl.c:571 func:rapl_cpu_online

[akpm@linux-foundation.org: remove stray newline from struct allocinfo_private]
Link: https://lkml.kernel.org/r/20240514163128.3662251-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24 11:55:05 -07:00
Linus Torvalds
b0a9ba13ff hardening fixes for v6.10-rc1
- loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module decompression
   (Stephen Boyd)
 
 - ubsan: Restore dependency on ARCH_HAS_UBSAN
 
 - kunit/fortify: Fix memcmp() test to be amplitude agnostic
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmZP0w0WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJqYDEACWaY0Xjig6Izo+B+85IozTLf2R
 Wv3zlOjUhjbRn7enzhVBRRfU216nl/wp8s7pKhNYCEZ7gJ+04hYtZoLY6YV7jtZ0
 RAvpwc1dmUm7RZIBxjnzqiNTdttNBniPDE47goV0Yi9JVSDFY1Y/P5GwiAr0PO6W
 kt1+WBr2zADNpTZziH8MZou7jfK+y1bOZw8rUUFMODrMc0buuLGO2h+lZqASJXNs
 5NHPUOoJsZHvQxN/YSyE555VycpoyWiwMvA1XOz1NVKdr1eFP1heu88AnIRKOD7o
 cMz6W/yUZ+4dYr2yydDGNX+QvFmZuvPz0oXAlI7BAblpT0UU7xv0jaioAhIam87U
 WxVQSOgkLQBw6Ym79W66HplizCVfEl9aUAYDSK5UJlwdpNE/j16XLYDLKxDi0wUZ
 pjUy5CF0X7FFNyY7Kp5flqzKrQG31vfqZf/yWhtWu258x604LR6CTkO06IJDINx0
 UUrbehie3bGnbu5FS0oVKGH37Mq0aRn4Xk2aUZaFf1Vz/YtU4Wo3FbtyOyFZsdpl
 aCNyYzmNmfVijDQlLshy6HBACeLPV2DjIJ8pcC74abUV1FX6VOvIDsTy4ELkm9BF
 WZ8LNryo79lFsFMThhwfCDHubhXoaLjkl4rpOB5x+Ld0q+GgfIb5jMfF507YxrRj
 3KxJJKXzUKNf+JFnjg==
 =VTTF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module
   decompression (Stephen Boyd)

 - ubsan: Restore dependency on ARCH_HAS_UBSAN

 - kunit/fortify: Fix memcmp() test to be amplitude agnostic

* tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kunit/fortify: Fix memcmp() test to be amplitude agnostic
  ubsan: Restore dependency on ARCH_HAS_UBSAN
  loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module decompression
2024-05-24 08:33:44 -07:00
Linus Torvalds
c760b3725e - A series ("kbuild: enable more warnings by default") from Arnd
Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
2024-05-22 18:59:29 -07:00
Linus Torvalds
5c6f4d68e2 A series from Dave Chinner which cleans up and fixes the handling of
nested allocations within stackdepot and page-owner.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6MRwAKCRDdBJ7gKXxA
 jnzeAP9WHW425N7pWmE7rK7n8oXZK9f356dKJMtz2A35Bx6XJgEAuK86kDRA4Kv3
 kg8mtwzOIQYKZWzn5VlcvBbtlhjKGwM=
 =9/Ou
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:
 "A series from Dave Chinner which cleans up and fixes the handling of
  nested allocations within stackdepot and page-owner"

* tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/page-owner: use gfp_nested_mask() instead of open coded masking
  stackdepot: use gfp_nested_mask() instead of open coded masking
  mm: lift gfp_kmemleak_mask() to gfp.h
2024-05-22 17:32:04 -07:00
Linus Torvalds
f6b8e86b7a TTY/Serial changes for 6.10-rc1
Here is the big set of tty/serial driver changes for 6.10-rc1.  Included
 in here are:
   - Usual good set of api cleanups and evolution by Jiri Slaby to make
     the serial interfaces move out of the 1990's by using kfifos instead
     of hand-rolling their own logic.
   - 8250_exar driver updates
   - max3100 driver updates
   - sc16is7xx driver updates
   - exar driver updates
   - sh-sci driver updates
   - tty ldisc api addition to help refuse bindings
   - other smaller serial driver updates
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZk4Cvg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymqpwCgnHU1NeBBUsvoSDOLk5oApIQ4jVgAn102jWlw
 3dNDhA4i3Ay/mZdv8/Kj
 =TI+P
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
 "Here is the big set of tty/serial driver changes for 6.10-rc1.
  Included in here are:

   - Usual good set of api cleanups and evolution by Jiri Slaby to make
     the serial interfaces move out of the 1990's by using kfifos
     instead of hand-rolling their own logic.

   - 8250_exar driver updates

   - max3100 driver updates

   - sc16is7xx driver updates

   - exar driver updates

   - sh-sci driver updates

   - tty ldisc api addition to help refuse bindings

   - other smaller serial driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (113 commits)
  serial: Clear UPF_DEAD before calling tty_port_register_device_attr_serdev()
  serial: imx: Raise TX trigger level to 8
  serial: 8250_pnp: Simplify "line" related code
  serial: sh-sci: simplify locking when re-issuing RXDMA fails
  serial: sh-sci: let timeout timer only run when DMA is scheduled
  serial: sh-sci: describe locking requirements for invalidating RXDMA
  serial: sh-sci: protect invalidating RXDMA on shutdown
  tty: add the option to have a tty reject a new ldisc
  serial: core: Call device_set_awake_path() for console port
  dt-bindings: serial: brcm,bcm2835-aux-uart: convert to dtschema
  tty: serial: uartps: Add support for uartps controller reset
  arm64: zynqmp: Add resets property for UART nodes
  dt-bindings: serial: cdns,uart: Add optional reset property
  serial: 8250_pnp: Switch to DEFINE_SIMPLE_DEV_PM_OPS()
  serial: 8250_exar: Keep the includes sorted
  serial: 8250_exar: Make type of bit the same in exar_ee_*_bit()
  serial: 8250_exar: Use BIT() in exar_ee_read()
  serial: 8250_exar: Switch to use dev_err_probe()
  serial: 8250_exar: Return directly from switch-cases
  serial: 8250_exar: Decrease indentation level
  ...
2024-05-22 11:53:02 -07:00
Linus Torvalds
4865a27c66 bitmap patches for 6.10
Hi Linus,
 
 Please pull patches for 6.10. This includes:
  - topology_span_sane() optimization from Kyle Meyer;
  - fns() rework from Kuan-Wei Chiu (used in
    cpumask_local_spread() and other places); and
  - headers cleanup from Andy.
 
 This also adds a MAINTAINERS record for bitops API as it's unattended,
 and I'd like to follow it closer.
 
 Thanks,
 Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmZKh/kACgkQsUSA/Tof
 vshtSQv/eT5+KyXg5qCY3fLaIjWYD0uch5jxkdqtib5BncfIrUMsFpZBon+E2x9C
 fWu7K/nfxUjKZF0Sfgl9gVns6K0rC4F24WzHjzWRVVV7+g4idXwMC1kxSX733KQC
 o+D2065Dx9EmhnzypBbmNsGQsQ09WXP1GsJLf8qSGCw0lT1zNtgqsAD5sSogFGGn
 ca9ZsndThuzTst5lXPXipt1W/c26frchh6SgjVTPjzALCDAf5r9Ls5np3AL1AW8X
 yR8cuV9UphT1ysBplzPbBET/Fy/AGbZl1g4u72M6NvGy/nVkQ5Ic4HZj0zIem0Ic
 C60PokY8lg6hQ7tWN8da12/g6WZINgZcfUfuodKiQAzryBGUJlW0aDzDUZPcCqB/
 gmV/Op4RPJeQr9sibQ6nIFx73ydKVQEmZRliahzXR0p33HJCOLTATOeYqLTXQMdi
 ZwhYCqG5fNEUK0VMBy8S4+tEsUAoykU21hFD04b/Ur8A49bxxJ9RDlAUC0IEc1Pj
 fiU0VPFx
 =H6BQ
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.10v2' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - topology_span_sane() optimization from Kyle Meyer

 - fns() rework from Kuan-Wei Chiu (used in cpumask_local_spread() and
   other places)

 - headers cleanup from Andy

 - add a MAINTAINERS record for bitops API

* tag 'bitmap-for-6.10v2' of https://github.com/norov/linux:
  usercopy: Don't use "proxy" headers
  bitops: Move aligned_byte_mask() to wordpart.h
  MAINTAINERS: add BITOPS API record
  bitmap: relax find_nth_bit() limitation on return value
  lib: make test_bitops compilable into the kernel image
  bitops: Optimize fns() for improved performance
  lib/test_bitops: Add benchmark test for fns()
  Compiler Attributes: Add __always_used macro
  sched/topology: Optimize topology_span_sane()
  cpumask: Add for_each_cpu_from()
2024-05-21 15:29:01 -07:00
Linus Torvalds
3413efa888 Compactifying bdev flags
We can easily have up to 24 flags with sane
 atomicity, _without_ pushing anything out
 of the first cacheline of struct block_device.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkznRwAKCRBZ7Krx/gZQ
 69XpAQDOZCyvYOZ/dlMOKKLf2vAojC/h++E/NjvGt3erbvVN2wEArXMi13ECsoCw
 JYJA3MsmvjuY6VNcm24icf2/p4TMIgo=
 =JyYi
 -----END PGP SIGNATURE-----

Merge tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull bdev flags update from Al Viro:
 "Compactifying bdev flags.

  We can easily have up to 24 flags with sane atomicity, _without_
  pushing anything out of the first cacheline of struct block_device"

* tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  bdev: move ->bd_make_it_fail to ->__bd_flags
  bdev: move ->bd_ro_warned to ->__bd_flags
  bdev: move ->bd_has_subit_bio to ->__bd_flags
  bdev: move ->bd_write_holder into ->__bd_flags
  bdev: move ->bd_read_only to ->__bd_flags
  bdev: infrastructure for flags
  wrapper for access to ->bd_partno
  Use bdev_is_paritition() instead of open-coding it
2024-05-21 13:02:56 -07:00
Andy Shevchenko
5671dca241 usercopy: Don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use)
principle.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19 16:12:38 -07:00
Andy Shevchenko
9f2c2d6ba1 bitops: Move aligned_byte_mask() to wordpart.h
The bitops.h is for bit related operations. The aligned_byte_mask()
is about byte (or part of the machine word) operations, for which
we have a separate header, move the mentioned macro to wordpart.h
to consolidate similar operations.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19 16:12:38 -07:00
Dave Chinner
70c435ca8d stackdepot: use gfp_nested_mask() instead of open coded masking
The stackdepot code is used by KASAN and lockdep for recoding stack
traces.  Both of these track allocation context information, and so their
internal allocations must obey the caller allocation contexts to avoid
generating their own false positive warnings that have nothing to do with
the code they are instrumenting/tracking.

We also don't want recording stack traces to deplete emergency memory
reserves - debug code is useless if it creates new issues that can't be
replicated when the debug code is disabled.

Switch the stackdepot allocation masking to use gfp_nested_mask() to
address these issues.  gfp_nested_mask() also strips GFP_ZONEMASK
naturally, so that greatly simplifies this code.

Link: https://lkml.kernel.org/r/20240430054604.4169568-3-david@fromorbit.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:40:44 -07:00
Samuel Holland
790a4a3dd1 selftests/fpu: allow building on other architectures
Now that ARCH_HAS_KERNEL_FPU_SUPPORT provides a common way to compile and
run floating-point code, this test is no longer x86-specific.

Link: https://lkml.kernel.org/r/20240329072441.591471-16-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:20 -07:00
Samuel Holland
9613736d85 selftests/fpu: move FP code to a separate translation unit
This ensures no compiler-generated floating-point code can appear outside
kernel_fpu_{begin,end}() sections, and some architectures enforce this
separation.

Link: https://lkml.kernel.org/r/20240329072441.591471-15-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:20 -07:00
Samuel Holland
4be073931c lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
Now that CC_FLAGS_FPU is exported and can be used anywhere in the source
tree, use it instead of duplicating the flags here.

Link: https://lkml.kernel.org/r/20240329072441.591471-7-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:18 -07:00
Linus Torvalds
eb6a9339ef Mainly singleton patches, documented in their respective changelogs.
Notable series include:
 
 - Some maintenance and performance work for ocfs2 in Heming Zhao's
   series "improve write IO performance when fragmentation is high".
 
 - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
   exposed by fstests".
 
 - kfifo header rework from Andy Shevchenko in the series "kfifo: Clean
   up kfifo.h".
 
 - GDB script fixes from Florian Rommel in the series "scripts/gdb: Fixes
   for $lx_current and $lx_per_cpu".
 
 - After much discussion, a coding-style update from Barry Song
   explaining one reason why inline functions are preferred over macros.
   The series is "codingstyle: avoid unused parameters for a function-like
   macro".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkpLYQAKCRDdBJ7gKXxA
 jo9NAQDctSD3TMXqxqCHLaEpCaYTYzi6TGAVHjgkqGzOt7tYjAD/ZIzgcmRwthjP
 R7SSiSgZ7UnP9JRn16DQILmFeaoG1gs=
 =lYhr
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-mm updates from Andrew Morton:
 "Mainly singleton patches, documented in their respective changelogs.
  Notable series include:

   - Some maintenance and performance work for ocfs2 in Heming Zhao's
     series "improve write IO performance when fragmentation is high".

   - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
     exposed by fstests".

   - kfifo header rework from Andy Shevchenko in the series "kfifo:
     Clean up kfifo.h".

   - GDB script fixes from Florian Rommel in the series "scripts/gdb:
     Fixes for $lx_current and $lx_per_cpu".

   - After much discussion, a coding-style update from Barry Song
     explaining one reason why inline functions are preferred over
     macros. The series is "codingstyle: avoid unused parameters for a
     function-like macro""

* tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (62 commits)
  fs/proc: fix softlockup in __read_vmcore
  nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()
  scripts: checkpatch: check unused parameters for function-like macro
  Documentation: coding-style: ask function-like macros to evaluate parameters
  nilfs2: use __field_struct() for a bitwise field
  selftests/kcmp: remove unused open mode
  nilfs2: remove calls to folio_set_error() and folio_clear_error()
  kernel/watchdog_perf.c: tidy up kerneldoc
  watchdog: allow nmi watchdog to use raw perf event
  watchdog: handle comma separated nmi_watchdog command line
  nilfs2: make superblock data array index computation sparse friendly
  squashfs: remove calls to set the folio error flag
  squashfs: convert squashfs_symlink_read_folio to use folio APIs
  scripts/gdb: fix detection of current CPU in KGDB
  scripts/gdb: make get_thread_info accept pointers
  scripts/gdb: fix parameter handling in $lx_per_cpu
  scripts/gdb: fix failing KGDB detection during probe
  kfifo: don't use "proxy" headers
  media: stih-cec: add missing io.h
  media: rc: add missing io.h
  ...
2024-05-19 14:02:03 -07:00
Linus Torvalds
16dbfae867 bcachefs changes for 6.10-rc1
- More safety fixes, primarily found by syzbot
 
 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata in
   memory.
 
   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.
 
 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take too
   many for lockdep to track, and it's not necessary since we have a
   cycle detector.
 
 - Trigger improvements that are prep work for online fsck
 
 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.
 
 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.
 
 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.
 
 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.
 
 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.
 
 - Safety fixes for devices references: this is a big series that changes
   almost all device lookups to properly check if the device exists and
   take a reference to it.
 
   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid methods,
   but this was incorrect because it meant device removal relied on
   accounting being correct to not leave keys pointing to invalid
   devices, and that's not something we can assume.
 
   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.
 
 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).
 
 - New coding style document, which among other things talks about the
   correct usage of assertions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZKJQgACgkQE6szbY3K
 bnZETg//SU9H0OHnBSMB/cteF6PKo9QR+dhT+3n+gWTxl0o/egbGTqwbzVqGtd2f
 J6II1BsDk8VoTOb/gFfLRShlmJfnj2jpRThU265faR/7LQYeSaqndDPkjOpTayAD
 Nj/DJyiSUTL753rZh3yUhOpOIHf7iapH6wuaZCPfhdfk+yvZNW8iz07JHjHLKRp8
 I2cFH0r6kN916NdRkt9oDCz68WouT8eWTqwcKra04XsLEZjNJHxLpKMq4M8UdPc7
 YynJPVt+aP8+VduGIq6pV8Co3afCP2oUywo11JpRmvLsw4tex/59wxOYtpMfgn6k
 4H+9WqiBwkbmnLDrfFHWRameS6F/7+GRAOVuz9nkmfk61UPU15gLjSRffqZ6u2YC
 7vbrXgebId/sZXtBpQd83RMMX52BnEJah0upNJ54IsSqfDYkU9lwl6CEyYpcX1hf
 YNBGBTbspZztc3AB13b3ow421FMhaySUg0FDmntMR9O8Z6/BXk7Ykc7b8DPEfrFs
 W6JY7q+ARBxr+EgFcV74fvMCf7NJTAhyv80AKryo7NFU2JZOyyaTxcTGSnolX4Mi
 lyHiOgicmOX+vy3vbC1dZoDcmIDJ4Uc0vixYcpKiZqxlR8XJ+wpevC50TEhxrcW+
 ZO4SloQvgyjI34xu/gZgjRYb3BhXK3x+ougVFpRG8V8zQ/+ccWg=
 =MKrF
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - More safety fixes, primarily found by syzbot

 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata
   in memory.

   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.

 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take
   too many for lockdep to track, and it's not necessary since we have a
   cycle detector.

 - Trigger improvements that are prep work for online fsck

 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.

 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.

 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.

 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.

 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.

 - Safety fixes for devices references: this is a big series that
   changes almost all device lookups to properly check if the device
   exists and take a reference to it.

   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid
   methods, but this was incorrect because it meant device removal
   relied on accounting being correct to not leave keys pointing to
   invalid devices, and that's not something we can assume.

   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.

 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).

 - New coding style document, which among other things talks about the
   correct usage of assertions

* tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs: (155 commits)
  bcachefs: add no_invalid_checks flag
  bcachefs: add counters for failed shrinker reclaim
  bcachefs: Fix sb_field_downgrade validation
  bcachefs: Plumb bch_validate_flags to sb_field_ops.validate()
  bcachefs: s/bkey_invalid_flags/bch_validate_flags
  bcachefs: fsync() should not return -EROFS
  bcachefs: Invalid devices are now checked for by fsck, not .invalid methods
  bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs()
  bcachefs: kill bch2_dev_bkey_exists() in bch2_read_endio()
  bcachefs: bch2_dev_get_ioref() checks for device not present
  bcachefs: bch2_dev_get_ioref2(); io_read.c
  bcachefs: bch2_dev_get_ioref2(); debug.c
  bcachefs: bch2_dev_get_ioref2(); journal_io.c
  bcachefs: bch2_dev_get_ioref2(); io_write.c
  bcachefs: bch2_dev_get_ioref2(); btree_io.c
  bcachefs: bch2_dev_get_ioref2(); backpointers.c
  bcachefs: bch2_dev_get_ioref2(); alloc_background.c
  bcachefs: for_each_bset() declares loop iter
  bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h
  bcachefs: Improve sysfs internal/btree_cache
  ...
2024-05-19 13:45:48 -07:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Kees Cook
ae1a863bcd kunit/fortify: Fix memcmp() test to be amplitude agnostic
When memcmp() returns a non-zero value, only the signed bit has any
meaning. The actual value may differ between implementations.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2025
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240518184020.work.604-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-18 13:46:10 -07:00
Kees Cook
890a64810d ubsan: Restore dependency on ARCH_HAS_UBSAN
While removing CONFIG_UBSAN_SANITIZE_ALL, ARCH_HAS_UBSAN wasn't correctly
depended on. Restore this, as we do not want to attempt UBSAN builds
unless it's actually been tested on a given architecture.

Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Closes: https://lore.kernel.org/all/20240514095427.541201-1-masahiroy@kernel.org
Fixes: 918327e9b7 ("ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL")
Link: https://lore.kernel.org/r/20240514233747.work.441-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-18 13:46:10 -07:00
Linus Torvalds
25f4874662 RDMA v6.10 merge window
Normal set of driver updates and small fixes:
 
 - Small improvements and fixes for erdma, efa, hfi1, bnxt_re
 
 - Fix a UAF crash after module unload on leaking restrack entry
 
 - Continue adding full RDMA support in mana with support for EQs, GID's
   and CQs
 
 - Improvements to the mkey cache in mlx5
 
 - DSCP traffic class support in hns and several bug fixes
 
 - Cap the maximum number of MADs in the receive queue to avoid OOM
 
 - Another batch of rxe bug fixes from large scale testing
 
 - __iowrite64_copy() optimizations for write combining MMIO memory
 
 - Remove NULL checks before dev_put/hold()
 
 - EFA support for receive with immediate
 
 - Fix a recent memleaking regression in a cma error path
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZkeo2gAKCRCFwuHvBreF
 YbuNAQChzGmS4F0JAn5Wj0CDvkZghELqtvzEb92SzqcgdyQafAD/fC7f23LJ4OsO
 1ZIaQEZu7j9DVg5PKFZ7WfdXjGTKqwA=
 =QRXg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Aside from the usual things this has an arch update for
  __iowrite64_copy() used by the RDMA drivers.

  This API was intended to generate large 64 byte MemWr TLPs on PCI.
  These days most processors had done this by just repeating writel() in
  a loop. S390 and some new ARM64 designs require a special helper to
  get this to generate.

   - Small improvements and fixes for erdma, efa, hfi1, bnxt_re

   - Fix a UAF crash after module unload on leaking restrack entry

   - Continue adding full RDMA support in mana with support for EQs,
     GID's and CQs

   - Improvements to the mkey cache in mlx5

   - DSCP traffic class support in hns and several bug fixes

   - Cap the maximum number of MADs in the receive queue to avoid OOM

   - Another batch of rxe bug fixes from large scale testing

   - __iowrite64_copy() optimizations for write combining MMIO memory

   - Remove NULL checks before dev_put/hold()

   - EFA support for receive with immediate

   - Fix a recent memleaking regression in a cma error path"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (70 commits)
  RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw
  RDMA/IPoIB: Fix format truncation compilation errors
  bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwq
  RDMA/efa: Support QP with unsolicited write w/ imm. receive
  IB/hfi1: Remove generic .ndo_get_stats64
  IB/hfi1: Do not use custom stat allocator
  RDMA/hfi1: Use RMW accessors for changing LNKCTL2
  RDMA/mana_ib: implement uapi for creation of rnic cq
  RDMA/mana_ib: boundary check before installing cq callbacks
  RDMA/mana_ib: introduce a helper to remove cq callbacks
  RDMA/mana_ib: create and destroy RNIC cqs
  RDMA/mana_ib: create EQs for RNIC CQs
  RDMA/core: Remove NULL check before dev_{put, hold}
  RDMA/ipoib: Remove NULL check before dev_{put, hold}
  RDMA/mlx5: Remove NULL check before dev_{put, hold}
  RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources.
  RDMA/core: Add an option to display driver-specific QPs in the rdmatool
  RDMA/efa: Add shutdown notifier
  RDMA/mana_ib: Fix missing ret value
  IB/mlx5: Use __iowrite64_copy() for write combining stores
  ...
2024-05-18 13:04:15 -07:00
Linus Torvalds
ff9a79307f Kbuild updates for v6.10
- Avoid 'constexpr', which is a keyword in C23
 
  - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
    'dt_binding_check'
 
  - Fix weak references to avoid GOT entries in position-independent
    code generation
 
  - Convert the last use of 'optional' property in arch/sh/Kconfig
 
  - Remove support for the 'optional' property in Kconfig
 
  - Remove support for Clang's ThinLTO caching, which does not work with
    the .incbin directive
 
  - Change the semantics of $(src) so it always points to the source
    directory, which fixes Makefile inconsistencies between upstream and
    downstream
 
  - Fix 'make tar-pkg' for RISC-V to produce a consistent package
 
  - Provide reasonable default coverage for objtool, sanitizers, and
    profilers
 
  - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.
 
  - Remove the last use of tristate choice in drivers/rapidio/Kconfig
 
  - Various cleanups and fixes in Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmZFlGcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG8voQALC8NtFpduWVfLRj2Qg6Ll/xf1vX
 2igcTJEOFHkeqXLGoT8dTDKLEipUBUvKyguPq66CGwVTe2g6zy/nUSXeVtFrUsIa
 msLTi8FqhqUo5lodNvGMRf8qqmuqcvnXoiQwIocF92jtsFy14bhiFY+n4HfcFNjj
 GOKwqBZYQUwY/VVb090efc7RfS9c7uwABJSBelSoxg3AGZriwjGy7Pw5aSKGgVYi
 inqL1eR6qwPP6z7CgQWM99soP+zwybFZmnQrsD9SniRBI4rtAat8Ih5jQFaSUFUQ
 lk2w0NQBRFN88/uR2IJ2GWuIlQ74WeJ+QnCqVuQ59tV5zw90wqSmLzngfPD057Dv
 JjNuhk0UyXVtpIg3lRtd4810ppNSTe33b9OM4O2H846W/crju5oDRNDHcflUXcwm
 Rmn5ho1rb5QVzDVejJbgwidnUInSgJ9PZcvXQ/RJVZPhpgsBzAY9pQexG1G3hviw
 y9UDrt6KP6bF9tHjmolmtdIes9Pj0c4dN6/Rdj4HS4hIQ/GDar0tnwvOvtfUctNL
 orJlBsA6GeMmDVXKkR0ytOCWRYqWWbyt8g70RVKQJfuHX7/hGyAQPaQ2/u4mQhC2
 aevYfbNJMj0VDfGz81HDBKFtkc5n+Ite8l157dHEl2LEabkOkRdNVcn7SNbOvZmd
 ZCSnZ31h7woGfNho
 =D5B/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Avoid 'constexpr', which is a keyword in C23

 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
   'dt_binding_check'

 - Fix weak references to avoid GOT entries in position-independent code
   generation

 - Convert the last use of 'optional' property in arch/sh/Kconfig

 - Remove support for the 'optional' property in Kconfig

 - Remove support for Clang's ThinLTO caching, which does not work with
   the .incbin directive

 - Change the semantics of $(src) so it always points to the source
   directory, which fixes Makefile inconsistencies between upstream and
   downstream

 - Fix 'make tar-pkg' for RISC-V to produce a consistent package

 - Provide reasonable default coverage for objtool, sanitizers, and
   profilers

 - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.

 - Remove the last use of tristate choice in drivers/rapidio/Kconfig

 - Various cleanups and fixes in Kconfig

* tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits)
  kconfig: use sym_get_choice_menu() in sym_check_prop()
  rapidio: remove choice for enumeration
  kconfig: lxdialog: remove initialization with A_NORMAL
  kconfig: m/nconf: merge two item_add_str() calls
  kconfig: m/nconf: remove dead code to display value of bool choice
  kconfig: m/nconf: remove dead code to display children of choice members
  kconfig: gconf: show checkbox for choice correctly
  kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal
  Makefile: remove redundant tool coverage variables
  kbuild: provide reasonable defaults for tool coverage
  modules: Drop the .export_symbol section from the final modules
  kconfig: use menu_list_for_each_sym() in sym_check_choice_deps()
  kconfig: use sym_get_choice_menu() in conf_write_defconfig()
  kconfig: add sym_get_choice_menu() helper
  kconfig: turn defaults and additional prompt for choice members into error
  kconfig: turn missing prompt for choice members into error
  kconfig: turn conf_choice() into void function
  kconfig: use linked list in sym_set_changed()
  kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED
  kconfig: gconf: remove debug code
  ...
2024-05-18 12:39:20 -07:00
Linus Torvalds
70a663205d Probes updates for v6.10:
- tracing/probes: Adding new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'.
 
 - uprobes: Some performance optimizations have been done.
  . Speed up the BPF uprobe event by delaying the fetching of the uprobe
    event arguments that are not used in BPF.
  . Avoid locking by speculatively checking whether uprobe event is valid.
  . Reduce lock contention by using read/write_lock instead of spinlock for
    uprobe list operation. This improved BPF uprobe benchmark result 43% on
    average.
 
 - rethook: Removes non-fatal warning messages when tracing stack from BPF
   and skip rcu_is_watching() validation in rethook if possible.
 
 - objpool: Optimizing objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching nr_cpu_ids
   because it is a const value.
 
 - fprobe: Add entry/exit callbacks types (code cleanup)
 - kprobes: Check ftrace was killed in kprobes if it uses ftrace.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmZFUxsbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b+fIH/A96/SeC5WRLhXmHfTCM
 IvKUea2n0b0oV/2pVfHqfkCBTICuUZ97Opd9VH9jLtjBOTh0fUOGZ2DNVGdSYfWm
 IIkS5dhuZxHXrSHEVYykwLHI3AOL7Q6Ny9EmOg1CNMidUkPMNtBvppsBYPlFU/B/
 qQJAvOdkVOnNITCaas0+MNgepoVVKdJzdNQ1I4WrGyG8isCZBaCYKo2QcGyheCNN
 y8NXvnVHgmgHQ8nTaeE5AawclFzFnhwHfPQPe1kiyGrx15b8K+VYmaZxPKv33A1a
 KT3TKJ1Ep7s7iWFh2iPVJzIwOXCmSnvNTKfNx/MDuKtO7UVfFwytoMEaekbmv3bG
 VqM=
 =n/mW
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:

 - tracing/probes: Add new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'

 - uprobes performance optimizations:
    - Speed up the BPF uprobe event by delaying the fetching of the
      uprobe event arguments that are not used in BPF
    - Avoid locking by speculatively checking whether uprobe event is
      valid
    - Reduce lock contention by using read/write_lock instead of
      spinlock for uprobe list operation. This improved BPF uprobe
      benchmark result 43% on average

 - rethook: Remove non-fatal warning messages when tracing stack from
   BPF and skip rcu_is_watching() validation in rethook if possible

 - objpool: Optimize objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching
   nr_cpu_ids because it is a const value

 - fprobe: Add entry/exit callbacks types (code cleanup)

 - kprobes: Check ftrace was killed in kprobes if it uses ftrace

* tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobe/ftrace: bail out if ftrace was killed
  selftests/ftrace: Fix required features for VFS type test case
  objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
  objpool: enable inlining objpool_push() and objpool_pop() operations
  rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get()
  ftrace: make extra rcu_is_watching() validation check optional
  uprobes: reduce contention on uprobes_tree access
  rethook: Remove warning messages printed for finding return address of a frame.
  fprobe: Add entry/exit callbacks types
  selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
  selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
  Documentation: tracing: add new type '%pd' and '%pD' for kprobe
  tracing/probes: support '%pD' type for print struct file's name
  tracing/probes: support '%pd' type for print struct dentry's name
  uprobes: add speculative lockless system-wide uprobe filter check
  uprobes: prepare uprobe args buffer lazily
  uprobes: encapsulate preparation of uprobe args buffer
2024-05-17 18:29:30 -07:00
Linus Torvalds
1b294a1f35 Networking changes for 6.10.
Core & protocols
 ----------------
 
  - Complete rework of garbage collection of AF_UNIX sockets.
    AF_UNIX is prone to forming reference count cycles due to fd passing
    functionality. New method based on Tarjan's Strongly Connected Components
    algorithm should be both faster and remove a lot of workarounds
    we accumulated over the years.
 
  - Add TCP fraglist GRO support, allowing chaining multiple TCP packets
    and forwarding them together. Useful for small switches / routers which
    lack basic checksum offload in some scenarios (e.g. PPPoE).
 
  - Support using SMP threads for handling packet backlog i.e. packet
    processing from software interfaces and old drivers which don't
    use NAPI. This helps move the processing out of the softirq jumble.
 
  - Continue work of converting from rtnl lock to RCU protection.
    Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address
    labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files,
    MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs,
    neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link
    information available via rtnetlink.
 
  - Small optimizations from Eric to UDP wake up handling, memory accounting,
    RPS/RFS implementation, TCP packet sizing etc.
 
  - Allow direct page recycling in the bulk API used by XDP, for +2% PPS.
 
  - Support peek with an offset on TCP sockets.
 
  - Add MPTCP APIs for querying last time packets were received/sent/acked,
    and whether MPTCP "upgrade" succeeded on a TCP socket.
 
  - Add intra-node communication shortcut to improve SMC performance.
 
  - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver.
 
  - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
 
  - Add reset reasons for tracing what caused a TCP reset to be sent.
 
  - Introduce direction attribute for xfrm (IPSec) states.
    State can be used either for input or output packet processing.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
    This required touch-ups and renaming of a few existing users.
 
  - Add Endian-dependent __counted_by_{le,be} annotations.
 
  - Make building selftests "quieter" by printing summaries like
    "CC object.o" rather than full commands with all the arguments.
 
 Netfilter
 ---------
 
  - Use GFP_KERNEL to clone elements, to deal better with OOM situations
    and avoid failures in the .commit step.
 
 BPF
 ---
 
  - Add eBPF JIT for ARCv2 CPUs.
 
  - Support attaching kprobe BPF programs through kprobe_multi link in
    a session mode, meaning, a BPF program is attached to both function entry
    and return, the entry program can decide if the return program gets
    executed and the entry program can share u64 cookie value with return
    program. "Session mode" is a common use-case for tetragon and bpftrace.
 
  - Add the ability to specify and retrieve BPF cookie for raw tracepoint
    programs in order to ease migration from classic to raw tracepoints.
 
  - Add an internal-only BPF per-CPU instruction for resolving per-CPU
    memory addresses and implement support in x86, ARM64 and RISC-V JITs.
    This allows inlining functions which need to access per-CPU state.
 
  - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
    atomics in bpf_arena which can be JITed as a single x86 instruction.
    Support BPF arena on ARM64.
 
  - Add a new bpf_wq API for deferring events and refactor process-context
    bpf_timer code to keep common code where possible.
 
  - Harden the BPF verifier's and/or/xor value tracking.
 
  - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs.
 
  - Support bpf_tail_call_static() helper for BPF programs with GCC 13.
 
  - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
    program to have code sections where preemption is disabled.
 
 Driver API
 ----------
 
  - Skip software TC processing completely if all installed rules are
    marked as HW-only, instead of checking the HW-only flag rule by rule.
 
  - Add support for configuring PoE (Power over Ethernet), similar to
    the already existing support for PoDL (Power over Data Line) config.
 
  - Initial bits of a queue control API, for now allowing a single queue
    to be reset without disturbing packet flow to other queues.
 
  - Common (ethtool) statistics for hardware timestamping.
 
 Tests and tooling
 -----------------
 
  - Remove the need to create a config file to run the net forwarding tests
    so that a naive "make run_tests" can exercise them.
 
  - Define a method of writing tests which require an external endpoint
    to communicate with (to send/receive data towards the test machine).
    Add a few such tests.
 
  - Create a shared code library for writing Python tests. Expose the YAML
    Netlink library from tools/ to the tests for easy Netlink access.
 
  - Move netfilter tests under net/, extend them, separate performance tests
    from correctness tests, and iron out issues found by running them
    "on every commit".
 
  - Refactor BPF selftests to use common network helpers.
 
  - Further work filling in YAML definitions of Netlink messages for:
    nftables, team driver, bonding interfaces, vlan interfaces, VF info,
    TC u32 mark, TC police action.
 
  - Teach Python YAML Netlink to decode attribute policies.
 
  - Extend the definition of the "indexed array" construct in the specs
    to cover arrays of scalars rather than just nests.
 
  - Add hyperlinks between definitions in generated Netlink docs.
 
 Drivers
 -------
 
  - Make sure unsupported flower control flags are rejected by drivers,
    and make more drivers report errors directly to the application rather
    than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen).
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - support multiple RSS contexts and steering traffic to them
      - support XDP metadata
      - make page pool allocations more NUMA aware
    - Intel (100G, ice, idpf):
      - extract datapath code common among Intel drivers into a library
      - use fewer resources in switchdev by sharing queues with the PF
      - add PFCP filter support
      - add Ethernet filter support
      - use a spinlock instead of HW lock in PTP clock ops
      - support 5 layer Tx scheduler topology
    - nVidia/Mellanox:
      - 800G link modes and 100G SerDes speeds
      - per-queue IRQ coalescing configuration
    - Marvell Octeon:
      - support offloading TC packet mark action
 
  - Ethernet NICs consumer, embedded and virtual:
    - stop lying about skb->truesize in USB Ethernet drivers, it messes up
      TCP memory calculations
    - Google cloud vNIC:
      - support changing ring size via ethtool
      - support ring reset using the queue control API
    - VirtIO net:
      - expose flow hash from RSS to XDP
      - per-queue statistics
      - add selftests
    - Synopsys (stmmac):
      - support controllers which require an RX clock signal from the MII
        bus to perform their hardware initialization
    - TI:
      - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
      - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
      - cpsw: minimal XDP support
    - Renesas (ravb):
      - support describing the MDIO bus
    - Realtek (r8169):
      - add support for RTL8168M
    - Microchip Sparx5:
      - matchall and flower actions mirred and redirect
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - improve events processing performance
    - Marvell:
      - add support for MV88E6250 family internal PHYs
    - Microchip:
      - add DCB and DSCP mapping support for KSZ switches
      - vsc73xx: convert to PHYLINK
    - Realtek:
      - rtl8226b/rtl8221b: add C45 instances and SerDes switching
 
  - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup.
 
  - Ethernet PHYs:
    - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
    - micrel: lan8814: add support for PPS out and external timestamp trigger
 
  - WiFi:
    - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers.
      Modern devices can only be configured using nl80211.
    - mac80211/cfg80211
      - handle color change per link for WiFi 7 Multi-Link Operation
    - Intel (iwlwifi):
      - don't support puncturing in 5 GHz
      - support monitor mode on passive channels
      - BZ-W device support
      - P2P with HE/EHT support
      - re-add support for firmware API 90
      - provide channel survey information for Automatic Channel Selection
    - MediaTek (mt76):
      - mt7921 LED control
      - mt7925 EHT radiotap support
      - mt7920e PCI support
    - Qualcomm (ath11k):
      - P2P support for QCA6390, WCN6855 and QCA2066
      - support hibernation
      - ieee80211-freq-limit Device Tree property support
    - Qualcomm (ath12k):
      - refactoring in preparation of multi-link support
      - suspend and hibernation support
      - ACPI support
      - debugfs support, including dfs_simulate_radar support
    - RealTek:
      - rtw88: RTL8723CS SDIO device support
      - rtw89: RTL8922AE Wi-Fi 7 PCI device support
      - rtw89: complete features of new WiFi 7 chip 8922AE including
        BT-coexistence and Wake-on-WLAN
      - rtw89: use BIOS ACPI settings to set TX power and channels
      - rtl8xxxu: enable Management Frame Protection (MFP) support
 
  - Bluetooth:
    - support for Intel BlazarI and Filmore Peak2 (BE201)
    - support for MediaTek MT7921S SDIO
    - initial support for Intel PCIe BT driver
    - remove HCI_AMP support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZD6sQACgkQMUZtbf5S
 IrtLYw/+I73ePGIye37o2jpbodcLAUZVfF3r6uYUzK8hokEcKD0QVJa9w7PizLZ3
 UO45ClOXFLJCkfP4reFenLfxGCel2AJI+F7VFl2xaO2XgrcH/lnVrHqKZEAEXjls
 KoYMnShIolv7h2MKP6hHtyTi2j1wvQUKsZC71o9/fuW+4fUT8gECx1YtYcL73wrw
 gEMdlUgBYC3jiiCUHJIFX6iPJ2t/TC+q1eIIF2K/Osrk2kIqQhzoozcL4vpuAZQT
 99ljx/qRelXa8oppDb7nM5eulg7WY8ZqxEfFZphTMC5nLEGzClxuOTTl2kDYI/D/
 UZmTWZDY+F5F0xvNk2gH84qVJXBOVDoobpT7hVA/tDuybobc/kvGDzRayEVqVzKj
 Q0tPlJs+xBZpkK5TVnxaFLJVOM+p1Xosxy3kNVXmuYNBvT/R89UbJiCrUKqKZF+L
 z/1mOYUv8UklHqYAeuJSptHvqJjTGa/fsEYP7dAUBbc1N2eVB8mzZ4mgU5rYXbtC
 E6UXXiWnoSRm8bmco9QmcWWoXt5UGEizHSJLz6t1R5Df/YmXhWlytll5aCwY1ksf
 FNoL7S4u7AZThL1Nwi7yUs4CAjhk/N4aOsk+41S0sALCx30BJuI6UdesAxJ0lu+Z
 fwCQYbs27y4p7mBLbkYwcQNxAxGm7PSK4yeyRIy2njiyV4qnLf8=
 =EsC2
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Complete rework of garbage collection of AF_UNIX sockets.

     AF_UNIX is prone to forming reference count cycles due to fd
     passing functionality. New method based on Tarjan's Strongly
     Connected Components algorithm should be both faster and remove a
     lot of workarounds we accumulated over the years.

   - Add TCP fraglist GRO support, allowing chaining multiple TCP
     packets and forwarding them together. Useful for small switches /
     routers which lack basic checksum offload in some scenarios (e.g.
     PPPoE).

   - Support using SMP threads for handling packet backlog i.e. packet
     processing from software interfaces and old drivers which don't use
     NAPI. This helps move the processing out of the softirq jumble.

   - Continue work of converting from rtnl lock to RCU protection.

     Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
     address labels, netdev threaded NAPI sysfs files, bonding driver's
     sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
     TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
     of the link information available via rtnetlink.

   - Small optimizations from Eric to UDP wake up handling, memory
     accounting, RPS/RFS implementation, TCP packet sizing etc.

   - Allow direct page recycling in the bulk API used by XDP, for +2%
     PPS.

   - Support peek with an offset on TCP sockets.

   - Add MPTCP APIs for querying last time packets were received/sent/acked
     and whether MPTCP "upgrade" succeeded on a TCP socket.

   - Add intra-node communication shortcut to improve SMC performance.

   - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
     driver.

   - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.

   - Add reset reasons for tracing what caused a TCP reset to be sent.

   - Introduce direction attribute for xfrm (IPSec) states. State can be
     used either for input or output packet processing.

  Things we sprinkled into general kernel code:

   - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().

     This required touch-ups and renaming of a few existing users.

   - Add Endian-dependent __counted_by_{le,be} annotations.

   - Make building selftests "quieter" by printing summaries like
     "CC object.o" rather than full commands with all the arguments.

  Netfilter:

   - Use GFP_KERNEL to clone elements, to deal better with OOM
     situations and avoid failures in the .commit step.

  BPF:

   - Add eBPF JIT for ARCv2 CPUs.

   - Support attaching kprobe BPF programs through kprobe_multi link in
     a session mode, meaning, a BPF program is attached to both function
     entry and return, the entry program can decide if the return
     program gets executed and the entry program can share u64 cookie
     value with return program. "Session mode" is a common use-case for
     tetragon and bpftrace.

   - Add the ability to specify and retrieve BPF cookie for raw
     tracepoint programs in order to ease migration from classic to raw
     tracepoints.

   - Add an internal-only BPF per-CPU instruction for resolving per-CPU
     memory addresses and implement support in x86, ARM64 and RISC-V
     JITs. This allows inlining functions which need to access per-CPU
     state.

   - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
     atomics in bpf_arena which can be JITed as a single x86
     instruction. Support BPF arena on ARM64.

   - Add a new bpf_wq API for deferring events and refactor
     process-context bpf_timer code to keep common code where possible.

   - Harden the BPF verifier's and/or/xor value tracking.

   - Introduce crypto kfuncs to let BPF programs call kernel crypto
     APIs.

   - Support bpf_tail_call_static() helper for BPF programs with GCC 13.

   - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
     program to have code sections where preemption is disabled.

  Driver API:

   - Skip software TC processing completely if all installed rules are
     marked as HW-only, instead of checking the HW-only flag rule by
     rule.

   - Add support for configuring PoE (Power over Ethernet), similar to
     the already existing support for PoDL (Power over Data Line)
     config.

   - Initial bits of a queue control API, for now allowing a single
     queue to be reset without disturbing packet flow to other queues.

   - Common (ethtool) statistics for hardware timestamping.

  Tests and tooling:

   - Remove the need to create a config file to run the net forwarding
     tests so that a naive "make run_tests" can exercise them.

   - Define a method of writing tests which require an external endpoint
     to communicate with (to send/receive data towards the test
     machine). Add a few such tests.

   - Create a shared code library for writing Python tests. Expose the
     YAML Netlink library from tools/ to the tests for easy Netlink
     access.

   - Move netfilter tests under net/, extend them, separate performance
     tests from correctness tests, and iron out issues found by running
     them "on every commit".

   - Refactor BPF selftests to use common network helpers.

   - Further work filling in YAML definitions of Netlink messages for:
     nftables, team driver, bonding interfaces, vlan interfaces, VF
     info, TC u32 mark, TC police action.

   - Teach Python YAML Netlink to decode attribute policies.

   - Extend the definition of the "indexed array" construct in the specs
     to cover arrays of scalars rather than just nests.

   - Add hyperlinks between definitions in generated Netlink docs.

  Drivers:

   - Make sure unsupported flower control flags are rejected by drivers,
     and make more drivers report errors directly to the application
     rather than dmesg (large number of driver changes from Asbjørn
     Sloth Tønnesen).

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support multiple RSS contexts and steering traffic to them
         - support XDP metadata
         - make page pool allocations more NUMA aware
      - Intel (100G, ice, idpf):
         - extract datapath code common among Intel drivers into a library
         - use fewer resources in switchdev by sharing queues with the PF
         - add PFCP filter support
         - add Ethernet filter support
         - use a spinlock instead of HW lock in PTP clock ops
         - support 5 layer Tx scheduler topology
      - nVidia/Mellanox:
         - 800G link modes and 100G SerDes speeds
         - per-queue IRQ coalescing configuration
      - Marvell Octeon:
         - support offloading TC packet mark action

   - Ethernet NICs consumer, embedded and virtual:
      - stop lying about skb->truesize in USB Ethernet drivers, it
        messes up TCP memory calculations
      - Google cloud vNIC:
         - support changing ring size via ethtool
         - support ring reset using the queue control API
      - VirtIO net:
         - expose flow hash from RSS to XDP
         - per-queue statistics
         - add selftests
      - Synopsys (stmmac):
         - support controllers which require an RX clock signal from the
           MII bus to perform their hardware initialization
      - TI:
         - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
         - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
         - cpsw: minimal XDP support
      - Renesas (ravb):
         - support describing the MDIO bus
      - Realtek (r8169):
         - add support for RTL8168M
      - Microchip Sparx5:
         - matchall and flower actions mirred and redirect

   - Ethernet switches:
      - nVidia/Mellanox:
         - improve events processing performance
      - Marvell:
         - add support for MV88E6250 family internal PHYs
      - Microchip:
         - add DCB and DSCP mapping support for KSZ switches
         - vsc73xx: convert to PHYLINK
      - Realtek:
         - rtl8226b/rtl8221b: add C45 instances and SerDes switching

   - Many driver changes related to PHYLIB and PHYLINK deprecated API
     cleanup

   - Ethernet PHYs:
      - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
      - micrel: lan8814: add support for PPS out and external timestamp trigger

   - WiFi:
      - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
        drivers. Modern devices can only be configured using nl80211.
      - mac80211/cfg80211
         - handle color change per link for WiFi 7 Multi-Link Operation
      - Intel (iwlwifi):
         - don't support puncturing in 5 GHz
         - support monitor mode on passive channels
         - BZ-W device support
         - P2P with HE/EHT support
         - re-add support for firmware API 90
         - provide channel survey information for Automatic Channel Selection
      - MediaTek (mt76):
         - mt7921 LED control
         - mt7925 EHT radiotap support
         - mt7920e PCI support
      - Qualcomm (ath11k):
         - P2P support for QCA6390, WCN6855 and QCA2066
         - support hibernation
         - ieee80211-freq-limit Device Tree property support
      - Qualcomm (ath12k):
         - refactoring in preparation of multi-link support
         - suspend and hibernation support
         - ACPI support
         - debugfs support, including dfs_simulate_radar support
      - RealTek:
         - rtw88: RTL8723CS SDIO device support
         - rtw89: RTL8922AE Wi-Fi 7 PCI device support
         - rtw89: complete features of new WiFi 7 chip 8922AE including
           BT-coexistence and Wake-on-WLAN
         - rtw89: use BIOS ACPI settings to set TX power and channels
         - rtl8xxxu: enable Management Frame Protection (MFP) support

   - Bluetooth:
      - support for Intel BlazarI and Filmore Peak2 (BE201)
      - support for MediaTek MT7921S SDIO
      - initial support for Intel PCIe BT driver
      - remove HCI_AMP support"

* tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
  selftests: netfilter: fix packetdrill conntrack testcase
  net: gro: fix napi_gro_cb zeroed alignment
  Bluetooth: btintel_pcie: Refactor and code cleanup
  Bluetooth: btintel_pcie: Fix warning reported by sparse
  Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
  Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
  Bluetooth: btintel_pcie: Fix compiler warnings
  Bluetooth: btintel_pcie: Add *setup* function to download firmware
  Bluetooth: btintel_pcie: Add support for PCIe transport
  Bluetooth: btintel: Export few static functions
  Bluetooth: HCI: Remove HCI_AMP support
  Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
  Bluetooth: qca: Fix error code in qca_read_fw_build_info()
  Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
  Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
  Bluetooth: btintel: Add support for BlazarI
  LE Create Connection command timeout increased to 20 secs
  dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
  Bluetooth: compute LE flow credits based on recvbuf space
  Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
  ...
2024-05-14 19:42:24 -07:00
Linus Torvalds
896d3fce84 linux_kselftest-kunit-6.10-rc1
This kunit update for Linux 6.10-rc1 consists of:
 
 - fix to race condition in try-catch completion
 - change to __kunit_test_suites_init() to exit early if there is
   nothing to test
 - change to string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER
 - moving fault tests behind KUNIT_FAULT_TEST Kconfig option
 - kthread test fixes and improvements
 - iov_iter test fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmZCNsAACgkQCwJExA0N
 Qxx03w/9EmjF3T16LPaeuerdoypWDcroDT6gpoFXGrvf3lDrna8uDNija5Pb1yMn
 l97wla3IJ1EZRMTy1jgWGQiiGIdkV8hcze65HZMi19qx/49TUbhA/pTmpYC56cp9
 sk2fBjOHz8iI4kdL4eCMr9MpSiwOIDcfWOr1Lh/AP2LHOU1pRdFZbwO6iZ3wyGlJ
 JH4D1CwmfgMGEau4qUo0jvuRbFAf33S+yEI9gr8CskPItljFVO4jVz4lprnTbU9i
 qAOivHzwcHyYc0upb6q2vIlp8vhmDygG/m07lnwfF7ZHsYo+3zV4FkxHspN2+jGA
 frH7Y0X9zt6YjRRMb9NcNnI67VTiSNzdCvB7urUhKlbXoZ2gjtgB7zHeQtAhlXRo
 XVa4QgWBI5ExKBuLI+0yKo4wEO8M0quXxhbX+2Q+tsRnoYmhwb0G8AUyl/26bt2g
 RelGrArDS5eMrlxl97rjMGFrB5Uan2MR751tl+aZPgyNRW3tRKJnQLZmM1z8aFQp
 vGReT6POzCnQ1wLUkcj6mnObbv9XuuYY1BQgKCtmJflvRToEuwpLOKK8Uca7ou3p
 TbVarGIn0jdHv4zGkXrAkt/mhcxanBXhVKLfh/MqQ7fCZBULkSrjJFLhCpvvHwIV
 nckaP2sZWls6FTDuawFOUxrr/+LjJchMmHhFy9MiDaVoieiTg6U=
 =3QIa
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - fix race condition in try-catch completion

 - change __kunit_test_suites_init() to exit early if there is
   nothing to test

 - change string-stream-test to use KUNIT_DEFINE_ACTION_WRAPPER

 - move fault tests behind KUNIT_FAULT_TEST Kconfig option

 - kthread test fixes and improvements

 - iov_iter test fixes

* tag 'linux_kselftest-kunit-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: bail out early in __kunit_test_suites_init() if there are no suites to test
  kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER
  kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option
  kunit: unregister the device on error
  kunit: Fix race condition in try-catch completion
  kunit: Add tests for fault
  kunit: Print last test location on fault
  kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests
  kunit: Handle test faults
  kunit: Fix timeout message
  kunit: Fix kthread reference
  kunit: Handle thread creation error
2024-05-14 11:32:52 -07:00
Linus Torvalds
6bfd2d442a Updates for the interrupt subsystem:
- Core code:
 
    - Interrupt storm detection for the lockup watchdog:
 
      Lockups which are caused by interrupt storms are not easy to debug
      because there is no information about the events which make the lockup
      detector trigger.
 
      To make this more user friendly, provide an extenstion to interrupt
      statistics which allows to take snapshots and an interface to retrieve
      the delta to the snapshot. Use this new mechanism in the watchdog code
      to do a two stage lockup analysis by taking the snapshot and printing
      the deltas for the topmost active interrupts on the second trigger.
 
      Note: This contains both the interrupt and the watchdog changes as
      the latter depend on the former obviously.
 
   - Avoid summation loops in the /proc/interrupts output and use the global
     counter when possible
 
   - Skip suspended interrupts on CPU hotplug operations to ensure that they
     are not delivered before the system resumes the device drivers when
     coming out of suspend.
 
   - On CPU hot-unplug interrupts which are affine to the outgoing CPU are
     migrated to a different CPU in the affinity mask. This can fail when
     the CPUs have no vectors left. Instead of giving up try to migrate it
     to any online CPU and thereby breaking the affinity setting in order to
     prevent a stale device interrupt which targets an offline CPU
 
   - The usual small cleanups
 
  - Driver code:
 
   - Support for the RISCV AIA MSI controller
 
   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion
 
   - The usual set of cleanups and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBCM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeZHEACqMLN3K+1HyWflYtcTHJeYCjZLHS77
 2tQeKaaskOA4W6dcGXPxMw5CHqAobHVQQMqgcJxhUdqQiOJnFFnrtCD7JtqM0hWK
 UORNbyeovuhAo+iJ0fTuS8p63H7vm2GIWwBLWJnOuChYv/6Yyx5Cald1skvyvbzL
 zePhiiAf5mkdmJMeT5wJSCqEWSRYOXsVAJ/0YAwFG3bKkJH3bmDo6SDJY02sXT5P
 pjbtD/0hum9wIVT4fNdYleHHQMdBdj9dLlcxXBikHq50mDMw7GxvjKiLcXmoerw3
 rEBfVVJp3qpSofpNJZ3HH0ywcF3yUzq04/LPE9Tk2MoQ8NF0GzP8r9Ahke4B7cUj
 FysWNiAlC2IisEi6th313FZkTLx0zgewdsdEBTLt8eAE9TU0wamRbo99LZ8i/Qr3
 hk7jV8DzL+EDQJLgl4p1iPJgA708eW17tbCxLEa15VKVV6P58miohmhx/IfPO2Gx
 FV1PPehtItsmiK/UoRtUCoFdFsqNQtOE+h8DWLyy8RDmhBqGbn9Ut4euXiQIF+rX
 WJKPFfslCTR39BrBcZnZeNsgOCN7tEfFRstzjzkey1DaeTGWtxmA5UGhpC2vT74y
 YyXluvZlgKr4S64ABmcqQj++hQLho0OQAih3uW5YVxt4VxEUcXYMJOsV1AQGpMjF
 UnewWH5opBQdfw==
 =jFLf
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:
 "Core code:

   - Interrupt storm detection for the lockup watchdog:

     Lockups which are caused by interrupt storms are not easy to debug
     because there is no information about the events which make the
     lockup detector trigger.

     To make this more user friendly, provide an extenstion to interrupt
     statistics which allows to take snapshots and an interface to
     retrieve the delta to the snapshot. Use this new mechanism in the
     watchdog code to do a two stage lockup analysis by taking the
     snapshot and printing the deltas for the topmost active interrupts
     on the second trigger.

     Note: This contains both the interrupt and the watchdog changes as
     the latter depend on the former obviously.

   - Avoid summation loops in the /proc/interrupts output and use the
     global counter when possible

   - Skip suspended interrupts on CPU hotplug operations to ensure that
     they are not delivered before the system resumes the device drivers
     when coming out of suspend.

   - On CPU hot-unplug interrupts which are affine to the outgoing CPU
     are migrated to a different CPU in the affinity mask. This can fail
     when the CPUs have no vectors left. Instead of giving up try to
     migrate it to any online CPU and thereby breaking the affinity
     setting in order to prevent a stale device interrupt which targets
     an offline CPU

   - The usual small cleanups

  Driver code:

   - Support for the RISCV AIA MSI controller

   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion

   - The usual set of cleanups and fixes all over the place"

* tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc
  cpuidle: Avoid explicit cpumask allocation on stack
  irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  cpumask: Introduce cpumask_first_and_and()
  irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown
  genirq: Reuse irq_is_nmi()
  genirq/cpuhotplug: Retry with cpu_online_mask when migration fails
  genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
  arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251
  arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251
  ARM: dts: stm32: List exti parent interrupts on stm32mp131
  ARM: dts: stm32: List exti parent interrupts on stm32mp151
  arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32
  irqchip/stm32-exti: Mark events reserved with RIF configuration check
  irqchip/stm32-exti: Skip secure events
  irqchip/stm32-exti: Convert driver to standard PM
  ...
2024-05-14 09:47:14 -07:00
Linus Torvalds
2d9db778dd Timers and timekeeping updates:
- Core code:
 
    - Make timekeeping and VDSO time readouts resilent against math overflow:
 
      In guest context the kernel is prone to math overflow when the host
      defers the timer interrupt due to overload, malfunction or malice.
 
      This can be mitigated by checking the clocksource delta for the
      maximum deferrement which is readily available. If that value is
      exceeded then the code uses a slowpath function which can handle the
      multiplication overflow.
 
      This functionality is enabled unconditionally in the kernel, but made
      conditional in the VDSO code. The latter is conditional because it
      allows architectures to optimize the check so it is not causing
      performance regressions.
 
      On X86 this is achieved by reworking the existing check for negative
      TSC deltas as a negative delta obviously exceeds the maximum
      deferrement when it is evaluated as an unsigned value. That avoids two
      conditionals in the hotpath and allows to hide both the negative delta
      and the large delta handling in the same slow path.
 
    - Add an initial minimal ktime_t abstraction for Rust
 
    - The usual boring cleanups and enhancements
 
  - Drivers:
 
    - Boring updates to device trees and trivial enhancements in various
      drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBErUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZVhD/9iUPzcGNgqGqcO1bXy6dH4xLpeec6o
 2En1vg45DOaygN7DFxkoei20KJtfdFeaaEDH8UqmOfPcpLIuVAd0yqhgDQtx6ZcO
 XNd09SFDInzUt1Ot/WcoXp5N6Wt3vyEgUAlIN1fQdbaZ3fh6OhGhXXCRfiRCGXU1
 ea2pSunLuRf1pKU0AYhGIexnZMOHC4NmVXw/m+WNw5DJrmWB+OaNFKfMoQjtQ1HD
 Vgyr2RALHnIeXm60y2j3dD7TWGXICE/edzOd7pEyg5LFXsmcp388eu/DEdOq3OTV
 tsHLgIi05GJym3dykPBVwZk09M5oVNNfkg9zDxHWhSLkEJmc4QUaH3dgM8uBoaRW
 pS3LaO3ePxWmtAOdSNKFY6xnl6df+PYJoZcIF/GuXgty7im+VLK9C4M05mSjey00
 omcEywvmGdFezY6D9MmjjhFa+q2v9zpRjFpCWaIv3DQdAaDPrOzBk4SSqHZOV4lq
 +hp7ar1mTn1FPrXBouwyOgSOUANISV5cy/QuwOtrVIuVR4rWFVgfWo/7J32/q5Ik
 XBR0lTdQy1Biogf6xy0HCY+4wItOLTqEXXqeknHSMJpDzj5uZglZemgKbix1wVJ9
 8YlD85Q7sktlPmiLMKV9ra0MKVyXDoIrgt4hX98A8M12q9bNdw23x0p0jkJHwGha
 ZYUyX+XxKgOJug==
 =pL+S
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timers and timekeeping updates from Thomas Gleixner:
 "Core code:

   - Make timekeeping and VDSO time readouts resilent against math
     overflow:

     In guest context the kernel is prone to math overflow when the host
     defers the timer interrupt due to overload, malfunction or malice.

     This can be mitigated by checking the clocksource delta for the
     maximum deferrement which is readily available. If that value is
     exceeded then the code uses a slowpath function which can handle
     the multiplication overflow.

     This functionality is enabled unconditionally in the kernel, but
     made conditional in the VDSO code. The latter is conditional
     because it allows architectures to optimize the check so it is not
     causing performance regressions.

     On X86 this is achieved by reworking the existing check for
     negative TSC deltas as a negative delta obviously exceeds the
     maximum deferrement when it is evaluated as an unsigned value. That
     avoids two conditionals in the hotpath and allows to hide both the
     negative delta and the large delta handling in the same slow path.

   - Add an initial minimal ktime_t abstraction for Rust

   - The usual boring cleanups and enhancements

  Drivers:

   - Boring updates to device trees and trivial enhancements in various
     drivers"

* tag 'timers-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  clocksource/drivers/arm_arch_timer: Mark hisi_161010101_oem_info const
  clocksource/drivers/timer-ti-dm: Remove an unused field in struct dmtimer
  clocksource/drivers/renesas-ostm: Avoid reprobe after successful early probe
  clocksource/drivers/renesas-ostm: Allow OSTM driver to reprobe for RZ/V2H(P) SoC
  dt-bindings: timer: renesas: ostm: Document Renesas RZ/V2H(P) SoC
  rust: time: doc: Add missing C header links
  clocksource: Make the int help prompt unit readable in ncurses
  hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active()
  timerqueue: Remove never used function timerqueue_node_expires()
  rust: time: Add Ktime
  vdso: Fix powerpc build U64_MAX undeclared error
  clockevents: Convert s[n]printf() to sysfs_emit()
  clocksource: Convert s[n]printf() to sysfs_emit()
  clocksource: Make watchdog and suspend-timing multiplication overflow safe
  timekeeping: Let timekeeping_cycles_to_ns() handle both under and overflow
  timekeeping: Make delta calculation overflow safe
  timekeeping: Prepare timekeeping_cycles_to_ns() for overflow safety
  timekeeping: Fold in timekeeping_delta_to_ns()
  timekeeping: Consolidate timekeeping helpers
  timekeeping: Refactor timekeeping helpers
  ...
2024-05-14 09:27:40 -07:00
Linus Torvalds
6e5a0c30b6 Scheduler changes for v6.10:
- Add cpufreq pressure feedback for the scheduler
 
  - Rework misfit load-balancing wrt. affinity restrictions
 
  - Clean up and simplify the code around ::overutilized and
    ::overload access.
 
  - Simplify sched_balance_newidle()
 
  - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
    handling that changed the output.
 
  - Rework & clean up <asm/vtime.h> interactions wrt. arch_vtime_task_switch()
 
  - Reorganize, clean up and unify most of the higher level
    scheduler balancing function names around the sched_balance_*()
    prefix.
 
  - Simplify the balancing flag code (sched_balance_running)
 
  - Miscellaneous cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmZBtA0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gQEw//WiCiV7zTlWShSiG/g8GTfoAvl53QTWXF
 0jQ8TUcoIhxB5VeGgxVG1srYt8f505UXjH7L0MJLrbC3nOgRCg4NK57WiQEachKK
 HORIJHT0tMMsKIwX9D5Ovo4xYJn+j7mv7j/caB+hIlzZAbWk+zZPNWcS84p0ZS/4
 appY6RIcp7+cI7bisNMGUuNZS14+WMdWoX3TgoI6ekgDZ7Ky+kQvkwGEMBXsNElO
 qZOj6yS/QUE4Htwz0tVfd6h5svoPM/VJMIvl0yfddPGurfNw6jEh/fjcXnLdAzZ6
 9mgcosETncQbm0vfSac116lrrZIR9ygXW/yXP5S7I5dt+r+5pCrBZR2E5g7U4Ezp
 GjX1+6J9U6r6y12AMLRjadFOcDvxdwtszhZq4/wAcmS3B9dvupnH/w7zqY9ho3wr
 hTdtDHoAIzxJh7RNEHgeUC0/yQX3wJ9THzfYltDRIIjHTuvl4d5lHgsug+4Y9ClE
 pUIQm/XKouweQN9TZz2ULle4ZhRrR9sM9QfZYfirJ/RppmuKool4riWyQFQNHLCy
 mBRMjFFsTpFIOoZXU6pD4EabOpWdNrRRuND/0yg3WbDat2gBWq6jvSFv2UN1/v7i
 Un5jijTuN7t8yP5lY5Tyf47kQfLlA9bUx1v56KnF9mrpI87FyiDD3MiQVhDsvpGX
 rP96BIOrkSo=
 =obph
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Add cpufreq pressure feedback for the scheduler

 - Rework misfit load-balancing wrt affinity restrictions

 - Clean up and simplify the code around ::overutilized and
   ::overload access.

 - Simplify sched_balance_newidle()

 - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
   handling that changed the output.

 - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()

 - Reorganize, clean up and unify most of the higher level
   scheduler balancing function names around the sched_balance_*()
   prefix

 - Simplify the balancing flag code (sched_balance_running)

 - Miscellaneous cleanups & fixes

* tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/pelt: Remove shift of thermal clock
  sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
  thermal/cpufreq: Remove arch_update_thermal_pressure()
  sched/cpufreq: Take cpufreq feedback into account
  cpufreq: Add a cpufreq pressure feedback for the scheduler
  sched/fair: Fix update of rd->sg_overutilized
  sched/vtime: Do not include <asm/vtime.h> header
  s390/irq,nmi: Include <asm/vtime.h> header directly
  s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
  sched/vtime: Get rid of generic vtime_task_switch() implementation
  sched/vtime: Remove confusing arch_vtime_task_switch() declaration
  sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
  sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
  sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
  sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
  sched/fair: Rename root_domain::overload to ::overloaded
  sched/fair: Use helper functions to access root_domain::overload
  sched/fair: Check root_domain::overload value before update
  sched/fair: Combine EAS check with root_domain::overutilized access
  sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
  ...
2024-05-13 17:18:51 -07:00
Linus Torvalds
87caef4220 hardening updates for 6.10-rc1
- selftests: Add str*cmp tests (Ivan Orlov)
 
 - __counted_by: provide UAPI for _le/_be variants (Erick Archer)
 
 - Various strncpy deprecation refactors (Justin Stitt)
 
 - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas Weißschuh)
 
 - UBSAN: Work around i386 -regparm=3 bug with Clang prior to version 19
 
 - Provide helper to deal with non-NUL-terminated string copying
 
 - SCSI: Fix older string copying bugs (with new helper)
 
 - selftests: Consolidate string helper behavioral tests
 
 - selftests: add memcpy() fortify tests
 
 - string: Add additional __realloc_size() annotations for "dup" helpers
 
 - LKDTM: Fix KCFI+rodata+objtool confusion
 
 - hardening.config: Enable KCFI
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmY/yCUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJuf2D/9xlQA7UxUDlm1Z6DPYzTZfNm4M
 D+RJ1QoLNbZEYSzULWvfRSWI+c82qINoSgvtv2DdhWqSKivcMoeNDN846gewfwMY
 0q3iChbhPaNBAHaXat1pf0iA6q2n/wpg1jv1C1PmPVSaEpl0CeQ2MLXSOMz9Gb7G
 FkkaN/v+YlShUzkw61KwKPg959/bh5vCBbeLjSd1XAhLGKU7nWw4yj0J3usTnRbV
 icCnW4mk9SD+pIli/+n7t/QIvPMf6TrJZoSgH9P7YNm+wNme4UEAm1PJz8F+KVAH
 D3CJhlH36l8TrndsHMsHgDjKtUUchh+ExOlWGw3ObUnbU7ST2JP6crAdjtnyT2eN
 uF+ELBT97SskFBAlzOzBSIs8lEwBZzTdJCmWqEBr3ZxxR7lcClmqbJY+X/FhvXko
 o7PvtCbHCatpDPJPZ0e25nVsfEJS29RUED5Gen6vWcUtuvdFEgws70s5BDAbSZTo
 RoJsuDqlRAFLdNDYmEN3UTGcm+PBjPgKsBrXiiNr4Y0BilU67Bzdmd8jiZC9ARe6
 +3cfQRs0uWdemANzvrN5FnrIUhjRHWTvfVTXcC9Jt53HntIuMhhRajJuMcTAX5uQ
 iWACUR14RL8lfInS8phWB5T4AvNexTFc6kVRqNzsGB0ZutsnAsqELttCk57tYQVr
 Hlv/MbePyyLSKF/nYA==
 =CgsW
 -----END PGP SIGNATURE-----

Merge tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "The bulk of the changes here are related to refactoring and expanding
  the KUnit tests for string helper and fortify behavior.

  Some trivial strncpy replacements in fs/ were carried in my tree. Also
  some fixes to SCSI string handling were carried in my tree since the
  helper for those was introduce here. Beyond that, just little fixes
  all around: objtool getting confused about LKDTM+KCFI, preparing for
  future refactors (constification of sysctl tables, additional
  __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding
  more options in the hardening.config Kconfig fragment.

  Summary:

   - selftests: Add str*cmp tests (Ivan Orlov)

   - __counted_by: provide UAPI for _le/_be variants (Erick Archer)

   - Various strncpy deprecation refactors (Justin Stitt)

   - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas
     Weißschuh)

   - UBSAN: Work around i386 -regparm=3 bug with Clang prior to
     version 19

   - Provide helper to deal with non-NUL-terminated string copying

   - SCSI: Fix older string copying bugs (with new helper)

   - selftests: Consolidate string helper behavioral tests

   - selftests: add memcpy() fortify tests

   - string: Add additional __realloc_size() annotations for "dup"
     helpers

   - LKDTM: Fix KCFI+rodata+objtool confusion

   - hardening.config: Enable KCFI"

* tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits)
  uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be}
  stackleak: Use a copy of the ctl_table argument
  string: Add additional __realloc_size() annotations for "dup" helpers
  kunit/fortify: Fix replaced failure path to unbreak __alloc_size
  hardening: Enable KCFI and some other options
  lkdtm: Disable CFI checking for perms functions
  kunit/fortify: Add memcpy() tests
  kunit/fortify: Do not spam logs with fortify WARNs
  kunit/fortify: Rename tests to use recommended conventions
  init: replace deprecated strncpy with strscpy_pad
  kunit/fortify: Fix mismatched kvalloc()/vfree() usage
  scsi: qla2xxx: Avoid possible run-time warning with long model_num
  scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings
  scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings
  fs: ecryptfs: replace deprecated strncpy with strscpy
  hfsplus: refactor copy_name to not use strncpy
  reiserfs: replace deprecated strncpy with scnprintf
  virt: acrn: replace deprecated strncpy with strscpy
  ubsan: Avoid i386 UBSAN handler crashes with Clang
  ubsan: Remove 1-element array usage in debug reporting
  ...
2024-05-13 14:14:05 -07:00
Linus Torvalds
0c9f4ac808 for-6.10/block-20240511
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmY/YgsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvi0EACwnFRtYioizBH0x7QUHTBcIr0IhACd5gfz
 bm+uwlDUtf6G6lupHdJT9gOVB2z2z1m2Pz//8RuUVWw3Eqw2+rfgG8iJd+yo7IaV
 DpX3WaM4NnBvB7FKOKHlMPvGuf7KgbZ3uPm3x8cbrn/axMmkZ6ljxTixJ3p5t4+s
 xRsef/lVdG71DkXIFgTKATB86yNRJNlRQTbL+sZW22vdXdtfyBbOgR1sBuFfp7Hd
 g/uocZM/z0ahM6JH/5R2IX2ttKXMIBZLA8HRkJdvYqg022cj4js2YyRCPU3N6jQN
 MtN4TpJV5I++8l6SPQOOhaDNrK/6zFtDQpwG0YBiKKj3nQDgVbWWb8ejYTIUv4MP
 SrEto4MVBEqg5N65VwYYhIf45rmueFyJp6z0Vqv6Owur5nuww/YIFknmoMa/WDMd
 V8dIU3zL72FZDbPjIBjxHeqAGz9OgzEVafled7pi0Xbw6wqiB4kZihlMGXlD+WBy
 Yd6xo8PX4i5+d2LLKKPxpW1X0eJlKYJ/4dnYCoFN8LmXSiPJnMx2pYrV+NqMxy4X
 Thr8lxswLQC7j9YBBuIeDl8NB9N5FZZLvaC6I25QKq045M2ckJ+VrounsQb3vGwJ
 72nlxxBZL8wz3sasgX9Pc1Cez9AqYbM+UZahq8ezPY5y3Jh0QfRw/MOk1ZaDNC8V
 CNOHBH0E+Q==
 =HnjE
 -----END PGP SIGNATURE-----

Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - Add a partscan attribute in sysfs, fixing an issue with systemd
   relying on an internal interface that went away.

 - Attempt #2 at making long running discards interruptible. The
   previous attempt went into 6.9, but we ended up mostly reverting it
   as it had issues.

 - Remove old ida_simple API in bcache

 - Support for zoned write plugging, greatly improving the performance
   on zoned devices.

 - Remove the old throttle low interface, which has been experimental
   since 2017 and never made it beyond that and isn't being used.

 - Remove page->index debugging checks in brd, as it hasn't caught
   anything and prepares us for removing in struct page.

 - MD pull request from Song

 - Don't schedule block workers on isolated CPUs

* tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits)
  blk-throttle: delay initialization until configuration
  blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW
  block: fix that util can be greater than 100%
  block: support to account io_ticks precisely
  block: add plug while submitting IO
  bcache: fix variable length array abuse in btree_iter
  bcache: Remove usage of the deprecated ida_simple_xx() API
  md: Revert "md: Fix overflow in is_mddev_idle"
  blk-lib: check for kill signal in ioctl BLKDISCARD
  block: add a bio_await_chain helper
  block: add a blk_alloc_discard_bio helper
  block: add a bio_chain_and_submit helper
  block: move discard checks into the ioctl handler
  block: remove the discard_granularity check in __blkdev_issue_discard
  block/ioctl: prefer different overflow check
  null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION()
  block: fix and simplify blkdevparts= cmdline parsing
  block: refine the EOF check in blkdev_iomap_begin
  block: add a partscan sysfs attribute for disks
  block: add a disk_has_partscan helper
  ...
2024-05-13 13:03:54 -07:00
Linus Torvalds
b19239143e Hi,
These are the changes for the TPM driver with a single major new
 feature: TPM bus encryption and integrity protection. The key pair
 on TPM side is generated from so called null random seed per power
 on of the machine [1]. This supports the TPM encryption of the hard
 drive by adding layer of protection against bus interposer attacks.
 
 Other than the pull request a few minor fixes and documentation for
 tpm_tis to clarify basics of TPM localities for future patch review
 discussions (will be extended and refined over times, just a seed).
 
 [1] https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iJYEABYKAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZj0l2iAcamFya2tvLnNh
 a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0m8yAP4hBjMtpgAJZ4eZ
 5o9tEQJrh/1JFZJ+8HU5IKPc4RU8BAEAyyYOCtxtS/C5B95iP+LvNla0KWi0pprU
 HsCLULnV2Aw=
 =RTXJ
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull TPM updates from Jarkko Sakkinen:
 "These are the changes for the TPM driver with a single major new
  feature: TPM bus encryption and integrity protection. The key pair on
  TPM side is generated from so called null random seed per power on of
  the machine [1]. This supports the TPM encryption of the hard drive by
  adding layer of protection against bus interposer attacks.

  Other than that, a few minor fixes and documentation for tpm_tis to
  clarify basics of TPM localities for future patch review discussions
  (will be extended and refined over times, just a seed)"

Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1]

* tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits)
  Documentation: tpm: Add TPM security docs toctree entry
  tpm: disable the TPM if NULL name changes
  Documentation: add tpm-security.rst
  tpm: add the null key name as a sysfs export
  KEYS: trusted: Add session encryption protection to the seal/unseal path
  tpm: add session encryption protection to tpm2_get_random()
  tpm: add hmac checks to tpm2_pcr_extend()
  tpm: Add the rest of the session HMAC API
  tpm: Add HMAC session name/handle append
  tpm: Add HMAC session start and end functions
  tpm: Add TCG mandated Key Derivation Functions (KDFs)
  tpm: Add NULL primary creation
  tpm: export the context save and load commands
  tpm: add buffer function to point to returned parameters
  crypto: lib - implement library version of AES in CFB mode
  KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers
  tpm: Add tpm_buf_read_{u8,u16,u32}
  tpm: TPM2B formatted buffers
  tpm: Store the length of the tpm_buf data separately.
  tpm: Update struct tpm_buf documentation comments
  ...
2024-05-13 10:40:15 -07:00
Linus Torvalds
cd97950cbc slab updates for 6.10
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmY8mxAACgkQu+CwddJF
 iJru7AgAmBfolYwYjm9fCkH+px40smQQF08W+ygJaKF4+6e+b5ijfI8H3AG7QtuE
 5FmdCjSvu56lr15sjeUy7giYWRfeEwxC/ztJ0FJ+RCzSEQVKCo2wWGYxDneelwdH
 /v0Of5ENbIiH/svK4TArY9AemZw+nowNrwa4TI1QAEcp47T7x52r0GFOs1pnduep
 eV6uSwHSx00myiF3fuMGQ7P4aUDLNTGn5LSHNI4sykObesGPx4Kvr0zZvhQT41me
 c6Sc0GwV5M9sqBFwjujIeD7CB98wVPju4SDqNiEL+R1u+pnIA0kkefO4D4VyKvpr
 7R/WXmqZI4Ae/HEtcRd8+5Z4FvapPw==
 =7ez3
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:
 "This time it's mostly random cleanups and fixes, with two performance
  fixes that might have significant impact, but limited to systems
  experiencing particular bad corner case scenarios rather than general
  performance improvements.

  The memcg hook changes are going through the mm tree due to
  dependencies.

   - Prevent stalls when reading /proc/slabinfo (Jianfeng Wang)

     This fixes the long-standing problem that can happen with workloads
     that have alloc/free patterns resulting in many partially used
     slabs (in e.g. dentry cache). Reading /proc/slabinfo will traverse
     the long partial slab list under spinlock with disabled irqs and
     thus can stall other processes or even trigger the lockup
     detection. The traversal is only done to count free objects so that
     <active_objs> column can be reported along with <num_objs>.

     To avoid affecting fast paths with another shared counter
     (attempted in the past) or complex partial list traversal schemes
     that allow rescheduling, the chosen solution resorts to
     approximation - when the partial list is over 10000 slabs long, we
     will only traverse first 5000 slabs from head and tail each and use
     the average of those to estimate the whole list. Both head and tail
     are used as the slabs near head to tend to have more free objects
     than the slabs towards the tail.

     It is expected the approximation should not break existing
     /proc/slabinfo consumers. The <num_objs> field is still accurate
     and reflects the overall kmem_cache footprint. The <active_objs>
     was already imprecise due to cpu and percpu-partial slabs, so can't
     be relied upon to determine exact cache usage. The difference
     between <active_objs> and <num_objs> is mainly useful to determine
     the slab fragmentation, and that will be possible even with the
     approximation in place.

   - Prevent allocating many slabs when a NUMA node is full (Chen Jun)

     Currently, on NUMA systems with a node under significantly bigger
     pressure than other nodes, the fallback strategy may result in each
     kmalloc_node() that can't be safisfied from the preferred node, to
     allocate a new slab on a fallback node, and not reuse the slabs
     already on that node's partial list.

     This is now fixed and partial lists of fallback nodes are checked
     even for kmalloc_node() allocations. It's still preferred to
     allocate a new slab on the requested node before a fallback, but
     only with a GFP_NOWAIT attempt, which will fail quickly when the
     node is under a significant memory pressure.

   - More SLAB removal related cleanups (Xiu Jianfeng, Hyunmin Lee)

   - Fix slub_kunit self-test with hardened freelists (Guenter Roeck)

   - Mark racy accesses for KCSAN (linke li)

   - Misc cleanups (Xiongwei Song, Haifeng Xu, Sangyun Kim)"

* tag 'slab-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: remove the check for NULL kmalloc_caches
  mm/slub: create kmalloc 96 and 192 caches regardless cache size order
  mm/slub: mark racy access on slab->freelist
  slub: use count_partial_free_approx() in slab_out_of_memory()
  slub: introduce count_partial_free_approx()
  slub: Set __GFP_COMP in kmem_cache by default
  mm/slub: remove duplicate initialization for early_kmem_cache_node_alloc()
  mm/slub: correct comment in do_slab_free()
  mm/slub, kunit: Use inverted data to corrupt kmem cache
  mm/slub: simplify get_partial_node()
  mm/slub: add slub_get_cpu_partial() helper
  mm/slub: remove the check of !kmem_cache_has_cpu_partial()
  mm/slub: Reduce memory consumption in extreme scenarios
  mm/slub: mark racy accesses on slab->slabs
  mm/slub: remove dummy slabinfo functions
2024-05-13 10:28:34 -07:00
Linus Torvalds
2e57d1d606 sparc32,parisc,csky: Provide one-byte and two-byte cmpxchg() support
This series provides native one-byte and two-byte cmpxchg() support
 for sparc32 and parisc, courtesy of Al Viro.  This support is provided
 by the same hashed-array-of-locks technique used for the other atomic
 operations provided for these two platforms.
 
 This series also provides emulated one-byte cmpxchg() support for csky
 using a new cmpxchg_emu_u8() function that uses a four-byte cmpxchg()
 to emulate the one-byte variant.
 
 Similar patches for emulation of one-byte cmpxchg() for arc, sh, and
 xtensa have not yet received maintainer acks, so they are slated for
 the v6.11 merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmY/gZ8THHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jFIjD/0Uu4VZZN96jYbSaDbC5aAkEHg/swBK
 6OVn+yspLOvkebVZlSfus+7rc5VUrxT3GA/gvAWEQsUlPqpYg6Qja/efFpPPRjIq
 lwkFE5HFgE0J4lBo9p78ggm6Hx60WUPlNg9uS23qURZbFTx5TYQyAdzXw9HlYzr8
 jg5IuTtO5L5AZzR2ocDRh4A5sqfcBJCVdVsKO+XzdFLLtgum+kJY7StYLPdY8VtL
 pIV3+ZQENoiwzE+wccnCb2R/4kt6jsEDShlpV4VEfv76HwbjBdvSq4jEg4jS2N3/
 AIyThclD97AEdbbM1oJ3oZdjD3GLGVPhVFfiMSGD5HGA+JVJPjJe2it4o+xY7CIR
 sSdI/E3Rs67qgaga6t2vHygDZABOwgNLAsc4VwM7X6I20fRixkYVc7aVOTnAPzmr
 15iaFd/T7fLKJcC3m/IXb9iNdlfe0Op4+YVD0lOTWmzIk80Xgf45a39u1VFlqQvh
 CLIZG3IdmuxXSWjOmk70iokzJgoSmBriGLbAT3K++pzGYUN/BNQs6XRR77BczFsX
 CbZTZKnEWZMR1U0UWa/TbvUKcsVBZTYebJSvJOG2/+oVqayzvwYfBsE/vWZcI72K
 XEEpKY9ZPDf/gCs/G4OFWt2QPJ0PL+Nt4UZDr5Khrqgo1PwN0uIXstA4mnJ0WjqQ
 sGiACjdTXk4h0w==
 =AEPy
 -----END PGP SIGNATURE-----

Merge tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull cmpxchg updates from Paul McKenney:
 "Provide one-byte and two-byte cmpxchg() support on sparc32, parisc,
  and csky

  This provides native one-byte and two-byte cmpxchg() support for
  sparc32 and parisc, courtesy of Al Viro. This support is provided by
  the same hashed-array-of-locks technique used for the other atomic
  operations provided for these two platforms.

  There is also emulated one-byte cmpxchg() support for csky using a new
  cmpxchg_emu_u8() function that uses a four-byte cmpxchg() to emulate
  the one-byte variant.

  Similar patches for emulation of one-byte cmpxchg() for arc, sh, and
  xtensa have not yet received maintainer acks, so they are slated for
  the v6.11 merge window"

* tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  csky: Emulate one-byte cmpxchg
  lib: Add one-byte emulation function
  parisc: add u16 support to cmpxchg()
  parisc: add missing export of __cmpxchg_u8()
  parisc: unify implementations of __cmpxchg_u{8,32,64}
  parisc: __cmpxchg_u32(): lift conversion into the callers
  sparc32: add __cmpxchg_u{8,16}() and teach __cmpxchg() to handle those sizes
  sparc32: unify __cmpxchg_u{32,64}
  sparc32: make the first argument of __cmpxchg_u64() volatile u64 *
  sparc32: make __cmpxchg_u32() return u32
2024-05-13 10:05:39 -07:00
Linus Torvalds
c22c3e0753 18 hotfixes, 7 of which are cc:stable.
More fixups for this cycle's page_owner updates.  And a few userfaultfd
 fixes.  Otherwise, random singletons - see the individual changelogs for
 details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZj6AhAAKCRDdBJ7gKXxA
 jsvHAQCoSRI4qM0a6j5Fs2Q+B1in+kGWTe50q5Rd755VgolEsgD8CUASDgZ2Qv7g
 yDAlluXMv4uvA4RqkZvDiezsENzYQw0=
 =MApd
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM fixes from Andrew Morton:
 "18 hotfixes, 7 of which are cc:stable.

  More fixups for this cycle's page_owner updates. And a few userfaultfd
  fixes. Otherwise, random singletons - see the individual changelogs
  for details"

* tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Barry Song
  selftests/mm: fix powerpc ARCH check
  mailmap: add entry for John Garry
  XArray: set the marks correctly when splitting an entry
  selftests/vDSO: fix runtime errors on LoongArch
  selftests/vDSO: fix building errors on LoongArch
  mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list
  fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry()
  fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan
  mm/vmalloc: fix return value of vb_alloc if size is 0
  mm: use memalloc_nofs_save() in page_cache_ra_order()
  kmsan: compiler_types: declare __no_sanitize_or_inline
  lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
  tools: fix userspace compilation with new test_xarray changes
  MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER
  mm: page_owner: fix wrong information in dump_page_owner
  maple_tree: fix mas_empty_area_rev() null pointer dereference
  mm/userfaultfd: reset ptes when close() for wr-protected ones
2024-05-10 14:16:03 -07:00
Masahiro Yamada
b1992c3772 kbuild: use $(src) instead of $(srctree)/$(src) for source directory
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for
checked-in source files. It is merely a convention without any functional
difference. In fact, $(obj) and $(src) are exactly the same, as defined
in scripts/Makefile.build:

    src := $(obj)

When the kernel is built in a separate output directory, $(src) does
not accurately reflect the source directory location. While Kbuild
resolves this discrepancy by specifying VPATH=$(srctree) to search for
source files, it does not cover all cases. For example, when adding a
header search path for local headers, -I$(srctree)/$(src) is typically
passed to the compiler.

This introduces inconsistency between upstream and downstream Makefiles
because $(src) is used instead of $(srctree)/$(src) for the latter.

To address this inconsistency, this commit changes the semantics of
$(src) so that it always points to the directory in the source tree.

Going forward, the variables used in Makefiles will have the following
meanings:

  $(obj)     - directory in the object tree
  $(src)     - directory in the source tree  (changed by this commit)
  $(objtree) - the top of the kernel object tree
  $(srctree) - the top of the kernel source tree

Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced
with $(src).

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-10 04:34:52 +09:00
Ard Biesheuvel
f135440447 crypto: lib - implement library version of AES in CFB mode
Implement AES in CFB mode using the existing, mostly constant-time
generic AES library implementation. This will be used by the TPM code
to encrypt communications with TPM hardware, which is often a discrete
component connected using sniffable wires or traces.

While a CFB template does exist, using a skcipher is a major pain for
non-performance critical synchronous crypto where the algorithm is known
at compile time and the data is in contiguous buffers with valid kernel
virtual addresses.

Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/all/20230216201410.15010-1-James.Bottomley@HansenPartnership.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09 22:30:51 +03:00
Jakub Kicinski
e7073830cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
  35d92abfba ("net: hns3: fix kernel crash when devlink reload during initialization")
  2a1a1a7b5f ("net: hns3: add command queue trace for hns3")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-09 10:01:01 -07:00
Yury Norov
0b2811ba11 bitmap: relax find_nth_bit() limitation on return value
The function claims to return the bitmap size, if Nth bit doesn't exist.
This rule is violated in inline case because the fns() that is used
there doesn't know anything about size of the bitmap.

So, relax this requirement to '>= size', and make the outline
implementation a bit cheaper.

All in-tree kernel users of find_nth_bit() are safe against that.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Closes: https://lore.kernel.org/all/Zi50cAgR8nZvgLa3@yury-ThinkPad/T/#m6da806a0525e74dcc91f35e5f20766ed4e853e8a
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Yury Norov
77db1920a8 lib: make test_bitops compilable into the kernel image
The test now is limited to be compiled as a module. There's no technical
reason for it. Now that the test bears some performance benchmarks, it
would be reasonable to run it at kernel load time, before userspace
starts, to reduce possible jitter.

Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Kuan-Wei Chiu
0a2c6664e5 lib/test_bitops: Add benchmark test for fns()
Introduce a benchmark test for the fns(). It measures the total time
taken by fns() to process 10,000 test data generated using
get_random_bytes() for each n in the range [0, BITS_PER_LONG).

example:
test_bitops: fns:             7637268 ns

CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: David Laight <David.Laight@aculab.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-09 09:25:08 -07:00
Kent Overstreet
4c5b7294de closures: closure_sync_timeout()
Add a new variant of closure_sync_timeout() that takes a timeout.

Note that when this returns -ETIME the closure will still be waiting on
something, i.e. it's not safe to return if you've got a stack allocated
closure.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:22 -04:00
Andy Shevchenko
22bcc915ae kfifo: don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use) principle.

Link: https://lkml.kernel.org/r/20240423192529.3249134-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alain Volmat <alain.volmat@foss.st.com>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Samuel Holland <samuel@sholland.org>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Sean Young <sean@mess.org>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-08 08:41:27 -07:00
Florian Fainelli
0d5044b4e7 lib: Allow for the DIM library to be modular
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a
module. This is particularly useful in an Android GKI (Google Kernel
Image) configuration where everything is built as a module, including
Ethernet controller drivers. Having to build DIMLIB into the kernel
image with potentially no user is wasteful.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07 16:42:45 -07:00
Scott Mayhew
5496b9b77d kunit: bail out early in __kunit_test_suites_init() if there are no suites to test
Commit c72a870926 added a mutex to prevent kunit tests from running
concurrently.  Unfortunately that mutex gets locked during module load
regardless of whether the module actually has any kunit tests.  This
causes a problem for kunit tests that might need to load other kernel
modules (e.g. gss_krb5_test loading the camellia module).

So check to see if there are actually any tests to run before locking
the kunit_run_lock mutex.

Fixes: c72a870926 ("kunit: add ability to run tests after boot using debugfs")
Reported-by: Nico Pache <npache@redhat.com>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Ivan Orlov
a96a394577 kunit: string-stream-test: use KUNIT_DEFINE_ACTION_WRAPPER
Use KUNIT_DEFINE_ACTION_WRAPPER macro to define the 'kfree' and
'string_stream_destroy' wrappers for kunit_add_action.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
David Gow
4b513a02fd kunit: test: Move fault tests behind KUNIT_FAULT_TEST Kconfig option
The NULL dereference tests in kunit_fault deliberately trigger a kernel
BUG(), and therefore print the associated stack trace, even when the
test passes. This is both annoying (as it bloats the test output), and
can confuse some test harnesses, which assume any BUG() is a failure.

Allow these tests to be specifically disabled (without disabling all
of KUnit's other tests), by placing them behind the
CONFIG_KUNIT_FAULT_TEST Kconfig option. This is enabled by default, but
can be set to 'n' to disable the test. An empty 'kunit_fault' suite is
left behind, which will automatically be marked 'skipped'.

As the fault tests already were disabled under UML (as they weren't
compatible with its fault handling), we can simply adapt those
conditions, and add a dependency on !UML for our new option.

Suggested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/928249cc-e027-4f7f-b43f-502f99a1ea63@roeck-us.net/
Fixes: 82b0beff3497 ("kunit: Add tests for fault")
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Wander Lairson Costa
fabd480b72 kunit: unregister the device on error
kunit_init_device() should unregister the device on bus register error,
but mistakenly it tries to unregister the bus.

Unregister the device instead of the bus.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
David Gow
1eb69ded80 kunit: Fix race condition in try-catch completion
KUnit's try-catch infrastructure now uses vfork_done, which is always
set to a valid completion when a kthread is created, but which is set to
NULL once the thread terminates. This creates a race condition, where
the kthread exits before we can wait on it.

Keep a copy of vfork_done, which is taken before we wake_up_process()
and so valid, and wait on that instead.

Fixes: 93533996100c ("kunit: Handle test faults")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/lkml/20240410102710.35911-1-naresh.kamboju@linaro.org/
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
170c31737c kunit: Add tests for fault
Add a test case to check NULL pointer dereference and make sure it would
result as a failed test.

The full kunit_fault test suite is marked as skipped when run on UML
because it would result to a kernel panic.

Tested with:
./tools/testing/kunit/kunit.py run --arch x86_64 kunit_fault
./tools/testing/kunit/kunit.py run --arch arm64 \
  --cross_compile=aarch64-linux-gnu- kunit_fault

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-8-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
8bd5d74bab kunit: Print last test location on fault
This helps identify the location of test faults with opportunistic calls
to _KUNIT_SAVE_LOC().  This can be useful while writing tests or
debugging them.  It is possible to call KUNIT_SUCCESS() to explicit save
last location.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-7-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
70585f05fb kunit: Fix KUNIT_SUCCESS() calls in iov_iter tests
Fix KUNIT_SUCCESS() calls to pass a test argument.

This is a no-op for now because this macro does nothing, but it will be
required for the next commit.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Rae Moar <rmoar@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-6-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
3a35c13007 kunit: Handle test faults
Previously, when a kernel test thread crashed (e.g. NULL pointer
dereference, general protection fault), the KUnit test hanged for 30
seconds and exited with a timeout error.

Fix this issue by waiting on task_struct->vfork_done instead of the
custom kunit_try_catch.try_completion, and track the execution state by
initially setting try_result with -EINTR and only setting it to 0 if
the test passed.

Fix kunit_generic_run_threadfn_adapter() signature by returning 0
instead of calling kthread_complete_and_exit().  Because thread's exit
code is never checked, always set it to 0 to make it clear.  To make
this explicit, export kthread_exit() for KUnit tests built as module.

Fix the -EINTR error message, which couldn't be reached until now.

This is tested with a following patch.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-5-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
53026ff63b kunit: Fix timeout message
The exit code is always checked, so let's properly handle the -ETIMEDOUT
error code.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-4-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
f8aa1b98ce kunit: Fix kthread reference
There is a race condition when a kthread finishes after the deadline and
before the call to kthread_stop(), which may lead to use after free.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes: adf5054570 ("kunit: fix UAF when run kfence test case test_gfpzero")
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-3-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Mickaël Salaün
cde5e1b4a9 kunit: Handle thread creation error
Previously, if a thread creation failed (e.g. -ENOMEM), the function was
called (kunit_catch_run_case or kunit_catch_run_case_cleanup) without
marking the test as failed.  Instead, fill try_result with the error
code returned by kthread_run(), which will mark the test as failed and
print "internal error occurred...".

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20240408074625.65017-2-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-06 14:22:02 -06:00
Long Li
ba591801a3 xarray: inline xas_descend to improve performance
The commit 63b1898fff ("XArray: Disallow sibling entries of nodes")
modified the xas_descend function in such a way that it was no longer
being compiled as an inline function, because it increased the size of
xas_descend(), and the compiler no longer optimizes it as inline.  This
had a negative impact on performance, xas_descend is called frequently to
traverse downwards in the xarray tree, making it a hot function.

Inlining xas_descend has been shown to significantly improve performance
by approximately 4.95% in the iozone write test.

  Machine: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
  #iozone i 0 -i 1 -s 64g -r 16m -f /test/tmptest

Before this patch:

       kB    reclen    write   rewrite     read    reread
 67108864     16384  2230080   3637689  6315197   5496027

After this patch:

       kB    reclen    write   rewrite     read    reread
 67108864     16384  2340360   3666175  6272401   5460782

Percentage change:
                       4.95%     0.78%   -0.68%    -0.64%

This patch introduces inlining to the xas_descend function. While this
change increases the size of lib/xarray.o, the performance gains in
critical workloads make this an acceptable trade-off.

Size comparison before and after patch:
.text		.data		.bss		file
0x3502		    0		   0		lib/xarray.o.before
0x3602		    0		   0		lib/xarray.o.after

Link: https://lkml.kernel.org/r/20240416061628.3768901-1-leo.lilong@huawei.com
Signed-off-by: Long Li <leo.lilong@huawei.com>
Cc: Hou Tao <houtao1@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: yangerkun <yangerkun@huawei.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:38 -07:00
Matthew Wilcox (Oracle)
2a0774c288 XArray: set the marks correctly when splitting an entry
If we created a new node to replace an entry which had search marks set,
we were setting the search mark on every entry in that node.  That works
fine when we're splitting to order 0, but when splitting to a larger
order, we must not set the search marks on the sibling entries.

Link: https://lkml.kernel.org/r/20240501153120.4094530-1-willy@infradead.org
Fixes: c010d47f10 ("mm: thp: split huge page to any lower order pages")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/ZjFGCOYk3FK_zVy3@bombadil.infradead.org
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:08 -07:00
Luis Chamberlain
2aaba39e78 lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
While testing lib/test_xarray in userspace I've noticed we can fail with:

make -C tools/testing/radix-tree
./tools/testing/radix-tree/xarray

BUG at check_xa_multi_store_adv_add:749
xarray: 0x55905fb21a00x head 0x55905fa1d8e0x flags 0 marks 0 0 0
0: 0x55905fa1d8e0x
xarray: ../../../lib/test_xarray.c:749: check_xa_multi_store_adv_add: Assertion `0' failed.
Aborted

We get a failure with a BUG_ON(), and that is because we actually can
fail due to -ENOMEM, the check in xas_nomem() will fix this for us so
it makes no sense to expect no failure inside the loop. So modify the
check and since this is also useful for instructional purposes clarify
the situation.

The check for XA_BUG_ON(xa, xa_load(xa, index) != p) is already done
at the end of the loop so just remove the bogus on inside the loop.

With this we now pass the test in both kernel and userspace:

In userspace:

./tools/testing/radix-tree/xarray
XArray: 149092856 of 149092856 tests passed

In kernel space:

XArray: 148257077 of 148257077 tests passed

Link: https://lkml.kernel.org/r/20240423192221.301095-3-mcgrof@kernel.org
Fixes: a60cc288a1 ("test_xarray: add tests for advanced multi-index use")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:05 -07:00
Liam R. Howlett
955a923d28 maple_tree: fix mas_empty_area_rev() null pointer dereference
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL.  This will lead to a null pointer dereference when checking
information in the NULL node, which is done in mas_data_end().

Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.

A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.

Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Marius Fleischer <fleischermarius@gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
Tested-by: Marius Fleischer <fleischermarius@gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:28:05 -07:00
Linus Torvalds
b9158815de Char/Misc driver fixes for 6.9-rc7
Here are some small char/misc/other driver fixes and new device ids for
 6.9-rc7 that resolve some reported problems.
 
 Included in here are:
   - iio driver fixes
   - mei driver fix and new device ids
   - dyndbg bugfix
   - pvpanic-pci driver bugfix
   - slimbus driver bugfix
   - fpga new device id
 
 All have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZjdD2Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk38wCeJeUXW4/yQ4BTj7cHir0aOowVs+UAnAxCUwzt
 NpooaVg3v9tzLtvAOp1O
 =YfmA
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc/other driver fixes and new device ids
  for 6.9-rc7 that resolve some reported problems.

  Included in here are:

   - iio driver fixes

   - mei driver fix and new device ids

   - dyndbg bugfix

   - pvpanic-pci driver bugfix

   - slimbus driver bugfix

   - fpga new device id

  All have been in linux-next with no reported problems"

* tag 'char-misc-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  slimbus: qcom-ngd-ctrl: Add timeout for wait operation
  dyndbg: fix old BUG_ON in >control parser
  misc/pvpanic-pci: register attributes via pci_driver
  fpga: dfl-pci: add PCI subdevice ID for Intel D5005 card
  mei: me: add lunar lake point M DID
  mei: pxp: match against PCI_CLASS_DISPLAY_OTHER
  iio:imu: adis16475: Fix sync mode setting
  iio: accel: mxc4005: Reset chip on probe() and resume()
  iio: accel: mxc4005: Interrupt handling fixes
  dt-bindings: iio: health: maxim,max30102: fix compatible check
  iio: pressure: Fixes SPI support for BMP3xx devices
  iio: pressure: Fixes BME280 SPI driver data
2024-05-05 10:08:52 -07:00
Al Viro
b8c873edbf wrapper for access to ->bd_partno
On the next step it's going to get folded into a field where flags will go.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02 17:48:09 -04:00
Al Viro
3f9b8fb46e Use bdev_is_paritition() instead of open-coding it
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-05-02 17:48:09 -04:00
Jakub Kicinski
e958da0ddb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/linux/filter.h
kernel/bpf/core.c
  66e13b615a ("bpf: verifier: prevent userspace memory access")
  d503a04f8b ("bpf: Add support for certain atomics in bpf_arena to x86 JIT")
https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02 12:06:25 -07:00
Linus Torvalds
545c494465 Including fixes from bpf.
Relatively calm week, likely due to public holiday in most places.
 No known outstanding regressions.
 
 Current release - regressions:
 
   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()
 
   - eth: e1000e: change usleep_range to udelay in PHY mdic access
 
 Previous releases - regressions:
 
   - gro: fix udp bad offset in socket lookup
 
   - bpf: fix incorrect runtime stat for arm64
 
   - tipc: fix UAF in error path
 
   - netfs: fix a potential infinite loop in extract_user_to_sg()
 
   - eth: ice: ensure the copied buf is NUL terminated
 
   - eth: qeth: fix kernel panic after setting hsuid
 
 Previous releases - always broken:
 
   - bpf:
     - verifier: prevent userspace memory access
     - xdp: use flags field to disambiguate broadcast redirect
 
   - bridge: fix multicast-to-unicast with fraglist GSO
 
   - mptcp: ensure snd_nxt is properly initialized on connect
 
   - nsh: fix outer header access in nsh_gso_segment().
 
   - eth: bcmgenet: fix racing registers access
 
   - eth: vxlan: fix stats counters.
 
 Misc:
 
   - a bunch of MAINTAINERS file updates
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYzaRsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkh70P/jzsTsvzHspu3RUwcsyvWpSoJPcxP2tF
 5SKR66o8sbSjB5I26zUi/LtRZgbPO32GmLN2Y8GvP74h9lwKdDo4AY4volZKCT6f
 lRG6GohvMa0lSPSn1fti7CKVzDOsaTHvLz3uBBr+Xb9ITCKh+I+zGEEDGj/47SQN
 tmDWHPF8OMs2ezmYS5NqRIQ3CeRz6uyLmEoZhVm4SolypZ18oEg7GCtL3u6U48n+
 e3XB3WwKl0ZxK8ipvPgUDwGIDuM5hEyAaeNon3zpYGoqitRsRITUjULpb9dT4DtJ
 Jma3OkarFJNXgm4N/p/nAtQ9AdiAloF9ivZXs2t0XCdrrUZJUh05yuikoX+mLfpw
 GedG2AbaVl6mdqNkrHeyf5SXKuiPgeCLVfF2xMjS0l1kFbY+Bt8BqnRSdOrcoUG0
 zlSzBeBtajttMdnalWv2ZshjP8uo/NjXydUjoVNwuq8xGO5wP+zhNnwhOvecNyUg
 t7q2PLokahlz4oyDqyY/7SQ0hSEndqxOlt43I6CthoWH0XkS83nTPdQXcTKQParD
 ntJUk5QYwefUT1gimbn/N8GoP7a1+ysWiqcf/7+SNm932gJGiDt36+HOEmyhIfIG
 IDWTWJJW64SnPBIUw59MrG7hMtbfaiZiFQqeUJQpFVrRr+tg5z5NUZ5thA+EJVd8
 qiVDvmngZFiv
 =f6KY
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf.

  Relatively calm week, likely due to public holiday in most places. No
  known outstanding regressions.

  Current release - regressions:

   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()

   - eth: e1000e: change usleep_range to udelay in PHY mdic access

  Previous releases - regressions:

   - gro: fix udp bad offset in socket lookup

   - bpf: fix incorrect runtime stat for arm64

   - tipc: fix UAF in error path

   - netfs: fix a potential infinite loop in extract_user_to_sg()

   - eth: ice: ensure the copied buf is NUL terminated

   - eth: qeth: fix kernel panic after setting hsuid

  Previous releases - always broken:

   - bpf:
       - verifier: prevent userspace memory access
       - xdp: use flags field to disambiguate broadcast redirect

   - bridge: fix multicast-to-unicast with fraglist GSO

   - mptcp: ensure snd_nxt is properly initialized on connect

   - nsh: fix outer header access in nsh_gso_segment().

   - eth: bcmgenet: fix racing registers access

   - eth: vxlan: fix stats counters.

  Misc:

   - a bunch of MAINTAINERS file updates"

* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
  MAINTAINERS: remove Ariel Elior
  net: gro: add flush check in udp_gro_receive_segment
  net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  ipv4: Fix uninit-value access in __ip_make_skb()
  s390/qeth: Fix kernel panic after setting hsuid
  vxlan: Pull inner IP header in vxlan_rcv().
  tipc: fix a possible memleak in tipc_buf_append
  tipc: fix UAF in error path
  rxrpc: Clients must accept conn from any address
  net: core: reject skb_copy(_expand) for fraglist GSO skbs
  net: bridge: fix multicast-to-unicast with fraglist GSO
  mptcp: ensure snd_nxt is properly initialized on connect
  e1000e: change usleep_range to udelay in PHY mdic access
  net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
  cxgb4: Properly lock TX queue for the selftest.
  rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
  vxlan: Add missing VNI filter counter update in arp_reduce().
  vxlan: Fix racy device stats updates.
  net: qede: use return from qede_parse_actions()
  ...
2024-05-02 08:51:47 -07:00
Kees Cook
7d78a77733 string: Add additional __realloc_size() annotations for "dup" helpers
Several other "dup"-style interfaces could use the __realloc_size()
attribute. (As a reminder to myself and others: "realloc" is used here
instead of "alloc" because the "alloc_size" attribute implies that the
memory contents are uninitialized. Since we're copying contents into the
resulting allocation, it must use "realloc_size" to avoid confusing the
compiler's optimization passes.)

Add KUnit test coverage where possible. (KUnit still does not have the
ability to manipulate userspace memory.)

Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240502145218.it.729-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-02 07:52:41 -07:00
Ard Biesheuvel
377d909511 vmlinux: Avoid weak reference to notes section
Weak references are references that are permitted to remain unsatisfied
in the final link. This means they cannot be implemented using place
relative relocations, resulting in GOT entries when using position
independent code generation.

The notes section should always exist, so the weak annotations can be
omitted.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-05-02 19:48:26 +09:00
Thomas Zimmermann
b0a835db17 Merge drm/drm-next into drm-misc-next
Backmerging to get DRM fixes from v6.9-rc6.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-05-02 11:25:27 +02:00
Jocelyn Falempe
b94605a388 lib/fonts: Allow to select fonts for drm_panic
drm_panic has been introduced recently, and uses the same fonts as
FRAMEBUFFER_CONSOLE.

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419132243.154466-1-jfalempe@redhat.com
2024-05-02 09:40:17 +02:00
Kees Cook
74df22453c kunit/fortify: Fix replaced failure path to unbreak __alloc_size
The __alloc_size annotation for kmemdup() was getting disabled under
KUnit testing because the replaced fortify_panic macro implementation
was using "return NULL" as a way to survive the sanity checking. But
having the chance to return NULL invalidated __alloc_size, so kmemdup
was not passing the __builtin_dynamic_object_size() tests any more:

[23:26:18] [PASSED] fortify_test_alloc_size_kmalloc_const
[23:26:19]     # fortify_test_alloc_size_kmalloc_dynamic: EXPECTATION FAILED at lib/fortify_kunit.c:265
[23:26:19]     Expected __builtin_dynamic_object_size(p, 1) == expected, but
[23:26:19]         __builtin_dynamic_object_size(p, 1) == -1 (0xffffffffffffffff)
[23:26:19]         expected == 11 (0xb)
[23:26:19] __alloc_size() not working with __bdos on kmemdup("hello there", len, gfp)
[23:26:19] [FAILED] fortify_test_alloc_size_kmalloc_dynamic

Normal builds were not affected: __alloc_size continued to work there.

Use a zero-sized allocation instead, which allows __alloc_size to
behave.

Fixes: 4ce615e798 ("fortify: Provide KUnit counters for failure testing")
Fixes: fa4a3f86d4 ("fortify: Add KUnit tests for runtime overflows")
Link: https://lore.kernel.org/r/20240501232937.work.532-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-01 16:35:06 -07:00
Andrii Nakryiko
78d0b16127 objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
Profiling shows that calling nr_possible_cpus() in objpool_pop() takes
a noticeable amount of CPU (when profiled on 80-core machine), as we
need to recalculate number of set bits in a CPU bit mask. This number
can't change, so there is no point in paying the price for recalculating
it. As such, cache this value in struct objpool_head and use it in
objpool_pop().

On the other hand, cached pool->nr_cpus isn't necessary, as it's not
used in hot path and is also a pretty trivial value to retrieve. So drop
pool->nr_cpus in favor of using nr_cpu_ids everywhere. This way the size
of struct objpool_head remains the same, which is a nice bonus.

Same BPF selftests benchmarks were used to evaluate the effect. Using
changes in previous patch (inlining of objpool_pop/objpool_push) as
baseline, here are the differences:

BASELINE
========
kretprobe      :    9.937 ± 0.174M/s
kretprobe-multi:   10.440 ± 0.108M/s

AFTER
=====
kretprobe      :   10.106 ± 0.120M/s (+1.7%)
kretprobe-multi:   10.515 ± 0.180M/s (+0.7%)

Link: https://lore.kernel.org/all/20240424215214.3956041-3-andrii@kernel.org/

Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01 23:18:48 +09:00
Andrii Nakryiko
a3b00f10da objpool: enable inlining objpool_push() and objpool_pop() operations
objpool_push() and objpool_pop() are very performance-critical functions
and can be called very frequently in kretprobe triggering path.

As such, it makes sense to allow compiler to inline them completely to
eliminate function calls overhead. Luckily, their logic is quite well
isolated and doesn't have any sprawling dependencies.

This patch moves both objpool_push() and objpool_pop() into
include/linux/objpool.h and marks them as static inline functions,
enabling inlining. To avoid anyone using internal helpers
(objpool_try_get_slot, objpool_try_add_slot), rename them to use leading
underscores.

We used kretprobe microbenchmark from BPF selftests (bench trig-kprobe
and trig-kprobe-multi benchmarks) running no-op BPF kretprobe/kretprobe.multi
programs in a tight loop to evaluate the effect. BPF own overhead in
this case is minimal and it mostly stresses the rest of in-kernel
kretprobe infrastructure overhead. Results are in millions of calls per
second. This is not super scientific, but shows the trend nevertheless.

BEFORE
======
kretprobe      :    9.794 ± 0.086M/s
kretprobe-multi:   10.219 ± 0.032M/s

AFTER
=====
kretprobe      :    9.937 ± 0.174M/s (+1.5%)
kretprobe-multi:   10.440 ± 0.108M/s (+2.2%)

Link: https://lore.kernel.org/all/20240424215214.3956041-2-andrii@kernel.org/

Cc: Matt (Qiang) Wu <wuqiang.matt@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01 23:18:48 +09:00
Kees Cook
26f812ba75 kunit/fortify: Add memcpy() tests
Add fortify tests for memcpy() and memmove(). This can use a similar
method to the fortify_panic() replacement, only we can do it for what
was the WARN_ONCE(), which can be redefined.

Since this is primarily testing the fortify behaviors of the memcpy()
and memmove() defenses, the tests for memcpy() and memmove() are
identical.

Link: https://lore.kernel.org/r/20240429194342.2421639-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:30 -07:00
Kees Cook
091f79e8de kunit/fortify: Do not spam logs with fortify WARNs
When running KUnit fortify tests, we're already doing precise tracking
of which warnings are getting hit. Don't fill the logs with WARNs unless
we've been explicitly built with DEBUG enabled.

Link: https://lore.kernel.org/r/20240429194342.2421639-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:29 -07:00
Kees Cook
a0d6677ec3 kunit/fortify: Rename tests to use recommended conventions
The recommended conventions for KUnit tests is ${module}_test_${what}.
Adjust the fortify tests to match.

Link: https://lore.kernel.org/r/20240429194342.2421639-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-30 10:34:29 -07:00
Jim Cromie
00e7d3bea2 dyndbg: fix old BUG_ON in >control parser
Fix a BUG_ON from 2009.  Even if it looks "unreachable" (I didn't
really look), lets make sure by removing it, doing pr_err and return
-EINVAL instead.

Cc: stable <stable@kernel.org>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20240429193145.66543-2-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-30 09:20:48 +02:00
Jakub Kicinski
89de2db193 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZi9+AAAKCRDbK58LschI
 g0nEAP487m7L0nLVriC2oIOWsi29tklW3etm6DO7gmGRGIHgrgEAnMyV1xBj3bGj
 v6jJwDcybCym1hLx+1x1JCZ4eoAFswE=
 =xbna
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-04-29

We've added 147 non-merge commits during the last 32 day(s) which contain
a total of 158 files changed, 9400 insertions(+), 2213 deletions(-).

The main changes are:

1) Add an internal-only BPF per-CPU instruction for resolving per-CPU
   memory addresses and implement support in x86 BPF JIT. This allows
   inlining per-CPU array and hashmap lookups
   and the bpf_get_smp_processor_id() helper, from Andrii Nakryiko.

2) Add BPF link support for sk_msg and sk_skb programs, from Yonghong Song.

3) Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
   atomics in bpf_arena which can be JITed as a single x86 instruction,
   from Alexei Starovoitov.

4) Add support for passing mark with bpf_fib_lookup helper,
   from Anton Protopopov.

5) Add a new bpf_wq API for deferring events and refactor sleepable
   bpf_timer code to keep common code where possible,
   from Benjamin Tissoires.

6) Fix BPF_PROG_TEST_RUN infra with regards to bpf_dummy_struct_ops programs
   to check when NULL is passed for non-NULLable parameters,
   from Eduard Zingerman.

7) Harden the BPF verifier's and/or/xor value tracking,
   from Harishankar Vishwanathan.

8) Introduce crypto kfuncs to make BPF programs able to utilize the kernel
   crypto subsystem, from Vadim Fedorenko.

9) Various improvements to the BPF instruction set standardization doc,
   from Dave Thaler.

10) Extend libbpf APIs to partially consume items from the BPF ringbuffer,
    from Andrea Righi.

11) Bigger batch of BPF selftests refactoring to use common network helpers
    and to drop duplicate code, from Geliang Tang.

12) Support bpf_tail_call_static() helper for BPF programs with GCC 13,
    from Jose E. Marchesi.

13) Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
    program to have code sections where preemption is disabled,
    from Kumar Kartikeya Dwivedi.

14) Allow invoking BPF kfuncs from BPF_PROG_TYPE_SYSCALL programs,
    from David Vernet.

15) Extend the BPF verifier to allow different input maps for a given
    bpf_for_each_map_elem() helper call in a BPF program, from Philo Lu.

16) Add support for PROBE_MEM32 and bpf_addr_space_cast instructions
    for riscv64 and arm64 JITs to enable BPF Arena, from Puranjay Mohan.

17) Shut up a false-positive KMSAN splat in interpreter mode by unpoison
    the stack memory, from Martin KaFai Lau.

18) Improve xsk selftest coverage with new tests on maximum and minimum
    hardware ring size configurations, from Tushar Vyavahare.

19) Various ReST man pages fixes as well as documentation and bash completion
    improvements for bpftool, from Rameez Rehman & Quentin Monnet.

20) Fix libbpf with regards to dumping subsequent char arrays,
    from Quentin Deslandes.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (147 commits)
  bpf, docs: Clarify PC use in instruction-set.rst
  bpf_helpers.h: Define bpf_tail_call_static when building with GCC
  bpf, docs: Add introduction for use in the ISA Internet Draft
  selftests/bpf: extend BPF_SOCK_OPS_RTT_CB test for srtt and mrtt_us
  bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args
  selftests/bpf: dummy_st_ops should reject 0 for non-nullable params
  bpf: check bpf_dummy_struct_ops program params for test runs
  selftests/bpf: do not pass NULL for non-nullable params in dummy_st_ops
  selftests/bpf: adjust dummy_st_ops_success to detect additional error
  bpf: mark bpf_dummy_struct_ops.test_1 parameter as nullable
  selftests/bpf: Add ring_buffer__consume_n test.
  bpf: Add bpf_guard_preempt() convenience macro
  selftests: bpf: crypto: add benchmark for crypto functions
  selftests: bpf: crypto skcipher algo selftests
  bpf: crypto: add skcipher to bpf crypto
  bpf: make common crypto API for TC/XDP programs
  bpf: update the comment for BTF_FIELDS_MAX
  selftests/bpf: Fix wq test.
  selftests/bpf: Use make_sockaddr in test_sock_addr
  selftests/bpf: Use connect_to_addr in test_sock_addr
  ...
====================

Link: https://lore.kernel.org/r/20240429131657.19423-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-29 13:12:19 -07:00
Jakub Kicinski
b2ff42c6d3 bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZiwdfQAKCRDbK58LschI
 g1oqAP9mjayeIHCfYMQZa2eevy1PmVlgdNdFdMDWZFS/pHv9cgD/ZdmGzbUDKCAQ
 Y/KiTajitZw3kxtHX45v8/Ugtlsh9Qg=
 =Ewiw
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-04-26

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 14 files changed, 168 insertions(+), 72 deletions(-).

The main changes are:

1) Fix BPF_PROBE_MEM in verifier and JIT to skip loads from vsyscall page,
   from Puranjay Mohan.

2) Fix a crash in XDP with devmap broadcast redirect when the latter map
   is in process of being torn down, from Toke Høiland-Jørgensen.

3) Fix arm64 and riscv64 BPF JITs to properly clear start time for BPF
   program runtime stats, from Xu Kuohai.

4) Fix a sockmap KCSAN-reported data race in sk_psock_skb_ingress_enqueue,
    from Jason Xing.

5) Fix BPF verifier error message in resolve_pseudo_ldimm64,
   from Anton Protopopov.

6) Fix missing DEBUG_INFO_BTF_MODULES Kconfig menu item,
   from Andrii Nakryiko.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Test PROBE_MEM of VSYSCALL_ADDR on x86-64
  bpf, x86: Fix PROBE_MEM runtime load check
  bpf: verifier: prevent userspace memory access
  xdp: use flags field to disambiguate broadcast redirect
  arm32, bpf: Reimplement sign-extension mov instruction
  riscv, bpf: Fix incorrect runtime stats
  bpf, arm64: Fix incorrect runtime stats
  bpf: Fix a verifier verbose message
  bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue
  MAINTAINERS: bpf: Add Lehui and Puranjay as riscv64 reviewers
  MAINTAINERS: Update email address for Puranjay Mohan
  bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
====================

Link: https://lore.kernel.org/r/20240426224248.26197-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26 17:36:53 -07:00
Kees Cook
998b18072c kunit/fortify: Fix mismatched kvalloc()/vfree() usage
The kv*() family of tests were accidentally freeing with vfree() instead
of kvfree(). Use kvfree() instead.

Fixes: 9124a26401 ("kunit/fortify: Validate __alloc_size attribute results")
Link: https://lore.kernel.org/r/20240425230619.work.299-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-26 15:31:39 -07:00
Linus Torvalds
e6ebf01172 11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
post-6.8 issues or aren't considered suitable for backporting.
 
 All except one of these are for MM.  I see no particular theme - it's
 singletons all over.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZiwPZwAKCRDdBJ7gKXxA
 jmcQAPkB6UT/rBUMvFZb1dom9R6SDYl5ZBr20Vj1HvfakCLxmQEAqEd0N7QoWvKS
 hKNCMDujiEKqDUWeUaJen4cqXFFE2Qg=
 =1wP7
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address
  post-6.8 issues or aren't considered suitable for backporting.

  All except one of these are for MM. I see no particular theme - it's
  singletons all over"

* tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  selftests: mm: protection_keys: save/restore nr_hugepages value from launch script
  stackdepot: respect __GFP_NOLOCKDEP allocation flag
  hugetlb: check for anon_vma prior to folio allocation
  mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
  mm: turn folio_test_hugetlb into a PageType
  mm: support page_mapcount() on page_has_type() pages
  mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros
  mm/hugetlb: fix missing hugetlb_lock for resv uncharge
  selftests: mm: fix unused and uninitialized variable warning
  selftests/harness: remove use of LINE_MAX
2024-04-26 13:48:03 -07:00
David Howells
6a30653b60 Fix a potential infinite loop in extract_user_to_sg()
Fix extract_user_to_sg() so that it will break out of the loop if
iov_iter_extract_pages() returns 0 rather than looping around forever.

[Note that I've included two fixes lines as the function got moved to a
different file and renamed]

Fixes: 85dd2c8ff3 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator")
Fixes: f5f82cd187 ("Move netfs_extract_iter_to_sg() to lib/scatterlist.c")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: netfs@lists.linux.dev
Link: https://lore.kernel.org/r/1967121.1714034372@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-26 12:35:57 -07:00
linke li
6ad0d7e0f4 sbitmap: use READ_ONCE to access map->word
In __sbitmap_queue_get_batch(), map->word is read several times, and
update atomically using atomic_long_try_cmpxchg(). But the first two read
of map->word is not protected.

This patch moves the statement val = READ_ONCE(map->word) forward,
eliminating unprotected accesses to map->word within the function.
It is aimed at reducing the number of benign races reported by KCSAN in
order to focus future debugging effort on harmful races.

Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_0B517C25E519D3D002194E8445E86C04AD0A@qq.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-26 07:40:28 -06:00
Arnd Bergmann
3ef3a05ba6 test_hexdump: avoid string truncation warning
gcc can warn when a string is too long to fit into the strncpy()
destination buffer, as it is here depending on the function arguments:

    inlined from 'test_hexdump_prepare_test.constprop' at /home/arnd/arm-soc/lib/test_hexdump.c:116:3:
include/linux/fortify-string.h:108:33: error: '__builtin_strncpy' output truncated copying between 0 and 32 bytes from a string of length 32 [-Werror=stringop-truncation]
  108 | #define __underlying_strncpy    __builtin_strncpy
      |                                 ^
include/linux/fortify-string.h:187:16: note: in expansion of macro '__underlying_strncpy'
  187 |         return __underlying_strncpy(p, q, size);
      |                ^~~~~~~~~~~~~~~~~~~~

The intention here is to copy exactly 'l' bytes without any padding or
NUL-termination, so the most logical change is to use memcpy(), just as
a previous change adapted the other output from strncpy() to memcpy().

Link: https://lkml.kernel.org/r/20240409140059.3806717-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Justin Stitt <justinstitt@google.com>
Cc: Alexey Starikovskiy <astarikovskiy@suse.de>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Len Brown <lenb@kernel.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:07 -07:00
Andy Shevchenko
3cc98aa11e devres: don't use "proxy" headers
Update header inclusions to follow IWYU (Include What You Use) principle.

Link: https://lkml.kernel.org/r/20240403104820.557487-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:06 -07:00
Andy Shevchenko
f36c54f3ce devres: switch to use dev_err_probe() for unification
Patch series "devres: A couple of cleanups".

A couple of ad-hoc cleanups. No functional changes intended. 


This patch (of 2):

The devm_*() APIs are supposed to be called during the ->probe() stage. 
Many drivers (especially new ones) have switched to use dev_err_probe()
for error messaging for the sake of unification.  Let's do the same in the
devres APIs.

Link: https://lkml.kernel.org/r/20240403104820.557487-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20240403104820.557487-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:06 -07:00
Niklas Schnelle
b157f0e97e kgdb: add HAS_IOPORT dependency
In a future patch HAS_IOPORT=n will disable inb()/outb() and friends at
compile time.  We thus need to add HAS_IOPORT as dependency for those
drivers using them.

Link: https://lkml.kernel.org/r/20240403132547.762429-2-schnelle@linux.ibm.com
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:05 -07:00
Uwe Kleine-König
5ef6dc08cf lib/build_OID_registry: don't mention the full path of the script in output
This change strips the full path of the script generating
lib/oid_registry_data.c to just lib/build_OID_registry.  The motivation
for this change is Yocto emitting a build warning

	File /usr/src/debug/linux-lxatac/6.7-r0/lib/oid_registry_data.c in package linux-lxatac-src contains reference to TMPDIR [buildpaths]

So this change brings us one step closer to make the build result
reproducible independent of the build path.

Link: https://lkml.kernel.org/r/20240313211957.884561-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:01 -07:00
Kairui Song
6758c1128c mm/filemap: optimize filemap folio adding
Instead of doing multiple tree walks, do one optimism range check with
lock hold, and exit if raced with another insertion.  If a shadow exists,
check it with a new xas_get_order helper before releasing the lock to
avoid redundant tree walks for getting its order.

Drop the lock and do the allocation only if a split is needed.

In the best case, it only need to walk the tree once.  If it needs to
alloc and split, 3 walks are issued (One for first ranged conflict check
and order retrieving, one for the second check after allocation, one for
the insert after split).

Testing with 4K pages, in an 8G cgroup, with 16G brd as block device:

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
    --buffered=1 --ioengine=mmap --rw=randread --time_based \
    --ramp_time=30s --runtime=5m --group_reporting

Before:
bw (  MiB/s): min= 1027, max= 3520, per=100.00%, avg=2445.02, stdev=18.90, samples=8691
iops        : min=263001, max=901288, avg=625924.36, stdev=4837.28, samples=8691

After (+7.3%):
bw (  MiB/s): min=  493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651
iops        : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651

Test result with THP (do a THP randread then switch to 4K page in hope it
issues a lot of splitting):

  echo 3 > /proc/sys/vm/drop_caches

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap -thp=1 --readonly \
      --rw=randread --time_based --ramp_time=30s --runtime=10m \
      --group_reporting

  fio -name=cached --numjobs=16 --filename=/mnt/test.img \
      --buffered=1 --ioengine=mmap \
      --rw=randread --time_based --runtime=5s --group_reporting

Before:
bw (  KiB/s): min= 4141, max=14202, per=100.00%, avg=7935.51, stdev=96.85, samples=18976
iops        : min= 1029, max= 3548, avg=1979.52, stdev=24.23, samples=18976·

READ: bw=4545B/s (4545B/s), 4545B/s-4545B/s (4545B/s-4545B/s), io=64.0KiB (65.5kB), run=14419-14419msec

After (+12.5%):
bw (  KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146
iops        : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146

READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec

The performance is better for both 4K (+7.5%) and THP (+12.5%) cached read.

Link: https://lkml.kernel.org/r/20240415171857.19244-5-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:09 -07:00
Kairui Song
a4864671ca lib/xarray: introduce a new helper xas_get_order
It can be used after xas_load to check the order of loaded entries. 
Compared to xa_get_order, it saves an XA_STATE and avoid a rewalk.

Added new test for xas_get_order, to make the test work, we have to export
xas_get_order with EXPORT_SYMBOL_GPL.

Also fix a sparse warning by checking the slot value with xa_entry instead
of accessing it directly, as suggested by Matthew Wilcox.

[kasong@tencent.com: simplify comment, sparse warning fix, per Matthew Wilcox]
  Link: https://lkml.kernel.org/r/20240416071722.45997-4-ryncsn@gmail.com
Link: https://lkml.kernel.org/r/20240415171857.19244-4-ryncsn@gmail.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:09 -07:00
Kees Cook
e13106952f alloc_tag: Tighten file permissions on /proc/allocinfo
The /proc/allocinfo file exposes a tremendous about of information about
kernel build details, memory allocations (obviously), and potentially even
image layout (due to ordering).  As this is intended to be consumed by
system owners (like /proc/slabinfo), use the same file permissions as
there: 0400.

Link: https://lkml.kernel.org/r/20240425200844.work.184-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:59 -07:00
Suren Baghdasaryan
1438d349d1 lib: add memory allocations report in show_mem()
Include allocations in show_mem reports.

Link: https://lkml.kernel.org/r/20240321163705.3067592-33-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:57 -07:00
Kent Overstreet
9e54dd8b64 rhashtable: plumb through alloc tag
This gives better memory allocation profiling results; rhashtable
allocations will be accounted to the code that initialized the rhashtable.

[surenb@google.com: undo _noprof additions in the documentation]
  Link: https://lkml.kernel.org/r/20240326231453.1206227-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-32-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:57 -07:00
Suren Baghdasaryan
c789b5fe38 lib: add codetag reference into slabobj_ext
To store code tag for every slab object, a codetag reference is embedded
into slabobj_ext when CONFIG_MEM_ALLOC_PROFILING=y.

Link: https://lkml.kernel.org/r/20240321163705.3067592-23-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:55 -07:00
Suren Baghdasaryan
8d469d0bee lib: introduce early boot parameter to avoid page_ext memory overhead
The highest memory overhead from memory allocation profiling comes from
page_ext objects.  This overhead exists even if the feature is disabled
but compiled-in.  To avoid it, introduce an early boot parameter that
prevents page_ext object creation.  The new boot parameter is a tri-state
with possible values of 0|1|never.  When it is set to "never" the memory
allocation profiling support is disabled, and overhead is minimized
(currently no page_ext objects are allocated, in the future more overhead
might be eliminated).  As a result we also lose ability to enable memory
allocation profiling at runtime (because there is no space to store
alloctag references).  Runtime sysctrl becomes read-only if the early boot
parameter was set to "never".  Note that the default value of this boot
parameter depends on the CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
configuration.  When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n the
boot parameter is set to "never", therefore eliminating any overhead. 
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y results in boot parameter
being set to 1 (enabled).  This allows distributions to avoid any overhead
by setting CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n config and with
no changes to the kernel command line.

We reuse sysctl.vm.mem_profiling boot parameter name in order to avoid
introducing yet another control.  This change turns it into a tri-state
early boot parameter.

Link: https://lkml.kernel.org/r/20240321163705.3067592-16-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:53 -07:00
Suren Baghdasaryan
dcfe378c81 lib: introduce support for page allocation tagging
Introduce helper functions to easily instrument page allocators by storing
a pointer to the allocation tag associated with the code that allocated
the page in a page_ext field.

Link: https://lkml.kernel.org/r/20240321163705.3067592-15-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:53 -07:00
Suren Baghdasaryan
22d407b164 lib: add allocation tagging support for memory allocation profiling
Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily
instrument memory allocators.  It registers an "alloc_tags" codetag type
with /proc/allocinfo interface to output allocation tag information when
the feature is enabled.

CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory
allocation profiling instrumentation.

Memory allocation profiling can be enabled or disabled at runtime using
/proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n.
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation
profiling by default.

[surenb@google.com: Documentation/filesystems/proc.rst: fix allocinfo title]
  Link: https://lkml.kernel.org/r/20240326073813.727090-1-surenb@google.com
[surenb@google.com: do limited memory accounting for modules with ARCH_NEEDS_WEAK_PER_CPU]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-2-surenb@google.com
[klarasmodin@gmail.com: explicitly include irqflags.h in alloc_tag.h]
  Link: https://lkml.kernel.org/r/20240407133252.173636-1-klarasmodin@gmail.com
[surenb@google.com: fix alloc_tag_init() to prevent passing NULL to PTR_ERR()]
  Link: https://lkml.kernel.org/r/20240417003349.2520094-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-14-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
47a92dfbe0 lib: prevent module unloading if memory is not freed
Skip freeing module's data section if there are non-zero allocation tags
because otherwise, once these allocations are freed, the access to their
code tag would cause UAF.

Link: https://lkml.kernel.org/r/20240321163705.3067592-13-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
a473573964 lib: code tagging module support
Add support for code tagging from dynamically loaded modules.

Link: https://lkml.kernel.org/r/20240321163705.3067592-12-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Suren Baghdasaryan
916cc5167c lib: code tagging framework
Add basic infrastructure to support code tagging which stores tag common
information consisting of the module name, function, file name and line
number.  Provide functions to register a new code tag type and navigate
between code tags.

Link: https://lkml.kernel.org/r/20240321163705.3067592-11-surenb@google.com
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:52 -07:00
Duoming Zhou
c2af060d1c lib/test_hmm.c: handle src_pfns and dst_pfns allocation failure
The kcalloc() in dmirror_device_evict_chunk() will return null if the
physical memory has run out.  As a result, if src_pfns or dst_pfns is
dereferenced, the null pointer dereference bug will happen.

Moreover, the device is going away.  If the kcalloc() fails, the pages
mapping a chunk could not be evicted.  So add a __GFP_NOFAIL flag in
kcalloc().

Finally, as there is no need to have physically contiguous memory, Switch
kcalloc() to kvcalloc() in order to avoid failing allocations.

Link: https://lkml.kernel.org/r/20240312005905.9939-1-duoming@zju.edu.cn
Fixes: b2ef9f5a5c ("mm/hmm/test: add selftest driver for HMM")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Cc: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:55:48 -07:00
Jakub Kicinski
2bd87951de Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/ti/icssg/icssg_prueth.c

net/mac80211/chan.c
  89884459a0 ("wifi: mac80211: fix idle calculation with multi-link")
  87f5500285 ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()")
https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/

net/unix/garbage.c
  1971d13ffa ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().")
  4090fa373f ("af_unix: Replace garbage collection algorithm.")

drivers/net/ethernet/ti/icssg/icssg_prueth.c
drivers/net/ethernet/ti/icssg/icssg_common.c
  4dcd0e83ea ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()")
  e2dc7bfd67 ("net: ti: icssg-prueth: Move common functions into a separate file")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25 12:41:37 -07:00
Andrey Ryabinin
6fe60465e1 stackdepot: respect __GFP_NOLOCKDEP allocation flag
If stack_depot_save_flags() allocates memory it always drops
__GFP_NOLOCKDEP flag.  So when KASAN tries to track __GFP_NOLOCKDEP
allocation we may end up with lockdep splat like bellow:

======================================================
 WARNING: possible circular locking dependency detected
 6.9.0-rc3+ #49 Not tainted
 ------------------------------------------------------
 kswapd0/149 is trying to acquire lock:
 ffff88811346a920
(&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590
[xfs]

 but task is already holding lock:
 ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
balance_pgdat+0x5d9/0xad0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:
 -> #1 (fs_reclaim){+.+.}-{0:0}:
        __lock_acquire+0x7da/0x1030
        lock_acquire+0x15d/0x400
        fs_reclaim_acquire+0xb5/0x100
 prepare_alloc_pages.constprop.0+0xc5/0x230
        __alloc_pages+0x12a/0x3f0
        alloc_pages_mpol+0x175/0x340
        stack_depot_save_flags+0x4c5/0x510
        kasan_save_stack+0x30/0x40
        kasan_save_track+0x10/0x30
        __kasan_slab_alloc+0x83/0x90
        kmem_cache_alloc+0x15e/0x4a0
        __alloc_object+0x35/0x370
        __create_object+0x22/0x90
 __kmalloc_node_track_caller+0x477/0x5b0
        krealloc+0x5f/0x110
        xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
        xfs_iext_insert+0x2e/0x130 [xfs]
        xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs]
        xfs_btree_visit_block+0xfb/0x290 [xfs]
        xfs_btree_visit_blocks+0x215/0x2c0 [xfs]
        xfs_iread_extents+0x1a2/0x2e0 [xfs]
 xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs]
        iomap_iter+0x1d1/0x2d0
 iomap_file_buffered_write+0x120/0x1a0
        xfs_file_buffered_write+0x128/0x4b0 [xfs]
        vfs_write+0x675/0x890
        ksys_write+0xc3/0x160
        do_syscall_64+0x94/0x170
 entry_SYSCALL_64_after_hwframe+0x71/0x79

Always preserve __GFP_NOLOCKDEP to fix this.

Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com
Fixes: cd11016e5f ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Xiubo Li <xiubli@redhat.com>
Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/
Reported-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource.wdc.com/
Suggested-by: Dave Chinner <david@fromorbit.com>
Tested-by: Xiubo Li <xiubli@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-24 19:34:26 -07:00
Kees Cook
2e431b23a1 ubsan: Avoid i386 UBSAN handler crashes with Clang
When generating Runtime Calls, Clang doesn't respect the -mregparm=3
option used on i386. Hopefully this will be fixed correctly in Clang 19:
https://github.com/llvm/llvm-project/pull/89707
but we need to fix this for earlier Clang versions today. Force the
calling convention to use non-register arguments.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://github.com/KSPP/linux/issues/350
Link: https://lore.kernel.org/r/20240424224026.it.216-kees@kernel.org
Acked-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 15:45:38 -07:00
Dawei Li
cdc66553c4 cpumask: Introduce cpumask_first_and_and()
Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works in-place with all inputs and produces desired output directly.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20240416085454.3547175-2-dawei.li@shingroup.cn
2024-04-24 21:23:49 +02:00
Kees Cook
c209826737 ubsan: Remove 1-element array usage in debug reporting
The "type_name" character array was still marked as a 1-element array.
While we don't validate strings used in format arguments yet, let's fix
this before it causes trouble some future day.

Link: https://lore.kernel.org/r/20240424162739.work.492-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 11:59:19 -07:00
Kees Cook
c01c41e500 string_kunit: Move strtomem KUnit test to string_kunit.c
It is more logical to have the strtomem() test in string_kunit.c instead
of the memcpy() suite. Move it to live with memtostr().

Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 09:02:34 -07:00
Kees Cook
0efc5990bc string.h: Introduce memtostr() and memtostr_pad()
Another ambiguous use of strncpy() is to copy from strings that may not
be NUL-terminated. These cases depend on having the destination buffer
be explicitly larger than the source buffer's maximum size, having
the size of the copy exactly match the source buffer's maximum size,
and for the destination buffer to get explicitly NUL terminated.

This usually happens when parsing protocols or hardware character arrays
that are not guaranteed to be NUL-terminated. The code pattern is
effectively this:

	char dest[sizeof(src) + 1];

	strncpy(dest, src, sizeof(src));
	dest[sizeof(dest) - 1] = '\0';

In practice it usually looks like:

struct from_hardware {
	...
	char name[HW_NAME_SIZE] __nonstring;
	...
};

	struct from_hardware *p = ...;
	char name[HW_NAME_SIZE + 1];

	strncpy(name, p->name, HW_NAME_SIZE);
	name[NW_NAME_SIZE] = '\0';

This cannot be replaced with:

	strscpy(name, p->name, sizeof(name));

because p->name is smaller and not NUL-terminated, so FORTIFY will
trigger when strnlen(p->name, sizeof(name)) is used. And it cannot be
replaced with:

	strscpy(name, p->name, sizeof(p->name));

because then "name" may contain a 1 character early truncation of
p->name.

Provide an unambiguous interface for converting a maybe not-NUL-terminated
string to a NUL-terminated string, with compile-time buffer size checking
so that it can never fail at runtime: memtostr() and memtostr_pad(). Also
add KUnit tests for both.

Link: https://lore.kernel.org/r/20240410023155.2100422-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-24 08:57:09 -07:00
Greg Kroah-Hartman
660a708098 Merge 6.9-rc5 into tty-next
We want the tty fixes in here as well, and it resolves a merge conflict
in:
	drivers/tty/serial/serial_core.c
as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23 13:24:45 +02:00
Jason Gunthorpe
e7bc47b166 s390: Stop using weak symbols for __iowrite64_copy()
Complete switching the __iowriteXX_copy() routines over to use #define and
arch provided inline/macro functions instead of weak symbols.

S390 has an implementation that simply calls another memcpy
function. Inline this so the callers don't have to do two jumps.

Link: https://lore.kernel.org/r/3-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22 17:11:20 -03:00
Jason Gunthorpe
20516d6e51 x86: Stop using weak symbols for __iowrite32_copy()
Start switching iomap_copy routines over to use #define and arch provided
inline/macro functions instead of weak symbols.

Inline functions allow more compiler optimization and this is often a
driver hot path.

x86 has the only weak implementation for __iowrite32_copy(), so replace it
with a static inline containing the same single instruction inline
assembly. The compiler will generate the "mov edx,ecx" in a more optimal
way.

Remove iomap_copy_64.S

Link: https://lore.kernel.org/r/1-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22 17:11:19 -03:00
Linus Torvalds
2d412262cc hardening fixes for v6.9-rc5
- Correctly disable UBSAN configs in configs/hardening (Nathan Chancellor)
 
 - Add missing signed integer overflow trap types to arm64 handler
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmYi0M8WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjPrEACrLlPZUPLiJPlPdYC5bW4lLSgZ
 v6z5XjeEVWVvIlzW3DKPKvzMmIl6D3CTN6KbgdjHR+s5VYGYVlkoQJw09SSBu1OX
 yFC2i0lyqUKuAmh6jK1T46kXvbrgK/3ClO3nQk7KotTfvuRcorAcGEmayTYnaWqd
 JX3qyry2oEiQG9pWDHl9bRQ1ZgbNdNkxR2YYIhy88lrMWORdVNG7PFkgNnVsEbnb
 UAbXl817//TSTuXUwklTllz0UNInmDmQTjrmMRUhiwTKEs8aRS5VX6biSyc1Fucz
 KYXNeK9ciV80mQYnj7jDxgXC5jNThtrjEokzht8vvZGHcBp3WMr6CJLmwj9aaSXE
 edib7mJf/YveJTCPN17xAvIMHZAFZyoyeiVIOE1Ys2lWSj8rXH5TvWnn/E4QPxHK
 77lOKGZNwNMYmIa+L6gb3OOWpiZpOMTLCGMuJh6VSDf5BcA0i45yTxAlAe5JYpgw
 txxDscFu5MtrabR4Z28J+VY/wnWqQAC89D6qYsJOPH8kL0o3XhELCDKPNUZoY094
 LV7XuhAB+xDqNdvZi7SHTmTtZSLPqRBlNrOUqQSmXrwjp11naya26l7fn1Y0cpQM
 K8o3ioUkSg0PJNox/kGxryouHXXMqtN/k52JPotkfa6XEQDpN82uo0xJD9r21Viu
 qhA7A8vcQ3KIb0cUbw==
 =Q6Hg
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Correctly disable UBSAN configs in configs/hardening (Nathan
   Chancellor)

 - Add missing signed integer overflow trap types to arm64 handler

* tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  ubsan: Add awareness of signed integer overflow traps
  configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP
  configs/hardening: Fix disabling UBSAN configurations
2024-04-19 14:10:11 -07:00
Kees Cook
dde915c5cb string: Convert KUnit test names to standard convention
The KUnit convention for test names is AREA_test_WHAT. Adjust the string
test names to follow this pattern.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-5-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:12:02 -07:00
Kees Cook
bd678f7d9b string: Merge strcat KUnit tests into string_kunit.c
Move the strcat() tests into string_kunit.c. Remove the separate
Kconfig and Makefile rule.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-4-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:12:01 -07:00
Kees Cook
6e4ef1429f string: Prepare to merge strcat KUnit tests into string_kunit.c
The test naming convention differs between string_kunit.c and
strcat_kunit.c. Move "test" to the beginning of the function name.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:58 -07:00
Kees Cook
bb8d9b742a string: Merge strscpy KUnit tests into string_kunit.c
Move the strscpy() tests into string_kunit.c. Remove the separate
Kconfig and Makefile rule.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:57 -07:00
Kees Cook
b03442f761 string: Prepare to merge strscpy_kunit.c into string_kunit.c
In preparation for moving the strscpy_kunit.c tests into string_kunit.c,
rename "tc" to "strscpy_check" for better readability.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Link: https://lore.kernel.org/r/20240419140155.3028912-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-19 13:11:45 -07:00
Linus Torvalds
9c6e84e4ba Bootconfig fixes for v6.9-rc4:
- Fix potential static_command_line buffer overrun. Currently we allocate
   the memory for static_command_line based on "boot_command_line", but it
   will copy "command_line" into it. So we use the length of "command_line"
   instead of "boot_command_line" (as previously we did).
 - Use memblock_free_late() in xbc_exit() instead of memblock_free() after
   the buddy system is initialized.
 - Fix a kerneldoc warning.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmYgN1kbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b/yEH/1FFgb7UJDtQLbtHl5/b
 bcxLbSzfb/N37Bc+sE/AKZYrt5QAMjaOmdtzQz9kdLtycxWcQinne4jqGxd6zfTU
 UIisfDjEZr46/Rs5sJg+5i8wWrud1TJOmlMsqiSVcorl0f/wE4S7PqgYXRNWZ0p+
 KipjuCCV43ITmVjsiq2NxfZGDaWzow/EJXwZzpQkJE1zaU13w2nzgzg64JW3f/lf
 Dx/o9jlYEoLkCjiQJ6XaRuTpHbPP1grozSMbvE3z1WnxCaiFHlzXGi6WUhto+pTu
 vt/pUrIFYE7k0IFHAVEgBjOkfCm5y9FwOdPLqwy3harQ5ek9D6h6bFnDhbZw7I27
 6V8=
 =e2c5
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig fixes from Masami Hiramatsu:

 - Fix potential static_command_line buffer overrun.

   Currently we allocate the memory for static_command_line based on
   "boot_command_line", but it will copy "command_line" into it. So we
   use the length of "command_line" instead of "boot_command_line" (as
   we previously did)

 - Use memblock_free_late() in xbc_exit() instead of memblock_free()
   after the buddy system is initialized

 - Fix a kerneldoc warning

* tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  bootconfig: Fix the kerneldoc of _xbc_exit()
  bootconfig: use memblock_free_late to free xbc memory to buddy
  init/main.c: Fix potential static_command_line memory overflow
2024-04-19 09:52:09 -07:00
Ivan Orlov
9259a47216 string_kunit: Add test cases for str*cmp functions
Currently, str*cmp functions (strcmp, strncmp, strcasecmp and
strncasecmp) are not covered with tests. Extend the `string_kunit.c`
test by adding the test cases for them.

This patch adds 8 more test cases:

1) strcmp test
2) strcmp test on long strings (2048 chars)
3) strncmp test
4) strncmp test on long strings (2048 chars)
5) strcasecmp test
6) strcasecmp test on long strings
7) strncasecmp test
8) strncasecmp test on long strings

These test cases aim at covering as many edge cases as possible,
including the tests on empty strings, situations when the different
symbol is placed at the end of one of the strings, etc.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240417233033.717596-1-ivan.orlov0322@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-18 10:14:19 -07:00
Masami Hiramatsu (Google)
298b871cd5 bootconfig: Fix the kerneldoc of _xbc_exit()
Fix the kerneldoc of _xbc_exit() which is updated to have an @early
argument and the function name is changed.

Link: https://lore.kernel.org/all/171321744474.599864.13532445969528690358.stgit@devnote2/

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404150036.kPJ3HEFA-lkp@intel.com/
Fixes: 89f9a1e876 ("bootconfig: use memblock_free_late to free xbc memory to buddy")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-18 05:01:32 +09:00
Chen Pei
dac045fc9f bpf, tests: Fix typos in comments
Currently, there are two comments with same name "64-bit ATOMIC magnitudes",
the second one should be "32-bit ATOMIC magnitudes" based on the context.

Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240415081928.17440-1-cp0613@linux.alibaba.com
2024-04-16 17:22:18 +02:00
Kees Cook
f4626c12e4 ubsan: Add awareness of signed integer overflow traps
On arm64, UBSAN traps can be decoded from the trap instruction. Add the
add, sub, and mul overflow trap codes now that CONFIG_UBSAN_SIGNED_WRAP
exists. Seen under clang 19:

  Internal error: UBSAN: unrecognized failure code: 00000000f2005515 [#1] PREEMPT SMP

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/lkml/20240411-fix-ubsan-in-hardening-config-v1-0-e0177c80ffaa@kernel.org
Fixes: 557f8c582a ("ubsan: Reintroduce signed overflow sanitizer")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240415182832.work.932-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-15 17:42:43 -07:00
Breno Leitao
4ba67ef3a1 net: dqs: make struct dql more cache efficient
With the previous change, struct dqs->stall_thrs will be in the hot path
(at queue side), even if DQS is disabled.

The other fields accessed in this function (last_obj_cnt and num_queued)
are in the first cache line, let's move this field  (stall_thrs) to the
very first cache line, since there is a hole there.

This does not change the structure size, since it moves an short (2
bytes) to 4-bytes whole in the first cache line.

This is the new structure format now:

struct dql {
	unsigned int    num_queued;
	unsigned int    last_obj_cnt;
...
	short unsigned int    stall_thrs;
	/* XXX 2 bytes hole, try to pack */
...
	/* --- cacheline 1 boundary (64 bytes) --- */
...
 	/* Longest stall detected, reported to user */
	short unsigned int         stall_max;
	/* XXX 2 bytes hole, try to pack */
};

Also, read the stall_thrs (now in the very first cache line) earlier,
together with dql->num_queued (also in the first cache line).

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240411192241.2498631-5-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-15 11:19:53 -07:00
Qiang Zhang
89f9a1e876 bootconfig: use memblock_free_late to free xbc memory to buddy
On the time to free xbc memory in xbc_exit(), memblock may has handed
over memory to buddy allocator. So it doesn't make sense to free memory
back to memblock. memblock_free() called by xbc_exit() even causes UAF bugs
on architectures with CONFIG_ARCH_KEEP_MEMBLOCK disabled like x86.
Following KASAN logs shows this case.

This patch fixes the xbc memory free problem by calling memblock_free()
in early xbc init error rewind path and calling memblock_free_late() in
xbc exit path to free memory to buddy allocator.

[    9.410890] ==================================================================
[    9.418962] BUG: KASAN: use-after-free in memblock_isolate_range+0x12d/0x260
[    9.426850] Read of size 8 at addr ffff88845dd30000 by task swapper/0/1

[    9.435901] CPU: 9 PID: 1 Comm: swapper/0 Tainted: G     U             6.9.0-rc3-00208-g586b5dfb51b9 #5
[    9.446403] Hardware name: Intel Corporation RPLP LP5 (CPU:RaptorLake)/RPLP LP5 (ID:13), BIOS IRPPN02.01.01.00.00.19.015.D-00000000 Dec 28 2023
[    9.460789] Call Trace:
[    9.463518]  <TASK>
[    9.465859]  dump_stack_lvl+0x53/0x70
[    9.469949]  print_report+0xce/0x610
[    9.473944]  ? __virt_addr_valid+0xf5/0x1b0
[    9.478619]  ? memblock_isolate_range+0x12d/0x260
[    9.483877]  kasan_report+0xc6/0x100
[    9.487870]  ? memblock_isolate_range+0x12d/0x260
[    9.493125]  memblock_isolate_range+0x12d/0x260
[    9.498187]  memblock_phys_free+0xb4/0x160
[    9.502762]  ? __pfx_memblock_phys_free+0x10/0x10
[    9.508021]  ? mutex_unlock+0x7e/0xd0
[    9.512111]  ? __pfx_mutex_unlock+0x10/0x10
[    9.516786]  ? kernel_init_freeable+0x2d4/0x430
[    9.521850]  ? __pfx_kernel_init+0x10/0x10
[    9.526426]  xbc_exit+0x17/0x70
[    9.529935]  kernel_init+0x38/0x1e0
[    9.533829]  ? _raw_spin_unlock_irq+0xd/0x30
[    9.538601]  ret_from_fork+0x2c/0x50
[    9.542596]  ? __pfx_kernel_init+0x10/0x10
[    9.547170]  ret_from_fork_asm+0x1a/0x30
[    9.551552]  </TASK>

[    9.555649] The buggy address belongs to the physical page:
[    9.561875] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x45dd30
[    9.570821] flags: 0x200000000000000(node=0|zone=2)
[    9.576271] page_type: 0xffffffff()
[    9.580167] raw: 0200000000000000 ffffea0011774c48 ffffea0012ba1848 0000000000000000
[    9.588823] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
[    9.597476] page dumped because: kasan: bad access detected

[    9.605362] Memory state around the buggy address:
[    9.610714]  ffff88845dd2ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    9.618786]  ffff88845dd2ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    9.626857] >ffff88845dd30000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.634930]                    ^
[    9.638534]  ffff88845dd30080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.646605]  ffff88845dd30100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[    9.654675] ==================================================================

Link: https://lore.kernel.org/all/20240414114944.1012359-1-qiang4.zhang@linux.intel.com/

Fixes: 40caa127f3 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Stable@vger.kernel.org
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-04-14 22:00:43 +09:00
Bitao Hu
d7037381d0 watchdog/softlockup: Low-overhead detection of interrupt storm
The following softlockup is caused by interrupt storm, but it cannot be
identified from the call tree. Because the call tree is just a snapshot
and doesn't fully capture the behavior of the CPU during the soft lockup.
  watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
  ...
  Call trace:
    __do_softirq+0xa0/0x37c
    __irq_exit_rcu+0x108/0x140
    irq_exit+0x14/0x20
    __handle_domain_irq+0x84/0xe0
    gic_handle_irq+0x80/0x108
    el0_irq_naked+0x50/0x58

Therefore, it is necessary to report CPU utilization during the
softlockup_threshold period (report once every sample_period, for a total
of 5 reportings), like this:
  watchdog: BUG: soft lockup - CPU#28 stuck for 23s! [fio:83921]
  CPU#28 Utilization every 4s during lockup:
    #1: 0% system, 0% softirq, 100% hardirq, 0% idle
    #2: 0% system, 0% softirq, 100% hardirq, 0% idle
    #3: 0% system, 0% softirq, 100% hardirq, 0% idle
    #4: 0% system, 0% softirq, 100% hardirq, 0% idle
    #5: 0% system, 0% softirq, 100% hardirq, 0% idle
  ...

This is helpful in determining whether an interrupt storm has occurred or
in identifying the cause of the softlockup. The criteria for determination
are as follows:

  a. If the hardirq utilization is high, then interrupt storm should be
     considered and the root cause cannot be determined from the call tree.
  b. If the softirq utilization is high, then the call might not necessarily
     point at the root cause.
  c. If the system utilization is high, then analyzing the root
     cause from the call tree is possible in most cases.

The mechanism requires a considerable amount of global storage space
when configured for the maximum number of CPUs. Therefore, adding a
SOFTLOCKUP_DETECTOR_INTR_STORM Kconfig knob that defaults to "yes"
if the max number of CPUs is <= 128.

Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Liu Song <liusong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240411074134.30922-5-yaoma@linux.alibaba.com
2024-04-12 17:08:05 +02:00
Jakub Kicinski
94426ed213 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/unix/garbage.c
  47d8ac011f ("af_unix: Fix garbage collector racing against connect()")
  4090fa373f ("af_unix: Replace garbage collection algorithm.")

Adjacent changes:

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  faa12ca245 ("bnxt_en: Reset PTP tx_avail after possible firmware reset")
  b3d0083caf ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")

drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
  7ac10c7d72 ("bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()")
  194fad5b27 ("bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions")

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
  958f56e483 ("net/mlx5e: Un-expose functions in en.h")
  49e6c93870 ("net/mlx5e: RSS, Block XOR hash with over 128 channels")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-11 14:23:47 -07:00
Linus Torvalds
2ae9a8972c Including fixes from bluetooth.
Current release - new code bugs:
 
   - netfilter: complete validation of user input
 
   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev
 
 Previous releases - regressions:
 
   - core: fix u64_stats_init() for lockdep when used repeatedly in one file
 
   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
 
   - bluetooth: fix memory leak in hci_req_sync_complete()
 
   - batman-adv: avoid infinite loop trying to resize local TT
 
   - drv: geneve: fix header validation in geneve[6]_xmit_skb
 
   - drv: bnxt_en: fix possible memory leak in bnxt_rdma_aux_device_init()
 
   - drv: mlx5: offset comp irq index in name by one
 
   - drv: ena: avoid double-free clearing stale tx_info->xdpf value
 
   - drv: pds_core: fix pdsc_check_pci_health deadlock
 
 Previous releases - always broken:
 
   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
 
   - bluetooth: fix setsockopt not validating user input
 
   - af_unix: clear stale u->oob_skb.
 
   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies
 
   - drv: virtio_net: fix guest hangup on invalid RSS update
 
   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow
 
   - dsa: mt7530: trap link-local frames regardless of ST Port State
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYXyoQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk72wQAJJ9DQra9b/8S3Zla1dutBcznSCxruas
 vWrpgIZiT3Aw5zUmZUZn+rNP8xeWLBK78Yv4m236B8/D3Kji2uMbVrjUAApBBcHr
 /lLmctZIhDHCoJCYvRTC/VOVPuqbbbmxOmx6rNvry93iNAiHAnBdCUlOYzMNPzJz
 6XIGtztFP0ICtt9owFtQRnsPeKhZJ5DoxqgE9KS2Pmb9PU99i1bEShpwLwB5I83S
 yTHKUY5W0rknkQZTW1gbv+o3dR0iFy7LZ+1FItJ/UzH0bG6JmcqzSlH5mZZJCc/L
 5LdUwtwMmKG2Kez/vKr1DAwTeAyhwVU+d+Hb28QXiO0kAYbjbOgNXse1st3RwDt5
 YKMKlsmR+kgPYLcvs9df2aubNSRvi2utwIA2kuH33HxBYF5PfQR5PTGeR21A+cKo
 wvSit8aMaGFTPJ7rRIzkNaPdIHSvPMKYcXV/T8EPvlOHzi5GBX0qHWj99JO9Eri+
 VFci+FG3HCPHK8v683g/WWiiVNx/IHMfNbcukes1oDFsCeNo7KZcnPY+zVhtdvvt
 QBnvbAZGKeDXMbnHZLB3DCR3ENHWTrJzC3alLDp3/uFC79VKtfIRO2wEX3gkrN8S
 JHsdYU13Yp1ERaNjUeq7Sqk2OGLfsBt4HSOhcK8OPPgE5rDRON5UPjkuNvbaEiZY
 Morzaqzerg1B
 =a9bB
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth.

  Current release - new code bugs:

   - netfilter: complete validation of user input

   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev

  Previous releases - regressions:

   - core: fix u64_stats_init() for lockdep when used repeatedly in one
     file

   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr

   - bluetooth: fix memory leak in hci_req_sync_complete()

   - batman-adv: avoid infinite loop trying to resize local TT

   - drv: geneve: fix header validation in geneve[6]_xmit_skb

   - drv: bnxt_en: fix possible memory leak in
     bnxt_rdma_aux_device_init()

   - drv: mlx5: offset comp irq index in name by one

   - drv: ena: avoid double-free clearing stale tx_info->xdpf value

   - drv: pds_core: fix pdsc_check_pci_health deadlock

  Previous releases - always broken:

   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING

   - bluetooth: fix setsockopt not validating user input

   - af_unix: clear stale u->oob_skb.

   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies

   - drv: virtio_net: fix guest hangup on invalid RSS update

   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow

   - dsa: mt7530: trap link-local frames regardless of ST Port State"

* tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
  net: ena: Set tx_info->xdpf value to NULL
  net: ena: Fix incorrect descriptor free behavior
  net: ena: Wrong missing IO completions check order
  net: ena: Fix potential sign extension issue
  af_unix: Fix garbage collector racing against connect()
  net: dsa: mt7530: trap link-local frames regardless of ST Port State
  Revert "s390/ism: fix receive message buffer allocation"
  net: sparx5: fix wrong config being used when reconfiguring PCS
  net/mlx5: fix possible stack overflows
  net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev
  net/mlx5e: RSS, Block XOR hash with over 128 channels
  net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
  net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
  net/mlx5e: Fix mlx5e_priv_init() cleanup flow
  net/mlx5e: RSS, Block changing channels number when RXFH is configured
  net/mlx5: Correctly compare pkt reformat ids
  net/mlx5: Properly link new fs rules into the tree
  net/mlx5: offset comp irq index in name by one
  net/mlx5: Register devlink first under devlink lock
  net/mlx5: E-switch, store eswitch pointer before registering devlink_param
  ...
2024-04-11 11:46:31 -07:00
Linus Torvalds
fe5b5ef836 hardening fixes for v6.9-rc4
- gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel)
 
 - ubsan: fix unused variable warning in test module (Arnd Bergmann)
 
 - Improve entropy diffusion in randomize_kstack
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmYWv7sWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJv0TD/9z0EFo2K6brr6rZ9LUK8JgdcFd
 MXfee95uw3DvqTphx6H+559AGl0nCZd0S8huw8TAPsRg2hNeuFsQTcQ4TRnyCE+w
 qMCjflZI0LHqyTfTOuFeSJERQ8kiHntaci33xXR7crehHNGvJN6jRB5ouUOxdFlB
 L4mx7P4OFjS/+6qRM957yqWrCPM26Nl7qe0TcnOZ5eh+HvIRNF9hSg8DAMMIb0j7
 lZZojpOdiYYccfBUhM2PCpDfSNtG06/QpKGCeVK/3gpRzrVWl99iABTpJStf3rbH
 PWd3kwz/FINYq8sOjm9Y960Xgkz+IZkhr5YXVs6ezMoV5J9vGBG7zjzlYbPwbfoy
 UTaHUEGmYqCnnshrC5PU05G/CGlYP5iz6pzfxofXLca5+WQY0HxERLAkVdVhn39q
 oRLIOleAguocK3eBQ03tmJFF28yFJyOWK3C0gp8g8vExBAg2EQGrHOAfR28crB5p
 yKvLQHL04urSQA9gs4PiflBGyHiLSuWBGyO1K3NvW5cNrRUIGXlwfy8i7cHSzBFo
 /rQ49E7xxtitAlk+Jol55OewEpw/nq1wqOv0eFW4pje+WSOVRpbQBtDD03gik74v
 bW9ownxrypqUm5OR6Uuf6OwVWi/zX03inVF0d55dcLv+kjDy0Xu8ViVOhXhs2vT8
 PayuOAhAvHgcfGJwFw==
 =MHgA
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel)

 - ubsan: fix unused variable warning in test module (Arnd Bergmann)

 - Improve entropy diffusion in randomize_kstack

* tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randomize_kstack: Improve entropy diffusion
  ubsan: fix unused variable warning in test module
  gcc-plugins/stackleak: Avoid .head.text section
2024-04-10 13:31:34 -07:00
Paul E. McKenney
a88d970c8b lib: Add one-byte emulation function
Architectures are required to provide four-byte cmpxchg() and 64-bit
architectures are additionally required to provide eight-byte cmpxchg().
However, there are cases where one-byte cmpxchg() would be extremely
useful.  Therefore, provide cmpxchg_emu_u8() that emulates one-byte
cmpxchg() in terms of four-byte cmpxchg().

Note that this emulations is fully ordered, and can (for example) cause
one-byte cmpxchg_relaxed() to incur the overhead of full ordering.
If this causes problems for a given architecture, that architecture is
free to provide its own lighter-weight primitives.

[ paulmck: Apply Marco Elver feedback. ]
[ paulmck: Apply kernel test robot feedback. ]
[ paulmck: Drop two-byte support per Arnd Bergmann feedback. ]

Link: https://lore.kernel.org/all/0733eb10-5e7a-4450-9b8a-527b97c842ff@paulmck-laptop/

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: <linux-arch@vger.kernel.org>
2024-04-09 22:06:00 -07:00
Jiri Slaby (SUSE)
d52b761e4b kfifo: add kfifo_dma_out_prepare_mapped()
When the kfifo buffer is already dma-mapped, one cannot use the kfifo
API to fill in an SG list.

Add kfifo_dma_in_prepare_mapped() which allows exactly this. A mapped
dma_addr_t is passed and it is filled into provided sgl too. Including
the dma_len.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-8-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
fea0dde081 kfifo: pass offset to setup_sgl_buf() instead of a pointer
As a preparatory for dma addresses filling, we need the data offset
instead of virtual pointer in setup_sgl_buf(). So pass the former
instead the latter.

And pointer to fifo is needed in setup_sgl_buf() now too.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-7-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
ed6d22f5d8 kfifo: rename l to len_to_end in setup_sgl()
So that one can make any sense of the name.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-6-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
e9d9576de0 kfifo: remove support for physically non-contiguous memory
First, there is no such user. The only user of this interface is
caam_rng_fill_async() and that uses kfifo_alloc() -> kmalloc().

Second, the implementation does not allow anything else than direct
mapping and kmalloc() (due to virt_to_phys()), anyway.

Therefore, there is no point in having this dead (and complex) code in
the kernel.

Note the setup_sgl_buf() function now boils down to simple sg_set_buf().
That is called twice from setup_sgl() to take care of kfifo buffer
wrap-around.

setup_sgl_buf() will be extended shortly, so keeping it in place.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Stefani Seibold <stefani@seibold.net>
Link: https://lore.kernel.org/r/20240405060826.2521-5-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
4edd7e96a1 kfifo: add kfifo_out_linear{,_ptr}()
These are helpers which are going to be used in the serial layer. We
need a wrapper around kfifo which provides us with a tail (sometimes
"tail" offset, sometimes a pointer) to the kfifo data. And which returns
count of available data -- but not larger than to the end of the buffer
(hence _linear in the names). I.e. something like CIRC_CNT_TO_END() in
the legacy circ_buf.

This patch adds such two helpers.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-4-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:03 +02:00
Jiri Slaby (SUSE)
779087fe2f kfifo: drop __kfifo_dma_out_finish_r()
It is the same as __kfifo_skip_r(), so:
* drop __kfifo_dma_out_finish_r() completely, and
* replace its (only) use by __kfifo_skip_r().

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20240405060826.2521-2-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-09 15:28:02 +02:00
Adrian Hunter
8ff1e6c5ac vdso: Fix powerpc build U64_MAX undeclared error
U64_MAX is not in include/vdso/limits.h, although that isn't noticed on x86
because x86 includes include/linux/limits.h indirectly. However powerpc is
more selective, resulting in the following build error:

  In file included from <command-line>:
  lib/vdso/gettimeofday.c: In function 'vdso_calc_ns':
  lib/vdso/gettimeofday.c:11:33: error: 'U64_MAX' undeclared
     11 | # define VDSO_DELTA_MASK(vd)    U64_MAX
        |                                 ^~~~~~~

Use ULLONG_MAX instead which will work just as well and is in
include/vdso/limits.h.

Fixes: c8e3a8b6f2 ("vdso: Consolidate vdso_calc_delta()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240409062639.3393-1-adrian.hunter@intel.com
Closes: https://lore.kernel.org/all/20240409124905.6816db37@canb.auug.org.au/
2024-04-09 12:35:19 +02:00
Adrian Hunter
456e3788bc vdso: Make delta calculation overflow safe
Kernel timekeeping is designed to keep the change in cycles (since the last
timer interrupt) below max_cycles, which prevents multiplication overflow
when converting cycles to nanoseconds. However, if timer interrupts stop,
the calculation will eventually overflow.

Add protection against that, enabled by config option
CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT. Check against max_cycles, falling
back to a slower higher precision calculation.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-8-adrian.hunter@intel.com
2024-04-08 15:03:07 +02:00
Adrian Hunter
0c68458b0a vdso: Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT
Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT in preparation to add
multiplication overflow protection to the VDSO time getter functions.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-4-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Adrian Hunter
5b26ef660a vdso: Consolidate nanoseconds calculation
Consolidate nanoseconds calculation to simplify and reduce code
duplication.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-3-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Adrian Hunter
c8e3a8b6f2 vdso: Consolidate vdso_calc_delta()
Consolidate vdso_calc_delta(), in preparation for further simplification.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240325064023.2997-2-adrian.hunter@intel.com
2024-04-08 15:03:06 +02:00
Arnd Bergmann
e9d47b7b31 lib: checksum: hide unused expected_csum_ipv6_magic[]
When CONFIG_NET is disabled, an extra warning shows up for this
unused variable:

lib/checksum_kunit.c:218:18: error: 'expected_csum_ipv6_magic' defined but not used [-Werror=unused-const-variable=]

Replace the #ifdef with an IS_ENABLED() check that makes the compiler's
dead-code-elimination take care of the link failure.

Fixes: f24a70106d ("lib: checksum: Fix build with CONFIG_NET=n")
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-08 11:03:05 +01:00
Peter Collingbourne
a6c1d9cb9a stackdepot: rename pool_index to pool_index_plus_1
Commit 3ee34eabac ("lib/stackdepot: fix first entry having a 0-handle")
changed the meaning of the pool_index field to mean "the pool index plus
1".  This made the code accessing this field less self-documenting, as
well as causing debuggers such as drgn to not be able to easily remain
compatible with both old and new kernels, because they typically do that
by testing for presence of the new field.  Because stackdepot is a
debugging tool, we should make sure that it is debugger friendly. 
Therefore, give the field a different name to improve readability as well
as enabling debugger backwards compatibility.

This is needed in 6.9, which would otherwise become an odd release with
the new semantics and old name so debuggers wouldn't recognize the new
semantics there.

Fixes: 3ee34eabac ("lib/stackdepot: fix first entry having a 0-handle")
Link: https://lkml.kernel.org/r/20240402001500.53533-1-pcc@google.com
Link: https://linux-review.googlesource.com/id/Ib3e70c36c1d230dd0a118dc22649b33e768b9f88
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-05 11:21:31 -07:00
Andrii Nakryiko
229087f6f1 bpf, kconfig: Fix DEBUG_INFO_BTF_MODULES Kconfig definition
Turns out that due to CONFIG_DEBUG_INFO_BTF_MODULES not having an
explicitly specified "menu item name" in Kconfig, it's basically
impossible to turn it off (see [0]).

This patch fixes the issue by defining menu name for
CONFIG_DEBUG_INFO_BTF_MODULES, which makes it actually adjustable
and independent of CONFIG_DEBUG_INFO_BTF, in the sense that one can
have DEBUG_INFO_BTF=y and DEBUG_INFO_BTF_MODULES=n.

We still keep it as defaulting to Y, of course.

Fixes: 5f9ae91f7c ("kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CAK3+h2xiFfzQ9UXf56nrRRP=p1+iUxGoEP5B+aq9MDT5jLXDSg@mail.gmail.com [0]
Link: https://lore.kernel.org/bpf/20240404220344.3879270-1-andrii@kernel.org
2024-04-05 15:12:47 +02:00
Guenter Roeck
b1080c667b mm/slub, kunit: Use inverted data to corrupt kmem cache
Two failure patterns are seen randomly when running slub_kunit tests with
CONFIG_SLAB_FREELIST_RANDOM and CONFIG_SLAB_FREELIST_HARDENED enabled.

Pattern 1:
     # test_clobber_zone: pass:1 fail:0 skip:0 total:1
     ok 1 test_clobber_zone
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:72
     Expected 3 == slab_errors, but
         slab_errors == 0 (0x0)
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:84
     Expected 2 == slab_errors, but
         slab_errors == 0 (0x0)
     # test_next_pointer: pass:0 fail:1 skip:0 total:1
     not ok 2 test_next_pointer

In this case, test_next_pointer() overwrites p[s->offset], but the data
at p[s->offset] is already 0x12.

Pattern 2:
     ok 1 test_clobber_zone
     # test_next_pointer: EXPECTATION FAILED at lib/slub_kunit.c:72
     Expected 3 == slab_errors, but
         slab_errors == 2 (0x2)
     # test_next_pointer: pass:0 fail:1 skip:0 total:1
     not ok 2 test_next_pointer

In this case, p[s->offset] has a value other than 0x12, but one of the
expected failures is nevertheless missing.

Invert data instead of writing a fixed value to corrupt the cache data
structures to fix the problem.

Fixes: 1f9f78b1b3 ("mm/slub, kunit: add a KUnit test for SLUB debugging functionality")
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
CC: Daniel Latypov <dlatypov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04 16:08:10 +02:00
Arnd Bergmann
bbda3ba626 ubsan: fix unused variable warning in test module
This is one of the drivers with an unused variable that is marked 'const'.
Adding a __used annotation here avoids the warning and lets us enable
the option by default:

lib/test_ubsan.c:137:28: error: unused variable 'skip_ubsan_array' [-Werror,-Wunused-const-variable]

Fixes: 4a26f49b7b ("ubsan: expand tests and reporting")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20240403080702.3509288-3-arnd@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2024-04-03 14:35:57 -07:00
Alexander Lobakin
7adaf37f7f lib/bitmap: add compile-time test for __assign_bit() optimization
Commit dc34d50366 ("lib: test_bitmap: add compile-time
optimization/evaluations assertions") initially missed __assign_bit(),
which led to that quite a time passed before I realized it doesn't get
optimized at compilation time. Now that it does, add test for that just
to make sure nothing will break one day.
To make things more interesting, use bitmap_complement() and
bitmap_full(), thus checking their compile-time evaluation as well. And
remove the misleading comment mentioning the workaround removed recently
in favor of adding the whole file to GCov exceptions.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:28 +01:00
Alexander Lobakin
a37fbe666c bitmap: introduce generic optimized bitmap_size()
The number of times yet another open coded
`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
Some generic helper is long overdue.

Add one, bitmap_size(), but with one detail.
BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
divident and divisor are compile-time constants or when the divisor
is not a pow-of-2. When it is however, the compilers sometimes tend
to generate suboptimal code (GCC 13):

48 83 c0 3f          	add    $0x3f,%rax
48 c1 e8 06          	shr    $0x6,%rax
48 8d 14 c5 00 00 00 00	lea    0x0(,%rax,8),%rdx

%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
full division of `nbits + 63` by it and then multiplication by 8.
Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:

8d 50 3f             	lea    0x3f(%rax),%edx
c1 ea 03             	shr    $0x3,%edx
81 e2 f8 ff ff 1f    	and    $0x1ffffff8,%edx

Now it shifts `nbits + 63` by 3 positions (IOW performs fast division
by 8) and then masks bits[2:0]. bloat-o-meter:

add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)

Clang does it better and generates the same code before/after starting
from -O1, except that with the ALIGN() approach it uses %edx and thus
still saves some bytes:

add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)

Note that we can't expand DIV_ROUND_UP() by adding a check and using
this approach there, as it's used in array declarations where
expressions are not allowed.
Add this helper to tools/ as well.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:28 +01:00
Alexander Potapenko
f3e28876b6 lib/test_bitmap: use pr_info() for non-error messages
pr_err() messages may be treated as errors by some log readers, so let
us only use them for test failures. For non-error messages, replace them
with pr_info().

Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:27 +01:00
Alexander Potapenko
991e558364 lib/test_bitmap: add tests for bitmap_{read,write}()
Add basic tests ensuring that values can be added at arbitrary positions
of the bitmap, including those spanning into the adjacent unsigned
longs.

Two new performance tests, test_bitmap_read_perf() and
test_bitmap_write_perf(), can be used to assess future performance
improvements of bitmap_read() and bitmap_write():

[    0.431119][    T1] test_bitmap: Time spent in test_bitmap_read_perf:	615253
[    0.433197][    T1] test_bitmap: Time spent in test_bitmap_write_perf:	916313

(numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running
QEMU).

Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:27 +01:00
Ingo Molnar
f4566a1e73 Linux 6.9-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmYAlq0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYqwH/0fb4pRbVtULpiIK
 Cs7/e/IWzRRWLBq+Jj2KVVTxwjyiKFNOq6K/CHHnljIWo1yN2CIWeOgbHfTI0WfN
 xmBdJP7OtK8MCN9PwwoWhZxMLcyv4pFCERrrkGa7AD+cdN4j/ytQ3mH5V8f/21fd
 rnpQSdpgGXB2SSMHd520Y+e56+gxrrTmsDXjZWM08Wt0bbqAWJrjNe58BMz5hI1t
 yQtcgYRTdUuZBn5TMkT99lK9EFQslV38YCo7RUP5D0DWXS1jSfWlgnCD1Nc1ziF4
 ps/xPdUMDJAc5Tslg/hgJOciSuLqgMzIUsVgZrKysuu3NhwDY1LDWGORmH1t8E8W
 RC25950=
 =F+01
 -----END PGP SIGNATURE-----

Merge tag 'v6.9-rc1' into sched/core, to pick up fixes and to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-03-25 11:32:29 +01:00