2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

443 Commits

Author SHA1 Message Date
Heiko Carstens
1f266fd704 s390/lowcore: Remove unused machine_flags
The machine_flags member in struct lowcore is not used anymore.
Remove it.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-31 12:20:39 +02:00
Heiko Carstens
841f35a08d s390/bear: Convert cpu_has_bear() to cpu feature function
Get rid of the cpu_has_bear jump label and convert cpu_has_bear() to a cpu
feature function using test_facility() and with that use a static branch.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:07 +01:00
Heiko Carstens
52109a067a s390: Convert MACHINE_IS_[LPAR|VM|KVM], etc, machine_is_[lpar|vm|kvm]()
Move machine type detection to the decompressor and use static branches
to implement and use machine_is_[lpar|vm|kvm]() instead of a runtime check
via MACHINE_IS_[LPAR|VM|KVM].

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:07 +01:00
Heiko Carstens
e4da8249cf s390/lowcore: Convert relocated lowcore alternative to machine feature
Convert the explicit relocated lowcore alternative type to a more
generic machine feature. This only reduces the number of alternative
types, but has no impact on code generation.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:05 +01:00
Heiko Carstens
3f5eede6df s390/cpufeature: Convert MACHINE_HAS_EDAT2 to cpu_has_edat2()
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead
of testing the machine_flags lowcore member if the feature is present.

test_facility() generates better code since it results in a static branch
without accessing memory. The branch is patched via alternatives by the
decompressor depending on the availability of the required facility.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:05 +01:00
Linus Torvalds
b731bc5f49 more s390 updates for 6.14 merge window
- The rework that uncoupled physical and virtual address spaces
   inadvertently prevented KASAN shadow mappings from using large
   pages. Restore large page mappings for KASAN shadows
 
 - Add decompressor routine physmem_alloc() that may fail,
   unlike physmem_alloc_or_die(). This allows callers to
   implement fallback paths
 
 - Allow falling back from large pages to smaller pages (1MB or
   4KB) if the allocation of 2GB pages  in the decompressor can
   not be fulfilled
 
 - Add to the decompressor boot print support of "%%" format
   string, width and padding hadnling, length modifiers and
   decimal conversion specifiers
 
 - Add to the decompressor message severity levels similar to
   kernel ones. Support command-line options that control
   console output verbosity
 
 - Replaces boot_printk() calls with appropriate loglevel-
   specific helpers such as boot_emerg(), boot_warn(), and
   boot_debug().
 
 - Collect all boot messages into a ring buffer independent
   of the current log level. This is particularly useful for
   early crash analysis
 
 - If 'earlyprintk' command line parameter is not specified, store
   decompressor boot messages in a ring buffer to be printed later
   by the kernel, once the console driver is registered
 
 - Add 'bootdebug' command line parameter to enable printing of
   decompressor debug messages when needed. That parameters allows
   message supressing and filtering
 
 - Dump boot messages on a decompressor crash, but only if
   'bootdebug' command line parameter is enabled
 
 - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot
   messages in the same format as regular printk()
 
 - Dump physical memory tracking information on boot:
   online ranges, reserved areas and vmem allocations
 
 - Dump virtual memory layout and randomization details
 
 - Improve decompression error reporting and dump the message
   ring buffer in case the boot failed and system halted
 
 - Add an exception handler which handles exceptions when FPU
   control register is attempted to be set to an invalid value.
   Remove '.fixup' section as result of this change
 
 - Use 'A', 'O', and 'R' inline assembly format flags, which
   allows recent Clang compilers to generate better FPU code
 
 - Rework uaccess code so it reads better and generates more
   efficient code
 
 - Cleanup futex inline assembly code
 
 - Disable KMSAN instrumention for futex inline assemblies, which
   contain dereferenced user pointers. Otherwise, shadows for the
   user pointers would be accessed
 
 - PFs which are not initially configured but in standby create
   only a single-function PCI domain. If they are configured later
   on, sibling PFs and their child VFs will not be added to their
   PCI domain breaking SR-IOV expectations. Fix that by allowing
   initially configured but in standby PFs create multi-function
   PCI domains
 
 - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
   compile errors caused by kernel's own definitions of 'bool',
   'false', and 'true' conflicting with the C23 reserved keywords
 
 - Fix sclp subsystem failure when a sclp console is not present
 
 - Fix misuse of non-NULL terminated strings in vmlogrdr driver
 
 - Various other small improvements, cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZ5t0jhccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8A4wAP91BcHtV4OTmoMsG4JTwiy9wTzI
 zro9IrBSCLPLED7kfQEAzjEspd5gWMFDnIM/KxXmCNHA/nBTXVh/1F+b1F0z+Q8=
 =JfEG
 -----END PGP SIGNATURE-----

Merge tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Alexander Gordeev:

 - The rework that uncoupled physical and virtual address spaces
   inadvertently prevented KASAN shadow mappings from using large pages.
   Restore large page mappings for KASAN shadows

 - Add decompressor routine physmem_alloc() that may fail, unlike
   physmem_alloc_or_die(). This allows callers to implement fallback
   paths

 - Allow falling back from large pages to smaller pages (1MB or 4KB) if
   the allocation of 2GB pages in the decompressor can not be fulfilled

 - Add to the decompressor boot print support of "%%" format string,
   width and padding hadnling, length modifiers and decimal conversion
   specifiers

 - Add to the decompressor message severity levels similar to kernel
   ones. Support command-line options that control console output
   verbosity

 - Replaces boot_printk() calls with appropriate loglevel- specific
   helpers such as boot_emerg(), boot_warn(), and boot_debug().

 - Collect all boot messages into a ring buffer independent of the
   current log level. This is particularly useful for early crash
   analysis

 - If 'earlyprintk' command line parameter is not specified, store
   decompressor boot messages in a ring buffer to be printed later by
   the kernel, once the console driver is registered

 - Add 'bootdebug' command line parameter to enable printing of
   decompressor debug messages when needed. That parameters allows
   message suppressing and filtering

 - Dump boot messages on a decompressor crash, but only if 'bootdebug'
   command line parameter is enabled

 - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages
   in the same format as regular printk()

 - Dump physical memory tracking information on boot: online ranges,
   reserved areas and vmem allocations

 - Dump virtual memory layout and randomization details

 - Improve decompression error reporting and dump the message ring
   buffer in case the boot failed and system halted

 - Add an exception handler which handles exceptions when FPU control
   register is attempted to be set to an invalid value. Remove '.fixup'
   section as result of this change

 - Use 'A', 'O', and 'R' inline assembly format flags, which allows
   recent Clang compilers to generate better FPU code

 - Rework uaccess code so it reads better and generates more efficient
   code

 - Cleanup futex inline assembly code

 - Disable KMSAN instrumention for futex inline assemblies, which
   contain dereferenced user pointers. Otherwise, shadows for the user
   pointers would be accessed

 - PFs which are not initially configured but in standby create only a
   single-function PCI domain. If they are configured later on, sibling
   PFs and their child VFs will not be added to their PCI domain
   breaking SR-IOV expectations.

   Fix that by allowing initially configured but in standby PFs create
   multi-function PCI domains

 - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
   compile errors caused by kernel's own definitions of 'bool', 'false',
   and 'true' conflicting with the C23 reserved keywords

 - Fix sclp subsystem failure when a sclp console is not present

 - Fix misuse of non-NULL terminated strings in vmlogrdr driver

 - Various other small improvements, cleanups and fixes

* tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (53 commits)
  s390/vmlogrdr: Use array instead of string initializer
  s390/vmlogrdr: Use internal_name for error messages
  s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
  s390/tools: Use array instead of string initializer
  s390/vmem: Fix null-pointer-arithmetic warning in vmem_map_init()
  s390: Add '-std=gnu11' to decompressor and purgatory CFLAGS
  s390/bitops: Use correct constraint for arch_test_bit() inline assembly
  s390/pci: Fix SR-IOV for PFs initially in standby
  s390/futex: Avoid KMSAN instrumention for user pointers
  s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inline
  s390/futex: Cleanup futex_atomic_cmpxchg_inatomic()
  s390/futex: Generate futex atomic op functions
  s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER
  s390/uaccess: Use asm goto for put_user()/get_user()
  s390/uaccess: Remove usage of the oac specifier
  s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handling
  s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly constraints
  s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappers
  s390/uaccess: Move put_user() / __put_user() close to put_user() asm code
  s390/uaccess: Use asm goto for __mvc_kernel_nofault()
  ...
2025-01-30 10:48:17 -08:00
Heiko Carstens
3bcc8a1af5 s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
With the switch to GENERIC_CPU_DEVICES an early call to the sclp subsystem
was added to smp_prepare_cpus(). This will usually succeed since the sclp
subsystem is implicitly initialized early enough if an sclp based console
is present.

If no such console is present the initialization happens with an
arch_initcall(); in such cases calls to the sclp subsystem will fail.
For CPU detection this means that the fallback sigp loop will be used
permanently to detect CPUs instead of the preferred READ_CPU_INFO sclp
request.

Fix this by adding an explicit early sclp_init() call via
arch_cpu_finalize_init().

Reported-by: Sheshu Ramanandan <sheshu.ramanandan@ibm.com>
Fixes: 4a39f12e75 ("s390/smp: Switch to GENERIC_CPU_DEVICES")
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-28 17:38:46 +01:00
Linus Torvalds
9c5968db9e The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
 
 - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the
   page allocator so we end up with the ability to allocate and free
   zero-refcount pages.  So that callers (ie, slab) can avoid a refcount
   inc & dec.
 
 - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use
   large folios other than PMD-sized ones.
 
 - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and
   fixes for this small built-in kernel selftest.
 
 - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of
   the mapletree code.
 
 - "mm: fix format issues and param types" from Keren Sun implements a
   few minor code cleanups.
 
 - "simplify split calculation" from Wei Yang provides a few fixes and a
   test for the mapletree code.
 
 - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes
   continues the work of moving vma-related code into the (relatively) new
   mm/vma.c.
 
 - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
   Hildenbrand cleans up and rationalizes handling of gfp flags in the page
   allocator.
 
 - "readahead: Reintroduce fix for improper RA window sizing" from Jan
   Kara is a second attempt at fixing a readahead window sizing issue.  It
   should reduce the amount of unnecessary reading.
 
 - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
   addresses an issue where "huge" amounts of pte pagetables are
   accumulated
   (https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/).
   Qi's series addresses this windup by synchronously freeing PTE memory
   within the context of madvise(MADV_DONTNEED).
 
 - "selftest/mm: Remove warnings found by adding compiler flags" from
   Muhammad Usama Anjum fixes some build warnings in the selftests code
   when optional compiler warnings are enabled.
 
 - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David
   Hildenbrand tightens the allocator's observance of __GFP_HARDWALL.
 
 - "pkeys kselftests improvements" from Kevin Brodsky implements various
   fixes and cleanups in the MM selftests code, mainly pertaining to the
   pkeys tests.
 
 - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
   estimate application working set size.
 
 - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
   provides some cleanups to memcg's hugetlb charging logic.
 
 - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
   removes the global swap cgroup lock.  A speedup of 10% for a tmpfs-based
   kernel build was demonstrated.
 
 - "zram: split page type read/write handling" from Sergey Senozhatsky
   has several fixes and cleaups for zram in the area of zram_write_page().
   A watchdog softlockup warning was eliminated.
 
 - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky
   cleans up the pagetable destructor implementations.  A rare
   use-after-free race is fixed.
 
 - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
   simplifies and cleans up the debugging code in the VMA merging logic.
 
 - "Account page tables at all levels" from Kevin Brodsky cleans up and
   regularizes the pagetable ctor/dtor handling.  This results in
   improvements in accounting accuracy.
 
 - "mm/damon: replace most damon_callback usages in sysfs with new core
   functions" from SeongJae Park cleans up and generalizes DAMON's sysfs
   file interface logic.
 
 - "mm/damon: enable page level properties based monitoring" from
   SeongJae Park increases the amount of information which is presented in
   response to DAMOS actions.
 
 - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes
   DAMON's long-deprecated debugfs interfaces.  Thus the migration to sysfs
   is completed.
 
 - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter
   Xu cleans up and generalizes the hugetlb reservation accounting.
 
 - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
   removes a never-used feature of the alloc_pages_bulk() interface.
 
 - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
   extends DAMOS filters to support not only exclusion (rejecting), but
   also inclusion (allowing) behavior.
 
 - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
   "introduces a new memory descriptor for zswap.zpool that currently
   overlaps with struct page for now.  This is part of the effort to reduce
   the size of struct page and to enable dynamic allocation of memory
   descriptors."
 
 - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and
   simplifies the swap allocator locking.  A speedup of 400% was
   demonstrated for one workload.  As was a 35% reduction for kernel build
   time with swap-on-zram.
 
 - "mm: update mips to use do_mmap(), make mmap_region() internal" from
   Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
   mmap_region() can be made MM-internal.
 
 - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU
   regressions and otherwise improves MGLRU performance.
 
 - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park
   updates DAMON documentation.
 
 - "Cleanup for memfd_create()" from Isaac Manjarres does that thing.
 
 - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand
   provides various cleanups in the areas of hugetlb folios, THP folios and
   migration.
 
 - "Uncached buffered IO" from Jens Axboe implements the new
   RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache
   reading and writing.  To permite userspace to address issues with
   massive buildup of useless pagecache when reading/writing fast devices.
 
 - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
   Weißschuh fixes and optimizes some of the MM selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5a+cwAKCRDdBJ7gKXxA
 jtoyAP9R58oaOKPJuTizEKKXvh/RpMyD6sYcz/uPpnf+cKTZxQEAqfVznfWlw/Lz
 uC3KRZYhmd5YrxU4o+qjbzp9XWX/xAE=
 =Ib2s
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "The various patchsets are summarized below. Plus of course many
  indivudual patches which are described in their changelogs.

   - "Allocate and free frozen pages" from Matthew Wilcox reorganizes
     the page allocator so we end up with the ability to allocate and
     free zero-refcount pages. So that callers (ie, slab) can avoid a
     refcount inc & dec

   - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
     use large folios other than PMD-sized ones

   - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
     and fixes for this small built-in kernel selftest

   - "mas_anode_descend() related cleanup" from Wei Yang tidies up part
     of the mapletree code

   - "mm: fix format issues and param types" from Keren Sun implements a
     few minor code cleanups

   - "simplify split calculation" from Wei Yang provides a few fixes and
     a test for the mapletree code

   - "mm/vma: make more mmap logic userland testable" from Lorenzo
     Stoakes continues the work of moving vma-related code into the
     (relatively) new mm/vma.c

   - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
     Hildenbrand cleans up and rationalizes handling of gfp flags in the
     page allocator

   - "readahead: Reintroduce fix for improper RA window sizing" from Jan
     Kara is a second attempt at fixing a readahead window sizing issue.
     It should reduce the amount of unnecessary reading

   - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
     addresses an issue where "huge" amounts of pte pagetables are
     accumulated:

       https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/

     Qi's series addresses this windup by synchronously freeing PTE
     memory within the context of madvise(MADV_DONTNEED)

   - "selftest/mm: Remove warnings found by adding compiler flags" from
     Muhammad Usama Anjum fixes some build warnings in the selftests
     code when optional compiler warnings are enabled

   - "mm: don't use __GFP_HARDWALL when migrating remote pages" from
     David Hildenbrand tightens the allocator's observance of
     __GFP_HARDWALL

   - "pkeys kselftests improvements" from Kevin Brodsky implements
     various fixes and cleanups in the MM selftests code, mainly
     pertaining to the pkeys tests

   - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
     estimate application working set size

   - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
     provides some cleanups to memcg's hugetlb charging logic

   - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
     removes the global swap cgroup lock. A speedup of 10% for a
     tmpfs-based kernel build was demonstrated

   - "zram: split page type read/write handling" from Sergey Senozhatsky
     has several fixes and cleaups for zram in the area of
     zram_write_page(). A watchdog softlockup warning was eliminated

   - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
     Brodsky cleans up the pagetable destructor implementations. A rare
     use-after-free race is fixed

   - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
     simplifies and cleans up the debugging code in the VMA merging
     logic

   - "Account page tables at all levels" from Kevin Brodsky cleans up
     and regularizes the pagetable ctor/dtor handling. This results in
     improvements in accounting accuracy

   - "mm/damon: replace most damon_callback usages in sysfs with new
     core functions" from SeongJae Park cleans up and generalizes
     DAMON's sysfs file interface logic

   - "mm/damon: enable page level properties based monitoring" from
     SeongJae Park increases the amount of information which is
     presented in response to DAMOS actions

   - "mm/damon: remove DAMON debugfs interface" from SeongJae Park
     removes DAMON's long-deprecated debugfs interfaces. Thus the
     migration to sysfs is completed

   - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
     Peter Xu cleans up and generalizes the hugetlb reservation
     accounting

   - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
     removes a never-used feature of the alloc_pages_bulk() interface

   - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
     extends DAMOS filters to support not only exclusion (rejecting),
     but also inclusion (allowing) behavior

   - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
     introduces a new memory descriptor for zswap.zpool that currently
     overlaps with struct page for now. This is part of the effort to
     reduce the size of struct page and to enable dynamic allocation of
     memory descriptors

   - "mm, swap: rework of swap allocator locks" from Kairui Song redoes
     and simplifies the swap allocator locking. A speedup of 400% was
     demonstrated for one workload. As was a 35% reduction for kernel
     build time with swap-on-zram

   - "mm: update mips to use do_mmap(), make mmap_region() internal"
     from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
     mmap_region() can be made MM-internal

   - "mm/mglru: performance optimizations" from Yu Zhao fixes a few
     MGLRU regressions and otherwise improves MGLRU performance

   - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
     Park updates DAMON documentation

   - "Cleanup for memfd_create()" from Isaac Manjarres does that thing

   - "mm: hugetlb+THP folio and migration cleanups" from David
     Hildenbrand provides various cleanups in the areas of hugetlb
     folios, THP folios and migration

   - "Uncached buffered IO" from Jens Axboe implements the new
     RWF_DONTCACHE flag which provides synchronous dropbehind for
     pagecache reading and writing. To permite userspace to address
     issues with massive buildup of useless pagecache when
     reading/writing fast devices

   - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
     Weißschuh fixes and optimizes some of the MM selftests"

* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm/compaction: fix UBSAN shift-out-of-bounds warning
  s390/mm: add missing ctor/dtor on page table upgrade
  kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
  tools: add VM_WARN_ON_VMG definition
  mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
  seqlock: add missing parameter documentation for raw_seqcount_try_begin()
  mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
  mm/page_alloc: remove the incorrect and misleading comment
  zram: remove zcomp_stream_put() from write_incompressible_page()
  mm: separate move/undo parts from migrate_pages_batch()
  mm/kfence: use str_write_read() helper in get_access_type()
  selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
  kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
  selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
  selftests/mm: vm_util: split up /proc/self/smaps parsing
  selftests/mm: virtual_address_range: unmap chunks after validation
  selftests/mm: virtual_address_range: mmap() without PROT_WRITE
  selftests/memfd/memfd_test: fix possible NULL pointer dereference
  mm: add FGP_DONTCACHE folio creation flag
  mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
  ...
2025-01-26 18:36:23 -08:00
Vasily Gorbik
d7bebcb4a8 s390: Optimize __pa/__va when RANDOMIZE_IDENTITY_BASE is off
Use a zero identity base when CONFIG_RANDOMIZE_IDENTITY_BASE is off,
slightly optimizing __pa/__va calculations.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:03 +01:00
Vasily Gorbik
9688b17b4a s390/boot: Add physmem tracking debug support
Introduce boot_debug() calls to track memory detection, online ranges,
reserved areas, and allocations (except for VMEM allocations, which are
too frequent). Instead introduce dump_physmem_reserved() function which
prints out full memory tracking information. This helps in debugging
early boot memory handling.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:02 +01:00
Vasily Gorbik
70309dc776 s390/boot: Add timestamps to early boot messages
When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages in
the same format as regular printk. Timestamps appear only with earlyprintk
and are stored in the boot messages ring buffer, but are not propagated
to main kernel messages (if earlyprintk is not enabled). This prevents
double timestamps in the output.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:02 +01:00
Vasily Gorbik
b79015ae63 s390/boot: Add prefix filtering to bootdebug messages
Enhance boot debugging by allowing the "bootdebug" kernel parameter to
accept an optional comma-separated list of prefixes. Only debug messages
starting with these prefixes will be printed during boot. For example:

    bootdebug=startup,vmem

Not specifying a filter for the "bootdebug" parameter prints all debug
messages. The `boot_fmt` macro can be defined to set a common prefix:

    #define boot_fmt(fmt) "startup: " fmt

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:01 +01:00
Vasily Gorbik
d20d8e5133 s390/boot: Add bootdebug option to control debug messages
Suppress decompressor debug messages by default, similar to regular
kernel debug messages that require 'DEBUG' or 'dyndbg' to be enabled
(depending on CONFIG_DYNAMIC_DEBUG). Introduce a 'bootdebug' option to
enable printing these messages when needed.

All messages are still stored in the boot ring buffer regardless.

To enable boot debug messages:

  bootdebug debug

Or combine with 'earlyprintk' to print them without delay:

  bootdebug debug earlyprintk

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:01 +01:00
Vasily Gorbik
c09f8d0ad6 s390/boot: Defer boot messages when earlyprintk is not enabled
When earlyprintk is not specified, boot messages are only stored in a
ring buffer to be printed later by printk when console driver is
registered.

Critical messages from boot_emerg() are always printed immediately,
even without earlyprintk.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:01 +01:00
Guo Weikang
c6f239796b mm/memblock: add memblock_alloc_or_panic interface
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory.  In most cases, when memory allocation fails, an
immediate panic is required.  To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`.  This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
  Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
  Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>	[s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:38 -08:00
Heiko Carstens
f8107a8be0 s390/mm: Simplify noexec page protection handling
By default page protection definitions like PAGE_RX have the _PAGE_NOEXEC
bit set. For older machines without the instruction execution protection
facility this bit is not allowed to be used in page table entries, and
therefore must be removed.

This is done at a couple of page table walkers, but also at some but not
all page table modification functions like ptep_modify_prot_commit(). Avoid
all of this and change the page, segment and region3 protection definitions
so that the noexec bit is masked out automatically if the instruction
execution-protection facility is not available. This is similar to what
also various other architectures do which had to solve the same problem.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-17 12:46:13 +01:00
Vasily Gorbik
912a0d3523 s390: Remove __bootdata annotations from declarations
For consistency, remove the `__bootdata` and `__bootdata_preserved`
section annotations from variable declarations in header files. Section
annotations should be applied to definitions, not declarations. This
change moves the annotations to the variable definitions in the
corresponding source files.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-15 16:19:04 +01:00
Heiko Carstens
7ad0075005 s390/setup: Cleanup stack_alloc() and stack_free()
Some small cleanups to stack_alloc() and stack_free():

- Rename ret to stack to reflect what the variable is used for
- Whitespace removal

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-10 15:41:58 +01:00
Heiko Carstens
27939d6cde s390/Kconfig: Select VMAP_STACK unconditionally
There is no point in supporting !VMAP_STACK kernel builds. VMAP_STACK has
proven to work since many years. Also, since KASAN_VMALLOC is supported,
kernels built with !VMAP_STACK are completely untested.

Therefore select VMAP_STACK unconditionally and remove all config options
and code required for !VMAP_STACK builds.

Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-10 15:41:58 +01:00
Alexander Gordeev
a3ca27c405 s390/mm: Prevent lowcore vs identity mapping overlap
The identity mapping position in virtual memory is randomized
together with the kernel mapping. That position can never
overlap with the lowcore even when the lowcore is relocated.

Prevent overlapping with the lowcore to allow independent
positioning of the identity mapping. With the current value
of the alternative lowcore address of 0x70000 the overlap
could happen in case the identity mapping is placed at zero.

This is a prerequisite for uncoupling of randomization base
of kernel image and identity mapping in virtual memory.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-21 16:14:45 +02:00
Sven Schnelle
5ade5be4ed s390: Add infrastructure to patch lowcore accesses
The s390 architecture defines two special per-CPU data pages
called the "prefix area". In s390-linux terminology this is usually
called "lowcore". This memory area contains system configuration
data like old/new PSW's for system call/interrupt/machine check
handlers and lots of other data. It is normally mapped to logical
address 0. This area can only be accessed when in supervisor mode.

This means that kernel code can dereference NULL pointers, because
accesses to address 0 are allowed. Parts of lowcore can be write
protected, but read accesses and write accesses outside of the write
protected areas are not caught.

To remove this limitation for debugging and testing, remap lowcore to
another address and define a function get_lowcore() which simply
returns the address where lowcore is mapped at. This would normally
introduce a pointer dereference (=memory read). As lowcore is used
for several very often used variables, add code to patch this function
during runtime, so we avoid the memory reads.

For C code get_lowcore() has to be used, for assembly code it is
the GET_LC macro. When using this macro/function a reference is added
to alternative patching. All these locations will be patched to the
actual lowcore location when the kernel is booted or a module is loaded.

To make debugging/bisecting problems easier, this patch adds all the
infrastructure but the lowcore address is still hardwired to 0. This
way the code can be converted on a per function basis, and the
functionality is enabled in a patch after all the functions have
been converted.

Note that this requires at least z16 because the old lpsw instruction
only allowed a 12 bit displacement. z16 introduced lpswey which allows
20 bits (signed), so the lowcore can effectively be mapped from
address 0 - 0x7e000. To use 0x7e000 as address, a 6 byte lgfi
instruction would have to be used in the alternative. To save two
bytes, llilh can be used, but this only allows to set bits 16-31 of
the address. In order to use the llilh instruction, use 0x70000 as
alternative lowcore address. This is still large enough to catch
NULL pointer dereferences into large arrays.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23 16:02:32 +02:00
Heiko Carstens
beb8cee06f s390/alternatives: Remove alternative facility list
The alternative and the normal facility list are always identical. Remove
the alternative facility list, which allows to simplify the alternatives
code.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23 16:02:31 +02:00
Sven Schnelle
d3604ffba1 s390: Move CIF flags to struct pcpu
To allow testing flags for offline CPUs, move the CIF flags
to struct pcpu. To avoid having to calculate the array index
for each access, add a pointer to the pcpu member for the current
cpu to lowcore.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23 16:02:31 +02:00
Vasily Gorbik
e188e5d5ff s390/setup: Fix __pa/__va for modules under non-GPL licenses
The struct vm_layout contains fields used in __pa/__va calculations. Such
fundamental things have to be exported with EXPORT_SYMBOL to avoid
breakages of out-of-tree modules under non-GPL licenses.

Fixes: 7de0446f0b ("s390/boot: Make identity mapping base address explicit")
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23 15:54:58 +02:00
Sven Schnelle
208da1d5fc s390: Replace S390_lowcore by get_lowcore()
Replace all S390_lowcore usages in arch/s390/ by get_lowcore().

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-06-18 17:01:33 +02:00
Sven Schnelle
e7dec0b792 s390/boot: Remove alt_stfle_fac_list from decompressor
It is nowhere used in the decompressor, therefore remove it.

Fixes: 17e89e1340 ("s390/facilities: move stfl information from lowcore to global data")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-05-16 10:17:12 +02:00
Alexander Gordeev
5fb50fa66a s390/boot: Make .amode31 section address range explicit
This is a preparatory rework to allow uncoupling virtual
and physical addresses spaces.

Introduce .amode31 section address range AMODE31_START
and AMODE31_END macros for later use.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17 13:38:00 +02:00
Alexander Gordeev
7de0446f0b s390/boot: Make identity mapping base address explicit
This is a preparatory rework to allow uncoupling virtual
and physical addresses spaces.

Currently the identity mapping base address is implicit
and is always set to zero. Make it explicit by putting
into __identity_base persistent boot variable and use it
in proper context - which is the value of PAGE_OFFSET.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17 13:38:00 +02:00
Alexander Gordeev
236f324b74 s390/mm: Create virtual memory layout structure
This is a preparatory rework to allow uncoupling virtual
and physical addresses spaces.

Put virtual memory layout information into a structure
to improve code generation when accessing the structure
members, which are currently only ident_map_size and
__kaslr_offset.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17 13:38:00 +02:00
Gerald Schaefer
e8054eaeb5 s390/setup: fix virtual vs physical address confusion
Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.

/proc/iomem should report the physical address ranges, so use __pa_symbol()
for resource registration, similar to other architectures.

Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:12 +01:00
Heiko Carstens
d7f679ec86 s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support
s390 selects ARCH_WANTS_DYNAMIC_TASK_STRUCT in order to make the size of
the task structure dependent on the availability of the vector
facility. This doesn't make sense anymore because since many years all
machines provide the vector facility.

Therefore simplify the code a bit and remove s390 support for
ARCH_WANTS_DYNAMIC_TASK_STRUCT.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11 14:33:06 +01:00
Heiko Carstens
84e599e3ad s390/nmi: consistently enable machine checks in trap_init()
The kernel starts with machine checks disabled (machine check mask bit in
the PSW is zero), and machine checks are enabled when trap_init() is
called. The rationale is that this allows to assume that the system is
initialized up to a certain point before the machine check handler may be
invoked.

However the implementation is incomplete: all new PSW masks in lowcore have
the machine check mask bit. This means that e.g. for any early program
check machine checks are enabled within the program check handler. This
contradicts the whole point of enabling machine checks at a single place.

Change this and initialize all new PSWs in lowcore so they have the machine
check mask bit not set. Set the bit in all masks in trap_init(). This way
machine check enabling is consistent.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11 14:33:05 +01:00
Linus Torvalds
e392ea4d4d s390 updates for the 6.7 merge window
- Get rid of private VM_FAULT flags
 
 - Add word-at-a-time implementation
 
 - Add DCACHE_WORD_ACCESS support
 
 - Cleanup control register handling
 
 - Disallow CPU hotplug of CPU 0 to simplify its handling complexity,
   following a similar restriction in x86
 
 - Optimize pai crypto map allocation
 
 - Update the list of crypto express EP11 coprocessor operation modes
 
 - Fixes and improvements for secure guests AP pass-through
 
 - Several fixes to address incorrect page marking for address translation
   with the "cmma no-dat" feature, preventing potential incorrect guest
   TLB flushes
 
 - Fix early IPI handling
 
 - Several virtual vs physical address confusion fixes
 
 - Various small fixes and improvements all over the code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmVFLkYACgkQjYWKoQLX
 FBgxRwf9FSNFwLcbYbG1x94rUUHnbaiyJWCezp3/ypr+m+qDvQatLYc75SxwrH0y
 ocSygqvtVryVkWAKKvOHF1Kg5R2Fedmzf5wuVTXglfPqE1ZgMGdwS/LtknIoz556
 twZJIpFzUFt5xaljpTCZJanLMvy/npl0bilezhNGl6v7N5rsWLbfK6vsPMDm+TTZ
 yscapOsk8Z16NjXq0FETS5JHG65jjj9rkRfb0qD8SOFhti0fR9MSP2xeRXrDMDZE
 IWXog5usx2DS6VX2HnxA8O7z1hhuTccJ1K1+rYqbb0Fwccqi7QaGZXEvocYEvlvy
 lVe3/jbyn27hUoypHcfVCAVxdoOrnw==
 =SMOp
 -----END PGP SIGNATURE-----

Merge tag 's390-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Get rid of private VM_FAULT flags

 - Add word-at-a-time implementation

 - Add DCACHE_WORD_ACCESS support

 - Cleanup control register handling

 - Disallow CPU hotplug of CPU 0 to simplify its handling complexity,
   following a similar restriction in x86

 - Optimize pai crypto map allocation

 - Update the list of crypto express EP11 coprocessor operation modes

 - Fixes and improvements for secure guests AP pass-through

 - Several fixes to address incorrect page marking for address
   translation with the "cmma no-dat" feature, preventing potential
   incorrect guest TLB flushes

 - Fix early IPI handling

 - Several virtual vs physical address confusion fixes

 - Various small fixes and improvements all over the code

* tag 's390-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (74 commits)
  s390/cio: replace deprecated strncpy with strscpy
  s390/sclp: replace deprecated strncpy with strtomem
  s390/cio: fix virtual vs physical address confusion
  s390/cio: export CMG value as decimal
  s390: delete the unused store_prefix() function
  s390/cmma: fix handling of swapper_pg_dir and invalid_pg_dir
  s390/cmma: fix detection of DAT pages
  s390/sclp: handle default case in sclp memory notifier
  s390/pai_crypto: remove per-cpu variable assignement in event initialization
  s390/pai: initialize event count once at initialization
  s390/pai_crypto: use PERF_ATTACH_TASK define for per task detection
  s390/mm: add missing arch_set_page_dat() call to gmap allocations
  s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()
  s390/cmma: fix initial kernel address space page table walk
  s390/diag: add missing virt_to_phys() translation to diag224()
  s390/mm,fault: move VM_FAULT_ERROR handling to do_exception()
  s390/mm,fault: remove VM_FAULT_BADMAP and VM_FAULT_BADACCESS
  s390/mm,fault: remove VM_FAULT_SIGNAL
  s390/mm,fault: remove VM_FAULT_BADCONTEXT
  s390/mm,fault: simplify kfence fault handling
  ...
2023-11-03 10:17:22 -10:00
Baoquan He
a9e1a3d84e crash_core: change the prototype of function parse_crashkernel()
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added.  Make adjustments in
all call sites of parse_crashkernel() in arch.

Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:58 -07:00
Heiko Carstens
0b6529e3dc s390/setup: make use of system_ctl_load()
Use system_ctl_load() instead of local_ctl_load() to reflect that
control register changes are supposed to be global.

Even though setup_cr() was ok, note that the usage of local_ctl_load()
would have been wrong, if it would have happened after the control
register save area was initialized: only local control register contents
would have been changed, but wouldn't be used for new CPUs.

With using system_ctl_load() the caller doesn't need to take care.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:57 +02:00
Heiko Carstens
d11d5c8c84 s390/ctltreg: make initialization of control register save area explicit
Commit e1b9c2749a ("s390/smp: ensure global control register contents
are in sync") made the control register save area contained within the
lowcore at absolute address zero a resource which is used when
initializing CPUs. However this is anything but obvious from the code.

Add an ctlreg_init_save_area() function in order to make this explicit.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
527618abb9 s390/ctlreg: add struct ctlreg
Add struct ctlreg to enforce strict type checking / usage for control
register functions.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
8072597826 s390/ctlreg: change parameters of __local_ctl_load() and __local_ctl_store()
Change __local_ctl_load() and __local_ctl_store(), so that control
register parameters come first.

This way all control handling functions consistently have control
register(s) parameter first.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
2372d39142 s390/ctlreg: use local_ctl_load() and local_ctl_store() where possible
Convert all single control register usages of __local_ctl_load() and
__local_ctl_store() to local_ctl_load() and local_ctl_store().

This also requires to change the type of some struct lowcore members
from __u64 to unsigned long.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
8d5e98f8d6 s390/ctlreg: add local and system prefix to some functions
Add local and system prefix to some functions to clarify they change
control register contents on either the local CPU or the on all CPUs.

This results in the following API:

Two defines which load and save multiple control registers.
The defines correlate with the following C prototypes:

void __local_ctl_load(unsigned long *, unsigned int cr_low, unsigned int cr_high);
void __local_ctl_store(unsigned long *, unsigned int cr_low, unsigned int cr_high);

Two functions which locally set or clear one bit for a specified
control register:

void local_ctl_set_bit(unsigned int cr, unsigned int bit);
void local_ctl_clear_bit(unsigned int cr, unsigned int bit);

Two functions which set or clear one bit for a specified control
register on all CPUs:

void system_ctl_set_bit(unsigned int cr, unsigned int bit);
void system_ctl_clear_bit(unsigend int cr, unsigned int bit);

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
aa36d433b7 s390/setup: use strlcat() instead of strcat()
Use strlcat() instead of strcat() in order to get rid of this W=1
warning:

In function ‘strlcat’,
    inlined from ‘strcat’ at ./include/linux/fortify-string.h:432:6,
    inlined from ‘setup_zfcpdump’ at arch/s390/kernel/setup.c:308:2,
    inlined from ‘setup_arch’ at arch/s390/kernel/setup.c:1002:2:
./include/linux/fortify-string.h:406:19: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  406 |         p[actual] = '\0';
      |         ~~~~~~~~~~^~~~~~

As stated in fortify-string.h strcat() should not be used, since
FORTIFY_SOURCE cannot figure out the size of the destination buffer.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Ilya Leoshkevich
3570ee046c s390/smp: keep the original lowcore for CPU 0
Now that CPU 0 is not hotpluggable, it is not necessary to support
freeing its stacks. Delete all the code that migrates it to new stacks
and a new lowcore.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:55 +02:00
Heiko Carstens
3eeb07788f s390/amode31: change type of __samode31, __eamode31, etc
For consistencs reasons change the type of __samode31, __eamode31,
__stext_amode31, and __etext_amode31 to a char pointer so they
(nearly) match the type of all other sections.

This allows for code simplifications with follow-on patches.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-30 11:03:27 +02:00
Heiko Carstens
b6f10e2f66 s390: remove "noexec" option
Do the same like x86 with commit 76ea0025a2 ("x86/cpu: Remove "noexec"")
and remove the "noexec" kernel command line option.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-30 11:03:27 +02:00
Alexander Gordeev
979fe44af8 s390/ipl: fix virtual vs physical address confusion
The value of ipl_cert_list_addr boot variable contains
a physical address, which is used directly. That works
because virtual and physical address spaces are currently
the same, but otherwise it is wrong.

While at it, fix also a comment for the platform keyring.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/r/20230816132942.2540411-1-agordeev@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:08:12 +02:00
Alexander Gordeev
94fd522069 s390/mm: rework arch_get_mappable_range() callback
As per description in mm/memory_hotplug.c platforms should define
arch_get_mappable_range() that provides maximum possible addressable
physical memory range for which the linear mapping could be created.

The current implementation uses VMEM_MAX_PHYS macro as the maximum
mappable physical address and it is simply a cast to vmemmap. Since
the address is in physical address space the natural upper limit of
MAX_PHYSMEM_BITS is honoured:

	vmemmap_start = min(vmemmap_start, 1UL << MAX_PHYSMEM_BITS);

Further, to make sure the identity mapping would not overlay with
vmemmap, the size of identity mapping could be stripped like this:

	ident_map_size = min(ident_map_size, vmemmap_start);

Similarily, any other memory that could be added (e.g DCSS segment)
should not overlay with vmemmap as well and that is prevented by
using vmemmap (VMEM_MAX_PHYS macro) as the upper limit.

However, while the use of VMEM_MAX_PHYS brings the desired result
it actually poses two issues:

1. As described, vmemmap is handled as a physical address, although
   it is actually a pointer to struct page in virtual address space.

2. As vmemmap is a virtual address it could have been located
   anywhere in the virtual address space. However, the desired
   necessity to honour MAX_PHYSMEM_BITS limit prevents that.

Rework arch_get_mappable_range() callback in a way it does not
use VMEM_MAX_PHYS macro and does not confuse the notion of virtual
vs physical address spacees as result. That paves the way for moving
vmemmap elsewhere and optimizing the virtual address space layout.

Introduce max_mappable preserved boot variable and let function
setup_kernel_memory_layout() set it up. As result, the rest of the
code is does not need to know the virtual memory layout specifics.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-24 12:12:23 +02:00
Heiko Carstens
cada938a01 s390: fix various typos
Fix various typos found with codespell.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:42 +02:00
Linus Torvalds
10de638d8e s390 updates for the 6.4 merge window
- Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%
 
 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS
 
 - Improve KASLR to also randomize module and special amode31 code
   base load addresses
 
 - Rework decompressor memory tracking to support memory holes and improve
   error handling
 
 - Add support for protected virtualization AP binding
 
 - Add support for set_direct_map() calls
 
 - Implement set_memory_rox() and noexec module_alloc()
 
 - Remove obsolete overriding of mem*() functions for KASAN
 
 - Rework kexec/kdump to avoid using nodat_stack to call purgatory
 
 - Convert the rest of the s390 code to use flexible-array member instead
   of a zero-length array
 
 - Clean up uaccess inline asm
 
 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE
 
 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B
 
 - Resolve last_break in userspace fault reports
 
 - Simplify one-level sysctl registration
 
 - Clean up branch prediction handling
 
 - Rework CPU counter facility to retrieve available counter sets just
   once
 
 - Other various small fixes and improvements all over the code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmRM8pwACgkQjYWKoQLX
 FBjV1AgAlvAhu1XkwOdwqdT4GqE8pcN4XXzydog1MYihrSO2PdgWAxpEW7o2QURN
 W+3xa6RIqt7nX2YBiwTanMZ12TYaFY7noGl3eUpD/NhueprweVirVl7VZUEuRoW/
 j0mbx77xsVzLfuDFxkpVwE6/j+tTO78kLyjUHwcN9rFVUaL7/orJneDJf+V8fZG0
 sHLOv0aljF7Jr2IIkw82lCmW/vdk7k0dACWMXK2kj1H3dIK34B9X4AdKDDf/WKXk
 /OSElBeZ93tSGEfNDRIda6iR52xocROaRnQAaDtargKFl9VO0/dN9ADxO+SLNHjN
 pFE/9VD6xT/xo4IuZZh/Z3TcYfiLvA==
 =Geqx
 -----END PGP SIGNATURE-----

Merge tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%

 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS

 - Improve KASLR to also randomize module and special amode31 code base
   load addresses

 - Rework decompressor memory tracking to support memory holes and
   improve error handling

 - Add support for protected virtualization AP binding

 - Add support for set_direct_map() calls

 - Implement set_memory_rox() and noexec module_alloc()

 - Remove obsolete overriding of mem*() functions for KASAN

 - Rework kexec/kdump to avoid using nodat_stack to call purgatory

 - Convert the rest of the s390 code to use flexible-array member
   instead of a zero-length array

 - Clean up uaccess inline asm

 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE

 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B

 - Resolve last_break in userspace fault reports

 - Simplify one-level sysctl registration

 - Clean up branch prediction handling

 - Rework CPU counter facility to retrieve available counter sets just
   once

 - Other various small fixes and improvements all over the code

* tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (118 commits)
  s390/stackleak: provide fast __stackleak_poison() implementation
  stackleak: allow to specify arch specific stackleak poison function
  s390: select ARCH_USE_SYM_ANNOTATIONS
  s390/mm: use VM_FLUSH_RESET_PERMS in module_alloc()
  s390: wire up memfd_secret system call
  s390/mm: enable ARCH_HAS_SET_DIRECT_MAP
  s390/mm: use BIT macro to generate SET_MEMORY bit masks
  s390/relocate_kernel: adjust indentation
  s390/relocate_kernel: use SYM* macros instead of ENTRY(), etc.
  s390/entry: use SYM* macros instead of ENTRY(), etc.
  s390/purgatory: use SYM* macros instead of ENTRY(), etc.
  s390/kprobes: use SYM* macros instead of ENTRY(), etc.
  s390/reipl: use SYM* macros instead of ENTRY(), etc.
  s390/head64: use SYM* macros instead of ENTRY(), etc.
  s390/earlypgm: use SYM* macros instead of ENTRY(), etc.
  s390/mcount: use SYM* macros instead of ENTRY(), etc.
  s390/crc32le: use SYM* macros instead of ENTRY(), etc.
  s390/crc32be: use SYM* macros instead of ENTRY(), etc.
  s390/crypto,chacha: use SYM* macros instead of ENTRY(), etc.
  s390/amode31: use SYM* macros instead of ENTRY(), etc.
  ...
2023-04-30 11:43:31 -07:00
Josh Poimboeuf
9ea7e6b62c init: Mark [arch_call_]rest_init() __noreturn
In preparation for improving objtool's handling of weak noreturn
functions, mark start_kernel(), arch_call_rest_init(), and rest_init()
__noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/7194ed8a989a85b98d92e62df660f4a90435a723.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:23 +02:00
Heiko Carstens
bb87190c9d s390/kaslr: provide kaslr_enabled() function
Just like other architectures provide a kaslr_enabled() function, instead
of directly accessing a global variable.

Also pass the renamed __kaslr_enabled variable from the decompressor to the
kernel, so that kalsr_enabled() is available there too. This will be used
by a subsequent patch which randomizes the module base load address.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-13 17:36:25 +02:00