2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

203 Commits

Author SHA1 Message Date
Mark Rutland
b5df5b8307 arm64: idle: don't instrument idle code with KCOV
The low-level idle code in arch_cpu_idle() and its callees runs at a
time where where portions of the kernel environment aren't available.
For example, RCU may not be watching, and lockdep state may be
out-of-sync with the hardware. Due to this, it is not sound to
instrument this code.

We generally avoid instrumentation by marking the entry functions as
`noinstr`, but currently this doesn't inhibit KCOV instrumentation.
Prevent this by factoring these functions into a new idle.c so that we
can disable KCOV for the entire compilation unit, as is done for the
core idle code in kernel/sched/idle.c.

We'd like to keep instrumentation of the rest of process.c, and for the
existing code in cpuidle.c, so a new compilation unit is preferable. The
arch_cpu_idle_dead() function in process.c is a cpu hotplug function
that is safe to instrument, so it is left as-is in process.c.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-21-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:56 +01:00
Mark Rutland
33a3581a76 arm64: entry: move arm64_preempt_schedule_irq to entry-common.c
Subsequent patches will pull more of the IRQ entry handling into C. To
keep this in one place, let's move arm64_preempt_schedule_irq() into
entry-common.c along with the other entry management functions.

We no longer need to include <linux/lockdep.h> in process.c, so the
include directive is removed.

There should be no functional change as a result of this patch.

Reviewed-by Joey Gouly <joey.gouly@arm.com>

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Madhavan T. Venkataraman
7d7b720a4b arm64: Implement stack trace termination record
Reliable stacktracing requires that we identify when a stacktrace is
terminated early. We can do this by ensuring all tasks have a final
frame record at a known location on their task stack, and checking
that this is the final frame record in the chain.

We'd like to use task_pt_regs(task)->stackframe as the final frame
record, as this is already setup upon exception entry from EL0. For
kernel tasks we need to consistently reserve the pt_regs and point x29
at this, which we can do with small changes to __primary_switched,
__secondary_switched, and copy_process().

Since the final frame record must be at a specific location, we must
create the final frame record in __primary_switched and
__secondary_switched rather than leaving this to start_kernel and
secondary_start_kernel. Thus, __primary_switched and
__secondary_switched will now show up in stacktraces for the idle tasks.

Since the final frame record is now identified by its location rather
than by its contents, we identify it at the start of unwind_frame(),
before we read any values from it.

External debuggers may terminate the stack trace when FP == 0. In the
pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the
debugger may print an extra record 0x0 at the end. While this is not
pretty, this does not do any harm. This is a small price to pay for
having reliable stack trace termination in the kernel. That said, gdb
does not show the extra record probably because it uses DWARF and not
frame pointers for stack traces.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
[Mark: rebase, use ASM_BUG(), update comments, update commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:53:29 +01:00
Linus Torvalds
51595e3b49 Assorted arm64 fixes and clean-ups, the most important:
- Restore terminal stack frame records. Their previous removal caused
   traces which cross secondary_start_kernel to terminate one entry too
   late, with a spurious "0" entry.
 
 - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR
   register.
 
 - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe
   failure (GTDT), prevent unregistering of GIC SGIs.
 
 - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having
   to test all the other combinations.
 
 - Documentation fixes and updates: tagged address ABI exceptions on
   brk/mmap/mremap(), event stream frequency, update booting requirements
   on the configuration of traps.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmCVba0ACgkQa9axLQDI
 XvEClxAAsqigp+Mnotdr8YUOuXLjHWU41EMShV6WbFcmlViEyZxxtZ5qavw19T3L
 rPxb8hq9QqI8kCd+j4MAU7cdc0ry+047njJmQ3Va0WeiDsbgEfPvLWPguDbeDFXW
 EjKKib+F/u58IffDkn6rVA7ZVPgYHRH+8yw6EdApp0BN4JuxEFzGBzG4EWKXnNHH
 IOu4IIXlbLX+U1kTtUFR4u6i4uBs2pZdEYzo1NF/Joacg14F01CBRuh8U04eeWFD
 HF4pWd4eCl/bLYPurF1rOi1dIUyrPuaPgNInGEdSaocD0hIvQH0r0wyIt+aMmqvK
 9Jm+dDEGeLxQn2nDrXfyldYG5EbFa3OplkUt2MVDDMWwN2Gpsjlnf/ucff/SBT/N
 7D6AL2OH6KDDCsNgU1JH9H6rAlh4nWJcsMBrWmP7aQtBMRyccQLywrt4HXB8cy7E
 +MyhTit05P3lpsrK2uZSFujK35Ts8hxywA7lAlU7YP4ADKu3Noc6qXSaxZRe+1Gb
 O5k3Qdcih0VLE843PjJj8f8fW1ysJW5J60cK9BaZxpB77gNufKkh/hS6YAiA8qkt
 PT3J0jk/cgGvwKK54rW52dG7qvDImgUMGkXGKQnEimgb62DatCZ4ZOPC+UoiDiqO
 SEd1DSW0Lt1VxVIulAjatVgzIJGM0jGCm9L7/vBguR0+Lahakbg=
 =vYok
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "A mix of fixes and clean-ups that turned up too late for the first
  pull request:

   - Restore terminal stack frame records. Their previous removal caused
     traces which cross secondary_start_kernel to terminate one entry
     too late, with a spurious "0" entry.

   - Fix boot warning with pseudo-NMI due to the way we manipulate the
     PMR register.

   - ACPI fixes: avoid corruption of interrupt mappings on watchdog
     probe failure (GTDT), prevent unregistering of GIC SGIs.

   - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with
     having to test all the other combinations.

   - Documentation fixes and updates: tagged address ABI exceptions on
     brk/mmap/mremap(), event stream frequency, update booting
     requirements on the configuration of traps"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Update the stale comment
  arm64: Fix the documented event stream frequency
  arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
  arm64: Explicitly document boot requirements for SVE
  arm64: Explicitly require that FPSIMD instructions do not trap
  arm64: Relax booting requirements for configuration of traps
  arm64: cpufeatures: use min and max
  arm64: stacktrace: restore terminal records
  arm64/vdso: Discard .note.gnu.property sections in vDSO
  arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions
  psci: Remove unneeded semicolon
  ACPI: irq: Prevent unregistering of GIC SGIs
  ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
  arm64: Show three registers per line
  arm64: remove HAVE_DEBUG_BUGVERBOSE
  arm64: alternative: simplify passing alt_region
  arm64: Force SPARSEMEM_VMEMMAP as the only memory management model
  arm64: vdso32: drop -no-integrated-as flag
2021-05-07 12:11:05 -07:00
Linus Torvalds
31a24ae89c arm64 updates for 5.13:
- MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not allow
   precise identification of the illegal access.
 
 - Run kernel mode SIMD with softirqs disabled. This allows using NEON in
   softirq context for crypto performance improvements. The conditional
   yield support is modified to take softirqs into account and reduce the
   latency.
 
 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.
 
 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new
   functions for the HiSilicon HHA and L3C PMU, cleanups.
 
 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.
 
 - Disable fine-grained traps at boot and improve the documented boot
   requirements.
 
 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).
 
 - Add hierarchical eXecute Never permissions for all page tables.
 
 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.
 
 - arm64 kselftests for BTI and some improvements to the MTE tests.
 
 - Minor improvements to the compat vdso and sigpage.
 
 - Miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmB5xkkACgkQa9axLQDI
 XvEBgRAAsr6r8gsBQJP3FDHmbtbVf2ej5QJTCOAQAGHbTt0JH7Pk03pWSBr7h5nF
 vsddRDxxeDgB6xd7jWP7EvDaPxHeB0CdSj5gG8EP/ZdOm8sFAwB1ZIHWikgUgSwW
 nu6R28yXTMSj+EkyFtahMhTMJ1EMF4sCPuIgAo59ST5w/UMMqLCJByOu4ej6RPKZ
 aeSJJWaDLBmbgnTKWxRvCc/MgIx4J/LAHWGkdpGjuMK6SLp38Kdf86XcrklXtzwf
 K30ZYeoKq8zZ+nFOsK9gBVlOlocZcbS1jEbN842jD6imb6vKLQtBWrKk9A6o4v5E
 XulORWcSBhkZb3ItIU9+6SmelUExf0VeVlSp657QXYPgquoIIGvFl6rCwhrdGMGO
 bi6NZKCfJvcFZJoIN1oyhuHejgZSBnzGEcvhvzNdg7ItvOCed7q3uXcGHz/OI6tL
 2TZKddzHSEMVfTo0D+RUsYfasZHI1qAiQ0mWVC31c+YHuRuW/K/jlc3a5TXlSBUa
 Dwu0/zzMLiqx65ISx9i7XNMrngk55uzrS6MnwSByPoz4M4xsElZxt3cbUxQ8YAQz
 jhxTHs1Pwes8i7f4n61ay/nHCFbmVvN/LlsPRpZdwd8JumThLrDolF3tc6aaY0xO
 hOssKtnGY4Xvh/WitfJ5uvDb1vMObJKTXQEoZEJh4hlNQDxdeUE=
 =6NGI
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
2021-04-26 10:25:03 -07:00
Matthew Wilcox (Oracle)
0bca3ec846 arm64: Show three registers per line
Displaying two registers per line takes 15 lines.  That improves to just
10 lines if we display three registers per line, which reduces the amount
of information lost when oopses are cut off.  It stays within 80 columns
and matches x86-64.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210420172245.3679077-1-willy@infradead.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-23 18:00:51 +01:00
Catalin Marinas
a27a881656 Merge branch 'for-next/pac-set-get-enabled-keys' into for-next/core
* for-next/pac-set-get-enabled-keys:
  : Introduce arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS).
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
2021-04-15 14:00:48 +01:00
Peter Collingbourne
b90e483938 arm64: pac: Optimize kernel entry/exit key installation code paths
The kernel does not use any keys besides IA so we don't need to
install IB/DA/DB/GA on kernel exit if we arrange to install them
on task switch instead, which we can expect to happen an order of
magnitude less often.

Furthermore we can avoid installing the user IA in the case where the
user task has IA disabled and just leave the kernel IA installed. This
also lets us avoid needing to install IA on kernel entry.

On an Apple M1 under a hypervisor, the overhead of kernel entry/exit
has been measured to be reduced by 15.6ns in the case where IA is
enabled, and 31.9ns in the case where IA is disabled.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ieddf6b580d23c9e0bed45a822dabe72d2ffc9a8e
Link: https://lore.kernel.org/r/2d653d055f38f779937f2b92f8ddd5cf9e4af4f4.1616123271.git.pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-13 17:31:44 +01:00
Peter Collingbourne
201698626f arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
This change introduces a prctl that allows the user program to control
which PAC keys are enabled in a particular task. The main reason
why this is useful is to enable a userspace ABI that uses PAC to
sign and authenticate function pointers and other pointers exposed
outside of the function, while still allowing binaries conforming
to the ABI to interoperate with legacy binaries that do not sign or
authenticate pointers.

The idea is that a dynamic loader or early startup code would issue
this prctl very early after establishing that a process may load legacy
binaries, but before executing any PAC instructions.

This change adds a small amount of overhead to kernel entry and exit
due to additional required instruction sequences.

On a DragonBoard 845c (Cortex-A75) with the powersave governor, the
overhead of similar instruction sequences was measured as 4.9ns when
simulating the common case where IA is left enabled, or 43.7ns when
simulating the uncommon case where IA is disabled. These numbers can
be seen as the worst case scenario, since in more realistic scenarios
a better performing governor would be used and a newer chip would be
used that would support PAC unlike Cortex-A75 and would be expected
to be faster than Cortex-A75.

On an Apple M1 under a hypervisor, the overhead of the entry/exit
instruction sequences introduced by this patch was measured as 0.3ns
in the case where IA is left enabled, and 33.0ns in the case where
IA is disabled.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://linux-review.googlesource.com/id/Ibc41a5e6a76b275efbaa126b31119dc197b927a5
Link: https://lore.kernel.org/r/d6609065f8f40397a4124654eb68c9f490b4d477.1616123271.git.pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-13 17:31:44 +01:00
Peter Collingbourne
2f79d2fc39 arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
In an upcoming change we are going to introduce per-task SCTLR_EL1
bits for PAC. Move the existing per-task SCTLR_EL1 field out of the
MTE-specific code so that we will be able to use it from both the
PAC and MTE code paths and make the task switching code more efficient.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ic65fac78a7926168fa68f9e8da591c9e04ff7278
Link: https://lore.kernel.org/r/13d725cb8e741950fb9d6e64b2cd9bd54ff7c3f9.1616123271.git.pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-13 17:31:44 +01:00
Maninder Singh
baa96377bc arm64/process.c: fix Wmissing-prototypes build warnings
Fix GCC warnings reported when building with "-Wmissing-prototypes":

  arch/arm64/kernel/process.c:261:6: warning: no previous prototype for '__show_regs' [-Wmissing-prototypes]
      261 | void __show_regs(struct pt_regs *regs)
          |      ^~~~~~~~~~~
  arch/arm64/kernel/process.c:307:6: warning: no previous prototype for '__show_regs_alloc_free' [-Wmissing-prototypes]
      307 | void __show_regs_alloc_free(struct pt_regs *regs)
          |      ^~~~~~~~~~~~~~~~~~~~~~
  arch/arm64/kernel/process.c:365:5: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
      365 | int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
          |     ^~~~~~~~~~~~~~~~~~~~
  arch/arm64/kernel/process.c:546:41: warning: no previous prototype for '__switch_to' [-Wmissing-prototypes]
      546 | __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
          |                                         ^~~~~~~~~~~
  arch/arm64/kernel/process.c:710:25: warning: no previous prototype for 'arm64_preempt_schedule_irq' [-Wmissing-prototypes]
      710 | asmlinkage void __sched arm64_preempt_schedule_irq(void)
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~

Link: https://lore.kernel.org/lkml/202103192250.AennsfXM-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Link: https://lore.kernel.org/r/1616568899-986-1-git-send-email-maninder1.s@samsung.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 09:50:16 +00:00
Hector Martin
f0098155d3 arm64: Always keep DAIF.[IF] in sync
Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
FIQ line. We implement support for this by simply treating IRQs and FIQs
the same way in the interrupt vectors.

To support these systems, the FIQ mask bit needs to be kept in sync with
the IRQ mask bit, so both kinds of exceptions are masked together. No
other platforms should be delivering FIQ exceptions right now, and we
already unmask FIQ in normal process context, so this should not have an
effect on other systems - if spurious FIQs were arriving, they would
already panic the kernel.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Hector Martin <marcan@marcan.st>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210315115629.57191-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-24 20:19:30 +00:00
Jens Axboe
4727dc20e0 arch: setup PF_IO_WORKER threads like PF_KTHREAD
PF_IO_WORKER are kernel threads too, but they aren't PF_KTHREAD in the
sense that we don't assign ->set_child_tid with our own structure. Just
ensure that every arch sets up the PF_IO_WORKER threads like kthreads
in the arch implementation of copy_thread().

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-21 17:25:22 -07:00
Zhiyuan Dai
d9f1b52afa arm64: improve whitespace
In a few places we don't have whitespace between macro parameters,
which makes them hard to read. This patch adds whitespace to clearly
separate the parameters.

In a few places we have unnecessary whitespace around unary operators,
which is confusing, This patch removes the unnecessary whitespace.

Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Link: https://lore.kernel.org/r/1612403029-5011-1-git-send-email-daizhiyuan@phytium.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-04 13:59:49 +00:00
Linus Torvalds
0ca2ce81eb arm64 updates for 5.11:
- Expose tag address bits in siginfo. The original arm64 ABI did not
   expose any of the bits 63:56 of a tagged address in siginfo. In the
   presence of user ASAN or MTE, this information may be useful. The
   implementation is generic to other architectures supporting tags (like
   SPARC ADI, subject to wiring up the arch code). The user will have to
   opt in via sigaction(SA_EXPOSE_TAGBITS) so that the extra bits, if
   available, become visible in si_addr.
 
 - Default to 32-bit wide ZONE_DMA. Previously, ZONE_DMA was set to the
   lowest 1GB to cope with the Raspberry Pi 4 limitations, to the
   detriment of other platforms. With these changes, the kernel scans the
   Device Tree dma-ranges and the ACPI IORT information before deciding
   on a smaller ZONE_DMA.
 
 - Strengthen READ_ONCE() to acquire when CONFIG_LTO=y. When building
   with LTO, there is an increased risk of the compiler converting an
   address dependency headed by a READ_ONCE() invocation into a control
   dependency and consequently allowing for harmful reordering by the
   CPU.
 
 - Add CPPC FFH support using arm64 AMU counters.
 
 - set_fs() removal on arm64. This renders the User Access Override (UAO)
   ARMv8 feature unnecessary.
 
 - Perf updates: PMU driver for the ARM DMC-620 memory controller, sysfs
   identifier file for SMMUv3, stop event counters support for i.MX8MP,
   enable the perf events-based hard lockup detector.
 
 - Reorganise the kernel VA space slightly so that 52-bit VA
   configurations can use more virtual address space.
 
 - Improve the robustness of the arm64 memory offline event notifier.
 
 - Pad the Image header to 64K following the EFI header definition
   updated recently to increase the section alignment to 64K.
 
 - Support CONFIG_CMDLINE_EXTEND on arm64.
 
 - Do not use tagged PC in the kernel (TCR_EL1.TBID1==1), freeing up 8
   bits for PtrAuth.
 
 - Switch to vmapped shadow call stacks.
 
 - Miscellaneous clean-ups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl/XcSgACgkQa9axLQDI
 XvGkwg//SLknimELD/cphf2UzZm5RFuCU0x1UnIXs9XYo5BrOpgVLLA//+XkCrKN
 0GLAdtBDfw1axWJudzgMBiHrv6wSGh4p3YWjLIW06u/PJu3m3U8oiiolvvF8d7Yq
 UKDseKGQnQkrl97J0SyA+Da/u8D11GEzp52SWL5iRxzt6vInEC27iTOp9n1yoaoP
 f3y7qdp9kv831ryUM3rXFYpc8YuMWXk+JpBSNaxqmjlvjMzipA5PhzBLmNzfc657
 XcrRX5qsgjEeJW8UUnWUVNB42j7tVzN77yraoUpoVVCzZZeWOQxqq5EscKPfIhRt
 AjtSIQNOs95ZVE0SFCTjXnUUb823coUs4dMCdftqlE62JNRwdR+3bkfa+QjPTg1F
 O9ohW1AzX0/JB19QBxMaOgbheB8GFXh3DVJ6pizTgxJgyPvQQtFuEhT1kq8Cst0U
 Pe+pEWsg9t41bUXNz+/l9tUWKWpeCfFNMTrBXLmXrNlTLeOvDh/0UiF0+2lYJYgf
 YAboibQ5eOv2wGCcSDEbNMJ6B2/6GtubDJxH4du680F6Emb6pCSw0ntPwB7mSGLG
 5dXz+9FJxDLjmxw7BXxQgc5MoYIrt5JQtaOQ6UxU8dPy53/+py4Ck6tXNkz0+Ap7
 gPPaGGy1GqobQFu3qlHtOK1VleQi/sWcrpmPHrpiiFUf6N7EmcY=
 =zXFk
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Expose tag address bits in siginfo. The original arm64 ABI did not
   expose any of the bits 63:56 of a tagged address in siginfo. In the
   presence of user ASAN or MTE, this information may be useful. The
   implementation is generic to other architectures supporting tags
   (like SPARC ADI, subject to wiring up the arch code). The user will
   have to opt in via sigaction(SA_EXPOSE_TAGBITS) so that the extra
   bits, if available, become visible in si_addr.

 - Default to 32-bit wide ZONE_DMA. Previously, ZONE_DMA was set to the
   lowest 1GB to cope with the Raspberry Pi 4 limitations, to the
   detriment of other platforms. With these changes, the kernel scans
   the Device Tree dma-ranges and the ACPI IORT information before
   deciding on a smaller ZONE_DMA.

 - Strengthen READ_ONCE() to acquire when CONFIG_LTO=y. When building
   with LTO, there is an increased risk of the compiler converting an
   address dependency headed by a READ_ONCE() invocation into a control
   dependency and consequently allowing for harmful reordering by the
   CPU.

 - Add CPPC FFH support using arm64 AMU counters.

 - set_fs() removal on arm64. This renders the User Access Override
   (UAO) ARMv8 feature unnecessary.

 - Perf updates: PMU driver for the ARM DMC-620 memory controller, sysfs
   identifier file for SMMUv3, stop event counters support for i.MX8MP,
   enable the perf events-based hard lockup detector.

 - Reorganise the kernel VA space slightly so that 52-bit VA
   configurations can use more virtual address space.

 - Improve the robustness of the arm64 memory offline event notifier.

 - Pad the Image header to 64K following the EFI header definition
   updated recently to increase the section alignment to 64K.

 - Support CONFIG_CMDLINE_EXTEND on arm64.

 - Do not use tagged PC in the kernel (TCR_EL1.TBID1==1), freeing up 8
   bits for PtrAuth.

 - Switch to vmapped shadow call stacks.

 - Miscellaneous clean-ups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
  perf/imx_ddr: Add system PMU identifier for userspace
  bindings: perf: imx-ddr: add compatible string
  arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled
  arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
  arm64: mark __system_matches_cap as __maybe_unused
  arm64: uaccess: remove vestigal UAO support
  arm64: uaccess: remove redundant PAN toggling
  arm64: uaccess: remove addr_limit_user_check()
  arm64: uaccess: remove set_fs()
  arm64: uaccess cleanup macro naming
  arm64: uaccess: split user/kernel routines
  arm64: uaccess: refactor __{get,put}_user
  arm64: uaccess: simplify __copy_user_flushcache()
  arm64: uaccess: rename privileged uaccess routines
  arm64: sdei: explicitly simulate PAN/UAO entry
  arm64: sdei: move uaccess logic to arch/arm64/
  arm64: head.S: always initialize PSTATE
  arm64: head.S: cleanup SCTLR_ELx initialization
  arm64: head.S: rename el2_setup -> init_kernel_el
  arm64: add C wrappers for SET_PSTATE_*()
  ...
2020-12-14 16:24:30 -08:00
Catalin Marinas
d889797530 Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/core
* arm64/for-next/fixes: (26 commits)
  arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
  arm64: mte: Fix typo in macro definition
  arm64: entry: fix EL1 debug transitions
  arm64: entry: fix NMI {user, kernel}->kernel transitions
  arm64: entry: fix non-NMI kernel<->kernel transitions
  arm64: ptrace: prepare for EL1 irq/rcu tracking
  arm64: entry: fix non-NMI user<->kernel transitions
  arm64: entry: move el1 irq/nmi logic to C
  arm64: entry: prepare ret_to_user for function call
  arm64: entry: move enter_from_user_mode to entry-common.c
  arm64: entry: mark entry code as noinstr
  arm64: mark idle code as noinstr
  arm64: syscall: exit userspace before unmasking exceptions
  arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
  arm64: pgtable: Fix pte_accessible()
  ACPI/IORT: Fix doc warnings in iort.c
  arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build
  arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver
  arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list
  arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist
  ...

# Conflicts:
#	arch/arm64/include/asm/exception.h
#	arch/arm64/kernel/sdei.c
2020-12-09 18:04:55 +00:00
Linus Torvalds
3bb61aa618 arm64 fixes for -rc7
- Fix numerous issues with instrumentation and exception entry
 
 - Fix hideous typo in unused register field definition
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/HvbwQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFTMB/oD0ucfP6CH65w+7Sbsv2L8FABfYzSrA9gP
 f1cmeh1+MyRN4Nbx2ves5wcRGoX1CgZ8KFAmLXG6yyn7UDA/q27CTELknwobhOft
 tQIPB2hFDW9qq3VBXFReL3aoXLnWUiRL3nBxQFt7LG1Xor/ivEb1ZFht351UklDh
 u1P6NVptpjXFuGPvdqxkHo2WzT0QHI57MRuc1l7I1FRo4dV1nKSlwohu0Ydii4q9
 8oLhx77Ga1SWK80IztNmpo7CSMP/FLGDwbUE3vAaftUJx5CBt+lYR1CeWNACSEvy
 22y7CkJWKGQccG62oHI7zQaZm1+fum70ndP5dDlfQW/BcCaz8vRH
 =KSDc
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "I'm sad to say that we've got an unusually large arm64 fixes pull for
  rc7 which addresses numerous significant instrumentation issues with
  our entry code.

  Without these patches, lockdep is hopelessly unreliable in some
  configurations [1,2] and syzkaller is therefore not a lot of use
  because it's so noisy.

  Although much of this has always been broken, it appears to have been
  exposed more readily by other changes such as 044d0d6de9 ("lockdep:
  Only trace IRQ edges") and general lockdep improvements around IRQ
  tracing and NMIs.

  Fixing this properly required moving much of the instrumentation hooks
  from our entry assembly into C, which Mark has been working on for the
  last few weeks. We're not quite ready to move to the recently added
  generic functions yet, but the code here has been deliberately written
  to mimic that closely so we can look at cleaning things up once we
  have a bit more breathing room.

  Having said all that, the second version of these patches was posted
  last week and I pushed it into our CI (kernelci and cki) along with a
  commit which forced on PROVE_LOCKING, NOHZ_FULL and
  CONTEXT_TRACKING_FORCE. The result? We found a real bug in the
  md/raid10 code [3].

  Oh, and there's also a really silly typo patch that's unrelated.

  Summary:

   - Fix numerous issues with instrumentation and exception entry

   - Fix hideous typo in unused register field definition"

[1] https://lore.kernel.org/r/CACT4Y+aAzoJ48Mh1wNYD17pJqyEcDnrxGfApir=-j171TnQXhw@mail.gmail.com
[2] https://lore.kernel.org/r/20201119193819.GA2601289@elver.google.com
[3] https://lore.kernel.org/r/94c76d5e-466a-bc5f-e6c2-a11b65c39f83@redhat.com

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mte: Fix typo in macro definition
  arm64: entry: fix EL1 debug transitions
  arm64: entry: fix NMI {user, kernel}->kernel transitions
  arm64: entry: fix non-NMI kernel<->kernel transitions
  arm64: ptrace: prepare for EL1 irq/rcu tracking
  arm64: entry: fix non-NMI user<->kernel transitions
  arm64: entry: move el1 irq/nmi logic to C
  arm64: entry: prepare ret_to_user for function call
  arm64: entry: move enter_from_user_mode to entry-common.c
  arm64: entry: mark entry code as noinstr
  arm64: mark idle code as noinstr
  arm64: syscall: exit userspace before unmasking exceptions
2020-12-02 12:27:37 -08:00
Mark Rutland
3d2403fd10 arm64: uaccess: remove set_fs()
Now that the uaccess primitives dont take addr_limit into account, we
have no need to manipulate this via set_fs() and get_fs(). Remove
support for these, along with some infrastructure this renders
redundant.

We no longer need to flip UAO to access kernel memory under KERNEL_DS,
and head.S unconditionally clears UAO for all kernel configurations via
an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
nor do we need to context-switch it. However, we still need to adjust
PAN during SDEI entry.

Masking of __user pointers no longer needs to use the dynamic value of
addr_limit, and can use a constant derived from the maximum possible
userspace task size. A new TASK_SIZE_MAX constant is introduced for
this, which is also used by core code. In configurations supporting
52-bit VAs, this may include a region of unusable VA space above a
48-bit TTBR0 limit, but never includes any portion of TTBR1.

Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
KERNEL_DS were inclusive limits, and is converted to a mask by
subtracting one.

As the SDEI entry code repurposes the otherwise unnecessary
pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
consider factoring that out.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 19:49:11 +00:00
Mark Rutland
f80d034086 arm64: ensure ERET from kthread is illegal
For consistency, all tasks have a pt_regs reserved at the highest
portion of their task stack. Among other things, this ensures that a
task's SP is always pointing within its stack rather than pointing
immediately past the end.

While it is never legitimate to ERET from a kthread, we take pains to
initialize pt_regs for kthreads as if this were legitimate. As this is
never legitimate, the effects of an erroneous return are rarely tested.

Let's simplify things by initializing a kthread's pt_regs such that an
ERET is caught as an illegal exception return, and removing the explicit
initialization of other exception context. Note that as
spectre_v4_enable_task_mitigation() only manipulates the PSTATE within
the unused regs this is safe to remove.

As user tasks will have their exception context initialized via
start_thread() or start_compat_thread(), this should only impact cases
where something has gone very wrong and we'd like that to be clearly
indicated.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201113124937.20574-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 19:44:02 +00:00
Mark Rutland
114e0a6847 arm64: mark idle code as noinstr
Core code disables RCU when calling arch_cpu_idle(), so it's not safe
for arch_cpu_idle() or its calees to be instrumented, as the
instrumentation callbacks may attempt to use RCU or other features which
are unsafe to use in this context.

Mark them noinstr to prevent issues.

The use of local_irq_enable() in arch_cpu_idle() is similarly
problematic, and the "sched/idle: Fix arch_cpu_idle() vs tracing" patch
queued in the tip tree addresses that case.

Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30 12:11:37 +00:00
Linus Torvalds
f91a3aa6bc Yet two more places which invoke tracing from RCU disabled regions in the
idle path. Similar to the entry path the low level idle functions have to
 be non-instrumentable.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/DpAUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoXSLD/9klc0YimnEnROW6Q5Svb2IcyIutmXF
 bOIY1bYYoKILOBj3wyvDUhmdMuq5zh7H9yG11hO8MaVVWVQcLcOMLdHTYm9dcdmF
 xQk33+xqjuhRShB+nEmC9ayYtWogtH6W6uZ6WDtF9ZltMKU85n5ddGJ/Fvo+HoCb
 NbOdHGJdJ3/3ZCeHnxOnxM+5/GwjkBuccTV/tXmb3yXrfU9DBySyQ4/UchcpF43w
 LcEb0kiQbpZsBTByKJOQV8+RR654S0sILlvRwVXpmj94vrgGwhlVk1/9rz7tkOhF
 ksoo1mTVu75LMt22G/hXxE63787yRvFdHjapf0+kCOAuhl992NK+xlGDH8o9DXcu
 9y73D4bI0HnDFs20w6vs20iLvxECJiYHJqlgR5ZwFUToceaNgtiYr8kzuD7Zbae1
 KG2E7BuNSwHWMtf97fGn44GZknPEOaKdDn4Wv6/bvKHxLm77qe11RKF70Stcz2AI
 am13KmQzzsHGF5qNWwpElRUxSdxfJMR66RnOdTQULGrRedaZTFol/y2pnVzTSe3k
 SZnlpL5kE7y92UYDogPb5wWA7b+YkJN0OdSkRFy1FH26ZG8E4M7ZJ2tql5Sw7pGM
 lsTjXpAUphnK5rz7QcYE8KAZWj//fIAcElIrvdklVcBnS3IqjfksYW27B64133vx
 cT1B/lA1PHXj6Q==
 =raED
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Two more places which invoke tracing from RCU disabled regions in the
  idle path.

  Similar to the entry path the low level idle functions have to be
  non-instrumentable"

* tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  intel_idle: Fix intel_idle() vs tracing
  sched/idle: Fix arch_cpu_idle() vs tracing
2020-11-29 11:19:26 -08:00
Peter Zijlstra
58c644ba51 sched/idle: Fix arch_cpu_idle() vs tracing
We call arch_cpu_idle() with RCU disabled, but then use
local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.

Switch all arch_cpu_idle() implementations to use
raw_local_irq_{en,dis}able() and carefully manage the
lockdep,rcu,tracing state like we do in entry.

(XXX: we really should change arch_cpu_idle() to not return with
interrupts enabled)

Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org
2020-11-24 16:47:35 +01:00
Will Deacon
f969f03888 arm64: errata: Fix handling of 1418040 with late CPU onlining
In a surprising turn of events, it transpires that CPU capabilities
configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
result of late-onlining. Therefore our handling of erratum 1418040 does
not get activated if it is not required by any of the boot CPUs, even
though we allow late-onlining of an affected CPU.

In order to get things working again, replace the cpus_have_const_cap()
invocation with an explicit check for the current CPU using
this_cpu_has_cap().

Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201106114952.10032-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-10 13:09:39 +00:00
Will Deacon
baab853229 Merge branch 'for-next/mte' into for-next/core
Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.

(Catalin Marinas and others)
* for-next/mte: (30 commits)
  arm64: mte: Fix typo in memory tagging ABI documentation
  arm64: mte: Add Memory Tagging Extension documentation
  arm64: mte: Kconfig entry
  arm64: mte: Save tags when hibernating
  arm64: mte: Enable swap of tagged pages
  mm: Add arch hooks for saving/restoring tags
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
  arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
  arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Allow user control of the generated random tags via prctl()
  arm64: mte: Allow user control of the tag check mode via prctl()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Introduce arch_validate_flags()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: Avoid unnecessary clear_user_page() indirection
  ...
2020-10-02 12:16:11 +01:00
Will Deacon
780c083a8f arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
The PR_SPEC_DISABLE_NOEXEC option to the PR_SPEC_STORE_BYPASS prctl()
allows the SSB mitigation to be enabled only until the next execve(),
at which point the state will revert back to PR_SPEC_ENABLE and the
mitigation will be disabled.

Add support for PR_SPEC_DISABLE_NOEXEC on arm64.

Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Will Deacon
c28762070c arm64: Rewrite Spectre-v4 mitigation code
Rewrite the Spectre-v4 mitigation handling code to follow the same
approach as that taken by Spectre-v2.

For now, report to KVM that the system is vulnerable (by forcing
'ssbd_state' to ARM64_SSBD_UNKNOWN), as this will be cleared up in
subsequent steps.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Catalin Marinas
93f067f6ca arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
In preparation for ptrace() access to the prctl() value, allow calling
these functions on non-current tasks.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:07 +01:00
Catalin Marinas
af5ce95282 arm64: mte: Allow user control of the generated random tags via prctl()
The IRG, ADDG and SUBG instructions insert a random tag in the resulting
address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
when, for example, the user wants a certain colour for freed buffers.
Since the GCR_EL1 register is not accessible at EL0, extend the
prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
the first argument for controlling which tags can be generated by the
above instruction (an include rather than exclude mask). Note that by
default all non-zero tags are excluded. This setting is per-thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:07 +01:00
Catalin Marinas
1c101da8b9 arm64: mte: Allow user control of the tag check mode via prctl()
By default, even if PROT_MTE is set on a memory range, there is no tag
check fault reporting (SIGSEGV). Introduce a set of option to the
exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
check fault mode:

  PR_MTE_TCF_NONE  - no reporting (default)
  PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
  PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting

These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
context-switched by the kernel. Note that the kernel accesses to the
user address space (e.g. read() system call) are not checked if the user
thread tag checking mode is PR_MTE_TCF_NONE or PR_MTE_TCF_ASYNC. If the
tag checking mode is PR_MTE_TCF_SYNC, the kernel makes a best effort to
check its user address accesses, however it cannot always guarantee it.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:07 +01:00
Vincenzo Frascino
637ec831ea arm64: mte: Handle synchronous and asynchronous tag check faults
The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:

1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.

Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().

On a tag check failure in user-space, whether synchronous or
asynchronous, a SIGSEGV will be raised on the faulting thread.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:06 +01:00
Peter Zijlstra
9864f5b594 cpuidle: Move trace_cpu_idle() into generic code
Remove trace_cpu_idle() from the arch_cpu_idle() implementations and
put it in the generic code, right before disabling RCU. Gets rid of
more trace_*_rcuidle() users.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.428433395@infradead.org
2020-08-26 12:41:54 +02:00
Marc Zyngier
d49f7d7376 arm64: Move handling of erratum 1418040 into C code
Instead of dealing with erratum 1418040 on each entry and exit,
let's move the handling to __switch_to() instead, which has
several advantages:

- It can be applied when it matters (switching between 32 and 64
  bit tasks).
- It is written in C (yay!)
- It can rely on static keys rather than alternatives

Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200731173824.107480-2-maz@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-21 11:39:56 +01:00
Christian Brauner
714acdbd1c
arch: rename copy_thread_tls() back to copy_thread()
Now that HAVE_COPY_THREAD_TLS has been removed, rename copy_thread_tls()
back simply copy_thread(). It's a simpler name, and doesn't imply that only
tls is copied here. This finishes an outstanding chunk of internal process
creation work since we've added clone3().

Cc: linux-arch@vger.kernel.org
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>A
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>A
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-04 23:41:37 +02:00
Dmitry Safonov
c76898373f arm64: add loglvl to dump_backtrace()
Currently, the log-level of show_stack() depends on a platform
realization.  It creates situations where the headers are printed with
lower log level or higher than the stacktrace (depending on a platform or
user).

Furthermore, it forces the logic decision from user to an architecture
side.  In result, some users as sysrq/kdb/etc are doing tricks with
temporary rising console_loglevel while printing their messages.  And in
result it not only may print unwanted messages from other CPUs, but also
omit printing at all in the unlucky case where the printk() was deferred.

Introducing log-level parameter and KERN_UNSUPPRESSED [1] seems an easier
approach than introducing more printk buffers.  Also, it will consolidate
printings with headers.

Add log level argument to dump_backtrace() as a preparation for
introducing show_stack_loglvl().

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/T/#u

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200418201944.482088-10-dima@arista.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:11 -07:00
Will Deacon
80e4e56132 Merge branch 'for-next/bti-user' into for-next/bti
Merge in user support for Branch Target Identification, which narrowly
missed the cut for 5.7 after a late ABI concern.

* for-next/bti-user:
  arm64: bti: Document behaviour for dynamically linked binaries
  arm64: elf: Fix allnoconfig kernel build with !ARCH_USE_GNU_PROPERTY
  arm64: BTI: Add Kconfig entry for userspace BTI
  mm: smaps: Report arm64 guarded pages in smaps
  arm64: mm: Display guarded pages in ptdump
  KVM: arm64: BTI: Reset BTYPE when skipping emulated instructions
  arm64: BTI: Reset BTYPE when skipping emulated instructions
  arm64: traps: Shuffle code to eliminate forward declarations
  arm64: unify native/compat instruction skipping
  arm64: BTI: Decode BYTPE bits when printing PSTATE
  arm64: elf: Enable BTI at exec based on ELF program properties
  elf: Allow arch to tweak initial mmap prot flags
  arm64: Basic Branch Target Identification support
  ELF: Add ELF program property parsing support
  ELF: UAPI and Kconfig additions for ELF program properties
2020-05-05 15:15:58 +01:00
Linus Torvalds
3cd86a58f7 arm64 updates for 5.7:
- In-kernel Pointer Authentication support (previously only offered to
   user space).
 
 - ARM Activity Monitors (AMU) extension support allowing better CPU
   utilisation numbers for the scheduler (frequency invariance).
 
 - Memory hot-remove support for arm64.
 
 - Lots of asm annotations (SYM_*) in preparation for the in-kernel
   Branch Target Identification (BTI) support.
 
 - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU
   init callbacks, support for new DT compatibles.
 
 - IPv6 header checksum optimisation.
 
 - Fixes: SDEI (software delegated exception interface) double-lock on
   hibernate with shared events.
 
 - Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm()
   converted to C, cpufeature finalisation helper.
 
 - sys_mremap() comment explaining the asymmetric address untagging
   behaviour.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6DVyIACgkQa9axLQDI
 XvHkqRAAiZA2EYKiQL4M1DJ1cNTADjT7xKX9+UtYBXj7GMVhgVWdunpHVE6qtfgk
 cT6avmKrS/6PDqizJgr+Z1yX8x3Kvs57G4BvmIUKIw97mkdewvFQ9JKv6VA1vb86
 7Qrl1WzqsGg5Kj9uUfI4h+ZoT1H4C/9PQeFxJwgZRtF9DxRh8O7VeZI+JCu8Aub2
 lIkjI8rh+EpTsGT9h/PMGWUcawnKQloZ1/F+GfMAuYBvIv2RNN2xVreJtTmm4NyJ
 VcpL0KCNyAI2lGdaJg5nBLRDyGuXDm5i+PLsCSXMquI4fie00txXeD8sjbeuO0ks
 YTJ0EhmUUhbSE17go+SxYiEFE0v09i+lD5ud+B4Vmojp0KTczTta9VSgURlbb2/9
 n9biq5G3PPDNIrZqiTT2Tf4AMz1350nkbzL2gzKecM5aIzR/u3y5yII5CgfZtFnj
 7bGbyFpFpcqI7UaISPsNCxmknbTt/7ff0WM3+7SbecxI3AD2mnxsOdN9JTLyhDp+
 owjyiaWxl5zMWF9DhplLG/9BKpNWSxh3skazdOdELd8GTq2MbJlXrVG2XgXTAOh3
 y1s6RQrfw8zXh8TSqdmmzauComXIRWTum/sbVB3U8Z3AUsIeq/NTSbN5X9JyIbOP
 HOabhlVhhkI6omN1grqPX4jwUiZLZoNfn7Ez4q71549KVK/uBtA=
 =LJVX
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "The bulk is in-kernel pointer authentication, activity monitors and
  lots of asm symbol annotations. I also queued the sys_mremap() patch
  commenting the asymmetry in the address untagging.

  Summary:

   - In-kernel Pointer Authentication support (previously only offered
     to user space).

   - ARM Activity Monitors (AMU) extension support allowing better CPU
     utilisation numbers for the scheduler (frequency invariance).

   - Memory hot-remove support for arm64.

   - Lots of asm annotations (SYM_*) in preparation for the in-kernel
     Branch Target Identification (BTI) support.

   - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the
     PMU init callbacks, support for new DT compatibles.

   - IPv6 header checksum optimisation.

   - Fixes: SDEI (software delegated exception interface) double-lock on
     hibernate with shared events.

   - Minor clean-ups and refactoring: cpu_ops accessor,
     cpu_do_switch_mm() converted to C, cpufeature finalisation helper.

   - sys_mremap() comment explaining the asymmetric address untagging
     behaviour"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  mm/mremap: Add comment explaining the untagging behaviour of mremap()
  arm64: head: Convert install_el2_stub to SYM_INNER_LABEL
  arm64: Introduce get_cpu_ops() helper function
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Declare ACPI parking protocol CPU operation if needed
  arm64: move kimage_vaddr to .rodata
  arm64: use mov_q instead of literal ldr
  arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH
  lkdtm: arm64: test kernel pointer authentication
  arm64: compile the kernel with ptrauth return address signing
  kconfig: Add support for 'as-option'
  arm64: suspend: restore the kernel ptrauth keys
  arm64: __show_regs: strip PAC from lr in printk
  arm64: unwind: strip PAC from kernel addresses
  arm64: mask PAC bits of __builtin_return_address
  arm64: initialize ptrauth keys for kernel booting task
  arm64: initialize and switch ptrauth kernel keys
  arm64: enable ptrauth earlier
  arm64: cpufeature: handle conflicts based on capability
  arm64: cpufeature: Move cpu capability helpers inside C file
  ...
2020-03-31 10:05:01 -07:00
Qais Yousef
5efbe6a6e1 arm64: Use reboot_cpu instead of hardconding it to 0
Use `reboot_cpu` variable instead of hardcoding 0 as the reboot cpu in
machine_shutdown().

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200323135110.30522-8-qais.yousef@arm.com
2020-03-25 12:59:33 +01:00
Qais Yousef
d66b16f5df arm64: Don't use disable_nonboot_cpus()
disable_nonboot_cpus() is not safe to use when doing machine_down(),
because it relies on freeze_secondary_cpus() which in turn is
a suspend/resume related freeze and could abort if the logic detects any
pending activities that can prevent finishing the offlining process.

Beside disable_nonboot_cpus() is dependent on CONFIG_PM_SLEEP_SMP which
is an othogonal config to rely on to ensure this function works
correctly.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200323135110.30522-7-qais.yousef@arm.com
2020-03-25 12:59:33 +01:00
Mark Brown
5d1b631c77 arm64: bti: Document behaviour for dynamically linked binaries
For dynamically linked binaries the interpreter is responsible for setting
PROT_BTI on everything except itself. The dynamic linker needs to be aware
of PROT_BTI, for example in order to avoid dropping that when marking
executable pages read only after doing relocations, and doing everything
in userspace ensures that we don't get any issues due to divergences in
behaviour between the kernel and dynamic linker within a single executable.
Add a comment indicating that this is intentional to the code to help
people trying to understand what's going on.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-25 09:56:10 +00:00
Amit Daniel Kachhap
cdcb61ae4c arm64: __show_regs: strip PAC from lr in printk
lr is printed with %pS which will try to find an entry in kallsyms.
After enabling pointer authentication, this match will fail due to
PAC present in the lr.

Strip PAC from the lr to display the correct symbol name.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:20 +00:00
Kristina Martsenko
33e4523498 arm64: initialize and switch ptrauth kernel keys
Set up keys to use pointer authentication within the kernel. The kernel
will be compiled with APIAKey instructions, the other keys are currently
unused. Each task is given its own APIAKey, which is initialized during
fork. The key is changed during context switch and on kernel entry from
EL0.

The keys for idle threads need to be set before calling any C functions,
because it is not possible to enter and exit a function with different
keys.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[Amit: Modified secondary cores key structure, comments]
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:20 +00:00
Kristina Martsenko
be12984256 arm64: install user ptrauth keys at kernel exit time
As we're going to enable pointer auth within the kernel and use a
different APIAKey for the kernel itself, so move the user APIAKey
switch to EL0 exception return.

The other 4 keys could remain switched during task switch, but are also
moved to keep things consistent.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[Amit: commit msg, re-positioned the patch, comments]
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:19 +00:00
Dave Martin
ec94a46ee7 arm64: BTI: Decode BYTPE bits when printing PSTATE
The current code to print PSTATE symbolically when generating
backtraces etc., does not include the BYTPE field used by Branch
Target Identification.

So, decode BYTPE and print it too.

In the interests of human-readability, print the classes of BTI
matched.  The symbolic notation, BYTPE (PSTATE[11:10]) and
permitted classes of subsequent instruction are:

    -- (BTYPE=0b00): any insn
    jc (BTYPE=0b01): BTI jc, BTI j, BTI c, PACIxSP
    -c (BYTPE=0b10): BTI jc, BTI c, PACIxSP
    j- (BTYPE=0b11): BTI jc, BTI j

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-16 17:19:49 +00:00
Dave Martin
ab7876a98a arm64: elf: Enable BTI at exec based on ELF program properties
For BTI protection to be as comprehensive as possible, it is
desirable to have BTI enabled from process startup.  If this is not
done, the process must use mprotect() to enable BTI for each of its
executable mappings, but this is painful to do in the libc startup
code.  It's simpler and more sound to have the kernel do it
instead.

To this end, detect BTI support in the executable (or ELF
interpreter, as appropriate), via the
NT_GNU_PROGRAM_PROPERTY_TYPE_0 note, and tweak the initial prot
flags for the process' executable pages to include PROT_BTI as
appropriate.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-16 17:19:48 +00:00
Will Deacon
fca3d33d8a arm64: ssbs: Fix context-switch when SSBS is present on all CPUs
When all CPUs in the system implement the SSBS extension, the SSBS field
in PSTATE is the definitive indication of the mitigation state. Further,
when the CPUs implement the SSBS manipulation instructions (advertised
to userspace via an HWCAP), EL0 can toggle the SSBS field directly and
so we cannot rely on any shadow state such as TIF_SSBD at all.

Avoid forcing the SSBS field in context-switch on such a system, and
simply rely on the PSTATE register instead.

Cc: <stable@vger.kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Srinivas Ramana <sramana@codeaurora.org>
Fixes: cbdf8a189a ("arm64: Force SSBS on context switch")
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-02-10 11:29:02 +00:00
Matteo Croce
2c614c1194 arm64: use shared sysctl constants
Use shared sysctl variables for zero and one constants, as in
commit eec4844fae ("proc/sysctl: add shared variables for range check")

Fixes: 63f0c60379 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-02-10 11:24:57 +00:00
Linus Torvalds
0238d3c753 arm64 updates for 5.6
- New architecture features
 	* Support for Armv8.5 E0PD, which benefits KASLR in the same way as
 	  KPTI but without the overhead. This allows KPTI to be disabled on
 	  CPUs that are not affected by Meltdown, even is KASLR is enabled.
 
 	* Initial support for the Armv8.5 RNG instructions, which claim to
 	  provide access to a high bandwidth, cryptographically secure hardware
 	  random number generator. As well as exposing these to userspace, we
 	  also use them as part of the KASLR seed and to seed the crng once
 	  all CPUs have come online.
 
 	* Advertise a bunch of new instructions to userspace, including support
 	  for Data Gathering Hint, Matrix Multiply and 16-bit floating point.
 
 - Kexec
 	* Cleanups in preparation for relocating with the MMU enabled
 	* Support for loading crash dump kernels with kexec_file_load()
 
 - Perf and PMU drivers
 	* Cleanups and non-critical fixes for a couple of system PMU drivers
 
 - FPU-less (aka broken) CPU support
 	* Considerable fixes to support CPUs without the FP/SIMD extensions,
 	  including their presence in heterogeneous systems. Good luck finding
 	  a 64-bit userspace that handles this.
 
 - Modern assembly function annotations
 	* Start migrating our use of ENTRY() and ENDPROC() over to the
 	  new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to
 	  aid debuggers
 
 - Kbuild
 	* Cleanup detection of LSE support in the assembler by introducing
 	  'as-instr'
 
 	* Remove compressed Image files when building clean targets
 
 - IP checksumming
 	* Implement optimised IPv4 checksumming routine when hardware offload
 	  is not in use. An IPv6 version is in the works, pending testing.
 
 - Hardware errata
 	* Work around Cortex-A55 erratum #1530923
 
 - Shadow call stack
 	* Work around some issues with Clang's integrated assembler not liking
 	  our perfectly reasonable assembly code
 
 	* Avoid allocating the X18 register, so that it can be used to hold the
 	  shadow call stack pointer in future
 
 - ACPI
 	* Fix ID count checking in IORT code. This may regress broken firmware
 	  that happened to work with the old implementation, in which case we'll
 	  have to revert it and try something else
 
 	* Fix DAIF corruption on return from GHES handler with pseudo-NMIs
 
 - Miscellaneous
 	* Whitelist some CPUs that are unaffected by Spectre-v2
 
 	* Reduce frequency of ASID rollover when KPTI is compiled in but
 	  inactive
 
 	* Reserve a couple of arch-specific PROT flags that are already used by
 	  Sparc and PowerPC and are planned for later use with BTI on arm64
 
 	* Preparatory cleanup of our entry assembly code in preparation for
 	  moving more of it into C later on
 
 	* Refactoring and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl4oY+IQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNNfRB/4p3vax0hqaOnLRvmJPRXF31B8oPlivnr2u
 6HCA9LkdU5IlrgaTNOJ/sQEqJAPOPCU7v49Ol0iYw0iKL1suUE7Ikui5VB6Uybqt
 YbfF5UNzfXAMs2A86TF/hzqhxw+W+lpnZX8NVTuQeAODfHEGUB1HhTLfRi9INsER
 wKEAuoZyuSUibxTFvji+DAq7nVRniXX7CM7tE385pxDisCMuu/7E5wOl+3EZYXWz
 DTGzTbHXuVFL+UFCANFEUlAtmr3dQvPFIqAwVl/CxjRJjJ7a+/G3cYLsHFPrQCjj
 qYX4kfhAeeBtqmHL7YFNWFwFs5WaT5UcQquFO665/+uCTWSJpORY
 =AIh/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The changes are a real mixed bag this time around.

  The only scary looking one from the diffstat is the uapi change to
  asm-generic/mman-common.h, but this has been acked by Arnd and is
  actually just adding a pair of comments in an attempt to prevent
  allocation of some PROT values which tend to get used for
  arch-specific purposes. We'll be using them for Branch Target
  Identification (a CFI-like hardening feature), which is currently
  under review on the mailing list.

  New architecture features:

   - Support for Armv8.5 E0PD, which benefits KASLR in the same way as
     KPTI but without the overhead. This allows KPTI to be disabled on
     CPUs that are not affected by Meltdown, even is KASLR is enabled.

   - Initial support for the Armv8.5 RNG instructions, which claim to
     provide access to a high bandwidth, cryptographically secure
     hardware random number generator. As well as exposing these to
     userspace, we also use them as part of the KASLR seed and to seed
     the crng once all CPUs have come online.

   - Advertise a bunch of new instructions to userspace, including
     support for Data Gathering Hint, Matrix Multiply and 16-bit
     floating point.

  Kexec:

   - Cleanups in preparation for relocating with the MMU enabled

   - Support for loading crash dump kernels with kexec_file_load()

  Perf and PMU drivers:

   - Cleanups and non-critical fixes for a couple of system PMU drivers

  FPU-less (aka broken) CPU support:

   - Considerable fixes to support CPUs without the FP/SIMD extensions,
     including their presence in heterogeneous systems. Good luck
     finding a 64-bit userspace that handles this.

  Modern assembly function annotations:

   - Start migrating our use of ENTRY() and ENDPROC() over to the
     new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended
     to aid debuggers

  Kbuild:

   - Cleanup detection of LSE support in the assembler by introducing
     'as-instr'

   - Remove compressed Image files when building clean targets

  IP checksumming:

   - Implement optimised IPv4 checksumming routine when hardware offload
     is not in use. An IPv6 version is in the works, pending testing.

  Hardware errata:

   - Work around Cortex-A55 erratum #1530923

  Shadow call stack:

   - Work around some issues with Clang's integrated assembler not
     liking our perfectly reasonable assembly code

   - Avoid allocating the X18 register, so that it can be used to hold
     the shadow call stack pointer in future

  ACPI:

   - Fix ID count checking in IORT code. This may regress broken
     firmware that happened to work with the old implementation, in
     which case we'll have to revert it and try something else

   - Fix DAIF corruption on return from GHES handler with pseudo-NMIs

  Miscellaneous:

   - Whitelist some CPUs that are unaffected by Spectre-v2

   - Reduce frequency of ASID rollover when KPTI is compiled in but
     inactive

   - Reserve a couple of arch-specific PROT flags that are already used
     by Sparc and PowerPC and are planned for later use with BTI on
     arm64

   - Preparatory cleanup of our entry assembly code in preparation for
     moving more of it into C later on

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits)
  arm64: acpi: fix DAIF manipulation with pNMI
  arm64: kconfig: Fix alignment of E0PD help text
  arm64: Use v8.5-RNG entropy for KASLR seed
  arm64: Implement archrandom.h for ARMv8.5-RNG
  arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
  arm64: entry: Avoid empty alternatives entries
  arm64: Kconfig: select HAVE_FUTEX_CMPXCHG
  arm64: csum: Fix pathological zero-length calls
  arm64: entry: cleanup sp_el0 manipulation
  arm64: entry: cleanup el0 svc handler naming
  arm64: entry: mark all entry code as notrace
  arm64: assembler: remove smp_dmb macro
  arm64: assembler: remove inherit_daif macro
  ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()
  mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use
  arm64: Use macros instead of hard-coded constants for MAIR_EL1
  arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list
  arm64: kernel: avoid x18 in __cpu_soft_restart
  arm64: kvm: stop treating register x18 as caller save
  arm64/lib: copy_page: avoid x18 register in assembler code
  ...
2020-01-27 08:58:19 -08:00
Suzuki K Poulose
b51c6ac220 arm64: Introduce system_capabilities_finalized() marker
We finalize the system wide capabilities after the SMP CPUs
are booted by the kernel. This is used as a marker for deciding
various checks in the kernel. e.g, sanity check the hotplugged
CPUs for missing mandatory features.

However there is no explicit helper available for this in the
kernel. There is sys_caps_initialised, which is not exposed.
The other closest one we have is the jump_label arm64_const_caps_ready
which denotes that the capabilities are set and the capability checks
could use the individual jump_labels for fast path. This is
performed before setting the ELF Hwcaps, which must be checked
against the new CPUs. We also perform some of the other initialization
e.g, SVE setup, which is important for the use of FP/SIMD
where SVE is supported. Normally userspace doesn't get to run
before we finish this. However the in-kernel users may
potentially start using the neon mode. So, we need to
reject uses of neon mode before we are set. Instead of defining
a new marker for the completion of SVE setup, we could simply
reuse the arm64_const_caps_ready and enable it once we have
finished all the setup. Also we could expose this to the
various users as "system_capabilities_finalized()" to make
it more meaningful than "const_caps_ready".

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-14 17:10:21 +00:00
Amanieu d'Antras
a4376f2fbc
arm64: Implement copy_thread_tls
This is required for clone3 which passes the TLS value through a
struct rather than a register.

Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@vger.kernel.org> # 5.3.x
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200102172413.654385-3-amanieu@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-01-07 13:31:01 +01:00
Julien Thierry
19c95f261c arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled
Preempting from IRQ-return means that the task has its PSTATE saved
on the stack, which will get restored when the task is resumed and does
the actual IRQ return.

However, enabling some CPU features requires modifying the PSTATE. This
means that, if a task was scheduled out during an IRQ-return before all
CPU features are enabled, the task might restore a PSTATE that does not
include the feature enablement changes once scheduled back in.

* Task 1:

PAN == 0 ---|                          |---------------
            |                          |<- return from IRQ, PSTATE.PAN = 0
            | <- IRQ                   |
            +--------+ <- preempt()  +--
                                     ^
                                     |
                                     reschedule Task 1, PSTATE.PAN == 1
* Init:
        --------------------+------------------------
                            ^
                            |
                            enable_cpu_features
                            set PSTATE.PAN on all CPUs

Worse than this, since PSTATE is untouched when task switching is done,
a task missing the new bits in PSTATE might affect another task, if both
do direct calls to schedule() (outside of IRQ/exception contexts).

Fix this by preventing preemption on IRQ-return until features are
enabled on all CPUs.

This way the only PSTATE values that are saved on the stack are from
synchronous exceptions. These are expected to be fatal this early, the
exception is BRK for WARN_ON(), but as this uses do_debug_exception()
which keeps IRQs masked, it shouldn't call schedule().

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
[james: Replaced a really cool hack, with an even simpler static key in C.
 expanded commit message with Julien's cover-letter ascii art]
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-16 09:51:43 -07:00
Masayoshi Mizuma
4585fc59c0 arm64/sve: Fix wrong free for task->thread.sve_state
The system which has SVE feature crashed because of
the memory pointed by task->thread.sve_state was destroyed
by someone.

That is because sve_state is freed while the forking the
child process. The child process has the pointer of sve_state
which is same as the parent's because the child's task_struct
is copied from the parent's one. If the copy_process()
fails as an error on somewhere, for example, copy_creds(),
then the sve_state is freed even if the parent is alive.
The flow is as follows.

copy_process
        p = dup_task_struct
            => arch_dup_task_struct
                *dst = *src;  // copy the entire region.
:
        retval = copy_creds
        if (retval < 0)
                goto bad_fork_free;
:
bad_fork_free:
...
        delayed_free_task(p);
          => free_task
             => arch_release_task_struct
                => fpsimd_release_task
                   => __sve_free
                      => kfree(task->thread.sve_state);
                         // free the parent's sve_state

Move child's sve_state = NULL and clearing TIF_SVE flag
to arch_dup_task_struct() so that the child doesn't free the
parent's one.
There is no need to wait until copy_process() to clear TIF_SVE for
dst, because the thread flags for dst are initialized already by
copying the src task_struct.
This change simplifies the code, so get rid of comments that are no
longer needed.

As a note, arm64 used to have thread_info on the stack. So it
would not be possible to clear TIF_SVE until the stack is initialized.
From commit c02433dd6d ("arm64: split thread_info from task stack"),
the thread_info is part of the task, so it should be valid to modify
the flag from arch_dup_task_struct().

Cc: stable@vger.kernel.org # 4.15.x-
Fixes: bc0ee47603 ("arm64/sve: Core task context handling")
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reported-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Suggested-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-01 13:30:52 +01:00
Alexandre Ghiti
e7142bf5d2 arm64, mm: make randomization selected by generic topdown mmap layout
This commits selects ARCH_HAS_ELF_RANDOMIZE when an arch uses the generic
topdown mmap layout functions so that this security feature is on by
default.

Note that this commit also removes the possibility for arm64 to have elf
randomization and no MMU: without MMU, the security added by randomization
is worth nothing.

Link: http://lkml.kernel.org/r/20190730055113.23635-6-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Catalin Marinas
413235fced arm64: Change the tagged_addr sysctl control semantics to only prevent the opt-in
First rename the sysctl control to abi.tagged_addr_disabled and make it
default off (zero). When abi.tagged_addr_disabled == 1, only block the
enabling of the TBI ABI via prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE).
Getting the status of the ABI or disabling it is still allowed.

Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-20 18:18:10 +01:00
Catalin Marinas
63f0c60379 arm64: Introduce prctl() options to control the tagged user addresses ABI
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.

The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-06 18:08:45 +01:00
Marc Zyngier
cbdf8a189a arm64: Force SSBS on context switch
On a CPU that doesn't support SSBS, PSTATE[12] is RES0.  In a system
where only some of the CPUs implement SSBS, we end-up losing track of
the SSBS bit across task migration.

To address this issue, let's force the SSBS bit on context switch.

Fixes: 8f04e8e6e2 ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: inverted logic and added comments]
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 15:24:16 +01:00
Dave Martin
f3dcbe67ed arm64: stacktrace: Factor out backtrace initialisation
Some common code is required by each stacktrace user to initialise
struct stackframe before the first call to unwind_frame().

In preparation for adding to the common code, this patch factors it
out into a separate function start_backtrace(), and modifies the
stacktrace callers appropriately.

No functional change.

Signed-off-by: Dave Martin <dave.martin@arm.com>
[Mark: drop tsk argument, update more callsites]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 11:44:08 +01:00
Linus Torvalds
dfd437a257 arm64 updates for 5.3:
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
 
 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly
 
 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)
 
 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG
   and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)
 
 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)
 
 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic
 
 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms
 
 - perf: DDR performance monitor support for iMX8QXP
 
 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers
 
 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent
 
 - arm64 do_page_fault() and hugetlb cleanups
 
 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
 
 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags'
   introduced in 5.1)
 
 - CONFIG_RANDOMIZE_BASE now enabled in defconfig
 
 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area
 
 - Make ZONE_DMA32 configurable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI
 XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN
 fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk
 gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0
 w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8
 Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT
 KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc
 eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA
 o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb
 lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF
 7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU
 tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc=
 =0TDT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}

 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly

 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)

 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
   XAFLAG and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)

 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)

 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic

 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms

 - perf: DDR performance monitor support for iMX8QXP

 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers

 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent

 - arm64 do_page_fault() and hugetlb cleanups

 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)

 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
   'arm_boot_flags' introduced in 5.1)

 - CONFIG_RANDOMIZE_BASE now enabled in defconfig

 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area

 - Make ZONE_DMA32 configurable

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  perf: arm_spe: Enable ACPI/Platform automatic module loading
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  x86/entry: Simplify _TIF_SYSCALL_EMU handling
  arm64: rename dump_instr as dump_kernel_instr
  arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
  arm64: Implement panic_smp_self_stop()
  arm64: Improve parking of stopped CPUs
  arm64: Expose FRINT capabilities to userspace
  arm64: Expose ARMv8.5 CondM capability to userspace
  arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
  arm64: ARM64_MODULES_PLTS must depend on MODULES
  arm64: bpf: do not allocate executable memory
  arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
  arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
  arm64: module: create module allocations without exec permissions
  arm64: Allow user selection of ARM64_MODULE_PLTS
  acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
  arm64: Allow selecting Pseudo-NMI again
  ...
2019-07-08 09:54:55 -07:00
Julien Thierry
bd82d4bd21 arm64: Fix incorrect irqflag restore for priority masking
When using IRQ priority masking to disable interrupts, in order to deal
with the PSR.I state, local_irq_save() would convert the I bit into a
PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
potentially modifying the value of PMR in undesired location due to the
state of PSR.I upon flag saving [1].

In an attempt to solve this issue in a less hackish manner, introduce
a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
whether PSR.I is being used to disable interrupts, in which case it
takes precedence of the status of interrupt masking via PMR.

GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> |
GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as
some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
requires PMR not to mask interrupts that could be signaled to the
CPU when using only PSR.I.

[1] https://www.spinics.net/lists/arm-kernel/msg716956.html

Fixes: 4a503217ce ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
Cc: <stable@vger.kernel.org> # 5.1.x-
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-21 15:50:10 +01:00
Thomas Gleixner
caab277b1d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:07 +02:00
Julien Thierry
a9806aa259 arm64: Unmask PMR before going idle
CPU does not received signals for interrupts with a priority masked by
ICC_PMR_EL1. This means the CPU might not come back from a WFI
instruction.

Make sure ICC_PMR_EL1 does not mask interrupts when doing a WFI.

Since the logic of cpu_do_idle is becoming a bit more complex than just
two instructions, lets turn it from ASM to C.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-06 10:05:18 +00:00
Julien Thierry
133d051863 arm64: Make PMR part of task context
In order to replace PSR.I interrupt disabling/enabling with ICC_PMR_EL1
interrupt masking, ICC_PMR_EL1 needs to be saved/restored when
taking/returning from an exception. This mimics the way hardware saves
and restores PSR.I bit in spsr_el1 for exceptions and ERET.

Add PMR to the registers to save in the pt_regs struct upon kernel entry,
and restore it before ERET. Also, initialize it to a sane value when
creating new tasks.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-06 10:05:18 +00:00
Linus Torvalds
495d714ad1 Tracing changes for v4.21:
- Rework of the kprobe/uprobe and synthetic events to consolidate all
    the dynamic event code. This will make changes in the future easier.
 
  - Partial rewrite of the function graph tracing infrastructure.
    This will allow for multiple users of hooking onto functions
    to get the callback (return) of the function. This is the ground
    work for having kprobes and function graph tracer using one code base.
 
  - Clean up of the histogram code that will facilitate adding more
    features to the histograms in the future.
 
  - Addition of str_has_prefix() and a few use cases. There currently
    is a similar function strstart() that is used in a few places, but
    only returns a bool and not a length. These instances will be
    removed in the future to use str_has_prefix() instead.
 
  - A few other various clean ups as well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXCawlBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhbcAQCFeT0fWWTUxofBQz5jqsHaRnVg21+9
 X4sTldYRYEn4YgEAmWOyiwq7zvrsAu4ZwkNBMeqxn3tVymYHiGOGe3Y4BAw=
 =u96o
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - Rework of the kprobe/uprobe and synthetic events to consolidate all
   the dynamic event code. This will make changes in the future easier.

 - Partial rewrite of the function graph tracing infrastructure. This
   will allow for multiple users of hooking onto functions to get the
   callback (return) of the function. This is the ground work for having
   kprobes and function graph tracer using one code base.

 - Clean up of the histogram code that will facilitate adding more
   features to the histograms in the future.

 - Addition of str_has_prefix() and a few use cases. There currently is
   a similar function strstart() that is used in a few places, but only
   returns a bool and not a length. These instances will be removed in
   the future to use str_has_prefix() instead.

 - A few other various clean ups as well.

* tag 'trace-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits)
  tracing: Use the return of str_has_prefix() to remove open coded numbers
  tracing: Have the historgram use the result of str_has_prefix() for len of prefix
  tracing: Use str_has_prefix() instead of using fixed sizes
  tracing: Use str_has_prefix() helper for histogram code
  string.h: Add str_has_prefix() helper function
  tracing: Make function ‘ftrace_exports’ static
  tracing: Simplify printf'ing in seq_print_sym
  tracing: Avoid -Wformat-nonliteral warning
  tracing: Merge seq_print_sym_short() and seq_print_sym_offset()
  tracing: Add hist trigger comments for variable-related fields
  tracing: Remove hist trigger synth_var_refs
  tracing: Use hist trigger's var_ref array to destroy var_refs
  tracing: Remove open-coding of hist trigger var_ref management
  tracing: Use var_refs[] for hist trigger reference checking
  tracing: Change strlen to sizeof for hist trigger static strings
  tracing: Remove unnecessary hist trigger struct field
  tracing: Fix ftrace_graph_get_ret_stack() to use task and not current
  seq_buf: Use size_t for len in seq_buf_puts()
  seq_buf: Make seq_buf_puts() null-terminate the buffer
  arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack
  ...
2018-12-31 11:46:59 -08:00
Steven Rostedt (VMware)
a448276ce5 arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack
The structure of the ret_stack array on the task struct is going to
change, and accessing it directly via the curr_ret_stack index will no
longer give the ret_stack entry that holds the return address. To access
that, architectures must now use ftrace_graph_get_ret_stack() to get the
associated ret_stack that matches the saved return address.

Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-22 08:21:02 -05:00
Mark Rutland
7503197562 arm64: add basic pointer authentication support
This patch adds basic support for pointer authentication, allowing
userspace to make use of APIAKey, APIBKey, APDAKey, APDBKey, and
APGAKey. The kernel maintains key values for each process (shared by all
threads within), which are initialised to random values at exec() time.

The ID_AA64ISAR1_EL1.{APA,API,GPA,GPI} fields are exposed to userspace,
to describe that pointer authentication instructions are available and
that the kernel is managing the keys. Two new hwcaps are added for the
same reason: PACA (for address authentication) and PACG (for generic
authentication).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Tested-by: Adam Wallis <awallis@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: Fix sizeof() usage and unroll address key initialisation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Ard Biesheuvel
0a1213fa74 arm64: enable per-task stack canaries
This enables the use of per-task stack canary values if GCC has
support for emitting the stack canary reference relative to the
value of sp_el0, which holds the task struct pointer in the arm64
kernel.

The $(eval) extends KBUILD_CFLAGS at the moment the make rule is
applied, which means asm-offsets.o (which we rely on for the offset
value) is built without the arguments, and everything built afterwards
has the options set.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 18:45:31 +00:00
Linus Torvalds
2d6bb6adb7 New gcc plugin: stackleak
- Introduces the stackleak gcc plugin ported from grsecurity by Alexander
   Popov, with x86 and arm64 support.
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlvQvn4WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJpSfD/sErFreuPT1beSw994Lr9Zx4k9v
 ERsuXxWBENaJOJXbOOHMfVEcEeG/1uhPSp7hlw/dpHfh0anATTrcYqm8RNKbfK+k
 o06+JK14OJfpm5Ghq/7OizhdNLCMT8wMU3XZtWfy65VSJGjEFx8Y48vMeQtpWtUK
 ylSzi9JV6j2iUBF9oibtiT53+yqsqAtX80X1G7HRCgv9kxuKMhZr+Q5oGV6+ViyQ
 Azj8mNn06iRnhHKd17WxDJr0GjSibzz4weS/9XgP3t3EcNWJo1EgBlD2KV3tOfP5
 nzmqfqTqrcjxs/tyjdh6vVCSlYucNtyCQGn63qyShQYSg6mZwclR2fY8YSTw6PWw
 GfYWFOWru9z+qyQmwFkQ9bSQS2R+JIT0oBCj9VmtF9XmPCy7K2neJsQclzSPBiCW
 wPgXVQS4IA4684O5CmDOVMwmDpGvhdBNUR6cqSzGLxQOHY1csyXubMNUsqU3g9xk
 Ob4pEy/xrrIw4WpwHcLHSEW5gV1/OLhsT0fGRJJiC947L3cN5s9EZp7FLbIS0zlk
 qzaXUcLmn6AgcfkYwg5cI3RMLaN2V0eDCMVTWZJ1wbrmUV9chAaOnTPTjNqLOTht
 v3b1TTxXG4iCpMmOFf59F8pqgAwbBDlfyNSbySZ/Pq5QH69udz3Z9pIUlYQnSJHk
 u6q++2ReDpJXF81rBw==
 =Ks6B
 -----END PGP SIGNATURE-----

Merge tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull stackleak gcc plugin from Kees Cook:
 "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin
  was ported from grsecurity by Alexander Popov. It provides efficient
  stack content poisoning at syscall exit. This creates a defense
  against at least two classes of flaws:

   - Uninitialized stack usage. (We continue to work on improving the
     compiler to do this in other ways: e.g. unconditional zero init was
     proposed to GCC and Clang, and more plugin work has started too).

   - Stack content exposure. By greatly reducing the lifetime of valid
     stack contents, exposures via either direct read bugs or unknown
     cache side-channels become much more difficult to exploit. This
     complements the existing buddy and heap poisoning options, but
     provides the coverage for stacks.

  The x86 hooks are included in this series (which have been reviewed by
  Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already
  been merged through the arm64 tree (written by Laura Abbott and
  reviewed by Mark Rutland and Will Deacon).

  With VLAs having been removed this release, there is no need for
  alloca() protection, so it has been removed from the plugin"

* tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  arm64: Drop unneeded stackleak_check_alloca()
  stackleak: Allow runtime disabling of kernel stack erasing
  doc: self-protection: Add information about STACKLEAK feature
  fs/proc: Show STACKLEAK metrics in the /proc file system
  lkdtm: Add a test for STACKLEAK
  gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
2018-11-01 11:46:27 -07:00
Will Deacon
8f04e8e6e2 arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3
On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD
state without needing to call into firmware.

This patch hooks into the existing SSBD infrastructure so that SSBS is
used on CPUs that support it, but it's all made horribly complicated by
the very real possibility of big/little systems that don't uniformly
provide the new capability.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-14 17:46:19 +01:00
Alexander Popov
6fcde90466 arm64: Drop unneeded stackleak_check_alloca()
Drop stackleak_check_alloca() for arm64 since the STACKLEAK gcc plugin now
doesn't track stack depth overflow caused by alloca().

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-09-04 10:35:48 -07:00
Laura Abbott
0b3e336601 arm64: Add support for STACKLEAK gcc plugin
This adds support for the STACKLEAK gcc plugin to arm64 by implementing
stackleak_check_alloca(), based heavily on the x86 version, and adding the
two helpers used by the stackleak common code: current_top_of_stack() and
on_thread_stack(). The stack erasure calls are made at syscall returns.
Additionally, this disables the plugin in hypervisor and EFI stub code,
which are out of scope for the protection.

Acked-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-26 11:36:34 +01:00
Mark Rutland
d64567f678 arm64: use PSR_AA32 definitions
Some code cares about the SPSR_ELx format for exceptions taken from
AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
the SPSR_ELx format, and not in the AArch32 PSR format.

To separate these from cases where we care about the AArch32 PSR format,
migrate these cases to use the PSR_AA32_* definitions rather than
COMPAT_PSR_*.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-05 17:24:14 +01:00
Linus Torvalds
050e9baa9d Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables
The changes to automatically test for working stack protector compiler
support in the Kconfig files removed the special STACKPROTECTOR_AUTO
option that picked the strongest stack protector that the compiler
supported.

That was all a nice cleanup - it makes no sense to have the AUTO case
now that the Kconfig phase can just determine the compiler support
directly.

HOWEVER.

It also meant that doing "make oldconfig" would now _disable_ the strong
stackprotector if you had AUTO enabled, because in a legacy config file,
the sane stack protector configuration would look like

  CONFIG_HAVE_CC_STACKPROTECTOR=y
  # CONFIG_CC_STACKPROTECTOR_NONE is not set
  # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
  # CONFIG_CC_STACKPROTECTOR_STRONG is not set
  CONFIG_CC_STACKPROTECTOR_AUTO=y

and when you ran this through "make oldconfig" with the Kbuild changes,
it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
used to be disabled (because it was really enabled by AUTO), and would
disable it in the new config, resulting in:

  CONFIG_HAVE_CC_STACKPROTECTOR=y
  CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
  CONFIG_CC_STACKPROTECTOR=y
  # CONFIG_CC_STACKPROTECTOR_STRONG is not set
  CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

That's dangerously subtle - people could suddenly find themselves with
the weaker stack protector setup without even realizing.

The solution here is to just rename not just the old RECULAR stack
protector option, but also the strong one.  This does that by just
removing the CC_ prefix entirely for the user choices, because it really
is not about the compiler support (the compiler support now instead
automatially impacts _visibility_ of the options to users).

This results in "make oldconfig" actually asking the user for their
choice, so that we don't have any silent subtle security model changes.
The end result would generally look like this:

  CONFIG_HAVE_CC_STACKPROTECTOR=y
  CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
  CONFIG_STACKPROTECTOR=y
  CONFIG_STACKPROTECTOR_STRONG=y
  CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

where the "CC_" versions really are about internal compiler
infrastructure, not the user selections.

Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-14 12:21:18 +09:00
Dave Martin
65896545b6 arm64: uaccess: Fix omissions from usercopy whitelist
When the hardend usercopy support was added for arm64, it was
concluded that all cases of usercopy into and out of thread_struct
were statically sized and so didn't require explicit whitelisting
of the appropriate fields in thread_struct.

Testing with usercopy hardening enabled has revealed that this is
not the case for certain ptrace regset manipulation calls on arm64.
This occurs because the sizes of usercopies associated with the
regset API are dynamic by construction, and because arm64 does not
always stage such copies via the stack: indeed the regset API is
designed to avoid the need for that by adding some bounds checking.

This is currently believed to affect only the fpsimd and TLS
registers.

Because the whitelisted fields in thread_struct must be contiguous,
this patch groups them together in a nested struct.  It is also
necessary to be able to determine the location and size of that
struct, so rather than making the struct anonymous (which would
save on edits elsewhere) or adding an anonymous union containing
named and unnamed instances of the same struct (gross), this patch
gives the struct a name and makes the necessary edits to code that
references it (noisy but simple).

Care is needed to ensure that the new struct does not contain
padding (which the usercopy hardening would fail to protect).

For this reason, the presence of tp2_value is made unconditional,
since a padding field would be needed there in any case.  This pads
up to the 16-byte alignment required by struct user_fpsimd_state.

Acked-by: Kees Cook <keescook@chromium.org>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9e8084d3f7 ("arm64: Implement thread_struct whitelist for hardened usercopy")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-28 15:25:44 +01:00
Will Deacon
a06f818a70 arm64: __show_regs: Only resolve kernel symbols when running at EL1
__show_regs pretty prints PC and LR by attempting to map them to kernel
function names to improve the utility of crash reports. Unfortunately,
this mapping is applied even when the pt_regs corresponds to user mode,
resulting in a KASLR oracle.

Avoid this issue by only looking up the function symbols when the register
state indicates that we're actually running at EL1.

Cc: <stable@vger.kernel.org>
Reported-by: NCSC Security <security@ncsc.gov.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-19 17:07:12 +00:00
Linus Torvalds
ab486bc9a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - Add a console_msg_format command line option:

     The value "default" keeps the old "[time stamp] text\n" format. The
     value "syslog" allows to see the syslog-like "<log
     level>[timestamp] text" format.

     This feature was requested by people doing regression tests, for
     example, 0day robot. They want to have both filtered and full logs
     at hands.

 - Reduce the risk of softlockup:

     Pass the console owner in a busy loop.

     This is a new approach to the old problem. It was first proposed by
     Steven Rostedt on Kernel Summit 2017. It marks a context in which
     the console_lock owner calls console drivers and could not sleep.
     On the other side, printk() callers could detect this state and use
     a busy wait instead of a simple console_trylock(). Finally, the
     console_lock owner checks if there is a busy waiter at the end of
     the special context and eventually passes the console_lock to the
     waiter.

     The hand-off works surprisingly well and helps in many situations.
     Well, there is still a possibility of the softlockup, for example,
     when the flood of messages stops and the last owner still has too
     much to flush.

     There is increasing number of people having problems with
     printk-related softlockups. We might eventually need to get better
     solution. Anyway, this looks like a good start and promising
     direction.

 - Do not allow to schedule in console_unlock() called from printk():

     This reverts an older controversial commit. The reschedule helped
     to avoid softlockups. But it also slowed down the console output.
     This patch is obsoleted by the new console waiter logic described
     above. In fact, the reschedule made the hand-off less effective.

 - Deprecate "%pf" and "%pF" format specifier:

     It was needed on ia64, ppc64 and parisc64 to dereference function
     descriptors and show the real function address. It is done
     transparently by "%ps" and "pS" format specifier now.

     Sergey Senozhatsky found that all the function descriptors were in
     a special elf section and could be easily detected.

 - Remove printk_symbol() API:

     It has been obsoleted by "%pS" format specifier, and this change
     helped to remove few continuous lines and a less intuitive old API.

 - Remove redundant memsets:

     Sergey removed unnecessary memset when processing printk.devkmsg
     command line option.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
  printk: drop redundant devkmsg_log_str memsets
  printk: Never set console_may_schedule in console_trylock()
  printk: Hide console waiter logic into helpers
  printk: Add console owner and waiter logic to load balance console writes
  kallsyms: remove print_symbol() function
  checkpatch: add pF/pf deprecation warning
  symbol lookup: introduce dereference_symbol_descriptor()
  parisc64: Add .opd based function descriptor dereference
  powerpc64: Add .opd based function descriptor dereference
  ia64: Add .opd based function descriptor dereference
  sections: split dereference_function_descriptor()
  openrisc: Fix conflicting types for _exext and _stext
  lib: do not use print_symbol()
  irq debug: do not use print_symbol()
  sysfs: do not use print_symbol()
  drivers: do not use print_symbol()
  x86: do not use print_symbol()
  unicore32: do not use print_symbol()
  sh: do not use print_symbol()
  mn10300: do not use print_symbol()
  ...
2018-02-01 13:36:15 -08:00
Sergey Senozhatsky
4ef7963843 arm64: do not use print_symbol()
print_symbol() is a very old API that has been obsoleted by %pS format
specifier in a normal printk() call.

Replace print_symbol() with a direct printk("%pS") call.

Link: http://lkml.kernel.org/r/20171211125025.2270-3-sergey.senozhatsky@gmail.com
To: Andrew Morton <akpm@linux-foundation.org>
To: Russell King <linux@armlinux.org.uk>
To: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Salter <msalter@redhat.com>
To: Tony Luck <tony.luck@intel.com>
To: David Howells <dhowells@redhat.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
To: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: Borislav Petkov <bp@alien8.de>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
To: Vineet Gupta <vgupta@synopsys.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-sh@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: updated commit message]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-05 15:20:19 +01:00
Will Deacon
18011eac28 arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native tasks
When unmapping the kernel at EL0, we use tpidrro_el0 as a scratch register
during exception entry from native tasks and subsequently zero it in
the kernel_ventry macro. We can therefore avoid zeroing tpidrro_el0
in the context-switch path for native tasks using the entry trampoline.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11 13:41:03 +00:00
Dave Martin
071b6d4a5d arm64: fpsimd: Prevent registers leaking from dead tasks
Currently, loading of a task's fpsimd state into the CPU registers
is skipped if that task's state is already present in the registers
of that CPU.

However, the code relies on the struct fpsimd_state * (and by
extension struct task_struct *) to unambiguously identify a task.

There is a particular case in which this doesn't work reliably:
when a task exits, its task_struct may be recycled to describe a
new task.

Consider the following scenario:

 1) Task P loads its fpsimd state onto cpu C.
        per_cpu(fpsimd_last_state, C) := P;
        P->thread.fpsimd_state.cpu := C;

 2) Task X is scheduled onto C and loads its fpsimd state on C.
        per_cpu(fpsimd_last_state, C) := X;
        X->thread.fpsimd_state.cpu := C;

 3) X exits, causing X's task_struct to be freed.

 4) P forks a new child T, which obtains X's recycled task_struct.
	T == X.
	T->thread.fpsimd_state.cpu == C (inherited from P).

 5) T is scheduled on C.
	T's fpsimd state is not loaded, because
	per_cpu(fpsimd_last_state, C) == T (== X) &&
	T->thread.fpsimd_state.cpu == C.

        (This is the check performed by fpsimd_thread_switch().)

So, T gets X's registers because the last registers loaded onto C
were those of X, in (2).

This patch fixes the problem by ensuring that the sched-in check
fails in (5): fpsimd_flush_task_state(T) is called when T is
forked, so that T->thread.fpsimd_state.cpu == C cannot be true.
This relies on the fact that T is not schedulable until after
copy_thread() completes.

Once T's fpsimd state has been loaded on some CPU C there may still
be other cpus D for which per_cpu(fpsimd_last_state, D) ==
&X->thread.fpsimd_state.  But D is necessarily != C in this case,
and the check in (5) must fail.

An alternative fix would be to do refcounting on task_struct.  This
would result in each CPU holding a reference to the last task whose
fpsimd state was loaded there.  It's not clear whether this is
preferable, and it involves higher overhead than the fix proposed
in this patch.  It would also move all the task_struct freeing
work into the context switch critical section, or otherwise some
deferred cleanup mechanism would need to be introduced, neither of
which seems obviously justified.

Cc: <stable@vger.kernel.org>
Fixes: 005f78cd88 ("arm64: defer reloading a task's FPSIMD state to userland resume")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: word-smithed the comment so it makes more sense]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06 18:02:21 +00:00
Dave Martin
bc0ee47603 arm64/sve: Core task context handling
This patch adds the core support for switching and managing the SVE
architectural state of user tasks.

Calls to the existing FPSIMD low-level save/restore functions are
factored out as new functions task_fpsimd_{save,load}(), since SVE
now dynamically may or may not need to be handled at these points
depending on the kernel configuration, hardware features discovered
at boot, and the runtime state of the task.  To make these
decisions as fast as possible, const cpucaps are used where
feasible, via the system_supports_sve() helper.

The SVE registers are only tracked for threads that have explicitly
used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
FPSIMD view of the architectural state is stored in
thread.fpsimd_state as usual.

When in use, the SVE registers are not stored directly in
thread_struct due to their potentially large and variable size.
Because the task_struct slab allocator must be configured very
early during kernel boot, it is also tricky to configure it
correctly to match the maximum vector length provided by the
hardware, since this depends on examining secondary CPUs as well as
the primary.  Instead, a pointer sve_state in thread_struct points
to a dynamically allocated buffer containing the SVE register data,
and code is added to allocate and free this buffer at appropriate
times.

TIF_SVE is set when taking an SVE access trap from userspace, if
suitable hardware support has been detected.  This enables SVE for
the thread: a subsequent return to userspace will disable the trap
accordingly.  If such a trap is taken without sufficient system-
wide hardware support, SIGILL is sent to the thread instead as if
an undefined instruction had been executed: this may happen if
userspace tries to use SVE in a system where not all CPUs support
it for example.

The kernel will clear TIF_SVE and disable SVE for the thread
whenever an explicit syscall is made by userspace.  For backwards
compatibility reasons and conformance with the spirit of the base
AArch64 procedure call standard, the subset of the SVE register
state that aliases the FPSIMD registers is still preserved across a
syscall even if this happens.  The remainder of the SVE register
state logically becomes zero at syscall entry, though the actual
zeroing work is currently deferred until the thread next tries to
use SVE, causing another trap to the kernel.  This implementation
is suboptimal: in the future, the fastpath case may be optimised
to zero the registers in-place and leave SVE enabled for the task,
where beneficial.

TIF_SVE is also cleared in the following slowpath cases, which are
taken as reasonable hints that the task may no longer use SVE:
 * exec
 * fork and clone

Code is added to sync data between thread.fpsimd_state and
thread.sve_state whenever enabling/disabling SVE, in a manner
consistent with the SVE architectural programmer's model.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
[will: added #include to fix allnoconfig build]
[will: use enable_daif in do_sve_acc]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:15 +00:00
Will Deacon
b7300d4c03 arm64: traps: Pretty-print pstate in register dumps
We can decode the PSTATE easily enough, so pretty-print it in register
dumps.

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:26:58 +01:00
Will Deacon
a25ffd3a63 arm64: traps: Don't print stack or raw PC/LR values in backtraces
Printing raw pointer values in backtraces has potential security
implications and are of questionable value anyway.

This patch follows x86's lead and removes the "Exception stack:" dump
from kernel backtraces, as well as converting PC/LR values to symbols
such as "sysrq_handle_crash+0x20/0x30".

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-27 16:26:53 +01:00
Linus Torvalds
04759194dc arm64 updates for 4.14:
- VMAP_STACK support, allowing the kernel stacks to be allocated in
   the vmalloc space with a guard page for trapping stack overflows. One
   of the patches introduces THREAD_ALIGN and changes the generic
   alloc_thread_stack_node() to use this instead of THREAD_SIZE (no
   functional change for other architectures)
 
 - Contiguous PTE hugetlb support re-enabled (after being reverted a
   couple of times). We now have the semantics agreed in the generic mm
   layer together with API improvements so that the architecture code can
   detect between contiguous and non-contiguous huge PTEs
 
 - Initial support for persistent memory on ARM: DC CVAP instruction
   exposed to user space (HWCAP) and the in-kernel pmem API implemented
 
 - raid6 improvements for arm64: faster algorithm for the delta syndrome
   and implementation of the recovery routines using Neon
 
 - FP/SIMD refactoring and removal of support for Neon in interrupt
   context. This is in preparation for full SVE support
 
 - PTE accessors converted from inline asm to cmpxchg so that we can
   use LSE atomics if available (ARMv8.1)
 
 - Perf support for Cortex-A35 and A73
 
 - Non-urgent fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlmuunYACgkQa9axLQDI
 XvEH9BAAo8V94GOMkX6HkT+2hjkl7DQ9krjumzmfzLV5AdgHMMzBNozmWKOCzgh0
 yaxRcTUju3EyNeKhADr7yLiKDH8fnRPmYEJiVrwfgo7MaPApaCorr7LLIXfPGuxe
 DTBHw+oxRMjlmaHeATX4PBWfQxAx+vjjhHqv3Qpmvdm4nYqR+0hZomH2BNsu64fk
 AkSeUCxfCEyzSFIKuQM04M4zhSSZHz1tDxWI0b0RcK73qqEOuYZNkn6qxSKP5J4X
 b2Y2U8nmxJ5C2fXpDYZaK9shiJ4Vu7X3Ocf/M7hsJzGY5z4dhnmUmxpHROaNiSvo
 hCx7POYKyAPovps7zMSqcdsujkqOIQO8RHp4zGXx/pIr1RumjIiCY+RGpUYGibvU
 N4Px5hZNneuHaPZZ+sWjOOdNB28xyzeUp2UK9Bb6uHB+/3xssMAD8Fd/b2ZLnS6a
 YW3wrZmqA+ckfETsSRibabTs/ayqYHs2SDVwnlDJGtn+4Pw8oQpwGrwokxLQuuw3
 uF2sNEPhJz+dcy21q3udYAQE1qOJBlLqTptgP96CHoVqh8X6nYSi5obT7y30ln3n
 dhpZGOdi6R8YOouxgXS3Wg07pxn444L/VzDw5ku/5DkdryPOZCSRbk/2t8If6oDM
 2VD6PCbTx3hsGc7SZ7FdSwIysD2j446u40OMGdH2iLB5jWBwyOM=
 =vd0/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - VMAP_STACK support, allowing the kernel stacks to be allocated in the
   vmalloc space with a guard page for trapping stack overflows. One of
   the patches introduces THREAD_ALIGN and changes the generic
   alloc_thread_stack_node() to use this instead of THREAD_SIZE (no
   functional change for other architectures)

 - Contiguous PTE hugetlb support re-enabled (after being reverted a
   couple of times). We now have the semantics agreed in the generic mm
   layer together with API improvements so that the architecture code
   can detect between contiguous and non-contiguous huge PTEs

 - Initial support for persistent memory on ARM: DC CVAP instruction
   exposed to user space (HWCAP) and the in-kernel pmem API implemented

 - raid6 improvements for arm64: faster algorithm for the delta syndrome
   and implementation of the recovery routines using Neon

 - FP/SIMD refactoring and removal of support for Neon in interrupt
   context. This is in preparation for full SVE support

 - PTE accessors converted from inline asm to cmpxchg so that we can use
   LSE atomics if available (ARMv8.1)

 - Perf support for Cortex-A35 and A73

 - Non-urgent fixes and cleanups

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits)
  arm64: cleanup {COMPAT_,}SET_PERSONALITY() macro
  arm64: introduce separated bits for mm_context_t flags
  arm64: hugetlb: Cleanup setup_hugepagesz
  arm64: Re-enable support for contiguous hugepages
  arm64: hugetlb: Override set_huge_swap_pte_at() to support contiguous hugepages
  arm64: hugetlb: Override huge_pte_clear() to support contiguous hugepages
  arm64: hugetlb: Handle swap entries in huge_pte_offset() for contiguous hugepages
  arm64: hugetlb: Add break-before-make logic for contiguous entries
  arm64: hugetlb: Spring clean huge pte accessors
  arm64: hugetlb: Introduce pte_pgprot helper
  arm64: hugetlb: set_huge_pte_at Add WARN_ON on !pte_present
  arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic cores
  arm64: dma-mapping: Mark atomic_pool as __ro_after_init
  arm64: dma-mapping: Do not pass data to gen_pool_set_algo()
  arm64: Remove the !CONFIG_ARM64_HW_AFDBM alternative code paths
  arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()
  arm64: Move PTE_RDONLY bit handling out of set_pte_at()
  kvm: arm64: Convert kvm_set_s2pte_readonly() from inline asm to cmpxchg()
  arm64: Convert pte handling from inline asm to using (cmp)xchg
  arm64: neon/efi: Make EFI fpsimd save/restore variables static
  ...
2017-09-05 09:53:37 -07:00
Yury Norov
d1be5c99a0 arm64: cleanup {COMPAT_,}SET_PERSONALITY() macro
There is some work that should be done after setting the personality.
Currently it's done in the macro, which is not the best idea.

In this patch new arch_setup_new_exec() routine is introduced, and all
setup code is moved there, as suggested by Catalin:
https://lkml.org/lkml/2017/8/4/494

Cc: Pratyush Anand <panand@redhat.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
[catalin.marinas@arm.com: comments changed or removed]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-22 18:41:47 +01:00
Mathieu Desnoyers
22e4ebb975 membarrier: Provide expedited private command
Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built
from all runqueues for which current thread's mm is the same as the
thread calling sys_membarrier. It executes faster than the non-expedited
variant (no blocking). It also works on NOHZ_FULL configurations.

Scheduler-wise, it requires a memory barrier before and after context
switching between processes (which have different mm). The memory
barrier before context switch is already present. For the barrier after
context switch:

* Our TSO archs can do RELEASE without being a full barrier. Look at
  x86 spin_unlock() being a regular STORE for example.  But for those
  archs, all atomics imply smp_mb and all of them have atomic ops in
  switch_mm() for mm_cpumask(), and on x86 the CR3 load acts as a full
  barrier.

* From all weakly ordered machines, only ARM64 and PPC can do RELEASE,
  the rest does indeed do smp_mb(), so there the spin_unlock() is a full
  barrier and we're good.

* ARM64 has a very heavy barrier in switch_to(), which suffices.

* PPC just removed its barrier from switch_to(), but appears to be
  talking about adding something to switch_mm(). So add a
  smp_mb__after_unlock_lock() for now, until this is settled on the PPC
  side.

Changes since v3:
- Properly document the memory barriers provided by each architecture.

Changes since v2:
- Address comments from Peter Zijlstra,
- Add smp_mb__after_unlock_lock() after finish_lock_switch() in
  finish_task_switch() to add the memory barrier we need after storing
  to rq->curr. This is much simpler than the previous approach relying
  on atomic_dec_and_test() in mmdrop(), which actually added a memory
  barrier in the common case of switching between userspace processes.
- Return -EINVAL when MEMBARRIER_CMD_SHARED is used on a nohz_full
  kernel, rather than having the whole membarrier system call returning
  -ENOSYS. Indeed, CMD_PRIVATE_EXPEDITED is compatible with nohz_full.
  Adapt the CMD_QUERY mask accordingly.

Changes since v1:
- move membarrier code under kernel/sched/ because it uses the
  scheduler runqueue,
- only add the barrier when we switch from a kernel thread. The case
  where we switch from a user-space thread is already handled by
  the atomic_dec_and_test() in mmdrop().
- add a comment to mmdrop() documenting the requirement on the implicit
  memory barrier.

CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: gromer@google.com
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Dave Watson <davejwatson@fb.com>
2017-08-17 07:28:05 -07:00
Ard Biesheuvel
31e43ad3b7 arm64: unwind: remove sp from struct stackframe
The unwind code sets the sp member of struct stackframe to
'frame pointer + 0x10' unconditionally, without regard for whether
doing so produces a legal value. So let's simply remove it now that
we have stopped using it anyway.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-08-09 14:10:29 +01:00
Dave Martin
936eb65ca2 arm64: ptrace: Flush user-RW TLS reg to thread_struct before reading
When reading current's user-writable TLS register (which occurs
when dumping core for native tasks), it is possible that userspace
has modified it since the time the task was last scheduled out.
The new TLS register value is not guaranteed to have been written
immediately back to thread_struct in this case.

As a result, a coredump can capture stale data for this register.
Reading the register for a stopped task via ptrace is unaffected.

For native tasks, this patch explicitly flushes the TPIDR_EL0
register back to thread_struct before dumping when operating on
current, thus ensuring that coredump contents are up to date.  For
compat tasks, the TLS register is not user-writable and so cannot
be out of sync, so no flush is required in compat_tls_get().

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-22 15:58:20 +01:00
Kefeng Wang
1149aad10b arm64: Add dump_backtrace() in show_regs
Generic code expects show_regs() to dump the stack, but arm64's
show_regs() does not. This makes it hard to debug softlockups and
other issues that result in show_regs() being called.

This patch updates arm64's show_regs() to dump the stack, as common
code expects.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
[will: folded in bug_handler fix from mrutland]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-05-30 11:07:42 +01:00
Kefeng Wang
29d981217a arm64: drop unnecessary newlines in show_regs()
There are two unnecessary newlines, one is in show_regs, another
is in __show_regs(), drop them.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-03-23 14:20:41 +00:00
Ingo Molnar
68db0cf106 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:36 +01:00
Ingo Molnar
299300258d sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>
We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:35 +01:00
Ingo Molnar
b17b01533b sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>
We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:34 +01:00
Miles Chen
ffe3d1e43c arm64: use linux/sizes.h for constants
Use linux/size.h to improve code readability.

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-09 11:39:20 +00:00
Joel Fernandes
8f4b326d66 arm64: Don't trace __switch_to if function graph tracer is enabled
Function graph tracer shows negative time (wrap around) when tracing
__switch_to if the nosleep-time trace option is enabled.

Time compensation for nosleep-time is done by an ftrace probe on
sched_switch. This doesn't work well for the following events (with
letters representing timestamps):
A - sched switch probe called for task T switch out
B - __switch_to calltime is recorded
C - sched_switch probe called for task T switch in
D - __switch_to rettime is recorded

If C - A > D - B, then we end up over compensating for the time spent in
__switch_to giving rise to negative times in the trace output.

On x86, __switch_to is not traced if function graph tracer is enabled.
Do the same for arm64 as well.

Cc: Todd Kjos <tkjos@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-10 11:05:08 +00:00
Suzuki K Poulose
a4023f6827 arm64: Add hypervisor safe helper for checking constant capabilities
The hypervisor may not have full access to the kernel data structures
and hence cannot safely use cpus_have_cap() helper for checking the
system capability. Add a safe helper for hypervisors to check a constant
system capability, which *doesn't* fall back to checking the bitmap
maintained by the kernel. With this, make the cpus_have_cap() only
check the bitmask and force constant cap checks to use the new API
for quicker checks.

Cc: Robert Ritcher <rritcher@cavium.com>
Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-16 17:50:51 +00:00
Mark Rutland
c02433dd6d arm64: split thread_info from task stack
This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.

Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct. Core code now
implements current_thread_info(), and as noted in <linux/sched.h> this
relies on offsetof(task_struct, thread_info) == 0, enforced by core
code.

This change means that the 'tsk' register used in entry.S now points to
a task_struct, rather than a thread_info as it used to. To make this
clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets
appropriately updated to account for the structural change.

Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).

Both secondary entry and idle are updated to stash the sp and task
pointer separately.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11 18:25:46 +00:00
Mark Rutland
9bbd4c56b0 arm64: prep stack walkers for THREAD_INFO_IN_TASK
When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
before a task is destroyed. To account for this, the stacks are
refcounted, and when manipulating the stack of another task, it is
necessary to get/put the stack to ensure it isn't freed and/or re-used
while we do so.

This patch reworks the arm64 stack walking code to account for this.
When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
refcounting, and this should only be a structural change that does not
affect behaviour.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11 18:25:44 +00:00
Mark Rutland
db4b0710fa arm64: fix show_regs fallout from KERN_CONT changes
Recently in commit 4bcc595ccd ("printk: reinstate KERN_CONT for
printing continuation lines"), the behaviour of printk changed w.r.t.
KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines.
Without this, prefixes are inserted, making output illegible, e.g.

[ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[ 1007.076329] sp : ffff000008d53ec0
[ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018
[ 1007.086160]
[ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00
[ 1007.094183]
[ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac
[ 1007.102206]

... or when dumped with the userpace dmesg tool, which has slightly
different implicit newline behaviour. e.g.

[ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[ 1007.076329] sp : ffff000008d53ec0
[ 1007.079606] x29: ffff000008d53ec0
[ 1007.082797] x28: 0000000080c50018
[ 1007.086160]
[ 1007.087630] x27: ffff000008e0c7f8
[ 1007.090820] x26: ffff80097631ca00
[ 1007.094183]
[ 1007.095653] x25: 0000000000000001
[ 1007.098843] x24: 000000ea68b61cac
[ 1007.102206]

We can't simply always use KERN_CONT for lines which may or may not be
continuations. That causes line prefixes (e.g. timestamps) to be
supressed, and the alignment of all but the first line will be broken.

For even more fun, we can't simply insert some dummy empty-string printk
calls, as GCC warns for an empty printk string, and even if we pass
KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed
unless there is an additional part to the string.

Instead, we must manually iterate over pairs of registers, which gives
us the legible output we want in either case, e.g.

[  169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145
[  169.779109] sp : ffff000008d53ec0
[  169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018
[  169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00
[  169.792913] x25: 0000000000000001 x24: 00000027827b2cf4

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20 15:27:56 +01:00
James Morse
d08544127d arm64: suspend: Reconfigure PSTATE after resume from idle
The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not
save/restore PSTATE. As a result of this cpufeatures that were detected
and have bits in PSTATE get lost when we resume from idle.

UAO gets set appropriately on the next context switch. PAN will be
re-enabled next time we return from user-space, but on a preemptible
kernel we may run work accessing user space before this point.

Add code to re-enable theses two features in __cpu_suspend_exit().
We re-use uao_thread_switch() passing current.

Signed-off-by: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20 09:50:54 +01:00
Jason Cooper
fa5114c78c arm64: use simpler API for random address requests
Currently, all callers to randomize_range() set the length to 0 and
calculate end by adding a constant to the start address.  We can simplify
the API to remove a bunch of needless checks and variables.

Use the new randomize_addr(start, range) call to set the requested
address.

Link: http://lkml.kernel.org/r/20160803233913.32511-5-jason@lakedaemon.net
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Russell King - ARM Linux" <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:32 -07:00
Mark Rutland
adf7589997 arm64: simplify sysreg manipulation
A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these across arm64 to make code shorter and
clearer. For sequences with a trailing ISB, the existing isb() macro is
also used so that asm blocks can be removed entirely.

A few uses of inline assembly for msr/mrs are left as-is. Those
manipulating sp_el0 for the current thread_info value have special
clobber requiremends.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:43:50 +01:00
Jiri Slaby
5f56a5dfdb exit_thread: remove empty bodies
Define HAVE_EXIT_THREAD for archs which want to do something in
exit_thread. For others, let's define exit_thread as an empty inline.

This is a cleanup before we change the prototype of exit_thread to
accept a task parameter.

[akpm@linux-foundation.org: fix mips]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00