2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

54 Commits

Author SHA1 Message Date
Yong-Xuan Wang
9752fed8f6 RISCV: KVM: Introduce vcpu->reset_cntx_lock
Originally, the use of kvm->lock in SBI_EXT_HSM_HART_START also avoids
the simultaneous updates to the reset context of target VCPU. Since this
lock has been replace with vcpu->mp_state_lock, and this new lock also
protects the vcpu->mp_state. We have to add a separate lock for
vcpu->reset_cntx.

Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240417074528.16506-3-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-22 10:39:03 +05:30
Yong-Xuan Wang
2121cadec4 RISCV: KVM: Introduce mp_state_lock to avoid lock inversion
Documentation/virt/kvm/locking.rst advises that kvm->lock should be
acquired outside vcpu->mutex and kvm->srcu. However, when KVM/RISC-V
handling SBI_EXT_HSM_HART_START, the lock ordering is vcpu->mutex,
kvm->srcu then kvm->lock.

Although the lockdep checking no longer complains about this after commit
f0f44752f5 ("rcu: Annotate SRCU's update-side lockdep dependencies"),
it's necessary to replace kvm->lock with a new dedicated lock to ensure
only one hart can execute the SBI_EXT_HSM_HART_START call for the target
hart simultaneously.

Additionally, this patch also rename "power_off" to "mp_state" with two
possible values. The vcpu->mp_state_lock also protects the access of
vcpu->mp_state.

Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240417074528.16506-2-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-22 10:37:11 +05:30
Chao Du
edcbe90f12 RISC-V: KVM: Implement kvm_arch_vcpu_ioctl_set_guest_debug()
kvm_vm_ioctl_check_extension(): Return 1 if KVM_CAP_SET_GUEST_DEBUG is
been checked.

kvm_arch_vcpu_ioctl_set_guest_debug(): Update the guest_debug flags
from userspace accordingly. Route the breakpoint exceptions to HS mode
if the VCPU is being debugged by userspace, by clearing the
corresponding bit in hedeleg.

Initialize the hedeleg configuration in kvm_riscv_vcpu_setup_config().
Write the actual CSR in kvm_arch_vcpu_load().

Signed-off-by: Chao Du <duchao@eswincomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240402062628.5425-2-duchao@eswincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-04-08 14:06:27 +05:30
Paolo Bonzini
9cc52627c7 KVM/riscv changes for 6.8 part #1
- KVM_GET_REG_LIST improvement for vector registers
 - Generate ISA extension reg_list using macros in get-reg-list selftest
 - Steal time account support along with selftest
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmWQ+cgACgkQrUjsVaLH
 LAckBA//R4X9L5ugfPdDunp3ntjZXmNtBS5pM2jD+UvaoFn2kOA1o5kOD5mXluuh
 0imNjVuzlrX7XoAATQ4BoeoXg0whDbnv/8TE13KqSl1PfNziH2p5YD2DuHXPST3B
 V2VHrGACZ4wN074Whztl0oYI72obGnmpcsiglifkeCvRPerghHuDu40eUaWvCmgD
 DPwT+bjBgxkDZ4IheyytUrPql6reALh1Qo1vfj0FsJAgj+MAqQeD8n6rixSPnOdE
 9XXa4jdu7ycDp675VX/DVEWsNBQGPrsRK/uCiMksO36td+wLCKpAkvX95mE0w/L8
 qFJ+dN1c+1ZkjooHdVLfq2MjxaIRwmIowk7SeJbpvGIf/zG3r7eany7eXJT0+NjO
 22j5FY2z1NqcSG6Fazx76Qp2vVBVbxHShP9h7d6VTZYS7XENjmV6IWHpTSuSF8+n
 puj8Nf5C7WuqbySirSgQndDuKawn9myqfXXEoAuSiZ+kVyYEl8QnXm2gAIcxRDHX
 x+NDPMv0DpMBRO9qa/tXeqgNue/XOTJwgbmXzAlCNff3U7hPIHJ/5aZiJ/Re5TeE
 DxiU9AmIsNN2Bh0csS/wQbdScIqkOdOiDYEwT1DXOJWpmhiyCW7vR8ltaIuMJ4vP
 DtlfuUlSe4aml957nAiqqyjQAY/7gqmpoaGwu+lmrOX1K7fdtF0=
 =FeiG
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.8-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.8 part #1

- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Steal time account support along with selftest
2024-01-02 13:19:40 -05:00
Andrew Jones
38b3390ee4 RISC-V: KVM: Add SBI STA info to vcpu_arch
KVM's implementation of SBI STA needs to track the address of each
VCPU's steal-time shared memory region as well as the amount of
stolen time. Add a structure to vcpu_arch to contain this state
and make sure that the address is always set to INVALID_GPA on
vcpu reset. And, of course, ensure KVM won't try to update steal-
time when the shared memory address is invalid.

Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-12-30 11:26:26 +05:30
Andrew Jones
2a1f6bf079 RISC-V: KVM: Add steal-update vcpu request
Add a new vcpu request to inform a vcpu that it should record its
steal-time information. The request is made each time it has been
detected that the vcpu task was not assigned a cpu for some time,
which is easy to do by making the request from vcpu-load. The record
function is just a stub for now and will be filled in with the rest
of the steal-time support functions in following patches.

Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-12-30 11:26:22 +05:30
Sean Christopherson
f128cf8cfb KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER
Convert KVM_ARCH_WANT_MMU_NOTIFIER into a Kconfig and select it where
appropriate to effectively maintain existing behavior.  Using a proper
Kconfig will simplify building more functionality on top of KVM's
mmu_notifier infrastructure.

Add a forward declaration of kvm_gfn_range to kvm_types.h so that
including arch/powerpc/include/asm/kvm_ppc.h's with CONFIG_KVM=n doesn't
generate warnings due to kvm_gfn_range being undeclared.  PPC defines
hooks for PR vs. HV without guarding them via #ifdeffery, e.g.

  bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
  bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);

Alternatively, PPC could forward declare kvm_gfn_range, but there's no
good reason not to define it in common KVM.

Acked-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Message-Id: <20231027182217.3615211-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-11-13 05:29:09 -05:00
Mayuresh Chitale
81f0f314fe RISCV: KVM: Add sstateen0 context save/restore
Define sstateen0 and add sstateen0 save/restore for guest VCPUs.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-10-12 18:44:11 +05:30
Mayuresh Chitale
db3c01c7a3 RISCV: KVM: Add senvcfg context save/restore
Add senvcfg context save/restore for guest VCPUs and also add it to the
ONE_REG interface to allow its access from user space.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-10-12 18:44:09 +05:30
Mayuresh Chitale
d21b5d342f RISC-V: KVM: Enable Smstateen accesses
Configure hstateen0 register so that the AIA state and envcfg are
accessible to the vcpus. This includes registers such as siselect,
sireg, siph, sieh and all the IMISC registers.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-10-12 18:44:07 +05:30
Mayuresh Chitale
fe0bab701e RISC-V: KVM: Add kvm_vcpu_config
Add a placeholder for all registers such as henvcfg, hstateen etc
which have 'static' configurations depending on extensions supported by
the guest. The values are derived once and are then subsequently written
to the corresponding CSRs while switching to the vcpu.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-10-12 18:44:02 +05:30
Haibo Xu
031f9efafc KVM: riscv: Add KVM_GET_REG_LIST API support
KVM_GET_REG_LIST API will return all registers that are available to
KVM_GET/SET_ONE_REG APIs. It's very useful to identify some platform
regression issue during VM migration.

Since this API was already supported on arm64, it is straightforward
to enable it on riscv with similar code structure.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-08-09 12:15:25 +05:30
Anup Patel
e98b1085be RISC-V: KVM: Factor-out ONE_REG related code to its own source file
The VCPU ONE_REG interface has grown over time and it will continue
to grow with new ISA extensions and other features. Let us move all
ONE_REG related code to its own source file so that vcpu.c only
focuses only on high-level VCPU functions.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-08-08 17:25:29 +05:30
Linus Torvalds
e8069f5a8e ARM64:
* Eager page splitting optimization for dirty logging, optionally
   allowing for a VM to avoid the cost of hugepage splitting in the stage-2
   fault path.
 
 * Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
   services that live in the Secure world. pKVM intervenes on FF-A calls
   to guarantee the host doesn't misuse memory donated to the hyp or a
   pKVM guest.
 
 * Support for running the split hypervisor with VHE enabled, known as
   'hVHE' mode. This is extremely useful for testing the split
   hypervisor on VHE-only systems, and paves the way for new use cases
   that depend on having two TTBRs available at EL2.
 
 * Generalized framework for configurable ID registers from userspace.
   KVM/arm64 currently prevents arbitrary CPU feature set configuration
   from userspace, but the intent is to relax this limitation and allow
   userspace to select a feature set consistent with the CPU.
 
 * Enable the use of Branch Target Identification (FEAT_BTI) in the
   hypervisor.
 
 * Use a separate set of pointer authentication keys for the hypervisor
   when running in protected mode, as the host is untrusted at runtime.
 
 * Ensure timer IRQs are consistently released in the init failure
   paths.
 
 * Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
   (FEAT_EVT), as it is a register commonly read from userspace.
 
 * Erratum workaround for the upcoming AmpereOne part, which has broken
   hardware A/D state management.
 
 RISC-V:
 
 * Redirect AMO load/store misaligned traps to KVM guest
 
 * Trap-n-emulate AIA in-kernel irqchip for KVM guest
 
 * Svnapot support for KVM Guest
 
 s390:
 
 * New uvdevice secret API
 
 * CMM selftest and fixes
 
 * fix racy access to target CPU for diag 9c
 
 x86:
 
 * Fix missing/incorrect #GP checks on ENCLS
 
 * Use standard mmu_notifier hooks for handling APIC access page
 
 * Drop now unnecessary TR/TSS load after VM-Exit on AMD
 
 * Print more descriptive information about the status of SEV and SEV-ES during
   module load
 
 * Add a test for splitting and reconstituting hugepages during and after
   dirty logging
 
 * Add support for CPU pinning in demand paging test
 
 * Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
   included along the way
 
 * Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
   recovery threads (because nx_huge_pages=off can be toggled at runtime)
 
 * Move handling of PAT out of MTRR code and dedup SVM+VMX code
 
 * Fix output of PIC poll command emulation when there's an interrupt
 
 * Add a maintainer's handbook to document KVM x86 processes, preferred coding
   style, testing expectations, etc.
 
 * Misc cleanups, fixes and comments
 
 Generic:
 
 * Miscellaneous bugfixes and cleanups
 
 Selftests:
 
 * Generate dependency files so that partial rebuilds work as expected
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
 6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
 F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
 FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
 LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
 m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
 =AMXp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Eager page splitting optimization for dirty logging, optionally
     allowing for a VM to avoid the cost of hugepage splitting in the
     stage-2 fault path.

   - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
     with services that live in the Secure world. pKVM intervenes on
     FF-A calls to guarantee the host doesn't misuse memory donated to
     the hyp or a pKVM guest.

   - Support for running the split hypervisor with VHE enabled, known as
     'hVHE' mode. This is extremely useful for testing the split
     hypervisor on VHE-only systems, and paves the way for new use cases
     that depend on having two TTBRs available at EL2.

   - Generalized framework for configurable ID registers from userspace.
     KVM/arm64 currently prevents arbitrary CPU feature set
     configuration from userspace, but the intent is to relax this
     limitation and allow userspace to select a feature set consistent
     with the CPU.

   - Enable the use of Branch Target Identification (FEAT_BTI) in the
     hypervisor.

   - Use a separate set of pointer authentication keys for the
     hypervisor when running in protected mode, as the host is untrusted
     at runtime.

   - Ensure timer IRQs are consistently released in the init failure
     paths.

   - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
     Traps (FEAT_EVT), as it is a register commonly read from userspace.

   - Erratum workaround for the upcoming AmpereOne part, which has
     broken hardware A/D state management.

  RISC-V:

   - Redirect AMO load/store misaligned traps to KVM guest

   - Trap-n-emulate AIA in-kernel irqchip for KVM guest

   - Svnapot support for KVM Guest

  s390:

   - New uvdevice secret API

   - CMM selftest and fixes

   - fix racy access to target CPU for diag 9c

  x86:

   - Fix missing/incorrect #GP checks on ENCLS

   - Use standard mmu_notifier hooks for handling APIC access page

   - Drop now unnecessary TR/TSS load after VM-Exit on AMD

   - Print more descriptive information about the status of SEV and
     SEV-ES during module load

   - Add a test for splitting and reconstituting hugepages during and
     after dirty logging

   - Add support for CPU pinning in demand paging test

   - Add support for AMD PerfMonV2, with a variety of cleanups and minor
     fixes included along the way

   - Add a "nx_huge_pages=never" option to effectively avoid creating NX
     hugepage recovery threads (because nx_huge_pages=off can be toggled
     at runtime)

   - Move handling of PAT out of MTRR code and dedup SVM+VMX code

   - Fix output of PIC poll command emulation when there's an interrupt

   - Add a maintainer's handbook to document KVM x86 processes,
     preferred coding style, testing expectations, etc.

   - Misc cleanups, fixes and comments

  Generic:

   - Miscellaneous bugfixes and cleanups

  Selftests:

   - Generate dependency files so that partial rebuilds work as
     expected"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
  Documentation/process: Add a maintainer handbook for KVM x86
  Documentation/process: Add a label for the tip tree handbook's coding style
  KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
  RISC-V: KVM: Remove unneeded semicolon
  RISC-V: KVM: Allow Svnapot extension for Guest/VM
  riscv: kvm: define vcpu_sbi_ext_pmu in header
  RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
  RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel emulation of AIA APLIC
  RISC-V: KVM: Implement device interface for AIA irqchip
  RISC-V: KVM: Skeletal in-kernel AIA irqchip support
  RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
  RISC-V: KVM: Add APLIC related defines
  RISC-V: KVM: Add IMSIC related defines
  RISC-V: KVM: Implement guest external interrupt line management
  KVM: x86: Remove PRIx* definitions as they are solely for user space
  s390/uv: Update query for secret-UVCs
  s390/uv: replace scnprintf with sysfs_emit
  s390/uvdevice: Add 'Lock Secret Store' UVC
  ...
2023-07-03 15:32:22 -07:00
Anup Patel
00f918f61c RISC-V: KVM: Skeletal in-kernel AIA irqchip support
To incrementally implement in-kernel AIA irqchip support, we first
add minimal skeletal support which only compiles but does not provide
any functionality.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-06-18 21:24:40 +05:30
Vincent Chen
0f4b825797
riscv: KVM: Add vector lazy save/restore support
This patch adds vector context save/restore for guest VCPUs. To reduce the
impact on KVM performance, the implementation imitates the FP context
switch mechanism to lazily store and restore the vector context only when
the kernel enters/exits the in-kernel run loop and not during the KVM
world switch.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20230605110724.21391-20-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-08 07:16:51 -07:00
Anup Patel
6b1e8ba4ba RISC-V: KVM: Use bitmap for irqs_pending and irqs_pending_mask
To support 64 VCPU local interrupts on RV32 host, we should use
bitmap for irqs_pending and irqs_pending_mask in struct kvm_vcpu_arch.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-04-21 17:45:58 +05:30
Anup Patel
54e43320c2 RISC-V: KVM: Initial skeletal support for AIA
To incrementally implement AIA support, we first add minimal skeletal
support which only compiles and detects AIA hardware support at the
boot-time but does not provide any functionality.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-04-21 17:45:48 +05:30
Paolo Bonzini
33436335e9 KVM/riscv changes for 6.3
- Fix wrong usage of PGDIR_SIZE to check page sizes
 - Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
 - Redirect illegal instruction traps to guest
 - SBI PMU support for guest
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmPifFIACgkQrUjsVaLH
 LAcEyxAAinMBaBhiPmwWZQvcCzh/UFmJo8BQCwAPuwoc/a4ZGAR7ylzd0oJilP8M
 wSgX6Ad8XF+CEW2VpxW9nwyi41N25ep1Lrf8vOaWy9L9QNUo0t15WrCIbXT2p399
 HrK9fz7HHKKIMsJy+rYb9EepdmMf55xtr1Y/EjyvhoDQbrEMlKsAODYz/SUoriQG
 Tn3cCYBzLdvzDzu0xXM9v+nsetWXdajK/v4je+mE3NQceXhePAO4oVWP4IpnoROd
 ZQm3evvVdf0WtKG9curxwMB7jjBqDBFrcLYl0qHGa7pi2o5PzVM7esgaV47KwetH
 IgA/Mrf1IfzpgM7VYDDax5wUHlKj63KisqU0J8rU3PUloQXaWqv7+ho51t9GzZ/i
 9x4uyO/evVntgyTw6HCbqmQJDgEtJiG1ydrR/ydBMYHLnh7LPY2UpKgcqmirtbkK
 1/DYDp84vikQ5VW1hc8IACdoBShh9Moh4xsEStzkTrIeHcZCjtORXUh8UIPZ0Mu2
 7Mnkktu9I55SLwA3rwH/EYT1ISrOV1G+q3wfqgeLpn8YUWwCIiqWQ5Ur0/WSMJse
 uJ3HedZDzj9T4n4khX+mKEYh6joAafQZag+4TID2lRSwd0S/mpeC22hYrViMdDmq
 yhE+JNin/sz4AVaHNzGwfqk2NC2RFl9aRn2X0xTwyBubif9pKMQ=
 =spUL
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.3-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.3

- Fix wrong usage of PGDIR_SIZE to check page sizes
- Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect()
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
2023-02-15 12:33:28 -05:00
Atish Patra
8f0153ecd3 RISC-V: KVM: Add skeleton support for perf
This patch only adds barebone structure of perf implementation. Most
of the function returns zero at this point and will be implemented
fully in the future.

Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-02-07 20:35:51 +05:30
Sean Christopherson
45b66dc139 KVM: RISC-V: Tag init functions and data with __init, __ro_after_init
Now that KVM setup is handled directly in riscv_kvm_init(), tag functions
and data that are used/set only during init with __init/__ro_after_init.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-26-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:18 -05:00
Sean Christopherson
63a1bd8ad1 KVM: Drop arch hardware (un)setup hooks
Drop kvm_arch_hardware_setup() and kvm_arch_hardware_unsetup() now that
all implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:40:54 -05:00
Anup Patel
52ec4b695d RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU
We should save VCPU mvendorid, marchid, and mimpid at the time
of creating VCPU so that we don't have to do host SBI call every
time Guest/VM ask for these details.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-12-07 09:17:43 +05:30
Anup Patel
23fe562e45 RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h
Just like asm/kvm_vcpu_timer.h, we should have all sbi related struct
and functions in asm/kvm_vcpu_sbi.h.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-12-07 09:17:27 +05:30
Anup Patel
1343c61a70 RISC-V: KVM: Remove redundant includes of asm/csr.h
We should include asm/csr.h only where required so let us remove
redundant includes of this header.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-12-07 09:17:12 +05:30
Jisheng Zhang
54ce3f7ff3 RISC-V: KVM: Record number of signal exits as a vCPU stat
Record a statistic indicating the number of times a vCPU has exited
due to a pending signal.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org
2022-10-02 10:19:16 +05:30
Anup Patel
c9d57373fc RISC-V: KVM: Add G-stage ioremap() and iounmap() functions
The in-kernel AIA IMSIC support requires on-demand mapping / unmapping
of Guest IMSIC address to Host IMSIC guest files. To help achieve this,
we add kvm_riscv_stage2_ioremap() and kvm_riscv_stage2_iounmap() functions.
These new functions for updating G-stage page table mappings will be called
in atomic context so we have special "in_atomic" parameter for this purpose.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29 17:15:06 +05:30
Anup Patel
8a061562e2 RISC-V: KVM: Add extensible CSR emulation framework
We add an extensible CSR emulation framework which is based upon the
existing system instruction emulation. This will be useful to upcoming
AIA, PMU, Nested and other virtualization features.

The CSR emulation framework also has provision to emulate CSR in user
space but this will be used only in very specific cases such as AIA
IMSIC CSR emulation in user space or vendor specific CSR emulation
in user space.

By default, all CSRs not handled by KVM RISC-V will be redirected back
to Guest VCPU as illegal instruction trap.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29 17:14:53 +05:30
Anup Patel
b91f0e4cb8 RISC-V: KVM: Factor-out instruction emulation into separate sources
The instruction and CSR emulation for VCPU is going to grow over time
due to upcoming AIA, PMU, Nested and other virtualization features.

Let us factor-out VCPU instruction emulation from vcpu_exit.c to a
separate source dedicated for this purpose.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29 17:14:40 +05:30
Atish Patra
9bfd900bee RISC-V: KVM: Improve ISA extension by using a bitmap
Currently, the every vcpu only stores the ISA extensions in a unsigned long
which is not scalable as number of extensions will continue to grow.
Using a bitmap allows the ISA extension to support any number of
extensions. The CONFIG one reg interface implementation is modified to
support the bitmap as well. But it is meant only for base extensions.
Thus, the first element of the bitmap array is sufficient for that
interface.

In the future, all the new multi-letter extensions must use the
ISA_EXT one reg interface that allows enabling/disabling any extension
now.

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29 17:14:11 +05:30
Anup Patel
92e450507d RISC-V: KVM: Cleanup stale TLB entries when host CPU changes
On RISC-V platforms with hardware VMID support, we share same
VMID for all VCPUs of a particular Guest/VM. This means we might
have stale G-stage TLB entries on the current Host CPU due to
some other VCPU of the same Guest which ran previously on the
current Host CPU.

To cleanup stale TLB entries, we simply flush all G-stage TLB
entries by VMID whenever underlying Host CPU changes for a VCPU.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:09:18 +05:30
Anup Patel
13acfec2db RISC-V: KVM: Add remote HFENCE functions based on VCPU requests
The generic KVM has support for VCPU requests which can be used
to do arch-specific work in the run-loop. We introduce remote
HFENCE functions which will internally use VCPU requests instead
of host SBI calls.

Advantages of doing remote HFENCEs as VCPU requests are:
1) Multiple VCPUs of a Guest may be running on different Host CPUs
   so it is not always possible to determine the Host CPU mask for
   doing Host SBI call. For example, when VCPU X wants to do HFENCE
   on VCPU Y, it is possible that VCPU Y is blocked or in user-space
   (i.e. vcpu->cpu < 0).
2) To support nested virtualization, we will be having a separate
   shadow G-stage for each VCPU and a common host G-stage for the
   entire Guest/VM. The VCPU requests based remote HFENCEs helps
   us easily synchronize the common host G-stage and shadow G-stage
   of each VCPU without any additional IPI calls.

This is also a preparatory patch for upcoming nested virtualization
support where we will be having a shadow G-stage page table for
each Guest VCPU.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:09:15 +05:30
Anup Patel
486a384294 RISC-V: KVM: Reduce KVM_MAX_VCPUS value
Currently, the KVM_MAX_VCPUS value is 16384 for RV64 and 128
for RV32.

The KVM_MAX_VCPUS value is too high for RV64 and too low for
RV32 compared to other architectures (e.g. x86 sets it to 1024
and ARM64 sets it to 512). The too high value of KVM_MAX_VCPUS
on RV64 also leads to VCPU mask on stack consuming 2KB.

We set KVM_MAX_VCPUS to 1024 for both RV64 and RV32 to be
aligned other architectures.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:09:12 +05:30
Anup Patel
2415e46e3a RISC-V: KVM: Introduce range based local HFENCE functions
Various  __kvm_riscv_hfence_xyz() functions implemented in the
kvm/tlb.S are equivalent to corresponding HFENCE.GVMA instructions
and we don't have range based local HFENCE functions.

This patch provides complete set of local HFENCE functions which
supports range based TLB invalidation and supports HFENCE.VVMA
based functions. This is also a preparatory patch for upcoming
Svinval support in KVM RISC-V.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:09:09 +05:30
Anup Patel
26708234eb RISC-V: KVM: Use G-stage name for hypervisor page table
The two-stage address translation defined by the RISC-V privileged
specification defines: VS-stage (guest virtual address to guest
physical address) programmed by the Guest OS  and G-stage (guest
physical addree to host physical address) programmed by the
hypervisor.

To align with above terminology, we replace "stage2" with "gstage"
and "Stage2" with "G-stage" name everywhere in KVM RISC-V sources.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:09:01 +05:30
Sean Christopherson
fdd6f6ac2e KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
Use the generic kvm_vcpu's srcu_idx instead of using an indentical field
in RISC-V's version of kvm_vcpu_arch.  Generic KVM very intentionally
does not touch vcpu->srcu_idx, i.e. there's zero chance of running afoul
of common code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220415004343.2203171-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 13:16:10 -04:00
Anup Patel
c9d3b5bd26 RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function
The wait for interrupt (WFI) instruction emulation can share the VCPU
halt logic with SBI HSM suspend emulation so this patch adds a common
kvm_riscv_vcpu_wfi() function for this purpose.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-03-11 19:02:37 +05:30
Anup Patel
a457fd5660 RISC-V: KVM: Add VM capability to allow userspace get GPA bits
The number of GPA bits supported for a RISC-V Guest/VM is based on the
MMU mode used by the G-stage translation. The KVM RISC-V will detect and
use the best possible MMU mode for the G-stage in kvm_arch_init().

We add a generic VM capability KVM_CAP_VM_GPA_BITS which can be used by
the KVM userspace to get the number of GPA (guest physical address) bits
supported for a Guest/VM.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-and-tested-by: Atish Patra <atishp@rivosinc.com>
2022-01-06 15:16:58 +05:30
Sean Christopherson
cc4f602bc4 KVM: RISC-V: Use common KVM implementation of MMU memory caches
Use common KVM's implementation of the MMU memory caches, which for all
intents and purposes is semantically identical to RISC-V's version, the
only difference being that the common implementation will fall back to an
atomic allocation if there's a KVM bug that triggers a cache underflow.

RISC-V appears to have based its MMU code on arm64 before the conversion
to the common caches in commit c1a33aebe9 ("KVM: arm64: Use common KVM
implementation of MMU memory caches"), despite having also copy-pasted
the definition of KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE in kvm_types.h.

Opportunistically drop the superfluous wrapper
kvm_riscv_stage2_flush_cache(), whose name is very, very confusing as
"cache flush" in the context of MMU code almost always refers to flushing
hardware caches, not freeing unused software objects.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2022-01-06 14:38:50 +05:30
Sean Christopherson
005467e06b KVM: Drop obsolete kvm_arch_vcpu_block_finish()
Drop kvm_arch_vcpu_block_finish() now that all arch implementations are
nops.

No functional change intended.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211009021236.4122790-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:50 -05:00
Anup Patel
74c2e97b01 RISC-V: KVM: Fix incorrect KVM_MAX_VCPUS value
The KVM_MAX_VCPUS value is supposed to be aligned with number of
VMID bits in the hgatp CSR but the current KVM_MAX_VCPUS value
is aligned with number of ASID bits in the satp CSR.

Fixes: 99cdc6c18c ("RISC-V: Add initial skeletal KVM support")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
2021-11-22 10:36:19 +05:30
Anup Patel
7c8de080d4 RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions
The parameter passed to HFENCE.GVMA instruction in rs1 register
is guest physical address right shifted by 2 (i.e. divided by 4).

Unfortunately, we overlooked the semantics of rs1 registers for
HFENCE.GVMA instruction and never right shifted guest physical
address by 2. This issue did not manifest for hypervisors till
now because:
  1) Currently, only __kvm_riscv_hfence_gvma_all() and SBI
     HFENCE calls are used to invalidate TLB.
  2) All H-extension implementations (such as QEMU, Spike,
     Rocket Core FPGA, etc) that we tried till now were
     conservatively flushing everything upon any HFENCE.GVMA
     instruction.

This patch fixes GPA passed to __kvm_riscv_hfence_gvma_vmid_gpa()
and __kvm_riscv_hfence_gvma_gpa() functions.

Fixes: fd7bb4a251 ("RISC-V: KVM: Implement VMID allocator")
Reported-by: Ian Huang <ihuang@ventanamicro.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Message-Id: <20211026170136.2147619-4-anup.patel@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-31 02:45:43 -04:00
Anup Patel
0a86512dc1 RISC-V: KVM: Factor-out FP virtualization into separate sources
The timer and SBI virtualization is already in separate sources.
In future, we will have vector and AIA virtualization also added
as separate sources.

To align with above described modularity, we factor-out FP
virtualization into separate sources.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Message-Id: <20211026170136.2147619-3-anup.patel@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-31 02:45:43 -04:00
Atish Patra
dea8ee31a0 RISC-V: KVM: Add SBI v0.1 support
The KVM host kernel is running in HS-mode needs so we need to handle
the SBI calls coming from guest kernel running in VS-mode.

This patch adds SBI v0.1 support in KVM RISC-V. Almost all SBI v0.1
calls are implemented in KVM kernel module except GETCHAR and PUTCHART
calls which are forwarded to user space because these calls cannot be
implemented in kernel space. In future, when we implement SBI v0.2 for
Guest, we will forward SBI v0.2 experimental and vendor extension calls
to user space.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:11:30 +05:30
Atish Patra
5de52d4a23 RISC-V: KVM: FP lazy save/restore
This patch adds floating point (F and D extension) context save/restore
for guest VCPUs. The FP context is saved and restored lazily only when
kernel enter/exits the in-kernel run loop and not during the KVM world
switch. This way FP save/restore has minimal impact on KVM performance.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:08:23 +05:30
Atish Patra
3a9f66cb25 RISC-V: KVM: Add timer functionality
The RISC-V hypervisor specification doesn't have any virtual timer
feature.

Due to this, the guest VCPU timer will be programmed via SBI calls.
The host will use a separate hrtimer event for each guest VCPU to
provide timer functionality. We inject a virtual timer interrupt to
the guest VCPU whenever the guest VCPU hrtimer event expires.

This patch adds guest VCPU timer implementation along with ONE_REG
interface to access VCPU timer state from user space.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:07:16 +05:30
Anup Patel
9955371cc0 RISC-V: KVM: Implement MMU notifiers
This patch implements MMU notifiers for KVM RISC-V so that Guest
physical address space is in-sync with Host physical address space.

This will allow swapping, page migration, etc to work transparently
with KVM RISC-V.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:03:39 +05:30
Anup Patel
9d05c1fee8 RISC-V: KVM: Implement stage2 page table programming
This patch implements all required functions for programming
the stage2 page table for each Guest/VM.

At high-level, the flow of stage2 related functions is similar
from KVM ARM/ARM64 implementation but the stage2 page table
format is quite different for KVM RISC-V.

[jiangyifei: stage2 dirty log support]
Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:02:19 +05:30
Anup Patel
fd7bb4a251 RISC-V: KVM: Implement VMID allocator
We implement a simple VMID allocator for Guests/VMs which:
1. Detects number of VMID bits at boot-time
2. Uses atomic number to track VMID version and increments
   VMID version whenever we run-out of VMIDs
3. Flushes Guest TLBs on all host CPUs whenever we run-out
   of VMIDs
4. Force updates HW Stage2 VMID for each Guest VCPU whenever
   VMID changes using VCPU request KVM_REQ_UPDATE_HGATP

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 16:01:04 +05:30
Anup Patel
9f70132651 RISC-V: KVM: Handle MMIO exits for VCPU
We will get stage2 page faults whenever Guest/VM access SW emulated
MMIO device or unmapped Guest RAM.

This patch implements MMIO read/write emulation by extracting MMIO
details from the trapped load/store instruction and forwarding the
MMIO read/write to user-space. The actual MMIO emulation will happen
in user-space and KVM kernel module will only take care of register
updates before resuming the trapped VCPU.

The handling for stage2 page faults for unmapped Guest RAM will be
implemeted by a separate patch later.

[jiangyifei: ioeventfd and in-kernel mmio device support]
Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 15:51:47 +05:30