mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-03-21 23:16:50 +08:00
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Quite a large pull request, partly due to skipping last week and
therefore having material from ~all submaintainers in this one. About
a fourth of it is a new selftest, and a couple more changes are large
in number of files touched (fixing a -Wflex-array-member-not-at-end
compiler warning) or lines changed (reformatting of a table in the API
documentation, thanks rST).
But who am I kidding---it's a lot of commits and there are a lot of
bugs being fixed here, some of them on the nastier side like the
RISC-V ones.
ARM:
- Correctly handle deactivation of interrupts that were activated
from LRs. Since EOIcount only denotes deactivation of interrupts
that are not present in an LR, start EOIcount deactivation walk
*after* the last irq that made it into an LR
- Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
is already enabled -- not only thhis isn't possible (pKVM will
reject the call), but it is also useless: this can only happen for
a CPU that has already booted once, and the capability will not
change
- Fix a couple of low-severity bugs in our S2 fault handling path,
affecting the recently introduced LS64 handling and the even more
esoteric handling of hwpoison in a nested context
- Address yet another syzkaller finding in the vgic initialisation,
where we would end-up destroying an uninitialised vgic with nasty
consequences
- Address an annoying case of pKVM failing to boot when some of the
memblock regions that the host is faulting in are not page-aligned
- Inject some sanity in the NV stage-2 walker by checking the limits
against the advertised PA size, and correctly report the resulting
faults
PPC:
- Fix a PPC e500 build error due to a long-standing wart that was
exposed by the recent conversion to kmalloc_obj(); rip out all the
ugliness that led to the wart
RISC-V:
- Prevent speculative out-of-bounds access using array_index_nospec()
in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
access, float register access, and PMU counter access
- Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()
- Fix potential null pointer dereference in
kvm_riscv_vcpu_aia_rmw_topei()
- Fix off-by-one array access in SBI PMU
- Skip THP support check during dirty logging
- Fix error code returned for Smstateen and Ssaia ONE_REG interface
- Check host Ssaia extension when creating AIA irqchip
x86:
- Fix cases where CPUID mitigation features were incorrectly marked
as available whenever the kernel used scattered feature words for
them
- Validate _all_ GVAs, rather than just the first GVA, when
processing a range of GVAs for Hyper-V's TLB flush hypercalls
- Fix a brown paper bug in add_atomic_switch_msr()
- Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu
- Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
local APIC (and AVIC is enabled at the module level)
- Update CR8 write interception when AVIC is (de)activated, to fix a
bug where the guest can run in perpetuity with the CR8 intercept
enabled
- Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
default) an unintentional tightening of userspace ABI in 6.17, and
provides some amount of backwards compatibility with hypervisors
who want to freeze PMCs on VM-Entry
- Validate the VMCS/VMCB on return to a nested guest from SMM,
because either userspace or the guest could stash invalid values in
memory and trigger the processor's consistency checks
Generic:
- Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
being unnecessary and confusing, triggered compiler warnings due to
-Wflex-array-member-not-at-end
- Document that vcpu->mutex is take outside of kvm->slots_lock and
kvm->slots_arch_lock, which is intentional and desirable despite
being rather unintuitive
Selftests:
- Increase the maximum number of NUMA nodes in the guest_memfd
selftest to 64 (from 8)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Documentation: kvm: fix formatting of the quirks table
KVM: x86: clarify leave_smm() return value
selftests: kvm: add a test that VMX validates controls on RSM
selftests: kvm: extract common functionality out of smm_test.c
KVM: SVM: check validity of VMCB controls when returning from SMM
KVM: VMX: check validity of VMCS controls when returning from SMM
KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
KVM: x86: synthesize CPUID bits only if CPU capability is set
KVM: PPC: e500: Rip out "struct tlbe_ref"
KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
KVM: selftests: Increase 'maxnode' for guest_memfd tests
KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
...
This commit is contained in:
@@ -8435,115 +8435,123 @@ KVM_CHECK_EXTENSION.
|
||||
|
||||
The valid bits in cap.args[0] are:
|
||||
|
||||
=================================== ============================================
|
||||
KVM_X86_QUIRK_LINT0_REENABLED By default, the reset value for the LVT
|
||||
LINT0 register is 0x700 (APIC_MODE_EXTINT).
|
||||
When this quirk is disabled, the reset value
|
||||
is 0x10000 (APIC_LVT_MASKED).
|
||||
======================================== ================================================
|
||||
KVM_X86_QUIRK_LINT0_REENABLED By default, the reset value for the LVT
|
||||
LINT0 register is 0x700 (APIC_MODE_EXTINT).
|
||||
When this quirk is disabled, the reset value
|
||||
is 0x10000 (APIC_LVT_MASKED).
|
||||
|
||||
KVM_X86_QUIRK_CD_NW_CLEARED By default, KVM clears CR0.CD and CR0.NW on
|
||||
AMD CPUs to workaround buggy guest firmware
|
||||
that runs in perpetuity with CR0.CD, i.e.
|
||||
with caches in "no fill" mode.
|
||||
KVM_X86_QUIRK_CD_NW_CLEARED By default, KVM clears CR0.CD and CR0.NW on
|
||||
AMD CPUs to workaround buggy guest firmware
|
||||
that runs in perpetuity with CR0.CD, i.e.
|
||||
with caches in "no fill" mode.
|
||||
|
||||
When this quirk is disabled, KVM does not
|
||||
change the value of CR0.CD and CR0.NW.
|
||||
When this quirk is disabled, KVM does not
|
||||
change the value of CR0.CD and CR0.NW.
|
||||
|
||||
KVM_X86_QUIRK_LAPIC_MMIO_HOLE By default, the MMIO LAPIC interface is
|
||||
available even when configured for x2APIC
|
||||
mode. When this quirk is disabled, KVM
|
||||
disables the MMIO LAPIC interface if the
|
||||
LAPIC is in x2APIC mode.
|
||||
KVM_X86_QUIRK_LAPIC_MMIO_HOLE By default, the MMIO LAPIC interface is
|
||||
available even when configured for x2APIC
|
||||
mode. When this quirk is disabled, KVM
|
||||
disables the MMIO LAPIC interface if the
|
||||
LAPIC is in x2APIC mode.
|
||||
|
||||
KVM_X86_QUIRK_OUT_7E_INC_RIP By default, KVM pre-increments %rip before
|
||||
exiting to userspace for an OUT instruction
|
||||
to port 0x7e. When this quirk is disabled,
|
||||
KVM does not pre-increment %rip before
|
||||
exiting to userspace.
|
||||
KVM_X86_QUIRK_OUT_7E_INC_RIP By default, KVM pre-increments %rip before
|
||||
exiting to userspace for an OUT instruction
|
||||
to port 0x7e. When this quirk is disabled,
|
||||
KVM does not pre-increment %rip before
|
||||
exiting to userspace.
|
||||
|
||||
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT When this quirk is disabled, KVM sets
|
||||
CPUID.01H:ECX[bit 3] (MONITOR/MWAIT) if
|
||||
IA32_MISC_ENABLE[bit 18] (MWAIT) is set.
|
||||
Additionally, when this quirk is disabled,
|
||||
KVM clears CPUID.01H:ECX[bit 3] if
|
||||
IA32_MISC_ENABLE[bit 18] is cleared.
|
||||
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT When this quirk is disabled, KVM sets
|
||||
CPUID.01H:ECX[bit 3] (MONITOR/MWAIT) if
|
||||
IA32_MISC_ENABLE[bit 18] (MWAIT) is set.
|
||||
Additionally, when this quirk is disabled,
|
||||
KVM clears CPUID.01H:ECX[bit 3] if
|
||||
IA32_MISC_ENABLE[bit 18] is cleared.
|
||||
|
||||
KVM_X86_QUIRK_FIX_HYPERCALL_INSN By default, KVM rewrites guest
|
||||
VMMCALL/VMCALL instructions to match the
|
||||
vendor's hypercall instruction for the
|
||||
system. When this quirk is disabled, KVM
|
||||
will no longer rewrite invalid guest
|
||||
hypercall instructions. Executing the
|
||||
incorrect hypercall instruction will
|
||||
generate a #UD within the guest.
|
||||
KVM_X86_QUIRK_FIX_HYPERCALL_INSN By default, KVM rewrites guest
|
||||
VMMCALL/VMCALL instructions to match the
|
||||
vendor's hypercall instruction for the
|
||||
system. When this quirk is disabled, KVM
|
||||
will no longer rewrite invalid guest
|
||||
hypercall instructions. Executing the
|
||||
incorrect hypercall instruction will
|
||||
generate a #UD within the guest.
|
||||
|
||||
KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
|
||||
they are intercepted) as NOPs regardless of
|
||||
whether or not MONITOR/MWAIT are supported
|
||||
according to guest CPUID. When this quirk
|
||||
is disabled and KVM_X86_DISABLE_EXITS_MWAIT
|
||||
is not set (MONITOR/MWAIT are intercepted),
|
||||
KVM will inject a #UD on MONITOR/MWAIT if
|
||||
they're unsupported per guest CPUID. Note,
|
||||
KVM will modify MONITOR/MWAIT support in
|
||||
guest CPUID on writes to MISC_ENABLE if
|
||||
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
|
||||
disabled.
|
||||
KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
|
||||
they are intercepted) as NOPs regardless of
|
||||
whether or not MONITOR/MWAIT are supported
|
||||
according to guest CPUID. When this quirk
|
||||
is disabled and KVM_X86_DISABLE_EXITS_MWAIT
|
||||
is not set (MONITOR/MWAIT are intercepted),
|
||||
KVM will inject a #UD on MONITOR/MWAIT if
|
||||
they're unsupported per guest CPUID. Note,
|
||||
KVM will modify MONITOR/MWAIT support in
|
||||
guest CPUID on writes to MISC_ENABLE if
|
||||
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
|
||||
disabled.
|
||||
|
||||
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, for KVM_X86_DEFAULT_VM VMs, KVM
|
||||
invalidates all SPTEs in all memslots and
|
||||
address spaces when a memslot is deleted or
|
||||
moved. When this quirk is disabled (or the
|
||||
VM type isn't KVM_X86_DEFAULT_VM), KVM only
|
||||
ensures the backing memory of the deleted
|
||||
or moved memslot isn't reachable, i.e KVM
|
||||
_may_ invalidate only SPTEs related to the
|
||||
memslot.
|
||||
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, for KVM_X86_DEFAULT_VM VMs, KVM
|
||||
invalidates all SPTEs in all memslots and
|
||||
address spaces when a memslot is deleted or
|
||||
moved. When this quirk is disabled (or the
|
||||
VM type isn't KVM_X86_DEFAULT_VM), KVM only
|
||||
ensures the backing memory of the deleted
|
||||
or moved memslot isn't reachable, i.e KVM
|
||||
_may_ invalidate only SPTEs related to the
|
||||
memslot.
|
||||
|
||||
KVM_X86_QUIRK_STUFF_FEATURE_MSRS By default, at vCPU creation, KVM sets the
|
||||
vCPU's MSR_IA32_PERF_CAPABILITIES (0x345),
|
||||
MSR_IA32_ARCH_CAPABILITIES (0x10a),
|
||||
MSR_PLATFORM_INFO (0xce), and all VMX MSRs
|
||||
(0x480..0x492) to the maximal capabilities
|
||||
supported by KVM. KVM also sets
|
||||
MSR_IA32_UCODE_REV (0x8b) to an arbitrary
|
||||
value (which is different for Intel vs.
|
||||
AMD). Lastly, when guest CPUID is set (by
|
||||
userspace), KVM modifies select VMX MSR
|
||||
fields to force consistency between guest
|
||||
CPUID and L2's effective ISA. When this
|
||||
quirk is disabled, KVM zeroes the vCPU's MSR
|
||||
values (with two exceptions, see below),
|
||||
i.e. treats the feature MSRs like CPUID
|
||||
leaves and gives userspace full control of
|
||||
the vCPU model definition. This quirk does
|
||||
not affect VMX MSRs CR0/CR4_FIXED1 (0x487
|
||||
and 0x489), as KVM does now allow them to
|
||||
be set by userspace (KVM sets them based on
|
||||
guest CPUID, for safety purposes).
|
||||
KVM_X86_QUIRK_STUFF_FEATURE_MSRS By default, at vCPU creation, KVM sets the
|
||||
vCPU's MSR_IA32_PERF_CAPABILITIES (0x345),
|
||||
MSR_IA32_ARCH_CAPABILITIES (0x10a),
|
||||
MSR_PLATFORM_INFO (0xce), and all VMX MSRs
|
||||
(0x480..0x492) to the maximal capabilities
|
||||
supported by KVM. KVM also sets
|
||||
MSR_IA32_UCODE_REV (0x8b) to an arbitrary
|
||||
value (which is different for Intel vs.
|
||||
AMD). Lastly, when guest CPUID is set (by
|
||||
userspace), KVM modifies select VMX MSR
|
||||
fields to force consistency between guest
|
||||
CPUID and L2's effective ISA. When this
|
||||
quirk is disabled, KVM zeroes the vCPU's MSR
|
||||
values (with two exceptions, see below),
|
||||
i.e. treats the feature MSRs like CPUID
|
||||
leaves and gives userspace full control of
|
||||
the vCPU model definition. This quirk does
|
||||
not affect VMX MSRs CR0/CR4_FIXED1 (0x487
|
||||
and 0x489), as KVM does now allow them to
|
||||
be set by userspace (KVM sets them based on
|
||||
guest CPUID, for safety purposes).
|
||||
|
||||
KVM_X86_QUIRK_IGNORE_GUEST_PAT By default, on Intel platforms, KVM ignores
|
||||
guest PAT and forces the effective memory
|
||||
type to WB in EPT. The quirk is not available
|
||||
on Intel platforms which are incapable of
|
||||
safely honoring guest PAT (i.e., without CPU
|
||||
self-snoop, KVM always ignores guest PAT and
|
||||
forces effective memory type to WB). It is
|
||||
also ignored on AMD platforms or, on Intel,
|
||||
when a VM has non-coherent DMA devices
|
||||
assigned; KVM always honors guest PAT in
|
||||
such case. The quirk is needed to avoid
|
||||
slowdowns on certain Intel Xeon platforms
|
||||
(e.g. ICX, SPR) where self-snoop feature is
|
||||
supported but UC is slow enough to cause
|
||||
issues with some older guests that use
|
||||
UC instead of WC to map the video RAM.
|
||||
Userspace can disable the quirk to honor
|
||||
guest PAT if it knows that there is no such
|
||||
guest software, for example if it does not
|
||||
expose a bochs graphics device (which is
|
||||
known to have had a buggy driver).
|
||||
=================================== ============================================
|
||||
KVM_X86_QUIRK_IGNORE_GUEST_PAT By default, on Intel platforms, KVM ignores
|
||||
guest PAT and forces the effective memory
|
||||
type to WB in EPT. The quirk is not available
|
||||
on Intel platforms which are incapable of
|
||||
safely honoring guest PAT (i.e., without CPU
|
||||
self-snoop, KVM always ignores guest PAT and
|
||||
forces effective memory type to WB). It is
|
||||
also ignored on AMD platforms or, on Intel,
|
||||
when a VM has non-coherent DMA devices
|
||||
assigned; KVM always honors guest PAT in
|
||||
such case. The quirk is needed to avoid
|
||||
slowdowns on certain Intel Xeon platforms
|
||||
(e.g. ICX, SPR) where self-snoop feature is
|
||||
supported but UC is slow enough to cause
|
||||
issues with some older guests that use
|
||||
UC instead of WC to map the video RAM.
|
||||
Userspace can disable the quirk to honor
|
||||
guest PAT if it knows that there is no such
|
||||
guest software, for example if it does not
|
||||
expose a bochs graphics device (which is
|
||||
known to have had a buggy driver).
|
||||
|
||||
KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM By default, KVM relaxes the consistency
|
||||
check for GUEST_IA32_DEBUGCTL in vmcs12
|
||||
to allow FREEZE_IN_SMM to be set. When
|
||||
this quirk is disabled, KVM requires this
|
||||
bit to be cleared. Note that the vmcs02
|
||||
bit is still completely controlled by the
|
||||
host, regardless of the quirk setting.
|
||||
======================================== ================================================
|
||||
|
||||
7.32 KVM_CAP_MAX_VCPU_ID
|
||||
------------------------
|
||||
|
||||
@@ -17,6 +17,8 @@ The acquisition orders for mutexes are as follows:
|
||||
|
||||
- kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
|
||||
|
||||
- vcpu->mutex is taken outside kvm->slots_lock and kvm->slots_arch_lock
|
||||
|
||||
- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
|
||||
them together is quite rare.
|
||||
|
||||
|
||||
@@ -784,6 +784,9 @@ struct kvm_host_data {
|
||||
/* Number of debug breakpoints/watchpoints for this CPU (minus 1) */
|
||||
unsigned int debug_brps;
|
||||
unsigned int debug_wrps;
|
||||
|
||||
/* Last vgic_irq part of the AP list recorded in an LR */
|
||||
struct vgic_irq *last_lr_irq;
|
||||
};
|
||||
|
||||
struct kvm_host_psci_config {
|
||||
|
||||
@@ -2345,6 +2345,15 @@ static bool can_trap_icv_dir_el1(const struct arm64_cpu_capabilities *entry,
|
||||
!is_midr_in_range_list(has_vgic_v3))
|
||||
return false;
|
||||
|
||||
/*
|
||||
* pKVM prevents late onlining of CPUs. This means that whatever
|
||||
* state the capability is in after deprivilege cannot be affected
|
||||
* by a new CPU booting -- this is garanteed to be a CPU we have
|
||||
* already seen, and the cap is therefore unchanged.
|
||||
*/
|
||||
if (system_capabilities_finalized() && is_protected_kvm_enabled())
|
||||
return cpus_have_final_cap(ARM64_HAS_ICH_HCR_EL2_TDIR);
|
||||
|
||||
if (is_kernel_in_hyp_mode())
|
||||
res.a1 = read_sysreg_s(SYS_ICH_VTR_EL2);
|
||||
else
|
||||
|
||||
@@ -1504,8 +1504,6 @@ int __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
|
||||
fail = true;
|
||||
}
|
||||
|
||||
isb();
|
||||
|
||||
if (!fail)
|
||||
par = read_sysreg_par();
|
||||
|
||||
|
||||
@@ -29,7 +29,7 @@
|
||||
|
||||
#include "trace.h"
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS()
|
||||
};
|
||||
|
||||
@@ -42,7 +42,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, hvc_exit_stat),
|
||||
STATS_DESC_COUNTER(VCPU, wfe_exit_stat),
|
||||
|
||||
@@ -518,7 +518,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range)
|
||||
granule = kvm_granule_size(level);
|
||||
cur.start = ALIGN_DOWN(addr, granule);
|
||||
cur.end = cur.start + granule;
|
||||
if (!range_included(&cur, range))
|
||||
if (!range_included(&cur, range) && level < KVM_PGTABLE_LAST_LEVEL)
|
||||
continue;
|
||||
*range = cur;
|
||||
return 0;
|
||||
|
||||
@@ -1751,6 +1751,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||
|
||||
force_pte = (max_map_size == PAGE_SIZE);
|
||||
vma_pagesize = min_t(long, vma_pagesize, max_map_size);
|
||||
vma_shift = __ffs(vma_pagesize);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -1837,10 +1838,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||
if (exec_fault && s2_force_noncacheable)
|
||||
ret = -ENOEXEC;
|
||||
|
||||
if (ret) {
|
||||
kvm_release_page_unused(page);
|
||||
return ret;
|
||||
}
|
||||
if (ret)
|
||||
goto out_put_page;
|
||||
|
||||
/*
|
||||
* Guest performs atomic/exclusive operations on memory with unsupported
|
||||
@@ -1850,7 +1849,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
|
||||
*/
|
||||
if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) {
|
||||
kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
|
||||
return 1;
|
||||
ret = 1;
|
||||
goto out_put_page;
|
||||
}
|
||||
|
||||
if (nested)
|
||||
@@ -1936,6 +1936,10 @@ out_unlock:
|
||||
mark_page_dirty_in_slot(kvm, memslot, gfn);
|
||||
|
||||
return ret != -EAGAIN ? ret : 0;
|
||||
|
||||
out_put_page:
|
||||
kvm_release_page_unused(page);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* Resolve the access fault by making the page young again. */
|
||||
|
||||
@@ -152,31 +152,31 @@ static int get_ia_size(struct s2_walk_info *wi)
|
||||
return 64 - wi->t0sz;
|
||||
}
|
||||
|
||||
static int check_base_s2_limits(struct s2_walk_info *wi,
|
||||
static int check_base_s2_limits(struct kvm_vcpu *vcpu, struct s2_walk_info *wi,
|
||||
int level, int input_size, int stride)
|
||||
{
|
||||
int start_size, ia_size;
|
||||
int start_size, pa_max;
|
||||
|
||||
ia_size = get_ia_size(wi);
|
||||
pa_max = kvm_get_pa_bits(vcpu->kvm);
|
||||
|
||||
/* Check translation limits */
|
||||
switch (BIT(wi->pgshift)) {
|
||||
case SZ_64K:
|
||||
if (level == 0 || (level == 1 && ia_size <= 42))
|
||||
if (level == 0 || (level == 1 && pa_max <= 42))
|
||||
return -EFAULT;
|
||||
break;
|
||||
case SZ_16K:
|
||||
if (level == 0 || (level == 1 && ia_size <= 40))
|
||||
if (level == 0 || (level == 1 && pa_max <= 40))
|
||||
return -EFAULT;
|
||||
break;
|
||||
case SZ_4K:
|
||||
if (level < 0 || (level == 0 && ia_size <= 42))
|
||||
if (level < 0 || (level == 0 && pa_max <= 42))
|
||||
return -EFAULT;
|
||||
break;
|
||||
}
|
||||
|
||||
/* Check input size limits */
|
||||
if (input_size > ia_size)
|
||||
if (input_size > pa_max)
|
||||
return -EFAULT;
|
||||
|
||||
/* Check number of entries in starting level table */
|
||||
@@ -269,16 +269,19 @@ static int walk_nested_s2_pgd(struct kvm_vcpu *vcpu, phys_addr_t ipa,
|
||||
if (input_size > 48 || input_size < 25)
|
||||
return -EFAULT;
|
||||
|
||||
ret = check_base_s2_limits(wi, level, input_size, stride);
|
||||
if (WARN_ON(ret))
|
||||
ret = check_base_s2_limits(vcpu, wi, level, input_size, stride);
|
||||
if (WARN_ON(ret)) {
|
||||
out->esr = compute_fsc(0, ESR_ELx_FSC_FAULT);
|
||||
return ret;
|
||||
}
|
||||
|
||||
base_lower_bound = 3 + input_size - ((3 - level) * stride +
|
||||
wi->pgshift);
|
||||
base_addr = wi->baddr & GENMASK_ULL(47, base_lower_bound);
|
||||
|
||||
if (check_output_size(wi, base_addr)) {
|
||||
out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ);
|
||||
/* R_BFHQH */
|
||||
out->esr = compute_fsc(0, ESR_ELx_FSC_ADDRSZ);
|
||||
return 1;
|
||||
}
|
||||
|
||||
@@ -293,8 +296,10 @@ static int walk_nested_s2_pgd(struct kvm_vcpu *vcpu, phys_addr_t ipa,
|
||||
|
||||
paddr = base_addr | index;
|
||||
ret = read_guest_s2_desc(vcpu, paddr, &desc, wi);
|
||||
if (ret < 0)
|
||||
if (ret < 0) {
|
||||
out->esr = ESR_ELx_FSC_SEA_TTW(level);
|
||||
return ret;
|
||||
}
|
||||
|
||||
new_desc = desc;
|
||||
|
||||
|
||||
@@ -143,23 +143,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
|
||||
kvm->arch.vgic.in_kernel = true;
|
||||
kvm->arch.vgic.vgic_model = type;
|
||||
kvm->arch.vgic.implementation_rev = KVM_VGIC_IMP_REV_LATEST;
|
||||
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
ret = vgic_allocate_private_irqs_locked(vcpu, type);
|
||||
if (ret)
|
||||
break;
|
||||
}
|
||||
|
||||
if (ret) {
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
|
||||
kfree(vgic_cpu->private_irqs);
|
||||
vgic_cpu->private_irqs = NULL;
|
||||
}
|
||||
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
|
||||
|
||||
aa64pfr0 = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1) & ~ID_AA64PFR0_EL1_GIC;
|
||||
@@ -176,6 +159,23 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
|
||||
kvm_set_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1, aa64pfr0);
|
||||
kvm_set_vm_id_reg(kvm, SYS_ID_PFR1_EL1, pfr1);
|
||||
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
ret = vgic_allocate_private_irqs_locked(vcpu, type);
|
||||
if (ret)
|
||||
break;
|
||||
}
|
||||
|
||||
if (ret) {
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
|
||||
kfree(vgic_cpu->private_irqs);
|
||||
vgic_cpu->private_irqs = NULL;
|
||||
}
|
||||
|
||||
kvm->arch.vgic.vgic_model = 0;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
if (type == KVM_DEV_TYPE_ARM_VGIC_V3)
|
||||
kvm->arch.vgic.nassgicap = system_supports_direct_sgis();
|
||||
|
||||
|
||||
@@ -115,7 +115,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
|
||||
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
|
||||
struct vgic_v2_cpu_if *cpuif = &vgic_cpu->vgic_v2;
|
||||
u32 eoicount = FIELD_GET(GICH_HCR_EOICOUNT, cpuif->vgic_hcr);
|
||||
struct vgic_irq *irq;
|
||||
struct vgic_irq *irq = *host_data_ptr(last_lr_irq);
|
||||
|
||||
DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
|
||||
|
||||
@@ -123,7 +123,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
|
||||
vgic_v2_fold_lr(vcpu, cpuif->vgic_lr[lr]);
|
||||
|
||||
/* See the GICv3 equivalent for the EOIcount handling rationale */
|
||||
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
|
||||
list_for_each_entry_continue(irq, &vgic_cpu->ap_list_head, ap_list) {
|
||||
u32 lr;
|
||||
|
||||
if (!eoicount) {
|
||||
|
||||
@@ -148,7 +148,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
|
||||
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
|
||||
struct vgic_v3_cpu_if *cpuif = &vgic_cpu->vgic_v3;
|
||||
u32 eoicount = FIELD_GET(ICH_HCR_EL2_EOIcount, cpuif->vgic_hcr);
|
||||
struct vgic_irq *irq;
|
||||
struct vgic_irq *irq = *host_data_ptr(last_lr_irq);
|
||||
|
||||
DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
|
||||
|
||||
@@ -158,12 +158,12 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
|
||||
/*
|
||||
* EOIMode=0: use EOIcount to emulate deactivation. We are
|
||||
* guaranteed to deactivate in reverse order of the activation, so
|
||||
* just pick one active interrupt after the other in the ap_list,
|
||||
* and replay the deactivation as if the CPU was doing it. We also
|
||||
* rely on priority drop to have taken place, and the list to be
|
||||
* sorted by priority.
|
||||
* just pick one active interrupt after the other in the tail part
|
||||
* of the ap_list, past the LRs, and replay the deactivation as if
|
||||
* the CPU was doing it. We also rely on priority drop to have taken
|
||||
* place, and the list to be sorted by priority.
|
||||
*/
|
||||
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
|
||||
list_for_each_entry_continue(irq, &vgic_cpu->ap_list_head, ap_list) {
|
||||
u64 lr;
|
||||
|
||||
/*
|
||||
|
||||
@@ -814,6 +814,9 @@ retry:
|
||||
|
||||
static inline void vgic_fold_lr_state(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!*host_data_ptr(last_lr_irq))
|
||||
return;
|
||||
|
||||
if (kvm_vgic_global_state.type == VGIC_V2)
|
||||
vgic_v2_fold_lr_state(vcpu);
|
||||
else
|
||||
@@ -960,10 +963,13 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
|
||||
if (irqs_outside_lrs(&als))
|
||||
vgic_sort_ap_list(vcpu);
|
||||
|
||||
*host_data_ptr(last_lr_irq) = NULL;
|
||||
|
||||
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
|
||||
scoped_guard(raw_spinlock, &irq->irq_lock) {
|
||||
if (likely(vgic_target_oracle(irq) == vcpu)) {
|
||||
vgic_populate_lr(vcpu, irq, count++);
|
||||
*host_data_ptr(last_lr_irq) = irq;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -14,7 +14,7 @@
|
||||
#define CREATE_TRACE_POINTS
|
||||
#include "trace.h"
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, int_exits),
|
||||
STATS_DESC_COUNTER(VCPU, idle_exits),
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
#include <asm/kvm_eiointc.h>
|
||||
#include <asm/kvm_pch_pic.h>
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS(),
|
||||
STATS_DESC_ICOUNTER(VM, pages),
|
||||
STATS_DESC_ICOUNTER(VM, hugepages),
|
||||
|
||||
@@ -38,7 +38,7 @@
|
||||
#define VECTORSPACING 0x100 /* for EI/VI mode */
|
||||
#endif
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS()
|
||||
};
|
||||
|
||||
@@ -51,7 +51,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, wait_exits),
|
||||
STATS_DESC_COUNTER(VCPU, cache_exits),
|
||||
|
||||
@@ -38,7 +38,7 @@
|
||||
|
||||
/* #define EXIT_DEBUG */
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS(),
|
||||
STATS_DESC_ICOUNTER(VM, num_2M_pages),
|
||||
STATS_DESC_ICOUNTER(VM, num_1G_pages)
|
||||
@@ -53,7 +53,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, sum_exits),
|
||||
STATS_DESC_COUNTER(VCPU, mmio_exits),
|
||||
|
||||
@@ -36,7 +36,7 @@
|
||||
|
||||
unsigned long kvmppc_booke_handlers;
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS(),
|
||||
STATS_DESC_ICOUNTER(VM, num_2M_pages),
|
||||
STATS_DESC_ICOUNTER(VM, num_1G_pages)
|
||||
@@ -51,7 +51,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, sum_exits),
|
||||
STATS_DESC_COUNTER(VCPU, mmio_exits),
|
||||
|
||||
@@ -39,15 +39,11 @@ enum vcpu_ftr {
|
||||
/* bits [6-5] MAS2_X1 and MAS2_X0 and [4-0] bits for WIMGE */
|
||||
#define E500_TLB_MAS2_ATTR (0x7f)
|
||||
|
||||
struct tlbe_ref {
|
||||
struct tlbe_priv {
|
||||
kvm_pfn_t pfn; /* valid only for TLB0, except briefly */
|
||||
unsigned int flags; /* E500_TLB_* */
|
||||
};
|
||||
|
||||
struct tlbe_priv {
|
||||
struct tlbe_ref ref;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_KVM_E500V2
|
||||
struct vcpu_id_table;
|
||||
#endif
|
||||
|
||||
@@ -920,12 +920,12 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
|
||||
vcpu_e500->gtlb_offset[0] = 0;
|
||||
vcpu_e500->gtlb_offset[1] = KVM_E500_TLB0_SIZE;
|
||||
|
||||
vcpu_e500->gtlb_priv[0] = kzalloc_objs(struct tlbe_ref,
|
||||
vcpu_e500->gtlb_priv[0] = kzalloc_objs(struct tlbe_priv,
|
||||
vcpu_e500->gtlb_params[0].entries);
|
||||
if (!vcpu_e500->gtlb_priv[0])
|
||||
goto free_vcpu;
|
||||
|
||||
vcpu_e500->gtlb_priv[1] = kzalloc_objs(struct tlbe_ref,
|
||||
vcpu_e500->gtlb_priv[1] = kzalloc_objs(struct tlbe_priv,
|
||||
vcpu_e500->gtlb_params[1].entries);
|
||||
if (!vcpu_e500->gtlb_priv[1])
|
||||
goto free_vcpu;
|
||||
|
||||
@@ -189,16 +189,16 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
|
||||
{
|
||||
struct kvm_book3e_206_tlb_entry *gtlbe =
|
||||
get_entry(vcpu_e500, tlbsel, esel);
|
||||
struct tlbe_ref *ref = &vcpu_e500->gtlb_priv[tlbsel][esel].ref;
|
||||
struct tlbe_priv *tlbe = &vcpu_e500->gtlb_priv[tlbsel][esel];
|
||||
|
||||
/* Don't bother with unmapped entries */
|
||||
if (!(ref->flags & E500_TLB_VALID)) {
|
||||
WARN(ref->flags & (E500_TLB_BITMAP | E500_TLB_TLB0),
|
||||
"%s: flags %x\n", __func__, ref->flags);
|
||||
if (!(tlbe->flags & E500_TLB_VALID)) {
|
||||
WARN(tlbe->flags & (E500_TLB_BITMAP | E500_TLB_TLB0),
|
||||
"%s: flags %x\n", __func__, tlbe->flags);
|
||||
WARN_ON(tlbsel == 1 && vcpu_e500->g2h_tlb1_map[esel]);
|
||||
}
|
||||
|
||||
if (tlbsel == 1 && ref->flags & E500_TLB_BITMAP) {
|
||||
if (tlbsel == 1 && tlbe->flags & E500_TLB_BITMAP) {
|
||||
u64 tmp = vcpu_e500->g2h_tlb1_map[esel];
|
||||
int hw_tlb_indx;
|
||||
unsigned long flags;
|
||||
@@ -216,28 +216,28 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
|
||||
}
|
||||
mb();
|
||||
vcpu_e500->g2h_tlb1_map[esel] = 0;
|
||||
ref->flags &= ~(E500_TLB_BITMAP | E500_TLB_VALID);
|
||||
tlbe->flags &= ~(E500_TLB_BITMAP | E500_TLB_VALID);
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
if (tlbsel == 1 && ref->flags & E500_TLB_TLB0) {
|
||||
if (tlbsel == 1 && tlbe->flags & E500_TLB_TLB0) {
|
||||
/*
|
||||
* TLB1 entry is backed by 4k pages. This should happen
|
||||
* rarely and is not worth optimizing. Invalidate everything.
|
||||
*/
|
||||
kvmppc_e500_tlbil_all(vcpu_e500);
|
||||
ref->flags &= ~(E500_TLB_TLB0 | E500_TLB_VALID);
|
||||
tlbe->flags &= ~(E500_TLB_TLB0 | E500_TLB_VALID);
|
||||
}
|
||||
|
||||
/*
|
||||
* If TLB entry is still valid then it's a TLB0 entry, and thus
|
||||
* backed by at most one host tlbe per shadow pid
|
||||
*/
|
||||
if (ref->flags & E500_TLB_VALID)
|
||||
if (tlbe->flags & E500_TLB_VALID)
|
||||
kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
|
||||
|
||||
/* Mark the TLB as not backed by the host anymore */
|
||||
ref->flags = 0;
|
||||
tlbe->flags = 0;
|
||||
}
|
||||
|
||||
static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
|
||||
@@ -245,26 +245,26 @@ static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
|
||||
return tlbe->mas7_3 & (MAS3_SW|MAS3_UW);
|
||||
}
|
||||
|
||||
static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
|
||||
struct kvm_book3e_206_tlb_entry *gtlbe,
|
||||
kvm_pfn_t pfn, unsigned int wimg,
|
||||
bool writable)
|
||||
static inline void kvmppc_e500_tlbe_setup(struct tlbe_priv *tlbe,
|
||||
struct kvm_book3e_206_tlb_entry *gtlbe,
|
||||
kvm_pfn_t pfn, unsigned int wimg,
|
||||
bool writable)
|
||||
{
|
||||
ref->pfn = pfn;
|
||||
ref->flags = E500_TLB_VALID;
|
||||
tlbe->pfn = pfn;
|
||||
tlbe->flags = E500_TLB_VALID;
|
||||
if (writable)
|
||||
ref->flags |= E500_TLB_WRITABLE;
|
||||
tlbe->flags |= E500_TLB_WRITABLE;
|
||||
|
||||
/* Use guest supplied MAS2_G and MAS2_E */
|
||||
ref->flags |= (gtlbe->mas2 & MAS2_ATTRIB_MASK) | wimg;
|
||||
tlbe->flags |= (gtlbe->mas2 & MAS2_ATTRIB_MASK) | wimg;
|
||||
}
|
||||
|
||||
static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)
|
||||
static inline void kvmppc_e500_tlbe_release(struct tlbe_priv *tlbe)
|
||||
{
|
||||
if (ref->flags & E500_TLB_VALID) {
|
||||
if (tlbe->flags & E500_TLB_VALID) {
|
||||
/* FIXME: don't log bogus pfn for TLB1 */
|
||||
trace_kvm_booke206_ref_release(ref->pfn, ref->flags);
|
||||
ref->flags = 0;
|
||||
trace_kvm_booke206_ref_release(tlbe->pfn, tlbe->flags);
|
||||
tlbe->flags = 0;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -284,11 +284,8 @@ static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500)
|
||||
int i;
|
||||
|
||||
for (tlbsel = 0; tlbsel <= 1; tlbsel++) {
|
||||
for (i = 0; i < vcpu_e500->gtlb_params[tlbsel].entries; i++) {
|
||||
struct tlbe_ref *ref =
|
||||
&vcpu_e500->gtlb_priv[tlbsel][i].ref;
|
||||
kvmppc_e500_ref_release(ref);
|
||||
}
|
||||
for (i = 0; i < vcpu_e500->gtlb_params[tlbsel].entries; i++)
|
||||
kvmppc_e500_tlbe_release(&vcpu_e500->gtlb_priv[tlbsel][i]);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -304,18 +301,18 @@ void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu)
|
||||
static void kvmppc_e500_setup_stlbe(
|
||||
struct kvm_vcpu *vcpu,
|
||||
struct kvm_book3e_206_tlb_entry *gtlbe,
|
||||
int tsize, struct tlbe_ref *ref, u64 gvaddr,
|
||||
int tsize, struct tlbe_priv *tlbe, u64 gvaddr,
|
||||
struct kvm_book3e_206_tlb_entry *stlbe)
|
||||
{
|
||||
kvm_pfn_t pfn = ref->pfn;
|
||||
kvm_pfn_t pfn = tlbe->pfn;
|
||||
u32 pr = vcpu->arch.shared->msr & MSR_PR;
|
||||
bool writable = !!(ref->flags & E500_TLB_WRITABLE);
|
||||
bool writable = !!(tlbe->flags & E500_TLB_WRITABLE);
|
||||
|
||||
BUG_ON(!(ref->flags & E500_TLB_VALID));
|
||||
BUG_ON(!(tlbe->flags & E500_TLB_VALID));
|
||||
|
||||
/* Force IPROT=0 for all guest mappings. */
|
||||
stlbe->mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
|
||||
stlbe->mas2 = (gvaddr & MAS2_EPN) | (ref->flags & E500_TLB_MAS2_ATTR);
|
||||
stlbe->mas2 = (gvaddr & MAS2_EPN) | (tlbe->flags & E500_TLB_MAS2_ATTR);
|
||||
stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) |
|
||||
e500_shadow_mas3_attrib(gtlbe->mas7_3, writable, pr);
|
||||
}
|
||||
@@ -323,7 +320,7 @@ static void kvmppc_e500_setup_stlbe(
|
||||
static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
|
||||
u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe,
|
||||
int tlbsel, struct kvm_book3e_206_tlb_entry *stlbe,
|
||||
struct tlbe_ref *ref)
|
||||
struct tlbe_priv *tlbe)
|
||||
{
|
||||
struct kvm_memory_slot *slot;
|
||||
unsigned int psize;
|
||||
@@ -455,9 +452,9 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
|
||||
}
|
||||
}
|
||||
|
||||
kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg, writable);
|
||||
kvmppc_e500_tlbe_setup(tlbe, gtlbe, pfn, wimg, writable);
|
||||
kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,
|
||||
ref, gvaddr, stlbe);
|
||||
tlbe, gvaddr, stlbe);
|
||||
writable = tlbe_is_writable(stlbe);
|
||||
|
||||
/* Clear i-cache for new pages */
|
||||
@@ -474,17 +471,17 @@ static int kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500, int esel,
|
||||
struct kvm_book3e_206_tlb_entry *stlbe)
|
||||
{
|
||||
struct kvm_book3e_206_tlb_entry *gtlbe;
|
||||
struct tlbe_ref *ref;
|
||||
struct tlbe_priv *tlbe;
|
||||
int stlbsel = 0;
|
||||
int sesel = 0;
|
||||
int r;
|
||||
|
||||
gtlbe = get_entry(vcpu_e500, 0, esel);
|
||||
ref = &vcpu_e500->gtlb_priv[0][esel].ref;
|
||||
tlbe = &vcpu_e500->gtlb_priv[0][esel];
|
||||
|
||||
r = kvmppc_e500_shadow_map(vcpu_e500, get_tlb_eaddr(gtlbe),
|
||||
get_tlb_raddr(gtlbe) >> PAGE_SHIFT,
|
||||
gtlbe, 0, stlbe, ref);
|
||||
gtlbe, 0, stlbe, tlbe);
|
||||
if (r)
|
||||
return r;
|
||||
|
||||
@@ -494,7 +491,7 @@ static int kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500, int esel,
|
||||
}
|
||||
|
||||
static int kvmppc_e500_tlb1_map_tlb1(struct kvmppc_vcpu_e500 *vcpu_e500,
|
||||
struct tlbe_ref *ref,
|
||||
struct tlbe_priv *tlbe,
|
||||
int esel)
|
||||
{
|
||||
unsigned int sesel = vcpu_e500->host_tlb1_nv++;
|
||||
@@ -507,10 +504,10 @@ static int kvmppc_e500_tlb1_map_tlb1(struct kvmppc_vcpu_e500 *vcpu_e500,
|
||||
vcpu_e500->g2h_tlb1_map[idx] &= ~(1ULL << sesel);
|
||||
}
|
||||
|
||||
vcpu_e500->gtlb_priv[1][esel].ref.flags |= E500_TLB_BITMAP;
|
||||
vcpu_e500->gtlb_priv[1][esel].flags |= E500_TLB_BITMAP;
|
||||
vcpu_e500->g2h_tlb1_map[esel] |= (u64)1 << sesel;
|
||||
vcpu_e500->h2g_tlb1_rmap[sesel] = esel + 1;
|
||||
WARN_ON(!(ref->flags & E500_TLB_VALID));
|
||||
WARN_ON(!(tlbe->flags & E500_TLB_VALID));
|
||||
|
||||
return sesel;
|
||||
}
|
||||
@@ -522,24 +519,24 @@ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500,
|
||||
u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe,
|
||||
struct kvm_book3e_206_tlb_entry *stlbe, int esel)
|
||||
{
|
||||
struct tlbe_ref *ref = &vcpu_e500->gtlb_priv[1][esel].ref;
|
||||
struct tlbe_priv *tlbe = &vcpu_e500->gtlb_priv[1][esel];
|
||||
int sesel;
|
||||
int r;
|
||||
|
||||
r = kvmppc_e500_shadow_map(vcpu_e500, gvaddr, gfn, gtlbe, 1, stlbe,
|
||||
ref);
|
||||
tlbe);
|
||||
if (r)
|
||||
return r;
|
||||
|
||||
/* Use TLB0 when we can only map a page with 4k */
|
||||
if (get_tlb_tsize(stlbe) == BOOK3E_PAGESZ_4K) {
|
||||
vcpu_e500->gtlb_priv[1][esel].ref.flags |= E500_TLB_TLB0;
|
||||
vcpu_e500->gtlb_priv[1][esel].flags |= E500_TLB_TLB0;
|
||||
write_stlbe(vcpu_e500, gtlbe, stlbe, 0, 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Otherwise map into TLB1 */
|
||||
sesel = kvmppc_e500_tlb1_map_tlb1(vcpu_e500, ref, esel);
|
||||
sesel = kvmppc_e500_tlb1_map_tlb1(vcpu_e500, tlbe, esel);
|
||||
write_stlbe(vcpu_e500, gtlbe, stlbe, 1, sesel);
|
||||
|
||||
return 0;
|
||||
@@ -561,11 +558,11 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr,
|
||||
priv = &vcpu_e500->gtlb_priv[tlbsel][esel];
|
||||
|
||||
/* Triggers after clear_tlb_privs or on initial mapping */
|
||||
if (!(priv->ref.flags & E500_TLB_VALID)) {
|
||||
if (!(priv->flags & E500_TLB_VALID)) {
|
||||
kvmppc_e500_tlb0_map(vcpu_e500, esel, &stlbe);
|
||||
} else {
|
||||
kvmppc_e500_setup_stlbe(vcpu, gtlbe, BOOK3E_PAGESZ_4K,
|
||||
&priv->ref, eaddr, &stlbe);
|
||||
priv, eaddr, &stlbe);
|
||||
write_stlbe(vcpu_e500, gtlbe, &stlbe, 0, 0);
|
||||
}
|
||||
break;
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
#include <linux/irqchip/riscv-imsic.h>
|
||||
#include <linux/irqdomain.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <asm/cpufeature.h>
|
||||
@@ -182,9 +183,14 @@ int kvm_riscv_vcpu_aia_get_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long *out_val)
|
||||
{
|
||||
struct kvm_vcpu_aia_csr *csr = &vcpu->arch.aia_context.guest_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long))
|
||||
if (!riscv_isa_extension_available(vcpu->arch.isa, SSAIA))
|
||||
return -ENOENT;
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
*out_val = 0;
|
||||
if (kvm_riscv_aia_available())
|
||||
@@ -198,9 +204,14 @@ int kvm_riscv_vcpu_aia_set_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long val)
|
||||
{
|
||||
struct kvm_vcpu_aia_csr *csr = &vcpu->arch.aia_context.guest_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long))
|
||||
if (!riscv_isa_extension_available(vcpu->arch.isa, SSAIA))
|
||||
return -ENOENT;
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
if (kvm_riscv_aia_available()) {
|
||||
((unsigned long *)csr)[reg_num] = val;
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
#include <linux/irqchip/riscv-aplic.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/math.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/swab.h>
|
||||
#include <kvm/iodev.h>
|
||||
@@ -45,7 +46,7 @@ static u32 aplic_read_sourcecfg(struct aplic *aplic, u32 irq)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return 0;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
ret = irqd->sourcecfg;
|
||||
@@ -61,7 +62,7 @@ static void aplic_write_sourcecfg(struct aplic *aplic, u32 irq, u32 val)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
if (val & APLIC_SOURCECFG_D)
|
||||
val = 0;
|
||||
@@ -81,7 +82,7 @@ static u32 aplic_read_target(struct aplic *aplic, u32 irq)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return 0;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
ret = irqd->target;
|
||||
@@ -97,7 +98,7 @@ static void aplic_write_target(struct aplic *aplic, u32 irq, u32 val)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
val &= APLIC_TARGET_EIID_MASK |
|
||||
(APLIC_TARGET_HART_IDX_MASK << APLIC_TARGET_HART_IDX_SHIFT) |
|
||||
@@ -116,7 +117,7 @@ static bool aplic_read_pending(struct aplic *aplic, u32 irq)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return false;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
ret = (irqd->state & APLIC_IRQ_STATE_PENDING) ? true : false;
|
||||
@@ -132,7 +133,7 @@ static void aplic_write_pending(struct aplic *aplic, u32 irq, bool pending)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
|
||||
@@ -170,7 +171,7 @@ static bool aplic_read_enabled(struct aplic *aplic, u32 irq)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return false;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
ret = (irqd->state & APLIC_IRQ_STATE_ENABLED) ? true : false;
|
||||
@@ -186,7 +187,7 @@ static void aplic_write_enabled(struct aplic *aplic, u32 irq, bool enabled)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
if (enabled)
|
||||
@@ -205,7 +206,7 @@ static bool aplic_read_input(struct aplic *aplic, u32 irq)
|
||||
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
return false;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
|
||||
@@ -254,7 +255,7 @@ static void aplic_update_irq_range(struct kvm *kvm, u32 first, u32 last)
|
||||
for (irq = first; irq <= last; irq++) {
|
||||
if (!irq || aplic->nr_irqs <= irq)
|
||||
continue;
|
||||
irqd = &aplic->irqs[irq];
|
||||
irqd = &aplic->irqs[array_index_nospec(irq, aplic->nr_irqs)];
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
|
||||
@@ -283,7 +284,7 @@ int kvm_riscv_aia_aplic_inject(struct kvm *kvm, u32 source, bool level)
|
||||
|
||||
if (!aplic || !source || (aplic->nr_irqs <= source))
|
||||
return -ENODEV;
|
||||
irqd = &aplic->irqs[source];
|
||||
irqd = &aplic->irqs[array_index_nospec(source, aplic->nr_irqs)];
|
||||
ie = (aplic->domaincfg & APLIC_DOMAINCFG_IE) ? true : false;
|
||||
|
||||
raw_spin_lock_irqsave(&irqd->lock, flags);
|
||||
|
||||
@@ -11,6 +11,7 @@
|
||||
#include <linux/irqchip/riscv-imsic.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/cpufeature.h>
|
||||
|
||||
static int aia_create(struct kvm_device *dev, u32 type)
|
||||
{
|
||||
@@ -22,6 +23,9 @@ static int aia_create(struct kvm_device *dev, u32 type)
|
||||
if (irqchip_in_kernel(kvm))
|
||||
return -EEXIST;
|
||||
|
||||
if (!riscv_isa_extension_available(NULL, SSAIA))
|
||||
return -ENODEV;
|
||||
|
||||
ret = -EBUSY;
|
||||
if (kvm_trylock_all_vcpus(kvm))
|
||||
return ret;
|
||||
@@ -437,7 +441,7 @@ static int aia_get_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
|
||||
|
||||
static int aia_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
|
||||
{
|
||||
int nr_vcpus;
|
||||
int nr_vcpus, r = -ENXIO;
|
||||
|
||||
switch (attr->group) {
|
||||
case KVM_DEV_RISCV_AIA_GRP_CONFIG:
|
||||
@@ -466,12 +470,18 @@ static int aia_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
|
||||
}
|
||||
break;
|
||||
case KVM_DEV_RISCV_AIA_GRP_APLIC:
|
||||
return kvm_riscv_aia_aplic_has_attr(dev->kvm, attr->attr);
|
||||
mutex_lock(&dev->kvm->lock);
|
||||
r = kvm_riscv_aia_aplic_has_attr(dev->kvm, attr->attr);
|
||||
mutex_unlock(&dev->kvm->lock);
|
||||
break;
|
||||
case KVM_DEV_RISCV_AIA_GRP_IMSIC:
|
||||
return kvm_riscv_aia_imsic_has_attr(dev->kvm, attr->attr);
|
||||
mutex_lock(&dev->kvm->lock);
|
||||
r = kvm_riscv_aia_imsic_has_attr(dev->kvm, attr->attr);
|
||||
mutex_unlock(&dev->kvm->lock);
|
||||
break;
|
||||
}
|
||||
|
||||
return -ENXIO;
|
||||
return r;
|
||||
}
|
||||
|
||||
struct kvm_device_ops kvm_riscv_aia_device_ops = {
|
||||
|
||||
@@ -908,6 +908,10 @@ int kvm_riscv_vcpu_aia_imsic_rmw(struct kvm_vcpu *vcpu, unsigned long isel,
|
||||
int r, rc = KVM_INSN_CONTINUE_NEXT_SEPC;
|
||||
struct imsic *imsic = vcpu->arch.aia_context.imsic_state;
|
||||
|
||||
/* If IMSIC vCPU state not initialized then forward to user space */
|
||||
if (!imsic)
|
||||
return KVM_INSN_EXIT_TO_USER_SPACE;
|
||||
|
||||
if (isel == KVM_RISCV_AIA_IMSIC_TOPEI) {
|
||||
/* Read pending and enabled interrupt with highest priority */
|
||||
topei = imsic_mrif_topei(imsic->swfile, imsic->nr_eix,
|
||||
|
||||
@@ -245,6 +245,7 @@ out:
|
||||
bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
|
||||
{
|
||||
struct kvm_gstage gstage;
|
||||
bool mmu_locked;
|
||||
|
||||
if (!kvm->arch.pgd)
|
||||
return false;
|
||||
@@ -253,9 +254,12 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
|
||||
gstage.flags = 0;
|
||||
gstage.vmid = READ_ONCE(kvm->arch.vmid.vmid);
|
||||
gstage.pgd = kvm->arch.pgd;
|
||||
mmu_locked = spin_trylock(&kvm->mmu_lock);
|
||||
kvm_riscv_gstage_unmap_range(&gstage, range->start << PAGE_SHIFT,
|
||||
(range->end - range->start) << PAGE_SHIFT,
|
||||
range->may_block);
|
||||
if (mmu_locked)
|
||||
spin_unlock(&kvm->mmu_lock);
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -535,7 +539,7 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memslot,
|
||||
goto out_unlock;
|
||||
|
||||
/* Check if we are backed by a THP and thus use block mapping if possible */
|
||||
if (vma_pagesize == PAGE_SIZE)
|
||||
if (!logging && (vma_pagesize == PAGE_SIZE))
|
||||
vma_pagesize = transparent_hugepage_adjust(kvm, memslot, hva, &hfn, &gpa);
|
||||
|
||||
if (writable) {
|
||||
|
||||
@@ -24,7 +24,7 @@
|
||||
#define CREATE_TRACE_POINTS
|
||||
#include "trace.h"
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, ecall_exit_stat),
|
||||
STATS_DESC_COUNTER(VCPU, wfi_exit_stat),
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
#include <linux/errno.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <asm/cpufeature.h>
|
||||
|
||||
@@ -93,9 +94,11 @@ int kvm_riscv_vcpu_get_reg_fp(struct kvm_vcpu *vcpu,
|
||||
if (reg_num == KVM_REG_RISCV_FP_F_REG(fcsr))
|
||||
reg_val = &cntx->fp.f.fcsr;
|
||||
else if ((KVM_REG_RISCV_FP_F_REG(f[0]) <= reg_num) &&
|
||||
reg_num <= KVM_REG_RISCV_FP_F_REG(f[31]))
|
||||
reg_num <= KVM_REG_RISCV_FP_F_REG(f[31])) {
|
||||
reg_num = array_index_nospec(reg_num,
|
||||
ARRAY_SIZE(cntx->fp.f.f));
|
||||
reg_val = &cntx->fp.f.f[reg_num];
|
||||
else
|
||||
} else
|
||||
return -ENOENT;
|
||||
} else if ((rtype == KVM_REG_RISCV_FP_D) &&
|
||||
riscv_isa_extension_available(vcpu->arch.isa, d)) {
|
||||
@@ -107,6 +110,8 @@ int kvm_riscv_vcpu_get_reg_fp(struct kvm_vcpu *vcpu,
|
||||
reg_num <= KVM_REG_RISCV_FP_D_REG(f[31])) {
|
||||
if (KVM_REG_SIZE(reg->id) != sizeof(u64))
|
||||
return -EINVAL;
|
||||
reg_num = array_index_nospec(reg_num,
|
||||
ARRAY_SIZE(cntx->fp.d.f));
|
||||
reg_val = &cntx->fp.d.f[reg_num];
|
||||
} else
|
||||
return -ENOENT;
|
||||
@@ -138,9 +143,11 @@ int kvm_riscv_vcpu_set_reg_fp(struct kvm_vcpu *vcpu,
|
||||
if (reg_num == KVM_REG_RISCV_FP_F_REG(fcsr))
|
||||
reg_val = &cntx->fp.f.fcsr;
|
||||
else if ((KVM_REG_RISCV_FP_F_REG(f[0]) <= reg_num) &&
|
||||
reg_num <= KVM_REG_RISCV_FP_F_REG(f[31]))
|
||||
reg_num <= KVM_REG_RISCV_FP_F_REG(f[31])) {
|
||||
reg_num = array_index_nospec(reg_num,
|
||||
ARRAY_SIZE(cntx->fp.f.f));
|
||||
reg_val = &cntx->fp.f.f[reg_num];
|
||||
else
|
||||
} else
|
||||
return -ENOENT;
|
||||
} else if ((rtype == KVM_REG_RISCV_FP_D) &&
|
||||
riscv_isa_extension_available(vcpu->arch.isa, d)) {
|
||||
@@ -152,6 +159,8 @@ int kvm_riscv_vcpu_set_reg_fp(struct kvm_vcpu *vcpu,
|
||||
reg_num <= KVM_REG_RISCV_FP_D_REG(f[31])) {
|
||||
if (KVM_REG_SIZE(reg->id) != sizeof(u64))
|
||||
return -EINVAL;
|
||||
reg_num = array_index_nospec(reg_num,
|
||||
ARRAY_SIZE(cntx->fp.d.f));
|
||||
reg_val = &cntx->fp.d.f[reg_num];
|
||||
} else
|
||||
return -ENOENT;
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/errno.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <asm/cacheflush.h>
|
||||
@@ -127,6 +128,7 @@ static int kvm_riscv_vcpu_isa_check_host(unsigned long kvm_ext, unsigned long *g
|
||||
kvm_ext >= ARRAY_SIZE(kvm_isa_ext_arr))
|
||||
return -ENOENT;
|
||||
|
||||
kvm_ext = array_index_nospec(kvm_ext, ARRAY_SIZE(kvm_isa_ext_arr));
|
||||
*guest_ext = kvm_isa_ext_arr[kvm_ext];
|
||||
switch (*guest_ext) {
|
||||
case RISCV_ISA_EXT_SMNPM:
|
||||
@@ -443,13 +445,16 @@ static int kvm_riscv_vcpu_get_reg_core(struct kvm_vcpu *vcpu,
|
||||
unsigned long reg_num = reg->id & ~(KVM_REG_ARCH_MASK |
|
||||
KVM_REG_SIZE_MASK |
|
||||
KVM_REG_RISCV_CORE);
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_core) / sizeof(unsigned long);
|
||||
unsigned long reg_val;
|
||||
|
||||
if (KVM_REG_SIZE(reg->id) != sizeof(unsigned long))
|
||||
return -EINVAL;
|
||||
if (reg_num >= sizeof(struct kvm_riscv_core) / sizeof(unsigned long))
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
if (reg_num == KVM_REG_RISCV_CORE_REG(regs.pc))
|
||||
reg_val = cntx->sepc;
|
||||
else if (KVM_REG_RISCV_CORE_REG(regs.pc) < reg_num &&
|
||||
@@ -476,13 +481,16 @@ static int kvm_riscv_vcpu_set_reg_core(struct kvm_vcpu *vcpu,
|
||||
unsigned long reg_num = reg->id & ~(KVM_REG_ARCH_MASK |
|
||||
KVM_REG_SIZE_MASK |
|
||||
KVM_REG_RISCV_CORE);
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_core) / sizeof(unsigned long);
|
||||
unsigned long reg_val;
|
||||
|
||||
if (KVM_REG_SIZE(reg->id) != sizeof(unsigned long))
|
||||
return -EINVAL;
|
||||
if (reg_num >= sizeof(struct kvm_riscv_core) / sizeof(unsigned long))
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
if (copy_from_user(®_val, uaddr, KVM_REG_SIZE(reg->id)))
|
||||
return -EFAULT;
|
||||
|
||||
@@ -507,10 +515,13 @@ static int kvm_riscv_vcpu_general_get_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long *out_val)
|
||||
{
|
||||
struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_csr) / sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_csr) / sizeof(unsigned long))
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
if (reg_num == KVM_REG_RISCV_CSR_REG(sip)) {
|
||||
kvm_riscv_vcpu_flush_interrupts(vcpu);
|
||||
*out_val = (csr->hvip >> VSIP_TO_HVIP_SHIFT) & VSIP_VALID_MASK;
|
||||
@@ -526,10 +537,13 @@ static int kvm_riscv_vcpu_general_set_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long reg_val)
|
||||
{
|
||||
struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_csr) / sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_csr) / sizeof(unsigned long))
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
if (reg_num == KVM_REG_RISCV_CSR_REG(sip)) {
|
||||
reg_val &= VSIP_VALID_MASK;
|
||||
reg_val <<= VSIP_TO_HVIP_SHIFT;
|
||||
@@ -548,10 +562,15 @@ static inline int kvm_riscv_vcpu_smstateen_set_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long reg_val)
|
||||
{
|
||||
struct kvm_vcpu_smstateen_csr *csr = &vcpu->arch.smstateen_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_smstateen_csr) /
|
||||
sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_smstateen_csr) /
|
||||
sizeof(unsigned long))
|
||||
return -EINVAL;
|
||||
if (!riscv_isa_extension_available(vcpu->arch.isa, SMSTATEEN))
|
||||
return -ENOENT;
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
((unsigned long *)csr)[reg_num] = reg_val;
|
||||
return 0;
|
||||
@@ -562,10 +581,15 @@ static int kvm_riscv_vcpu_smstateen_get_csr(struct kvm_vcpu *vcpu,
|
||||
unsigned long *out_val)
|
||||
{
|
||||
struct kvm_vcpu_smstateen_csr *csr = &vcpu->arch.smstateen_csr;
|
||||
unsigned long regs_max = sizeof(struct kvm_riscv_smstateen_csr) /
|
||||
sizeof(unsigned long);
|
||||
|
||||
if (reg_num >= sizeof(struct kvm_riscv_smstateen_csr) /
|
||||
sizeof(unsigned long))
|
||||
return -EINVAL;
|
||||
if (!riscv_isa_extension_available(vcpu->arch.isa, SMSTATEEN))
|
||||
return -ENOENT;
|
||||
if (reg_num >= regs_max)
|
||||
return -ENOENT;
|
||||
|
||||
reg_num = array_index_nospec(reg_num, regs_max);
|
||||
|
||||
*out_val = ((unsigned long *)csr)[reg_num];
|
||||
return 0;
|
||||
@@ -595,10 +619,7 @@ static int kvm_riscv_vcpu_get_reg_csr(struct kvm_vcpu *vcpu,
|
||||
rc = kvm_riscv_vcpu_aia_get_csr(vcpu, reg_num, ®_val);
|
||||
break;
|
||||
case KVM_REG_RISCV_CSR_SMSTATEEN:
|
||||
rc = -EINVAL;
|
||||
if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SMSTATEEN))
|
||||
rc = kvm_riscv_vcpu_smstateen_get_csr(vcpu, reg_num,
|
||||
®_val);
|
||||
rc = kvm_riscv_vcpu_smstateen_get_csr(vcpu, reg_num, ®_val);
|
||||
break;
|
||||
default:
|
||||
rc = -ENOENT;
|
||||
@@ -640,10 +661,7 @@ static int kvm_riscv_vcpu_set_reg_csr(struct kvm_vcpu *vcpu,
|
||||
rc = kvm_riscv_vcpu_aia_set_csr(vcpu, reg_num, reg_val);
|
||||
break;
|
||||
case KVM_REG_RISCV_CSR_SMSTATEEN:
|
||||
rc = -EINVAL;
|
||||
if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SMSTATEEN))
|
||||
rc = kvm_riscv_vcpu_smstateen_set_csr(vcpu, reg_num,
|
||||
reg_val);
|
||||
rc = kvm_riscv_vcpu_smstateen_set_csr(vcpu, reg_num, reg_val);
|
||||
break;
|
||||
default:
|
||||
rc = -ENOENT;
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
#include <linux/errno.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/perf/riscv_pmu.h>
|
||||
#include <asm/csr.h>
|
||||
#include <asm/kvm_vcpu_sbi.h>
|
||||
@@ -87,7 +88,8 @@ static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc)
|
||||
|
||||
static u64 kvm_pmu_get_perf_event_hw_config(u32 sbi_event_code)
|
||||
{
|
||||
return hw_event_perf_map[sbi_event_code];
|
||||
return hw_event_perf_map[array_index_nospec(sbi_event_code,
|
||||
SBI_PMU_HW_GENERAL_MAX)];
|
||||
}
|
||||
|
||||
static u64 kvm_pmu_get_perf_event_cache_config(u32 sbi_event_code)
|
||||
@@ -218,6 +220,7 @@ static int pmu_fw_ctr_read_hi(struct kvm_vcpu *vcpu, unsigned long cidx,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
cidx = array_index_nospec(cidx, RISCV_KVM_MAX_COUNTERS);
|
||||
pmc = &kvpmu->pmc[cidx];
|
||||
|
||||
if (pmc->cinfo.type != SBI_PMU_CTR_TYPE_FW)
|
||||
@@ -244,6 +247,7 @@ static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
cidx = array_index_nospec(cidx, RISCV_KVM_MAX_COUNTERS);
|
||||
pmc = &kvpmu->pmc[cidx];
|
||||
|
||||
if (pmc->cinfo.type == SBI_PMU_CTR_TYPE_FW) {
|
||||
@@ -520,11 +524,12 @@ int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx,
|
||||
{
|
||||
struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu);
|
||||
|
||||
if (cidx > RISCV_KVM_MAX_COUNTERS || cidx == 1) {
|
||||
if (cidx >= RISCV_KVM_MAX_COUNTERS || cidx == 1) {
|
||||
retdata->err_val = SBI_ERR_INVALID_PARAM;
|
||||
return 0;
|
||||
}
|
||||
|
||||
cidx = array_index_nospec(cidx, RISCV_KVM_MAX_COUNTERS);
|
||||
retdata->out_val = kvpmu->pmc[cidx].cinfo.value;
|
||||
|
||||
return 0;
|
||||
@@ -559,7 +564,8 @@ int kvm_riscv_vcpu_pmu_ctr_start(struct kvm_vcpu *vcpu, unsigned long ctr_base,
|
||||
}
|
||||
/* Start the counters that have been configured and requested by the guest */
|
||||
for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
|
||||
pmc_index = i + ctr_base;
|
||||
pmc_index = array_index_nospec(i + ctr_base,
|
||||
RISCV_KVM_MAX_COUNTERS);
|
||||
if (!test_bit(pmc_index, kvpmu->pmc_in_use))
|
||||
continue;
|
||||
/* The guest started the counter again. Reset the overflow status */
|
||||
@@ -630,7 +636,8 @@ int kvm_riscv_vcpu_pmu_ctr_stop(struct kvm_vcpu *vcpu, unsigned long ctr_base,
|
||||
|
||||
/* Stop the counters that have been configured and requested by the guest */
|
||||
for_each_set_bit(i, &ctr_mask, RISCV_MAX_COUNTERS) {
|
||||
pmc_index = i + ctr_base;
|
||||
pmc_index = array_index_nospec(i + ctr_base,
|
||||
RISCV_KVM_MAX_COUNTERS);
|
||||
if (!test_bit(pmc_index, kvpmu->pmc_in_use))
|
||||
continue;
|
||||
pmc = &kvpmu->pmc[pmc_index];
|
||||
@@ -761,6 +768,7 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba
|
||||
}
|
||||
}
|
||||
|
||||
ctr_idx = array_index_nospec(ctr_idx, RISCV_KVM_MAX_COUNTERS);
|
||||
pmc = &kvpmu->pmc[ctr_idx];
|
||||
pmc->idx = ctr_idx;
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@
|
||||
#include <linux/kvm_host.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS()
|
||||
};
|
||||
static_assert(ARRAY_SIZE(kvm_vm_stats_desc) ==
|
||||
|
||||
@@ -65,7 +65,7 @@
|
||||
#define VCPU_IRQS_MAX_BUF (sizeof(struct kvm_s390_irq) * \
|
||||
(KVM_MAX_VCPUS + LOCAL_IRQS))
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS(),
|
||||
STATS_DESC_COUNTER(VM, inject_io),
|
||||
STATS_DESC_COUNTER(VM, inject_float_mchk),
|
||||
@@ -91,7 +91,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, exit_userspace),
|
||||
STATS_DESC_COUNTER(VCPU, exit_null),
|
||||
|
||||
@@ -2485,7 +2485,8 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
|
||||
KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS | \
|
||||
KVM_X86_QUIRK_SLOT_ZAP_ALL | \
|
||||
KVM_X86_QUIRK_STUFF_FEATURE_MSRS | \
|
||||
KVM_X86_QUIRK_IGNORE_GUEST_PAT)
|
||||
KVM_X86_QUIRK_IGNORE_GUEST_PAT | \
|
||||
KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM)
|
||||
|
||||
#define KVM_X86_CONDITIONAL_QUIRKS \
|
||||
(KVM_X86_QUIRK_CD_NW_CLEARED | \
|
||||
|
||||
@@ -476,6 +476,7 @@ struct kvm_sync_regs {
|
||||
#define KVM_X86_QUIRK_SLOT_ZAP_ALL (1 << 7)
|
||||
#define KVM_X86_QUIRK_STUFF_FEATURE_MSRS (1 << 8)
|
||||
#define KVM_X86_QUIRK_IGNORE_GUEST_PAT (1 << 9)
|
||||
#define KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM (1 << 10)
|
||||
|
||||
#define KVM_STATE_NESTED_FORMAT_VMX 0
|
||||
#define KVM_STATE_NESTED_FORMAT_SVM 1
|
||||
|
||||
@@ -776,7 +776,10 @@ do { \
|
||||
#define SYNTHESIZED_F(name) \
|
||||
({ \
|
||||
kvm_cpu_cap_synthesized |= feature_bit(name); \
|
||||
F(name); \
|
||||
\
|
||||
BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES); \
|
||||
if (boot_cpu_has(X86_FEATURE_##name)) \
|
||||
F(name); \
|
||||
})
|
||||
|
||||
/*
|
||||
|
||||
@@ -1981,16 +1981,17 @@ int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
|
||||
if (entries[i] == KVM_HV_TLB_FLUSHALL_ENTRY)
|
||||
goto out_flush_all;
|
||||
|
||||
if (is_noncanonical_invlpg_address(entries[i], vcpu))
|
||||
continue;
|
||||
|
||||
/*
|
||||
* Lower 12 bits of 'address' encode the number of additional
|
||||
* pages to flush.
|
||||
*/
|
||||
gva = entries[i] & PAGE_MASK;
|
||||
for (j = 0; j < (entries[i] & ~PAGE_MASK) + 1; j++)
|
||||
for (j = 0; j < (entries[i] & ~PAGE_MASK) + 1; j++) {
|
||||
if (is_noncanonical_invlpg_address(gva + j * PAGE_SIZE, vcpu))
|
||||
continue;
|
||||
|
||||
kvm_x86_call(flush_tlb_gva)(vcpu, gva + j * PAGE_SIZE);
|
||||
}
|
||||
|
||||
++vcpu->stat.tlb_flush;
|
||||
}
|
||||
|
||||
@@ -321,7 +321,8 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
|
||||
idx = srcu_read_lock(&kvm->irq_srcu);
|
||||
gsi = kvm_irq_map_chip_pin(kvm, irqchip, pin);
|
||||
if (gsi != -1)
|
||||
hlist_for_each_entry_rcu(kimn, &ioapic->mask_notifier_list, link)
|
||||
hlist_for_each_entry_srcu(kimn, &ioapic->mask_notifier_list, link,
|
||||
srcu_read_lock_held(&kvm->irq_srcu))
|
||||
if (kimn->irq == gsi)
|
||||
kimn->func(kimn, mask);
|
||||
srcu_read_unlock(&kvm->irq_srcu, idx);
|
||||
|
||||
@@ -189,12 +189,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
|
||||
struct kvm_vcpu *vcpu = &svm->vcpu;
|
||||
|
||||
vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
|
||||
|
||||
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
|
||||
vmcb->control.avic_physical_id |= avic_get_max_physical_id(vcpu);
|
||||
|
||||
vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
|
||||
|
||||
svm_clr_intercept(svm, INTERCEPT_CR8_WRITE);
|
||||
|
||||
/*
|
||||
* Note: KVM supports hybrid-AVIC mode, where KVM emulates x2APIC MSR
|
||||
* accesses, while interrupt injection to a running vCPU can be
|
||||
@@ -226,6 +226,9 @@ static void avic_deactivate_vmcb(struct vcpu_svm *svm)
|
||||
vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
|
||||
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
|
||||
|
||||
if (!sev_es_guest(svm->vcpu.kvm))
|
||||
svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
|
||||
|
||||
/*
|
||||
* If running nested and the guest uses its own MSR bitmap, there
|
||||
* is no need to update L0's msr bitmap
|
||||
@@ -368,7 +371,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
|
||||
vmcb->control.avic_physical_id = __sme_set(__pa(kvm_svm->avic_physical_id_table));
|
||||
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE;
|
||||
|
||||
if (kvm_apicv_activated(svm->vcpu.kvm))
|
||||
if (kvm_vcpu_apicv_active(&svm->vcpu))
|
||||
avic_activate_vmcb(svm);
|
||||
else
|
||||
avic_deactivate_vmcb(svm);
|
||||
|
||||
@@ -418,6 +418,15 @@ static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
|
||||
return __nested_vmcb_check_controls(vcpu, ctl);
|
||||
}
|
||||
|
||||
int nested_svm_check_cached_vmcb12(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!nested_vmcb_check_save(vcpu) ||
|
||||
!nested_vmcb_check_controls(vcpu))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* If a feature is not advertised to L1, clear the corresponding vmcb12
|
||||
* intercept.
|
||||
@@ -1028,8 +1037,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
|
||||
nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
|
||||
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
|
||||
|
||||
if (!nested_vmcb_check_save(vcpu) ||
|
||||
!nested_vmcb_check_controls(vcpu)) {
|
||||
if (nested_svm_check_cached_vmcb12(vcpu) < 0) {
|
||||
vmcb12->control.exit_code = SVM_EXIT_ERR;
|
||||
vmcb12->control.exit_info_1 = 0;
|
||||
vmcb12->control.exit_info_2 = 0;
|
||||
|
||||
@@ -1077,8 +1077,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool init_event)
|
||||
svm_set_intercept(svm, INTERCEPT_CR0_WRITE);
|
||||
svm_set_intercept(svm, INTERCEPT_CR3_WRITE);
|
||||
svm_set_intercept(svm, INTERCEPT_CR4_WRITE);
|
||||
if (!kvm_vcpu_apicv_active(vcpu))
|
||||
svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
|
||||
svm_set_intercept(svm, INTERCEPT_CR8_WRITE);
|
||||
|
||||
set_dr_intercepts(svm);
|
||||
|
||||
@@ -1189,7 +1188,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool init_event)
|
||||
if (guest_cpu_cap_has(vcpu, X86_FEATURE_ERAPS))
|
||||
svm->vmcb->control.erap_ctl |= ERAP_CONTROL_ALLOW_LARGER_RAP;
|
||||
|
||||
if (kvm_vcpu_apicv_active(vcpu))
|
||||
if (enable_apicv && irqchip_in_kernel(vcpu->kvm))
|
||||
avic_init_vmcb(svm, vmcb);
|
||||
|
||||
if (vnmi)
|
||||
@@ -2674,9 +2673,11 @@ static int dr_interception(struct kvm_vcpu *vcpu)
|
||||
|
||||
static int cr8_write_interception(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u8 cr8_prev = kvm_get_cr8(vcpu);
|
||||
int r;
|
||||
|
||||
u8 cr8_prev = kvm_get_cr8(vcpu);
|
||||
WARN_ON_ONCE(kvm_vcpu_apicv_active(vcpu));
|
||||
|
||||
/* instruction emulation calls kvm_set_cr8() */
|
||||
r = cr_interception(vcpu);
|
||||
if (lapic_in_kernel(vcpu))
|
||||
@@ -4879,11 +4880,15 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const union kvm_smram *smram)
|
||||
vmcb12 = map.hva;
|
||||
nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
|
||||
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
|
||||
ret = enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, vmcb12, false);
|
||||
|
||||
if (ret)
|
||||
if (nested_svm_check_cached_vmcb12(vcpu) < 0)
|
||||
goto unmap_save;
|
||||
|
||||
if (enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa,
|
||||
vmcb12, false) != 0)
|
||||
goto unmap_save;
|
||||
|
||||
ret = 0;
|
||||
svm->nested.nested_run_pending = 1;
|
||||
|
||||
unmap_save:
|
||||
|
||||
@@ -797,6 +797,7 @@ static inline int nested_svm_simple_vmexit(struct vcpu_svm *svm, u32 exit_code)
|
||||
|
||||
int nested_svm_exit_handled(struct vcpu_svm *svm);
|
||||
int nested_svm_check_permissions(struct kvm_vcpu *vcpu);
|
||||
int nested_svm_check_cached_vmcb12(struct kvm_vcpu *vcpu);
|
||||
int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
|
||||
bool has_error_code, u32 error_code);
|
||||
int nested_svm_exit_special(struct vcpu_svm *svm);
|
||||
|
||||
@@ -3300,10 +3300,24 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
|
||||
if (CC(vmcs12->guest_cr4 & X86_CR4_CET && !(vmcs12->guest_cr0 & X86_CR0_WP)))
|
||||
return -EINVAL;
|
||||
|
||||
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
|
||||
(CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
|
||||
CC(!vmx_is_valid_debugctl(vcpu, vmcs12->guest_ia32_debugctl, false))))
|
||||
return -EINVAL;
|
||||
if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) {
|
||||
u64 debugctl = vmcs12->guest_ia32_debugctl;
|
||||
|
||||
/*
|
||||
* FREEZE_IN_SMM is not virtualized, but allow L1 to set it in
|
||||
* vmcs12's DEBUGCTL under a quirk for backwards compatibility.
|
||||
* Note that the quirk only relaxes the consistency check. The
|
||||
* vmcc02 bit is still under the control of the host. In
|
||||
* particular, if a host administrator decides to clear the bit,
|
||||
* then L1 has no say in the matter.
|
||||
*/
|
||||
if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM))
|
||||
debugctl &= ~DEBUGCTLMSR_FREEZE_IN_SMM;
|
||||
|
||||
if (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
|
||||
CC(!vmx_is_valid_debugctl(vcpu, debugctl, false)))
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) &&
|
||||
CC(!kvm_pat_valid(vmcs12->guest_ia32_pat)))
|
||||
@@ -6842,13 +6856,34 @@ void vmx_leave_nested(struct kvm_vcpu *vcpu)
|
||||
free_nested(vcpu);
|
||||
}
|
||||
|
||||
int nested_vmx_check_restored_vmcs12(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
enum vm_entry_failure_code ignored;
|
||||
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
|
||||
|
||||
if (nested_cpu_has_shadow_vmcs(vmcs12) &&
|
||||
vmcs12->vmcs_link_pointer != INVALID_GPA) {
|
||||
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
|
||||
|
||||
if (shadow_vmcs12->hdr.revision_id != VMCS12_REVISION ||
|
||||
!shadow_vmcs12->hdr.shadow_vmcs)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (nested_vmx_check_controls(vcpu, vmcs12) ||
|
||||
nested_vmx_check_host_state(vcpu, vmcs12) ||
|
||||
nested_vmx_check_guest_state(vcpu, vmcs12, &ignored))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
struct kvm_nested_state __user *user_kvm_nested_state,
|
||||
struct kvm_nested_state *kvm_state)
|
||||
{
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
struct vmcs12 *vmcs12;
|
||||
enum vm_entry_failure_code ignored;
|
||||
struct kvm_vmx_nested_state_data __user *user_vmx_nested_state =
|
||||
&user_kvm_nested_state->data.vmx[0];
|
||||
int ret;
|
||||
@@ -6979,25 +7014,20 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
vmx->nested.mtf_pending =
|
||||
!!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
|
||||
|
||||
ret = -EINVAL;
|
||||
if (nested_cpu_has_shadow_vmcs(vmcs12) &&
|
||||
vmcs12->vmcs_link_pointer != INVALID_GPA) {
|
||||
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
|
||||
|
||||
ret = -EINVAL;
|
||||
if (kvm_state->size <
|
||||
sizeof(*kvm_state) +
|
||||
sizeof(user_vmx_nested_state->vmcs12) + sizeof(*shadow_vmcs12))
|
||||
goto error_guest_mode;
|
||||
|
||||
ret = -EFAULT;
|
||||
if (copy_from_user(shadow_vmcs12,
|
||||
user_vmx_nested_state->shadow_vmcs12,
|
||||
sizeof(*shadow_vmcs12))) {
|
||||
ret = -EFAULT;
|
||||
goto error_guest_mode;
|
||||
}
|
||||
|
||||
if (shadow_vmcs12->hdr.revision_id != VMCS12_REVISION ||
|
||||
!shadow_vmcs12->hdr.shadow_vmcs)
|
||||
sizeof(*shadow_vmcs12)))
|
||||
goto error_guest_mode;
|
||||
}
|
||||
|
||||
@@ -7008,9 +7038,8 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
kvm_state->hdr.vmx.preemption_timer_deadline;
|
||||
}
|
||||
|
||||
if (nested_vmx_check_controls(vcpu, vmcs12) ||
|
||||
nested_vmx_check_host_state(vcpu, vmcs12) ||
|
||||
nested_vmx_check_guest_state(vcpu, vmcs12, &ignored))
|
||||
ret = nested_vmx_check_restored_vmcs12(vcpu);
|
||||
if (ret < 0)
|
||||
goto error_guest_mode;
|
||||
|
||||
vmx->nested.dirty_vmcs12 = true;
|
||||
|
||||
@@ -22,6 +22,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
|
||||
void nested_vmx_hardware_unsetup(void);
|
||||
__init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
|
||||
void nested_vmx_set_vmcs_shadowing_bitmap(void);
|
||||
int nested_vmx_check_restored_vmcs12(struct kvm_vcpu *vcpu);
|
||||
void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu);
|
||||
enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
|
||||
bool from_vmentry);
|
||||
|
||||
@@ -1149,7 +1149,7 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
|
||||
}
|
||||
|
||||
vmx_add_auto_msr(&m->guest, msr, guest_val, VM_ENTRY_MSR_LOAD_COUNT, kvm);
|
||||
vmx_add_auto_msr(&m->guest, msr, host_val, VM_EXIT_MSR_LOAD_COUNT, kvm);
|
||||
vmx_add_auto_msr(&m->host, msr, host_val, VM_EXIT_MSR_LOAD_COUNT, kvm);
|
||||
}
|
||||
|
||||
static bool update_transition_efer(struct vcpu_vmx *vmx)
|
||||
@@ -8528,9 +8528,13 @@ int vmx_leave_smm(struct kvm_vcpu *vcpu, const union kvm_smram *smram)
|
||||
}
|
||||
|
||||
if (vmx->nested.smm.guest_mode) {
|
||||
/* Triple fault if the state is invalid. */
|
||||
if (nested_vmx_check_restored_vmcs12(vcpu) < 0)
|
||||
return 1;
|
||||
|
||||
ret = nested_vmx_enter_non_root_mode(vcpu, false);
|
||||
if (ret)
|
||||
return ret;
|
||||
if (ret != NVMX_VMENTRY_SUCCESS)
|
||||
return 1;
|
||||
|
||||
vmx->nested.nested_run_pending = 1;
|
||||
vmx->nested.smm.guest_mode = false;
|
||||
|
||||
@@ -243,7 +243,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_ipiv);
|
||||
bool __read_mostly enable_device_posted_irqs = true;
|
||||
EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_device_posted_irqs);
|
||||
|
||||
const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vm_stats_desc[] = {
|
||||
KVM_GENERIC_VM_STATS(),
|
||||
STATS_DESC_COUNTER(VM, mmu_shadow_zapped),
|
||||
STATS_DESC_COUNTER(VM, mmu_pte_write),
|
||||
@@ -269,7 +269,7 @@ const struct kvm_stats_header kvm_vm_stats_header = {
|
||||
sizeof(kvm_vm_stats_desc),
|
||||
};
|
||||
|
||||
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
|
||||
KVM_GENERIC_VCPU_STATS(),
|
||||
STATS_DESC_COUNTER(VCPU, pf_taken),
|
||||
STATS_DESC_COUNTER(VCPU, pf_fixed),
|
||||
|
||||
@@ -1940,56 +1940,43 @@ enum kvm_stat_kind {
|
||||
|
||||
struct kvm_stat_data {
|
||||
struct kvm *kvm;
|
||||
const struct _kvm_stats_desc *desc;
|
||||
const struct kvm_stats_desc *desc;
|
||||
enum kvm_stat_kind kind;
|
||||
};
|
||||
|
||||
struct _kvm_stats_desc {
|
||||
struct kvm_stats_desc desc;
|
||||
char name[KVM_STATS_NAME_SIZE];
|
||||
};
|
||||
|
||||
#define STATS_DESC_COMMON(type, unit, base, exp, sz, bsz) \
|
||||
.flags = type | unit | base | \
|
||||
BUILD_BUG_ON_ZERO(type & ~KVM_STATS_TYPE_MASK) | \
|
||||
BUILD_BUG_ON_ZERO(unit & ~KVM_STATS_UNIT_MASK) | \
|
||||
BUILD_BUG_ON_ZERO(base & ~KVM_STATS_BASE_MASK), \
|
||||
.exponent = exp, \
|
||||
.size = sz, \
|
||||
#define STATS_DESC_COMMON(type, unit, base, exp, sz, bsz) \
|
||||
.flags = type | unit | base | \
|
||||
BUILD_BUG_ON_ZERO(type & ~KVM_STATS_TYPE_MASK) | \
|
||||
BUILD_BUG_ON_ZERO(unit & ~KVM_STATS_UNIT_MASK) | \
|
||||
BUILD_BUG_ON_ZERO(base & ~KVM_STATS_BASE_MASK), \
|
||||
.exponent = exp, \
|
||||
.size = sz, \
|
||||
.bucket_size = bsz
|
||||
|
||||
#define VM_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vm_stat, generic.stat) \
|
||||
}, \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VCPU_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vcpu_stat, generic.stat) \
|
||||
}, \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VM_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vm_stat, stat) \
|
||||
}, \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VCPU_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vcpu_stat, stat) \
|
||||
}, \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VM_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vm_stat, generic.stat), \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VCPU_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vcpu_stat, generic.stat), \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VM_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vm_stat, stat), \
|
||||
.name = #stat, \
|
||||
}
|
||||
#define VCPU_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
|
||||
{ \
|
||||
STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
|
||||
.offset = offsetof(struct kvm_vcpu_stat, stat), \
|
||||
.name = #stat, \
|
||||
}
|
||||
/* SCOPE: VM, VM_GENERIC, VCPU, VCPU_GENERIC */
|
||||
#define STATS_DESC(SCOPE, stat, type, unit, base, exp, sz, bsz) \
|
||||
SCOPE##_STATS_DESC(stat, type, unit, base, exp, sz, bsz)
|
||||
@@ -2066,7 +2053,7 @@ struct _kvm_stats_desc {
|
||||
STATS_DESC_IBOOLEAN(VCPU_GENERIC, blocking)
|
||||
|
||||
ssize_t kvm_stats_read(char *id, const struct kvm_stats_header *header,
|
||||
const struct _kvm_stats_desc *desc,
|
||||
const struct kvm_stats_desc *desc,
|
||||
void *stats, size_t size_stats,
|
||||
char __user *user_buffer, size_t size, loff_t *offset);
|
||||
|
||||
@@ -2111,9 +2098,9 @@ static inline void kvm_stats_log_hist_update(u64 *data, size_t size, u64 value)
|
||||
|
||||
|
||||
extern const struct kvm_stats_header kvm_vm_stats_header;
|
||||
extern const struct _kvm_stats_desc kvm_vm_stats_desc[];
|
||||
extern const struct kvm_stats_desc kvm_vm_stats_desc[];
|
||||
extern const struct kvm_stats_header kvm_vcpu_stats_header;
|
||||
extern const struct _kvm_stats_desc kvm_vcpu_stats_desc[];
|
||||
extern const struct kvm_stats_desc kvm_vcpu_stats_desc[];
|
||||
|
||||
static inline int mmu_invalidate_retry(struct kvm *kvm, unsigned long mmu_seq)
|
||||
{
|
||||
|
||||
@@ -14,6 +14,10 @@
|
||||
#include <linux/ioctl.h>
|
||||
#include <asm/kvm.h>
|
||||
|
||||
#ifdef __KERNEL__
|
||||
#include <linux/kvm_types.h>
|
||||
#endif
|
||||
|
||||
#define KVM_API_VERSION 12
|
||||
|
||||
/*
|
||||
@@ -1601,7 +1605,11 @@ struct kvm_stats_desc {
|
||||
__u16 size;
|
||||
__u32 offset;
|
||||
__u32 bucket_size;
|
||||
#ifdef __KERNEL__
|
||||
char name[KVM_STATS_NAME_SIZE];
|
||||
#else
|
||||
char name[];
|
||||
#endif
|
||||
};
|
||||
|
||||
#define KVM_GET_STATS_FD _IO(KVMIO, 0xce)
|
||||
|
||||
@@ -71,6 +71,7 @@ TEST_GEN_PROGS_x86 += x86/cpuid_test
|
||||
TEST_GEN_PROGS_x86 += x86/cr4_cpuid_sync_test
|
||||
TEST_GEN_PROGS_x86 += x86/dirty_log_page_splitting_test
|
||||
TEST_GEN_PROGS_x86 += x86/feature_msrs_test
|
||||
TEST_GEN_PROGS_x86 += x86/evmcs_smm_controls_test
|
||||
TEST_GEN_PROGS_x86 += x86/exit_on_emulation_failure_test
|
||||
TEST_GEN_PROGS_x86 += x86/fastops_test
|
||||
TEST_GEN_PROGS_x86 += x86/fix_hypercall_test
|
||||
|
||||
@@ -80,7 +80,7 @@ static void test_mbind(int fd, size_t total_size)
|
||||
{
|
||||
const unsigned long nodemask_0 = 1; /* nid: 0 */
|
||||
unsigned long nodemask = 0;
|
||||
unsigned long maxnode = 8;
|
||||
unsigned long maxnode = BITS_PER_TYPE(nodemask);
|
||||
int policy;
|
||||
char *mem;
|
||||
int ret;
|
||||
|
||||
@@ -557,6 +557,11 @@ static inline uint64_t get_cr0(void)
|
||||
return cr0;
|
||||
}
|
||||
|
||||
static inline void set_cr0(uint64_t val)
|
||||
{
|
||||
__asm__ __volatile__("mov %0, %%cr0" : : "r" (val) : "memory");
|
||||
}
|
||||
|
||||
static inline uint64_t get_cr3(void)
|
||||
{
|
||||
uint64_t cr3;
|
||||
@@ -566,6 +571,11 @@ static inline uint64_t get_cr3(void)
|
||||
return cr3;
|
||||
}
|
||||
|
||||
static inline void set_cr3(uint64_t val)
|
||||
{
|
||||
__asm__ __volatile__("mov %0, %%cr3" : : "r" (val) : "memory");
|
||||
}
|
||||
|
||||
static inline uint64_t get_cr4(void)
|
||||
{
|
||||
uint64_t cr4;
|
||||
@@ -580,6 +590,19 @@ static inline void set_cr4(uint64_t val)
|
||||
__asm__ __volatile__("mov %0, %%cr4" : : "r" (val) : "memory");
|
||||
}
|
||||
|
||||
static inline uint64_t get_cr8(void)
|
||||
{
|
||||
uint64_t cr8;
|
||||
|
||||
__asm__ __volatile__("mov %%cr8, %[cr8]" : [cr8]"=r"(cr8));
|
||||
return cr8;
|
||||
}
|
||||
|
||||
static inline void set_cr8(uint64_t val)
|
||||
{
|
||||
__asm__ __volatile__("mov %0, %%cr8" : : "r" (val) : "memory");
|
||||
}
|
||||
|
||||
static inline void set_idt(const struct desc_ptr *idt_desc)
|
||||
{
|
||||
__asm__ __volatile__("lidt %0"::"m"(*idt_desc));
|
||||
|
||||
17
tools/testing/selftests/kvm/include/x86/smm.h
Normal file
17
tools/testing/selftests/kvm/include/x86/smm.h
Normal file
@@ -0,0 +1,17 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-only
|
||||
#ifndef SELFTEST_KVM_SMM_H
|
||||
#define SELFTEST_KVM_SMM_H
|
||||
|
||||
#include "kvm_util.h"
|
||||
|
||||
#define SMRAM_SIZE 65536
|
||||
#define SMRAM_MEMSLOT ((1 << 16) | 1)
|
||||
#define SMRAM_PAGES (SMRAM_SIZE / PAGE_SIZE)
|
||||
|
||||
void setup_smram(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
|
||||
uint64_t smram_gpa,
|
||||
const void *smi_handler, size_t handler_size);
|
||||
|
||||
void inject_smi(struct kvm_vcpu *vcpu);
|
||||
|
||||
#endif /* SELFTEST_KVM_SMM_H */
|
||||
@@ -8,6 +8,7 @@
|
||||
#include "kvm_util.h"
|
||||
#include "pmu.h"
|
||||
#include "processor.h"
|
||||
#include "smm.h"
|
||||
#include "svm_util.h"
|
||||
#include "sev.h"
|
||||
#include "vmx.h"
|
||||
@@ -1444,3 +1445,28 @@ bool kvm_arch_has_default_irqchip(void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
void setup_smram(struct kvm_vm *vm, struct kvm_vcpu *vcpu,
|
||||
uint64_t smram_gpa,
|
||||
const void *smi_handler, size_t handler_size)
|
||||
{
|
||||
vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, smram_gpa,
|
||||
SMRAM_MEMSLOT, SMRAM_PAGES, 0);
|
||||
TEST_ASSERT(vm_phy_pages_alloc(vm, SMRAM_PAGES, smram_gpa,
|
||||
SMRAM_MEMSLOT) == smram_gpa,
|
||||
"Could not allocate guest physical addresses for SMRAM");
|
||||
|
||||
memset(addr_gpa2hva(vm, smram_gpa), 0x0, SMRAM_SIZE);
|
||||
memcpy(addr_gpa2hva(vm, smram_gpa) + 0x8000, smi_handler, handler_size);
|
||||
vcpu_set_msr(vcpu, MSR_IA32_SMBASE, smram_gpa);
|
||||
}
|
||||
|
||||
void inject_smi(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_vcpu_events events;
|
||||
|
||||
vcpu_events_get(vcpu, &events);
|
||||
events.smi.pending = 1;
|
||||
events.flags |= KVM_VCPUEVENT_VALID_SMM;
|
||||
vcpu_events_set(vcpu, &events);
|
||||
}
|
||||
|
||||
150
tools/testing/selftests/kvm/x86/evmcs_smm_controls_test.c
Normal file
150
tools/testing/selftests/kvm/x86/evmcs_smm_controls_test.c
Normal file
@@ -0,0 +1,150 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* Copyright (C) 2026, Red Hat, Inc.
|
||||
*
|
||||
* Test that vmx_leave_smm() validates vmcs12 controls before re-entering
|
||||
* nested guest mode on RSM.
|
||||
*/
|
||||
#include <fcntl.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <sys/ioctl.h>
|
||||
|
||||
#include "test_util.h"
|
||||
#include "kvm_util.h"
|
||||
#include "smm.h"
|
||||
#include "hyperv.h"
|
||||
#include "vmx.h"
|
||||
|
||||
#define SMRAM_GPA 0x1000000
|
||||
#define SMRAM_STAGE 0xfe
|
||||
|
||||
#define SYNC_PORT 0xe
|
||||
|
||||
#define STR(x) #x
|
||||
#define XSTR(s) STR(s)
|
||||
|
||||
/*
|
||||
* SMI handler: runs in real-address mode.
|
||||
* Reports SMRAM_STAGE via port IO, then does RSM.
|
||||
*/
|
||||
static uint8_t smi_handler[] = {
|
||||
0xb0, SMRAM_STAGE, /* mov $SMRAM_STAGE, %al */
|
||||
0xe4, SYNC_PORT, /* in $SYNC_PORT, %al */
|
||||
0x0f, 0xaa, /* rsm */
|
||||
};
|
||||
|
||||
static inline void sync_with_host(uint64_t phase)
|
||||
{
|
||||
asm volatile("in $" XSTR(SYNC_PORT) ", %%al \n"
|
||||
: "+a" (phase));
|
||||
}
|
||||
|
||||
static void l2_guest_code(void)
|
||||
{
|
||||
sync_with_host(1);
|
||||
|
||||
/* After SMI+RSM with invalid controls, we should not reach here. */
|
||||
vmcall();
|
||||
}
|
||||
|
||||
static void guest_code(struct vmx_pages *vmx_pages,
|
||||
struct hyperv_test_pages *hv_pages)
|
||||
{
|
||||
#define L2_GUEST_STACK_SIZE 64
|
||||
unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
|
||||
|
||||
/* Set up Hyper-V enlightenments and eVMCS */
|
||||
wrmsr(HV_X64_MSR_GUEST_OS_ID, HYPERV_LINUX_OS_ID);
|
||||
enable_vp_assist(hv_pages->vp_assist_gpa, hv_pages->vp_assist);
|
||||
evmcs_enable();
|
||||
|
||||
GUEST_ASSERT(prepare_for_vmx_operation(vmx_pages));
|
||||
GUEST_ASSERT(load_evmcs(hv_pages));
|
||||
prepare_vmcs(vmx_pages, l2_guest_code,
|
||||
&l2_guest_stack[L2_GUEST_STACK_SIZE]);
|
||||
|
||||
GUEST_ASSERT(!vmlaunch());
|
||||
|
||||
/* L2 exits via vmcall if test fails */
|
||||
sync_with_host(2);
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
vm_vaddr_t vmx_pages_gva = 0, hv_pages_gva = 0;
|
||||
struct hyperv_test_pages *hv;
|
||||
struct hv_enlightened_vmcs *evmcs;
|
||||
struct kvm_vcpu *vcpu;
|
||||
struct kvm_vm *vm;
|
||||
struct kvm_regs regs;
|
||||
int stage_reported;
|
||||
|
||||
TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));
|
||||
TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
|
||||
TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
|
||||
TEST_REQUIRE(kvm_has_cap(KVM_CAP_X86_SMM));
|
||||
|
||||
vm = vm_create_with_one_vcpu(&vcpu, guest_code);
|
||||
|
||||
setup_smram(vm, vcpu, SMRAM_GPA, smi_handler, sizeof(smi_handler));
|
||||
|
||||
vcpu_set_hv_cpuid(vcpu);
|
||||
vcpu_enable_evmcs(vcpu);
|
||||
vcpu_alloc_vmx(vm, &vmx_pages_gva);
|
||||
hv = vcpu_alloc_hyperv_test_pages(vm, &hv_pages_gva);
|
||||
vcpu_args_set(vcpu, 2, vmx_pages_gva, hv_pages_gva);
|
||||
|
||||
vcpu_run(vcpu);
|
||||
|
||||
/* L2 is running and syncs with host. */
|
||||
TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
|
||||
vcpu_regs_get(vcpu, ®s);
|
||||
stage_reported = regs.rax & 0xff;
|
||||
TEST_ASSERT(stage_reported == 1,
|
||||
"Expected stage 1, got %d", stage_reported);
|
||||
|
||||
/* Inject SMI while L2 is running. */
|
||||
inject_smi(vcpu);
|
||||
vcpu_run(vcpu);
|
||||
TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
|
||||
vcpu_regs_get(vcpu, ®s);
|
||||
stage_reported = regs.rax & 0xff;
|
||||
TEST_ASSERT(stage_reported == SMRAM_STAGE,
|
||||
"Expected SMM handler stage %#x, got %#x",
|
||||
SMRAM_STAGE, stage_reported);
|
||||
|
||||
/*
|
||||
* Guest is now paused in the SMI handler, about to execute RSM.
|
||||
* Hack the eVMCS page to set-up invalid pin-based execution
|
||||
* control (PIN_BASED_VIRTUAL_NMIS without PIN_BASED_NMI_EXITING).
|
||||
*/
|
||||
evmcs = hv->enlightened_vmcs_hva;
|
||||
evmcs->pin_based_vm_exec_control |= PIN_BASED_VIRTUAL_NMIS;
|
||||
evmcs->hv_clean_fields = 0;
|
||||
|
||||
/*
|
||||
* Trigger copy_enlightened_to_vmcs12() via KVM_GET_NESTED_STATE,
|
||||
* copying the invalid pin_based_vm_exec_control into cached_vmcs12.
|
||||
*/
|
||||
union {
|
||||
struct kvm_nested_state state;
|
||||
char state_[16384];
|
||||
} nested_state_buf;
|
||||
|
||||
memset(&nested_state_buf, 0, sizeof(nested_state_buf));
|
||||
nested_state_buf.state.size = sizeof(nested_state_buf);
|
||||
vcpu_nested_state_get(vcpu, &nested_state_buf.state);
|
||||
|
||||
/*
|
||||
* Resume the guest. The SMI handler executes RSM, which calls
|
||||
* vmx_leave_smm(). nested_vmx_check_controls() should detect
|
||||
* VIRTUAL_NMIS without NMI_EXITING and cause a triple fault.
|
||||
*/
|
||||
vcpu_run(vcpu);
|
||||
TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_SHUTDOWN);
|
||||
|
||||
kvm_vm_free(vm);
|
||||
return 0;
|
||||
}
|
||||
@@ -13,6 +13,30 @@
|
||||
#include "linux/psp-sev.h"
|
||||
#include "sev.h"
|
||||
|
||||
static void guest_sev_test_msr(uint32_t msr)
|
||||
{
|
||||
uint64_t val = rdmsr(msr);
|
||||
|
||||
wrmsr(msr, val);
|
||||
GUEST_ASSERT(val == rdmsr(msr));
|
||||
}
|
||||
|
||||
#define guest_sev_test_reg(reg) \
|
||||
do { \
|
||||
uint64_t val = get_##reg(); \
|
||||
\
|
||||
set_##reg(val); \
|
||||
GUEST_ASSERT(val == get_##reg()); \
|
||||
} while (0)
|
||||
|
||||
static void guest_sev_test_regs(void)
|
||||
{
|
||||
guest_sev_test_msr(MSR_EFER);
|
||||
guest_sev_test_reg(cr0);
|
||||
guest_sev_test_reg(cr3);
|
||||
guest_sev_test_reg(cr4);
|
||||
guest_sev_test_reg(cr8);
|
||||
}
|
||||
|
||||
#define XFEATURE_MASK_X87_AVX (XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM)
|
||||
|
||||
@@ -24,6 +48,8 @@ static void guest_snp_code(void)
|
||||
GUEST_ASSERT(sev_msr & MSR_AMD64_SEV_ES_ENABLED);
|
||||
GUEST_ASSERT(sev_msr & MSR_AMD64_SEV_SNP_ENABLED);
|
||||
|
||||
guest_sev_test_regs();
|
||||
|
||||
wrmsr(MSR_AMD64_SEV_ES_GHCB, GHCB_MSR_TERM_REQ);
|
||||
vmgexit();
|
||||
}
|
||||
@@ -34,6 +60,8 @@ static void guest_sev_es_code(void)
|
||||
GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
|
||||
GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ES_ENABLED);
|
||||
|
||||
guest_sev_test_regs();
|
||||
|
||||
/*
|
||||
* TODO: Add GHCB and ucall support for SEV-ES guests. For now, simply
|
||||
* force "termination" to signal "done" via the GHCB MSR protocol.
|
||||
@@ -47,6 +75,8 @@ static void guest_sev_code(void)
|
||||
GUEST_ASSERT(this_cpu_has(X86_FEATURE_SEV));
|
||||
GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
|
||||
|
||||
guest_sev_test_regs();
|
||||
|
||||
GUEST_DONE();
|
||||
}
|
||||
|
||||
|
||||
@@ -14,13 +14,11 @@
|
||||
#include "test_util.h"
|
||||
|
||||
#include "kvm_util.h"
|
||||
#include "smm.h"
|
||||
|
||||
#include "vmx.h"
|
||||
#include "svm_util.h"
|
||||
|
||||
#define SMRAM_SIZE 65536
|
||||
#define SMRAM_MEMSLOT ((1 << 16) | 1)
|
||||
#define SMRAM_PAGES (SMRAM_SIZE / PAGE_SIZE)
|
||||
#define SMRAM_GPA 0x1000000
|
||||
#define SMRAM_STAGE 0xfe
|
||||
|
||||
@@ -113,18 +111,6 @@ static void guest_code(void *arg)
|
||||
sync_with_host(DONE);
|
||||
}
|
||||
|
||||
void inject_smi(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_vcpu_events events;
|
||||
|
||||
vcpu_events_get(vcpu, &events);
|
||||
|
||||
events.smi.pending = 1;
|
||||
events.flags |= KVM_VCPUEVENT_VALID_SMM;
|
||||
|
||||
vcpu_events_set(vcpu, &events);
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
vm_vaddr_t nested_gva = 0;
|
||||
@@ -140,16 +126,7 @@ int main(int argc, char *argv[])
|
||||
/* Create VM */
|
||||
vm = vm_create_with_one_vcpu(&vcpu, guest_code);
|
||||
|
||||
vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, SMRAM_GPA,
|
||||
SMRAM_MEMSLOT, SMRAM_PAGES, 0);
|
||||
TEST_ASSERT(vm_phy_pages_alloc(vm, SMRAM_PAGES, SMRAM_GPA, SMRAM_MEMSLOT)
|
||||
== SMRAM_GPA, "could not allocate guest physical addresses?");
|
||||
|
||||
memset(addr_gpa2hva(vm, SMRAM_GPA), 0x0, SMRAM_SIZE);
|
||||
memcpy(addr_gpa2hva(vm, SMRAM_GPA) + 0x8000, smi_handler,
|
||||
sizeof(smi_handler));
|
||||
|
||||
vcpu_set_msr(vcpu, MSR_IA32_SMBASE, SMRAM_GPA);
|
||||
setup_smram(vm, vcpu, SMRAM_GPA, smi_handler, sizeof(smi_handler));
|
||||
|
||||
if (kvm_has_cap(KVM_CAP_NESTED_STATE)) {
|
||||
if (kvm_cpu_has(X86_FEATURE_SVM))
|
||||
|
||||
@@ -50,7 +50,7 @@
|
||||
* Return: the number of bytes that has been successfully read
|
||||
*/
|
||||
ssize_t kvm_stats_read(char *id, const struct kvm_stats_header *header,
|
||||
const struct _kvm_stats_desc *desc,
|
||||
const struct kvm_stats_desc *desc,
|
||||
void *stats, size_t size_stats,
|
||||
char __user *user_buffer, size_t size, loff_t *offset)
|
||||
{
|
||||
|
||||
@@ -973,9 +973,9 @@ static void kvm_free_memslots(struct kvm *kvm, struct kvm_memslots *slots)
|
||||
kvm_free_memslot(kvm, memslot);
|
||||
}
|
||||
|
||||
static umode_t kvm_stats_debugfs_mode(const struct _kvm_stats_desc *pdesc)
|
||||
static umode_t kvm_stats_debugfs_mode(const struct kvm_stats_desc *desc)
|
||||
{
|
||||
switch (pdesc->desc.flags & KVM_STATS_TYPE_MASK) {
|
||||
switch (desc->flags & KVM_STATS_TYPE_MASK) {
|
||||
case KVM_STATS_TYPE_INSTANT:
|
||||
return 0444;
|
||||
case KVM_STATS_TYPE_CUMULATIVE:
|
||||
@@ -1010,7 +1010,7 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, const char *fdname)
|
||||
struct dentry *dent;
|
||||
char dir_name[ITOA_MAX_LEN * 2];
|
||||
struct kvm_stat_data *stat_data;
|
||||
const struct _kvm_stats_desc *pdesc;
|
||||
const struct kvm_stats_desc *pdesc;
|
||||
int i, ret = -ENOMEM;
|
||||
int kvm_debugfs_num_entries = kvm_vm_stats_header.num_desc +
|
||||
kvm_vcpu_stats_header.num_desc;
|
||||
@@ -6171,11 +6171,11 @@ static int kvm_stat_data_get(void *data, u64 *val)
|
||||
switch (stat_data->kind) {
|
||||
case KVM_STAT_VM:
|
||||
r = kvm_get_stat_per_vm(stat_data->kvm,
|
||||
stat_data->desc->desc.offset, val);
|
||||
stat_data->desc->offset, val);
|
||||
break;
|
||||
case KVM_STAT_VCPU:
|
||||
r = kvm_get_stat_per_vcpu(stat_data->kvm,
|
||||
stat_data->desc->desc.offset, val);
|
||||
stat_data->desc->offset, val);
|
||||
break;
|
||||
}
|
||||
|
||||
@@ -6193,11 +6193,11 @@ static int kvm_stat_data_clear(void *data, u64 val)
|
||||
switch (stat_data->kind) {
|
||||
case KVM_STAT_VM:
|
||||
r = kvm_clear_stat_per_vm(stat_data->kvm,
|
||||
stat_data->desc->desc.offset);
|
||||
stat_data->desc->offset);
|
||||
break;
|
||||
case KVM_STAT_VCPU:
|
||||
r = kvm_clear_stat_per_vcpu(stat_data->kvm,
|
||||
stat_data->desc->desc.offset);
|
||||
stat_data->desc->offset);
|
||||
break;
|
||||
}
|
||||
|
||||
@@ -6345,7 +6345,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
|
||||
static void kvm_init_debug(void)
|
||||
{
|
||||
const struct file_operations *fops;
|
||||
const struct _kvm_stats_desc *pdesc;
|
||||
const struct kvm_stats_desc *pdesc;
|
||||
int i;
|
||||
|
||||
kvm_debugfs_dir = debugfs_create_dir("kvm", NULL);
|
||||
@@ -6358,7 +6358,7 @@ static void kvm_init_debug(void)
|
||||
fops = &vm_stat_readonly_fops;
|
||||
debugfs_create_file(pdesc->name, kvm_stats_debugfs_mode(pdesc),
|
||||
kvm_debugfs_dir,
|
||||
(void *)(long)pdesc->desc.offset, fops);
|
||||
(void *)(long)pdesc->offset, fops);
|
||||
}
|
||||
|
||||
for (i = 0; i < kvm_vcpu_stats_header.num_desc; ++i) {
|
||||
@@ -6369,7 +6369,7 @@ static void kvm_init_debug(void)
|
||||
fops = &vcpu_stat_readonly_fops;
|
||||
debugfs_create_file(pdesc->name, kvm_stats_debugfs_mode(pdesc),
|
||||
kvm_debugfs_dir,
|
||||
(void *)(long)pdesc->desc.offset, fops);
|
||||
(void *)(long)pdesc->offset, fops);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user