mirror of
https://github.com/Fishwaldo/Star64_linux.git
synced 2025-07-23 23:32:14 +00:00
One of the largest releases for KVM... Hardly any generic improvement,
but lots of architecture-specific changes. * ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. * PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). * s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. * x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory---currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJW5r3BAAoJEL/70l94x66D2pMH/jTSWWwdTUJMctrDjPVzKzG0 yOzHW5vSLFoFlwEOY2VpslnXzn5TUVmCAfrdmFNmQcSw6hGb3K/xA/ZX/KLwWhyb oZpr123ycahga+3q/ht/dFUBCCyWeIVMdsLSFwpobEBzPL0pMgc9joLgdUC6UpWX tmN0LoCAeS7spC4TTiTTpw3gZ/L+aB0B6CXhOMjldb9q/2CsgaGyoVvKA199nk9o Ngu7ImDt7l/x1VJX4/6E/17VHuwqAdUrrnbqerB/2oJ5ixsZsHMGzxQ3sHCmvyJx WG5L00ubB1oAJAs9fBg58Y/MdiWX99XqFhdEfxq4foZEiQuCyxygVvq3JwZTxII= =OUZZ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
This commit is contained in:
commit
10dc374766
142 changed files with 6754 additions and 2936 deletions
|
@ -2507,8 +2507,9 @@ struct kvm_create_device {
|
|||
|
||||
4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
|
||||
|
||||
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
|
||||
Type: device ioctl, vm ioctl
|
||||
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
|
||||
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
|
||||
Type: device ioctl, vm ioctl, vcpu ioctl
|
||||
Parameters: struct kvm_device_attr
|
||||
Returns: 0 on success, -1 on error
|
||||
Errors:
|
||||
|
@ -2533,8 +2534,9 @@ struct kvm_device_attr {
|
|||
|
||||
4.81 KVM_HAS_DEVICE_ATTR
|
||||
|
||||
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
|
||||
Type: device ioctl, vm ioctl
|
||||
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
|
||||
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
|
||||
Type: device ioctl, vm ioctl, vcpu ioctl
|
||||
Parameters: struct kvm_device_attr
|
||||
Returns: 0 on success, -1 on error
|
||||
Errors:
|
||||
|
@ -2577,6 +2579,8 @@ Possible features:
|
|||
Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
|
||||
- KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
|
||||
Depends on KVM_CAP_ARM_PSCI_0_2.
|
||||
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
|
||||
Depends on KVM_CAP_ARM_PMU_V3.
|
||||
|
||||
|
||||
4.83 KVM_ARM_PREFERRED_TARGET
|
||||
|
@ -3035,6 +3039,87 @@ Returns: 0 on success, -1 on error
|
|||
|
||||
Queues an SMI on the thread's vcpu.
|
||||
|
||||
4.97 KVM_CAP_PPC_MULTITCE
|
||||
|
||||
Capability: KVM_CAP_PPC_MULTITCE
|
||||
Architectures: ppc
|
||||
Type: vm
|
||||
|
||||
This capability means the kernel is capable of handling hypercalls
|
||||
H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
|
||||
space. This significantly accelerates DMA operations for PPC KVM guests.
|
||||
User space should expect that its handlers for these hypercalls
|
||||
are not going to be called if user space previously registered LIOBN
|
||||
in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
|
||||
|
||||
In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
|
||||
user space might have to advertise it for the guest. For example,
|
||||
IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
|
||||
present in the "ibm,hypertas-functions" device-tree property.
|
||||
|
||||
The hypercalls mentioned above may or may not be processed successfully
|
||||
in the kernel based fast path. If they can not be handled by the kernel,
|
||||
they will get passed on to user space. So user space still has to have
|
||||
an implementation for these despite the in kernel acceleration.
|
||||
|
||||
This capability is always enabled.
|
||||
|
||||
4.98 KVM_CREATE_SPAPR_TCE_64
|
||||
|
||||
Capability: KVM_CAP_SPAPR_TCE_64
|
||||
Architectures: powerpc
|
||||
Type: vm ioctl
|
||||
Parameters: struct kvm_create_spapr_tce_64 (in)
|
||||
Returns: file descriptor for manipulating the created TCE table
|
||||
|
||||
This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
|
||||
windows, described in 4.62 KVM_CREATE_SPAPR_TCE
|
||||
|
||||
This capability uses extended struct in ioctl interface:
|
||||
|
||||
/* for KVM_CAP_SPAPR_TCE_64 */
|
||||
struct kvm_create_spapr_tce_64 {
|
||||
__u64 liobn;
|
||||
__u32 page_shift;
|
||||
__u32 flags;
|
||||
__u64 offset; /* in pages */
|
||||
__u64 size; /* in pages */
|
||||
};
|
||||
|
||||
The aim of extension is to support an additional bigger DMA window with
|
||||
a variable page size.
|
||||
KVM_CREATE_SPAPR_TCE_64 receives a 64bit window size, an IOMMU page shift and
|
||||
a bus offset of the corresponding DMA window, @size and @offset are numbers
|
||||
of IOMMU pages.
|
||||
|
||||
@flags are not used at the moment.
|
||||
|
||||
The rest of functionality is identical to KVM_CREATE_SPAPR_TCE.
|
||||
|
||||
4.98 KVM_REINJECT_CONTROL
|
||||
|
||||
Capability: KVM_CAP_REINJECT_CONTROL
|
||||
Architectures: x86
|
||||
Type: vm ioctl
|
||||
Parameters: struct kvm_reinject_control (in)
|
||||
Returns: 0 on success,
|
||||
-EFAULT if struct kvm_reinject_control cannot be read,
|
||||
-ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier.
|
||||
|
||||
i8254 (PIT) has two modes, reinject and !reinject. The default is reinject,
|
||||
where KVM queues elapsed i8254 ticks and monitors completion of interrupt from
|
||||
vector(s) that i8254 injects. Reinject mode dequeues a tick and injects its
|
||||
interrupt whenever there isn't a pending interrupt from i8254.
|
||||
!reinject mode injects an interrupt as soon as a tick arrives.
|
||||
|
||||
struct kvm_reinject_control {
|
||||
__u8 pit_reinject;
|
||||
__u8 reserved[31];
|
||||
};
|
||||
|
||||
pit_reinject = 0 (!reinject mode) is recommended, unless running an old
|
||||
operating system that uses the PIT for timing (e.g. Linux 2.4.x).
|
||||
|
||||
5. The kvm_run structure
|
||||
------------------------
|
||||
|
||||
|
@ -3339,6 +3424,7 @@ EOI was received.
|
|||
|
||||
struct kvm_hyperv_exit {
|
||||
#define KVM_EXIT_HYPERV_SYNIC 1
|
||||
#define KVM_EXIT_HYPERV_HCALL 2
|
||||
__u32 type;
|
||||
union {
|
||||
struct {
|
||||
|
@ -3347,6 +3433,11 @@ EOI was received.
|
|||
__u64 evt_page;
|
||||
__u64 msg_page;
|
||||
} synic;
|
||||
struct {
|
||||
__u64 input;
|
||||
__u64 result;
|
||||
__u64 params[2];
|
||||
} hcall;
|
||||
} u;
|
||||
};
|
||||
/* KVM_EXIT_HYPERV */
|
||||
|
|
|
@ -88,6 +88,8 @@ struct kvm_s390_io_adapter_req {
|
|||
perform a gmap translation for the guest address provided in addr,
|
||||
pin a userspace page for the translated address and add it to the
|
||||
list of mappings
|
||||
Note: A new mapping will be created unconditionally; therefore,
|
||||
the calling code should avoid making duplicate mappings.
|
||||
|
||||
KVM_S390_IO_ADAPTER_UNMAP
|
||||
release a userspace page for the translated address specified in addr
|
||||
|
|
33
Documentation/virtual/kvm/devices/vcpu.txt
Normal file
33
Documentation/virtual/kvm/devices/vcpu.txt
Normal file
|
@ -0,0 +1,33 @@
|
|||
Generic vcpu interface
|
||||
====================================
|
||||
|
||||
The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
|
||||
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
|
||||
kvm_device_attr as other devices, but targets VCPU-wide settings and controls.
|
||||
|
||||
The groups and attributes per virtual cpu, if any, are architecture specific.
|
||||
|
||||
1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
|
||||
Architectures: ARM64
|
||||
|
||||
1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
|
||||
Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
|
||||
pointer to an int
|
||||
Returns: -EBUSY: The PMU overflow interrupt is already set
|
||||
-ENXIO: The overflow interrupt not set when attempting to get it
|
||||
-ENODEV: PMUv3 not supported
|
||||
-EINVAL: Invalid PMU overflow interrupt number supplied
|
||||
|
||||
A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
|
||||
number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
|
||||
type must be same for each vcpu. As a PPI, the interrupt number is the same for
|
||||
all vcpus, while as an SPI it must be a separate number per vcpu.
|
||||
|
||||
1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
|
||||
Parameters: no additional parameter in kvm_device_attr.addr
|
||||
Returns: -ENODEV: PMUv3 not supported
|
||||
-ENXIO: PMUv3 not properly configured as required prior to calling this
|
||||
attribute
|
||||
-EBUSY: PMUv3 already initialized
|
||||
|
||||
Request the initialization of the PMUv3.
|
|
@ -84,3 +84,55 @@ Returns: -EBUSY in case 1 or more vcpus are already activated (only in write
|
|||
-EFAULT if the given address is not accessible from kernel space
|
||||
-ENOMEM if not enough memory is available to process the ioctl
|
||||
0 in case of success
|
||||
|
||||
3. GROUP: KVM_S390_VM_TOD
|
||||
Architectures: s390
|
||||
|
||||
3.1. ATTRIBUTE: KVM_S390_VM_TOD_HIGH
|
||||
|
||||
Allows user space to set/get the TOD clock extension (u8).
|
||||
|
||||
Parameters: address of a buffer in user space to store the data (u8) to
|
||||
Returns: -EFAULT if the given address is not accessible from kernel space
|
||||
-EINVAL if setting the TOD clock extension to != 0 is not supported
|
||||
|
||||
3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW
|
||||
|
||||
Allows user space to set/get bits 0-63 of the TOD clock register as defined in
|
||||
the POP (u64).
|
||||
|
||||
Parameters: address of a buffer in user space to store the data (u64) to
|
||||
Returns: -EFAULT if the given address is not accessible from kernel space
|
||||
|
||||
4. GROUP: KVM_S390_VM_CRYPTO
|
||||
Architectures: s390
|
||||
|
||||
4.1. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_AES_KW (w/o)
|
||||
|
||||
Allows user space to enable aes key wrapping, including generating a new
|
||||
wrapping key.
|
||||
|
||||
Parameters: none
|
||||
Returns: 0
|
||||
|
||||
4.2. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_DEA_KW (w/o)
|
||||
|
||||
Allows user space to enable dea key wrapping, including generating a new
|
||||
wrapping key.
|
||||
|
||||
Parameters: none
|
||||
Returns: 0
|
||||
|
||||
4.3. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_AES_KW (w/o)
|
||||
|
||||
Allows user space to disable aes key wrapping, clearing the wrapping key.
|
||||
|
||||
Parameters: none
|
||||
Returns: 0
|
||||
|
||||
4.4. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_DEA_KW (w/o)
|
||||
|
||||
Allows user space to disable dea key wrapping, clearing the wrapping key.
|
||||
|
||||
Parameters: none
|
||||
Returns: 0
|
||||
|
|
|
@ -392,11 +392,11 @@ To instantiate a large spte, four constraints must be satisfied:
|
|||
write-protected pages
|
||||
- the guest page must be wholly contained by a single memory slot
|
||||
|
||||
To check the last two conditions, the mmu maintains a ->write_count set of
|
||||
To check the last two conditions, the mmu maintains a ->disallow_lpage set of
|
||||
arrays for each memory slot and large page size. Every write protected page
|
||||
causes its write_count to be incremented, thus preventing instantiation of
|
||||
causes its disallow_lpage to be incremented, thus preventing instantiation of
|
||||
a large spte. The frames at the end of an unaligned memory slot have
|
||||
artificially inflated ->write_counts so they can never be instantiated.
|
||||
artificially inflated ->disallow_lpages so they can never be instantiated.
|
||||
|
||||
Zapping all pages (page generation count)
|
||||
=========================================
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue