summaryrefslogtreecommitdiff
path: root/hw/misc/vfio.c
AgeCommit message (Collapse)AuthorFilesLines
2014-02-26vfio: blacklist loading of unstable romsBandan Das1-0/+73
Certain cards such as the Broadcom BCM57810 have rom quirks that exhibit unstable system behavior duing device assignment. In the particular case of 57810, rom execution hangs and if a FLR follows, the device becomes inoperable until a power cycle. This change blacklists loading of rom for such cards unless the user specifies a romfile or rombar=1 on the cmd line Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-02-26vfio: Fix overrun after readlink() fills buffer completelyMarkus Armbruster1-3/+3
readlink() returns the number of bytes written to the buffer, and it doesn't write a terminating null byte. vfio_init() writes it itself. Overruns the buffer when readlink() filled it completely. Fix by treating readlink() filling the buffer completely as error, like we do in pci-assign.c's assign_failed_examine(). Spotted by Coverity. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-28vfio: correct debug macro typoBandan Das1-1/+1
Change to DEBUG_VFIO in vfio_msi_interrupt() for debug messages to get printed Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-17vfio: fix mapping of MSIX barAlexey Kardashevskiy1-3/+3
VFIO virtualizes MSIX table for the guest but not mapping the part of a BAR which contains an MSIX table. Since vfio_mmap_bar() mmaps chunks before and after the MSIX table, they have to be aligned to the host page size which may be TARGET_PAGE_MASK (4K) or 64K in case of PPC64. This fixes boundaries calculations to use the real host page size. Without the patch, the chunk before MSIX table may overlap with the MSIX table and mmap will fail in the host kernel. The result will be serious slowdown as the whole BAR will be emulated by QEMU. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-16vfio-pci: Fail initfn on DMA mapping errorsAlex Williamson1-6/+38
The vfio-pci initfn will currently succeed even if DMA mappings fail. A typical reason for failure is if the user does not have sufficient privilege to lock all the memory for the guest. In this case, the device gets attached, but can only access a portion of guest memory and is extremely unlikely to work. DMA mappings are done via a MemoryListener, which provides no direct error return path. We therefore stuff the errno into our container structure and check for error after registration completes. We can also test for mapping errors during runtime, but our only option for resolution at that point is to kill the guest with a hw_error. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-16vfio: Filter out bogus mappingsAlex Williamson1-1/+8
Since 57271d63 we now see spurious mappings with the upper bits set if 64bit PCI BARs are sized while enabled. The guest writes a mask of 0xffffffff to the lower BAR to size it, then restores it, then writes the same mask to the upper BAR resulting in a spurious BAR mapping into the last 4G of the 64bit address space. Most architectures do not support or make use of the full 64bits address space for PCI BARs, so we filter out mappings with the high bit set. Long term, we probably need to think about vfio telling us the address width limitations of the IOMMU. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2014-01-15vfio: Do not reattempt a failed rom readBandan Das1-0/+6
During lazy rom loading, if rom read fails, and the guest attempts a read again, vfio will again attempt it. Add a boolean to prevent this. There could be a case where a failed rom read might succeed the next time because of a device reset or such, but it's best to exclude unpredictable behavior Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-15vfio: warn if host device rom can't be readBandan Das1-0/+7
If the device rom can't be read, report an error to the user. This alerts the user that the device has a bad state that is causing rom read failure or option rom loading has been disabled from the device boot menu (among other reasons). Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-01-15vfio: Destroy memory regionsAlex Williamson1-0/+4
Somehow this has been lurking for a while; we remove our subregions from the base BAR and VGA region mappings, but we don't destroy them, creating a leak and more serious problems when we try to migrate after removing these devices. Add the trivial bit of final cleanup to remove these entirely. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-12-06vfio-pci: Release all MSI-X vectors when disabledAlex Williamson1-0/+12
We were relying on msix_unset_vector_notifiers() to release all the vectors when we disable MSI-X, but this only happens when MSI-X is still enabled on the device. Perform further cleanup by releasing any remaining vectors listed as in-use after this call. This caused a leak of IRQ routes on hotplug depending on how the guest OS prepared the device for removal. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org
2013-12-06vfio-pci: Add debug config options to disable MSI/X KVM supportAlex Williamson1-4/+20
It's sometimes useful to be able to verify interrupts are passing through correctly. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-12-06vfio-pci: Fix Nvidia MSI ACK through 0x88000 quirkAlex Williamson1-1/+29
When MSI is enabled on Nvidia GeForce cards the driver seems to acknowledge the interrupt by writing a 0xff byte to the MSI capability ID register using the PCI config space mirror at offset 0x88000 from BAR0. Without this, the device will only fire a single interrupt. VFIO handles the PCI capability ID/next registers as virtual w/o write support, so any write through config space is currently dropped. Add a check for this and allow the write through the BAR window. The registers are read-only anyway. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-12-06vfio-pci: Make use of new KVM-VFIO deviceAlex Williamson1-0/+67
Add and remove groups from the KVM virtual VFIO device as we make use of them. This allows KVM to optimize for performance and correctness based on properties of the group. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-11-21vfio-pci: Fix multifunction=onAlex Williamson1-0/+7
When an assigned device is initialized it copies the device config space into the emulated config space. Unfortunately multifunction is setup prior to the device initfn and gets clobbered. We need to restore it just like pci-assign does. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Bandan Das <bsd@redhat.com> Message-id: 20131112185059.7262.33780.stgit@bling.home Cc: qemu-stable@nongnu.org Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-31Merge remote-tracking branch 'mst/tags/for_anthony' into stagingAnthony Liguori1-5/+6
pci, pc, acpi fixes, enhancements This includes some pretty big changes: - pci master abort support by Marcel - pci IRQ API rework by Marcel - acpi generation support by myself Everything has gone through several revisions, latest versions have been on list for a while without any more comments, tested by several people. Please pull for 1.7. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 15 Oct 2013 07:33:48 AM CEST using RSA key ID D28D5469 # gpg: Can't check signature: public key not found * mst/tags/for_anthony: (39 commits) ssdt-proc: update generated file ssdt: fix PBLK length i386: ACPI table generation code from seabios pc: use new api to add builtin tables acpi: add interface to access user-installed tables hpet: add API to find it pvpanic: add API to access io port ich9: APIs for pc guest info piix: APIs for pc guest info acpi/piix: add macros for acpi property names i386: define pc guest info loader: allow adding ROMs in done callbacks i386: add bios linker/loader loader: use file path size from fw_cfg.h acpi: ssdt pcihp: updat generated file acpi: pre-compiled ASL files acpi: add rules to compile ASL source i386: add ACPI table files from seabios q35: expose mmcfg size as a property q35: use macro for MCFG property name ... Message-id: 1381818560-18367-1-git-send-email-mst@redhat.com Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-14hw/vfio: set interrupts using pci irq wrappersMarcel Apfelbaum1-5/+6
pci_set_irq and the other pci irq wrappers use PCI_INTERRUPT_PIN config register to compute device INTx pin to assert/deassert. save INTX pin into the config register before calling pci_set_irq Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-04vfio-pci: Fix endian issues in vfio_pci_size_rom()Alex Williamson1-2/+2
VFIO is always little endian so do byte swapping of our mask on the way in and byte swapping of the size on the way out. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2013-10-04vfio-pci: Add dummy PCI ROM write accessorAlex Williamson1-0/+6
Just to be sure we don't jump off any NULL pointer cliffs. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reported-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-03vfio: Fix debug output for int128 valuesAlexey Kardashevskiy1-2/+4
Memory regions can easily be 2^64 byte long and therefore overflow for just a bit but that is enough for int128_get64() to assert. This takes care of debug printing of huge section sizes. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02vfio-pci: Implement PCI hot resetAlex Williamson1-38/+300
Now that VFIO has a PCI hot reset interface, take advantage of it. There are two modes that we need to consider. The first is when only one device within the set of devices affected is actually assigned to the guest. In this case the other devices are are just held by VFIO for isolation and we can pretend they're not there, doing an entire bus reset whenever the device reset callback is triggered. Supporting this case separately allows us to do the best reset we can do of the device even if the device is hotplugged. The second mode is when multiple affected devices are all exposed to the guest. In this case we can only do a hot reset when the entire system is being reset. However, this also allows us to track which individual devices are affected by a reset and only do them once. We split our reset function into pre- and post-reset helper functions prioritize the types of device resets available to us, and create separate _one vs _multi reset interfaces to handle the distinct cases above. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02vfio-pci: Cleanup error_reportsAlex Williamson1-12/+12
Remove carriage returns and tweak formatting for error_reports. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02vfio-pci: Lazy PCI option ROM loadingAlex Williamson1-62/+122
During vfio-pci initfn, the device is not always in a state where the option ROM can be read. In the case of graphics cards, there's often no per function reset, which means we have host driver state affecting whether the option ROM is usable. Ideally we want to move reading the option ROM past any co-assigned device resets to the point where the guest first tries to read the ROM itself. To accomplish this, we switch the memory region for the option rom to an I/O region rather than a memory mapped region. This has the side benefit that we don't waste KVM memory slots for a BAR where we don't care about performance. This also allows us to delay loading the ROM from the device until the first read by the guest. We then use the PCI config space size of the ROM BAR when setting up the BAR through QEMU PCI. Another benefit of this approach is that previously when a user set the ROM to a file using the romfile= option, we still probed VFIO for the parameters of the ROM, which can result in dmesg errors about an invalid ROM. We now only probe VFIO to get the ROM contents if the guest actually tries to read the ROM. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02vfio-pci: Test device reset capabilitiesAlex Williamson1-0/+46
Not all resets are created equal. PM reset is not very reliable, especially for GPUs, so we might want to opt for a bus reset if a standard reset will only do a D3hot->D0 transition. We can also use this to tell if the standard reset will do a bus reset (if neither has_pm_reset or has_flr is probed, but the device still supports reset). Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02vfio-pci: Add support for MSI affinityAlex Williamson1-7/+40
When MSI is accelerated through KVM the vectors are only programmed when the guest first enables MSI support.  Subsequent writes to the vector address or data fields are ignored.  Unfortunately that means we're ignore updates done to adjust SMP affinity of the vectors. MSI SMP affinity already works in non-KVM mode because the address and data fields are read from their backing store on each interrupt. This patch stores the MSIMessage programmed into KVM so that we can determine when changes are made and update the routes. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-08-29Merge remote-tracking branch 'qemu-kvm/uq/master' into stable-1.5Anthony Liguori1-2/+2
* qemu-kvm/uq/master: kvm-stub: fix compilation kvm: shorten the parameter list for get_real_device() kvm: i386: fix LAPIC TSC deadline timer save/restore kvm-all.c: max_cpus should not exceed KVM vcpu limit kvm: Simplify kvm_handle_io kvm: x86: fix setting IA32_FEATURE_CONTROL with nested VMX disabled kvm: add KVM_IRQFD_FLAG_RESAMPLE support kvm: migrate vPMU state target-i386: remove tabs from target-i386/cpu.h Initialize IA32_FEATURE_CONTROL MSR in reset and migration Conflicts: target-i386/cpu.h target-i386/kvm.c aliguori: fixup trivial conflicts due to whitespace and added cpu argument Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-22aio / timers: Switch entire codebase to the new timer APIAlex Bligh1-7/+7
This is an autogenerated patch using scripts/switch-timer-api. Switch the entire code base to using the new timer API. Note this patch may introduce some line length issues. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-09kvm: add KVM_IRQFD_FLAG_RESAMPLE supportVincenzo Maffione1-2/+2
Added an EventNotifier* parameter to kvm-all.c:kvm_irqchip_add_irqfd_notifier(), in order to give KVM another eventfd to be used as "resamplefd". See the documentation in the linux kernel sources in Documentation/virtual/kvm/api.txt (section 4.75) for more details. When the added parameter is passed NULL, the behaviour of the function is unchanged with respect to the previous versions. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-29devices: Associate devices to their logical categoryMarcel Apfelbaum1-0/+1
The category will be used to sort the devices displayed in the command line help. Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com> Message-id: 1375107465-25767-4-git-send-email-marcel.a@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-15vfio: QEMU-AER: Qemu changes to support AER for VFIO-PCI devicesVijay Mohan Pandarathil1-0/+125
Add support for error containment when a VFIO device assigned to a KVM guest encounters an error. This is for PCIe devices/drivers that support AER functionality. When the host OS is notified of an error in a device either through the firmware first approach or through an interrupt handled by the AER root port driver, the error handler registered by the vfio-pci driver gets invoked. The qemu process is signaled through an eventfd registered per VFIO device by the qemu process. In the eventfd handler, qemu decides on what action to take. In this implementation, guest is brought down to contain the error. The kernel patches for the above functionality has been already accepted. This is a refresh of the QEMU patch which was reviewed earlier. http://marc.info/?l=linux-kernel&m=136281557608087&w=2 This patch has the same contents and has been built after refreshing to latest upstream and after the linux headers have been updated in qemu. - Create eventfd per vfio device assigned to a guest and register an event handler - This fd is passed to the vfio_pci driver through the SET_IRQ ioctl - When the device encounters an error, the eventfd is signalled and the qemu eventfd handler gets invoked. - In the handler decide what action to take. Current action taken is to stop the guest. Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil@hp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-07-15vfio-pci: VGA quirk updateAlex Williamson1-336/+321
Turns out all the suspicions for AMD devices were correct, everywhere we read a BAR address that the address matches the config space offset, there's full access to PCI config space. Attempt to generalize some helpers to allow quirks to easily be added for mirrors and windows. Also fill in complete config space for AMD. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-07-04hw/m*: pass owner to memory_region_init* functionsPaolo Bonzini1-15/+15
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04vfio: pass device to vfio_mmap_bar and use it to set ownerPaolo Bonzini1-5/+6
Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04memory: add ref/unref callsPaolo Bonzini1-0/+2
Add ref/unref calls at the following places: - places where memory regions are stashed by a listener and used outside the BQL (including in Xen or KVM). - memory_region_find callsites - creation of aliases and containers (only the aliased/contained region gets a reference to avoid loops) - around calls to del_subregion/add_subregion, where the region could disappear after the first call Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04memory: add owner argument to initialization functionsPaolo Bonzini1-14/+14
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20vfio: abort if an emulated iommu is usedAvi Kivity1-0/+2
vfio doesn't support guest iommus yet, indicate it to the user by gently depositing a core on their disk. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20memory: make section size a 128-bit integerPaolo Bonzini1-2/+2
So far, the size of all regions passed to listeners could fit in 64 bits, because artificial regions (containers and aliases) are eliminated by the memory core, leaving only device regions which have reasonable sizes An IOMMU however cannot be eliminated by the memory core, and may have an artificial size, hence we may need 65 bits to represent its size. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08hw: move VFIO and ivshmem to hw/misc/Paolo Bonzini1-0/+3206
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>