peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2017-11-20	Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171120' ↵	Peter Maydell	2	-29/+42
	into staging ppc patch queue 2017-11-20 Here's the current queue of ppc patches. These 2 patches are both more complex than I'd ideally like this late in the 2.11 cycle. However, they do fix important bugs, so I think it's worth it on balance. # gpg: Signature made Mon 20 Nov 2017 03:27:19 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.11-20171120: spapr: reset DRCs after devices target/ppc: Update setting of cpu features to account for compat modes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-20	pc-bios/s390-ccw.img: update image	Cornelia Huck	1	-0/+0
	Contains the following commit: - pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-11-20	Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵	Peter Maydell	13	-127/+201
	staging # gpg: Signature made Mon 20 Nov 2017 03:28:54 GMT # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: hw/net/vmxnet3: Fix code to work on big endian hosts, too net: Transmit zero UDP checksum as 0xFFFF MAINTAINERS: Add missing entry for eepro100 emulation hw/net/eepro100: Fix endianness problem on big endian hosts Revert "Add new PCI ID for i82559a" colo-compare: fix the dangerous assignment Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-20	linux-user: Fix calculation of auxv length	Peter Maydell	1	-2/+9
	In commit 7c4ee5bcc82e643 we changed the order in which we construct the AUXV, but forgot to adjust the calculation of the length. The result is that we set info->auxv_len to a bogus and negative value, and then later on the code in open_self_auxv() gets confused and ends up presenting the guest with an empty file. Since we now have to calculate the auxv length up-front as part of figuring out how much we're going to put on the stack, set info->auxv_len then; this allows us to assert that we put the same number of entries into auxv as we pre-calculated, rather than merely having a comment saying we need to do that. Fixes: https://bugs.launchpad.net/qemu/+bug/1728116 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2017-11-20	hw/arm: Silence xlnx-ep108 deprecation warning during tests	Thomas Huth	1	-2/+5
	The new deprecation warning for the xlnx-ep108 machine also pops up during "make check" which is kind of confusing. Silence it if testing mode is enabled. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> Acked-by: Wei Huang <wei@redhat.com> Message-id: 1510846183-756-1-git-send-email-thuth@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-20	hw/arm/aspeed: Unlock SCU when running kernel	Joel Stanley	4	-2/+17
	The ASPEED hardware contains a lock register for the SCU that disables any writes to the SCU when it is locked. The machine comes up with the lock enabled, but on all known hardware u-boot will unlock it and leave it unlocked when loading the kernel. This means the kernel expects the SCU to be unlocked. When booting from an emulated ROM the normal u-boot unlock path is executed. Things don't go well when booting using the -kernel command line, as u-boot does not run first. Change behaviour so that when a kernel is passed to the machine, set the reset value of the SCU to be unlocked. Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-id: 20171114122018.12204-1-joel@jms.id.au Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-20	arm: check regime, not current state, for ATS write PAR format	Peter Maydell	1	-1/+1
	In do_ats_write(), rather than using extended_addresses_enabled() to decide whether the value we get back from get_phys_addr() is a 64-bit format PAR or a 32-bit one, use arm_s1_regime_using_lpae_format(). This is not really the correct answer, because the PAR format depends on the AT instruction being used, not just on the translation regime. However getting this correct requires a significant refactoring, so that get_phys_addr() returns raw information about the fault which the caller can then assemble into a suitable FSR/PAR/syndrome for its purposes, rather than get_phys_addr() returning a pre-formatted FSR. However this change at least improves the situation by making the PAR work correctly for address translation operations done at AArch64 EL2 on the EL2 translation regime. In particular, this is necessary for Xen to be able to run in our emulation, so this seems like a safer interim fix given that we are in freeze. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Stefano Stabellini <sstabellini@kernel.org> Message-id: 1509719814-6191-1-git-send-email-peter.maydell@linaro.org
2017-11-20	nvic: Fix ARMv7M MPU_RBAR reads	Peter Maydell	1	-1/+1
	Fix an incorrect mask expression in the handling of v7M MPU_RBAR reads that meant that we would always report the ADDR field as zero. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 1509732813-22957-1-git-send-email-peter.maydell@linaro.org
2017-11-20	target/arm: Report GICv3 sysregs present in ID registers if needed	Peter Maydell	1	-4/+40
	The CPU ID registers ID_AA64PFR0_EL1, ID_PFR1_EL1 and ID_PFR1 have a field for reporting presence of GICv3 system registers. We need to report this field correctly in order for Xen to work as a guest inside QEMU emulation. We mustn't incorrectly claim the sysregs exist when they don't, though, or Linux will crash. Unfortunately the way we've designed the GICv3 emulation in QEMU puts the system registers as part of the GICv3 device, which may be created after the CPU proper has been realized. This means that we don't know at the point when we define the ID registers what the correct value is. Handle this by switching them to calling a function at runtime to read the value, where we can fill in the GIC field appropriately. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Stefano Stabellini <sstabellini@kernel.org> Message-id: 1510066898-3725-1-git-send-email-peter.maydell@linaro.org
2017-11-20	Revert "cpu-exec: don't overwrite exception_index"	Peter Maydell	1	-3/+1
	This reverts commit e01cecabf3e04d22340d7e8b3616ef051c42c891, which breaks booting of aarch64 Linux images. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-20	pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting	Thomas Huth	1	-1/+2
	When rebooting a guest that has a virtio-scsi disk, the s390-ccw bios sometimes bails out with an error message like this: ! SCSI cannot report LUNs: STATUS=02 RSPN=70 KEY=05 CODE=25 QLFR=00, sure ! Enabling the scsi_req* tracing in QEMU shows that the ccw bios is trying to execute the REPORT LUNS SCSI command with a LUN != 0, and this causes the SCSI command to fail. Looks like we neither clear the BSS of the s390-ccw bios during reboot, nor do we explicitly set the default_scsi_device.lun value to 0, so this variable can contain random values from the OS after the reboot. By setting this variable explicitly to 0, the problem is fixed and the reboots always succeed. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1514352 Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1510942228-22822-1-git-send-email-thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-11-20	s390x/tcg: fix DIAG 308 with > 1 VCPU (MTTCG)	David Hildenbrand	1	-0/+2
	Currently, multi threaded TCG with > 1 VCPU gets stuck during IPL, when the bios tries to switch to the loaded kernel via DIAG 308. As run_on_cpu() is used, we run into a deadlock after handling the reset. We need the iolock (just like KVM). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171116170526.12643-4-david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-11-20	s390x: fix storing CPU status (again)	David Hildenbrand	1	-1/+1
	Looks like the last fix + cleanup introduced another bug. (for now Linux guests don't seem to care) - we store the crs into ars. Fixes: 947a38bd6f13 ("s390x/kvm: fix and cleanup storing CPU status") Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171116170526.12643-2-david@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-11-20	hw/net/vmxnet3: Fix code to work on big endian hosts, too	Thomas Huth	3	-101/+181
	Since commit ab06ec43577177a442e8 we test the vmxnet3 device in the pxe-tester, too (when running "make check SPEED=slow"). This now revealed that the code is not working there if the host is a big endian machine (for example ppc64 or s390x) - "make check SPEED=slow" is now failing on such hosts. The vmxnet3 code lacks endianness conversions in a couple of places. Interestingly, the bitfields in the structs in vmxnet3.h already tried to take care of the bit endianness of the C compilers - but the code missed to change the byte endianness when reading or writing the corresponding structs. So the bitfields are now wrapped into unions which allow to change the byte endianness during runtime with the non-bitfield member of the union. With these changes, "make check SPEED=slow" now properly works on big endian hosts, too. Reported-by: David Gibson <dgibson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: David Gibson <dgibson@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	net: Transmit zero UDP checksum as 0xFFFF	Ed Swierk	5	-4/+11
	The checksum algorithm used by IPv4, TCP and UDP allows a zero value to be represented by either 0x0000 and 0xFFFF. But per RFC 768, a zero UDP checksum must be transmitted as 0xFFFF because 0x0000 is a special value meaning no checksum. Substitute 0xFFFF whenever a checksum is computed as zero when modifying a UDP datagram header. Doing this on IPv4 and TCP checksums is unnecessary but legal. Add a wrapper for net_checksum_finish() that makes the substitution. (We can't just change net_checksum_finish(), as that function is also used by receivers to verify checksums, and in that case the expected value is always 0x0000.) Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	MAINTAINERS: Add missing entry for eepro100 emulation	Stefan Weil	1	-0/+5
	Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	hw/net/eepro100: Fix endianness problem on big endian hosts	Thomas Huth	1	-2/+2
	Since commit 1865e288a823c764cd4344d ("Fix eepro100 simple transmission mode"), the test/pxe-test is broken for the eepro100 device on big endian hosts. However, it seems like that commit did not introduce the problem, but just uncovered it: The EEPRO100State->tx.tbd_array_addr and EEPRO100State->tx.tcb_bytes fields are already in host byte order, since they have already been byte-swapped in the read_cb() function. Thus byte-swapping them in tx_command() again results in the wrong endianness. Removing the byte-swapping here fixes the pxe-test. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	Revert "Add new PCI ID for i82559a"	Jason Wang	4	-19/+1
	This reverts commit 5e89dc01133f8f5e621f6b66b356c6f37d31dafb since: - we should use ID in the spec instead the one used by OEM - in the future, we should allow changing id through either property or EEPROM file. Cc: Stefan Weil <sw@weilnetz.de> Cc: Michael Nawrocki <michael.nawrocki@gtri.gatech.edu> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	colo-compare: fix the dangerous assignment	Mao Zhongyi	1	-1/+1
	Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Zhang Chen <zhangckid@gmail.com> Cc: Li Zhijian <lizhijian@cn.fujitsu.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Fixes: 8ec14402029d783720f4312ed8a925548e1dad61 Reported-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-11-20	spapr: reset DRCs after devices	Greg Kurz	2	-7/+21
	A DRC with a pending unplug request releases its associated device at machine reset time. In the case of LMB, when all DRCs for a DIMM device have been reset, the DIMM gets unplugged, causing guest memory to disappear. This may be very confusing for anything still using this memory. This is exactly what happens with vhost backends, and QEMU aborts with: qemu-system-ppc64: used ring relocated for ring 2 qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion `r >= 0' failed. The issue is that each DRC registers a QEMU reset handler, and we don't control the order in which these handlers are called (ie, a LMB DRC will unplug a DIMM before the virtio device using the memory on this DIMM could stop its vhost backend). To avoid such situations, let's reset DRCs after all devices have been reset. Reported-by: Mallesh N. Koti <mallesh@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-20	target/ppc: Update setting of cpu features to account for compat modes	Suraj Jitindar Singh	1	-22/+21
	The device tree nodes ibm,arch-vec-5-platform-support and ibm,pa-features are used to communicate features of the cpu to the guest operating system. The properties of each of these are determined based on the selected cpu model and the availability of hypervisor features. Currently the compatibility mode of the cpu is not taken into account. The ibm,arch-vec-5-platform-support node is used to communicate the level of support for various ISAv3 processor features to the guest before CAS to inform the guests' request. The available mmu mode should only be hash unless the cpu is a POWER9 which is not in a prePOWER9 compat mode, in which case the available modes depend on the accelerator and the hypervisor capabilities. The ibm,pa-featues node is used to communicate the level of cpu support for various features to the guest os. This should only contain features relevant to the operating mode of the processor, that is the selected cpu model taking into account any compat mode. This means that the compat mode should be taken into account when choosing the properties of ibm,pa-features and they should match the compat mode selected, or the cpu model selected if no compat mode. Update the setting of these cpu features in the device tree as described above to properly take into account any compat mode. We use the ppc_check_compat function which takes into account the current processor model and the cpu compat mode. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-11-17	Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging	Peter Maydell	57	-93/+1751
	Block layer patches for 2.11.0-rc2 # gpg: Signature made Fri 17 Nov 2017 17:58:36 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (25 commits) iotests: Make 087 pass without AIO enabled block: Make bdrv_next() keep strong references qcow2: Fix overly broad madvise() qcow2: Refuse to get unaligned offsets from cache qcow2: Add bounds check to get_refblock_offset() block: Guard against NULL bs->drv qcow2: Unaligned zero cluster in handle_alloc() qcow2: check_errors are fatal qcow2: reject unaligned offsets in write compressed iotests: Add test for failing qemu-img commit tests: Add check-qobject for equality tests iotests: Add test for non-string option reopening block: qobject_is_equal() in bdrv_reopen_prepare() qapi: Add qobject_is_equal() qapi/qlist: Add qlist_append_null() macro qapi/qnull: Add own header qcow2: fix image corruption on commit with persistent bitmap iotests: test clearing unknown autoclear_features by qcow2 block: Fix permissions in image activation qcow2: fix image corruption after committing qcow2 image into base ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-17	Merge remote-tracking branch 'mreitz/tags/pull-block-2017-11-17' into ↵	Kevin Wolf	43	-42/+1085
	queue-block Block patches for 2.11.0-rc2 # gpg: Signature made Fri Nov 17 18:22:07 2017 CET # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * mreitz/tags/pull-block-2017-11-17: iotests: Make 087 pass without AIO enabled block: Make bdrv_next() keep strong references qcow2: Fix overly broad madvise() qcow2: Refuse to get unaligned offsets from cache qcow2: Add bounds check to get_refblock_offset() block: Guard against NULL bs->drv qcow2: Unaligned zero cluster in handle_alloc() qcow2: check_errors are fatal qcow2: reject unaligned offsets in write compressed iotests: Add test for failing qemu-img commit tests: Add check-qobject for equality tests iotests: Add test for non-string option reopening block: qobject_is_equal() in bdrv_reopen_prepare() qapi: Add qobject_is_equal() qapi/qlist: Add qlist_append_null() macro qapi/qnull: Add own header Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-11-17	iotests: Make 087 pass without AIO enabled	Max Reitz	1	-1/+8
	If AIO has not been enabled in the qemu build that is to be tested, we should skip the "aio=native without O_DIRECT" test instead of failing. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171115180732.31753-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	block: Make bdrv_next() keep strong references	Max Reitz	5	-2/+57
	On one hand, it is a good idea for bdrv_next() to return a strong reference because ideally nearly every pointer should be refcounted. This fixes intermittent failure of iotest 194. On the other, it is absolutely necessary for bdrv_next() itself to keep a strong reference to both the BB (in its first phase) and the BDS (at least in the second phase) because when called the next time, it will dereference those objects to get a link to the next one. Therefore, it needs these objects to stay around until then. Just storing the pointer to the next in the iterator is not really viable because that pointer might become invalid as well. Both arguments taken together means we should probably just invoke bdrv_ref() and blk_ref() in bdrv_next(). This means we have to assert that bdrv_next() is always called from the main loop, but that was probably necessary already before this patch and judging from the callers, it also looks to actually be the case. Keeping these strong references means however that callers need to give them up if they decide to abort the iteration early. They can do so through the new bdrv_next_cleanup() function. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110172545.32609-1-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: Fix overly broad madvise()	Max Reitz	1	-1/+1
	@mem_size and @offset are both size_t, thus subtracting them from one another will just return a big size_t if mem_size < offset -- even more obvious here because the result is stored in another size_t. Checking that result to be positive is therefore not sufficient to exclude the case that offset > mem_size. Thus, we currently sometimes issue an madvise() over a very large address range. This is triggered by iotest 163, but with -m64, this does not result in tangible problems. But with -m32, this test produces three segfaults, all of which are fixed by this patch. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171114184127.24238-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: Refuse to get unaligned offsets from cache	Max Reitz	3	-0/+71
	Instead of using an assertion, it is better to emit a corruption event here. Checking all offsets for correct alignment can be tedious and it is easily possible to forget to do so. qcow2_cache_do_get() is a function every L2 and refblock access has to go through, so this is a good central point to add such a check. And for good measure, let us also add an assertion that the offset is non-zero. Making this a corruption event is not feasible, because a zero offset usually means something special (such as the cluster is unused), so all callers should be checking this anyway. If they do not, it is their fault, hence the assertion here. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-6-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: Add bounds check to get_refblock_offset()	Max Reitz	4	-7/+93
	Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728661 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-5-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	block: Guard against NULL bs->drv	Max Reitz	7	-3/+130
	We currently do not guard everywhere against a NULL bs->drv where we should be doing so. Most of the places fixed here just do not care about that case at all. Some care implicitly, e.g. through a prior function call to bdrv_getlength() which would always fail for an ejected BDS. Add an assert there to make it more obvious. Other places seem to care, but do so insufficiently: Freeing clusters in a qcow2 image is an error-free operation, but it may leave the image in an unusable state anyway. Giving qcow2_free_clusters() an error code is not really viable, it is much easier to note that bs->drv may be NULL even after a successful driver call. This concerns bdrv_co_flush(), and the way the check is added to bdrv_co_pdiscard() (in every iteration instead of only once). Finally, some places employ at least an assert(bs->drv); somewhere, that may be reasonable (such as in the reopen code), but in bdrv_has_zero_init(), it is definitely not. Returning 0 there in case of an ejected BDS saves us much headache instead. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728660 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: Unaligned zero cluster in handle_alloc()	Max Reitz	3	-1/+38
	We should check whether the cluster offset we are about to use is actually valid; that is, whether it is aligned to cluster boundaries. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728643 Buglink: https://bugs.launchpad.net/qemu/+bug/1728657 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: check_errors are fatal	Max Reitz	3	-1/+47
	When trying to repair a dirty image, qcow2_check() may apparently succeed (no really fatal error occurred that would prevent the check from continuing), but if check_errors in the result object is non-zero, we cannot trust the image to be usable. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728639 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: reject unaligned offsets in write compressed	Anton Nefedov	1	-0/+4
	Misaligned compressed write is not supported. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Message-id: 1510654613-47868-2-git-send-email-anton.nefedov@virtuozzo.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	iotests: Add test for failing qemu-img commit	Max Reitz	2	-0/+44
	Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170616135847.17726-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	tests: Add check-qobject for equality tests	Max Reitz	3	-1/+332
	Add a new test file (check-qobject.c) for unit tests that concern QObjects as a whole. Its only purpose for now is to test the qobject_is_equal() function. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171114180128.17076-7-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	iotests: Add test for non-string option reopening	Max Reitz	2	-0/+14
	Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171114180128.17076-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	block: qobject_is_equal() in bdrv_reopen_prepare()	Max Reitz	1	-11/+18
	Currently, bdrv_reopen_prepare() assumes that all BDS options are strings. However, this is not the case if the BDS has been created through the json: pseudo-protocol or blockdev-add. Note that the user-invokable reopen command is an HMP command, so you can only specify strings there. Therefore, specifying a non-string option with the "same" value as it was when originally created will now return an error because the values are supposedly similar (and there is no way for the user to circumvent this but to just not specify the option again -- however, this is still strictly better than just crashing). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171114180128.17076-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qapi: Add qobject_is_equal()	Max Reitz	14	-0/+186
	This generic function (along with its implementations for different types) determines whether two QObjects are equal. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 20171114180128.17076-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qapi/qlist: Add qlist_append_null() macro	Max Reitz	2	-0/+6
	Besides the macro itself, this patch also adds a corresponding Coccinelle rule. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20171114180128.17076-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qapi/qnull: Add own header	Max Reitz	8	-14/+36
	Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 20171114180128.17076-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-11-17	qcow2: fix image corruption on commit with persistent bitmap	Eric Blake	3	-18/+270
	If an image contains persistent bitmaps, we cannot use the fast path of bdrv_make_empty() to clear the image during qemu-img commit, because that will lose the clusters related to the bitmaps. Also leave a comment in qcow2_read_extensions to remind future feature additions to think about fast-path removal, since we just barely fixed the same bug for LUKS encryption. It's a pain that qemu-img has not yet been taught to manipulate, or even at a very minimum display, information about persistent bitmaps; instead, we have to use QMP commands. It's also a pain that only qeury-block and x-debug-block-dirty-bitmap-sha256 will allow bitmap introspection; but the former requires the node to be hooked to a block device, and the latter is experimental. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-11-17	iotests: test clearing unknown autoclear_features by qcow2	Vladimir Sementsov-Ogievskiy	3	-0/+72
	Test clearing unknown autoclear_features by qcow2 on incoming migration. [ kwolf: Fixed wait for destination VM startup ] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-11-17	block: Fix permissions in image activation	Kevin Wolf	1	-10/+22
	Inactive images generally request less permissions for their image files than they would if they were active (in particular, write permissions). Activating the image involves extending the permissions, therefore. drv->bdrv_invalidate_cache() can already require write access to the image file, so we have to update the permissions earlier than that. The current code does it only later, so we have to move up this part. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-11-17	Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-11-17' into ↵	Peter Maydell	3	-56/+40
	staging nbd patches for 2017-11-17 Eric Blake - nbd: Don't crash when server reports NBD_CMD_READ failure Eric Blake - nbd/client: Use error_prepend() correctly Eric Blake - nbd/client: Don't hard-disconnect on ESHUTDOWN from server Eric Blake - nbd/server: Fix error reporting for bad requests # gpg: Signature made Fri 17 Nov 2017 14:53:30 GMT # gpg: using RSA key 0xA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2017-11-17: nbd/server: Fix error reporting for bad requests nbd/client: Don't hard-disconnect on ESHUTDOWN from server nbd/client: Use error_prepend() correctly nbd: Don't crash when server reports NBD_CMD_READ failure Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-11-17	nbd/server: Fix error reporting for bad requests	Eric Blake	1	-24/+12
	The NBD spec says an attempt to NBD_CMD_TRIM on a read-only export should fail with EPERM, as a trim has the potential to change disk contents, but we were relying on the block layer to catch that for us, which might not always give the right error (and even if it does, it does not let us pass back a sane message for structured replies). The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of bounds should fail with ENOSPC, not EINVAL. Our check for u64 offset + u32 length wraparound up front is pointless; nothing uses offset until after the second round of sanity checks, and we can just as easily ensure there is no wraparound by checking whether offset is in bounds (since a disk size cannot exceed off_t which is 63 bits, adding a 32-bit number for a valid offset can't overflow). Bonus: dropping the up-front check lets us keep the connection alive after NBD_CMD_WRITE, whereas before we would drop the connection (of course, any client sending a packet that would trigger the failure is already buggy, so it's also okay to drop the connection, but better quality-of-implementation never hurts). Solve all of these issues by some code motion and improved request validation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171115213557.3548-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-17	nbd/client: Don't hard-disconnect on ESHUTDOWN from server	Eric Blake	1	-6/+0
	The NBD spec says that a server may fail any transmission request with ESHUTDOWN when it is apparent that no further request from the client can be successfully honored. The client is supposed to then initiate a soft shutdown (wait for all remaining in-flight requests to be answered, then send NBD_CMD_DISC). However, since qemu's server never uses ESHUTDOWN errors, this code was mostly untested since its introduction in commit b6f5d3b5. More recently, I learned that nbdkit as the NBD server is able to send ESHUTDOWN errors, so I finally tested this code, and noticed that our client was special-casing ESHUTDOWN to cause a hard shutdown (immediate disconnect, with no NBD_CMD_DISC), but only if the server sends this error as a simple reply. Further investigation found that commit d2febedb introduced a regression where structured replies behave differently than simple replies - but that the structured reply behavior is more in line with the spec (even if we still lack code in nbd-client.c to properly quit sending further requests). So this patch reverts the portion of b6f5d3b5 that introduced an improper hard-disconnect special-case at the lower level, and leaves the future enhancement of a nicer soft-disconnect at the higher level for another day. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171113194857.13933-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-17	nbd/client: Use error_prepend() correctly	Eric Blake	1	-24/+26
	When using error prepend(), it is necessary to end with a space in the format string; otherwise, messages come out incorrectly, such as when connecting to a socket that hangs up immediately: can't open device nbd://localhost:10809/: Failed to read dataUnexpected end-of-file before all bytes were read Originally botched in commit e44ed99d, then several more instances added in the meantime. Pre-existing and not fixed here: we are inconsistent on capitalization; some of our messages start with lower case, and others start with upper, although the use of error_prepend() is much nicer to read when all fragments consistently start with lower. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171113152424.25381-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-11-17	nbd: Don't crash when server reports NBD_CMD_READ failure	Eric Blake	1	-2/+2
	If a server fails a read, for example with EIO, but the connection is still live, then we would crash trying to print a non-existent error message in nbd_client_co_preadv(). For consistency, also change the error printout in nbd_read_reply_entry(), although that instance does not crash. Bug introduced in commit f140e300. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171112013936.5942-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-17	qcow2: fix image corruption after committing qcow2 image into base	Daniel P. Berrange	5	-3/+238
	After committing the qcow2 image contents into the base image, qemu-img will call bdrv_make_empty to drop the payload in the layered image. When this is done for qcow2 images, it blows away the LUKS encryption header, making the resulting image unusable. There are two codepaths for emptying a qcow2 image, and the second (slower) codepath leaves the LUKS header intact, so force use of that codepath. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-11-17	block: Deprecate bdrv_set_read_only() and users	Kevin Wolf	7	-16/+54
	bdrv_set_read_only() is used by some block drivers to override the read-only option given by the user. This is not how read-only images generally work in QEMU: Instead of second guessing what the user really meant (which currently includes making an image read-only even if the user didn't only use the default, but explicitly said read-only=off), we should error out if we can't provide what the user requested. This adds deprecation warnings to all callers of bdrv_set_read_only() so that the behaviour can be corrected after the usual deprecation period. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-11-17	qcow2: don't permit changing encryption parameters	Daniel P. Berrange	1	-0/+3
	Currently if trying to change encryption parameters on a qcow2 image, qemu-img will abort. We already explicitly check for attempt to change encrypt.format but missed other parameters like encrypt.key-secret. Rather than list each parameter, just blacklist changing of all parameters with a 'encrypt.' prefix. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>