peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2010-09-21	qcow2: Avoid bounce buffers for AIO write requests	Kevin Wolf	1	-23/+18
	qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO write path. Encrypted images continue to use a bounce buffer, however with constant size. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	qcow2: Avoid bounce buffers for AIO read requests	Kevin Wolf	3	-30/+68
	qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO read path, and constrains them to a constant size for encrypted images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	qcow2: Get rid of additional sync on COW	Kevin Wolf	1	-2/+8
	We always have a sync for the refcount update when a new cluster is allocated. If we move this past the COW, we can save an additional sync. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	qcow2: Move sync out of qcow2_alloc_clusters	Kevin Wolf	3	-2/+7
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	qcow2: Move sync out of update_refcount	Kevin Wolf	1	-2/+11
	Note that the flush is omitted intentionally in qcow2_free_clusters. If anything, we can leak clusters here if we lose the writes. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	qcow2: Move sync out of write_refcount_block_entries	Kevin Wolf	1	-1/+3
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	nbd: correctly manage default port	Laurent Vivier	1	-2/+0
	block/nbd.c: use default port number when none is specified qemu-nbd.c: use IANA-assigned port number: 10809 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	raw-posix: handle > 512 byte alignment correctly	Christoph Hellwig	1	-33/+46
	Replace the hardcoded handling of 512 byte alignment with bs->buffer_alignment to handle larger sector size devices correctly. Note that we can not rely on it to be initialize in bdrv_open, so deal with the worst case there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21	vvfat: Use cache=unsafe	Kevin Wolf	1	-4/+10
	The qcow file used for write support in vvfat is a temporary file, so we can use cache=unsafe there. Without this, write support is just too slow to be of any use. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-21	vvfat: Fix double free for opening the image rw	Kevin Wolf	1	-3/+4
	Allocation and deallocation of bs->opaque is not in the control of a block driver. Therefore it should not set bs->opaque to a data structure used by another bs, or closing the image will lead to a double free. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-21	vvfat: Fix segfault on write to read-only disk	Kevin Wolf	1	-0/+5
	vvfat tries to set the readonly flag in its open function, but nowadays this is overwritted with the readonly=... command line option. Check in bdrv_write if the vvfat was opened read-only and return an error in this case. Without this check, vvfat tries to access the qcow bs, which is NULL without enabled write support. Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
2010-09-18	blkdebug: fix enum comparison	Blue Swirl	1	-3/+1
	The signedness of enum types depend on the compiler implementation. Therefore the check for negative values may or may not be meaningful. Fix by explicitly casting to a signed integer. Since the values are also checked earlier against event_names table, this is an internal error. Change the 'if' to 'assert'. This also avoids a warning with GCC flag -Wtype-limits. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-09-08	Revert "Make default invocation of block drivers safer (v3)"	Anthony Liguori	1	-130/+0
	This reverts commit 79368c81bf8cf93864d7afc88b81b05d8f0a2c90. Conflicts: block.c I haven't been able to come up with a solution yet for the corruption caused by unaligned requests from the IDE disk so revert until a solution can be written. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-09-08	qcow2: Remove unnecessary flush after L2 write	Kevin Wolf	1	-4/+12
	When a new cluster was allocated, we only need a flush after the write to the L2 table if it was a COW and we need to decrease the refcounts of the old clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08	raw-posix: improve detection of scsi-generic devices	Bernhard Kohl	1	-2/+8
	Allow symbolic links which point to /dev/sgX devices. Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08	raw-posix: Don't use file name for host_cdrom detection on Linux	Kevin Wolf	1	-3/+0
	On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose anything but false positives by removing the check for a /dev/cd* path. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30	nbd: Introduce NBD named exports.	Laurent Vivier	1	-19/+49
	This patch allows to connect Qemu using NBD protocol to an nbd-server using named exports. For instance, if on the host "isoserver", in /etc/nbd-server/config, you have: [generic] [debian-500-ppc-netinst] exportname = /ISO/debian-500-powerpc-netinst.iso [Fedora-10-ppc-netinst] exportname = /ISO/Fedora-10-ppc-netinst.iso You can connect to it, using: qemu -cdrom nbd:isoserver:exportname=debian-500-ppc-netinst qemu -cdrom nbd:isoserver:exportname=Fedora-10-ppc-netinst NOTE: you need at least nbd-server 2.9.18 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30	vvfat: fat_chksum(): fix access above array bounds	Loïc Minier	1	-1/+1
	Signed-off-by: Loïc Minier <loic.minier@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-30	sheepdog: remove unnecessary includes	Izumi Tsutsui	1	-10/+0
	"qemu_socket.h" includes all necessary files and including <netinet/tcp.h> without <netinet/in.h> could cause errors on some systems. Signed-off-by: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-03	block: Fix bdrv_has_zero_init	Kevin Wolf	3	-4/+21
	Assuming that any image on a block device is not properly zero-initialized is actually wrong: Only raw images have this problem. Any other image format shouldn't care about it, they initialize everything properly themselves. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-25	block: Replace u_int8_t, u_int16_t, u_int32_t, u_int64_t by standard int types	Stefan Weil	1	-1/+1
	There is no need to have a second set of integral types. Replace them by the standard types from stdint.h. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-15	Make default invocation of block drivers safer (v3)	Anthony Liguori	1	-0/+130
	CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-07-07	sheepdog: fix compile error on systems without TCP_CORK	MORITA Kazutaka	1	-1/+1
	WIN32 is not only the system which doesn't have TCP_CORK (e.g. OS X). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-07-06	block: add sheepdog driver for distributed storage support	MORITA Kazutaka	1	-0/+2036
	Sheepdog is a distributed storage system for QEMU. It provides highly available block level storage volumes to VMs like Amazon EBS. This patch adds a qemu block driver for Sheepdog. Sheepdog features are: - No node in the cluster is special (no metadata node, no control node, etc) - Linear scalability in performance and capacity - No single point of failure - Autonomous management (zero configuration) - Useful volume management support such as snapshot and cloning - Thin provisioning - Autonomous load balancing The more details are available at the project site: http://www.osrg.net/sheepdog/ Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06	raw-posix: Fix test for host CD-ROM	Markus Armbruster	1	-11/+6
	raw_pread_aligned() retries up to two times if the block device backs a virtual CD-ROM (a drive with media=cdrom and if=ide, scsi, xen or none). This makes no sense. Whether retrying reads can correct read errors can only depend on what we're reading, not on how the result gets used. We need to check what whether we're reading from a physical CD-ROM or floppy here. I doubt retrying is useful even then. Left for another day. Impact: * Virtual CD-ROM backed by host_cdrom behaves the same. * Virtual CD-ROM backed by file or host_device no longer retries. * A drive backed by host_cdrom now retries even if it's not a virtual CD-ROM. * Any drive backed by host_floppy now retries. While there, clean up gratuitous use of goto. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06	qcow2/vdi: Change check to distinguish error cases	Kevin Wolf	4	-63/+73
	This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02	blkdebug: Initialize state as 1	Kevin Wolf	1	-0/+3
	state = 0 in rules means that the rule is valid for any state. Therefore it's impossible to have a rule that works only in the initial state. This changes the initial state from 0 to 1 to make this possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02	blkdebug: Free QemuOpts after having read the config	Kevin Wolf	1	-0/+2
	Forgetting to free them means that the next instance inherits all rules and gets its own rules only additionally. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02	blkdebug: Fix set_state_opts definition	Kevin Wolf	1	-1/+1
	The list head was initialized to point to the wrong list, so all actions ended up being handled as inject-error even if they were set-state in fact. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02	qcow2: Fix error handling during metadata preallocation	Kevin Wolf	1	-6/+9
	People were wondering why qemu-img check failed after they tried to preallocate a large qcow2 file and ran out of disk space. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	qcow2: Don't try to check tables that couldn't be loaded	Kevin Wolf	1	-0/+1
	Trying to check them leads to a second error message which is more confusing than helpful: Can't get refcount for cluster 0: Invalid argument ERROR cluster 0 refcount=-22 reference=1 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	qcow2: Fix qemu-img check segfault on corrupted images	Kevin Wolf	1	-3/+11
	With corrupted images, we can easily get an cluster index that exceeds the array size of the temporary refcount table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	vpc: Use bdrv_(p)write_sync for metadata writes	Kevin Wolf	1	-4/+5
	Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	vmdk: Use bdrv_(p)write_sync for metadata writes	Kevin Wolf	1	-5/+5
	Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	qcow2: Use bdrv_(p)write_sync for metadata writes	Kevin Wolf	4	-41/+40
	Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	qcow: Use bdrv_(p)write_sync for metadata writes	Kevin Wolf	1	-8/+10
	Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22	cow: Use bdrv_(p)write_sync for metadata writes	Kevin Wolf	1	-9/+11
	Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. While at it, correct the wrong usage of errno. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	cow: use qemu block API	Christoph Hellwig	1	-26/+13
	Use bdrv_pwrite to access the backing device instead of pread, and convert the driver to implementing the bdrv_open method which gives it an already opened BlockDriverState for the underlying device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	cow: stop using mmap	Christoph Hellwig	1	-37/+61
	We don't have an equivalent to mmap in the qemu block API, so read and write the bitmap directly. At least in the dumb implementation added in this patch this is a lot less efficient, but it means cow can also work on windows, and over nbd or curl. And it fixes qemu-iotests testcase 012 which did not work properly due to issues with read-only mmap access. In addition we can also get rid of the now unused get_mmap_addr function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	cow: use pread/pwrite	Christoph Hellwig	1	-5/+5
	Use pread/pwrite instead of lseek + read/write in preparation of using the qemu block API. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	qcow2: Restore L1 entry on l2_allocate failure	Kevin Wolf	1	-0/+1
	If writing the L1 table to disk failed, we need to restore its old content in memory to avoid inconsistencies. Reported-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	qcow2: Return real error code in load_refcount_block	Kevin Wolf	1	-3/+8
	This fixes load_refcount_block which completely ignored the return value of write_refcount_block and always returned -EIO for bdrv_pwrite failure. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	qcow2: Allow alloc_clusters_noref to return errors	Kevin Wolf	1	-3/+15
	Currently it would consider blocks for which get_refcount fails used. However, it's unlikely that get_refcount would succeed for the next cluster, so it's not really helpful. Return an error instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	qcow2: Allow get_refcount to return errors	Kevin Wolf	1	-4/+37
	get_refcount might need to load a refcount block from disk, so errors may happen. Return the error code instead of assuming a refcount of 1 and change the callers to respect error return values. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15	vpc: Read/write multiple sectors at once	Kevin Wolf	1	-11/+30
	This changes the vpc block driver (for VHD) to read/write multiple sectors at once instead of doing a request for each single sector. Before this, running qemu-iotests for VPC took ages, now it's actually quite reasonable to run it always (down from ~1 hour to 40 seconds for me). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-14	Merge remote branch 'kwolf/for-anthony' into staging	Anthony Liguori	1	-9/+11
	Conflicts: hw/pc.c
2010-06-13	Move stdbool.h	Paul Brook	1	-2/+0
	Move inclusion of stdbool.h to common header files, instead of including in an ad-hoc manner. Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-06-04	Cleanup: raw-posix.c: Be more consistent using BDRV_SECTOR_SIZE instead of 512	Jes Sorensen	1	-9/+11
	Clean up raw-posix.c to be more consistent using BDRV_SECTOR_SIZE instead of hard coded 512 values. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28	qcow2: Fix corruption after error in update_refcount	Kevin Wolf	1	-0/+4
	After it is done with updating refcounts in the cache, update_refcount writes all changed entries to disk. If a refcount block allocation fails, however, there was no change yet and therefore first_index = last_index = -1. Don't treat -1 as a normal sector index (resulting in a 512 byte write!) but return without updating anything in this case. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28	qcow2: Fix corruption after refblock allocation	Kevin Wolf	1	-2/+9
	Refblock allocation code needs to take into consideration that update_refcount will load a different refcount block into the cache, so it must initialize the cache for a new refcount block only afterwards. Not doing this means that not only the refcount in the wrong block is updated, but also that the caller will work on the wrong block. Signed-off-by: Kevin Wolf <kwolf@redhat.com>