peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2013-07-26	qcow2: Use dashes instead of underscores in options	Kevin Wolf	1	-1/+1
	This is what QMP wants to use. The options haven't been enabled in any release yet, so we're still free to change them. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-06-28	block: change default of .has_zero_init to 0	Peter Lieven	1	-0/+1
	.has_zero_init defaults to 1 for all formats and protocols. this is a dangerous default since this means that all new added drivers need to manually overwrite it to 0 if they do not ensure that a device is zero initialized after bdrv_create(). if a driver needs to explicitly set this value to 1 its easier to verify the correctness in the review process. during review of the existing drivers it turned out that ssh and gluster had a wrong default of 1. both protocols support host_devices as backend which are not by default zero initialized. this wrong assumption will lead to possible corruption if qemu-img convert is used to write to such a backend. vpc and vmdk also defaulted to 1 altough they support fixed respectively flat extends. this has to be addresses in separate patches. both formats as well as the mentioned ssh and gluster are turned to the default of 0 with this patch for safety. a similar problem with the wrong default existed for iscsi most likely because the driver developer did oversee the default value of 1. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-24	qcow2: Batch discards	Kevin Wolf	1	-0/+1
	This optimises the discard operation for freed clusters by batching discard requests (both snapshot deletion and bdrv_discard end up updating the refcounts cluster by cluster). Note that we don't discard asynchronously, but keep s->lock held. This is to avoid that a freed cluster is reallocated and written to while the discard is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24	qcow2: Options to enable discard for freed clusters	Kevin Wolf	1	-0/+26
	Deleted snapshots are discarded in the image file by default, discard requests take their default from the -drive discard=... option and other places that free clusters must always be enabled explicitly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24	qcow2: Add refcount update reason to all callers	Kevin Wolf	1	-1/+2
	This adds a refcount update reason to all callers of update_refcounts(), so that a follow-up patch can use this information to decide whether clusters that reach a refcount of 0 should be discarded in the image file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-14	qcow2: Catch some L1 table index overflows	Kevin Wolf	1	-2/+11
	This catches the situation that is described in the bug report at https://bugs.launchpad.net/qemu/+bug/865518 and goes like this: $ qemu-img create -f qcow2 huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off $ qemu-io /tmp/huge.qcow2 -c "write $((102410241024102410241024 - 1024)) 512" Segmentation fault With this patch applied the segfault will be avoided, however the case will still fail, though gracefully: $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off qemu-img: The image size is too large for file format 'qcow2' Note that even long before these overflow checks kick in, you get insanely high memory usage (up to INT_MAX sizeof(uint64_t) = 16 GB for the L1 table), so with somewhat smaller image sizes you'll probably see qemu aborting for a failed g_malloc(). If you need huge image sizes, you should increase the cluster size to the maximum of 2 MB in order to get higher limits. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-22	qcow2: allow sub-cluster compressed write to last cluster	Stefan Hajnoczi	1	-2/+15
	Compression in qcow2 requires image length to be a multiple of the cluster size. Lift this requirement by zero-padding the final cluster when necessary. The virtual disk size is still not cluster-aligned, so the guest cannot access the zero sectors. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-15	block: Introduce bdrv_pwritev() for qcow2_save_vmstate	Kevin Wolf	1	-7/+1
	Directly pass the QEMUIOVector on instead of linearising it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15	block: Introduce bdrv_writev_vmstate	Kevin Wolf	1	-3/+9
	Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-13	aes: move aes.h from include/block to include/qemu	Aurelien Jarno	1	-1/+1
	Move aes.h from include/block to include/qemu to show it can be reused by other subsystems. Cc: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-03-28	qcow2: Allow requests with multiple l2metas	Kevin Wolf	1	-3/+11
	Instead of expecting a single l2meta, have a list of them. This allows to still have a single I/O request for the guest data, even though multiple l2meta may be needed in order to describe both a COW overwrite and a new cluster allocation (typical sequential write case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28	qcow2: Remove bogus unlock of s->lock	Kevin Wolf	1	-2/+0
	The unlock wakes up the next coroutine, but the currently running coroutine will lock it again before it yields, so this doesn't make a lot of sense. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-22	block: Add options QDict to bdrv_file_open() prototypes	Kevin Wolf	1	-1/+1
	The new parameter is unused yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-19	qcow2: Fix segfault in qcow2_invalidate_cache	Kevin Wolf	1	-2/+10
	Need to pass an options QDict to qcow2_open() now. This fixes a segfault on the migration target with qcow2. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15	qcow2: make is_allocated return true for zero clusters	Paolo Bonzini	1	-5/+1
	Otherwise, live migration of the top layer will miss zero clusters and let the backing file show through. This also matches what is done in qed. QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this directly in qcow2_get_cluster_offset instead of replicating the test everywhere. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15	qcow2: Allow lazy refcounts to be enabled on the command line	Kevin Wolf	1	-0/+37
	qcow2 images now accept a boolean lazy_refcounts options. Use it like this: -drive file=test.qcow2,lazy_refcounts=on If the option is specified on the command line, it overrides the default specified by the qcow2 header flags that were set when creating the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15	block: Add options QDict to bdrv_open() prototype	Kevin Wolf	1	-1/+1
	It doesn't do anything yet except storing the options QDict in the BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15	block: Add options QDict to .bdrv_open()	Kevin Wolf	1	-2/+2
	Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-25	block: Use error code EMEDIUMTYPE for wrong format in some block drivers	Stefan Weil	1	-1/+1
	This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk when a file with the wrong format is selected. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-15	qcow2: Fix segfault on zero-length write	Kevin Wolf	1	-1/+1
	One of the recent refactoring patches (commit f50f88b9) didn't take care to initialise l2meta properly, so with zero-length writes, which don't even enter the write loop, qemu just segfaulted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19	misc: move include files to include/qemu/	Paolo Bonzini	1	-2/+2
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19	block: move include files to include/block/	Paolo Bonzini	1	-2/+2
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19	qapi: move include files to include/qobject/	Paolo Bonzini	1	-1/+1
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-13	qcow2: Execute run_dependent_requests() without lock	Kevin Wolf	1	-20/+16
	There's no reason for run_dependent_requests() to hold s->lock, and a later patch will require that in fact the lock is not held. Also, before this patch, run_dependent_requests() not only does what its name suggests, but also removes the l2meta from the list of in-flight requests. When changing this, it becomes an one-liner, so just inline it completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13	qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2	Kevin Wolf	1	-6/+1
	This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13	qcow2: Allocate l2meta only for cluster allocations	Kevin Wolf	1	-15/+17
	Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13	qcow2: Drop l2meta.cluster_offset	Kevin Wolf	1	-7/+7
	There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13	qcow2: Allocate l2meta dynamically	Kevin Wolf	1	-11/+15
	As soon as delayed COW is introduced, the l2meta struct is needed even after completion of the request, so it can't live on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12	qcow2: Move BLKDBG_EVENT out of the lock	Kevin Wolf	1	-1/+1
	We want to use these events to suspend requests for testing concurrent AIO requests. Suspending requests while they are holding the CoMutex is rather boring for this purpose. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-05	qcow2: mark this file's sole strncpy use as justified	Jim Meyering	1	-0/+1
	Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-24	block: qcow2 image file reopen	Jeff Cody	1	-0/+10
	These are the stubs for the file reopen drivers for the qcow2 format. There is currently nothing that needs to be done by the qcow2 driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10	block: add BLOCK_O_CHECK for qemu-img check	Stefan Hajnoczi	1	-2/+2
	Image formats with a dirty bit, like qed and qcow2, repair dirty image files upon open with BDRV_O_RDWR. Performing automatic repair when qemu-img check runs is not ideal because the bdrv_open() call repairs the image before the actual bdrv_check() call from qemu-img.c. Fix this "double repair" since it leads to confusing output from qemu-img check. Tell the block driver that this image is being opened just for bdrv_check(). This skips automatic repair and qemu-img.c can invoke it manually with bdrv_check(). Update the golden output for qemu-iotests 039 to reflect the new qemu-img check output. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10	qcow2: mark image clean after repair succeeds	Stefan Hajnoczi	1	-13/+15
	The dirty bit is cleared after image repair succeeds in qcow2_open(). Move this into qcow2_check() so that all callers benefit from this behavior when fix mode is enabled. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06	qcow2: implement lazy refcounts	Stefan Hajnoczi	1	-4/+69
	Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06	qcow2: introduce dirty bit	Stefan Hajnoczi	1	-3/+47
	This patch adds an incompatible feature bit to mark images that have not been closed cleanly. When a dirty image file is opened a consistency check and repair is performed. Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09	Merge remote-tracking branch 'mjt/mjt-iov2' into staging	Anthony Liguori	1	-12/+9
	* mjt/mjt-iov2: rewrite iov_send_recv() and move it to iov.c cleanup qemu_co_sendv(), qemu_co_recvv() and friends export iov_send_recv() and use it in iov_send() and iov_recv() rename qemu_sendv to iov_send, change proto and move declarations to iov.h change qemu_iovec_to_buf() to match other to,from_buf functions consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent allow qemu_iovec_from_buffer() to specify offset from which to start copying consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset() rewrite iov_* functions change iov_* function prototypes to be more appropriate virtio-serial-bus: use correct lengths in control_out() message Conflicts: tests/Makefile Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09	qcow2: fix #ifdef'd qcow2_check_refcounts() callers	Stefan Hajnoczi	1	-1/+1
	The DEBUG_ALLOC qcow2.h macro enables additional consistency checks throughout the code. This makes it easier to spot corruptions that are introduced during development. Since consistency check is an expensive operation the DEBUG_ALLOC macro is used to compile checks out in normal builds and qcow2_check_refcounts() calls missed the addition of a new function argument. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15	qcow2: fix autoclear image header update	Stefan Hajnoczi	1	-8/+9
	The autoclear feature bits can be used for qcow2 file format features that are safe to "drop" by old programs that do not understand the feature. Upon opening the image file unknown autoclear feature bits are cleared and the image file header is rewritten, but this was happening too early in the code when critical header fields were not yet loaded. Process autoclear feature bits after all necessary header information has been loaded. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15	qcow2: always operate caches in writeback mode	Paolo Bonzini	1	-5/+2
	Writethrough does not need special-casing anymore in the qcow2 caches. The block layer adds flushes after every guest-initiated data write, and these will also flush the qcow2 caches to the OS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15	qcow2: Support for fixing refcount inconsistencies	Kevin Wolf	1	-5/+1
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15	qemu-img check -r for repairing images	Kevin Wolf	1	-1/+6
	The QED block driver already provides the functionality to not only detect inconsistencies in images, but also fix them. However, this functionality cannot be manually invoked with qemu-img, but the check happens only automatically during bdrv_open(). This adds a -r switch to qemu-img check that allows manual invocation of an image repair. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-11	change qemu_iovec_to_buf() to match other to,from_buf functions	Michael Tokarev	1	-1/+1
	It now allows specifying offset within qiov to start from and amount of bytes to copy. Actual implementation is just a call to iov_to_buf(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11	consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent	Michael Tokarev	1	-2/+2
	qemu_iovec_concat() is currently a wrapper for qemu_iovec_copy(), use the former (with extra "0" arg) in a few places where it is used. Change skip argument of qemu_iovec_copy() from uint64_t to size_t, since size of qiov itself is size_t, so there's no way to skip larger sizes. Rename it to soffset, to make it clear that the offset is applied to src. Also change the only usage of uint64_t in hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() - all callers of it actually uses size_t too, not uint64_t. One added restriction: as for all other iovec-related functions, soffset must point inside src. Order of argumens is already good: qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes) vs: qemu_iovec_concat(QEMUIOVector dst, QEMUIOVector *src, size_t soffset, size_t sbytes) (note soffset is after _src_ not dst, since it applies to src; for memset it applies to qiov). Note that in many places where this function is used, the previous call is qemu_iovec_reset(), which means many callers actually want copy (replacing dst content), not concat. So we may want to add a wrapper like qemu_iovec_copy() with the same arguments but which calls qemu_iovec_reset() before _concat(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11	allow qemu_iovec_from_buffer() to specify offset from which to start copying	Michael Tokarev	1	-6/+3
	Similar to qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes); the new prototype is: qemu_iovec_from_buf(QEMUIOVector qiov, size_t offset, const void *buf, size_t bytes); The processing starts at offset bytes within qiov. This way, we may copy a bounce buffer directly to a middle of qiov. This is exactly the same function as iov_from_buf() from iov.c, so use the existing implementation and rename it to qemu_iovec_from_buf() to be shorter and to match the utility function. As with utility implementation, we now assert that the offset is inside actual iovec. Nothing changed for current callers, because `offset' parameter is new. While at it, stop using "bounce-qiov" in block/qcow2.c and copy decrypted data directly from cluster_data instead of recreating a temp qiov for doing that. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11	consolidate qemu_iovec_memset{,_skip}() into single function and use ↵	Michael Tokarev	1	-3/+3
	existing iov_memset() This patch combines two functions into one, and replaces the implementation with already existing iov_memset() from iov.c. The new prototype of qemu_iovec_memset(): size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes) It is different from former qemu_iovec_memset_skip(), and I want to make other functions to be consistent with it too: first how much to skip, second what, and 3rd how many of it. It also returns actual number of bytes filled in, which may be less than the requested `bytes' if qiov is smaller than offset+bytes, in the same way iov_memset() does. While at it, use utility function iov_memset() from iov.h in posix-aio-compat.c, where qiov was used. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-05-25	qcow2: don't leak buffer for unexpected qcow_version in header	Jim Meyering	1	-1/+2
	Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14	qcow2: Don't ignore failure to clear autoclear flags	Kevin Wolf	1	-1/+4
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10	block: push bdrv_change_backing_file error checking up from drivers	Paolo Bonzini	1	-5/+0
	This check applies to all drivers, but QED lacks it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07	qcow2: lock on prealloc	Zhi Yong Wu	1	-0/+3
	preallocate() will be locked. This is required because qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it can drop while COW is being performed. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02	block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()	Stefan Weil	1	-1/+2
	Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>