peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2011-12-05	cow: convert to .bdrv_co_is_allocated()	Stefan Hajnoczi	1	-4/+4
	The cow block driver does not keep internal state for cluster lookups. This means it is safe to perform cluster lookups in coroutine context without risk of race conditions that corrupt internal state. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	vdi: convert to .bdrv_co_is_allocated()	Stefan Hajnoczi	1	-3/+3
	It is trivial to switch from the synchronous .bdrv_is_allocated() interface to .bdrv_co_is_allocated() since vdi_is_allocated() does not block. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	vvfat: convert to .bdrv_co_is_allocated()	Stefan Hajnoczi	1	-2/+2
	It is trivial to switch from the synchronous .bdrv_is_allocated() interface to .bdrv_co_is_allocated() since vvfat_is_allocated() does not block. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: convert qcow2, qcow2, and vmdk to .bdrv_co_is_allocated()	Stefan Hajnoczi	3	-11/+18
	The qcow2, qcow, and vmdk block drivers are based on coroutines. They have a coroutine mutex which protects internal state. We can convert the .bdrv_is_allocated() function to .bdrv_co_is_allocated() by holding the mutex around the cluster lookup operation. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	qed: convert to .bdrv_co_is_allocated()	Stefan Hajnoczi	1	-4/+11
	The bdrv_qed_is_allocated() function is a synchronous wrapper around qed_find_cluster(), which performs the cluster lookup. In order to convert the synchronous function to a coroutine function we yield instead of using qemu_aio_wait(). Note that QED's cache is already safe for parallel requests so no locking is needed. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: add .bdrv_co_is_allocated()	Stefan Hajnoczi	2	-0/+39
	This patch adds the .bdrv_co_is_allocated() interface which is identical to .bdrv_is_allocated() but runs in coroutine context. Running in coroutine context implies that other coroutines might be performing I/O at the same time. Therefore it must be safe to run while the following BlockDriver functions are in-flight: .bdrv_co_readv() .bdrv_co_writev() .bdrv_co_flush() .bdrv_co_is_allocated() The new .bdrv_co_is_allocated() interface is useful because it can be used when a VM is running, whereas .bdrv_is_allocated() is a synchronous interface that does not cope with parallel requests. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: use public bdrv_is_allocated() interface	Stefan Hajnoczi	1	-1/+1
	There is no need for bdrv_commit() to use the BlockDriver .bdrv_is_allocated() interface directly. Converting to the public interface gives us the freedom to drop .bdrv_is_allocated() entirely in favor of a new .bdrv_co_is_allocated() in the future. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	qcow2: Fix error path in qcow2_snapshot_load_tmp	Kevin Wolf	1	-12/+22
	If the bdrv_read() of the snapshot's L1 table fails, return the right error code and make sure that the old L1 table is still loaded and we don't break the BlockDriverState completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Fix order in qcow2_snapshot_delete	Kevin Wolf	1	-15/+33
	First the snapshot must be deleted and only then the refcounts can be decreased. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Fix order of refcount updates in qcow2_snapshot_goto	Kevin Wolf	2	-18/+50
	The refcount updates must be moved so that in the worst case we can get cluster leaks, but refcounts may never be too low. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Return real error in qcow2_snapshot_goto	Kevin Wolf	1	-11/+40
	Besides fixing the return code, this adds some comments that make clear how the code works and that it potentially breaks images if we fail in the wrong place. Actually fixing this is left for the next patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Rework qcow2_snapshot_create error handling	Kevin Wolf	1	-14/+41
	Increase refcounts only after allocating a new L1 table has succeeded in order to make leaks less likely. If writing the snapshot table fails, revert in-memory state to be consistent with that on disk. While at it, make it return the real error codes instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Cleanups and memleak fix in qcow2_snapshot_create	Kevin Wolf	1	-15/+11
	sn->id_str could be leaked before this. The rest of this patch changes comments, fixes coding style or removes checks that are unnecessary with g_malloc. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Update snapshot table information at once	Kevin Wolf	1	-12/+10
	Failing in the middle wouldn't help with the integrity of the image, so doing everything in a single request seems better. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Return real error code in qcow2_write_snapshots	Kevin Wolf	1	-10/+38
	Doesn't immediately fix anything as the callers don't use the return value, but they will be fixed next. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	qcow2: Return real error code in qcow2_read_snapshots	Kevin Wolf	2	-7/+23
	Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05	block: Add coroutine_fn marker to coroutine functions	Dong Xu Wang	3	-8/+8
	Looks better when reviewing these source files. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	hmp/qmp: add block_set_io_throttle	Zhi Yong Wu	9	-2/+175
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: add I/O throttling algorithm	Zhi Yong Wu	3	-0/+236
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	CoQueue: introduce qemu_co_queue_wait_insert_head	Zhi Yong Wu	2	-0/+14
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: add the blockio limits command line support	Zhi Yong Wu	6	-0/+141
	Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	block: Use bdrv functions to replace file operation in cow.c	Li Zhi Hui	1	-18/+16
	Since common file operation functions lack of error detection, so change them to bdrv series functions. Signed-off-by: Li Zhi Hui <zhihuili@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	xen_disk: remove dead code	Paolo Bonzini	1	-84/+2
	Xen_disk.c has support for using synchronous I/O instead of asynchronous, but it is compiled out by default. Remove it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	qed: adjust the way to get nb_sectors	Zhi Yong Wu	1	-3/+3
	This patch is only to refactor some lines of codes to get better and more robust codes. As you have seen, in qed_read_table_cb() it's nice to use qiov->size because that function doesn't obviously use a single struct iovec. In other two functions, if qiov use more than one struct iovec, the existing way will get wrong nb_sectors. To make the code more robust, it will be nicer to refactor the existing way as below. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	qcow2: avoid reentrant bdrv_read() in copy_sectors()	Stefan Hajnoczi	1	-8/+19
	A BlockDriverState should not issue requests on itself through the public block layer interface. Nested, or reentrant, requests are problematic because they do I/O throttling and request tracking twice. Features like block layer copy-on-read use request tracking to avoid race conditions between concurrent requests. The reentrant request will have to "wait" for its parent request to complete. But the parent is waiting for the reentrant request to make progress so we have reached deadlock. The solution is for block drivers to avoid the public block layer interfaces for reentrant requests. Instead they should call their own internal functions if they wish to perform reentrant requests. This is also a good opportunity to make copy_sectors() a true coroutine_fn. That means calling bdrv_co_writev() instead of bdrv_write(). Behavior is unchanged but we're being explicit that this executes in coroutine context. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05	qcow2: Unlock during COW	Kevin Wolf	1	-69/+35
	Unlocking during COW allows for more parallelism. One change it requires is that buffers are dynamically allocated instead of just using a per-image buffer. While touching the code, drop the synchronous qcow2_read() function and replace it by a bdrv_read() call. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-01	Update version for 1.0 releasev1.0	Anthony Liguori	1	-1/+1
	Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-30	Makefile: use full path for qapi-generated directory	Michael Roth	1	-1/+1
	Generally $(BUILD_DIR) == $(CURDIR), but that isn't necessarilly the case, so use $(BUILD_DIR)/qapi-generated for generated files to avoid potentionally sticking generating files in odd places outside the build's include paths. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-30	qapi: fix guardname generation	Michael Roth	1	-3/+4
	Fix a bug in handling dotted paths, and exclude directory prefixes from generated guardnames to avoid odd/pseudo-random guardnames in generated headers. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	Update version for 1.0-rc4v1.0-rc4	Anthony Liguori	1	-1/+1
	Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	ccid: Fix buffer overrun in handling of VSC_ATR message	Markus Armbruster	1	-0/+1
	ATR size exceeding the limit is diagnosed, but then we merrily use it anyway, overrunning card->atr[]. The message is read from a character device. Obvious security implications unless the other end of the character device is trusted. Spotted by Coverity. CVE-2011-4111. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	Revert "fix out of tree build"	Anthony Liguori	1	-1/+1
	This reverts commit be85c90b74f56dca51782fa3080fcdf88593e045. This patch is incorrect and breaks the build with a freshly cloned git tree. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	configure: avoid screening of --{en, dis}able-usb-redir options	Max Filippov	1	-2/+8
	--dir) option pattern precede --{en,dis}able-usb-redir) patterns in the option analysis switch, making the latter options have no effect. There were some --dir that are supported by Autoconf and not by QEMU configure. The aim was to let QEMU packagers use the rpm (or similar) macro that overrides directories for their distribution. Replace --*dir with exact option names. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	cutils: Make strtosz & friends leave follow set to callers	Markus Armbruster	1	-44/+26
	strtosz() & friends require the size to be at the end of the string, or be followed by whitespace or ','. I find this surprising, because the name suggests it works like strtol(). The check simplifies callers that accept exactly that follow set slightly. No such callers exist. The check is redundant for callers that accept a smaller follow set, and thus need to check themselves anyway. Right now, this is the case for all but one caller. All of them neglected to check, or checked incorrectly, but the previous few commits fixed them up. Finally, the check is problematic for callers that accept a larger follow set. This is the case in monitor_parse_command(). Fortunately, the problems there are relatively harmless. monitor_parse_command() uses strtosz() for argument type 'o'. When the last argument is of type 'o', a trailing ',' is diagnosed differently than other trailing junk: (qemu) migrate_set_speed 1x invalid size (qemu) migrate_set_speed 1, migrate_set_speed: extraneous characters at the end of line A related inconsistency exists with non-last arguments. No such command exists, but let's use memsave to explore the inconsistency. The monitor permits, but does not require whitespace between arguments. For instance, "memsave (1-1)1024foo" is parsed as command memsave with three arguments 0, 1024 and "foo". Yes, this is daft, but at least it's consistently daft. If I change memsave's second argument from 'i' to 'o', then "memsave (1-1)1foo" is rejected, because the size is followed by an 'f'. But "memsave (1-1)1," is still accepted, and duly saves to file ",". We don't have any users of strtosz that profit from the check. In the users we have, it appears to encourage sloppy error checking, or gets in the way. Drop the bothersome check. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	qemu-img: Tighten parsing of size arguments	Markus Armbruster	1	-4/+6
	strtosz_suffix() fails unless the size is followed by 0, whitespace or ','. Useless here, because we need to fail for any junk following the size, even if it starts with whitespace or ','. Check manually. Things like "qemu-img create xxx 1024," and "qemu-img convert -S '1024 junk'" are now caught. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	x86/cpuid: Tighten parsing of tsc_freq=FREQ	Markus Armbruster	1	-1/+1
	cpu_x86_find_by_name() uses strtosz_suffix_unit(), but screws up the error checking. It detects some failures, but not all. Undetected failures result in a zero tsc_khz value (error value -1 divided by 1000), which means "no tsc_freq set". To reproduce, try "-cpu qemu64,tsc_freq=9999999T". strtosz_suffix_unit() fails, because the value overflows int64_t, Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	vl: Tighten parsing of -m argument	Markus Armbruster	1	-2/+3
	strtosz_suffix() fails unless the size is followed by 0, whitespace or ','. Useless here, because we need to fail for any junk following the size, even if it starts with whitespace or ','. Check manually. Things like "-m 1024," are now caught. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	vl: Tighten parsing of -numa's parameter mem	Markus Armbruster	1	-2/+2
	strtosz_suffix() fails unless the size is followed by 0, whitespace or ','. Useless here, because we need to fail for any junk following the size, even if it starts with whitespace or ','. Check manually. Things like -smp 4 -numa "node,mem=1024,cpus=0-1" -numa "node,mem=1024 cpus=2-3" are now caught. Before, the second -numa's argument was silently interpreted as just "node,mem=1024". Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	cutils: Drop broken support for zero strtosz default_suffix	Markus Armbruster	1	-13/+4
	Commit 9f9b17a4's strtosz() defaults a missing suffix to 'M', except it rejects fractions then (switch case 0). When commit d8427002 introduced strtosz_suffix(), that changed: fractions are no longer rejected, because we go to switch case 'M' on missing suffix now. Not mentioned in commit message, probably unintentional. Not worth changing back now. Because case 0 is still around, you can get the old behavior by passing a zero default_suffix to strtosz_suffix() or strtosz_suffix_unit(). Undocumented and not used. Drop. Commit d8427002 also neglected to update the function comment. Fix it up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	configure: tighten pie toolchain support test for tls variables	Avi Kivity	1	-1/+11
	Some toolchains don't support pie properly when tls variables are in use. Disallow pie when such toolchains are detected. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	usb-redir: Don't try to write to the chardev after a close event	Hans de Goede	1	-0/+4
	Since we handle close async in a bh, do_write and thus write can get called after receiving a close event. This patch adds a check to the usb-redir write callback to not call qemu_chr_fe_write on a closed backend. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	usb-redir: Device disconnect + re-connect robustness fixes	Hans de Goede	1	-1/+21
	These fixes mainly target the other side sending some (error status) packets after a disconnect packet. In some cases these would get queued up and then reported to the controller when a new device gets connected. * Fully reset device state on disconnect * Don't allow a connect message when already connected * Ignore iso and interrupt status messages when disconnected Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	usb-redir: Call qemu_chr_fe_open/close	Hans de Goede	1	-0/+3
	To let the chardev now we're ready start receiving data. This is necessary with the spicevmc chardev to get it registered with the spice-server. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	spice-qemu-char: Generate chardev open/close events	Hans de Goede	1	-1/+35
	Define a state callback and make that generate chardev open/close events when called by the spice-server. Notes: 1) For all but the newest spice-server versions (which have a fix for this) the code ignores these events for a spicevmc with a subtype of vdagent, this subtype specific knowledge is undesirable, but unavoidable for now, see: http://lists.freedesktop.org/archives/spice-devel/2011-July/004837.html 2) This code deliberately sends the events immediately rather then from a bh. This is done this way because: a) There is no need to do it from a bh; and b) Doing it from a bh actually causes issues because the spice-server may send data immediately after the open and when the open runs from a bh, then qemu_chr_be_can_write will return 0 for the first write which the spice-server does not expect, when this happens the spice-server will never retry the write causing communication to stall. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	qemu-char: rename qemu_chr_event to qemu_chr_be_event and make it public	Hans de Goede	2	-13/+23
	Rename qemu_chr_event to qemu_chr_be_event, since it is only to be called by backends and make it public so that it can be used by chardev code which lives outside of qemu-char.c Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	9pfs: improve portability to older systems	Aneesh Kumar K.V	1	-1/+6
	I guess we can also make sure we don't call local_ioc_getversion at all. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	tci: Make flush_icache_range() inline	Stefan Weil	2	-4/+4
	This is standard for other tcg targets and improves tci, too. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	eepro100: Fix alignment requirement for statistical counters	Stefan Weil	1	-1/+9
	According to Intel's Open Source Software Developer Manual, the dump counters address must be Dword aligned. The new code enforces this alignment, so s->statsaddr may now be used with stw_le_pci_dma() and stl_le_pci_dma(). Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	virtio: add and use virtio_set_features	Paolo Bonzini	5	-20/+23
	vdev->guest_features is not masking features that are not supported by the guest. Fix this by introducing a common wrapper to be used by all virtio bus implementations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-28	9pfs: improve portability to older systems	Paolo Bonzini	3	-7/+8
	Small requirements on "new" features have percolated to virtio-9p-local.c. In particular, the utimensat wrapper actually only supports dirfd = AT_FDCWD and flags = AT_SYMLINK_NOFOLLOW in the fallback code. Remove the arguments so that virtio-9p-local.c will not use AT_* constants. At the same time, fail local_ioc_getversion if the ioctl is not supported by the host. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>