summaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2013-01-15block: Fix how mirror_run() frees its bufferMarkus Armbruster1-1/+1
It allocates with qemu_blockalign(), therefore it must free with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15win32-aio: Fix how win32_aio_process_completion() frees bufferMarkus Armbruster1-1/+1
win32_aio_submit() allocates it with qemu_blockalign(), therefore it must be freed with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15sheepdog: clean up sd_aio_setup()Liu Yuan1-6/+5
The last two parameters of sd_aio_setup() are never used, so remove them. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15sheepdog: multiplex the rw FD to flush cacheLiu Yuan1-45/+37
This will reduce sockfds connected to the sheep server to one, which simply the future hacks. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15block: make discard asynchronousPaolo Bonzini2-81/+88
This is easy with the thread pool, because we can use s->is_xfs and s->has_discard from the worker function. QEMU has a widespread assumption that each I/O operation writes less than 2^32 bytes. This patch doesn't fix it throughout of course, but it starts correcting struct RawPosixAIOData so that there is no regression with respect to the synchronous discard implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15raw: support discard on block devicesPaolo Bonzini1-0/+36
Block devices use a ioctl instead of fallocate, so add a separate implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15raw-posix: remember whether discard failedPaolo Bonzini1-3/+6
Avoid sending system calls repeatedly if they shall fail. This does not apply to XFS: if the filesystem-specific ioctl fails, something weird is happening. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15raw-posix: support discard on more filesystemsKusanagi Kouichi1-2/+24
Linux 2.6.38 introduced the filesystem independent interface to deallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2, tmpfs and xfs support it. Even though the system calls here are in practice issued on Linux, the code is structured to allow plugging in alternatives for other Unix variants. EOPNOTSUPP is used unconditionally in this patch, but it is supported in both OpenBSD and Mac OS X since forever (see for example http://lists.debian.org/debian-glibc/2006/02/msg00337.html). Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15qcow2: Fix segfault on zero-length writeKevin Wolf1-1/+1
One of the recent refactoring patches (commit f50f88b9) didn't take care to initialise l2meta properly, so with zero-length writes, which don't even enter the write loop, qemu just segfaulted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-14Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori2-56/+42
* kwolf/for-anthony: dataplane: handle misaligned virtio-blk requests dataplane: extract virtio-blk read/write processing into do_rdwr_cmd() block: make qiov_is_aligned() public raw-posix: fix bdrv_aio_ioctl sheepdog: implement direct write semantics block: do not probe zero-sized disks Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-01-14block: make qiov_is_aligned() publicStefan Hajnoczi1-17/+1
The qiov_is_aligned() function checks whether a QEMUIOVector meets a BlockDriverState's alignment requirements. This is needed by virtio-blk-data-plane so: 1. Move the function from block/raw-posix.c to block/block.c. 2. Make it public in block/block.h. 3. Rename to bdrv_qiov_is_aligned(). 4. Change return type from int to bool. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-14raw-posix: fix bdrv_aio_ioctlPaolo Bonzini1-9/+1
When the raw-posix aio=thread code was moved from posix-aio-compat.c to block/raw-posix.c, there was an unintended change to the ioctl code. The code used to return the ioctl command, which posix_aio_read() would later morph into a zero. This hack is not necessary anymore, and in fact breaks scsi-generic (which expects a zero return code). Remove it. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-14sheepdog: implement direct write semanticsLiu Yuan1-30/+40
Sheepdog supports both writeback/writethrough write but has not yet supported DIRECTIO semantics which bypass the cache completely even if Sheepdog daemon is set up with cache enabled. Suppose cache is enabled on Sheepdog daemon size, the new cache control is cache=writeback # enable the writeback semantics for write cache=writethrough # enable the emulated writethrough semantics for write cache=directsync # disable cache competely Guest WCE toggling on the run time to toggle writeback/writethrough is also supported. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-12qemu-option: move standard option definitions out of qemu-config.cPaolo Bonzini1-0/+27
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-11Replace remaining gmtime, localtime by gmtime_r, localtime_rStefan Weil1-4/+0
This allows removing of MinGW specific code and improves reentrancy for POSIX hosts. [Removed unused ret variable in qemu_get_timedate() to fix warning: vl.c: In function ‘qemu_get_timedate’: vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable] -- Stefan Hajnoczi] Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-02sheepdog: pass oid directly to send_pending_req()Liu Yuan1-1/+1
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-02sheepdog: don't update inode when create_and_write failsLiu Yuan1-4/+5
For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmap to avoid the scenario that the object is allocated but wasn't created at the server side. This will result in VM's IO error on the failed object. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-02block/raw-win32: Fix compiler warnings (wrong format specifiers)Stefan Weil1-2/+2
Commit fbcad04d6bfdff937536eb23088a01a280a1a3af added fprintf statements with wrong format specifiers. GetLastError() returns a DWORD which is unsigned long, so %lu must be used. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-02raw-posix: add raw_get_aio_fd() for virtio-blk-data-planeStefan Hajnoczi1-0/+34
The raw_get_aio_fd() function allows virtio-blk-data-plane to get the file descriptor of a raw image file with Linux AIO enabled. This interface is really a layering violation that can be resolved once the block layer is able to run outside the global mutex - at that point virtio-blk-data-plane will switch from custom Linux AIO code to using the block layer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19softmmu: move include files to include/sysemu/Paolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19misc: move include files to include/qemu/Paolo Bonzini25-42/+42
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19migration: move include files to include/migration/Paolo Bonzini6-6/+6
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini33-43/+43
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19qapi: move include files to include/qobject/Paolo Bonzini2-2/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19janitor: do not include qemu-char everywherePaolo Bonzini1-1/+0
Touching char/char.h basically causes the whole of QEMU to be rebuilt. Avoid this, it is usually unnecessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19janitor: do not rely on indirect inclusions of or from qemu-char.hPaolo Bonzini2-0/+2
Various header files rely on qemu-char.h including qemu-config.h or main-loop.h, but they really do not need qemu-char.h at all (particularly interesting is the case of the block layer!). Clean this up, and also add missing inclusions of qemu-char.h itself. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19build: move rules from Makefile to */Makefile.objsPaolo Bonzini1-0/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-13qcow2: Factor out handle_dependencies()Kevin Wolf1-28/+42
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Execute run_dependent_requests() without lockKevin Wolf1-20/+16
There's no reason for run_dependent_requests() to hold s->lock, and a later patch will require that in fact the lock is not held. Also, before this patch, run_dependent_requests() not only does what its name suggests, but also removes the l2meta from the list of in-flight requests. When changing this, it becomes an one-liner, so just inline it completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2Kevin Wolf3-7/+7
This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Allocate l2meta only for cluster allocationsKevin Wolf3-31/+31
Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Drop l2meta.cluster_offsetKevin Wolf3-15/+14
There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Allocate l2meta dynamicallyKevin Wolf1-11/+15
As soon as delayed COW is introduced, the l2meta struct is needed even after completion of the request, so it can't live on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Introduce Qcow2COWRegionKevin Wolf2-36/+76
This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Round QCowL2Meta.offset down to cluster boundaryKevin Wolf2-2/+24
The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12qcow2: Move BLKDBG_EVENT out of the lockKevin Wolf1-1/+1
We want to use these events to suspend requests for testing concurrent AIO requests. Suspending requests while they are holding the CoMutex is rather boring for this purpose. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12blkdebug: Implement suspend/resume of AIO requestsKevin Wolf1-3/+105
This allows more systematic AIO testing. The patch adds three new operations to blkdebug: * Setting a "breakpoint" on a blkdebug event. The next request that triggers this breakpoint is suspended and is tagged with a name. The breakpoint is removed after a request has triggered it. * A suspended request (identified by it's tag) can be resumed * It's possible to check whether a suspended request with a given tag exists. This can be used for waiting for an event. Ideally, we would instead tag requests right when they are created and set breakpoints for individual requests. However, at this point the block layer doesn't allow this easily, and breakpoints that trigger for any request already allow a lot of useful testing. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12blkdebug: Factor out remove_rule()Kevin Wolf1-2/+13
The cleanup work to remove a rule depends on the type of the rule. It's easy for the existing rules as there is no data that must be cleaned up and is specific to a type yet, but the next patch will change this. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12blkdebug: Allow usage without config fileKevin Wolf1-0/+5
As soon as new rules can be set during runtime, as introduced by the next patch, blkdebug makes sense even without a config file. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-11Fix error code checking for SetFilePointer() callFabien Chouteau1-3/+14
An error has occurred if the return value is invalid_set_file_pointer and getlasterror doesn't return no_error. Signed-off-by: Fabien Chouteau <chouteau@adacore.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-11rbd: Fix race between aio completition and aio cancelStefan Priebe1-8/+12
This one fixes a race which qemu had also in iscsi block driver between cancellation and io completition. qemu_rbd_aio_cancel was not synchronously waiting for the end of the command. To archieve this it introduces a new status flag which uses -EINPROGRESS. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-11raw-posix: inline paio_ioctl into hdev_aio_ioctlPaolo Bonzini1-17/+10
clang now warns about an unused function: CC block/raw-posix.o block/raw-posix.c:707:26: warning: unused function paio_ioctl [-Wunused-function] static BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd, ^ 1 warning generated. because the only use of paio_ioctl() is inside a #if defined(__linux__) guard and it is static now. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-11block: vpc support for ~2 TB disksCharles Arnold1-4/+13
The VHD specification allows for up to a 2 TB disk size. The current implementation in qemu emulates EIDE and ATA-2 hardware which only allows for up to 127 GB. This disk size limitation can be overridden by allowing up to 255 heads instead of the normal 4 bit limitation of 16. Doing so allows disk images to be created of up to nearly 2 TB. This change does not violate the VHD format specification nor does it change how smaller disks (ie, <=127GB) are defined. [Charles Arnold also writes: "In analyzing a 160 GB VHD fixed disk image created on Windows 2008 R2, it appears that MS is also ignoring the CHS values in the footer geometry field in whatever driver they use for accessing the image. The CHS values are set at 65535,16,255 which obviously doesn't represent an image size of 160 GB." -- Stefan] Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-11block: vpc initialize the uuid footer fieldCharles Arnold1-1/+6
Initialize the uuid field in the footer with a generated uuid. Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-11aio: Get rid of qemu_aio_flush()Kevin Wolf3-3/+3
There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-11-28iscsi: do not assume device is zero initializedPeter Lieven1-0/+6
Without any complex checks we can't assume that an iscsi target is initialized to zero. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-11-28iscsi: fix deadlock during loginPeter Lieven1-181/+70
If the connection is interrupted before the first login is successfully completed qemu-kvm is waiting forever in qemu_aio_wait(). This is fixed by performing an sync login to the target. If the connection breaks after the first successful login errors are handled internally by libiscsi. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-11-28iscsi: fix segfault in url parsingPeter Lieven1-2/+1
If an invalid URL is specified iscsi_get_error(iscsi) is called with iscsi == NULL. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-11-21use int64_t for return values from rbd instead of intStefan Priebe1-2/+2
rbd / rados tends to return pretty often length of writes or discarded blocks. These values might be bigger than int. The steps to reproduce are: mkfs.xfs -f a whole device bigger than int in bytes. mkfs.xfs sends a discard. Important is that you use scsi-hd and set discard_granularity=512. Otherwise rbd disabled discard support. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-21vdi: don't override libuuid symbolsStefan Hajnoczi1-6/+3
It's poor symbol hygiene to provide a global symbols that collide with a common library like libuuid. If QEMU links against a shared library that depends on uuid_generate() it can end up calling our stub version of the function. This exact scenario happened with GlusterFS libgfapi.so, which depends on libglusterfs.so's uuid_generate(). Scope the uuid stubs for vdi.c only and avoid affecting other shared objects. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>