summaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2012-05-30block: prevent snapshot mode $TMPDIR symlink attackJim Meyering1-1/+6
In snapshot mode, bdrv_open creates an empty temporary file without checking for mkstemp or close failure, and ignoring the possibility of a buffer overrun given a surprisingly long $TMPDIR. Change the get_tmp_filename function to return int (not void), so that it can inform its two callers of those failures. Also avoid the risk of buffer overrun and do not ignore mkstemp or close failure. Update both callers (in block.c and vvfat.c) to propagate temp-file-creation failure to their callers. get_tmp_filename creates and closes an empty file, while its callers later open that presumed-existing file with O_CREAT. The problem was that a malicious user could provoke mkstemp failure and race to create a symlink with the selected temporary file name, thus causing the qemu process (usually root owned) to open through the symlink, overwriting an attacker-chosen file. This addresses CVE-2012-2652. http://bugzilla.redhat.com/CVE-2012-2652 Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-29Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori3-56/+83
* kwolf/for-anthony: fdc-test: introduced qtest no_media_on_start and cmos qtest for floppy fdc: fix media detection fdc: floppy drive should be visible after start without media qemu-iotests: mark 035 qcow2-only qcow2: Check qcow2_alloc_clusters_at() return value sheepdog: use heap instead of stack for BDRVSheepdogState sheepdog: return -errno on error sheepdog: mark image as snapshot when tag is specified qemu-img: Explain how rebase operation can be used to perform a 'diff' operation. qcow2: don't leak buffer for unexpected qcow_version in header
2012-05-28ISCSI: Switch to using READ16/WRITE16 for I/O to the LUNRonnie Sahlberg1-29/+83
This allows using LUNs bigger than 2TB. Keep using READ10 for other device types such as MMC. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2012-05-28ISCSI: Only call READCAPACITY16 for SBC devices, use READCAPACITY10 for MMCRonnie Sahlberg1-5/+59
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2012-05-28ISCSI: get device type at connection timeRonnie Sahlberg1-2/+43
This is needed to avoid READ CAPACITY(16) for MMC devices. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-05-28ISCSI: change num_blocks to 64-bitPaolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-05-28ISCSI: redo how we set up the eventsRonnie Sahlberg1-4/+21
Call qemu_notify_event() after updating events. Otherwise, If we add an event for -is-writeable but the socket is already writeable there may be a delay before the event callback is actually triggered. Those delays would in particular hurt performance during BIOS boot and when the GRUB bootloader reads the kernel and initrd. But first call out to the socket write functions directly, and only set up the write event if the socket is full. This will happen very rarely and this improves performance. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2012-05-25qcow2: Check qcow2_alloc_clusters_at() return valueKevin Wolf1-10/+13
When using qcow2_alloc_clusters_at(), the cluster allocation code checked the wrong variable for an error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: use heap instead of stack for BDRVSheepdogStateMORITA Kazutaka1-13/+22
bdrv_create() is called in coroutine context now, so we cannot use more stack than 1 MB in the function if we use ucontext coroutine. This patch allocates BDRVSheepdogState, whose size is 4 MB, on the heap in sd_create(). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: return -errno on errorMORITA Kazutaka1-32/+46
On error, BlockDriver APIs should return -errno instead of -1. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: mark image as snapshot when tag is specifiedMORITA Kazutaka1-1/+1
When a snapshot tag is specified in the filename, the opened image is a snapshot. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25qcow2: don't leak buffer for unexpected qcow_version in headerJim Meyering1-1/+2
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14qcow2: Don't ignore failure to clear autoclear flagsKevin Wolf1-1/+4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: fix warning introduced in efcc7a23Anthony Liguori1-1/+1
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-05-10stream: do not copy unallocated sectors from the basePaolo Bonzini1-14/+4
Unallocated sectors should really never be accessed by the guest, so there's no need to copy them during the streaming process. If they are read by the guest during streaming, guest-initiated copy-on-read will copy them (we're in the base == NULL case, which enables copy on read). If they are read after we disconnect the image from the base, they will read as zeroes anyway. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10stream: fix ratelimiting corner casePaolo Bonzini1-5/+5
This fixes inability to make progress in streaming if the quota is set to less than the amount of data that an I/O operation has to write. In this case, limit->dispatched + n will always be above the quota and, due to the "goto retry" to recheck cancellation and allocation, streaming will livelock. This can be reproduced with "block_job_set_speed ide0-hd0 1b". Of course, with this patch the requested limit will not be obeyed. That could be done with another patch that caps is_allocated's n argument by the slice quota. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10stream: pass new base image format to bdrv_change_backing_filePaolo Bonzini1-2/+5
When an image is modified to point to the new backing file, the backing file format is set to NULL, which means auto-probe. This is wrong, in fact it is a small security problem. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: wait for job callback in block_job_cancel_syncPaolo Bonzini1-4/+3
The limitation on not having I/O after cancellation cannot really be kept. Even streaming has a very small race window where you could cancel a job and have it report completion. If this window is hit, bdrv_change_backing_file() will yield and possibly cause accesses to dangling pointers etc. So, let's just assume that we cannot know exactly what will happen after the coroutine has set busy to false. We can set a very lax condition: - if we cancel the job, the coroutine won't set it to false again (and hence will not call co_sleep_ns again). - block_job_cancel_sync will wait for the coroutine to exit, which pretty much ensures no race. Instead, we track the coroutine that executes the job and put very strict conditions on what to do while it is quiescent (busy = false). First of all, the coroutine must never set busy = false while the job has been cancelled. Second, the coroutine can be reentered arbitrarily while it is quiescent, so you cannot really do anything but co_sleep_ns at that time. This condition is obeyed by the block_job_sleep_ns function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: add block_job_sleep_nsPaolo Bonzini1-14/+9
This function abstracts the pretty complex semantics of the "busy" member of BlockJob. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: fix snapshot on QEDPaolo Bonzini2-0/+14
QED's opaque data includes a pointer back to the BlockDriverState. This breaks when bdrv_append shuffles data between bs_new and bs_top. To avoid this, add a "rebind" function that tells the driver about the new relationship between the BlockDriverState and its opaque. The patch also adds rebind to VVFAT for completeness, even though it is not used with live snapshots. Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: update in-memory backing file and formatPaolo Bonzini1-11/+0
These are needed to print "info block" output correctly. QCOW2 does this because it needs it to write the header, but QED does not, and common code is the right place to do it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: push bdrv_change_backing_file error checking up from driversPaolo Bonzini1-5/+0
This check applies to all drivers, but QED lacks it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-08Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori3-5/+20
* kwolf/for-anthony: fdc: simplify media change handling qcow2: lock on prealloc block: make bdrv_create adopt coroutine qcow2: Limit COW to where it's needed sheepdog: switch to writethrough mode if cluster doesn't support flush
2012-05-07qcow2: lock on preallocZhi Yong Wu1-0/+3
preallocate() will be locked. This is required because qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it can drop while COW is being performed. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07qcow2: Limit COW to where it's neededKevin Wolf1-5/+9
This fixes a regression introduced in commit 250196f1. The bug leads to data corruption, found during an Autotest run with a Fedora 8 guest. Consider a write request whose first part is covered by an already allocated cluster, but additional clusters need to be newly allocated. When counting the number of clusters to allocate, the qcow2 code would decide to do COW for all remaining clusters of the write request, even if some of them are already allocated. If during this COW operation another write request is issued that touches the same cluster, it will still refer to the old cluster. When the COW completes, the first request will update the L2 table and the second write request will be lost. Note that the requests need not overlap, it's enough for them to touch the same cluster. This patch ensures that only clusters that really require COW are considered for allocation. In this case any other request writing to the same cluster will be an allocating write and gets serialised. Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07sheepdog: switch to writethrough mode if cluster doesn't support flushMORITA Kazutaka1-0/+8
This is necessary for qemu to work with the older version of Sheepdog which doesn't support SD_OP_FLUSH_VDI. Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-04ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNsRonnie Sahlberg1-13/+73
Update the configure test for libiscsi support to detect version 1.3 or later. Version 1.3 of libiscsi provides both READCAPACITY16 as well as UNMAP commands. Update the iscsi block layer to use READCAPACITY16 to detect the size of the LUN instead of READCAPACITY10. This allows support for LUNs larger than 2TB. Update to implement bdrv_aio_discard() using the UNMAP command. This allows us to use thin-provisioned LUNs from TGTD and other iSCSI targets that support thin-provisioning. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> [squashed in subsequent patch from Ronnie to fix off-by-one in LBA count] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-05-02rbd: add discard supportJosh Durgin1-16/+73
Change the write flag to an operation type in RBDAIOCB, and make the buffer optional since discard doesn't use it. Discard is first included in librbd 0.1.2 (which is in Ceph 0.46). If librbd is too old, leave out qemu_rbd_aio_discard entirely, so the old behavior is preserved. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02qcow2: fix the return value -ENOENT -> -EEXISTZhi Yong Wu1-1/+1
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02qcow2: Don't hold cache references across yieldKevin Wolf1-8/+13
If cache references are held while the coroutine has yielded, the cache may get used up and abort() when it can't find a free entry. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02qcow2: Remove unused parameter in do_alloc_cluster_offsetKevin Wolf1-2/+2
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()Stefan Weil1-1/+2
Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-01raw-posix: Do not use CONFIG_COCOA macroPavel Borzenkov1-4/+4
Use __APPLE__ and __MACH__ macros instead of CONFIG_COCOA to detect Mac OS X host. The patch is based on Ben Leslie's patch: http://patchwork.ozlabs.org/patch/97859/ Signed-off-by: Ben Leslie <benno@benno.id.au> Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2012-04-27Merge remote-tracking branch 'qmp/queue/qmp' into stagingAnthony Liguori1-11/+11
* qmp/queue/qmp: qapi: fix qmp_balloon() conversion qemu-iotests: add block-stream speed value test case block: add 'speed' optional parameter to block-stream block: change block-job-set-speed argument from 'value' to 'speed' block: use Error mechanism instead of -errno for block_job_set_speed() block: use Error mechanism instead of -errno for block_job_create()
2012-04-27block: add 'speed' optional parameter to block-streamStefan Hajnoczi1-2/+3
Allow streaming operations to be started with an initial speed limit. This eliminates the window of time between starting streaming and issuing block-job-set-speed. Users should use the new optional 'speed' parameter instead so that speed limits are in effect immediately when the job starts. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-04-27block: change block-job-set-speed argument from 'value' to 'speed'Stefan Hajnoczi1-4/+4
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-04-27block: use Error mechanism instead of -errno for block_job_set_speed()Stefan Hajnoczi1-3/+3
There are at least two different errors that can occur in block_job_set_speed(): the job might not support setting speeds or the value might be invalid. Use the Error mechanism to report the error where it occurs. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-04-27block: use Error mechanism instead of -errno for block_job_create()Stefan Hajnoczi1-6/+5
The block job API uses -errno return values internally and we convert these to Error in the QMP functions. This is ugly because the Error should be created at the point where we still have all the relevant information. More importantly, it is hard to add new error cases to this case since we quickly run out of -errno values without losing information. Go ahead and use Error directly and don't convert later. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-04-26nbd: Fix uninitialised use of s->sockKevin Wolf1-1/+1
s->sock is assigned only afterwards, so we're really registering an aio_fd_handler for file descriptor 0 here. Not exactly what we intended. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-04-23Merge remote-tracking branch 'kwolf/for-anthony' into stagingAnthony Liguori11-176/+581
* kwolf/for-anthony: (38 commits) qemu-iotests: Fix test 031 for qcow2 v3 support qemu-iotests: Add -o and make v3 the default for qcow2 qcow2: Zero write support qemu-iotests: Test backing file COW with zero clusters qemu-iotests: add a simple test for write_zeroes qcow2: Support for feature table header extension qcow2: Support reading zero clusters qcow2: Version 3 images qcow2: Ignore reserved bits in check_refcounts qcow2: Ignore reserved bits in refcount table entries qcow2: Simplify count_cow_clusters qcow2: Refactor qcow2_free_any_clusters qcow2: Ignore reserved bits in L1/L2 entries qcow2: Fail write_compressed when overwriting data qcow2: Ignore reserved bits in count_contiguous_clusters() qcow2: Ignore reserved bits in get_cluster_offset qcow2: Save disk size in snapshot header Specification for qcow2 version 3 qcow2: Fix refcount block allocation during qcow2_alloc_cluster_at() iotests: Resolve test failures caused by hostname ...
2012-04-20qcow2: Zero write supportKevin Wolf3-0/+94
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Support for feature table header extensionKevin Wolf2-9/+69
Instead of printing an ugly bitmask, qemu can now print a more helpful string even for yet unknown features. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Support reading zero clustersKevin Wolf4-4/+33
This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Version 3 imagesKevin Wolf2-15/+148
This adds the basic infrastructure to qcow2 to handle version 3 images. It includes code to create v3 images, allow header updates for v3 images and checks feature bits. It still misses support for zero clusters, so this is not a fully compliant implementation of v3 yet. The default for creating new images stays at v2 for now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in check_refcountsKevin Wolf1-44/+54
Also don't infer the cluster type directly from the L2 entries, but use qcow2_get_cluster_type() to keep everything in a single place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in refcount table entriesKevin Wolf2-1/+3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Simplify count_cow_clustersKevin Wolf1-18/+15
count_cow_clusters() tries to reuse existing functions, and all it achieves is to make things much more complicated than they really are: Everything needs COW, unless it's a normal cluster with refcount 1. This patch implements the obvious way of doing this, and by using qcow2_get_cluster_type() it gets rid of all flag magic. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Refactor qcow2_free_any_clustersKevin Wolf1-19/+22
Zero clusters will add another cluster type. Refactor the open-coded cluster type detection into a switch of QCOW2_CLUSTER_* options so that the detection is in a single place. This makes it easier to add new cluster types. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in L1/L2 entriesKevin Wolf2-19/+19
This changes the still existing places that assume that the only flags are QCOW_OFLAG_COPIED and QCOW_OFLAG_COMPRESSED to properly mask out reserved bits. It does not convert bdrv_check yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Fail write_compressed when overwriting dataKevin Wolf1-4/+3
qcow2_alloc_compressed_cluster_offset() already fails if the copied flag is set, because qcow2_write_compressed() doesn't perform COW as it would have to do to allow this. However, what we really want to check here is whether the cluster is allocated or not. With internal snapshots the copied flag may not be set on allocated clusters. Check the cluster offset instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>