summaryrefslogtreecommitdiff
path: root/block/qcow2-cluster.c
AgeCommit message (Collapse)AuthorFilesLines
2011-10-21qcow2: Fix bdrv_write_compressed error handlingKevin Wolf1-2/+4
If during allocation of compressed clusters the cluster was already allocated uncompressed, fail and properly release the l2_table (the latter avoids a failed assertion). While at it, make it return some real error numbers instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
2011-09-12qcow2: fix range checkFrediano Ziglio1-7/+7
QCowL2Meta::offset is not cluster aligned but only sector aligned however nb_clusters count cluster from cluster start. This fix range check. Note that old code have no corruption issues related to this check cause it only cause intersection to occur when shouldn't. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12qcow2: initialize metadata before inserting in cluster_allocsFrediano Ziglio1-5/+5
QCow2Meta structure was inserted into list before many fields are initialized. Currently is not a problem cause all occur in a lock but if qcow2_alloc_clusters would in a future unlock this lock some issues could arise. Initializing fields before inserting fix the problem. Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12qcow2: removed unused depends_on fieldFrediano Ziglio1-2/+1
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-25qcow2: use always stderr for debuggingFrediano Ziglio1-1/+1
let all DEBUG_ALLOC2 printf goes to stderr Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23qcow2: fix typo in documentation for qcow2_get_cluster_offset()Devin Nakamura1-2/+2
Documentation states the num is measured in clusters, but its actually measured in sectors Signed-off-by: Devin Nakamura <devin122@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-20Use glib memory allocation and free functionsAnthony Liguori1-6/+6
qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-02qcow2: Use coroutinesKevin Wolf1-14/+12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-15qcow2: Fix in-flight list after qcow2_cache_put failureKevin Wolf1-4/+8
If qcow2_cache_put returns an error during cluster allocation and the allocation fails, it must be removed from the list of in-flight allocations. Otherwise we'd get a loop in the list when the ACB is used for the next allocation. Luckily, this qcow2_cache_put shouldn't fail anyway because the L2 table is only read, so that qcow2_cache_put doesn't even involve I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-06-08qcow2: Fix memory leaks in error casesKevin Wolf1-1/+1
This fixes memory leaks that may be caused by I/O errors during L1 table growth (can happen during save_vm) and in qemu-img check. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10qcow2: Fix order in L2 table COWKevin Wolf1-3/+6
When copying L2 tables (this happens only with internal snapshots), the order wasn't completely safe, so that after a crash you could end up with a L2 table that has too low refcount, possibly leading to corruption in the long run. This patch puts the operations in the right order: First allocate the new L2 table and replace the reference, and only then decrease the refcount of the old table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10qcow2: Fix error handling for reading compressed clustersKevin Wolf1-2/+2
When reading a compressed cluster failed, qcow2 falsely returned success. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>
2011-01-31qcow2: Add bdrv_discard supportKevin Wolf1-0/+82
This adds a bdrv_discard function to qcow2 that frees the discarded clusters. It does not yet pass the discard on to the underlying file system driver, but the space can be reused by future writes to the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-01-24qcow2: Batch flushes for COWKevin Wolf1-1/+1
qcow2 calls bdrv_flush() after performing COW in order to ensure that the L2 table change is never written before the copy is safe on disk. Now that the L2 table is cached, we can wait with flushing until we write out the next L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24qcow2: Use QcowCacheKevin Wolf1-136/+72
Use the new functions of qcow2-cache.c for everything that works on refcount block and L2 tables. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24qcow2: fix unaligned accessAurelien Jarno1-1/+1
cpu_to_be64w() is called with an obviously non-aligned pointer. Use cpu_to_be64wu() instead. It fixes unaligned accesses errors on IA64 hosts. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17block/qcow2.c: rename qcow_ functions to qcow2_Jes Sorensen1-3/+3
It doesn't really make sense for functions in qcow2.c to be named qcow_ so convert the names to match correctly. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-11-04qcow2: Invalidate cache after failed readKevin Wolf1-0/+1
The cache content may be destroyed after a failed read, better not use it any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-10-22qcow2: Support exact L1 table growthStefan Hajnoczi1-9/+16
The L1 table grow operation includes a size calculation that bumps up the new L1 table size in order to anticipate the size needs of vmstate data. This helps reduce the number of times that the L1 table has to be grown when vmstate data is appended. This size overhead is not necessary during image creation, bdrv_truncate(), or snapshot goto operations. In fact, existing qemu-iotests that exercise table growth are no longer able to trigger it because image creation preallocates an L1 table that is too large after changes to qcow_create2(). This patch keeps the size calculation but also adds exact growth for callers that do not want to inflate the L1 table size unnecessarily. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Avoid bounce buffers for AIO read requestsKevin Wolf1-1/+7
qcow2 used to use bounce buffers for any AIO requests. This does not only imply unnecessary copying, but also unbounded allocations which should be avoided. This patch removes bounce buffers from the normal AIO read path, and constrains them to a constant size for encrypted images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Get rid of additional sync on COWKevin Wolf1-2/+8
We always have a sync for the refcount update when a new cluster is allocated. If we move this past the COW, we can save an additional sync. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21qcow2: Move sync out of qcow2_alloc_clustersKevin Wolf1-0/+3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08qcow2: Remove unnecessary flush after L2 writeKevin Wolf1-4/+12
When a new cluster was allocated, we only need a flush after the write to the L2 table if it was a COW and we need to decrease the refcounts of the old clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22qcow2: Use bdrv_(p)write_sync for metadata writesKevin Wolf1-12/+12
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-15qcow2: Restore L1 entry on l2_allocate failureKevin Wolf1-0/+1
If writing the L1 table to disk failed, we need to restore its old content in memory to avoid inconsistencies. Reported-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Change l2_load to return 0/-errnoKevin Wolf1-16/+22
Provide the error code to the caller instead of just indicating success/error. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Allow qcow2_get_cluster_offset to return errorsKevin Wolf1-14/+22
qcow2_get_cluster_offset() looks up a given virtual disk offset and returns the offset of the corresponding cluster in the image file. Errors (e.g. L2 table can't be read) are currenctly indicated by a return value of 0, which is unfortuately the same as for any unallocated cluster. So in effect we can't check for errors. This makes the old return value a by-reference parameter and returns the usual 0/-errno error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Fix error handling in l2_allocateKevin Wolf1-10/+13
l2_allocate has some intermediate states in which the image is inconsistent. Change the order to write to the L1 table only after the new L2 table has successfully been initialized. Also reset the L2 cache in failure case, it's very likely wrong. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28qcow2: Clear L2 table cache after write errorKevin Wolf1-0/+1
If the L2 table was already updated in cache, but writing it to disk has failed, we must not continue using the changed version in the cache to stay consistent with what's on the disk. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Open the underlying image file in generic codeKevin Wolf1-31/+33
Format drivers shouldn't need to bother with things like file names, but rather just get an open BlockDriverState for the underlying protocol. This patch introduces this behaviour for bdrv_open implementation. For protocols which need to access the filename to open their file/device/connection/... a new callback bdrv_file_open is introduced which doesn't get an underlying file opened. For now, also some of the more obscure formats use bdrv_file_open because they open() the file themselves instead of using the block.c functions. They need to be fixed in later patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23qcow2: Return 0/-errno in l2_allocateKevin Wolf1-17/+23
Returning NULL on error doesn't allow distinguishing between different errors. Change the interface to return an integer for -errno. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23qcow2: Return 0/-errno in write_l1_entryKevin Wolf1-5/+5
Change write_l1_entry to return the real error code instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23qcow2: Fix error return code in qcow2_alloc_cluster_link_l2Kevin Wolf1-2/+2
Fix qcow2_alloc_cluster_link_l2 to return the real error code like it does in all other error cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23qcow2: Return 0/-errno in write_l2_entriesKevin Wolf1-4/+5
Change write_l2_entries to return the real error code instead of -1. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23qcow2: Trigger blkdebug eventsKevin Wolf1-0/+15
This adds blkdebug events to qcow2 to allow injecting I/O errors in specific places. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-10qcow2: Remove request from in-flight list after errorKevin Wolf1-0/+1
If we complete a request with a failure we need to remove it from the list of requests that are in flight. If we don't do it, the next time the same AIOCB is used for a cluster allocation it will create a loop in the list and qemu will hang in an endless loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-02-19qcow2: Fix access after end of arrayKevin Wolf1-2/+6
If a write requests crosses a L2 table boundary and all clusters until the end of the L2 table are usable for the request, we must not look at the next L2 entry because we already have arrived at the end of the array. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10qcow2: Fix signedness bugsKevin Wolf1-6/+6
Checking for return codes < 0 isn't really going to work with unsigned types. Use signed types instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26qcow2: Don't ignore qcow2_alloc_clusters return valueKevin Wolf1-2/+17
Now that qcow2_alloc_clusters can return error codes, we must handle them in the callers of qcow2_alloc_clusters. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26qcow2: Return 0/-errno in qcow2_alloc_cluster_offsetKevin Wolf1-10/+18
Returning 0/-errno allows it to distingush different errors classes. The cluster offset of newly allocated clusters is now returned in the QCowL2Meta struct. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26qcow2: Return 0/-errno in get_cluster_tableKevin Wolf1-12/+18
Switching to 0/-errno allows it to distinguish different error cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26qcow2: Fix error handling in qcow2_grow_l1_tableKevin Wolf1-4/+6
Return the appropriate error value instead of always using EIO. Don't free the L1 table on errors, we still need it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09qcow2: Allow qcow2 disk images with size zeroStefan Weil1-0/+3
Images with disk size 0 may be used for VM snapshots, but not to save normal block data. It is possible to create such images using qemu-img, but opening them later fails. So even "qemu-img info image.qcow2" is not possible for an image created with "qemu-img create -f qcow2 image.qcow2 0". This is fixed here. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27Revert "qcow2: Bring synchronous read/write back to life"Kevin Wolf1-3/+3
It was merely a workaround and the real fix is done now. This reverts commit ef845c3bf421290153154635dc18eaa677cecb43. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-15qcow2: Bring synchronous read/write back to lifeKevin Wolf1-3/+3
When the synchronous read and write functions were dropped, they were replaced by generic emulation functions. Unfortunately, these emulation functions don't provide the same semantics as the original functions did. The original bdrv_read would mean that we read some data synchronously and that we won't be interrupted during this read. The latter assumption is no longer true with the emulation function which needs to use qemu_aio_poll and therefore allows the callback of any other concurrent AIO request to be run during the read. Which in turn means that (meta)data read earlier could have changed and be invalid now. qcow2 is not prepared to work in this way and it's just scary how many places there are where other requests could run. I'm not sure yet where exactly it breaks, but you'll see breakage with virtio on qcow2 with a backing file. Providing synchronous functions again fixes the problem for me. Patchworks-ID: 35437 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05qcow2: Increase maximum cluster size to 2 MBKevin Wolf1-6/+8
This patch increases the maximum qcow2 cluster size to 2 MB. Starting with 128k clusters, L2 tables span 2 GB or more of virtual disk space, causing 32 bit truncation and wraparound of signed integers. Therefore some variables need to use a larger data type. While being at reviewing data types, change some integers that are used for array indices to unsigned. In some places they were checked against some upper limit but not for negative values. This could avoid potential segfaults with corrupted qcow2 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-12Fix sys-queue.h conflict for goodBlue Swirl1-2/+2
Problem: Our file sys-queue.h is a copy of the BSD file, but there are some additions and it's not entirely compatible. Because of that, there have been conflicts with system headers on BSD systems. Some hacks have been introduced in the commits 15cc9235840a22c289edbe064a9b3c19c5f49896, f40d753718c72693c5f520f0d9899f6e50395e94, 96555a96d724016e13190b28cffa3bc929ac60dc and 3990d09adf4463eca200ad964cc55643c33feb50 but the fixes were fragile. Solution: Avoid the conflict entirely by renaming the functions and the file. Revert the previous hacks. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-09qcow2: Order concurrent AIO requests on the same unallocated clusterKevin Wolf1-0/+39
When two AIO requests write to the same cluster, and this cluster is unallocated, currently both requests allocate a new cluster and the second one merges the first one when it is completed. This means an cluster allocation, a read and a cluster deallocation which cause some overhead. If we simply let the second request wait until the first one is done, we improve overall performance with AIO requests (specifially, qcow2/virtio combinations). This patch maintains a list of in-flight requests that have allocated new clusters. A second request touching the same cluster is limited so that it either doesn't touch the allocation of the first request (so it can have a non-overlapping allocation) or it waits for the first request to complete. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-10qcow2: Fix L1 table memory allocationKevin Wolf1-1/+1
Contrary to what one could expect, the size of L1 tables is not cluster aligned. So as we're writing whole sectors now instead of single entries, we need to ensure that the L1 table in memory is large enough; otherwise write would access memory after the end of the L1 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16alloc_cluster_link_l2: Write complete sectorsKevin Wolf1-3/+25
When updating the L2 tables in alloc_cluster_link_l2(), write complete sectors instead of updating single entries. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>