summaryrefslogtreecommitdiff
path: root/block.c
AgeCommit message (Collapse)AuthorFilesLines
2014-03-14blockdev: Refuse to open encrypted image unless pausedMarkus Armbruster1-1/+8
Opening an encrypted image takes an additional step: setting the key. Between open and the key set, the image must not be used. We have some protection against accidental use in place: you can't unpause a guest while we're missing keys. You can, however, hot-plug block devices lacking keys into a running guest just fine, or insert media lacking keys. In the latter case, notifying the guest of the insert is delayed until the key is set, which may suffice to protect at least some guests in common usage. This patch makes the protection apply in more cases, in a rather heavy-handed way: it doesn't let you open encrypted images unless we're in a paused state. It doesn't extend the protection to users other than the guest (block jobs?). Use of runstate_check() from block.c is disgusting. Best I can do right now. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block: Unlink temporary fileMax Reitz1-1/+1
If the image file cannot be opened and was created as a temporary file, it should be deleted; thus, in this case, we should jump to the "unlink_and_fail" label and not just to "fail". Reported-by: Benoît Canet <benoit@irqsave.net> Signed-off-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block: Rewrite the snapshot authorization mechanism for block filters.Benoît Canet1-26/+21
This patch keep the recursive way of doing things but simplify it by giving two responsabilities to all block filters implementors. They will need to do two things: -Set the is_filter field of their block driver to true. -Implement the bdrv_recurse_is_first_non_filter method of their block driver like it is done on the Quorum block driver. (block/quorum.c) [Paolo Bonzini <pbonzini@redhat.com> pointed out that this patch changes the semantics of blkverify, which now recurses down both bs->file and s->test_file. -- Stefan] Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block: bs->drv may be NULL in bdrv_debug_resume()Max Reitz1-1/+1
Currently, bdrv_debug_resume() requires every bs->drv in the BDS stack to be NULL until a bs->drv with an implementation of bdrv_debug_resume() is found. For a normal function, this would be fine, but this is a function for debugging purposes and should therefore allow intermediate BDS not to have a driver (i.e., be "ejected"). Otherwise, it is hard to debug such situations. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13block: Update image size in bdrv_invalidate_cache()Kevin Wolf1-1/+9
After migration has completed, we call bdrv_invalidate_cache() so that drivers which cache some data drop their stale copy of the data and reread it from the image file to get a new version of data that the source modified while the migration was running. Reloading metadata from the image file is useless, though, if the size of the image file stays stale (this is a value that is cached for all image formats in block.c). Reads from (meta)data after the old EOF return only zeroes, causing image corruption. We need to update bs->total_sectors in all layers that could potentially have changed their size (i.e. backing files are not a concern - if they are changed, we're in bigger trouble) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-06block: Fix error path segfault in bdrv_open()Kevin Wolf1-0/+1
Using an invalid option for a block device that is opened with BDRV_O_PROTOCOL led to drv = NULL, and when trying to include the driver name in the error message, qemu dereferenced it: $ x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/test.qcow2,file.foo=bar Segmentation fault (core dumped) With this patch applied, the expected error message is printed: $ x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/test.qcow2,file.foo=bar qemu-system-x86_64: -drive file=/tmp/test.qcow2,file.foo=bar: could not open disk image /tmp/test.qcow2: Block protocol 'file' doesn't support the option 'foo' Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-03-06block: Keep "filename" option after parsingMax Reitz1-1/+6
Currently, bdrv_file_open() always removes the "filename" option from the options QDict after bdrv_parse_filename() has been (successfully) called. However, for drivers with bdrv_needs_filename, it makes more sense for bdrv_parse_filename() to overwrite the "filename" option and for bdrv_file_open() to fetch the filename from there. Since there currently are no drivers that implement bdrv_parse_filename() and have bdrv_needs_filename set, this does not change current behavior. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-06block: make bdrv_swap rebuild the bs graph node list field.Benoît Canet1-5/+19
Moving only the node_name one field could lead to some inconsitencies where a node_name was defined on a bs which was not registered in the graph node list. bdrv_swap between a named node bs and a non named node bs would lead to this. bdrv_make_anon would then crash because it would try to remove the bs from the graph node list while it is not in it. This patch remove named node bses from the graph node list before doing the swap then insert them back. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-05block: Fix bs->request_alignment assertion for bs->sg=1Kevin Wolf1-1/+1
For sg backends, bs->request_alignment is meaningless and may be 0. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-28block: use /var/tmp instead of /tmp for -snapshotAmit Shah1-2/+3
If TMPDIR is not specified, the default was to use /tmp for the working copy of the block devices. Update this to /var/tmp instead, so systems using tmp-on-tmpfs don't end up inadvertently using RAM for the block device. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-21block: Remove bdrv_open_image()'s force_raw optionMax Reitz1-23/+4
This option is now unnecessary since specifying BDRV_O_PROTOCOL as flag will do exactly the same. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Reuse success path from bdrv_open()Max Reitz1-34/+29
The fail and success paths of bdrv_file_open() may be further shortened by reusing code already existent in bdrv_open(). This includes bdrv_file_open() not taking the reference to options which allows the removal of QDECREF(options) in that function. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Handle bs->options in bdrv_open() onlyMax Reitz1-23/+15
The fail paths of bdrv_file_open() and bdrv_open() naturally exhibit similarities, thus it is possible to reuse the one from bdrv_open() and shorten the one in bdrv_file_open() accordingly. Also, setting bs->options in bdrv_file_open() is not necessary if it is already done in bdrv_open(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Remove bdrv_new() from bdrv_file_open()Max Reitz1-11/+13
Change bdrv_file_open() to take a simple pointer to an already existing BDS instead of an indirect one. The BDS will be created in bdrv_open() if necessary. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Reuse reference handling from bdrv_open()Max Reitz1-25/+7
Remove the reference parameter and the related handling code from bdrv_file_open(), since it exists in bdrv_open() now as well. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Make bdrv_file_open() staticMax Reitz1-5/+11
Add the bdrv_open() option BDRV_O_PROTOCOL which results in passing the call to bdrv_file_open(). Additionally, make bdrv_file_open() static and therefore bdrv_open() the only way to call it. Consequently, all existing calls to bdrv_file_open() have to be adjusted to use bdrv_open() with the BDRV_O_PROTOCOL flag instead. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Add reference parameter to bdrv_open()Max Reitz1-7/+37
Allow bdrv_open() to handle references to existing block devices just as bdrv_file_open() is already capable of. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Change BDS parameter of bdrv_open() to **Max Reitz1-24/+40
Make bdrv_open() take a pointer to a BDS pointer, similarly to bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open() will create a new BDS with an empty name; if the BDS pointer is not NULL, that existing BDS will be reused (in the same way as bdrv_open() already did). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21block: Fix bdrv_is_first_non_filter()Kevin Wolf1-5/+1
Consider top level BlockDriverStates as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Tested-by: Benoit Canet <benoit@irqsave.net>
2014-02-20Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into stagingPeter Maydell1-8/+8
* remotes/qmp-unstable/queue/qmp: monitor: Add object_add class argument completion. monitor: Add object_del id argument completion. monitor: Add device_add device argument completion. monitor: Add device_del id argument completion. qmp: expose list of supported character device backends Use error_is_set() only when necessary QMP: allow JSON dict arguments in qmp-shell hmp: migrate command (without -d) now blocks correctly Conflicts: blockdev.c [PMM: resolved trivial conflict in blockdev.c] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-17Use error_is_set() only when necessaryMarkus Armbruster1-8/+8
error_is_set(&var) is the same as var != NULL, but it takes whole-program analysis to figure that out. Unnecessarily hard for optimizers, static checkers, and human readers. Dumb it down to obvious. Gets rid of several dozen Coverity false positives. Note that the obvious form is already used in many places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-02-14block: Open by reference will try device then node_name.Benoît Canet1-2/+8
Since we introduced node_name for named bs of the graph modify the opening by reference to use it as a fallback. This patch also enforce the separation of the device id and graph node namespaces. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14block: Relax bdrv_lookup_bs constraints.Benoît Canet1-15/+11
The following patch will reuse bdrv_lookup_bs in order to open images by references so the rules of usage of bdrv_lookup_bs must be relaxed a bit. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-09block: Fix 32 bit truncation in mark_request_serialising()Kevin Wolf1-3/+3
On 32 bit hosts, size_t is too small for align as the bitmask ~(align - 1) will zero out the higher 32 bits of the offset. While at it, change the local overlap_bytes variable to unsigned to match the field in BdrvTrackedRequest. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09block: Don't call ROUND_UP with negative valuesKevin Wolf1-2/+2
The behaviour of the ROUND_UP macro with negative numbers isn't obvious. It happens to do the right thing in this please, but better avoid it. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09block: bdrv_aligned_pwritev: Assert overlap rangeKevin Wolf1-0/+2
This adds assertions that the request that we actually end up passing to the block driver (which includes RMW data and has therefore potentially been rounded to alignment boundaries) is fully covered by the overlap_{offset,size} fields of the associated BdrvTrackedRequest. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09block: Fix memory leaks in bdrv_co_do_pwritev()Kevin Wolf1-2/+2
The error path for a failure in one of the two bdrv_aligned_preadv() calls leaked head_buf or tail_buf, respectively. This fixes the memory leak. Reported-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09block: Fail gracefully with missing filenameKevin Wolf1-5/+6
This fixes a regression introduced in commit 2a05cbe42 ('block: Allow block devices without files'): $ qemu-system-x86_64 -drive driver=file qemu-system-x86_64: block.c:892: bdrv_open_common: Assertion `!drv->bdrv_needs_filename || filename != ((void *)0)' failed. Now the respective check must be performed not only in bdrv_file_open(), but also in bdrv_open(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-24block: Switch bdrv_io_limits_intercept() to byte granularityKevin Wolf1-8/+5
Request sizes used to be rounded down to the next sector boundary, allowing to bypass the I/O limit. Now all requests are accounted for with their exact byte size. Reported-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24qemu-iotests: Test pwritev RMW logicKevin Wolf1-0/+7
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Make bdrv_pwrite() a bdrv_prwv_co() wrapperKevin Wolf1-55/+9
Instead of implementing the alignment adjustment here, use the now existing functionality of bdrv_co_do_pwritev(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Make bdrv_pread() a bdrv_prwv_co() wrapperKevin Wolf1-36/+13
Instead of implementing the alignment adjustment here, use the now existing functionality of bdrv_co_do_preadv(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Change coroutine wrapper to byte granularityKevin Wolf1-22/+26
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Assert serialisation assumptions in pwritevKevin Wolf1-4/+12
If a request calls wait_serialising_requests() and actually has to wait in this function (i.e. a coroutine yield), other requests can run and previously read data (like the head or tail buffer) could become outdated. In this case, we would have to restart from the beginning to read in the updated data. However, we're lucky and don't actually need to do that: A request can only wait in the first call of wait_serialising_requests() because we mark it as serialising before that call, so any later requests would wait. So as we don't wait in practice, we don't have to reload the data. This is an important assumption that may not be broken or data corruption will happen. Document it with some assertions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Align requests in bdrv_co_do_pwritev()Kevin Wolf1-1/+85
This patch changes bdrv_co_do_pwritev() to actually be what its name promises. If requests aren't properly aligned, it performs a RMW. Requests touching the same block are serialised against the RMW request. Further optimisation of this is possible by differentiating types of requests (concurrent reads should actually be okay here). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Allow wait_serialising_requests() at any pointKevin Wolf1-3/+10
We can only have a single wait_serialising_requests() call per request because otherwise we can run into deadlocks where requests are waiting for each other. The same is true when wait_serialising_requests() is not at the very beginning of a request, so that other requests can be issued between the start of the tracking and wait_serialising_requests(). Fix this by changing wait_serialising_requests() to ignore requests that are already (directly or indirectly) waiting for the calling request. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Make overlap range for serialisation dynamicKevin Wolf1-26/+27
Copy on Read wants to serialise with all requests touching the same cluster, so wait_serialising_requests() rounded to cluster boundaries. Other users like alignment RMW will have different requirements, though (requests touching the same sector), so make it dynamic. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Generalise and optimise COR serialisationKevin Wolf1-19/+29
Change the API so that specific requests can be marked serialising. Only these requests are checked for overlaps then. This means that during a Copy on Read operation, not all requests overlapping other requests are serialised any more, but only those that actually overlap with the specific COR request. Also remove COR from function and variable names because this functionality can be useful in other contexts. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Make zero-after-EOF work with larger alignmentKevin Wolf1-3/+4
Odd file sizes could make bdrv_aligned_preadv() shorten the request in non-aligned ways. Fix it by rounding to the required alignment instead of 512 bytes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Allow waiting for overlapping requests between begin/endKevin Wolf1-18/+20
Previously, it was not possible to use wait_for_overlapping_requests() between tracked_request_begin()/end() because it would wait for itself. Ignore the current request in the overlap check and run more of the bdrv_co_do_preadv/pwritev code with a BdrvTrackedRequest present. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Switch BdrvTrackedRequest to byte granularityKevin Wolf1-18/+34
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Introduce bdrv_co_do_pwritev()Kevin Wolf1-6/+18
This is going to become the bdrv_co_do_preadv() equivalent for writes. In this patch, however, just a function taking byte offsets is created, it doesn't align anything yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: write: Handle COR dependency after I/O throttlingKevin Wolf1-4/+4
First waiting for all COR requests to complete and calling the throttling function afterwards means that the request could be delayed and we still need to wait for the COR request even if it was issued only after the throttled write request. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Introduce bdrv_aligned_pwritev()Kevin Wolf1-21/+41
This separates the part of bdrv_co_do_writev() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Introduce bdrv_co_do_preadv()Kevin Wolf1-6/+58
Similar to bdrv_pread(), which aligns byte-aligned request to 512 byte sectors, bdrv_co_do_preadv() takes a byte-aligned request and aligns it to the alignment specified in bs->request_alignment. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Introduce bdrv_aligned_preadv()Kevin Wolf1-18/+43
This separates the part of bdrv_co_do_readv() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24raw: Probe required direct I/O alignmentPaolo Bonzini1-0/+3
Add a bs->request_alignment field that contains the required offset/length alignment for I/O requests and fill it in the raw block drivers. Use ioctls if possible, else see what alignment it takes for O_DIRECT to succeed. While at it, also expose the memory alignment requirements, which may be (and in practice are) different from the disk alignment requirements. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: rename buffer_alignment to guest_block_sizePaolo Bonzini1-5/+5
The alignment field is now set to the value that is promised to the guest, rather than required by the host. The next patches will make QEMU aware of the host-provided values, so make this clear. The alignment is also not about memory buffers, but about the sectors on the disk, change the documentation of the field. At this point, the field is set by the device emulation, but completely ignored by the block layer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24block: Don't use guest sector size for qemu_blockalign()Kevin Wolf1-3/+20
bs->buffer_alignment is set by the device emulation and contains the logical block size of the guest device. This isn't something that the block layer should know, and even less something to use for determining the right alignment of buffers to be used for the host. The new BlockLimits field opt_mem_alignment tells the qemu block layer the optimal alignment to be used so that no bounce buffer must be used in the driver. This patch may change the buffer alignment from 4k to 512 for all callers that used qemu_blockalign() with the top-level image format BlockDriverState. The value was never propagated to other levels in the tree, so in particular raw-posix never required anything else than 512. While on disks with 4k sectors direct I/O requires a 4k alignment, memory may still be okay when aligned to 512 byte boundaries. This is what must have happened in practice, because otherwise this would already have failed earlier. Therefore I don't expect regressions even with this intermediate state. Later, raw-posix can implement the hook and expose a different memory alignment requirement. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24block: Detect unaligned length in bdrv_qiov_is_aligned()Kevin Wolf1-0/+3
For an O_DIRECT request to succeed, it's not only necessary that all base addresses in the qiov are aligned, but also that each length in it is aligned. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Max Reitz <mreitz@redhat.com>