summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-05-12tci: Make direct jump patching thread-safeSergey Fedorov3-2/+7
Ensure direct jump patching in TCI is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() to load/store the address. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add macros for pointer alignmentSergey Fedorov1-0/+11
These macros provide a convenient way to n-byte align pointers up and down and check if a pointer is n-byte aligned. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-3-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12include/qemu/osdep.h: Add a macro to check for alignmentSergey Fedorov1-0/+3
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-2-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12tb: consistently use uint32_t for tb->flagsEmilio G. Cota24-31/+32
We are inconsistent with the type of tb->flags: usage varies loosely between int and uint64_t. Settle to uint32_t everywhere, which is superior to both: at least one target (aarch64) uses the most significant bit in the u32, and uint64_t is wasteful. Compile-tested for all targets. Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: Richard Henderson <rth@twiddle.net> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell78-2059/+3323
Block layer patches # gpg: Signature made Thu 12 May 2016 14:37:05 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (69 commits) qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 block: Inactivate all children block: Drop superfluous invalidating bs->file from drivers block: Invalidate all children nbd: Simplify client FUA handling block: Honor BDRV_REQ_FUA during write_zeroes ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch ↵Peter Maydell53-643/+2976
'remotes/pmaydell/tags/pull-target-arm-20160512' into staging target-arm queue: * blizzard, omap_lcdc: code cleanup to remove DEPTH != 32 dead code * QOMify various ARM devices * bcm2835_property: use cached values when querying framebuffer * hw/arm/nseries: don't allocate large sized array on the stack * fix LPAE descriptor address masking (only visible for EL2) * fix stage 2 exec permission handling for AArch32 * first part of supporting syndrome info for data aborts to EL2 * virt: NUMA support * work towards i.MX6 support * avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes # gpg: Signature made Thu 12 May 2016 14:29:14 BST using RSA key ID 14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" * remotes/pmaydell/tags/pull-target-arm-20160512: (43 commits) hw/arm: QOM'ify versatilepb.c hw/arm: QOM'ify strongarm.c hw/arm: QOM'ify stellaris.c hw/arm: QOM'ify spitz.c hw/arm: QOM'ify pxa2xx_pic.c hw/arm: QOM'ify pxa2xx.c hw/arm: QOM'ify integratorcp.c hw/arm: QOM'ify highbank.c hw/arm: QOM'ify armv7m.c target-arm: Avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes hw/display/blizzard: Remove blizzard_template.h hw/display/blizzard: Expand out macros i.MX: Add sabrelite i.MX6 emulation. i.MX: Add i.MX6 SOC implementation. i.MX: Add the Freescale SPI Controller FIFO: Add a FIFO32 implementation i.MX: Add i.MX6 System Reset Controller device. ARM: Factor out ARM on/off PSCI control functions ACPI: Virt: Generate SRAT table ACPI: move acpi_build_srat_memory to common place ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-05-12' into ↵Peter Maydell39-428/+1194
staging QAPI patches for 2016-05-12 # gpg: Signature made Thu 12 May 2016 08:49:04 BST using RSA key ID EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" * remotes/armbru/tags/pull-qapi-2016-05-12: (23 commits) qapi: Change visit_type_FOO() to no longer return partial objects qapi: Simplify semantics of visit_next_list() qapi: Fix string input visitor handling of invalid list tests/string-input-visitor: Add negative integer tests qapi: Split visit_end_struct() into pieces qmp: Tighten output visitor rules qmp: Don't reuse qmp visitor after grabbing output spapr_drc: Expose 'null' in qom-get when there is no fdt qmp: Support explicit null during visits qapi: Add visit_type_null() visitor tests: Add check-qnull qapi: Document visitor interfaces, add assertions qmp-input: Refactor when list is advanced qmp-input: Require struct push to visit members of top dict qom: Wrap prop visit in visit_start_struct qapi-commands: Wrap argument visit in visit_start_struct qmp-input: Don't consume input when checking has_member qapi: Use strict QMP input visitor in more places qapi: Consolidate QMP input visitor creation qmp-input: Clean up stack handling ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-05-12' ↵Kevin Wolf28-900/+2053
into queue-block Block patches for 2.7 # gpg: Signature made Thu May 12 15:34:13 2016 CEST using RSA key ID E838ACAD # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" * mreitz/tags/pull-block-for-kevin-2016-05-12: qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160511-1' into ↵Peter Maydell3-4/+18
staging usb: misc fixes # gpg: Signature made Wed 11 May 2016 12:18:25 BST using RSA key ID D3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" * remotes/kraxel/tags/pull-usb-20160511-1: usb: Support compilation without poll.h usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain usb:xhci: no DMA on HC reset Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12qemu-iotests: iotests: fail hard if not run via "check"Sascha Silbe1-1/+9
Running an iotests-based Python test directly might appear to work, but may fail in subtle ways and is insecure: - It creates files with predictable file names in a world-writable location (/var/tmp). - Tests expect the environment to be set up by check. E.g. 041 and 055 may take the wrong code paths if QEMU_DEFAULT_MACHINE is not set. This can lead to false negatives. Instead fail hard and tell the user we want to be run via "check". The actual environment expected by the tests is currently only defined by the implementation of "check". We use two of the environment variables set by "check" as indication of whether we're being run via "check". Anyone writing their own test runner (replacing "check") will need to replicate the full environment (in a broader sense, not just environment variables) provided by "check" anyway, including setting the two environment variables we check. Whereas a regular developer just trying to invoke the tests usually won't have both of these defined in their environment so we can catch their mistake and give out useful advice. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Reviewed-by: Bo Tu <tubo@linux.vnet.ibm.com> Message-id: 1461094442-16014-1-git-send-email-silbe@linux.vnet.ibm.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: enable testing of LUKS driver with block I/O testsDaniel P. Berrange10-11/+67
This adds support for testing the LUKS driver with the block I/O test framework. cd tests/qemu-io-tests ./check -luks A handful of test cases are modified to work with luks - 004 - whitelist luks format - 012 - use TEST_IMG_FILE instead of TEST_IMG for file ops - 048 - use TEST_IMG_FILE instead of TEST_IMG for file ops. don't assume extended image contents is all zeros, explicitly initialize with zeros Make file size smaller to avoid having to decrypt 1 GB of data. - 052 - don't assume initial image contents is all zeros, explicitly initialize with zeros - 100 - don't assume initial image contents is all zeros, explicitly initialize with zeros With this patch applied, the results are as follows: Passed: 001 002 003 004 005 008 009 010 011 012 021 032 043 047 048 049 052 087 100 134 143 Failed: 033 120 140 145 Skipped: 007 013 014 015 017 018 019 020 022 023 024 025 026 027 028 029 030 031 034 035 036 037 038 039 040 041 042 043 044 045 046 047 049 050 051 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 101 102 103 104 105 107 108 109 110 111 112 113 114 115 116 117 118 119 121 122 123 124 128 129 130 131 132 133 134 135 136 137 138 139 141 142 144 146 148 150 152 The reasons for the failed tests are: - 033 - needs adapting to use image opts syntax with blkdebug and test image in order to correctly set align property - 120 - needs adapting to use correct -drive syntax for luks - 140 - needs adapting to use correct -drive syntax for luks - 145 - needs adapting to use correct -drive syntax for luks The vast majority of skipped tests are exercising code that is qcow2 specific, though a couple could probably be usefully enabled for luks too. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-4-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: add support for encryption secrets in block I/O testsDaniel P. Berrange4-3/+16
The LUKS block driver tests will require the ability to specify encryption secrets with block devices. This requires using the --object argument to qemu-img/qemu-io to create a 'secret' object. When the IMGKEYSECRET env variable is set, it provides the password to be associated with a secret called 'keysec0' The _qemu_img_wrapper function isn't modified as that needs to cope with differing syntax for subcommands, so can't be made to use the image opts syntax unconditionally. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-3-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: add support for --image-opts in block I/O testsDaniel P. Berrange6-34/+77
Currently all block tests use the traditional syntax for images just specifying a filename. To support the LUKS driver without resorting to JSON, the tests need to be able to use the new --image-opts argument to qemu-img and qemu-io. This introduces a new env variable IMGOPTSSYNTAX. If this is set to 'true', then qemu-img/qemu-io should use --image-opts. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1462896689-18450-2-git-send-email-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add 'write -z -u' to test MAY_UNMAP flagEric Blake1-3/+21
Make it easier to control whether the BDRV_REQ_MAY_UNMAP flag can be passed through a write_zeroes command, by adding the '-u' flag to qemu-io 'write -z' and 'aio_write -z'. To be useful, the device has to be opened with BDRV_O_UNMAP (done by default in qemu-io, but can be made explicit with '-d unmap'). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1462677405-4752-7-git-send-email-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add 'write -f' to test FUA flagEric Blake1-16/+41
Make it easier to test block drivers with BDRV_REQ_FUA in .supported_write_flags, by adding the '-f' flag to qemu-io to conditionally pass the flag through to specific writes ('write', 'write -z', 'writev', 'aio_write', 'aio_write -z'). You'll want to use 'qemu-io -t none' to actually make -f useful (as otherwise, the default writethrough mode automatically sets the FUA bit on every write). Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-6-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Allow unaligned access by defaultEric Blake2-776/+1457
There's no reason to require the user to specify a flag just so they can pass in unaligned numbers. Keep 'read -p' and 'write -p' as no-ops so that I don't have to hunt down and update all users of qemu-io, but otherwise make their behavior default as 'read' and 'write'. Also fix 'write -z', 'readv', 'writev', 'writev', 'aio_read', 'aio_write', and 'aio_write -z'. For now, 'read -b', 'write -b', and 'write -c' still require alignment (and 'multiwrite', but that's slated to die soon). qemu-iotest 23 is updated to match, as the only test that was previously explicitly expecting an error on an unaligned request. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-5-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Use bool for command line flagsEric Blake1-47/+47
We require a C99 compiler; let's use it to express what we really mean. (Yes, we now have an instance of 'if (bool + bool + bool > 1)', which, although semantically valid C, looks ugly; it gets cleaned up later.) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1462677405-4752-4-git-send-email-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Make 'open' subcommand more like command lineEric Blake1-4/+25
The command line defaults to BDRV_O_UNMAP, but can use -d to reset it. Meanwhile, the 'open' subcommand was defaulting to no discards, with no way to set it. The command line has both -n and -tMODE to set a variety of cache modes, but the 'open' subcommand had only -n. The 'open' subcommand had no way to set BDRV_O_NATIVE_AIO. Note that the 'reopen' subcommand uses '-c' where the command line and 'open' use -t. Making that consistent would be a separate patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-3-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-io: Add missing option documentationEric Blake1-7/+8
The Usage: summary is missing several options, but rather than having to maintain it, it's simpler to just state [OPTIONS], since the options are spelled out below. Commit 499afa2 added --image-opts, but forgot to document it in --help. Likewise for commit 9e8f183 and -d/--discard. Commit e3aff4f6 put "-o/--offset" in the long opts, but it has never been honored. Add a note that '-n' is short for '-t none'. Commit 9a2d77ad killed the -C option, but forgot to undocument it for the 'open' subcommand. Finally, commit 10d9d75 removed -g/--growable, but forgot to cull it from the valid short options. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 1462677405-4752-2-git-send-email-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qmp: add monitor command to add/remove a childWen Congyang3-0/+140
The new QMP command name is x-blockdev-change. It's just for adding/removing quorum's child now, and doesn't support all kinds of children, all kinds of operations, nor all block drivers. So it is experimental now. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1462865799-19402-4-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12quorum: implement bdrv_add_child() and bdrv_del_child()Wen Congyang3-6/+84
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Message-id: 1462865799-19402-3-git-send-email-xiecl.fnst@cn.fujitsu.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12Add new block driver interface to add/delete a BDS's childWen Congyang3-0/+58
In some cases, we want to take a quorum child offline, and take another child online. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1462865799-19402-2-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12qemu-img: check block status of backing file when converting.Ren Kimura1-2/+13
When converting images, check the block status of its backing file chain to avoid needlessly reading zeros. Signed-off-by: Ren Kimura <rkx1209dev@gmail.com> Message-id: 1461773098-20356-1-git-send-email-rkx1209dev@gmail.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12iotests: fix the redirection order in 083Wei Jiangang1-2/+2
It should redirect stdout to /dev/null first, then redirect stderr to whatever stdout currently points at. Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com> Message-id: 1461665601-14908-1-git-send-email-weijg.fnst@cn.fujitsu.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: Inactivate all childrenFam Zheng1-11/+36
Currently we only inactivate the top BDS. Actually bdrv_inactivate should be the opposite of bdrv_invalidate_cache. Recurse into the whole subtree instead. Because a node may have multiple parents, and because once BDRV_O_INACTIVE is set for a node, further writes are not allowed, we cannot interleave flag settings and .bdrv_inactivate calls (that may submit write to other nodes in a graph) within a single pass. Therefore two passes are used here. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Drop superfluous invalidating bs->file from driversFam Zheng3-29/+0
Now they are invalidated by the block layer, so it's not necessary to do this in block drivers' implementations of .bdrv_invalidate_cache. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Invalidate all childrenFam Zheng1-6/+14
Currently we only recurse to bs->file, which will miss the children in quorum and VMDK. Recurse into the whole subtree to avoid that. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12nbd: Simplify client FUA handlingEric Blake3-27/+8
Now that the block layer honors per-bds FUA support, we don't have to duplicate the fallback flush at the NBD layer. The static function nbd_co_writev_flags() is no longer needed, and the driver can just directly use nbd_client_co_writev(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Honor BDRV_REQ_FUA during write_zeroesEric Blake5-5/+33
The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but back then, it did not matter because there was no FUA flag. It became observable when commit 93f5e6d8 paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit 9eeb6dd, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Make supported_write_flags a per-bds propertyEric Blake6-14/+17
Pre-patch, .supported_write_flags lives at the driver level, which means we are blindly declaring that all block devices using a given driver will either equally support FUA, or that we need a fallback at the block layer. But there are drivers where FUA support is a per-block decision: the NBD block driver is dependent on the remote server advertising NBD_FLAG_SEND_FUA (and has fallback code to duplicate the flush that the block layer would do if NBD had not set .supported_write_flags); and the iscsi block driver is dependent on the mode sense bits advertised by the underlying device (and is currently silently ignoring FUA requests if the underlying device does not support FUA). The fix is to make supported flags as a per-BDS option, set during .bdrv_open(). This patch moves the variable and fixes NBD and iscsi to set it only conditionally; later patches will then further simplify the NBD driver to quit duplicating work done at the block layer, as well as tackle the fact that SCSI does not support FUA semantics on WRITESAME(10/16) but only on WRITE(10/16). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qcow2: improve qcow2_co_write_zeroes()Denis V. Lunev1-6/+59
There is a possibility that qcow2_co_write_zeroes() will be called with the partial block. This could be synthetically triggered with qemu-io -c "write -z 32k 4k" and can happen in the real life in qemu-nbd. The latter happens under the following conditions: (1) qemu-nbd is started with --detect-zeroes=on and is connected to the kernel NBD client (2) third party program opens kernel NBD device with O_DIRECT (3) third party program performs write operation with memory buffer not aligned to the page In this case qcow2_co_write_zeroes() is unable to perform the operation and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller switches to non-optimized version and writes real zeroes to the disk. The patch creates a shortcut. If the block is read as zeroes, f.e. if it is unallocated, the request is extended to cover full block. User-visible situation with this block is not changed. Before the patch the block is filled in the image with real zeroes. After that patch the block is marked as zeroed in metadata. Thus any subsequent changes in backing store chain are not affected. Kevin, thank you for a cool suggestion. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Kill unused sector-based blk_* functionsEric Blake2-61/+0
Now that there are no remaining clients, we can drop the sector-based blk_read(), blk_write(), blk_aio_readv(), and blk_aio_writev(). Sadly, there are still remaining sector-based interfaces, such as blk_*discard(), or blk_write_compressed(); those will have to wait for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qemu-io: Switch to byte-based block accessEric Blake1-52/+10
qemu-io is the last user of several sector-based interfaces. This patch upgrades to the new interfaces under the hood, then deletes the resulting dead code. Note that for maximum back-compat, while the -p option is no longer required to get blk_pread(), it is still needed to allow for unaligned access; this is because qemu-iotest 23 relies on qemu-io rejecting unaligned accesses without -p. A later patch may clean up the interface to be more user-friendly, but it's better to separate what's done under the hood from what the user sees. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qemu-img: Switch to byte-based block accessEric Blake1-9/+19
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12nbd: Switch to byte-based block accessEric Blake1-4/+9
Sector-based blk_read() should die; switch to byte-based blk_pread() instead. Add a constant for our magic number 512, to make it obvious that this size will NOT change even if BDRV_SECTOR_SIZE does, even though the two happen to be the same for now. Split assignments from conditionals to keep checkpatch.pl happy. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12atapi: Switch to byte-based block accessEric Blake1-8/+11
Sector-based blk_read() should die; switch to byte-based blk_pread() instead. Add new defines ATAPI_SECTOR_BITS and ATAPI_SECTOR_SIZE to use anywhere we were previously scaling BDRV_SECTOR_* by 4, for better legibility. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12m25p80: Switch to byte-based block accessEric Blake1-16/+7
Sector-based blk_read() should die; switch to byte-based blk_pread() instead. Likewise for blk_aio_readv() and blk_aio_writev(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12sd: Switch to byte-based block accessEric Blake1-47/+4
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). Greatly simplifies the code, now that we let the block layer take care of alignment and read-modify-write on our behalf :) In fact, we no longer need to include 'buf' in the migration stream (although we do have to ensure that the stream remains compatible). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12pflash: Switch to byte-based block accessEric Blake2-12/+12
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12onenand: Switch to byte-based block accessEric Blake1-14/+27
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). This particular device picks its size during onenand_initfn(), and can be at most 0x80000000 bytes; therefore, shifting an 'int sec' request to get back to a byte offset should never overflow 32 bits. But adding assertions to document that point should not hurt. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12nand: Switch to byte-based block accessEric Blake1-13/+23
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). This file is doing some complex computations to map various flash page sizes (256, 512, and 2048) atop generic uses of 512-byte sector operations. Perhaps someone will want to tidy up the file for fewer gymnastics in managing addresses and offsets, and less wasteful visits of 256-byte pages, but it was out of scope for this series, where I just went with the mechanical conversion. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12fdc: Switch to byte-based block accessEric Blake1-8/+17
Sector-based blk_write() should die; switch to byte-based blk_pwrite() instead. Likewise for blk_read(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12xen_disk: Switch to byte-based aio block accessEric Blake1-6/+4
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch to byte-based blk_aio_preadv() and blk_aio_pwritev() instead. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12virtio: Switch to byte-based aio block accessEric Blake2-11/+9
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch to byte-based blk_aio_preadv() and blk_aio_pwritev() instead. The trace is modified at the same time, and nb_sectors is now unused. Fix a comment typo while in the vicinity. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12scsi-disk: Switch to byte-based aio block accessEric Blake1-21/+20
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch to byte-based blk_aio_preadv() and blk_aio_pwritev() instead. As part of the cleanup, scsi_init_iovec() no longer needs to return a value, and reword a comment. [ kwolf: Fix read accounting change ] Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12ide: Switch to byte-based aio block accessEric Blake5-21/+18
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch to byte-based blk_aio_preadv() and blk_aio_pwritev() instead. The patch had to touch multiple files at once, because dma_blk_io() takes pointers to the functions, and ide_issue_trim() piggybacks on the same interface (while ignoring offset under the hood). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Introduce byte-based aio read/writeEric Blake2-2/+24
blk_aio_readv() and blk_aio_writev() are annoying in that they can't access sub-sector granularity, and cannot pass flags. Also, they require the caller to pass redundant information about the size of the I/O (qiov->size in bytes must match nb_sectors in sectors). Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix the flaws. The next few patches will upgrade callers, then finally delete the old interfaces. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_*write_zeroes() to byte interfaceEric Blake6-36/+25
Sector-based blk_write() should die; convert the one-off variant blk_write_zeroes() to use an offset/count interface instead. Likewise for blk_co_write_zeroes() and blk_aio_write_zeroes(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_read_unthrottled() to byte interfaceEric Blake3-7/+7
Sector-based blk_read() should die; convert the one-off variant blk_read_unthrottled(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Allow BDRV_REQ_FUA through blk_pwrite()Eric Blake15-33/+37
We have several block drivers that understand BDRV_REQ_FUA, and emulate it in the block layer for the rest by a full flush. But without a way to actually request BDRV_REQ_FUA during a pass-through blk_pwrite(), FUA-aware block drivers like NBD are forced to repeat the emulation logic of a full flush regardless of whether the backend they are writing to could do it more efficiently. This patch just wires up a flags argument; followup patches will actually make use of it in the NBD driver and in qemu-io. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>