summaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2017-09-07Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell12-337/+966
Block layer patches # gpg: Signature made Wed 06 Sep 2017 14:44:41 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing qemu-iotests: add 184 for throttle filter driver block: add throttle block filter driver block: convert ThrottleGroup to object with QOM block: tidy ThrottleGroupMember initializations block: add aio_context field in ThrottleGroupMember block: move ThrottleGroup membership to ThrottleGroupMember block: document semantics of bdrv_co_preadv|pwritev qcow: Check failure of bdrv_getlength() and bdrv_truncate() qcow: Change signature of get_cluster_offset() block: add default implementations for bdrv_co_get_block_status() block: remove bdrv_truncate callback in blkdebug block: remove unused bdrv_media_changed block: pass bdrv_* methods to bs->file by default in block filters Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-06qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushingPavel Butsykin1-8/+8
After calling qcow2_inactivate(), all qcow2 caches must be flushed, but this may not happen, because the last call qcow2_store_persistent_dirty_bitmaps() can lead to marking l2/refcont cache as dirty. Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe flushing to fix it. Cc: qemu-stable@nongnu.org Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-06block: add throttle block filter driverManos Pitsidianakis3-1/+252
block/throttle.c uses existing I/O throttle infrastructure inside a block filter driver. I/O operations are intercepted in the filter's read/write coroutines, and referred to block/throttle-groups.c The driver can be used with the syntax -drive driver=throttle,file.filename=foo.qcow2,throttle-group=bar which registers the throttle filter node with the ThrottleGroup 'bar'. The given group must be created beforehand with object-add or -object. Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05block: convert ThrottleGroup to object with QOMManos Pitsidianakis1-42/+382
ThrottleGroup is converted to an object. This will allow the future throttle block filter drive easy creation and configuration of throttle groups in QMP and cli. A new QAPI struct, ThrottleLimits, is introduced to provide a shared struct for all throttle configuration needs in QMP. ThrottleGroups can be created via CLI as -object throttle-group,id=foo,x-iops-total=100,x-.. where x-* are individual limit properties. Since we can't add non-scalar properties in -object this interface must be used instead. However, setting these properties must be disabled after initialization because certain combinations of limits are forbidden and thus configuration changes should be done in one transaction. The individual properties will go away when support for non-scalar values in CLI is implemented and thus are marked as experimental. ThrottleGroup also has a `limits` property that uses the ThrottleLimits struct. It can be used to create ThrottleGroups or set the configuration in existing groups as follows: { "execute": "object-add", "arguments": { "qom-type": "throttle-group", "id": "foo", "props" : { "limits": { "iops-total": 100 } } } } { "execute" : "qom-set", "arguments" : { "path" : "foo", "property" : "limits", "value" : { "iops-total" : 99 } } } This also means a group's configuration can be fetched with qom-get. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05block: tidy ThrottleGroupMember initializationsManos Pitsidianakis2-3/+3
Move the CoMutex and CoQueue inits inside throttle_group_register_tgm() which is called whenever a ThrottleGroupMember is initialized. There's no need for them to be separate. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05block: add aio_context field in ThrottleGroupMemberManos Pitsidianakis2-23/+30
timer_cb() needs to know about the current Aio context of the throttle request that is woken up. In order to make ThrottleGroupMember backend agnostic, this information is stored in an aio_context field instead of accessing it from BlockBackend. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05block: move ThrottleGroup membership to ThrottleGroupMemberManos Pitsidianakis3-176/+186
This commit eliminates the 1:1 relationship between BlockBackend and throttle group state. Users will be able to create multiple throttle nodes, each with its own throttle group state, in the future. The throttle group state cannot be per-BlockBackend anymore, it must be per-throttle node. This is done by gathering ThrottleGroup membership details from BlockBackendPublic into ThrottleGroupMember and refactoring existing code to use the structure. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-20170905-2' ↵Peter Maydell2-2/+2
into staging Merge QEMU I/O 2017/09/05 v2 # gpg: Signature made Tue 05 Sep 2017 13:22:36 BST # gpg: using RSA key 0xBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/pull-qio-20170905-2: io: fix check for handshake completion in TLS test io: add new qio_channel_{readv, writev, read, write}_all functions io: fix typo in docs comment for qio_channel_read util: remove the obsolete non-blocking connect io: fix temp directory used by test-io-channel-tls test Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-05util: remove the obsolete non-blocking connectCao jin2-2/+2
The non-blocking connect mechanism is obsolete, and it doesn't work well in inet connection, because it will call getaddrinfo first and getaddrinfo will blocks on DNS lookups. Since commit e65c67e4 & d984464e, the non-blocking connect of migration goes through QIOChannel in a different manner(using a thread), and nobody use this old non-blocking connect anymore. Any newly written code which needs a non-blocking connect should use the QIOChannel code, so we can drop NonBlockingConnectHandler as a concept entirely. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-09-04qcow: Check failure of bdrv_getlength() and bdrv_truncate()Eric Blake1-8/+22
Omitting the check for whether bdrv_getlength() and bdrv_truncate() failed meant that it was theoretically possible to return an incorrect offset to the caller. More likely, conditions for either of these functions to fail would also cause one of our other calls (such as bdrv_pread() or bdrv_pwrite_sync()) to also fail, but auditing that we are safe is difficult compared to just patching things to always forward on the error rather than ignoring it. Use osdep.h macros instead of open-coded rounding while in the area. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-04qcow: Change signature of get_cluster_offset()Eric Blake1-50/+73
The old signature has an ambiguous meaning for a return of 0: either no allocation was requested or necessary, or an error occurred (but any errno associated with the error is lost to the caller, which then has to assume EIO). Better is to follow the example of qcow2, by changing the signature to have a separate return value that cleanly distinguishes between failure and success, along with a parameter that cleanly holds a 64-bit value. Then update all callers. While auditing that all return paths return a negative errno (rather than -1), I also simplified places where we can pass NULL rather than a local Error that just gets thrown away. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-04block: add default implementations for bdrv_co_get_block_status()Manos Pitsidianakis4-33/+29
bdrv_co_get_block_status_from_file() and bdrv_co_get_block_status_from_backing() set *file to bs->file and bs->backing respectively, so that bdrv_co_get_block_status() can recurse to them. Future block drivers won't have to duplicate code to implement this. Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-04block: remove bdrv_truncate callback in blkdebugManos Pitsidianakis1-7/+1
Now that bdrv_truncate is passed to bs->file by default, remove the callback from block/blkdebug.c and set is_filter to true. is_filter also gives access to other callbacks that are forwarded automatically to bs->file for filters. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-04block: remove unused bdrv_media_changedManos Pitsidianakis1-6/+0
This function is not used anywhere, so remove it. Markus Armbruster adds: The i82078 floppy device model used to call bdrv_media_changed() to implement its media change bit when backed by a host floppy. This went away in 21fcf36 "fdc: simplify media change handling". Probably broke host floppy media change. Host floppy pass-through was dropped in commit f709623. bdrv_media_changed() has never been used for anything else. Remove it. (Source is Message-ID: <87y3ruaypm.fsf@dusky.pond.sub.org>) Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-04qapi: drop the sentinel in enum arrayMarc-André Lureau1-1/+0
Now that all usages have been converted to user lookup helpers. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-14-marcandre.lureau@redhat.com> [Rebased, superfluous local variable dropped, missing check-qom-proplist.c update added] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-17-git-send-email-armbru@redhat.com>
2017-09-04qapi: Change data type of the FOO_lookup generated for enum FOOMarc-André Lureau7-17/+21
Currently, a FOO_lookup is an array of strings terminated by a NULL sentinel. A future patch will generate enums with "holes". NULL-termination will cease to work then. To prepare for that, store the length in the FOO_lookup by wrapping it in a struct and adding a member for the length. The sentinel will be dropped next. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-13-marcandre.lureau@redhat.com> [Basically redone] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-16-git-send-email-armbru@redhat.com> [Rebased]
2017-09-04qapi: Mechanically convert FOO_lookup[...] to FOO_str(...)Markus Armbruster10-13/+13
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-14-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-09-04qapi: Generate FOO_str() macro for QAPI enum FOOMarkus Armbruster7-7/+0
The next commit will put it to use. May look pointless now, but we're going to change the FOO_lookup's type, and then it'll help. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-09-04quorum: Use qapi_enum_parse() in quorum_open()Marc-André Lureau1-19/+9
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-12-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, qemu_opt_get() factored out, commit message tweaked] Cc: Alberto Garcia <berto@igalia.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-9-git-send-email-armbru@redhat.com>
2017-09-04block: Use qemu_enum_parse() in blkdebug_debug_breakpoint()Marc-André Lureau1-20/+8
The error message on invalid blkdebug events changes from qemu-system-x86_64: LOCATION: Invalid event name "VALUE" to qemu-system-x86_64: LOCATION: invalid parameter value: VALUE Slight degradation, but the message is sub-par even before the patch. When complaining about a parameter value, both parameter name and value should be mentioned, as the value may well not be unique. Left for another day. Also left is the error message's unhelpful location: it points to the config=FILENAME rather than into that file. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-11-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, commit message rewritten] Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-8-git-send-email-armbru@redhat.com>
2017-09-04qapi: Drop superfluous qapi_enum_parse() parameter maxMarkus Armbruster5-14/+10
The lookup tables have a sentinel, no need to make callers pass their size. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-3-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Rebased, commit message corrected]
2017-08-31Merge remote-tracking branch 'remotes/elmarco/tags/tidy-pull-request' into ↵Peter Maydell5-7/+7
staging # gpg: Signature made Thu 31 Aug 2017 11:29:33 BST # gpg: using RSA key 0xDAE8E10975969CE5 # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/tidy-pull-request: (29 commits) eepro100: replace g_malloc()+memcpy() with g_memdup() test-iov: replace g_malloc()+memcpy() with g_memdup() i386: replace g_malloc()+memcpy() with g_memdup() i386: introduce ELF_NOTE_SIZE macro decnumber: use DIV_ROUND_UP kvm: use DIV_ROUND_UP i386/dump: use DIV_ROUND_UP ppc: use DIV_ROUND_UP msix: use DIV_ROUND_UP usb-hub: use DIV_ROUND_UP q35: use DIV_ROUND_UP piix: use DIV_ROUND_UP virtio-serial: use DIV_ROUND_UP console: use DIV_ROUND_UP monitor: use DIV_ROUND_UP virtio-gpu: use DIV_ROUND_UP vga: use DIV_ROUND_UP ui: use DIV_ROUND_UP vnc: use DIV_ROUND_UP vvfat: use DIV_ROUND_UP ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-31Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell3-27/+22
staging # gpg: Signature made Thu 31 Aug 2017 09:21:49 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: qcow2: allocate cluster_cache/cluster_data on demand qemu-doc: Add UUID support in initiator name tests: migration/guestperf Python 2.6 argparse compatibility docker.py: Python 2.6 argparse compatibility scripts: add argparse module for Python 2.6 compatibility misc: Remove unused Error variables oslib-posix: Print errors before aborting on qemu_alloc_stack() throttle: Test the valid range of config values throttle: Make burst_length 64bit and add range checks throttle: Make LeakyBucket.avg and LeakyBucket.max integer types throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket() throttle: Make throttle_is_valid() a bit less verbose throttle: Update the throttle_fix_bucket() documentation throttle: Fix wrong variable name in the header documentation nvme: Fix get/set number of queues feature, again Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-31vvfat: use DIV_ROUND_UPMarc-André Lureau1-2/+2
I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-31vpc: use DIV_ROUND_UPMarc-André Lureau1-2/+2
I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-31qcow2: use DIV_ROUND_UPMarc-André Lureau1-1/+1
I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-31dmg: use DIV_ROUND_UPMarc-André Lureau1-1/+1
I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-31vhdx: use QEMU_ALIGN_DOWNMarc-André Lureau1-1/+1
I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-30block/nbd-client: refactor request send/receiveVladimir Sementsov-Ogievskiy1-47/+26
Add nbd_co_request, to remove code duplications in nbd_client_co_{pwrite,pread,...} functions. Also this is needed for further refactoring. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com> [eblake: make nbd_co_request a wrapper, rather than merging two existing functions] Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30block/nbd-client: rename nbd_recv_coroutines_enter_allVladimir Sementsov-Ogievskiy1-2/+2
Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all, as it most probably just adds all recv coroutines into co_queue_wakeup, rather than directly enter them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com> [eblake: tweak commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30block/nbd-client: get rid of ssize_tVladimir Sementsov-Ogievskiy1-5/+5
Use int variable for nbd_co_send_request return value (as nbd_co_send_request returns int). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30nbd-client: avoid read_reply_co entry if send failedStefan Hajnoczi1-16/+9
The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-30qcow2: allocate cluster_cache/cluster_data on demandStefan Hajnoczi2-12/+17
Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1) * cluster_size + 512 bytes upfront. Allocate s->cluster_cache and s->cluster_data when the first read operation is performance on a compressed cluster. The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any code paths that can allocate these buffers, so remove the free functions in the error code path. This patch can result in significant memory savings when many qcow2 disks are attached or backing file chains are long: Before 12.81% (1,023,193,088B) After 5.36% (393,893,888B) Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170821135530.32344-1-stefanha@redhat.com Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-30misc: Remove unused Error variablesAlberto Garcia2-15/+5
There's a few cases which we're passing an Error pointer to a function only to discard it immediately afterwards without checking it. In these cases we can simply remove the variable and pass NULL instead. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170829120836.16091-1-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-23nbd-client: avoid spurious qio_channel_yield() re-entryStefan Hajnoczi2-14/+28
The following scenario leads to an assertion failure in qio_channel_yield(): 1. Request coroutine calls qio_channel_yield() successfully when sending would block on the socket. It is now yielded. 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because nbd_receive_reply() failed. 3. Request coroutine is entered and returns from qio_channel_yield(). Note that the socket fd handler has not fired yet so ioc->write_coroutine is still set. 4. Request coroutine attempts to send the request body with nbd_rwv() but the socket would still block. qio_channel_yield() is called again and assert(!ioc->write_coroutine) is hit. The problem is that nbd_read_reply_entry() does not distinguish between request coroutines that are waiting to receive a reply and those that are not. This patch adds a per-request bool receiving flag so nbd_read_reply_entry() can avoid spurious aio_wake() calls. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170822125113.5025-1-stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23mirror: Mark target BB as "force allow inactivate"Fam Zheng1-2/+12
Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-4-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23block-backend: Allow more "can inactivate" casesFam Zheng1-5/+17
These two conditions corresponds to mirror job's source and target, which need to be allowed as they are part of the non-shared storage migration workflow: failing to inactivate either will result in a failure during migration completion. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-3-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [eblake: improve comment grammar] Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23block-backend: Refactor inactivate checkFam Zheng1-5/+14
The logic will be fixed (extended), move it to a separate function. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-2-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-23fix build failure in nbd_read_reply_entry()Igor Mammedov1-1/+1
travis builds fail at HEAD at rc3 master with block/nbd-client.c: In function ‘nbd_read_reply_entry’: block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized] fix it by initializing 'ret' to 0 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-08-15nbd-client: Fix regression when server sends garbageEric Blake2-4/+14
When we switched NBD to use coroutines for qemu 2.9 (in particular, commit a12a712a), we introduced a regression: if a server sends us garbage (such as a corrupted magic number), we quit the read loop but do not stop sending further queued commands, resulting in the client hanging when it never reads the response to those additional commands. In qemu 2.8, we properly detected that the server is no longer reliable, and cancelled all existing pending commands with EIO, then tore down the socket so that all further command attempts get EPIPE. Restore the proper behavior of quitting (almost) all communication with a broken server: Once we know we are out of sync or otherwise can't trust the server, we must assume that any further incoming data is unreliable and therefore end all pending commands with EIO, and quit trying to send any further commands. As an exception, we still (try to) send NBD_CMD_DISC to let the server know we are going away (in part, because it is easier to do that than to further refactor nbd_teardown_connection, and in part because it is the only command where we do not have to wait for a reply). Based on a patch by Vladimir Sementsov-Ogievskiy. A malicious server can be created with the following hack, followed by setting NBD_SERVER_DEBUG to a non-zero value in the environment when running qemu-nbd: | --- a/nbd/server.c | +++ b/nbd/server.c | @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) | stl_be_p(buf + 4, reply->error); | stq_be_p(buf + 8, reply->handle); | | + static int debug; | + static int count; | + if (!count++) { | + const char *str = getenv("NBD_SERVER_DEBUG"); | + if (str) { | + debug = atoi(str); | + } | + } | + if (debug && !(count % debug)) { | + buf[0] = 0; | + } | return nbd_write(ioc, buf, sizeof(buf), errp); | } Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170814213426.24681-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-15block-backend: Defer shared_perm tightening migration completionFam Zheng1-0/+41
As in the case of nbd_export_new(), bdrv_invalidate_cache() can be called when migration is still in progress. In this case we are not ready to tighten the shared permissions fenced by blk->disable_perm. Defer to a VM state change handler. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170815130740.31229-4-famz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2017-08-11file-posix: Do runtime check for ofd lock APIFam Zheng1-11/+8
It is reported that on Windows Subsystem for Linux, ofd operations fail with -EINVAL. In other words, QEMU binary built with system headers that exports F_OFD_SETLK doesn't necessarily run in an environment that actually supports it: $ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \ -device virtio-blk-pci,drive=hd0 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100 As a matter of fact this is not WSL specific. It can happen when running a QEMU compiled against a newer glibc on an older kernel, such as in a containerized environment. Let's do a runtime check to cope with that. Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11qcow2: Check failure of bdrv_getlength()Eric Blake1-1/+4
qcow2_co_pwritev_compressed() should not call bdrv_truncate() if determining the size failed. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11qcow2: Drop debugging dump_refcounts()Eric Blake1-21/+0
It's been #if 0'd since its introduction in 2006, commit 585f8587. We can revive dead code if we need it, but in the meantime, it has bit-rotted (for example, not checking for failure in bdrv_getlength()). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-11vpc: Check failure of bdrv_getlength()Eric Blake1-1/+8
vpc_open() was checking for bdrv_getlength() failure in one, but not the other, location. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08block/nfs: fix mutex assertion in nfs_file_close()Jeff Cody1-3/+8
Commit c096358e747e88fc7364e40e3c354ee0bb683960 introduced assertion checks for when qemu_mutex() functions are called without the corresponding qemu_mutex_init() having initialized the mutex. This uncovered a latent bug in qemu's nfs driver - in nfs_client_close(), the NFSClient structure is overwritten with zeros, prior to the mutex being destroyed. Go ahead and destroy the mutex in nfs_client_close(), and change where we call qemu_mutex_init() so that it is correctly balanced. There are also a couple of memory leaks obscured by the memset, so this fixes those as well. Finally, we should be able to get rid of the memset(), as it isn't necessary. Cc: qemu-stable@nongnu.org Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08parallels: drop check that bdrv_truncate() is workingDenis V. Lunev1-3/+1
This would be actually strange and error prone. If truncate() nowadays will fail, there is something fatally wrong. Let's check for that during the actual work. The only fallback case is when the file is not zero initialized. In this case we should switch to preallocation via fallocate(). Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08parallels: respect error code of bdrv_getlength() in allocate_clusters()Denis V. Lunev1-2/+6
If we can not get the file length, the state of BDS is broken completely. Return error to the caller. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-08-08block: respect error code from bdrv_getlength in handle_aiocb_write_zeroesDenis V. Lunev1-1/+7
Original idea beyond the code in question was the following: we have failed to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the only chance now: if the request comes beyond end of the file. Thus we should calculate file length and respect the error code from that op. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-08-08vmdk: Fix error handling/reporting of vmdk_checkFam Zheng1-8/+18
Errors from the callees must be captured and propagated to our caller, ensure this for both find_extent() and bdrv_getlength(). Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>