summaryrefslogtreecommitdiff
path: root/migration
AgeCommit message (Collapse)AuthorFilesLines
2017-07-31replication: Make --disable-replication compile againMarkus Armbruster1-0/+12
Broken in commit daa33c5. Cc: qemu-stable@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Message-id: 1493298053-17140-1-git-send-email-armbru@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 38bb54f323bf7c83496b6a044cfd28896e997a00) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31migration: setup bi-directional I/O channel for exec: protocolDaniel P. Berrange1-2/+2
Historically the migration data channel has only needed to be unidirectional. Thus the 'exec:' protocol was requesting an I/O channel with O_RDONLY on incoming side, and O_WRONLY on the outgoing side. This is fine for classic migration, but if you then try to run TLS over it, this fails because the TLS handshake requires a bi-directional channel. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> (cherry picked from commit 062d81f0e968fe1597474735f3ea038065027372) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-07block: Ignore guest dev permissions during incoming migrationKevin Wolf1-0/+8
Usually guest devices don't like other writers to the same image, so they use blk_set_perm() to prevent this from happening. In the migration phase before the VM is actually running, though, they don't have a problem with writes to the image. On the other hand, storage migration needs to be able to write to the image in this phase, so the restrictive blk_set_perm() call of qdev devices breaks it. This patch flags all BlockBackends with a qdev device as blk->disable_perm during incoming migration, which means that the requested permissions are stored in the BlockBackend, but not actually applied to its root node yet. Once migration has finished and the VM should be resumed, the permissions are applied. If they cannot be applied (e.g. because the NBD server used for block migration hasn't been shut down), resuming the VM fails. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-03-16postcopy: Check for shared memoryDr. David Alan Gilbert1-0/+18
Postcopy doesn't support migration of RAM shared with another process yet (we've got a bunch of things to understand). Check for the case and don't allow postcopy to be enabled. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16vmstate: fix failed iotests case 68 and 91QingFeng Hao1-4/+4
This problem affects s390x only if we are running without KVM. Basically, S390CPU.irqstate is unused if we do not use KVM, and thus no buffer is allocated. This causes size=0, first_elem=NULL and n_elems=1 in vmstate_load_state and vmstate_save_state. And the assert fails. With this fix we can go back to the old behavior and support VMS_VBUFFER with size 0 and nullptr. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16migration/block: Avoid invoking blk_drain too frequentlyLidong Chen1-0/+3
Increase bmds->cur_dirty after submit io, so reduce the frequency involve into blk_drain, and improve the performance obviously when block migration. The performance test result of this patch: During the block dirty save phase, this patch improve guest os IOPS from 4.0K to 9.5K. and improve the migration speed from 505856 rsec/s to 855756 rsec/s. Signed-off-by: Lidong Chen <jemmy858585@gmail.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16migration: use "" as the default for tls-creds/hostnameDaniel P. Berrange2-1/+5
The tls-creds parameter has a default value of NULL indicating that TLS should not be used. Setting it to non-NULL enables use of TLS. Once tls-creds are set to a non-NULL value via the monitor, it isn't possible to set them back to NULL again, due to current implementation limitations. The empty string is not a valid QObject identifier, so this switches to use "" as the default, indicating that TLS will not be used The tls-hostname parameter has a default value of NULL indicating the the hostname from the migrate connection URI should be used. Again, once tls-hostname is set non-NULL, to override the default hostname for x509 cert validation, it isn't possible to reset it back to NULL via the monitor. The empty string is not a valid hostname, so this switches to use "" as the default, indicating that the migrate URI hostname should be used. Using "" as the default for both, also means that the monitor commands "info migrate_parameters" / "query-migrate-parameters" will report existance of tls-creds/tls-parameters even when set to their default values. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16Change the method to calculate dirty-pages-rateChao Fan1-7/+5
In function cpu_physical_memory_sync_dirty_bitmap, file include/exec/ram_addr.h: if (src[idx][offset]) { unsigned long bits = atomic_xchg(&src[idx][offset], 0); unsigned long new_dirty; new_dirty = ~dest[k]; dest[k] |= bits; new_dirty &= bits; num_dirty += ctpopl(new_dirty); } After these codes executed, only the pages not dirtied in bitmap(dest), but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated. For example: When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111, and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011, the new_dirty will be 0b00001100, and this function will return 2 but not 4 which is expected. the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new, so these should be calculated also. Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-13migration: Document handling of bdrv_is_allocated() errorsEric Blake1-0/+2
Migration is the only code left in the tree that does not react to bdrv_is_allocated() failures. But as there is no useful way to react to the failure, and we are merely skipping unallocated sectors on success, just document that our choice of handling is intended. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-02Merge remote-tracking branch ↵Peter Maydell6-167/+257
'remotes/dgilbert/tags/pull-migration-20170228a' into staging Migration pull Note: The 'postcopy: Update userfaultfd.h header' is part of Paolo's header update and will disappear if applied after it. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Tue 28 Feb 2017 12:38:34 GMT # gpg: using RSA key 0x0516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-migration-20170228a: (27 commits) postcopy: Add extra check for COPY function postcopy: Add doc about hugepages and postcopy postcopy: Check for userfault+hugepage feature postcopy: Update userfaultfd.h header postcopy: Allow hugepages postcopy: Send whole huge pages postcopy: Mask fault addresses to huge page boundary postcopy: Load huge pages in one go postcopy: Use temporary for placing zero huge pages postcopy: Plumb pagesize down into place helpers postcopy: Record largest page size postcopy: enhance ram_block_discard_range for hugepages exec: ram_block_discard_range postcopy: Chunk discards for hugepages postcopy: Transmit and compare individual page sizes postcopy: Transmit ram size summary word migration: fix use-after-free of to_dst_file migration: Update docs to discourage version bumps migration: fix id leak regression migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-01Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell1-4/+17
Block layer patches # gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (46 commits) block: Add Error parameter to bdrv_append() block: Add Error parameter to bdrv_set_backing_hd() block: Assertions for resize permission block: Assertions for write permissions block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read tests: Remove FIXME comments nbd/server: Use real permissions for NBD exports migration/block: Use real permissions hmp: Request permissions in qemu-io commit: Add filter-node-name to block-commit mirror: Add filter-node-name to blockdev-mirror stream: Use real permissions in streaming block job mirror: Use real permissions in mirror/active commit block job blockjob: Factor out block_job_remove_all_bdrv() block: Allow backing file links in change_parent_backing_link() block: BdrvChildRole.attach/detach() callbacks block: Fix pending requests check in bdrv_append() backup: Use real permissions in backup block job commit: Use real permissions for HMP 'commit' commit: Use real permissions in commit block job ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28migration/block: Use real permissionsKevin Wolf1-5/+17
Request BLK_PERM_CONSISTENT_READ for the source of block migration, and handle potential permission errors as good as we can in this place (which is not very good, but it matches the other failure cases). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28block: Add error parameter to blk_insert_bs()Kevin Wolf1-1/+1
Now that blk_insert_bs() requests the BlockBackend permissions for the node it attaches to, it can fail. Instead of aborting, pass the errors to the callers. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28block: Add permissions to blk_new()Kevin Wolf1-1/+2
We want every user to be specific about the permissions it needs, so we'll pass the initial permissions as parameters to blk_new(). A user only needs to call blk_set_perm() if it wants to change the permissions after the fact. The permissions are stored in the BlockBackend and applied whenever a BlockDriverState should be attached in blk_insert_bs(). This does not include actually choosing the right set of permissions everywhere yet. Instead, the usual FIXME comment is added to each place and will be addressed in individual patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28Add a new qmp command to do checkpoint, query xen replication statusZhang Chen1-0/+23
We can call this qmp command to do checkpoint outside of qemu. Xen colo will need this function. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Wen Congyang <wencongyang@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2017-02-28Add a new qmp command to start/stop replicationZhang Chen1-0/+26
We can call this qmp command to start/stop replication outside of qemu. Like Xen colo need this function. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Wen Congyang <wencongyang@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2017-02-28postcopy: Add extra check for COPY functionDr. David Alan Gilbert1-0/+4
As an extra sanity check, make sure the region we're registering can perform UFFDIO_COPY; the COPY will fail later but this gives a cleaner failure. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-17-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Check for userfault+hugepage featureDr. David Alan Gilbert1-1/+11
We need extra Linux kernel support (~4.11) to support userfaults on hugetlbfs; check for them. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-15-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Allow hugepagesDr. David Alan Gilbert1-24/+1
Allow huge pages in postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-13-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Send whole huge pagesDr. David Alan Gilbert1-1/+5
The RAM save code uses ram_save_host_page to send whole host pages at a time; change this to use the host page size associated with the RAM Block which may be a huge page. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-12-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Mask fault addresses to huge page boundaryDr. David Alan Gilbert1-4/+3
Currently the fault address received by userfault is rounded to the host page boundary and a host page is requested from the source. Use the current RAMBlock page size instead of the general host page size so that for RAMBlocks backed by huge pages we request the whole huge page. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-11-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Load huge pages in one goDr. David Alan Gilbert1-5/+8
The existing postcopy RAM load loop already ensures that it glues together whole host-pages from the target page size chunks sent over the wire. Modify the definition of host page that it uses to be the RAM block page size and thus be huge pages where appropriate. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-10-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Use temporary for placing zero huge pagesDr. David Alan Gilbert1-2/+21
The kernel can't do UFFDIO_ZEROPAGE for huge pages, so we have to allocate a temporary (always zero) page and use UFFDIO_COPYPAGE on it. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-9-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Plumb pagesize down into place helpersDr. David Alan Gilbert2-26/+36
Now we deal with normal size pages and huge pages we need to tell the place handlers the size we're dealing with and make sure the temporary page is large enough. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-8-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Record largest page sizeDr. David Alan Gilbert1-0/+1
Record the largest page size in use; we'll need it soon for allocating temporary buffers. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-7-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28exec: ram_block_discard_rangeDr. David Alan Gilbert3-51/+5
Create ram_block_discard_range in exec.c to replace postcopy_ram_discard_range and most of ram_discard_range. Those two routines are a bit of a weird combination, and ram_discard_range is about to get more complex for hugepages. It's OS dependent code (so shouldn't be in migration/ram.c) but it needs quite a bit of the innards of RAMBlock so doesn't belong in the os*.c. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-5-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Chunk discards for hugepagesDr. David Alan Gilbert1-8/+9
At the start of the postcopy phase, partially sent huge pages must be discarded. The code for dealing with host page sizes larger than the target page size can be reused for this case. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20170224182844.32452-4-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Transmit and compare individual page sizesDr. David Alan Gilbert1-0/+17
When using postcopy with hugepages, we require the source and destination page sizes for any RAMBlock to match; note that different RAMBlocks in the same VM can have different page sizes. Transmit them as part of the RAM information header and fail if there's a difference. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20170224182844.32452-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28postcopy: Transmit ram size summary wordDr. David Alan Gilbert2-11/+38
Replace the host page-size in the 'advise' command by a pagesize summary bitmap; if the VM is just using normal RAM then this will be exactly the same as before, however if they're using huge pages they'll be different, and thus: a) Migration from/to old qemu's that don't understand huge pages will fail early. b) Migrations with different size RAMBlocks will also fail early. This catches it very early; earlier than the detailed per-block check in the next patch. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20170224182844.32452-2-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migration: fix use-after-free of to_dst_fileVladimir Sementsov-Ogievskiy1-0/+5
hmp_savevm calls qemu_savevm_state(f), which sets to_dst_file=f in global migration state. Then hmp_savevm closes f (g_free called). Next access to to_dst_file in migration state (for example, qmp_migrate_set_speed) will use it after it was freed. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170225193155.447462-5-vsementsov@virtuozzo.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migration: fix id leak regressionMarc-André Lureau1-0/+1
This leak was introduced in commit 581f08bac22bdd5e081ae07f68071a0fc3c5c2c7. (it stands out quickly with ASAN once the rest of the leaks are also removed from make check with this series) Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170221141451.28305-31-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratableAshijeet Acharya1-0/+15
Commit a3a3d8c7 introduced a segfault bug while checking for 'dc->vmsd->unmigratable' which caused QEMU to crash when trying to add devices which do no set their 'dc->vmsd' yet while initialization. Place a 'dc->vmsd' check prior to it so that we do not segfault for such devices. NOTE: This doesn't compromise the functioning of --only-migratable option as all the unmigratable devices do set their 'dc->vmsd'. Introduce a new function check_migratable() and move the only_migratable check inside it, also use stubs to avoid user-mode qemu build failures. Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com> Message-Id: <1487009088-23891-1-git-send-email-ashijeetacharya@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migration/vmstate: fix array of ptr with nullptrsHalil Pasic1-2/+38
Make VMS_ARRAY_OF_POINTER cope with null pointers. Previously the reward for trying to migrate an array with some null pointers in it was an illegal memory access, that is a swift and painless death of the process. Let's make vmstate cope with this scenario. The general approach is, when we encounter a null pointer (element), instead of following the pointer to save/load the data behind it, we save/load a placeholder. This way we can detect if we expected a null pointer at the load side but not null data was saved instead. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170222160119.52771-4-pasic@linux.vnet.ibm.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migration/vmstate: split up vmstate_base_addrHalil Pasic1-22/+17
Currently vmstate_base_addr does several things: it pinpoints the field within the struct, possibly allocates memory and possibly does the first pointer dereference. Obviously allocation is needed only for load. Let us split up the functionality in vmstate_base_addr and move the address manipulations (that is everything but the allocation logic) to load and save so it becomes more obvious what is actually going on. Like this all the address calculations (and the handling of the flags controlling these) is in one place and the sequence is more obvious. The newly introduced function vmstate_handle_alloc also fixes the allocation for the unused VMS_VBUFFER|VMS_MULTIPLY|VMS_ALLOC scenario and is substantially simpler than the original vmstate_base_addr. In load and save some asserts are added so it's easier to debug situations where we would end up with a null pointer dereference. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170222160119.52771-3-pasic@linux.vnet.ibm.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28migration/vmstate: renames in (load|save)_stateHalil Pasic1-11/+11
The vmstate_(load|save)_state start out with an a void *opaque pointing to some struct, and manipulate one or more elements of one field within that struct. First the field within the struct is pinpointed as opaque + offset, then if this is a pointer the pointer is dereferenced to obtain a pointer to the first element of the vmstate field. Pointers to further elements if any are calculated as first_element + i * element_size (where i is the zero based index of the element in question). Currently base_addr and addr is used as a variable name for the pointer to the first element and the pointer to the current element being processed. This is suboptimal because base_addr is somewhat counter-intuitive (because obtained as base + offset) and both base_addr and addr not very descriptive (that we have a pointer should be clear from the fact that it is declared as a pointer). Let make things easier to understand by renaming base_addr to first_elem and addr to curr_elem. This has the additional benefit of harmonizing with other names within the scope (n_elems, vmstate_n_elems). Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170222160119.52771-2-pasic@linux.vnet.ibm.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28Changing error message of QMP 'migrate_set_downtime' to secondsDaniel Henrique Barboza1-4/+16
Using QMP, the error message of 'migrate_set_downtime' was displaying the values in milliseconds, being misleading with the command that accepts the value in seconds: { "execute": "migrate_set_downtime", "arguments": {"value": 3000}} {"error": {"class": "GenericError", "desc": "Parameter 'downtime_limit' expects an integer in the range of 0 to 2000000 milliseconds"}} This message is also seen in HMP when trying to set the same parameter: (qemu) migrate_set_parameter downtime-limit 3000000 Parameter 'downtime_limit' expects an integer in the range of 0 to 2000000 milliseconds To allow for a proper error message when using QMP, a validation of the user input was added in 'qmp_migrate_set_downtime'. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Message-Id: <20170222151729.5812-1-danielhb@linux.vnet.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13migration: Add VMSTATE_WITH_TMPDr. David Alan Gilbert1-0/+40
VMSTATE_WITH_TMP is for handling structures where some calculation or rearrangement of the data needs to be performed before the data hits the wire. For example, where the value on the wire is an offset from a non-migrated base, but the data in the structure is the actual pointer. To use it, a temporary type is created and a vmsd used on that type. The first element of the type must be 'parent' a pointer back to the type of the main structure. VMSTATE_WITH_TMP takes care of allocating and freeing the temporary before running the child vmsd. The post_load/pre_save on the child vmsd can copy things from the parent to the temporary using the parent pointer and do any other calculations needed; it can then use normal VMSD entries to do the actual data storage without having to fiddle around with qemu_get_*/qemu_put_* Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20170203160651.19917-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13COLO: Don't process failover request while loading VM's statezhanghailiang1-0/+26
We should not do failover work while the main thread is loading VM's state. Otherwise the consistent of VM's memory and device state will be broken. We will restart the loading process after jump over the stage, The new failover status 'RELAUNCH' will help to record if we need to restart the process. Cc: Eric Blake <eblake@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1484657864-21708-4-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Added a missing '(Since 2.9)'
2017-02-13COLO: Shutdown related socket fd while do failoverzhanghailiang1-0/+43
If the net connection between primary host and secondary host breaks while COLO/COLO incoming threads are doing read() or write(). It will block until connection is timeout, and the failover process will be blocked because of it. So it is necessary to shutdown all the socket fds used by COLO to avoid this situation. Besides, we should close the corresponding file descriptors after failvoer BH shutdown them, Or there will be an error. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1484657864-21708-3-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13COLO: fix setting checkpoint-delay not working properlyzhanghailiang2-10/+26
If we set checkpoint-delay through command 'migrate-set-parameters', It will not take effect until we finish last sleep chekpoint-delay, That's will be offensive espeically when we want to change its value from an extreme big one to a proper value. Fix it by using timer to realize checkpoint-delay. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <1484657864-21708-2-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13migration: consolidate VMStateField.startHalil Pasic2-3/+3
The member VMStateField.start is used for two things, partial data migration for VBUFFER data (basically provide migration for a sub-buffer) and for locating next in QTAILQ. The implementation of the VBUFFER feature is broken when VMSTATE_ALLOC is used. This however goes unnoticed because actually partial migration for VBUFFER is not used at all. Let's consolidate the usage of VMStateField.start by removing support for partial migration for VBUFFER. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Message-Id: <20170203175217.45562-1-pasic@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13migrate: Introduce zero RAM checks to skip RAM migrationAshijeet Acharya1-7/+15
Migration of a "none" machine with no RAM crashes abruptly as bitmap_new() fails and thus aborts. Instead place zero RAM checks at appropriate places to skip migration of RAM in this case and complete migration successfully for devices only. Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com> Message-Id: <1486564125-31366-1-git-send-email-ashijeetacharya@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13migration: discard non-dirty ram pages after the start of postcopyPavel Butsykin2-0/+23
After the start of postcopy migration there are some non-dirty pages which have already been migrated. These pages are no longer needed on the source vm so that we can free them and it doen't hurt to complete the migration. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Message-Id: <20170203152321.19739-4-pbutsykin@virtuozzo.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13add 'release-ram' migrate capabilityPavel Butsykin3-8/+82
This feature frees the migrated memory on the source during postcopy-ram migration. In the second step of postcopy-ram migration when the source vm is put on pause we can free unnecessary memory. It will allow, in particular, to start relaxing the memory stress on the source host in a load-balancing scenario. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Message-Id: <20170203152321.19739-3-pbutsykin@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Manually merged in Pavel's 'migration: madvise error_report fixup!'
2017-02-13migration: add MigrationState arg for ram_save_/compressed_/page()Pavel Butsykin1-7/+8
Cosmetic patch. The use of ms variable instead of migrate_get_current() looks nicer, especially when there reuse. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Message-Id: <20170203152321.19739-2-pbutsykin@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-06postcopy: Recover block devices on early failureDr. David Alan Gilbert1-0/+25
An early postcopy failure can be recovered from as long as we know we haven't sent the command to run the destination. We have to undo the bdrv_inactivate_all by calling bdrv_invalidate_cache_all Note that I'm not using ms->block_inactive because once we've sent the postcopy package we dont want anything else to try and recover the block storage on the source; the destination might have started writing to it. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170202155909.31784-3-dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06Postcopy: Reset state to avoid cleanup assertDr. David Alan Gilbert1-0/+1
On a destination host with no userfault support an incoming postcopy would cause the state to enter ADVISE before it realised there was no support, and because it was in ADVISE state it would perform a cleanup at the end. Since there was no support the cleanup function should be unreachable, but ends up being called and asserting. Reset the state when we realise we have no support, thus the cleanup doesn't happen. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170202155909.31784-2-dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06migration: Check for ID lengthDr. David Alan Gilbert1-5/+16
The qdev id of a device can be huge if it's on the end of a chain of bridges; in reality such chains shouldn't occur but they can be made to by chaining PCIe bridges together. The migration format has a number of 256 character long format limits; check we don't hit them (we already use pstrcat/cpy but that just protects us from buffer overruns, we fairly quickly hit an assert). Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20170202125956.21942-3-dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06vmstate_register_with_alias_id: Take an Error **Dr. David Alan Gilbert1-1/+2
I'll be adding an error to it in a subsequent patch. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20170202125956.21942-2-dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06migration: create Migration Incoming State at init timeJuan Quintela2-23/+19
Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1485207141-1941-3-git-send-email-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>