summaryrefslogtreecommitdiff
path: root/migration/colo.c
AgeCommit message (Collapse)AuthorFilesLines
2016-11-01migration: fix compiler warning on uninitialized variableJeff Cody1-1/+1
Some older GCC versions (e.g. 4.4.7) report a warning on an uninitialized variable for 'request', even though all possible code paths that reference 'request' will be initialized. To appease these versions, initialize the variable to 0. Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-id: 259818682e41b95ae60f1423b87954a3fe377639.1477950393.git.jcody@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-30configure: Support enable/disable COLO featurezhanghailiang1-1/+1
configure --enable-colo/--disable-colo to switch COLO support on/off. COLO feature doesn't depend on any other external libraries, So here it is reasonable to enable COLO by default, to avoid re-compile QEMU if users want to use this capability. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Implement failover work for secondary VMzhanghailiang1-0/+34
If users require SVM to takeover work, COLO incoming thread should exit from loop while failover BH helps backing to migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Implement the process of failover for primary VMzhanghailiang1-0/+50
For primary side, if COLO gets failover request from users. To be exact, gets 'x_colo_lost_heartbeat' command. COLO thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Introduce state to record failover processzhanghailiang1-0/+4
When handling failover, COLO processes differently according to the different stage of failover process, here we introduce a global atomic variable to record the status of failover. We add four failover status to indicate the different stage of failover process. You should use the helpers to get and set the value. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Add 'x-colo-lost-heartbeat' command to trigger failoverzhanghailiang1-0/+1
We leave users to choose whatever heartbeat solution they want, if the heartbeat is lost, or other errors they detect, they can use experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations accordingly. For example, if the command is sent to the Primary side, the Primary side will exit COLO mode, does cleanup work, and then, PVM will take over the service work. If sent to the Secondary side, the Secondary side will run failover work, then takes over PVM's service work. Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Synchronize PVM's state to SVM periodicallyzhanghailiang1-0/+12
Do checkpoint periodically, the default interval is 200ms. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Load VMState into QIOChannelBuffer before restore itzhanghailiang1-2/+65
We should not destroy the state of SVM (Secondary VM) until we receive the complete data of PVM's state, in case the primary fails in the process of sending the state, so we cache the VM's state in secondary side before load it into SVM. Besides, we should call qemu_system_reset() before load VM state, which can ensure the data is intact. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Send PVM state to secondary side when do checkpointzhanghailiang1-7/+78
VM checkpointing is to synchronize the state of PVM to SVM, just like migration does, we re-use save helpers to achieve migrating PVM's state to Secondary side. COLO need to cache the data of VM's state in the secondary side before synchronize it to SVM. COLO need the size of the data to determine how much data should be read in the secondary side. So here, we can get the size of the data by saving it into I/O channel before send it to the secondary side. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Introduce checkpointing protocolzhanghailiang1-2/+199
We need communications protocol of user-defined to control the checkpointing process. The new checkpointing request is started by Primary VM, and the interactive process like below: Checkpoint synchronizing points: Primary Secondary initial work 'checkpoint-ready' <-------------------- @ 'checkpoint-request' @ --------------------> Suspend (Only in hybrid mode) 'checkpoint-reply' <-------------------- @ Suspend&Save state 'vmstate-send' @ --------------------> Send state Receive state 'vmstate-received' <-------------------- @ Release packets Load state 'vmstate-load' <-------------------- @ Resume Resume (Only in hybrid mode) Start Comparing (Only in hybrid mode) NOTE: 1) '@' who sends the message 2) Every sync-point is synchronized by two sides with only one handshake(single direction) for low-latency. If more strict synchronization is required, a opposite direction sync-point should be added. 3) Since sync-points are single direction, the remote side may go forward a lot when this side just receives the sync-point. 4) For now, we only support 'periodic' checkpoint, for which the Secondary VM is not running, later we will support 'hybrid' mode. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30COLO: Establish a new communicating path for COLOzhanghailiang1-0/+28
This new communication path will be used for returning messages from Secondary side to Primary side. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30migration: Switch to COLO process after finishing loadvmzhanghailiang1-0/+21
Switch from normal migration loadvm process into COLO checkpoint process if COLO mode is enabled. We add three new members to struct MigrationIncomingState, 'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO related thread for secondary VM, 'migration_incoming_co' records the original migration incoming coroutine. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30migration: Enter into COLO mode after migration if COLO is enabledzhanghailiang1-0/+29
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side enters this state after the first live migration successfully finished if COLO is enabled by command 'migrate_set_capability x-colo on'. We reuse migration thread, so the process of checkpointing will be handled in migration thread. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30migration: Introduce capability 'x-colo' to migrationzhanghailiang1-0/+19
We add helper function colo_supported() to indicate whether colo is supported or not, with which we use to control whether or not showing 'x-colo' string to users, they can use qmp command 'query-migrate-capabilities' or hmp command 'info migrate_capabilities' to learn if colo is supported. The default value for COLO (COarse-Grain LOck Stepping) is disabled. Cc: Juan Quintela <quintela@redhat.com> Cc: Amit Shah <amit.shah@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit@amitshah.net>