From d759c951f3287fad04210a52f2dc93f94cf58c7f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Alex=20Benn=C3=A9e?= Date: Tue, 27 Feb 2018 12:52:48 +0300 Subject: replay: push replay_mutex_lock up the call tree MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Now instead of using the replay_lock to guard the output of the log we now use it to protect the whole execution section. This replaces what the BQL used to do when it was held during TCG execution. We also introduce some rules for locking order - mainly that you cannot take the replay_mutex while holding the BQL. This leads to some slight sophistry during start-up and extending the replay_mutex_destroy function to unlock the mutex without checking for the BQL condition so it can be cleanly dropped in the non-replay case. Signed-off-by: Alex Bennée Signed-off-by: Pavel Dovgalyuk Tested-by: Pavel Dovgalyuk Message-Id: <20180227095248.1060.40374.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini Signed-off-by: Alex Bennée --- docs/replay.txt | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) (limited to 'docs') diff --git a/docs/replay.txt b/docs/replay.txt index c52407fe23..959633e7ea 100644 --- a/docs/replay.txt +++ b/docs/replay.txt @@ -49,6 +49,28 @@ Modifications of qemu include: * recording/replaying user input (mouse and keyboard) * adding internal checkpoints for cpu and io synchronization +Locking and thread synchronisation +---------------------------------- + +Previously the synchronisation of the main thread and the vCPU thread +was ensured by the holding of the BQL. However the trend has been to +reduce the time the BQL was held across the system including under TCG +system emulation. As it is important that batches of events are kept +in sequence (e.g. expiring timers and checkpoints in the main thread +while instruction checkpoints are written by the vCPU thread) we need +another lock to keep things in lock-step. This role is now handled by +the replay_mutex_lock. It used to be held only for each event being +written but now it is held for a whole execution period. This results +in a deterministic ping-pong between the two main threads. + +As the BQL is now a finer grained lock than the replay_lock it is almost +certainly a bug, and a source of deadlocks, to take the +replay_mutex_lock while the BQL is held. This is enforced by an assert. +While the unlocks are usually in the reverse order, this is not +necessary; you can drop the replay_lock while holding the BQL, without +doing a more complicated unlock_iothread/replay_unlock/lock_iothread +sequence. + Non-deterministic events ------------------------ -- cgit v1.2.1