From 21a03d17f2edb1e63f7137d97ba355cc6f19d79f Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Tue, 21 Jul 2015 16:07:52 +0200 Subject: AioContext: fix broken placement of event_notifier_test_and_clear event_notifier_test_and_clear must be called before processing events. Otherwise, an aio_poll could "eat" the notification before the main I/O thread invokes ppoll(). The main I/O thread then never wakes up. This is an example of what could happen: i/o thread vcpu thread worker thread --------------------------------------------------------------------- lock_iothread notify_me = 1 ... unlock_iothread bh->scheduled = 1 event_notifier_set lock_iothread notify_me = 3 ppoll notify_me = 1 aio_dispatch aio_bh_poll thread_pool_completion_bh bh->scheduled = 1 event_notifier_set node->io_read(node->opaque) event_notifier_test_and_clear ppoll *** hang *** "Tracing" with qemu_clock_get_ns shows pretty much the same behavior as in the previous bug, so there are no new tricks here---just stare more at the code until it is apparent. One could also use a formal model, of course. The included one shows this with three processes: notifier corresponds to a QEMU thread pool worker, temporary_waiter to a VCPU thread that invokes aio_poll(), waiter to the main I/O thread. I would be happy to say that the formal model found the bug for me, but actually I wrote it after the fact. This patch is a bit of a big hammer. The next one optimizes it, with help (this time for real rather than a posteriori :)) from another, similar formal model. Reported-by: Richard W. M. Jones Signed-off-by: Paolo Bonzini Reviewed-by: Fam Zheng Tested-by: Richard W.M. Jones Message-id: 1437487673-23740-6-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi --- aio-win32.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'aio-win32.c') diff --git a/aio-win32.c b/aio-win32.c index ea655b0935..7afc9992d6 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -337,10 +337,11 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } - if (first && aio_bh_poll(ctx)) { - progress = true; + if (first) { + event_notifier_test_and_clear(&ctx->notifier); + progress |= aio_bh_poll(ctx); + first = false; } - first = false; /* if we have any signaled events, dispatch event */ event = NULL; -- cgit v1.2.1