peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2017-04-11	async: Introduce aio_co_enter	Fam Zheng	1	-1/+6
	They start the coroutine on the specified context. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11	coroutine: Extract qemu_aio_coroutine_enter	Fam Zheng	2	-4/+9
	It's a variant of qemu_coroutine_enter with an explicit AioContext parameter. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-04	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	1	-0/+11
	* MemoryRegionCache revert * glib optimization workaround * fix "info lapic" segfault on isapc * fix QIOChannel memory leak # gpg: Signature made Mon 03 Apr 2017 18:17:00 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: main-loop: Acquire main_context lock around os_host_main_loop_wait. exec: revert MemoryRegionCache nbd: fix memory leak on socket_connect failed ipmi: Fix macro issues target-i386: fix "info lapic" segfault on isapc iscsi: drop unused IscsiAIOCB.qiov field Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03	main-loop: Acquire main_context lock around os_host_main_loop_wait.	Richard W.M. Jones	1	-0/+11
	When running virt-rescue the serial console hangs from time to time. Virt-rescue runs an ordinary Linux kernel "appliance", but there is only a single idle process running inside, so the qemu main loop is largely idle. With virt-rescue >= 1.37 you may be able to observe the hang by doing: $ virt-rescue -e ^] --scratch ><rescue> while true; do ls -l /usr/bin; done The hang in virt-rescue can be resolved by pressing a key on the serial console. Possibly with the same root cause, we also observed hangs during very early boot of regular Linux VMs with a serial console. Those hangs are extremely rare, but you may be able to observe them by running this command on baremetal for a sufficiently long time: $ while libguestfs-test-tool -t 60 >& /tmp/log ; do echo -n . ; done (Check in /tmp/log that the failure was caused by a hang during early boot, and not some other reason) During investigation of this bug, Paolo Bonzini wrote: > glib is expecting QEMU to use g_main_context_acquire around accesses to > GMainContext. However QEMU is not doing that, instead it is taking its > own mutex. So we should add g_main_context_acquire and > g_main_context_release in the two implementations of > os_host_main_loop_wait; these should undo the effect of Frediano's > glib patch. This patch exactly implements Paolo's suggestion in that paragraph. This fixes the serial console hang in my testing, across 3 different physical machines (AMD, Intel Core i7 and Intel Xeon), over many hours of automated testing. I wasn't able to reproduce the early boot hangs (but as noted above, these are extremely rare in any case). Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1435432 Reported-by: Richard W.M. Jones <rjones@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20170331205133.23906-1-rjones@redhat.com> [Paolo: this is actually a glib bug: recent glib versions are also expecting g_main_context_acquire around g_poll---but that is not documented and probably not even intended]. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03	sockets: New helper socket_address_crumple()	Markus Armbruster	1	-0/+32
	SocketAddress is a simple union, and simple unions are awkward: they have their variant members wrapped in a "data" object on the wire, and require additional indirections in C. I intend to limit its use to existing external interfaces. New ones should use SocketAddressFlat. I further intend to convert all internal interfaces to SocketAddressFlat. This helper should go away then. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03	io vnc sockets: Clean up SocketAddressKind switches	Markus Armbruster	1	-3/+1
	We have quite a few switches over SocketAddressKind. Some have case labels for all enumeration values, others rely on a default label. Some abort when the value isn't a valid SocketAddressKind, others report an error then. Unify as follows. Always provide case labels for all enumeration values, to clarify intent. Abort when the value isn't a valid SocketAddressKind, because the program state is messed up then. Improve a few error messages while there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1490895797-29094-4-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03	nbd sockets vnc: Mark problematic address family tests TODO	Markus Armbruster	1	-0/+4
	Certain features make sense only with certain address families. For instance, passing file descriptors requires AF_UNIX. Testing SocketAddress's saddr->type == SOCKET_ADDRESS_KIND_UNIX is obvious, but problematic: it can't recognize AF_UNIX when type == SOCKET_ADDRESS_KIND_FD. Mark such tests of saddr->type TODO. We may want to check the address family with getsockname() there. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1490895797-29094-2-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-29	event_notifier: prevent accidental use after close	Halil Pasic	2	-0/+3
	Let's set the handles to the underlying facilities to their extremal value so no accidental misuse can happen, and to make it obvious that the notifier is dysfunctional. E.g. if we just close an fd but do not touch the int holding the fd eventually a read/write could succeed again when the fd gets reused, and corrupt the file addressed by the fd. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-28	sockets: Fix socket_address_to_string() hostname truncation	Markus Armbruster	1	-7/+2
	We first snprintf() to a fixed buffer, then g_strdup() the result boggle. Worse, the size of the fixed buffer INET6_ADDRSTRLEN + 5 + 4 is bogus: the 4 correctly accounts for '[', ']', ':' and '\0', but INET6_ADDRSTRLEN is not a suitable limit for inet->host, and 5 is not one for inet->port! They are for host and port in numeric form (exploiting that INET6_ADDRSTRLEN > INET_ADDRSTRLEN), but inet->host can also be a hostname, and inet->port can be a service name, to be resolved with getaddrinfo(). Fortunately, the only user so far is the "socket" network backend's net_socket_connected(), which uses it to initialize a NetSocketState's info_str[]. info_str[] has considerable more space: 256 instead of 55. So the bug's impact appears to be limited to truncated "info networks" with the "socket" network backend. The fix is obvious: use g_strdup_printf(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490268208-23368-1-git-send-email-armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27	win32: replace custom mutex and condition variable with native primitives	Andrey Shedel	1	-121/+15
	The multithreaded TCG implementation exposed deadlocks in the win32 condition variables: as implemented, qemu_cond_broadcast waited on receivers, whereas the pthreads API it was intended to emulate does not. This was causing a deadlock because broadcast was called while holding the IO lock, as well as all possible waiters blocked on the same lock. This patch replaces all the custom synchronisation code for mutexes and condition variables with native Windows primitives (SRWlocks and condition variables) with the same semantics as their POSIX equivalents. To enable that, it requires a Windows Vista or newer host OS. Signed-off-by: Andrey Shedel <ashedel@microsoft.com> [AB: edited commit message] Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24	mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.	Jitendra Kolhe	1	-2/+14
	This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN) fails and returns -1. This results in memset_num_threads getting set to -1. Which we then pass to g_new0(). The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call get_memset_num_threads() to handle sysconf() failure gracefully. In case sysconf() fails, we fall back to single threaded. (Spotted by Coverity, CID 1372465.) Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1490079006-32495-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-21	keyval: Document issues with 'any' and alternate types	Markus Armbruster	1	-0/+10
	Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490014548-15083-5-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21	keyval: Improve some comments	Markus Armbruster	1	-16/+31
	Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490014548-15083-3-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-19	qemu-ga: obey LISTEN_PID when using systemd socket activation	Paolo Bonzini	2	-0/+78
	qemu-ga's socket activation support was not obeying the LISTEN_PID environment variable, which avoids that a process uses a socket-activation file descriptor meant for its parent. Mess can for example ensue if a process forks a children before consuming the socket-activation file descriptor and therefore setting O_CLOEXEC on it. Luckily, qemu-nbd also got socket activation code, and its copy does support LISTEN_PID. Some extra fixups are needed to ensure that the code can be used for both, but that's what this patch does. The main change is to replace get_listen_fds's "consume" argument with the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code. Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-17	oslib-posix: fix compilation on OpenBSD	Paolo Bonzini	1	-2/+0
	si_band is not found in OpenBSD. It is marked as obsolescent in POSIX, so we can delete it without any remorse. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20170317152214.6148-1-pbonzini@redhat.com Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-17	thread-pool: add missing qemu_bh_cancel in completion function	Peter Lieven	1	-0/+7
	commit 3c80ca15 fixed a deadlock scenarion with nested aio_poll invocations. However, the rescheduling of the completion BH introcuded unnecessary spinning in the main-loop. On very fast file backends this can even lead to the "WARNING: I/O thread spun for 1000 iterations" message popping up. Callgrind reports about 3-4% less instructions with this patch running qemu-img bench on a ramdisk based VMDK file. Fixes: 3c80ca158c96ff902a30883a8933e755988948b1 Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-15	os: don't corrupt pre-existing memory-backend data with prealloc	Daniel P. Berrange	1	-1/+13
	When using a memory-backend object with prealloc turned on, QEMU will memset() the first byte in every memory page to zero. While this might have been acceptable for memory backends associated with RAM, this corrupts application data for NVDIMMs. Instead of setting every page to zero, read the current byte value and then just write that same value back, so we are not corrupting the original data. Directly write the value instead of memset()ing it, since there's no benefit to memset for a single byte write. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Message-id: 20170303113255.28262-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-14	icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread	Paolo Bonzini	1	-1/+3
	icount has become much slower after tcg_cpu_exec has stopped using the BQL. There is also a latent bug that is masked by the slowness. The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL timer now has to wake up the I/O thread and wait for it. The rendez-vous is mediated by the BQL QemuMutex: - handle_icount_deadline wakes up the I/O thread with BQL taken - the I/O thread wakes up and waits on the BQL - the VCPU thread releases the BQL a little later - the I/O thread raises an interrupt, which calls qemu_cpu_kick - the VCPU thread notices the interrupt, takes the BQL to process it and waits on it All this back and forth is extremely expensive, causing a 6 to 8-fold slowdown when icount is turned on. One may think that the issue is that the VCPU thread is too dependent on the BQL, but then the latent bug comes in. I first tried removing the BQL completely from the x86 cpu_exec, only to see everything break. The only way to fix it (and make everything slow again) was to add a dummy BQL lock/unlock pair. This is because in -icount mode you really have to process the events before the CPU restarts executing the next instruction. Therefore, this series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in the vCPU thread when running in icount mode. The required changes include: - make the timer notification callback wake up TCG's single vCPU thread when run from another thread. By using async_run_on_cpu, the callback can override all_cpu_threads_idle() when the CPU is halted. - move handle_icount_deadline after qemu_tcg_wait_io_event, so that the timer notification callback is invoked after the dummy work item wakes up the vCPU thread - make handle_icount_deadline run the timers instead of just waking the I/O thread. - stop processing the timers in the main loop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14	cpus: define QEMUTimerListNotifyCB for QEMU system emulation	Paolo Bonzini	3	-7/+7
	There is no change for now, because the callback just invokes qemu_notify_event. Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14	qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h	Paolo Bonzini	2	-0/+2
	This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14	qemu-timer: fix off-by-one	Paolo Bonzini	1	-1/+1
	If the first timer is exactly at the current value of the clock, the deadline is met and the timer should fire. This fixes itself on the next iteration of the loop without icount; with icount, however, execution of instructions will stop exactly at the deadline and won't proceed. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14	util: Removed unneeded header from path.c	Suramya Shah	1	-1/+0
	Signed-off-by: Suramya Shah <shah.suramya@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20170310163948.7567-1-shah.suramya@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14	mem-prealloc: reduce large guest start-up and migration time.	Jitendra Kolhe	2	-22/+89
	Using "-mem-prealloc" option for a large guest leads to higher guest start-up and migration time. This is because with "-mem-prealloc" option qemu tries to map every guest page (create address translations), and make sure the pages are available during runtime. virsh/libvirt by default, seems to use "-mem-prealloc" option in case the guest is configured to use huge pages. The patch tries to map all guest pages simultaneously by spawning multiple threads. Currently limiting the change to QEMU library functions on POSIX compliant host only, as we are not sure if the problem exists on win32. Below are some stats with "-mem-prealloc" option for guest configured to use huge pages. ------------------------------------------------------------------------ Idle Guest \| Start-up time \| Migration time ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - single threaded (existing code) ------------------------------------------------------------------------ 64 Core - 4TB \| 54m11.796s \| 75m43.843s 64 Core - 1TB \| 8m56.576s \| 14m29.049s 64 Core - 256GB \| 2m11.245s \| 3m26.598s ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - map guest pages using 8 threads ------------------------------------------------------------------------ 64 Core - 4TB \| 5m1.027s \| 34m10.565s 64 Core - 1TB \| 1m10.366s \| 8m28.188s 64 Core - 256GB \| 0m19.040s \| 2m10.148s ----------------------------------------------------------------------- Guest stats with 2M HugePage usage - map guest pages using 16 threads ----------------------------------------------------------------------- 64 Core - 4TB \| 1m58.970s \| 31m43.400s 64 Core - 1TB \| 0m39.885s \| 7m55.289s 64 Core - 256GB \| 0m11.960s \| 2m0.135s ----------------------------------------------------------------------- Changed in v2: - modify number of memset threads spawned to min(smp_cpus, 16). - removed 64GB memory restriction for spawning memset threads. Changed in v3: - limit number of threads spawned based on min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus) - implement memset thread specific siglongjmp in SIGBUS signal_handler. Changed in v4 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread as main thread no longer touches any pages. - simplify code my returning memset_thread_failed status from touch_all_pages. Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-07	keyval: Support lists	Markus Armbruster	1	-12/+171
	Additionally permit non-negative integers as key components. A dictionary's keys must either be all integers or none. If all keys are integers, convert the dictionary to a list. The set of keys must be [0,N]. Examples: * list.1=goner,list.0=null,list.1=eins,list.2=zwei is equivalent to JSON [ "null", "eins", "zwei" ] * a.b.c=1,a.b.0=2 is inconsistent: a.b.c clashes with a.b.0 * list.0=null,list.2=eins,list.2=zwei has a hole: list.1 is missing Similar design flaw as for objects: there is no way to denote an empty list. While interpreting "key absent" as empty list seems natural (removing a list member from the input string works when there are multiple ones, so why not when there's just one), it doesn't work: "key absent" already means "optional list absent", which isn't the same as "empty list present". Update the keyval object visitor to use this a.0 syntax in error messages rather than the usual a[0]. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-25-git-send-email-armbru@redhat.com> [Off-by-one fix squashed in, as per Kevin's review] Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07	keyval: Restrict key components to valid QAPI names	Markus Armbruster	1	-4/+8
	Until now, key components are separated by '.'. This leaves little room for evolving the syntax, and is incompatible with the __RFQDN_ prefix convention for downstream extensions. Since key components will be commonly used as QAPI member names by the QObject input visitor, we can just as well borrow the QAPI naming rules here: letters, digits, hyphen and period starting with a letter, with an optional __RFQDN_ prefix for downstream extensions. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-20-git-send-email-armbru@redhat.com>
2017-03-07	keyval: New keyval_parse()	Markus Armbruster	2	-0/+232
	keyval_parse() parses KEY=VALUE,... into a QDict. Works like qemu_opts_parse(), except: * Returns a QDict instead of a QemuOpts (d'oh). * Supports nesting, unlike QemuOpts: a KEY is split into key fragments at '.' (dotted key convention; the block layer does something similar on top of QemuOpts). The key fragments are QDict keys, and the last one's value is updated to VALUE. * Each key fragment may be up to 127 bytes long. qemu_opts_parse() limits the entire key to 127 bytes. * Overlong key fragments are rejected. qemu_opts_parse() silently truncates them. * Empty key fragments are rejected. qemu_opts_parse() happily accepts empty keys. * It does not store the returned value. qemu_opts_parse() stores it in the QemuOptsList. * It does not treat parameter "id" specially. qemu_opts_parse() ignores all but the first "id", and fails when its value isn't id_wellformed(), or duplicate (a QemuOpts with the same ID is already stored). It also screws up when a value contains ",id=". * Implied value is not supported. qemu_opts_parse() desugars "foo" to "foo=on", and "nofoo" to "foo=off". * An implied key's value can't be empty, and can't contain ','. I intend to grow this into a saner replacement for QemuOpts. It'll take time, though. Note: keyval_parse() provides no way to do lists, and its key syntax is incompatible with the __RFQDN_ prefix convention for downstream extensions, because it blindly splits at '.', even in __RFQDN_. Both issues will be addressed later in the series. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
2017-03-04	Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' ↵	Peter Maydell	1	-0/+25
	into staging ppc patch queuye for 2017-03-03 This will probably be my last pull request before the hard freeze. It has some new work, but that has all been posted in draft before the soft freeze, so I think it's reasonable to include in qemu-2.9. This batch has: * A substantial amount of POWER9 work * Implements the legacy (hash) MMU for POWER9 * Some more preliminaries for implementing the POWER9 radix MMU * POWER9 has_work * Basic POWER9 compatibility mode handling * Removal of some premature tests * Some cleanups and fixes to the existing MMU code to make the POWER9 work simpler * A bugfix for TCG multiply adds on power * Allow pseries guests to access PCIe extended config space This also includes a code-motion not strictly in ppc code - moving getrampagesize() from ppc code to exec.c. This will make some future VFIO improvements easier, Paolo said it was ok to merge via my tree. # gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170303: target/ppc: rewrite f[n]m[add,sub] using float64_muladd spapr: Small cleanup of PPC MMU enums spapr_pci: Advertise access to PCIe extended config space target/ppc: Rework hash mmu page fault code and add defines for clarity target/ppc: Move no-execute and guarded page checking into new function target/ppc: Add execute permission checking to access authority check target/ppc: Add Instruction Authority Mask Register Check hw/ppc/spapr: Add POWER9 to pseries cpu models target/ppc/POWER9: Add cpu_has_work function for POWER9 target/ppc/POWER9: Add POWER9 pa-features definition target/ppc/POWER9: Add POWER9 mmu fault handler target/ppc: Don't gen an SDR1 on POWER9 and rework register creation target/ppc: Add patb_entry to sPAPRMachineState target/ppc/POWER9: Add POWERPC_MMU_V3 bit powernv: Don't test POWER9 CPU yet exec, kvm, target-ppc: Move getrampagesize() to common code target/ppc: Add POWER9/ISAv3.00 to compat_table Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03	cpus: remove ugly cast on sigbus_handler	Paolo Bonzini	3	-5/+34
	The cast is there because sigbus_handler is invoked via sigfd_handler. But it feels just wrong to use struct qemu_signalfd_siginfo in the prototype of a function that is passed to sigaction. Instead, do a simple-minded conversion of qemu_signalfd_siginfo to siginfo_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03	exec, kvm, target-ppc: Move getrampagesize() to common code	Alexey Kardashevskiy	1	-0/+25
	getrampagesize() returns the largest supported page size and mainly used to know if huge pages are enabled. However is implemented in target-ppc/kvm.c and not available in TCG or other architectures. This renames and moves gethugepagesize() to mmap-alloc.c where fd-based analog of it is already implemented. This renames and moves getrampagesize() to exec.c as it seems to be the common place for helpers like this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-02	Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into ↵	Peter Maydell	1	-5/+0
	staging # gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT # gpg: using RSA key 0xDAE8E10975969CE5 # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/leak-pull-request: (28 commits) tests: fix virtio-blk-test leaks tests: add specialized device_find function tests: fix usb-test leaks tests: allows to run single test in usb-hcd-ehci-test usb: release the created buses bus: do not unref hotplug handler tests: fix virtio-9p-test leaks tests: fix virtio-scsi-test leak tests: fix e1000e leaks tests: fix i440fx-test leaks tests: fix e1000-test leak tests: fix tco-test leaks tests: fix eepro100-test leak pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice tests: fix ipmi-bt-test leak tests: fix ipmi-kcs-test leak tests: fix bios-tables-test leak tests: fix hd-geo-test leaks tests: fix ide-test leaks tests: fix vhost-user-test leaks ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-01	timer: use an inline function for free	Marc-André Lureau	1	-5/+0
	Similarly to allocation, do it from an inline function. This allows tests to only use the headers for allocation/free of timer. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-28	option: Tweak invalid size error message and unbreak iotest 049	Markus Armbruster	1	-1/+1
	Commit 75cdcd1 neglected to update tests/qemu-iotests/049.out, and made the error message for negative size worse. Fix that. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-02-23	option: Fix checking of sizes for overflow and trailing crap	Markus Armbruster	1	-28/+13
	parse_option_size()'s checking for overflow and trailing crap is wrong. Has always been that way. qemu_strtosz() gets it right, so use that. This adds support for size suffixes 'P', 'E', and ignores case for all suffixes, not just 'k'. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-25-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Change qemu_strtosz*() from int64_t to uint64_t	Markus Armbruster	1	-5/+9
	This will permit its use in parse_option_size(). Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-24-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Return qemu_strtosz*() error and value separately	Markus Armbruster	1	-10/+12
	This makes qemu_strtosz(), qemu_strtosz_mebi() and qemu_strtosz_metric() similar to qemu_strtoi64(), except negative values are rejected. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-23-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Let qemu_strtosz*() optionally reject trailing crap	Markus Armbruster	1	-5/+9
	Change the qemu_strtosz() & friends to return -EINVAL when @endptr is null and the conversion doesn't consume the string completely. Matches how qemu_strtol() & friends work. Only test_qemu_strtosz_simple() passes a null @endptr. No functional change there, because its conversion consumes the string. Simplify callers that use @endptr only to fail when it doesn't point to '\0' to pass a null @endptr instead. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-22-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Drop QEMU_STRTOSZ_DEFSUFFIX_* macros	Markus Armbruster	1	-18/+10
	Writing QEMU_STRTOSZ_DEFSUFFIX_* instead of '*' gains nothing. Get rid of these eyesores. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-18-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: New qemu_strtosz()	Markus Armbruster	1	-4/+11
	Most callers of qemu_strtosz_suffix() pass QEMU_STRTOSZ_DEFSUFFIX_B. Capture the pattern in new qemu_strtosz(). Inline qemu_strtosz_suffix() into its only remaining caller. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-17-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Rename qemu_strtosz() to qemu_strtosz_MiB()	Markus Armbruster	1	-1/+1
	With qemu_strtosz(), no suffix means mebibytes. It's used rarely. I'm going to add a similar function where no suffix means bytes. Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name qemu_strtosz() available for the new function. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-16-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: New qemu_strtosz_metric()	Markus Armbruster	1	-3/+8
	To parse numbers with metric suffixes, we use qemu_strtosz_suffix_unit(nptr, &eptr, QEMU_STRTOSZ_DEFSUFFIX_B, 1000) Capture this in a new function for legibility: qemu_strtosz_metric(nptr, &eptr) Replace test_qemu_strtosz_suffix_unit() by test_qemu_strtosz_metric(). Rename qemu_strtosz_suffix_unit() to do_strtosz() and give it internal linkage. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-15-git-send-email-armbru@redhat.com>
2017-02-23	option: Fix to reject invalid and overflowing numbers	Markus Armbruster	1	-3/+8
	parse_option_number() fails to check for these errors after strtoull(). Has always been broken. Fix that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-10-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Clean up control flow around qemu_strtol() a bit	Markus Armbruster	1	-42/+43
	Reorder check_strtox_error() to make it obvious that we always store through a non-null @endptr. Transform if (some error) { error case ... err = value for error case; } else { normal case ... err = value for normal case; } return err; to if (some error) { error case ... return value for error case; } normal case ... return value for normal case; Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-9-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Clean up variable names around qemu_strtol()	Markus Armbruster	1	-21/+21
	Name same things the same, different things differently. * qemu_strtol()'s parameter @nptr is called @p in check_strtox_error(). Rename the latter. * qemu_strtol()'s parameter @endptr is called @next in check_strtox_error(). Rename the latter. * qemu_strtol()'s variable @p is called @endptr in check_strtox_error(). Rename both to @ep. * qemu_strtol()'s variable @err is negative errno, check_strtox_error()'s parameter @err is positive. Rename the latter to @libc_errno. Same for qemu_strtoul(), qemu_strtoi64(), qemu_strtou64(), of course. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-8-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Rename qemu_strtoll(), qemu_strtoull()	Markus Armbruster	2	-4/+4
	The name qemu_strtoll() suggests conversion to long long, but it actually converts to int64_t. Rename to qemu_strtoi64(). The name qemu_strtoull() suggests conversion to unsigned long long, but it actually converts to uint64_t. Rename to qemu_strtou64(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-7-git-send-email-armbru@redhat.com>
2017-02-23	util/cutils: Rewrite documentation of qemu_strtol() & friends	Markus Armbruster	1	-39/+49
	Fixes the following documentation bugs: * Fails to document that null @nptr is safe. * Fails to document that we return -EINVAL when no conversion could be performed (commit 47d4be1). * Confuses long long with int64_t, and unsigned long long with uint64_t. * Claims the unsigned conversions can underflow. They can't. While there, mark problematic assumptions that int64_t is long long, and uint64_t is unsigned long long with FIXME comments. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-6-git-send-email-armbru@redhat.com>
2017-02-23	option: Assert value string isn't null	Markus Armbruster	1	-50/+39
	Plenty of code relies on QemuOpt member @str not being null, including qemu_opts_print(), qemu_opts_to_qdict(), and callbacks passed to qemu_opt_foreach(). Begs the question whether it can be null. Only opt_set() creates QemuOpt. It sets member @str to its argument @value. Passing null for @value would plant a time bomb. Callers: * opts_do_parse() can't pass null. * qemu_opt_set() passes its argument @value. Callers: - qemu_opts_from_qdict_1() can't pass null - qemu_opts_set() passes its argument @value, but none of its callers pass null. - Many more outside qemu-option.c, but they shouldn't pass null, either. Assert member @str isn't null, so that misuse is caught right away. Simplify parse_option_bool(), parse_option_number() and parse_option_size() accordingly. Best viewed with whitespace changes ignored. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-3-git-send-email-armbru@redhat.com>
2017-02-21	coroutine-lock: make CoRwlock thread-safe and fair	Paolo Bonzini	1	-11/+24
	This adds a CoMutex around the existing CoQueue. Because the write-side can just take CoMutex, the old "writer" field is not necessary anymore. Instead of removing it altogether, count the number of pending writers during a read-side critical section and forbid further readers from entering. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-7-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21	coroutine-lock: add mutex argument to CoQueue APIs	Paolo Bonzini	1	-3/+21
	All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21	coroutine-lock: add limited spinning to CoMutex	Paolo Bonzini	2	-7/+46
	Running a very small critical section on pthread_mutex_t and CoMutex shows that pthread_mutex_t is much faster because it doesn't actually go to sleep. What happens is that the critical section is shorter than the latency of entering the kernel and thus FUTEX_WAIT always fails. With CoMutex there is no such latency but you still want to avoid wait and wakeup. So introduce it artificially. This only works with one waiters; because CoMutex is fair, it will always have more waits and wakeups than a pthread_mutex_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-3-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21	coroutine-lock: make CoMutex thread-safe	Paolo Bonzini	2	-10/+144
	This uses the lock-free mutex described in the paper '"Blocking without Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and Papatriantafilou. The same technique is used in OSv, and in fact the code is essentially a conversion to C of OSv's code. [Added missing coroutine_fn in tests/test-aio-multithread.c. --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-2-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>