peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2018-05-09	translator: merge max_insns into DisasContextBase	Emilio G. Cota	1	-4/+4
	While at it, use int for both num_insns and max_insns to make sure we have same-type comparisons. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael Clark <mjc@sifive.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-03	linux-user: remove useless padding in flock64 structure	Laurent Vivier	1	-1/+1
	Since commit 8efb2ed5ec ("linux-user: Correct signedness of target_flock l_start and l_len fields"), flock64 structure uses abi_llong for l_start and l_len in place of "unsigned long long" this should force them to be aligned accordingly to the target rules. So we can remove the padding field and the QEMU_PACKED attribute. I have compared the result of the following program before and after the change: cat -> flock64_dump <<EOF p/d sizeof(struct target_flock64) p/d &((struct target_flock64 )0)->l_type p/d &((struct target_flock64 )0)->l_whence p/d &((struct target_flock64 )0)->l_start p/d &((struct target_flock64 )0)->l_len p/d &((struct target_flock64 )0)->l_pid quit EOF for file in build/all/-linux-user/qemu-* ; do echo $file gdb -batch -nx -x flock64_dump $file 2> /dev/null done The sizeof() changes because we remove the QEMU_PACKED. The new size is 32 (except for i386 and m68k) and this is the real size of "struct flock64" on the target architecture. The following architectures differ: aarch64_be, aarch64, alpha, armeb, arm, cris, hppa, nios2, or1k, riscv32, riscv64, s390x. For a subset of these architectures, I have checked with the following program the new structure is the correct one: #include <stdio.h> #define __USE_LARGEFILE64 #include <fcntl.h> int main(void) { printf("struct flock64 %d\n", sizeof(struct flock64)); printf("l_type %d\n", &((struct flock64 )0)->l_type); printf("l_whence %d\n", &((struct flock64 )0)->l_whence); printf("l_start %d\n", &((struct flock64 )0)->l_start); printf("l_len %d\n", &((struct flock64 )0)->l_len); printf("l_pid %d\n", &((struct flock64 *)0)->l_pid); } [I have checked aarch64, alpha, hppa, s390x] For ARM, the target_flock64 becomes the EABI definition, so we need to define the OABI one in place of the EABI one and use it when it is needed. I have also fixed the alignment value for sh4 (to align llong on 4 bytes) (see c2e3dee6e0 "linux-user: Define target alignment size") [We should check alignment properties for cris, nios2 and or1k] Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180502215730.28162-1-laurent@vivier.eu>
2018-04-11	icount: fix cpu_restore_state_from_tb for non-tb-exit cases	Pavel Dovgalyuk	1	-1/+4
	In icount mode, instructions that access io memory spaces in the middle of the translation block invoke TB recompilation. After recompilation, such instructions become last in the TB and are allowed to access io memory spaces. When the code includes instruction like i386 'xchg eax, 0xffffd080' which accesses APIC, QEMU goes into an infinite loop of the recompilation. This instruction includes two memory accesses - one read and one write. After the first access, APIC calls cpu_report_tpr_access, which restores the CPU state to get the current eip. But cpu_restore_state_from_tb resets the cpu->can_do_io flag which makes the second memory access invalid. Therefore the second memory access causes a recompilation of the block. Then these operations repeat again and again. This patch moves resetting cpu->can_do_io flag from cpu_restore_state_from_tb to cpu_loop_exit* functions. It also adds a parameter for cpu_restore_state which controls restoring icount. There is no need to restore icount when we only query CPU state without breaking the TB. Restoring it in such cases leads to the incorrect flow of the virtual time. In most cases new parameter is true (icount should be recalculated). But there are two cases in i386 and openrisc when the CPU state is only queried without the need to break the TB. This patch fixes both of these cases. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180409091320.12504.35329.stgit@pasha-VirtualBox> [rth: Make can_do_io setting unconditional; move from cpu_exec; make cpu_loop_exit_{noexc,restore} call cpu_loop_exit.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-27	gdbstub: send a termination packet instead of crashing gdb	KONRAD Frederic	1	-0/+2
	Since the commit: commit 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 Author: Stefan Hajnoczi <stefanha@redhat.com> Date: Wed Mar 7 14:42:05 2018 +0000 vl: introduce vm_shutdown() GDB crashes when qemu exits (at least on sparc-softmmu): Remote communication error. Target disconnected.: Connection reset by peer. Quitting: putpkt: write failed: Broken pipe. So send a packet to exit GDB before we exit QEMU: [Inferior 1 (Thread 0) exited normally] Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com> Message-id: 1521538773-30802-1-git-send-email-frederic.konrad@adacore.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20	Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging	Peter Maydell	1	-0/+4
	virtio,vhost,pci,pc: features, cleanups SRAT tables for DIMM devices new virtio net flags for speed/duplex post-copy migration support in vhost cleanups in pci Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (51 commits) postcopy shared docs libvhost-user: Claim support for postcopy postcopy: Allow shared memory vhost: Huge page align and merge vhost+postcopy: Wire up POSTCOPY_END notify vhost-user: Add VHOST_USER_POSTCOPY_END message libvhost-user: mprotect & madvises for postcopy vhost+postcopy: Call wakeups vhost+postcopy: Add vhost waker postcopy: postcopy_notify_shared_wake postcopy: helper for waking shared vhost+postcopy: Resolve client address postcopy-ram: add a stub for postcopy_request_shared_page vhost+postcopy: Helper to send requests to source for shared pages vhost+postcopy: Stash RAMBlock and offset vhost+postcopy: Send address back to qemu libvhost-user+postcopy: Register new regions with the ufd migration/ram: ramblock_recv_bitmap_test_byte_offset postcopy+vhost-user: Split set_mem_table for postcopy vhost+postcopy: Transmit 'listen' to slave ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # scripts/update-linux-headers.sh
2018-03-20	postcopy: use UFFDIO_ZEROPAGE only when available	Dr. David Alan Gilbert	1	-0/+3
	Use a flag on the RAMBlock to state whether it has the UFFDIO_ZEROPAGE capability, use it when it's available. This allows the use of postcopy on tmpfs as well as hugepage backed files. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20	qemu_ram_block_host_offset	Dr. David Alan Gilbert	1	-0/+1
	Utility to give the offset of a host pointer within a RAMBlock (assuming we already know it's in that RAMBlock) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-09	linux-user: fix mmap/munmap/mprotect/mremap/shmat	Max Filippov	2	-10/+12
	In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when mmap, munmap, mprotect, mremap or shmat is called for an address outside the guest address space. mmap and mprotect should return ENOMEM in such case. Change definition of GUEST_ADDR_MAX to always be the last valid guest address. Account for this change in open_self_maps. Add macro guest_addr_valid that verifies if the guest address is valid. Add function guest_range_valid that verifies if address range is within guest address space and does not wrap around. Use that macro in mmap/munmap/mprotect/mremap/shmat for error checking. Cc: qemu-stable@nongnu.org Cc: Riku Voipio <riku.voipio@iki.fi> Cc: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20180307215010.30706-1-jcmvbkbc@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-06	address_space_read: address_space_to_flatview needs RCU lock	Paolo Bonzini	1	-15/+10
	address_space_read is calling address_space_to_flatview but it can be called outside the RCU lock. To fix it, push the rcu_read_lock/unlock pair up from flatview_read_full to address_space_read's constant size fast path and address_space_read_full. Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-06	memory: inline some performance-sensitive accessors	Paolo Bonzini	2	-5/+30
	These accessors are called from inlined functions, and the call sequence is much more expensive than just inlining the access. Move the struct declaration to memory-internal.h so that exec.c and memory.c can both use an inline function. Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-01	include/exec/helper-head.h: support f16 in helper calls	Alex Bennée	1	-0/+3
	This allows us to explicitly pass float16 to helpers rather than assuming uint32_t and dealing with the result. Of course they will be passed in i32 sized registers by default. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180227143852.11175-2-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-19	mem: add share parameter to memory-backend-ram	Marcel Apfelbaum	2	-1/+25
	Currently only file backed memory backend can be created with a "share" flag in order to allow sharing guest RAM with other processes in the host. Add the "share" flag also to RAM Memory Backend in order to allow remapping parts of the guest RAM to different host virtual addresses. This is needed by the RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address range. Moved the "share" flag to the Host Memory base class, modified phys_mem_alloc to include the new parameter and a new interface memory_region_init_ram_shared_nomigrate. There are no functional changes if the new flag is not used. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-02-13	memory: hide memory_region_sync_dirty_bitmap behind DirtyBitmapSnapshot	Paolo Bonzini	1	-11/+0
	Simplify the users of memory_region_snapshot_and_clear_dirty, so that they do not have to call memory_region_sync_dirty_bitmap explicitly. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-13	memory: remove memory_region_test_and_clear_dirty	Paolo Bonzini	1	-21/+3
	It is unused after g364fb has been converted to use DirtyBitmapSnapshot. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-09	Clean up includes	Markus Armbruster	1	-2/+0
	Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes, with the change to target/s390x/gen-features.c manually reverted, and blank lines around deletions collapsed. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-3-armbru@redhat.com>
2018-02-07	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	2	-15/+19
	* socket option parsing fix (Daniel) * SCSI fixes (Fam) * Readline double-free fix (Greg) * More HVF attribution fixes (Izik) * WHPX (Windows Hypervisor Platform Extensions) support (Justin) * POLLHUP handler (Klim) * ivshmem fixes (Ladi) * memfd memory backend (Marc-André) * improved error message (Marcelo) * Memory fixes (Peter Xu, Zhecheng) * Remove obsolete code and comments (Peter M.) * qdev API improvements (Philippe) * Add CONFIG_I2C switch (Thomas) # gpg: Signature made Wed 07 Feb 2018 15:24:08 GMT # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (47 commits) Add the WHPX acceleration enlightenments Introduce the WHPX impl Add the WHPX vcpu API Add the Windows Hypervisor Platform accelerator. tests/test-filter-redirector: move close() tests: use memfd in vhost-user-test vhost-user-test: make read-guest-mem setup its own qemu tests: keep compiling failing vhost-user tests Add memfd based hostmem memfd: add hugetlbsize argument memfd: add hugetlb support memfd: add error argument, instead of perror() cpus: join thread when removing a vCPU cpus: hvf: unregister thread with RCU cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug cpus: dummy: unregister thread with RCU, exit loop on unplug cpus: kvm: unregister thread with RCU cpus: hax: register/unregister thread with RCU, exit loop on unplug ivshmem: Disable irqfd on device reset ivshmem: Improve MSI irqfd error handling ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # cpus.c
2018-02-06	memory/iommu: Add get_attr()	Alexey Kardashevskiy	1	-0/+22
	This adds get_attr() to IOMMUMemoryRegionClass, like iommu_ops::domain_get_attr in the Linux kernel. This defines the first attribute - IOMMU_ATTR_SPAPR_TCE_FD - which will be used between the pSeries machine and VFIO-PCI. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-02-05	memory-internal.h: Remove obsolete claim that header is obsolete	Peter Maydell	1	-3/+4
	The memory-internal.h header claims that it is for "obsolete exec.c functions" which "will be removed soon". This statement was added in 2011, six years ago, but the header is still here. (Admittedly none of the prototypes added in commit 67d95c153bef55f6 are still in the header.) It's convenient to have a place to put prototypes for functions which are used internally to the various .c files of the memory system or by the accel/tcg code, which is inevitably fairly closely coupled. So keep the header but update the comments to reflect what we're actually using it for. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1511276888-17834-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-05	memory: update comments and fix some typos	Jay Zhou	1	-12/+15
	Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Message-Id: <1515043788-38300-1-git-send-email-jianjay.zhou@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-25	accel/tcg: add size paremeter in tlb_fill()	Laurent Vivier	1	-3/+3
	The MC68040 MMU provides the size of the access that triggers the page fault. This size is set in the Special Status Word which is written in the stack frame of the access fault exception. So we need the size in m68k_cpu_unassigned_access() and m68k_cpu_handle_mmu_fault(). To be able to do that, this patch modifies the prototype of handle_mmu_fault handler, tlb_fill() and probe_write(). do_unassigned_access() already includes a size parameter. This patch also updates handle_mmu_fault handlers and tlb_fill() of all targets (only parameter, no code change). Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180118193846.24953-2-laurent@vivier.eu>
2018-01-19	hostmem-file: add "align" option	Haozhong Zhang	1	-0/+3
	When mmap(2) the backend files, QEMU uses the host page size (getpagesize(2)) by default as the alignment of mapping address. However, some backends may require alignments different than the page size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned, fails with a kernel message like [617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff) Because there is no common approach to get such alignment requirement, we add the 'align' option to 'memory-backend-file', so that users or management utils, which have enough knowledge about the backend, can specify a proper alignment via this option. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [ehabkost: fixed typo, fixed error_setg() format string] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-16	cpu_physical_memory_sync_dirty_bitmap: Another alignment fix	Dr. David Alan Gilbert	1	-2/+3
	This code has an optimised, word aligned version, and a boring unaligned version. My commit f70d345 fixed one alignment issue, but there's another. The optimised version operates on 'longs' dealing with (typically) 64 pages at a time, replacing the whole long by a 0 and counting the bits. If the Ramblock is less than 64bits in length that long can contain bits representing two different RAMBlocks, but the code will update the bmap belinging to the 1st RAMBlock only while having updated the total dirty page count for both. This probably didn't matter prior to 6b6712ef which split the dirty bitmap by RAMBlock, but now they're separate RAMBlocks we end up with a count that doesn't match the state in the bitmaps. Symptom: Migration showing a few dirty pages left to be sent constantly Seen on aarch64 and x86 with x86+ovmf Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Wei Huang <wei@redhat.com> Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-29	tcg: Allow 6 arguments to TCG helpers	Richard Henderson	4	-0/+25
	We already handle this in the backends, and the lifetime datum for the TCGOp is already large enough. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-12-29	tcg: Dynamically allocate TCGOps	Richard Henderson	1	-6/+3
	With no fixed array allocation, we can't overflow a buffer. This will be important as optimizations related to host vectors may expand the number of ops used. Use QTAILQ to link the ops together. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-12-21	cpu: refactor cpu_address_space_init()	Peter Xu	1	-2/+4
	Normally we create an address space for that CPU and pass that address space into the function. Let's just do it inside to unify address space creations. It'll simplify my next patch to rename those address spaces. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20171123092333.16085-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-18	memory: remove unused memory_region_set_global_locking()	Marc-André Lureau	1	-12/+0
	This was never used since its introduction in commit 196ea13104f8 ("memory: Add global-locking property to memory regions"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-11-21	exec.c: Factor out before/after actions for notdirty memory writes	Peter Maydell	1	-0/+62
	The function notdirty_mem_write() has a sequence of actions it has to do before and after the actual business of writing data to host RAM to ensure that dirty flags are correctly updated and we flush any TCG translations for the region. We need to do this also in other places that write directly to host RAM, most notably the TCG atomic helper functions. Pull out the before and after pieces into their own functions. We use an API where the prepare function stashes the various bits of information about the write into a struct for the complete function to use, because in the calls for the atomic helpers the place where the complete function will be called doesn't have the information to hand. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1511201308-23580-2-git-send-email-peter.maydell@linaro.org
2017-11-15	tcg: Record code_gen_buffer address for user-only memory helpers	Richard Henderson	2	-2/+14
	When we handle a signal from a fault within a user-only memory helper, we cannot cpu_restore_state with the PC found within the signal frame. Use a TLS variable, helper_retaddr, to record the unwind start point to find the faulting guest insn. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-11-13	accel/tcg/translate-all: expand cpu_restore_state addr check	Alex Bennée	1	-0/+11
	We are still seeing signals during translation time when we walk over a page protection boundary. This expands the check to ensure the host PC is inside the code generation buffer. The original suggestion was to check versus tcg_ctx.code_gen_ptr but as we now segment the translation buffer we have to settle for just a general check for being inside. I've also fixed up the declaration to make it clear it can deal with invalid addresses. A later patch will fix up the call sites. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20171108153245.20740-2-alex.bennee@linaro.org Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-10-27	Merge remote-tracking branch 'remotes/rth/tags/pull-dis-20171026' into staging	Peter Maydell	1	-2/+2
	Capstone disassembler # gpg: Signature made Thu 26 Oct 2017 10:57:27 BST # gpg: using RSA key 0x64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-dis-20171026: disas: Add capstone as submodule disas: Remove monitor_disas_is_physical ppc: Support Capstone in disas_set_info arm: Support Capstone in disas_set_info i386: Support Capstone in disas_set_info disas: Support the Capstone disassembler library disas: Remove unused flags arguments target/arm: Don't set INSN_ARM_BE32 for CONFIG_USER_ONLY target/arm: Move BE32 disassembler fixup target/ppc: Convert to disas_set_info hook target/i386: Convert to disas_set_info hook Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # target/i386/cpu.c # target/ppc/translate_init.c
2017-10-25	Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20171025' into staging	Peter Maydell	8	-56/+76
	TCG patch queue # gpg: Signature made Wed 25 Oct 2017 10:30:18 BST # gpg: using RSA key 0x64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20171025: (51 commits) translate-all: exit from tb_phys_invalidate if qht_remove fails tcg: Initialize cpu_env generically tcg: enable multiple TCG contexts in softmmu tcg: introduce regions to split code_gen_buffer translate-all: use qemu_protect_rwx/none helpers osdep: introduce qemu_mprotect_rwx/none tcg: allocate optimizer temps with tcg_malloc tcg: distribute profiling counters across TCGContext's tcg: introduce **tcg_ctxs to keep track of all TCGContext's gen-icount: fold exitreq_label into TCGContext tcg: define tcg_init_ctx and make tcg_ctx a pointer tcg: take tb_ctx out of TCGContext translate-all: report correct avg host TB size exec-all: rename tb_free to tb_remove translate-all: use a binary search tree to track TBs in TBContext tcg: Remove CF_IGNORE_ICOUNT tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK cpu-exec: lookup/generate TB outside exclusive region during step_atomic tcg: check CF_PARALLEL instead of parallel_cpus target/sparc: check CF_PARALLEL instead of parallel_cpus ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-10-25	disas: Remove unused flags arguments	Richard Henderson	1	-2/+2
	Now that every target is using the disas_set_info hook, the flags argument is unused. Remove it. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Initialize cpu_env generically	Richard Henderson	1	-6/+4
	This is identical for each target. So, move the initialization to common code. Move the variable itself out of tcg_ctx and name it cpu_env to minimize changes within targets. This also means we can remove tcg_global_reg_new_{ptr,i32,i64}, since there are no longer global-register temps created by targets. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	gen-icount: fold exitreq_label into TCGContext	Emilio G. Cota	1	-4/+3
	Groundwork for supporting multiple TCG contexts. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: define tcg_init_ctx and make tcg_ctx a pointer	Emilio G. Cota	1	-5/+5
	Groundwork for supporting multiple TCG contexts. The core of this patch is this change to tcg/tcg.h: > -extern TCGContext tcg_ctx; > +extern TCGContext tcg_init_ctx; > +extern TCGContext tcg_ctx; Note that for now we set tcg_ctx to whatever TCGContext is passed to tcg_context_init -- in this case &tcg_init_ctx. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: take tb_ctx out of TCGContext	Emilio G. Cota	1	-0/+2
	Groundwork for supporting multiple TCG contexts. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	exec-all: rename tb_free to tb_remove	Emilio G. Cota	1	-1/+1
	We don't really free anything in this function anymore; we just remove the TB from the binary search tree. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	translate-all: use a binary search tree to track TBs in TBContext	Emilio G. Cota	2	-4/+6
	This is a prerequisite for supporting multiple TCG contexts, since we will have threads generating code in separate regions of code_gen_buffer. For this we need a new field (.size) in struct tb_tc to keep track of the size of the translated code. This field uses a size_t to avoid adding a hole to the struct, although really an unsigned int would have been enough. The comparison function we use is optimized for the common case: insertions. Profiling shows that upon booting debian-arm, 98% of comparisons are between existing tb's (i.e. a->size and b->size are both !0), which happens during insertions (and removals, but those are rare). The remaining cases are lookups. From reading the glib sources we see that the first key is always the lookup key. However, the code does not assume this to always be the case because this behaviour is not guaranteed in the glib docs. However, we embed this knowledge in the code as a branch hint for the compiler. Note that tb_free does not free space in the code_gen_buffer anymore, since we cannot easily know whether the tb is the last one inserted in code_gen_buffer. The next patch in this series renames tb_free to tb_remove to reflect this. Performance-wise, lookups in tb_find_pc are the same as before: O(log n). However, insertions are O(log n) instead of O(1), which results in a small slowdown when booting debian-arm: Performance counter stats for 'build/arm-softmmu/qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=img/arm/jessie-arm32.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel img/arm/aarch32-current-linux-kernel-only.img \ -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): - Before: 8048.598422 task-clock (msec) # 0.931 CPUs utilized ( +- 0.28% ) 16,974 context-switches # 0.002 M/sec ( +- 0.12% ) 0 cpu-migrations # 0.000 K/sec 10,125 page-faults # 0.001 M/sec ( +- 1.23% ) 35,144,901,879 cycles # 4.367 GHz ( +- 0.14% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 65,758,252,643 instructions # 1.87 insns per cycle ( +- 0.33% ) 10,871,298,668 branches # 1350.707 M/sec ( +- 0.41% ) 192,322,212 branch-misses # 1.77% of all branches ( +- 0.32% ) 8.640869419 seconds time elapsed ( +- 0.57% ) - After: 8146.242027 task-clock (msec) # 0.923 CPUs utilized ( +- 1.23% ) 17,016 context-switches # 0.002 M/sec ( +- 0.40% ) 0 cpu-migrations # 0.000 K/sec 18,769 page-faults # 0.002 M/sec ( +- 0.45% ) 35,660,956,120 cycles # 4.378 GHz ( +- 1.22% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 65,095,366,607 instructions # 1.83 insns per cycle ( +- 1.73% ) 10,803,480,261 branches # 1326.192 M/sec ( +- 1.95% ) 195,601,289 branch-misses # 1.81% of all branches ( +- 0.39% ) 8.828660235 seconds time elapsed ( +- 0.38% ) Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Remove CF_IGNORE_ICOUNT	Richard Henderson	1	-8/+9
	Now that we have curr_cflags, we can include CF_USE_ICOUNT early and then remove it as necessary. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK	Richard Henderson	1	-1/+2
	These flags are used by target/*/translate.c, and affect code generation. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: convert tb->cflags reads to tb_cflags(tb)	Emilio G. Cota	1	-4/+4
	Convert all existing readers of tb->cflags to tb_cflags, so that we use atomic_read and therefore avoid undefined behaviour in C11. Note that the remaining setters/getters of the field are protected by tb_lock, and therefore do not need conversion. Luckily all readers access the field via 'tb->cflags' (so no foo.cflags, bar->cflags in the code base), which makes the conversion easily scriptable: FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \ accel/tcg/translator.c \| cut -f1 -d':' \| sort \| uniq) perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES perl -pi -e 's/([a-z->.]*)(->\|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES Then manually fixed the few errors that checkpatch reported. Compile-tested for all targets. Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Include CF_COUNT_MASK in CF_HASH_MASK	Richard Henderson	1	-1/+1
	Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK	Emilio G. Cota	4	-9/+30
	This will enable us to decouple code translation from the value of parallel_cpus at any given time. It will also help us minimize TB flushes when generating code via EXCP_ATOMIC. Note that the declaration of parallel_cpus is brought to exec-all.h to be able to define there the "curr_cflags" inline. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Remove GET_TCGV_* and MAKE_TCGV_*	Richard Henderson	1	-4/+0
	The GET and MAKE functions weren't really specific enough. We now have a full complement of functions that convert exactly between temporaries, arguments, tcgv pointers, and indices. The target/sparc change is also a bug fix, which would have affected a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Introduce tcgv_{i32,i64,ptr}_{arg,temp}	Richard Henderson	2	-11/+11
	Transform TCGv_* to an "argument" or a temporary. For now, an argument is simply the temporary index. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-24	tcg: Push tcg_ctx into tcg_gen_callN	Richard Henderson	1	-6/+6
	Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-23	migration: add bitmap for received page	Alexey Perevalov	1	-0/+10
	This patch adds ability to track down already received pages, it's necessary for calculation vCPU block time in postcopy migration feature, and for recovery after postcopy migration failure. Also it's necessary to solve shared memory issue in postcopy livemigration. Information about received pages will be transferred to the software virtual bridge (e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for already received pages. fallocate syscall is required for remmaped shared memory, due to remmaping itself blocks ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT error (struct page is exists after remmap). Bitmap is placed into RAMBlock as another postcopy/precopy related bitmaps. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-10-20	accel/tcg: allow to invalidate a write TLB entry immediately	David Hildenbrand	1	-0/+3
	Background: s390x implements Low-Address Protection (LAP). If LAP is enabled, writing to effective addresses (before any translation) 0-511 and 4096-4607 triggers a protection exception. So we have subpage protection on the first two pages of every address space (where the lowcore - the CPU private data resides). By immediately invalidating the write entry but allowing the caller to continue, we force every write access onto these first two pages into the slow path. we will get a tlb fault with the specific accessed addresses and can then evaluate if protection applies or not. We have to make sure to ignore the invalid bit if tlb_fill() succeeds. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20171016202358.3633-2-david@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-10-10	util: move qemu_real_host_page_size/mask to osdep.h	Emilio G. Cota	1	-2/+0
	These only depend on the host and therefore belong in the common osdep, not in a target-dependent object. While at it, query the host during an init constructor, which guarantees the page size will be well-defined throughout the execution of the program. Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-10	exec-all: extract tb->tc_* into a separate struct tc_tb	Emilio G. Cota	1	-2/+10
	In preparation for adding tc.size to be able to keep track of TB's using the binary search tree implementation from glib. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>