peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2015-12-02	exec: Stop using memory after free	Don Slutz	1	-1/+3
	memory_region_unref(mr) can free memory. For example I got: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f43280d4700 (LWP 4462)] 0x00007f43323283c0 in phys_section_destroy (mr=0x7f43259468b0) at /home/don/xen/tools/qemu-xen-dir/exec.c:1023 1023 if (mr->subpage) { (gdb) bt at /home/don/xen/tools/qemu-xen-dir/exec.c:1023 at /home/don/xen/tools/qemu-xen-dir/exec.c:1034 at /home/don/xen/tools/qemu-xen-dir/exec.c:2205 (gdb) p mr $1 = (MemoryRegion *) 0x7f43259468b0 And this change prevents this. Signed-off-by: Don Slutz <Don.Slutz@Gmail.com> Message-Id: <1448921464-21845-1-git-send-email-Don.Slutz@Gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-26	exec: remove warning about mempath and hugetlbfs	Daniel P. Berrange	1	-3/+0
	The gethugepagesize() method in exec.c printed a warning if the file path for "-mem-path" or "-object memory-backend-file" was not on a hugetlbfs filesystem. This warning is bogus, because QEMU functions perfectly well with the path on a regular tmpfs filesystem. Use of hugetlbfs vs tmpfs is a choice for the management application or end user to make as best fits their needs. As such it is inappropriate for QEMU to have an opinion on whether the user's choice is right or wrong in this case. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1448448749-1332-3-git-send-email-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-26	Revert "exec: silence hugetlbfs warning under qtest"	Daniel P. Berrange	1	-4/+1
	This reverts commit 1c7ba94a184df1eddd589d5400d879568d3e5d08. That commit changed QEMU initialization order from - object-initial, chardev, qtest, object-late to - chardev, qtest, object-initial, object-late This breaks chardev setups which need to rely on objects having been created. For example, when chardevs use TLS encryption in the future, they need to have tls credential objects created first. This revert, restores the ordering introduced in commit f08f9271bfe3f19a5eb3d7a2f48532065304d5c8 Author: Daniel P. Berrange <berrange@redhat.com> Date: Wed May 13 17:14:04 2015 +0100 vl: Create (most) objects before creating chardev backends Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1448448749-1332-2-git-send-email-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-19	exec: silence hugetlbfs warning under qtest	Marc-André Lureau	1	-1/+4
	vhost-user-test prints a warning. A test should not need to run on hugetlbfs, let's silence the warning under qtest. The condition can't check on qtest_enabled() since vhost-user-test actually doesn't use qtest accel. However, qtest_driver() can be used, if qtest_init() is called early enough. For that reason, move chardev and qtest initialization early. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-10	Round up RAMBlock sizes to host page sizes	Dr. David Alan Gilbert	1	-4/+4
	RAMBlocks that are not a multiple of host pages in length cause problems for postcopy (I've seen an ACPI table on aarch64 be 5k in length - i.e. 5x target-page), so round RAMBlock sizes up to a host-page. This potentially breaks migration compatibility due to changes in RAMBlock sizes; however: 1) x86 and s390 I think always have host=target page size 2) When I've tried on Power the block sizes already seem aligned. 3) I don't think there's anything else that maintains per-version machine-types for compatibility. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10	qemu_ram_block_by_name	Dr. David Alan Gilbert	1	-0/+20
	Add a function to find a RAMBlock by name; use it in two of the places that already open code that loop; we've got another use later in postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10	qemu_ram_block_from_host	Dr. David Alan Gilbert	1	-9/+45
	Postcopy sends RAMBlock names and offsets over the wire (since it can't rely on the order of ramaddr being the same), and it starts out with HVA fault addresses from the kernel. qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset in the RAMBlock and the global ram_addr_t value. Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host. Provide qemu_ram_get_idstr since its the actual name text sent on the wire. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10	Provide runtime Target page information	Dr. David Alan Gilbert	1	-0/+10
	The migration code generally is built target-independent, however there are a few places where knowing the target page size would avoid artificially moving stuff into migration/ram.c. Provide 'qemu_target_page_bits()' that returns TARGET_PAGE_BITS to other bits of code so that they can stay target-independent. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-06	exec: avoid unnecessary cacheline bounce on ram_list.mru_block	Paolo Bonzini	1	-1/+1
	Whenever the MRU cache hits for the list of RAM blocks, qemu_get_ram_block does an unnecessary write that causes a processor cache line to bounce from one core to another. This causes a performance hit. Reported-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-11-06	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-replay' into ↵	Peter Maydell	1	-0/+2
	staging So here it is, let's see what happens. # gpg: Signature made Fri 06 Nov 2015 09:30:34 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream-replay: replay: recording of the user input replay: command line options replay: replay blockers for devices replay: initialization and deinitialization replay: ptimer bottom halves: introduce bh call function replay: checkpoints icount: improve counting for record/replay replay: shutdown event replay: recording and replaying clock ticks replay: asynchronous events infrastructure replay: interrupts and exceptions cpu: replay instructions sequence cpu-exec: allow temporary disabling icount replay: introduce icount event replay: introduce mutex to protect the replay log replay: internal functions for replay log replay: global variables and function stubs Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-06	replay: initialization and deinitialization	Pavel Dovgalyuk	1	-0/+2
	This patch introduces the functions for enabling the record/replay and for freeing the resources when simulator closes. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150917162507.8676.90232.stgit@PASHA-ISP.def.inno> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2015-11-04	backends/hostmem-file: Allow to specify full pathname for backing file	Pavel Fedin	1	-13/+21
	This allows to explicitly specify file name to use with the backend. This is important when using it together with ivshmem in order to make it backed by hugetlbfs. By default filename is autogenerated using mkstemp(), and the file is unlink()ed after creation, effectively making it anonymous. This is not very useful with ivshmem because it ends up in a memory which cannot be accessed by something else. Distinction between directory and file name is done by stat() check. If an existing directory is given, the code keeps old behavior. Otherwise it creates or opens a file with the given pathname. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Tested-by: Igor Skalkin <i.skalkin@samsung.com> Message-Id: <004301d11166$9672fe30$c358fa90$@samsung.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-04	memory: call begin, log_start and commit when registering a new listener	Paolo Bonzini	1	-1/+1
	This ensures that cpu_reload_memory_map() is called as soon as tcg_cpu_address_space_init() is called, and before cpu->memory_dispatch is used. qemu-system-s390x never changes the address spaces after tcg_cpu_address_space_init() is called, and thus tcg_commit() is never called. This causes a SIGSEGV. Because memory_map_init() will now call mem_commit(), we have to initialize io_mem_* before address_space_memory and friends. Reported-by: Philipp Kern <pkern@debian.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Fixes: 0a1c71cec63e95f9b8d0dc96d049d2daa00c5210 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-11-02	file_ram_alloc: propagate error to caller instead of terminating QEMU	Igor Mammedov	1	-4/+0
	QEMU shouldn't exits from file_ram_alloc() if -mem-prealloc option is specified and "object_add memory-backend-file,..." fails allocation during memory hotplug. Propagate error to a caller and let it decide what to do with allocation failure. That leaves QEMU alive if it can't create backend during hotplug time and kills QEMU at startup time if backends or initial memory were misconfigured/ too large. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1445274671-17704-1-git-send-email-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-22	Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging	Peter Maydell	1	-37/+10
	vhost, pc, virtio features, fixes, cleanups New features: VT-d support for devices behind a bridge vhost-user migration support Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 22 Oct 2015 12:39:19 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: (37 commits) hw/isa/lpc_ich9: inject the SMI on the VCPU that is writing to APM_CNT i386: keep cpu_model field in MachineState uptodate vhost: set the correct queue index in case of migration with multiqueue piix: fix resource leak reported by Coverity seccomp: add memfd_create to whitelist vhost-user-test: check ownership during migration vhost-user-test: add live-migration test vhost-user-test: learn to tweak various qemu arguments vhost-user-test: wrap server in TestServer struct vhost-user-test: remove useless static check vhost-user-test: move wait_for_fds() out vhost: add migration block if memfd failed vhost-user: use an enum helper for features mask vhost user: add rarp sending after live migration for legacy guest vhost user: add support of live migration net: add trace_vhost_user_event vhost-user: document migration log vhost: use a function for each call vhost-user: add a migration blocker vhost-user: send log shm fd along with log_base ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-21	exec: factor out duplicate mmap code	Michael S. Tsirkin	1	-37/+10
	Anonymous and file-backed RAM allocation are now almost exactly the same. Reduce code duplication by moving RAM mmap code out of oslib-posix.c and exec.c. Reported-by: Marc-André Lureau <mlureau@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-12	exec.c: Collect AddressSpace related fields into a CPUAddressSpace struct	Peter Maydell	1	-18/+38
	Gather up all the fields currently in CPUState which deal with the CPU's AddressSpace into a separate CPUAddressSpace struct. This paves the way for allowing the CPU to know about more than one AddressSpace. The rearrangement also allows us to make the MemoryListener a directly embedded object in the CPUAddressSpace (it could not be embedded in CPUState because 'struct MemoryListener' isn't defined for the user-only builds). This allows us to resolve the FIXME in tcg_commit() by going directly from the MemoryListener to the CPUAddressSpace. This patch extracts the actual update of the cached dispatch pointer from cpu_reload_memory_map() (which is renamed accordingly to cpu_reloading_memory_map() as it is only responsible for breaking cpu-exec.c's RCU critical section now). This lets us keep the definition of the CPUAddressSpace struct private to exec.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1443709790-25180-4-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-12	exec.c: Don't call cpu_reload_memory_map() from cpu_exec_init()	Peter Maydell	1	-1/+0
	Currently we call cpu_reload_memory_map() from cpu_exec_init(), but this is not necessary: * KVM doesn't use the data structures maintained by cpu_reload_memory_map() (the TLB and cpu->memory_dispatch) * for TCG, we will call this function via tcg_commit() either as soon as tcg_cpu_address_space_init() registers the listener, or when the first MemoryRegion is added to the AddressSpace if the AS is empty when we register the listener The unnecessary call is awkward for adding support for multiple address spaces per CPU, so drop it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com> Message-Id: <1443709790-25180-2-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01	exec: allocate PROT_NONE pages on top of RAM	Michael S. Tsirkin	1	-3/+39
	This inserts a read and write protected page between RAM and QEMU memory, for file-backend RAM. This makes it harder to exploit QEMU bugs resulting from buffer overflows in devices using variants of cpu_physical_memory_map, dma_memory_map etc. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16	include/exec: Move cputlb exec.c defs out	Peter Crosthwaite	1	-1/+0
	Move the architecture agnostic function prototypes for exec.c out of cputlb.h to exec-all.h. This allows hiding of the arch specific cputlb.h from exec.c which should be getting close to having no architecture specifics. Prepares support for multi-arch, which will have a minimal cpu.h that services exec.c but not cputlb.h. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <b4fe754c58c860315e35d44430c26b1c967ce2c9.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16	cputlb: Change tlb_set_dirty() arg to cpu	Peter Crosthwaite	1	-2/+1
	Change tlb_set_dirty() to accept a CPU instead of an env pointer. This allows for removal of another CPUArchState usage from prototypes that need to be QOMified. Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <d2b1dcbe7945112989861d8ba7369449c11cc273.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16	cputlb: move CPU_LOOP() for tlb_reset() to exec.c	Peter Crosthwaite	1	-1/+4
	To prepare for multi-arch, cputlb.c should only have awareness of one single architecture. This means it should not have access to the full CPU lists which may be heterogeneous. Instead, push the CPU_LOOP() up to the one and only caller in exec.c. Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <db06dc6c49f8970caaf116d0385f00ee10a56f2f.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-16	cpu: Add crash_occurred flag into CPUState	Andrey Smetanin	1	-0/+19
	CPUState::crash_occurred field inside CPUState marks that guest crash occurred. This value is added into cpu common migration subsection. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Andreas Färber <afaerber@suse.de> Message-Id: <1435924905-8926-12-git-send-email-den@openvz.org> [Document the new field. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-14	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	1	-1/+1
	* Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs # gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (44 commits) cutils: work around platform differences in strto{l,ul,ll,ull} cpu-exec: fix lock hierarchy for user-mode emulation exec: make mmap_lock/mmap_unlock globally available tcg: comment on which functions have to be called with mmap_lock held tcg: add memory barriers in page_find_alloc accesses remove unused spinlock. replace spinlock by QemuMutex. cpus: remove tcg_halt_cond and tcg_cpu_thread globals cpus: protect work list with work_mutex scripts/dump-guest-memory.py: fix after RAMBlock change configure: Add support for jemalloc add macro file for coccinelle configure: factor out adding disas configure vhost-scsi: fix wrong vhost-scsi firmware path checkpatch: remove tests that are not relevant outside the kernel checkpatch: adapt some tests to QEMU CODING_STYLE: update mixed declaration rules qmp: Add example usage of strto*l() qemu wrapper cutils: Add qemu_strtoull() wrapper cutils: Add qemu_strtoll() wrapper ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-09	remove qemu/tls.h	Paolo Bonzini	1	-1/+1
	TLS is now required on all platforms, so DECLARE_TLS/DEFINE_TLS is not needed anymore. Removing it does not break Windows because of the previous patch. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-07	exec.c: Use pow2floor() rather than hand-calculation	Peter Maydell	1	-3/+1
	Use pow2floor() to round down to the nearest power of 2, rather than an inline calculation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1437741192-20955-5-git-send-email-peter.maydell@linaro.org
2015-08-14	exec: use macro ROUND_UP for alignment	Chen Hanxiao	1	-1/+1
	Use ROUND_UP instead. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> Message-Id: <1437707523-4910-1-git-send-email-chenhanxiao@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-23	exec.c: Use atomic_rcu_read() to access dispatch in ↵	Peter Maydell	1	-1/+4
	memory_region_section_get_iotlb() When accessing the dispatch pointer in an AddressSpace within an RCU critical section we should always use atomic_rcu_read(). Fix an access within memory_region_section_get_iotlb() which was incorrectly doing a direct pointer access. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1437391637-31576-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-09	cpu: Change cpu_exec_init() arg to cpu, not env	Peter Crosthwaite	1	-3/+2
	The callers (most of them in target-foo/cpu.c) to this function all have the cpu pointer handy. Just pass it to avoid an ENV_GET_CPU() from core code (in exec.c). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Michael Walle <michael@walle.cc> Cc: Leon Alrae <leon.alrae@imgtec.com> Cc: Anthony Green <green@moxielogic.com> Cc: Jia Liu <proljc@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	translate-all: Change tb_flush() env argument to cpu	Peter Crosthwaite	1	-2/+1
	All of the core-code usages of this API have the cpu pointer handy so pass it in. There are only 3 architecture specific usages (2 of which are commented out) which can just use ENV_GET_CPU() locally to get the cpu pointer. The reduces core code usage of the CPU env, which brings us closer to common-obj'ing these core files. Cc: Riku Voipio <riku.voipio@iki.fi> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	cpu: Convert cpu_index into a bitmap	Bharata B Rao	1	-5/+53
	Currently CPUState::cpu_index is monotonically increasing and a newly created CPU always gets the next higher index. The next available index is calculated by counting the existing number of CPUs. This is fine as long as we only add CPUs, but there are architectures which are starting to support CPU removal, too. For an architecture like PowerPC which derives its CPU identifier (device tree ID) from cpu_index, the existing logic of generating cpu_index values causes problems. With the currently proposed method of handling vCPU removal by parking the vCPU fd in QEMU (Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html), generating cpu_index this way will not work for PowerPC. This patch changes the way cpu_index is handed out by maintaining a bit map of the CPUs that tracks both addition and removal of CPUs. The CPU bitmap allocation logic is part of cpu_exec_init(), which is called by instance_init routines of various CPU targets. Newly added cpu_exec_exit() API handles the deallocation part and this routine is called from generic CPU instance_finalize. Note: This new CPU enumeration is for !CONFIG_USER_ONLY only. CONFIG_USER_ONLY continues to have the old enumeration logic. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> [AF: max_cpus -> MAX_CPUMASK_BITS] Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	cpu: Add Error argument to cpu_exec_init()	Bharata B Rao	1	-1/+1
	Add an Error argument to cpu_exec_init() to let users collect the error. This is in preparation to change the CPU enumeration logic in cpu_exec_init(). With the new enumeration logic, cpu_exec_init() can fail if cpu_index values corresponding to max_cpus have already been handed out. Since all current callers of cpu_exec_init() are from instance_init, use error_abort Error argument to abort in case of an error. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	cpu: Reorder cpu->as, cpu->thread_id, cpu->memory_dispatch init	Eduardo Habkost	1	-5/+6
	Instead of initializing cpu->as, cpu->thread_id, and reloading memory map while holding cpu_list_lock(), do it earlier, before locking the CPU list and initializing cpu_index. This allows the code handling cpu_index and global CPU list to be isolated from the rest. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	cpu: Initialize breakpoint/watchpoint lists in cpu_common_initfn()	Eduardo Habkost	1	-2/+0
	One small step in the simplification of cpu_exec_init(). Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-09	cpu: No need to zero-initialize CPUState::numa_node	Eduardo Habkost	1	-1/+0
	QOM objects are already zero-filled when instantiated, there's no need to explicitly set numa_node to 0. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-07-07	migration: extend migration_bitmap	Li Zhijian	1	-0/+5
	Prevously, if we hotplug a device(e.g. device_add e1000) during migration is processing in source side, qemu will add a new ram block but migration_bitmap is not extended. In this case, migration_bitmap will overflow and lead qemu abort unexpectedly. Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-07-06	exec: skip MMIO regions correctly in cpu_physical_memory_write_rom_internal	Paolo Bonzini	1	-1/+13
	Loading the BIOS in the mac99 machine is interesting, because there is a PROM in the middle of the BIOS region (from 16K to 32K). Before memory region accesses were clamped, when QEMU was asked to load a BIOS from 0xfff00000 to 0xffffffff it would put even those 16K from the BIOS file into the region. This is weird because those 16K were not actually visible between 0xfff04000 and 0xfff07fff. However, it worked. After clamping was added, this also worked. In this case, the cpu_physical_memory_write_rom_internal function split the write in three parts: the first 16K were copied, the PROM area (second 16K) were ignored, then the rest was copied. Problems then started with commit 965eb2f (exec: do not clamp accesses to MMIO regions, 2015-06-17). Clamping accesses is not done for MMIO regions because they can overlap wildly, and MMIO registers can be expected to perform full-width accesses based only on their address (with no respect for adjacent registers that could decode to completely different MemoryRegions). However, this lack of clamping also applied to the PROM area! cpu_physical_memory_write_rom_internal thus failed to copy the third range above, i.e. only copied the first 16K of the BIOS. In effect, address_space_translate is expecting _something else_ to do the clamping for MMIO regions if the incoming length is large. This "something else" is memory_access_size in the case of address_space_rw, so use the same logic in cpu_physical_memory_write_rom_internal. Reported-by: Alexander Graf <agraf@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Fixes: 965eb2f Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01	memory: let address_space_rw/ld/st run outside the BQL	Jan Kiszka	1	-9/+57
	The MMIO case is further broken up in two cases: if the caller does not hold the BQL on invocation, the unlocked one takes or avoids BQL depending on the locking strategy of the target memory region and its coalesced MMIO handling. In this case, the caller should not hold _any_ lock (a friendly suggestion which is disregarded by virtio-scsi-dataplane). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Frederic Konrad <fred.konrad@greensocs.com> Message-Id: <1434646046-27150-6-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-01	exec: pull qemu_flush_coalesced_mmio_buffer() into address_space_rw/ld/st	Paolo Bonzini	1	-0/+21
	As memory_region_read/write_accessor will now be run also without BQL held, we need to move coalesced MMIO flushing earlier in the dispatch process. Cc: Frederic Konrad <fred.konrad@greensocs.com> Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19	exec: clamp accesses against the MemoryRegionSection	Paolo Bonzini	1	-1/+1
	Because the clamping was done against the MemoryRegion, address_space_rw was effectively broken if a write spanned multiple sections that are not linear in underlying memory (with the memory not being under an IOMMU). This is visible with the MIPS rc4030 IOMMU, which is implemented as a series of alias memory regions that point to the actual RAM. Tested-by: Hervé Poussineau <hpoussin@reactos.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19	exec: do not clamp accesses to MMIO regions	Paolo Bonzini	1	-2/+6
	It is common for MMIO registers to overlap, for example a 4 byte register at 0xcf8 (totally random choice... :)) and a 1 byte register at 0xcf9. If these registers are implemented via separate MemoryRegions, it is wrong to clamp the accesses as the value written would be truncated. Hence for these regions the effects of commit 23820db (exec: Respect as_translate_internal length clamp, 2015-03-16, previously applied as commit c3c1bb99) must be skipped. Tested-by: Hervé Poussineau <hpoussin@reactos.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-12	qemu_ram_foreach_block: pass up error value, and down the ramblock name	Dr. David Alan Gilbert	1	-2/+8
	check the return value of the function it calls and error if it's non-0 Fixup qemu_rdma_init_one_block that is the only current caller, and rdma_add_block the only function it calls using it. Pass the name of the ramblock to the function; helps in debugging. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Michael R. Hines <mrhines@us.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-12	migration: Use normal VMStateDescriptions for Subsections	Juan Quintela	1	-7/+4
	We create optional sections with this patch. But we already have optional subsections. Instead of having two mechanism that do the same, we can just generalize it. For subsections we just change: - Add a needed function to VMStateDescription - Remove VMStateSubsection (after removal of the needed function it is just a VMStateDescription) - Adjust the whole tree, moving the needed function to the corresponding VMStateDescription Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-06-05	memory: replace cpu_physical_memory_reset_dirty() with test-and-clear	Stefan Hajnoczi	1	-6/+17
	The cpu_physical_memory_reset_dirty() function is sometimes used together with cpu_physical_memory_get_dirty(). This is not atomic since two separate accesses to the dirty memory bitmap are made. Turn cpu_physical_memory_reset_dirty() and cpu_physical_memory_clear_dirty_range_type() into the atomic cpu_physical_memory_test_and_clear_dirty(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	exec: only check relevant bitmaps for cleanliness	Paolo Bonzini	1	-9/+13
	Most of the time, not all bitmaps have to be marked as dirty; do not do anything if the interesting ones are already dirty. Previously, any clean bitmap would have cause all the bitmaps to be marked dirty. In fact, unless running TCG most of the time bitmap operations need not be done at all, because memory_region_is_logging returns zero. In this case, skip the call to cpu_physical_memory_range_includes_clean altogether as well. With this patch, cpu_physical_memory_set_dirty_range is called unconditionally, so there need not be anymore a separate call to xen_modified_memory. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	exec: pass client mask to cpu_physical_memory_set_dirty_range	Paolo Bonzini	1	-9/+11
	This cuts in half the cost of bitmap operations (which will become more expensive when made atomic) during migration on non-VRAM regions. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	translate-all: remove unnecessary argument to tb_invalidate_phys_range	Paolo Bonzini	1	-1/+1
	The is_cpu_write_access argument is always 0, remove it. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	exec: use memory_region_get_dirty_log_mask to optimize dirty tracking	Paolo Bonzini	1	-40/+19
	The memory API can now return the exact set of bitmaps that have to be tracked. Use it instead of the in_migration variable. In the next patches, we will also use it to set only DIRTY_MEMORY_VGA or DIRTY_MEMORY_MIGRATION if necessary. This can make a difference for dataplane, especially after the dirty bitmap is changed to use more expensive atomic operations. Of some interest is the change to stl_phys_notdirty. When migration was introduced, stl_phys_notdirty was changed to effectively behave as stl_phys during migration. In fact, if one looks at the function as it was in the beginning (commit 8df1cd0, physical memory access functions, 2005-01-28), at the time the dirty bitmap was the equivalent of DIRTY_MEMORY_CODE nowadays; hence, the function simply should not touch the dirty code bits. This patch changes it to do the intended thing. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	ram_addr: tweaks to xen_modified_memory	Paolo Bonzini	1	-1/+2
	Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode; it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap. The remaining call from invalidate_and_set_dirty's "else" branch will go away soon. Second, fix the second argument to the function in the cpu_physical_memory_set_dirty_lebitmap call site. That function is only used by KVM, but it is better to be clean anyway. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05	exec: optimize phys_page_set_level	Paolo Bonzini	1	-14/+10
	phys_page_set_level is writing zeroes to a struct that has just been filled in by phys_map_node_alloc. Instead, tell phys_map_node_alloc whether to fill in the page "as a leaf" or "as a non-leaf". memcpy is faster than struct assignment, which copies each bitfield individually. A compiler bug (https://gcc.gnu.org/PR66391), and small memcpys like this one are special-cased anyway, and optimized to a register move, so just use the memcpy. This cuts the cost of phys_page_set_level from 25% to 5% when booting qboot. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>