peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-17	target-i386: Don't use cpu->migratable when filtering features	Eduardo Habkost	1	-1/+1
	When explicitly enabling unmigratable flags using "-cpu host" (e.g. "-cpu host,+invtsc"), the requested feature won't be enabled because cpu->migratable is true by default. This is inconsistent with all other CPU models, which don't have the "migratable" option, making "+invtsc" work without the need for extra options. This happens because x86_cpu_filter_features() uses cpu->migratable as an argument for x86_cpu_get_supported_feature_word(). This is not useful because: 2) on "-cpu host" it only makes QEMU disable features that were explicitly enabled in the command-line; 1) on all the other CPU models, cpu->migratable is already false. The fix is to just use 'false' as an argument to x86_cpu_get_supported_feature_word() in x86_cpu_filter_features(). Note that: * This won't change anything for people using using "-cpu host" or "-cpu host,migratable=<on\|off>" (with no extra features) because the x86_cpu_get_supported_feature_word() call on the cpu->host_features check uses cpu->migratable as argument. * This won't change anything for any CPU model except "host" because they all have cpu->migratable == false (and only "host" has the "migratable" property that allows it to be changed). * This will only change things for people using "-cpu host,+<feature>", where <feature> is a non-migratable feature. The only existing named non-migratable feature is "invtsc". In other words, this change will only affect people using "-cpu host,+invtsc" (that will now get what they asked for: the invtsc flag will be enabled). All other use cases are unaffected. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Return runnability information on query-cpu-definitions	Eduardo Habkost	1	-0/+76
	Fill the "unavailable-features" field on the x86 implementation of query-cpu-definitions. Cc: Jiri Denemark <jdenemar@redhat.com> Cc: libvir-list@redhat.com Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: x86_cpu_load_features() function	Eduardo Habkost	1	-24/+41
	When probing for CPU model information, we need to reuse the code that initializes CPUID fields, but not the remaining side-effects of x86_cpu_realizefn(). Move that code to a separate function that can be reused later. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Unset cannot_destroy_with_object_finalize_yet	Eduardo Habkost	1	-5/+0
	TYPE_X86_CPU now call cpu_exec_init() on realize, so we don't need to set cannot_destroy_with_object_finalize_yet anymore. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386/kvm: cache the return value of kvm_enable_x2apic()	Radim Krčmář	1	-2/+15
	Assume that KVM would have returned the same on subsequent runs. Abstract the memoizaiton pattern into macros and call it memorize as adding the r makes it less obscure. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	intel_iommu: reject broken EIM	Radim Krčmář	3	-0/+19
	Cluster x2APIC cannot work without KVM's x2apic API when the maximal APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we forbid other APICs and also the old KVM case with less than 9, to simplify the code. There is no point in enabling EIM in forbidden APICs, so we keep it enabled only for the KVM APIC; unconditionally, because making the option depend on KVM version would be a maintanance burden. Old QEMUs would enable eim whenever intremap was on, which would trick guests into thinking that they can enable cluster x2APIC even if any interrupt destination would get clamped to 8 bits. Depending on your configuration, QEMU could notice that the destination LAPIC is not present and report it with a very non-obvious: KVM: injection failed, MSI lost (Operation not permitted) Or the guest could say something about unexpected interrupts, because clamping leads to aliasing so interrupts were being delivered to incorrect VCPUs. KVM_X2APIC_API is the feature that allows us to enable EIM for KVM. QEMU 2.7 allowed EIM whenever interrupt remapping was enabled. In order to keep backward compatibility, we again allow guests to misbehave in non-obvious ways, and make it the default for old machine types. A user can enable the buggy mode it with "x-buggy-eim=on". Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	apic: add global apic_get_class()	Radim Krčmář	1	-3/+10
	Every configuration has only up to one APIC class and we'll be extending the class with a function that can be called without an instanced object, so a direct access to the class is convenient. This patch will break compilation if some code uses apic_get_class() with CONFIG_USER_ONLY. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Move warning code outside x86_cpu_filter_features()	Eduardo Habkost	1	-9/+19
	x86_cpu_filter_features() will be reused by code that shouldn't print any warning. Move the warning code to a new x86_cpu_report_filtered_features() function, and call it from x86_cpu_realizefn(). Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: xsave: Add FP and SSE bits to x86_ext_save_areas	Eduardo Habkost	1	-4/+18
	Instead of treating the FP and SSE bits as special cases, add them to the x86_ext_save_areas array. This will simplify the code that calculates the supported xsave components and the size of the xsave area. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Register properties for feature aliases manually	Eduardo Habkost	1	-22/+21
	Instead of keeping the aliases inside the feature name arrays and require parsing the strings, just register alias properties manually. This simplifies the code for property registration and lookup. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Remove underscores from feat_names arrays	Eduardo Habkost	1	-16/+20
	Instead of translating the feature name entries when adding property names, store the actual property names in the feature name array. For reference, here is the full list of functions that use FeatureWordInfo::feat_names: * x86_cpu_get_migratable_flags(): not affected, as it just check for non-NULL values. * report_unavailable_features(): informative only. It will start printing feature names with hyphens. * x86_cpu_list(): informative only. It will start printing feature names with hyphens * x86_cpu_register_feature_bit_props(): not affected, as it was already calling feat2prop(). Now we can remove the feat2prop() calls safely. So, the only user-visible effect of this patch are the new names being used in help and error messages for users. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Make plus_features/minus_features QOM-based	Eduardo Habkost	1	-86/+20
	Instead of using custom feature name lookup code for plus_features/minus_features, save the property names used in "[+-]feature" and use object_property_set_bool() to set them. We don't need a feat2prop() call because we now have alias properties for the old names containing underscores. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Register aliases for feature names with underscores	Eduardo Habkost	1	-0/+22
	Registering the actual names containing underscores as aliases will allow management software to be aware that the old compatibility names are suported, and will make feat2prop() calls unnecessary when using feature names. Also, this will help us avoid making the code support underscores on feature names that never had them in the first place. e.g. "+tsc_deadline" was never supported and doesn't need to be translated to "+tsc-deadline". In other word: this will require less magic translation of strings, and simple 1:1 match between the config options and actual QOM properties. Note that the underscores are still present in the FeatureWordInfo::feat_names arrays, because add_flagname_to_bitmaps() needs them to be kept. The next patches will remove add_flagname_to_bitmaps() and will allow us to finally remove the aliases from feat_names. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: Disable VME by default with TCG	Eduardo Habkost	1	-0/+10
	VME is already disabled automatically when using TCG. So, instead of pretending it is there when reporting CPU model data on query-cpu-* QMP commands (making every CPU model to be reported as not runnable), we can disable it by default on all CPU models when using TCG. Do that by adding a tcg_default_props array that will work like kvm_default_props. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17	target-i386: List CPU models using subclass list	Eduardo Habkost	2	-29/+78
	Instead of using the builtin_x86_defs array, use the QOM subclass list to list CPU models on "-cpu ?" and "query-cpu-definitions". Signed-off-by: Andreas Färber <afaerber@suse.de> [ehabkost: copied code from a patch by Andreas: "target-i386: QOM'ify CPU", from March 2012] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-10	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	1	-0/+7
	* Thread Sanitizer fixes (Alex) * Coverity fixes (David) * test-qht fixes (Emilio) * QOM interface for info irq/info pic (Hervé) * -rtc clock=rt fix (Junlian) * mux chardev fixes (Marc-André) * nicer report on death by signal (Michal) * qemu-tech TLC (Paolo) * MSI support for edu device (Peter) * qemu-nbd --offset fix (Tomáš) # gpg: Signature made Fri 07 Oct 2016 17:25:10 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (39 commits) qemu-doc: merge qemu-tech and qemu-doc qemu-tech: rewrite some parts qemu-tech: reorganize content qemu-tech: move TCG test documentation to tests/tcg/README qemu-tech: move user mode emulation features from qemu-tech qemu-tech: document lazy condition code evaluation in cpu.h qemu-tech: move text from qemu-tech to tcg/README qemu-doc: drop installation and compilation notes qemu-doc: replace introduction with the one from the internals manual qemu-tech: drop index test-qht: perform lookups under rcu_read_lock qht: fix unlock-after-free segfault upon resizing qht: simplify qht_reset_size qemu-nbd: Shrink image size by specified offset qemu_kill_report: Report PID name too util: Introduce qemu_get_pid_name char: update read handler in all cases char: use a fixed idx for child muxed chr i8259: give ISA device when registering ISA ioports .travis.yml: add gcc sanitizer build ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-07	qemu-tech: document lazy condition code evaluation in cpu.h	Paolo Bonzini	1	-0/+7
	Unlike the other sections, they are pretty specific to a particular CPU. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-04	hmp: fix qemu crash due to ioapic state dump w/ split irqchip	Wanpeng Li	1	-1/+2
	The qemu will crash when info ioapic through hmp if irqchip is split. Below message is splat: KVM_GET_IRQCHIP failed: Unknown error -6 This patch fix it by dumping the ioapic state from the qemu emulated ioapic if irqchip is split. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Message-Id: <1474602456-3232-1-git-send-email-wanpeng.li@hotmail.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20160923090824.GF15411@pxdev.xzpeter.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2016-10-03	target-i386: Correct family/model/stepping for Opteron_G3	Evgeny Yakovlev	1	-3/+3
	Current CPU definition for AMD Opteron third generation includes features like SSE4a and LAHF_LM support in emulated CPUID. These features are present in K8 rev.E or K10 CPUs and later. However, current G3 family and model describe 2nd generation K8 cores instead. This is incorrect but was considered harmless until our tests found a problem with linux kernels >= 3.10 (and maybe earlier) which specifically check for Opteron K8 model when parsing CPUID leaf 0x80000001: http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/amd.c?v=3.16#L552 This code will disable LAHF_LM feature in /proc/cpuinfo if model number is inconsistent. This change sets Opteron_G3 family/model/stepping to 16/2/3 which is a proper Opteron 3rd generation 2350 CPU. Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Richard Henderson <rth@twiddle.net> CC: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-03	target-i386: Report known CPUID[EAX=0xD,ECX=0]:EAX bits as migratable	Eduardo Habkost	1	-8/+11
	A regression was introduced by commit 96193c22a "target-i386: Move xsave component mask to features array": all CPUID[EAX=0xD,ECX=0]:EAX bits were being reported as unmigratable because they don't have feature names defined. This broke "-cpu host" because it enables only migratable features by default. This adds a new field to FeatureWordInfo: migratable_flags, which will make those features be reported as migratable even if they don't have a property name defined. Reported-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Paolo Bonzini <bonzini@gnu.org> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-28	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	2	-15/+10
	* thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-) * license clarification for compiler.h (Felipe) * glib cflags improvement (Marc-André) * checkpatch silencing (Paolo) * SMRAM migration fix (Paolo) * Replay improvements (Pavel) * IOMMU notifier improvements (Peter) * IOAPIC now defaults to version 0x20 (Peter) # gpg: Signature made Tue 27 Sep 2016 10:57:40 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (28 commits) replay: allow replay stopping and restarting replay: vmstate for replay module replay: move internal data to the structure cpus-common: lock-free fast path for cpu_exec_start/end tcg: Make tb_flush() thread safe cpus-common: Introduce async_safe_run_on_cpu() cpus-common: simplify locking for start_exclusive/end_exclusive cpus-common: remove redundant call to exclusive_idle() cpus-common: always defer async_run_on_cpu work items docs: include formal model for TCG exclusive sections cpus-common: move exclusive work infrastructure from linux-user cpus-common: fix uninitialized variable use in run_on_cpu cpus-common: move CPU work item management to common code cpus-common: move CPU list management to common code linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick() linux-user: Use QemuMutex and QemuCond cpus: Rename flush_queued_work() cpus: Move common code out of {async_, }run_on_cpu() cpus: pass CPUState to run_on_cpu helpers build-sys: put glib_cflags in QEMU_CFLAGS ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-27	target-i386: Remove has_msr_* global vars for KVM features	Eduardo Habkost	1	-15/+6
	The global variables are not necessary because we can check KVM feature flags in X86CPU directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Clear KVM CPUID features if KVM is disabled	Eduardo Habkost	1	-0/+4
	This will ensure all checks for features[FEAT_KVM] in the code will be correct in case the KVM CPUID leaf is completely disabled. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Remove has_msr_hv_tsc global variable	Eduardo Habkost	1	-6/+8
	The global variable is not necessary because we can check cpu->hyperv_time directly. We just need to ensure cpu->hyperv_time will be cleared if the feature is not really being exposed to the guest due to missing KVM_CAP_HYPERV_TIME capability. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Remove has_msr_hv_apic global variable	Eduardo Habkost	1	-5/+3
	The global variable is not necessary because we can check cpu->hyperv_vapic directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Remove has_msr_mtrr global variable	Eduardo Habkost	1	-6/+2
	The global variable is not necessary because we can check the CPU feature flags directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Move xsave component mask to features array	Eduardo Habkost	2	-15/+30
	This will reuse the existing check/enforce logic in x86_cpu_filter_features() to check the xsave component bits against GET_SUPPORTED_CPUID. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: xsave: Calculate set of xsave components on realize	Eduardo Habkost	2	-23/+33
	Instead of doing complex calculations and calling kvm_arch_get_supported_cpuid() inside cpu_x86_cpuid(), calculate the set of required XSAVE components earlier, at realize time. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: xsave: Helper function to calculate xsave area size	Eduardo Habkost	1	-7/+15
	Move the xsave area size calculation from cpu_x86_cpuid() inside its own function. While doing it, change it to use the XSAVE area struct sizes for the initial size, instead of the magic 0x240 number. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: xsave: Simplify CPUID[0xD,0].{EAX,EDX} calculation	Eduardo Habkost	1	-6/+2
	Instead of assigning individual bits in a loop, just copy the values from ena_mask. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: xsave: Calculate enabled components only once	Eduardo Habkost	1	-10/+16
	Instead of checking both env->features and ena_mask at two different places in the CPUID code, initialize ena_mask based on the features that are enabled for the CPU, and then clear unsupported bits based on kvm_arch_get_supported_cpuid(). The results should be exactly the same, but it will make it easier to move the mask calculation elsewhare, and reuse x86_cpu_filter_features() for the kvm_arch_get_supported_cpuid() check. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Don't try to enable PT State xsave component	Eduardo Habkost	1	-3/+3
	The code that calculates the set of supported XSAVE components on CPUID looks at ext_save_areas to find out which components should be enabled. However, if there are zeroed entries in the ext_save_areas array, the ((env->features[esa->feature] & esa->bits) == esa->bits) check will always succeed and QEMU will unconditionally try to enable the component. Luckily this never caused any problems because the only missing entry in ext_save_areas is the PT State component (bit 8), and KVM currently doesn't support it (so it was cleared on ena_mask). But the code was still incorrect and would break if KVM starts returning CPUID[EAX=0xD,ECX=0].EAX[bit 8] as supported on GET_SUPPORTED_CPUID. Fix the problem by changing the code to not enable a XSAVE component if ExtSaveArea::bits is zero. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Move feature name arrays inside FeatureWordInfo	Eduardo Habkost	1	-200/+170
	It makes it easier to guarantee the arrays are the right size, and to find information when looking at the code. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	linux-user: remove #define smp_{cores, threads}	Marc-André Lureau	1	-4/+4
	Those are unneeded now that CPUState nr_{cores,threads} is always initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Enable CPUID[0x8000000A] if SVM is enabled	Eduardo Habkost	1	-0/+4
	SVM needs CPUID[0x8000000A] to be available. So if SVM is enabled in a CPU model or explicitly in the command-line, adjust CPUID xlevel to expose the CPUID[0x8000000A] leaf. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Automatically set level/xlevel/xlevel2 when needed	Eduardo Habkost	2	-13/+83
	Instead of requiring users and management software to be aware of required CPUID level/xlevel/xlevel2 values for each feature, automatically increase those values when features need them. This was already done for CPUID[7].EBX, and is now made generic for all CPUID feature flags. Unit test included, to make sure we don't break ABI on older machine-types and don't mess with the CPUID level values if they are explicitly set by the user. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Add a marker to end of the region zeroed on reset	Eduardo Habkost	2	-1/+2
	Instead of using cpuid_level, use an empty struct as a marker (like we already did with {start,end}_init_save). This will avoid accidentaly resetting the wrong fields if we change the field ordering on CPUX86State. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	target-i386: Remove unused X86CPUDefinition::xlevel2 field	Eduardo Habkost	1	-2/+0
	No CPU model in builtin_x86_defs has xlevel2 set, so it is always zero. Delete the field. Note that this is not an user-visible change. It doesn't remove the ability to set xlevel2 on the command-line, it just removes an unused field in builtin_x86_defs. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27	cpus: pass CPUState to run_on_cpu helpers	Alex Bennée	2	-15/+10
	CPUState is a fairly common pointer to pass to these helpers. This means if you need other arguments for the async_run_on_cpu case you end up having to do a g_malloc to stuff additional data into the routine. For the current users this isn't a massive deal but for MTTCG this gets cumbersome when the only other parameter is often an address. This adds the typedef run_on_cpu_func for helper functions which has an explicit CPUState * passed as the first parameter. All the users of run_on_cpu and async_run_on_cpu have had their helpers updated to use CPUState where available. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [Sergey Fedorov: - eliminate more CPUState in user data; - remove unnecessary user data passing; - fix target-s390x/kvm.c and target-s390x/misc_helper.c] Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts) Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22	kvm: fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0	Herongguang (Stephen)	1	-1/+1
	Fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0. Signed-off-by: He Rongguang <herongguang.he@huawei.com> Message-Id: <57E38EAC.3020108@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22	kvm: apic: set APIC base as part of kvm_apic_put	Dr. David Alan Gilbert	2	-0/+10
	The parsing of KVM_SET_LAPIC's input depends on the current value of the APIC base MSR---which indeed is stored in APICCommonState---but for historical reasons APIC base is set through KVM_SET_SREGS together with cr8 (which is really just the APIC TPR) and the actual "special CPU registers". APIC base must now be set before the actual LAPIC registers, so do that in kvm_apic_put. It will be set again to the same value with KVM_SET_SREGS, but that's not a big issue. This only happens since Linux 4.8, which checks for x2apic mode in KVM_SET_LAPIC. However it's really a QEMU bug; until the recent commit 78d6a05 ("x86/lapic: Load LAPIC state at post_load", 2016-09-13) QEMU was indeed setting APIC base (via KVM_SET_SREGS) before the other LAPIC registers. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-22	target-i386: introduce kvm_put_one_msr	Paolo Bonzini	1	-9/+11
	Avoid further code duplication in the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-19	target-i386: Use struct X86XSaveArea in fpu_helper.c	Richard Henderson	3	-58/+67
	This avoids a double hand-full of magic numbers in the xsave and xrstor helper functions. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-16	target-i386: Generate fences for x86	Pranith Kumar	1	-0/+8
	Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-15-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-09-14	target-i386: Fixed syscall posssible segfault	Stanislav Shmarov	1	-17/+19
	In user-mode emulation env->idt.base memory is allocated in linux-user/main.c with size 8512 = 4096 (for 64-bit). When fake interrupt EXCP_SYSCALL is thrown do_interrupt_user checks destination privilege level for this fake exception, and tries to read 4 bytes at address base + (256 2^4)=4096, that causes segfault. Privlege level was checked only for int's, so lets read dpl from memory only for this case. Signed-off-by: Stanislav Shmarov <snarpix@gmail.com> Message-Id: <1473773008-2588376-1-git-send-email-snarpix@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-14	target-i386: fix ordering of fields in CPUX86State	Paolo Bonzini	1	-6/+6
	Make sure reset zeroes TSC_AUX, XCR0, PKRU. Move XSTATE_BV from the "vmstate only" section to the "KVM only" section. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-13	x86/lapic: Load LAPIC state at post_load	Dr. David Alan Gilbert	1	-17/+0
	Load the LAPIC state during post_load (rather than when the CPU starts). This allows an interrupt to be delivered from the ioapic to the lapic prior to cpu loading, in particular the RTC that starts ticking as soon as we load it's state. Fixes a case where Windows hangs after migration due to RTC interrupts disappearing. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-09	target-i386: present virtual L3 cache info for vcpus	Longpeng(Mike)	2	-5/+50
	Some software algorithms are based on the hardware's cache info, for example, for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger a resched IPI and told cpu2 to do the wakeup if they don't share low level cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc. The relevant linux-kernel code as bellow: static void ttwu_queue(struct task_struct p, int cpu) { struct rq rq = cpu_rq(cpu); ...... if (... && !cpus_share_cache(smp_processor_id(), cpu)) { ...... ttwu_queue_remote(p, cpu); /* will trigger RES IPI / return; } ...... ttwu_do_activate(rq, p, 0); / access target's rq directly / ...... } In real hardware, the cpus on the same socket share L3 cache, so one won't trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs under some workloads even if the virtual cpus belongs to the same virtual socket. For KVM, there will be lots of vmexit due to guest send IPIs. The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates) and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during the period: No-L3 With-L3(applied this patch) cpu0: 363890 44582 cpu1: 373405 43109 cpu2: 340783 43797 cpu3: 333854 43409 cpu4: 327170 40038 cpu5: 325491 39922 cpu6: 319129 42391 cpu7: 306480 41035 cpu8: 161139 32188 cpu9: 164649 31024 cpu10: 149823 30398 cpu11: 149823 32455 cpu12: 164830 35143 cpu13: 172269 35805 cpu14: 179979 33898 cpu15: 194505 32754 avg: 268963.6 40129.8 The VM's topology is "1socket 8cores 2threads". After present virtual L3 cache info for VM, the amounts of RES IPIs in guest reduce 85%. For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause severe performance degradation. We had tested the overall system performance if vcpus actually run on sparate physical socket. With L3 cache, the performance improves 7.2%~33.1%(avg:15.7%). Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-05	target-i386: Add more Intel AVX-512 instructions support	Luwei Kang	2	-5/+14
	Add more AVX512 feature bits, include AVX512DQ, AVX512IFMA, AVX512BW, AVX512VL, AVX512VBMI. Its spec can be found at: https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-08-16	target-i386: kvm: Report kvm_pv_unhalt as unsupported w/o kernel_irqchip	Eduardo Habkost	1	-0/+7
	The kvm_pv_unhalt feature doesn't work if kernel_irqchip is disabled, so we need to report it as unsupported. Tested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>