summaryrefslogtreecommitdiff
path: root/target-i386
AgeCommit message (Collapse)AuthorFilesLines
2013-04-01target-i386: SSE4.2: use clz32/ctz32 instead of reinventing the wheelAurelien Jarno2-30/+3
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: enable SSE4.1 and SSE4.2 in TCG modeAurelien Jarno1-6/+7
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstrX instructions with "Masked(-)" polarityAurelien Jarno1-1/+1
valids can equals to -1 if the reg/mem string is empty. Change the expression to have an empty xor mask in that case. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal ordered" modeAurelien Jarno1-2/+3
The inner loop should only change the current bit of the result, instead of the whole result. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstrX instructions in "Equal each" modeAurelien Jarno1-1/+1
pcmpXstrX instructions in "Equal each" mode force both invalid element pair to true. It means (upper - MAX(valids, validd)) bits should be set to 1, not (upper - MAX(valids, validd) + 1). Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstrX instructions in "Ranges" modeAurelien Jarno1-2/+2
Fix the order of the of the comparisons to match the "Intel 64 and IA-32 Architectures Software Developer's Manual". Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstrm instructionsAurelien Jarno1-8/+8
pcmpXstrm instructions returns their result in the XMM0 register and not in the first operand. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpXstri instructionsAurelien Jarno1-2/+2
ffs1 returns the first bit set to one starting counting from the most significant bit. pcmpXstri returns the most significant bit set to one, starting counting from the least significant bit. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.2: fix pcmpgtq instructionAurelien Jarno1-2/+1
The "Intel 64 and IA-32 Architectures Software Developer's Manual" (at least recent versions) clearly says that the comparison is signed. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-01target-i386: SSE4.1: fix pinsrb instructionAurelien Jarno1-2/+2
gen_op_mov_TN_reg() loads the value in cpu_T[0], so this temporary should be used instead of cpu_tmp0. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-03-23target-i386: Don't modify env->eflags around cpu_dump_stateRichard Henderson1-1/+1
We can compute the value in cpu_dump_state anyway, and gratuitous modifications to eflags creates heisenbugs. Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-23target-i386: Fix flags computation for ADOXRichard Henderson1-1/+1
When starting from CC_OP_DYNAMIC, and issuing adox before adcx, a typo used the wrong value for the resulting CC_OP. Cc: Blue Swirl <blauwirbel@gmail.com> Reported-by: Torbjorn Granlund <tg@gmplib.org> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-22Fix typos and misspellingsPeter Maydell1-1/+1
Fix various typos and misspellings. The bulk of these were found with codespell. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-12cpu: Replace do_interrupt() by CPUClass::do_interrupt methodAndreas Färber4-3/+12
This removes a global per-target function and thus takes us one step closer to compiling multiple targets into one executable. It will also allow to override the interrupt handling for certain CPU families. Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12cpu: Pass CPUState to cpu_interrupt()Andreas Färber1-3/+3
Move it to qom/cpu.h to avoid issues with include order. Change pc_acpi_smi_interrupt() opaque to X86CPU. Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12cpu: Move halted and interrupt_request fields to CPUStateAndreas Färber7-50/+66
Both fields are used in VMState, thus need to be moved together. Explicitly zero them on reset since they were located before breakpoints. Pass PowerPCCPU to kvmppc_handle_halt(). Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12target-i386: Update VMStateDescription to X86CPUAndreas Färber4-110/+113
Expose vmstate_cpu as vmstate_x86_cpu and hook it up to CPUClass::vmsd. Adapt opaques and VMState fields to X86CPU. Drop cpu_{save,load}(). Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-04Merge remote-tracking branch 'mst/tags/for_anthony' into stagingAnthony Liguori1-1/+2
virtio,vhost,pci,e1000 Mostly bugfixes, but also some ICH work by Laszlo. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 28 Feb 2013 07:13:56 AM CST using RSA key ID D28D5469 # gpg: Can't check signature: public key not found # By Michael S. Tsirkin (2) and others # Via Michael S. Tsirkin * mst/tags/for_anthony: Set virtio-serial device to have a default of 2 MSI vectors. ICH9 LPC: Reset Control Register, basic implementation Fix guest OS hang when 64bit PCI bar present e1000: unbreak the guest network migration to 1.3 vhost: memory sync fixes
2013-03-03gen-icount.h: Rename gen_icount_start/end to gen_tb_start/endPeter Maydell1-2/+2
The gen_icount_start/end functions are now somewhat misnamed since they are useful for generic "start/end of TB" code, used for more than just icount. Rename them to gen_tb_start/end. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-03cpu: Introduce ENV_OFFSET macrosAndreas Färber1-0/+1
Introduce ENV_OFFSET macros which can be used in non-target-specific code that needs to generate TCG instructions which reference CPUState fields given the cpu_env register that TCG targets set up with a pointer to the CPUArchState struct. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-27target-i386: Use mulu2 and muls2Richard Henderson3-155/+56
These correspond very closely to the insns that we're emulating. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-27Fix guest OS hang when 64bit PCI bar presentAlexey Korolev1-1/+2
This patch addresses the issue fully described here: http://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg01804.html Linux kernels prior to 2.6.36 do not disable the PCI device during enumeration process. Since lower and higher parts of a 64bit BAR are programmed separately this leads to qemu receiving a request to occupy a completely wrong address region for a short period of time. We have found that the boot process screws up completely if kvm-apic range is overlapped even for a short period of time (it is fine for other regions though). This patch raises the priority of the kvm-apic memory region, so it is never pushed out by PCI devices. The patch is quite safe as it does not touch memory manager. Signed-off-by: Alexey Korolev <akorolex@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-02-23target-i386: Use add2 to implement the ADX extensionRichard Henderson1-11/+9
Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-19target-i386: Use movcond to implement shiftd.Richard Henderson1-141/+106
With this being all straight-line code, it can get deleted when the cc variables die. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Discard CC_OP computation in set_cc_op alsoRichard Henderson1-3/+11
The shift and rotate insns use movcond to set CC_OP, and thus achieve a conditional EFLAGS setting. By discarding CC_OP in a later flags setting insn, we can discard that movcond. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Use movcond to implement rotate flags.Richard Henderson1-116/+121
With this being all straight-line code, it can get deleted when the cc variables die. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Use movcond to implement shift flags.Richard Henderson1-52/+42
With this being all straight-line code, it can get deleted when the cc variables die. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Add CC_OP_CLRRichard Henderson4-3/+21
Special case xor with self. We need not even store the known zero into cc_src. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Implement tzcnt and fix lzcntRichard Henderson3-48/+54
We weren't computing flags for lzcnt at all. At the same time, adjust the implementation of bsf/bsr to avoid the local branch, using movcond instead. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Use clz/ctz for bsf/bsr helpersRichard Henderson2-37/+14
And mark the helpers as NO_RWG_SE. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-19target-i386: Implement ADX extensionRichard Henderson5-5/+146
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement RORXRichard Henderson1-0/+32
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement SHLX, SARX, SHRXRichard Henderson1-0/+31
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement PDEP, PEXTRichard Henderson3-0/+71
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement MULXRichard Henderson3-0/+47
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement BZHIRichard Henderson1-0/+27
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement BLSR, BLSMSK, BLSIRichard Henderson5-1/+95
Do all of group 17 at one time for ease. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement BEXTRRichard Henderson1-0/+40
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement ANDNRichard Henderson2-7/+22
As this is the first of the BMI insns to be implemented, this carries quite a bit more baggage than normal. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Implement MOVBERichard Henderson2-28/+110
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Decode the VEX prefixesRichard Henderson1-4/+64
No actual required uses of these encodings yet. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Tidy prefix parsingRichard Henderson1-82/+52
Avoid duplicating switch statement between 32 and 64-bit modes. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Use CC_SRC2 for ADC and SBBRichard Henderson5-85/+75
Add another slot in ENV and store two of the three inputs. This lets us do less work when carry-out is not needed, and avoids the unpredictable CC_OP after translating these insns. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Make helper_cc_compute_{all,c} constRichard Henderson3-14/+33
Pass the data in explicitly, rather than indirectly via env. This avoids all sorts of unnecessary register spillage. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Don't reference ENV through most of cc helpersRichard Henderson2-282/+180
In preparation for making this a const helper. By using the proper types in the parameters to the helper functions, we get to avoid quite a lot of subsequent casting. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: optimize flags checking after sub using CC_SRCTRichard Henderson1-15/+31
After a comparison or subtraction, the original value of the LHS will currently be reconstructed using an addition. However, in most cases it is already available: store it in a temp-local variable and save 1 or 2 TCG ops (2 if the result of the addition needs to be extended). The temp-local can be declared dead as soon as the cc_op changes again, or also before the translation block ends because gen_prepare_cc will always make a copy before returning it. All this magic, plus copy propagation and dead-code elimination, ensures that the temp local will (almost) never be spilled. Example (cmp $0x21,%rax + jbe): Before After ---------------------------------------------------------------------------- movi_i64 tmp1,$0x21 movi_i64 tmp1,$0x21 movi_i64 cc_src,$0x21 movi_i64 cc_src,$0x21 sub_i64 cc_dst,rax,tmp1 sub_i64 cc_dst,rax,tmp1 add_i64 tmp7,cc_dst,cc_src movi_i32 cc_op,$0x11 movi_i32 cc_op,$0x11 brcond_i64 tmp7,cc_src,leu,$0x0 discard loc11 brcond_i64 rax,cc_src,leu,$0x0 Before After ---------------------------------------------------------------------------- mov (%r14),%rbp mov (%r14),%rbp mov %rbp,%rbx mov %rbp,%rbx sub $0x21,%rbx sub $0x21,%rbx lea 0x21(%rbx),%r12 movl $0x11,0xa0(%r14) movl $0x11,0xa0(%r14) movq $0x21,0x90(%r14) movq $0x21,0x90(%r14) mov %rbx,0x98(%r14) mov %rbx,0x98(%r14) cmp $0x21,%r12 | cmp $0x21,%rbp jbe ... jbe ... Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: Update cc_op before TCG branchesRichard Henderson1-4/+4
Placing the CC_OP_DYNAMIC at the join is less effective than before the branch, as the branch will have forced global registers to their home locations. This way we have a chance to discard CC_SRC2 before it gets stored. Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: introduce gen_jcc1_noeobRichard Henderson1-5/+22
A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMIC will always have to call gen_op_set_cc_op. However, not all jumps end a basic block, so introduce a variant that does not do this. This was partially undone earlier (i386: drop cc_op argument of gen_jcc1), redo it now also to prepare for the introduction of src2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: use gen_op for cmps/scasRichard Henderson1-14/+6
Replace low-level ops with a higher-level "cmp %al, (A0)" in the case of scas, and "cmp T0, (A0)" in the case of cmps. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-02-18target-i386: kill cpu_T3Paolo Bonzini1-11/+8
It is almost unused, and it is simpler to pass a TCG value directly to gen_shiftd_rm_T1_T3. This value is then written to t2 without going through a temporary register. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>