peter/qemu - QEMU hacking for Peter

Age	Commit message (Collapse)	Author	Files	Lines
2014-03-14	tcg-aarch64: Introduce tcg_out_insn_3405	Richard Henderson	1	-21/+27
	Cleaning up the implementation of tcg_out_movi at the same time. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support div, rem	Richard Henderson	2	-13/+45
	Clean up multiply at the same time. For remainder, generic code will produce mul+sub, whereas we can implement with msub. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support muluh, mulsh	Richard Henderson	2	-2/+14
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support add2, sub2	Richard Henderson	2	-4/+80
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support deposit	Richard Henderson	2	-21/+49
	Also tidy the implementation of ubfm, sbfm, extr in order to share code. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Use tcg_out_insn for setcond	Richard Henderson	1	-9/+3
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support movcond	Richard Henderson	2	-2/+36
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Support andc, orc, eqv, not, neg	Richard Henderson	2	-10/+67
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Handle constant operands to and, or, xor	Richard Henderson	1	-49/+107
	Handle a simplified set of logical immediates for the moment. The way gcc and binutils do it, with 52k worth of tables, and a binary search depth of log2(5334) = 13, seems slow for the most common cases. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Handle constant operands to add, sub, and compare	Richard Henderson	1	-22/+78
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Implement mov with tcg_out_insn	Richard Henderson	1	-15/+9
	Avoid the magic numbers in the current implementation. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Introduce tcg_out_insn_3401	Richard Henderson	1	-46/+26
	This merges the implementation of tcg_out_addi and tcg_out_subi. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Convert shift insns to tcg_out_insn	Richard Henderson	1	-31/+21
	Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-14	tcg-aarch64: Introduce tcg_out_insn	Richard Henderson	1	-36/+58
	Converting the add/sub (3.5.2) and logical shifted (3.5.10) instruction groups to the new scheme. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
2014-03-08	tcg-aarch64: Remove nop from qemu_st slow path	Richard Henderson	1	-7/+0
	Commit 023261ef851b22a04f6c5d76da870051031757a6 failed to remove a nop that's no longer required. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Simplify tcg_out_ldst_9 encoding	Richard Henderson	1	-12/+2
	At first glance the code appears to be using 1's compliment encoding, a-la AArch32. Except that the constant is "off", creating a complicated split field 2's compliment encoding. Much clearer to just use a normal mask and shift. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Use intptr_t apropriately	Richard Henderson	1	-28/+21
	As opposed to tcg_target_long. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Remove the shift_imm parameter from tcg_out_cmp	Richard Henderson	1	-6/+5
	It was unused. Let's not overcomplicate things before we need them. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Hoist common argument loads in tcg_out_op	Richard Henderson	1	-45/+50
	This reduces the code size of the function significantly. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Don't handle mov/movi in tcg_out_op	Richard Henderson	1	-13/+7
	Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Set ext based on TCG_OPF_64BIT	Richard Henderson	1	-21/+7
	Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Change all ext variables to TCGType	Richard Henderson	1	-27/+37
	We assert that the values for _I32 and _I64 are 0 and 1 respectively. This will make a couple of functions declared by tcg.c cleaner. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-08	tcg-aarch64: Remove redundant CPU_TLB_ENTRY_BITS check	Richard Henderson	1	-6/+0
	Removed from other targets in 56bbc2f967ce185fa1c5c39e1aeb5b68b26242e9. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-03-02	tcg: Fix typo in comment (dependancies -> dependencies)	Stefan Weil	1	-1/+1
	Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-02-21	tcg/i386: Fix build for systems without working cpuid.h (MacOSX, Win32)	Peter Maydell	1	-1/+3
	Win32 doesn't have a cpuid.h, and MacOSX may have one but without the __cpuid() function we use, which means that commit 9d2eec20 broke the build for those platforms. Fix this by tightening up our configure cpuid.h check to test that the functions we need are present, and adding some missing #ifdef guards in tcg/i386/tcg-target.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/i386: Use SHLX/SHRX/SARX instructions	Richard Henderson	1	-11/+50
	These three-operand shift instructions do not require the shift count to be placed into ECX. This reduces the number of mov insns required, with the mere addition of a new register constraint. Don't attempt to get rid of the matching constraint, as that's impossible to manipulate with just a new constraint. In addition, constant shifts still need the matching constraint. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/i386: Use ANDN instruction	Richard Henderson	2	-13/+45
	Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~C so we must handle constants in the implementation of andc. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/i386: Add tcg_out_vex_modrm	Richard Henderson	1	-3/+38
	Prepare for emitting BMI insns which require VEX encoding. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/i386: Move TCG_CT_CONST_* to tcg-target.c	Richard Henderson	2	-3/+4
	These are not needed by users of tcg-target.h. No need to recompile when we adjust them. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: Add more identity simplifications	Richard Henderson	1	-15/+24
	Recognize 0 operand to andc, and -1 operands to and, orc, eqv. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: Optmize ANDC X,Y,Y to MOV X,0	Richard Henderson	1	-0/+1
	Like we already do for SUB and XOR. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: Simply some logical ops to NOT	Richard Henderson	1	-0/+57
	Given, of course, an appropriate constant. These could be generated from the "canonical" operation for inversion on the guest, or via other optimizations. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: Handle known-zeros masks for ANDC	Richard Henderson	1	-0/+11
	Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: add known-zero bits compute for load ops	Aurelien Jarno	1	-1/+25
	Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: improve known-zero bits for 32-bit ops	Aurelien Jarno	1	-0/+6
	The shl_i32 op might set some bits of the unused 32 high bits of the mask. Fix that by clearing the unused 32 high bits for all 32-bit ops except load/store which operate on tl values. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: fix known-zero bits optimization	Aurelien Jarno	1	-1/+7
	Known-zero bits optimization is a great idea that helps to generate more optimized code. However the current implementation only works in very few cases as the computed mask is not saved. Fix this to make it really working. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg/optimize: fix known-zero bits for right shift ops	Aurelien Jarno	1	-5/+14
	32-bit versions of sar and shr ops should not propagate known-zero bits from the unused 32 high bits. For sar it could even lead to wrong code being generated. Cc: qemu-stable@nongnu.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17	tcg-arm: The shift count of op_rotl_i32 is in args[2] not args[1].	Huw Davies	1	-1/+1
	It's this that should be subtracted from 0x20 when converting to a right rotate. Cc: qemu-stable@nongnu.org Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-15	TCG: Fix 32-bit host allocation typo	Richard Henderson	1	-1/+1
	The second half register of a 64-bit temp on a 32-bit host was allocated with the wrong base_type. The base_type of the second half register is never checked, but for consistency it should be the same as the first half. Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-08	tcg: Add TCGV_UNUSED_PTR, TCGV_IS_UNUSED_PTR, TCGV_EQUAL_PTR	Peter Maydell	1	-0/+3
	We have macros for marking TCGv values as unused, checking if they are unused and comparing them to each other. However these only exist for TCGv_i32 and TCGv_i64; add them for TCGv_ptr as well. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-02-01	tcg/s390: Remove sigill_handler	Richard Henderson	1	-19/+0
	Commit c9baa30f42a87f61627391698f63fa4d1566d9d8 failed to delete all of the relevant code, leading to Werrors about unused symbols. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-01-30	Merge remote-tracking branch 'rth/tcg-movbe' into staging	Peter Maydell	1	-48/+97
	* rth/tcg-movbe: tcg/i386: cleanup useless #ifdef tcg/i386: use movbe instruction in qemu_ldst routines tcg/i386: add support for three-byte opcodes tcg/i386: remove hardcoded P_REXW value disas/i386.c: disassemble movbe instruction Message-id: 1390692772-15282-1-git-send-email-rth@twiddle.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30	TCG: Fix I64-on-32bit-host temporaries	Alexander Graf	1	-1/+1
	We have cache pools of temporaries that we can reuse later when they've already been allocated before. These cache pools differenciate between the target TCG variable type they contain. So we have one pool for I32 and one pool for I64 variables. On a 32bit system, we can't work with 64bit registers though. So instead we spawn two I32 temporaries for every I64 temporary we create. All caching works the same way as on a real 64-bit system though: We create a cache entry in the 64bit array for the first i32 index. However, when we free such a temporary we free it to the pool of its type (which is always i32 on 32bit systems) rather than its base_type (which is i64 or i32 depending on the variable). This means we put a temporary that is of base_type == i64 into the i32 preallocated temporary pool. Eventually, this results in failures like this on 32bit hosts: qemu-system-ppc64: tcg/tcg.c:515: tcg_temp_new_internal: Assertion `ts->base_type == type' failed. This patch makes the free routine use the base_type instead for the free case, so it's consistent with the temporary allocation. It fixes the above failure for me. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1390146811-59936-1-git-send-email-agraf@suse.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-25	tcg/i386: cleanup useless #ifdef	Aurelien Jarno	1	-2/+0
	TCG_TARGET_HAS_movcond_i32 is always defined to 1 in tcg-target.h, so remove the corresponding #ifdef #endif sequence, left from a previous refactoring. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25	tcg/i386: use movbe instruction in qemu_ldst routines	Aurelien Jarno	1	-37/+80
	The movbe instruction has been added on some Intel Atom CPUs and on recent Intel Haswell CPUs. It allows to load/store a value and at the same time bswap it. This patch detects the avaibility of this instruction and when available use it in the qemu load/store routines in replacement of load/store + bswap. Note that for 16-bit unsigned loads, movbe + movzw is basically the same as movzw + bswap, so the patch doesn't touch this case. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> [RTH: Reduced the number of conditionals using "movop".] Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25	tcg/i386: add support for three-byte opcodes	Aurelien Jarno	1	-8/+16
	Add support for three-byte opcodes, starting with the 0x0f 0x38 prefix. Use P_EXT38 as the new constant, and shift all other constants so that P_EXT and P_EXT38 have neighbouring values. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> [RTH: Changed the name from P_EXT2 to P_EXT38.] Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25	tcg/i386: remove hardcoded P_REXW value	Aurelien Jarno	1	-1/+1
	P_REXW is defined has a constant at the beginning of i386/tcg-target.c, but the corresponding bit is later used in a harcoded way, which defeat the purpose of a constant. Fix that by using a conditional expression operator instead of a shift. On x86 this actually makes the code slightly smaller as GCC does in practice (opc >> 8) & 8 instead of (opc & 0x800) >> 8 so the constants are smaller to load. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-12-21	tcg/i386: fix a comment	Aurelien Jarno	1	-1/+1
	The comments apply to 8-bit stores, not 8-byte stores. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-12-10	tcg: Use bitmaps for free temporaries	Richard Henderson	2	-22/+21
	We previously allocated 32-bits per temp for the next_free_temp entry. We now allocate 4 bits per temp across the 4 bitmaps. Using a linked list meant that if a translator is tweeked, resulting in temps being freed in a different order, that would have follow-on effects throughout the TB. Always allocating the lowest free temp means that follow-on effects are minimized, which can make it easier to diff output when debugging the translators. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-30	tcg-s390: Use qemu_getauxval in query_facilities	Richard Henderson	1	-83/+12
	No need to set up a SIGILL signal handler for detection anymore. Remove a ton of sanity checks that must be true, given that we're requiring a 64-bit build (the note about 31-bit KVM is satisfied by configuring with TCI). Signed-off-by: Richard Henderson <rth@twiddle.net>