Message-ID: <19990601135627.07936@atrey.karlin.mff.cuni.cz> Date: Tue, 1 Jun 1999 13:56:27 +0200 From: Jan Hubicka To: pgcc AT delorie DOT com, pcg AT goof DOT com Subject: Random i386 patches. Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.84 Reply-To: pgcc AT delorie DOT com Hi About two weeks ago we've talked about including my i386 patches to pgcc. I've finally found some time, so here is first part. It contains collection of my patches from egcs mailing list, that wasn't included to egcs (yet?). So you might take a look on them... I am just donwloading latest pgcc cvs tree and going to update my decoder-scheduling (Pentium/PPro/K6) patches. Thanks Honza Hi, This test in force_to_mode seems to bypass some optimizations (such as removing redundand ANDs) when mask is not 0xffffffff. Thu May 27 12:26:43 MET DST 1999 Jan Hubicka * combine.c (force_to_mode): Try to simplify when mask is not full. *** combine.old Thu May 27 12:25:27 1999 --- combine.c Thu May 27 12:25:35 1999 *************** force_to_mode (x, mode, mask, reg, just_ *** 6289,6297 **** && (GET_MODE_MASK (GET_MODE (x)) & ~ mask) == 0) return gen_lowpart_for_combine (mode, x); ! /* If we aren't changing the mode, X is not a SUBREG, and all zero bits in ! MASK are already known to be zero in X, we need not do anything. */ ! if (GET_MODE (x) == mode && code != SUBREG && (~ mask & nonzero) == 0) return x; switch (code) --- 6289,6297 ---- && (GET_MODE_MASK (GET_MODE (x)) & ~ mask) == 0) return gen_lowpart_for_combine (mode, x); ! /* If we aren't changing the mode, X is not a SUBREG, and mask is full, ! we need not do anything. */ ! if (GET_MODE (x) == mode && code != SUBREG && ~ mask == 0) return x; switch (code) Hi Here is the unified patch for the push usage and scheduler definitions. Because it is quite long, I don't expect it will be accepted w/o problems and because I am leaving, please be so kind and do the modifications, if the problems are not too deep. I've removed the have_esp_bypass and replaced it by attribute. Also I've added the new types push,pop,call,ret, because these instructions are so special, that we will need this type anyway. I now use them to set the have_esp_bypass. Purpose why I am still using the attribute here is that I believe that this is cleaner choice and also I can disable the bypass for ret instruction, that is exception for this rule. Also this patch adds new pattern for prologue_set_stack_ptr_4 that set correct attribute and use push instead of sub. I've also introduced new function i386_probably_constant_reg that returns most probably unused register in the function so this register is used for dummy pushes to reduce dependencies. Also I've added support for i386_probably_constant_reg to output shorter add in case of 128. Sun Apr 18 06:14:11 CEST 1999 Jan Hubicka * i386.c (agi_dependent): Use have_esp_bypass attrubute. (reg_mentioned_in_mem): Use lea attribute to determine AGI stalls in aritmetic insns that output lea. (x86_adjust_cost): Handle REG_DEP_OUTPUT and REG_DEP_ANTI link notes, use have_esp_bypass attribute. (i386_probably_constant_reg): New function. * i386.h (i386_probably_constant_reg): Declare it. * i386.md (have_esp_bypass): New attribute. (type attribute): New types push, pop, ret and call (memory attribute): Push and call writes memory, pop and red reads. (scheduling definitions): Support new types. (push patterns): Use new push type, do not set memory attribute manually. (pop pattern): Likewise. (call and ret patterns): Likewise. (ashl and add patterns): Set lea attribute when lea instruction is used. (addsi): Use push to decrement esp by 4, new splitter. (subsi): Likewise. (prologue_set_stack_ptr_4): New pattern. (prologue_set_stack_ptr): Use add to decrement esp by 128. *** i386.h.o Fri Apr 16 22:51:42 1999 --- i386.h Sat Apr 17 21:19:03 1999 *************** extern int ix86_can_use_return_insn_p () *** 2775,2780 **** --- 2776,2782 ---- extern int small_shift_operand (); extern char *output_ashl (); extern int memory_address_info (); + extern struct rtx_def *i386_probably_constant_reg (); #ifdef NOTYET extern struct rtx_def *copy_all_rtx (); *** i386.c.old3 Thu Apr 15 05:36:17 1999 --- i386.c Sun Apr 18 05:55:22 1999 *************** int *** 5097,5131 **** agi_dependent (insn, dep_insn) rtx insn, dep_insn; { ! int push = 0, push_dep = 0; if (GET_CODE (dep_insn) == INSN && GET_CODE (PATTERN (dep_insn)) == SET && GET_CODE (SET_DEST (PATTERN (dep_insn))) == REG && reg_mentioned_in_mem (SET_DEST (PATTERN (dep_insn)), insn)) return 1; ! if (GET_CODE (insn) == INSN && GET_CODE (PATTERN (insn)) == SET ! && GET_CODE (SET_DEST (PATTERN (insn))) == MEM ! && push_operand (SET_DEST (PATTERN (insn)), ! GET_MODE (SET_DEST (PATTERN (insn))))) ! push = 1; ! ! if (GET_CODE (dep_insn) == INSN && GET_CODE (PATTERN (dep_insn)) == SET ! && GET_CODE (SET_DEST (PATTERN (dep_insn))) == MEM ! && push_operand (SET_DEST (PATTERN (dep_insn)), ! GET_MODE (SET_DEST (PATTERN (dep_insn))))) ! push_dep = 1; ! /* CPUs contain special hardware to allow two pushes. */ ! if (push && push_dep) return 0; ! /* Push operation implicitly change stack pointer causing AGI stalls. */ ! if (push_dep && reg_mentioned_in_mem (stack_pointer_rtx, insn)) ! return 1; ! ! /* Push also implicitly read stack pointer. */ ! if (push && modified_in_p (stack_pointer_rtx, dep_insn)) return 1; return 0; --- 5124,5146 ---- agi_dependent (insn, dep_insn) rtx insn, dep_insn; { ! int bypassed_esp = 0, bypassed_esp_dep = 0; if (GET_CODE (dep_insn) == INSN && GET_CODE (PATTERN (dep_insn)) == SET && GET_CODE (SET_DEST (PATTERN (dep_insn))) == REG && reg_mentioned_in_mem (SET_DEST (PATTERN (dep_insn)), insn)) return 1; ! bypassed_esp = recog_memoized (insn) >= 0 && get_attr_have_esp_bypass (insn); ! bypassed_esp_dep = recog_memoized (dep_insn) >= 0 ! && get_attr_have_esp_bypass (dep_insn); ! /* CPUs contain special hardware to allow two bypassed instructions. */ ! if ((bypassed_esp && bypassed_esp_dep)) return 0; ! /* Push pop and call also implicitly read stack pointer. */ ! if (bypassed_esp && modified_in_p (stack_pointer_rtx, dep_insn)) return 1; return 0; *************** reg_mentioned_in_mem (reg, rtl) *** 5163,5168 **** --- 5178,5201 ---- break; } + /* The lea have register mentined in memory parameter even in cases + where it don't appears to be in RTL representation. We detect this + instruction by using insn attribute TYPE and expect that the pattern is + single set where all registers in source will be used as memory parameter. + It is true about all lea insn patterns in md file. But we will have + to take care and keep this in the sync. */ + + if (reload_completed && code == INSN + && recog_memoized (rtl) >= 0 + && get_attr_type (rtl) == TYPE_LEA) + { + /* Simple check to avoud desynchronization with MD file. */ + if (GET_CODE (PATTERN (rtl)) != SET) + abort(); + if (reg_mentioned_p (reg, SET_SRC (PATTERN (rtl)))) + return 1; + } + if (code == MEM && reg_mentioned_p (reg, rtl)) return 1; *************** x86_adjust_cost (insn, link, dep_insn, c *** 5463,5481 **** { rtx next_inst; - if (GET_CODE (dep_insn) == CALL_INSN || GET_CODE (insn) == JUMP_INSN) - return 0; - - if (GET_CODE (dep_insn) == INSN - && GET_CODE (PATTERN (dep_insn)) == SET - && GET_CODE (SET_DEST (PATTERN (dep_insn))) == REG - && GET_CODE (insn) == INSN - && GET_CODE (PATTERN (insn)) == SET - && !reg_overlap_mentioned_p (SET_DEST (PATTERN (dep_insn)), - SET_SRC (PATTERN (insn)))) - return 0; /* ??? */ - - switch (ix86_cpu) { case PROCESSOR_PENTIUM: --- 5496,5501 ---- *************** x86_adjust_cost (insn, link, dep_insn, c *** 5483,5488 **** --- 5503,5515 ---- && !is_fp_dest (dep_insn)) return 0; + /* Two instruction with special bypass are pairable. */ + if (recog_memoized (insn) >= 0 + && recog_memoized (dep_insn) >= 0 + && get_attr_have_esp_bypass (insn) + && get_attr_have_esp_bypass (dep_insn)) + return -1; + if (agi_dependent (insn, dep_insn)) return cost ? cost + 1 : 2; *************** x86_adjust_cost (insn, link, dep_insn, c *** 5493,5506 **** && GET_CODE (next_inst) == JUMP_INSN) /* compare probably paired with jump */ return 0; ! /* Stores stalls one cycle longer than other insns. */ ! if (is_fp_insn (insn) && cost && is_fp_store (dep_insn)) cost++; - break; case PROCESSOR_K6: default: if (!is_fp_dest (dep_insn)) { if(!agi_dependent (insn, dep_insn)) --- 5520,5547 ---- && GET_CODE (next_inst) == JUMP_INSN) /* compare probably paired with jump */ return 0; ! /* Anti dependencies are pairable. */ ! if (REG_NOTE_KIND (link) == REG_DEP_ANTI) ! return -1; ! ! /* Integer Output dependency disables pairing, but don't ! affect anything else. */ ! if (REG_NOTE_KIND (link) == REG_DEP_OUTPUT ! && !is_fp_insn (insn) && !is_fp_insn (dep_insn)) ! return 1; ! ! /* FP stores stalls one cycle longer than other insns. */ ! if (is_fp_insn (insn) && is_fp_store (dep_insn) ! && !REG_NOTE_KIND (link)) cost++; + break; case PROCESSOR_K6: default: + if (GET_CODE (dep_insn) == CALL_INSN || GET_CODE (insn) == JUMP_INSN) + return 0; + if (!is_fp_dest (dep_insn)) { if(!agi_dependent (insn, dep_insn)) *************** x86_adjust_cost (insn, link, dep_insn, c *** 5508,5513 **** --- 5549,5558 ---- if (TARGET_486) return 2; } + /* Output and Anti dependencies have no cost. */ + else + if (REG_NOTE_KIND (link)) + return 0; else if (is_fp_store (insn) && is_fp_insn (dep_insn) && NEXT_INSN (insn) && NEXT_INSN (NEXT_INSN (insn)) *************** memory_address_info (addr, disp_length) *** 5798,5800 **** --- 5843,5857 ---- return len; } + + /* Return register that with highest probability to be constant in this + function. This function can be used by insn patterns that needs to read + dummy register to avoid dependencies. */ + rtx + i386_probably_constant_reg () + { + if (flag_pic) + return pic_offset_table_rtx; + + return frame_pointer_rtx; + } *** i386.md.old Thu Apr 15 07:12:47 1999 --- i386.md Sat Apr 17 22:50:43 1999 *************** *** 72,80 **** ;; to i386.h at the same time. (define_attr "type" ! "integer,binary,memory,test,compare,fcompare,idiv,imul,lea,fld,fpop,fpdiv,fpmul" (const_string "integer")) (define_attr "memory" "none,load,store" (cond [(eq_attr "type" "idiv,lea") (const_string "none") --- 72,88 ---- ;; to i386.h at the same time. (define_attr "type" ! "integer,binary,memory,test,compare,fcompare,idiv,imul,lea,fld,fpop,fpdiv,fpmul,push,pop,call,ret" (const_string "integer")) + ;; True for instruction with esp bypass on Pentium (call, ret, push and pop) + (define_attr "have_esp_bypass" + "false,true" + (cond [(eq_attr "type" "push,pop,call,ret") + (const_string "true")] + (const_string "false"))) + + (define_attr "memory" "none,load,store" (cond [(eq_attr "type" "idiv,lea") (const_string "none") *************** *** 82,87 **** --- 90,101 ---- (eq_attr "type" "fld") (const_string "load") + (eq_attr "type" "push,call") + (const_string "store") + + (eq_attr "type" "pop,ret") + (const_string "load") + (eq_attr "type" "test") (if_then_else (match_operand 0 "memory_operand" "") (const_string "load") *************** *** 135,141 **** (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 7 0) (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) --- 149,159 ---- (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 3 0) ! ! (define_function_unit "fpmul" 1 0 ! (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 2 2) (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) *************** *** 165,176 **** ;; i386 and i486 have one integer unit, which need not be modeled (define_function_unit "integer" 2 0 ! (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium,pentiumpro")) 1 0) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") ! (and (eq_attr "type" "integer,binary,test,compare") (eq_attr "memory" "!load"))) 1 0) --- 183,194 ---- ;; i386 and i486 have one integer unit, which need not be modeled (define_function_unit "integer" 2 0 ! (and (eq_attr "type" "integer,memory,push,pop,call,ret,binary,test,compare,lea") (eq_attr "cpu" "pentium,pentiumpro")) 1 0) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") ! (and (eq_attr "type" "integer,memory,push,pop,call,ret,binary,test,compare") (eq_attr "memory" "!load"))) 1 0) *************** *** 178,191 **** ;; and a register operation (1 cycle). (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") ! (and (eq_attr "type" "integer,binary,test,compare") (eq_attr "memory" "load"))) 3 0) ;; Multiplies use one of the integer units (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul")) ! 11 11) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "imul")) --- 196,213 ---- ;; and a register operation (1 cycle). (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") ! (and (eq_attr "type" "integer,memory,push,pop,call,ret,binary,test,compare") (eq_attr "memory" "load"))) 3 0) ;; Multiplies use one of the integer units (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul")) ! 1 1) ! ! (define_function_unit "fpmul" 1 0 ! (and (eq_attr "type" "imul") (eq_attr "cpu" "pentium")) ! 1 1) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "imul")) *************** *** 193,199 **** (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv")) ! 25 25) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "idiv")) --- 215,221 ---- (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv")) ! 1 1) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "idiv")) *************** *** 892,905 **** (match_operand:SI 1 "nonmemory_operand" "rn"))] "flag_pic" "* return AS1 (push%L0,%1);" ! [(set_attr "memory" "store")]) (define_insn "" [(set (match_operand:SI 0 "push_operand" "=<") (match_operand:SI 1 "nonmemory_operand" "ri"))] "!flag_pic" "* return AS1 (push%L0,%1);" ! [(set_attr "memory" "store")]) ;; On a 386, it is faster to push MEM directly. --- 914,927 ---- (match_operand:SI 1 "nonmemory_operand" "rn"))] "flag_pic" "* return AS1 (push%L0,%1);" ! [(set_attr "type" "push")]) (define_insn "" [(set (match_operand:SI 0 "push_operand" "=<") (match_operand:SI 1 "nonmemory_operand" "ri"))] "!flag_pic" "* return AS1 (push%L0,%1);" ! [(set_attr "type" "push")]) ;; On a 386, it is faster to push MEM directly. *************** *** 908,915 **** (match_operand:SI 1 "memory_operand" "m"))] "TARGET_PUSH_MEMORY" "* return AS1 (push%L0,%1);" ! [(set_attr "type" "memory") ! (set_attr "memory" "load")]) ;; General case of fullword move. --- 930,936 ---- (match_operand:SI 1 "memory_operand" "m"))] "TARGET_PUSH_MEMORY" "* return AS1 (push%L0,%1);" ! [(set_attr "type" "push")]) ;; General case of fullword move. *************** *** 1019,1034 **** (match_operand:HI 1 "nonmemory_operand" "ri"))] "" "* return AS1 (push%W0,%1);" ! [(set_attr "type" "memory") ! (set_attr "memory" "store")]) (define_insn "" [(set (match_operand:HI 0 "push_operand" "=<") (match_operand:HI 1 "memory_operand" "m"))] "TARGET_PUSH_MEMORY" "* return AS1 (push%W0,%1);" ! [(set_attr "type" "memory") ! (set_attr "memory" "load")]) ;; On i486, an incl and movl are both faster than incw and movw. --- 1040,1053 ---- (match_operand:HI 1 "nonmemory_operand" "ri"))] "" "* return AS1 (push%W0,%1);" ! [(set_attr "type" "push")]) (define_insn "" [(set (match_operand:HI 0 "push_operand" "=<") (match_operand:HI 1 "memory_operand" "m"))] "TARGET_PUSH_MEMORY" "* return AS1 (push%W0,%1);" ! [(set_attr "type" "push")]) ;; On i486, an incl and movl are both faster than incw and movw. *************** *** 1154,1160 **** [(set (match_operand:QI 0 "push_operand" "=<") (match_operand:QI 1 "const_int_operand" "n"))] "" ! "* return AS1(push%W0,%1);") (define_insn "" [(set (match_operand:QI 0 "push_operand" "=<") --- 1173,1180 ---- [(set (match_operand:QI 0 "push_operand" "=<") (match_operand:QI 1 "const_int_operand" "n"))] "" ! "* return AS1(push%W0,%1);" ! [(set_attr "type" "push")]) (define_insn "" [(set (match_operand:QI 0 "push_operand" "=<") *************** *** 1164,1170 **** { operands[1] = gen_rtx_REG (HImode, REGNO (operands[1])); return AS1 (push%W0,%1); ! }") ;; On i486, incb reg is faster than movb $1,reg. --- 1184,1191 ---- { operands[1] = gen_rtx_REG (HImode, REGNO (operands[1])); return AS1 (push%W0,%1); ! }" ! [(set_attr "type" "push")]) ;; On i486, incb reg is faster than movb $1,reg. *************** *** 1284,1291 **** }") (define_insn "movsf_push" ! [(set (match_operand:SF 0 "push_operand" "=<,<") ! (match_operand:SF 1 "general_operand" "*rfF,m"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* --- 1305,1312 ---- }") (define_insn "movsf_push" ! [(set (match_operand:SF 0 "push_operand" "=<,<,<") ! (match_operand:SF 1 "general_operand" "*fF,*r,m"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* *************** *** 1312,1318 **** } return AS1 (push%L0,%1); ! }") (define_split [(set (match_operand:SF 0 "push_operand" "") --- 1333,1340 ---- } return AS1 (push%L0,%1); ! }" ! [(set_attr "type" "fld,push,push")]) (define_split [(set (match_operand:SF 0 "push_operand" "") *************** *** 1411,1418 **** (define_insn "movdf_push" ! [(set (match_operand:DF 0 "push_operand" "=<,<") ! (match_operand:DF 1 "general_operand" "*rfF,o"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* --- 1433,1440 ---- (define_insn "movdf_push" ! [(set (match_operand:DF 0 "push_operand" "=<,<,<") ! (match_operand:DF 1 "general_operand" "*fF,*r,o"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* *************** *** 1439,1445 **** return output_move_pushmem (operands, insn, GET_MODE_SIZE (DFmode), 0, 0); return output_move_double (operands); ! }") (define_split [(set (match_operand:DF 0 "push_operand" "") --- 1461,1468 ---- return output_move_pushmem (operands, insn, GET_MODE_SIZE (DFmode), 0, 0); return output_move_double (operands); ! }" ! [(set_attr "type" "fld,push,push")]) (define_split [(set (match_operand:DF 0 "push_operand" "") *************** *** 1540,1547 **** }") (define_insn "movxf_push" ! [(set (match_operand:XF 0 "push_operand" "=<,<") ! (match_operand:XF 1 "general_operand" "*rfF,o"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* --- 1563,1570 ---- }") (define_insn "movxf_push" ! [(set (match_operand:XF 0 "push_operand" "=<,<,<") ! (match_operand:XF 1 "general_operand" "*fF,*r,o"))] "TARGET_PUSH_MEMORY || GET_CODE (operands[1]) != MEM || reload_in_progress || reload_completed" "* *************** *** 1567,1573 **** return output_move_pushmem (operands, insn, GET_MODE_SIZE (XFmode), 0, 0); return output_move_double (operands); ! }") (define_split [(set (match_operand:XF 0 "push_operand" "") --- 1590,1597 ---- return output_move_pushmem (operands, insn, GET_MODE_SIZE (XFmode), 0, 0); return output_move_double (operands); ! }" ! [(set_attr "type" "fld,push,push")]) (define_split [(set (match_operand:XF 0 "push_operand" "") *************** *** 3323,3328 **** --- 3347,3365 ---- "" "IX86_EXPAND_BINARY_OPERATOR (PLUS, SImode, operands);") + ;; Decrementing of stack pointer by 4 is better to be done by push on most + ;; CPUs. It is shorter and have ESP bypass on Pentium resulting in much + ;; easier scheduling of prologues. + ;; Also it is safe, because GCC never put data outside allocated stack. + (define_split + [(set (reg:SI 7) + (plus:SI (reg:SI 7) + (const_int -4)))] + "" + [(set (mem:SI (pre_dec:SI (reg:SI 7))) + (match_dup 0))] + "operands[0] = i386_probably_constant_reg ();") + (define_insn "" [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r") (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r") *************** *** 3330,3335 **** --- 3367,3380 ---- "ix86_binary_operator_ok (PLUS, SImode, operands)" "* { + /* Do the same translating as the split above for non-scheduling + compilation. */ + if (operands[0] == stack_pointer_rtx && operands[1] == stack_pointer_rtx + && GET_CODE (operands[2]) == CONST_INT && INTVAL (operands[2]) == -4) + { + operands[0] = gen_rtx (REG, SImode, 0); + return AS1 (push%L0,%0); + } if (REG_P (operands[0]) && REG_P (operands[1]) && (REG_P (operands[2]) || CONSTANT_P (operands[2])) && REGNO (operands[0]) != REGNO (operands[1])) *************** *** 3377,3383 **** return AS2 (add%L0,%2,%0); }" ! [(set_attr "type" "binary")]) ;; addsi3 is faster, so put this after. --- 3422,3428 ---- return AS2 (add%L0,%2,%0); }" ! [(set_attr "type" "binary,binary,lea")]) ;; addsi3 is faster, so put this after. *************** *** 3496,3502 **** return AS2 (add%W0,%2,%0); }" ! [(set_attr "type" "binary")]) (define_expand "addqi3" [(set (match_operand:QI 0 "general_operand" "") --- 3541,3547 ---- return AS2 (add%W0,%2,%0); }" ! [(set_attr "type" "binary,binary,lea")]) (define_expand "addqi3" [(set (match_operand:QI 0 "general_operand" "") *************** *** 3541,3547 **** return AS2 (add%B0,%2,%0); }" ! [(set_attr "type" "binary")]) ;Lennart Augustsson ;says this pattern just makes slower code: --- 3586,3592 ---- return AS2 (add%B0,%2,%0); }" ! [(set_attr "type" "binary,binary,lea")]) ;Lennart Augustsson ;says this pattern just makes slower code: *************** *** 3724,3735 **** "" "IX86_EXPAND_BINARY_OPERATOR (MINUS, SImode, operands);") (define_insn "" [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r") (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,0") (match_operand:SI 2 "general_operand" "ri,rm")))] "ix86_binary_operator_ok (MINUS, SImode, operands)" ! "* return AS2 (sub%L0,%2,%0);" [(set_attr "type" "binary")]) (define_expand "subhi3" --- 3769,3811 ---- "" "IX86_EXPAND_BINARY_OPERATOR (MINUS, SImode, operands);") + ;; Decrementing of stack pointer by 4 is better to be done by push on most + ;; CPUs. It is shorter and have ESP bypass on Pentium. + ;; ??? For some purpose this non-canonical form is generated. I am unable + ;; to figure out where this happends, so for now just handle it correctly + ;; here. + (define_split + [(set (reg:SI 7) + (minus:SI (reg:SI 7) + (const_int 4)))] + "" + [(set (mem:SI (pre_dec:SI (reg:SI 7))) + (match_dup 0))] + "operands[0] = i386_probably_constant_reg ();") + (define_insn "" [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r") (minus:SI (match_operand:SI 1 "nonimmediate_operand" "0,0") (match_operand:SI 2 "general_operand" "ri,rm")))] "ix86_binary_operator_ok (MINUS, SImode, operands)" ! "* ! { ! if (operands[0] == stack_pointer_rtx && operands[1] == stack_pointer_rtx ! && GET_CODE (operands[2]) == CONST_INT) ! { ! if (INTVAL (operands[2]) == 4) ! { ! operands[0] = gen_rtx (REG, SImode, 0); ! return AS1 (push%L0,%0); ! } ! if (INTVAL (operands[2]) == 128) ! { ! operands[2] == GEN_INT (-128); ! return AS2 (add%L0,%2,%0); ! } ! } ! return AS2 (sub%L0,%2,%0); ! }" [(set_attr "type" "binary")]) (define_expand "subhi3" *************** byte_xor_operation: *** 5142,5148 **** (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,r") (match_operand:SI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);") ;; Generic left shift pattern to catch all cases not handled by the ;; shift pattern above. --- 5218,5225 ---- (ashift:SI (match_operand:SI 1 "nonimmediate_operand" "0,r") (match_operand:SI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);" ! [(set_attr "type" "*,lea")]) ;; Generic left shift pattern to catch all cases not handled by the ;; shift pattern above. *************** byte_xor_operation: *** 5158,5164 **** (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,r") (match_operand:HI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);") (define_insn "" [(set (match_operand:HI 0 "nonimmediate_operand" "=rm") --- 5235,5242 ---- (ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,r") (match_operand:HI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);" ! [(set_attr "type" "*,lea")]) (define_insn "" [(set (match_operand:HI 0 "nonimmediate_operand" "=rm") *************** byte_xor_operation: *** 5172,5178 **** (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,q") (match_operand:QI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);") ;; Generic left shift pattern to catch all cases not handled by the ;; shift pattern above. --- 5250,5257 ---- (ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,q") (match_operand:QI 2 "small_shift_operand" "M,M")))] "! optimize_size" ! "* return output_ashl (insn, operands);" ! [(set_attr "type" "*,lea")]) ;; Generic left shift pattern to catch all cases not handled by the ;; shift pattern above. *************** byte_xor_operation: *** 6569,6575 **** } else return AS1 (call,%P0); ! }") (define_insn "" [(call (mem:QI (match_operand:SI 0 "symbolic_operand" "")) --- 6648,6655 ---- } else return AS1 (call,%P0); ! }" ! [(set_attr "type" "call")]) (define_insn "" [(call (mem:QI (match_operand:SI 0 "symbolic_operand" "")) *************** byte_xor_operation: *** 6577,6583 **** (set (reg:SI 7) (plus:SI (reg:SI 7) (match_operand:SI 3 "immediate_operand" "i")))] "!HALF_PIC_P ()" ! "call %P0") (define_expand "call" [(call (match_operand:QI 0 "indirect_operand" "") --- 6657,6664 ---- (set (reg:SI 7) (plus:SI (reg:SI 7) (match_operand:SI 3 "immediate_operand" "i")))] "!HALF_PIC_P ()" ! "call %P0" ! [(set_attr "type" "call")]) (define_expand "call" [(call (match_operand:QI 0 "indirect_operand" "") *************** byte_xor_operation: *** 6617,6630 **** } else return AS1 (call,%P0); ! }") (define_insn "" [(call (mem:QI (match_operand:SI 0 "symbolic_operand" "")) (match_operand:SI 1 "general_operand" "g"))] ;; Operand 1 not used on the i386. "!HALF_PIC_P ()" ! "call %P0") ;; Call subroutine, returning value in operand 0 ;; (which must be a hard register). --- 6698,6713 ---- } else return AS1 (call,%P0); ! }" ! [(set_attr "type" "call")]) (define_insn "" [(call (mem:QI (match_operand:SI 0 "symbolic_operand" "")) (match_operand:SI 1 "general_operand" "g"))] ;; Operand 1 not used on the i386. "!HALF_PIC_P ()" ! "call %P0" ! [(set_attr "type" "call")]) ;; Call subroutine, returning value in operand 0 ;; (which must be a hard register). *************** byte_xor_operation: *** 6680,6686 **** output_asm_insn (AS1 (call,%P1), operands); RET; ! }") (define_insn "" [(set (match_operand 0 "" "=rf") --- 6763,6770 ---- output_asm_insn (AS1 (call,%P1), operands); RET; ! }" ! [(set_attr "type" "call")]) (define_insn "" [(set (match_operand 0 "" "=rf") *************** byte_xor_operation: *** 6689,6695 **** (set (reg:SI 7) (plus:SI (reg:SI 7) (match_operand:SI 4 "immediate_operand" "i")))] "!HALF_PIC_P ()" ! "call %P1") (define_expand "call_value" [(set (match_operand 0 "" "") --- 6773,6780 ---- (set (reg:SI 7) (plus:SI (reg:SI 7) (match_operand:SI 4 "immediate_operand" "i")))] "!HALF_PIC_P ()" ! "call %P1" ! [(set_attr "type" "call")]) (define_expand "call_value" [(set (match_operand 0 "" "") *************** byte_xor_operation: *** 6733,6739 **** output_asm_insn (AS1 (call,%P1), operands); RET; ! }") (define_insn "" [(set (match_operand 0 "" "=rf") --- 6818,6825 ---- output_asm_insn (AS1 (call,%P1), operands); RET; ! }" ! [(set_attr "type" "call")]) (define_insn "" [(set (match_operand 0 "" "=rf") *************** byte_xor_operation: *** 6741,6747 **** (match_operand:SI 2 "general_operand" "g")))] ;; Operand 2 not used on the i386. "!HALF_PIC_P ()" ! "call %P1") ;; Call subroutine returning any type. --- 6827,6834 ---- (match_operand:SI 2 "general_operand" "g")))] ;; Operand 2 not used on the i386. "!HALF_PIC_P ()" ! "call %P1" ! [(set_attr "type" "call")]) ;; Call subroutine returning any type. *************** byte_xor_operation: *** 6802,6815 **** [(return)] "reload_completed" "ret" ! [(set_attr "memory" "none")]) (define_insn "return_pop_internal" [(return) (use (match_operand:SI 0 "const_int_operand" ""))] "reload_completed" "ret %0" ! [(set_attr "memory" "none")]) (define_insn "nop" [(const_int 0)] --- 6889,6903 ---- [(return)] "reload_completed" "ret" ! [(set_attr "type" "ret")]) (define_insn "return_pop_internal" [(return) (use (match_operand:SI 0 "const_int_operand" ""))] "reload_completed" "ret %0" ! [(set_attr "type" "ret") ! (set_attr "have_esp_bypass" "false")]) (define_insn "nop" [(const_int 0)] *************** byte_xor_operation: *** 6829,6834 **** --- 6917,6943 ---- ;; The use of UNSPEC here is currently not necessary - a simple SET of ebp ;; to itself would be enough. But this way we are safe even if some optimizer ;; becomes too clever in the future. + (define_insn "prologue_set_stack_ptr_4" + [(set (reg:SI 7) + (minus:SI (reg:SI 7) (const_int 4))) + (set (reg:SI 6) (unspec:SI [(reg:SI 6)] 4))] + "" + "* + { + rtx xops [2]; + + /* We don't use i386_probably_constant_reg here, because it often return + frame pointer. The only place where it is changed is right before + this insn, it is better to choose some other register. Ebx seems to be + good choice. Other choice that looks resonable is esp. I've tested it + and it runs considerably slower on Pentium CPU. */ + + operands[0] = pic_offset_table_rtx; + return AS1 (push%L0,%0); + RET; + }" + [(set_attr "type" "push")]) + (define_insn "prologue_set_stack_ptr" [(set (reg:SI 7) (minus:SI (reg:SI 7) (match_operand:SI 0 "immediate_operand" "i"))) *************** byte_xor_operation: *** 6840,6845 **** --- 6949,6960 ---- xops[0] = operands[0]; xops[1] = stack_pointer_rtx; + if (INTVAL (operands[0]) == 128) + { + xops[0] = GEN_INT (-128); + output_asm_insn (AS2 (add%L1,%0,%1), xops); + RET; + } output_asm_insn (AS2 (sub%L1,%0,%1), xops); RET; }" *************** byte_xor_operation: *** 6897,6903 **** output_asm_insn (\"addl $_GLOBAL_OFFSET_TABLE_+[.-%X1],%0\", operands); RET; }" ! [(set_attr "memory" "none")]) (define_expand "epilogue" [(const_int 1)] --- 7012,7018 ---- output_asm_insn (\"addl $_GLOBAL_OFFSET_TABLE_+[.-%X1],%0\", operands); RET; }" ! [(set_attr "type" "call")]) (define_expand "epilogue" [(const_int 1)] *************** byte_xor_operation: *** 6941,6947 **** output_asm_insn (AS1 (pop%L0,%P0), operands); RET; }" ! [(set_attr "memory" "load")]) (define_expand "movstrsi" [(parallel [(set (match_operand:BLK 0 "memory_operand" "") --- 7056,7062 ---- output_asm_insn (AS1 (pop%L0,%P0), operands); RET; }" ! [(set_attr "type" "pop")]) (define_expand "movstrsi" [(parallel [(set (match_operand:BLK 0 "memory_operand" "") Hi This patch tells alias analysis, that memory areas referenced by stack pointer differs from memory references by arg pointer and base pointer. This patch removes extra dependency in prologues, so code to read function parameters can actually be scheduled into prologue's pushes. Needless to say that this brings huge performance improvements of prologues/epilogues on i386 where we actually have two AGI stalls. Note that I was not able to find example, where the asumption is wrong, but I really don't believe thinks are SO simple. Also I will not be able to update this patch, because I am leaving tomorrow, but I hope that someone who actually understands to aliasing code will take this as hint and implement it correctly. Honza Sun Apr 18 05:59:28 CEST 1999 Jan hubicka * alias.c (memrefs_conflict_p): Memory areas referenced by stack pointer, frame pointer and arg pointer are distinc. *** alias.old Sat Apr 17 22:39:52 1999 --- alias.c Sun Apr 18 05:40:24 1999 *************** memrefs_conflict_p (xsize, x, ysize, y, *** 951,956 **** --- 951,970 ---- return memrefs_conflict_p (xsize, x0, ysize, y0, c); if (rtx_equal_for_memref_p (x0, y0)) return memrefs_conflict_p (xsize, x1, ysize, y1, c); + + /* Gcc never reference arguments in other way, that using arg + pointer. */ + if (x0 == arg_pointer_rtx || y0 == arg_pointer_rtx) + return 0; + + /* Gcc never reference same memory by both frame pointer and stack + pointer references. */ + if (frame_pointer_needed + && (((x0 == frame_pointer_rtx || x0 == hard_frame_pointer_rtx) + && y0 == stack_pointer_rtx) + || (x0 == stack_pointer_rtx && (y0 == hard_frame_pointer_rtx + || y0 == frame_pointer_rtx)))) + return 0; if (GET_CODE (x1) == CONST_INT) { if (GET_CODE (y1) == CONST_INT) *************** memrefs_conflict_p (xsize, x, ysize, y, *** 965,970 **** --- 979,998 ---- return 1; } + /* Gcc never reference arguments in other way, that using arg + pointer. */ + else if ((x0 == arg_pointer_rtx && y != arg_pointer_rtx) + || (y == arg_pointer_rtx && x0 != arg_pointer_rtx)) + return 0; + + /* Gcc never reference same memory by both frame pointer and stack + pointer references. */ + else if (frame_pointer_needed + && (((x0 == frame_pointer_rtx || x0 == hard_frame_pointer_rtx) + && y == stack_pointer_rtx) + || (x0 == stack_pointer_rtx && (y == hard_frame_pointer_rtx + || y == frame_pointer_rtx)))) + return 0; else if (GET_CODE (x1) == CONST_INT) return memrefs_conflict_p (xsize, x0, ysize, y, c - INTVAL (x1)); } *************** memrefs_conflict_p (xsize, x, ysize, y, *** 975,986 **** --- 1003,1042 ---- rtx y0 = XEXP (y, 0); rtx y1 = XEXP (y, 1); + /* Gcc never reference arguments in other way, that using arg + pointer. */ + if ((x == arg_pointer_rtx && y0 != arg_pointer_rtx) + || (y0 == arg_pointer_rtx && x != arg_pointer_rtx)) + return 0; + + /* Gcc never reference same memory by both frame pointer and stack + pointer references. */ + if (frame_pointer_needed + && (((x == frame_pointer_rtx || x == hard_frame_pointer_rtx) + && y0 == stack_pointer_rtx) + || (x == stack_pointer_rtx && (y0 == hard_frame_pointer_rtx + || y0 == frame_pointer_rtx)))) + return 0; if (GET_CODE (y1) == CONST_INT) return memrefs_conflict_p (xsize, x, ysize, y0, c + INTVAL (y1)); else return 1; } + /* Gcc never reference arguments in other way, that using arg + pointer. */ + if ((x == arg_pointer_rtx && y != arg_pointer_rtx) + || (y == arg_pointer_rtx && x != arg_pointer_rtx)) + return 0; + + /* Gcc never reference same memory by both frame pointer and stack + pointer references. */ + if (frame_pointer_needed + && (((x == frame_pointer_rtx || x == hard_frame_pointer_rtx) + && y == stack_pointer_rtx) + || (x == stack_pointer_rtx && (y == hard_frame_pointer_rtx + || y == frame_pointer_rtx)))) + return 0; if (GET_CODE (x) == GET_CODE (y)) switch (GET_CODE (x)) { Hi This patch sets LOCAL_ALIGNMENT to 32 for HImode values. Together with my assign_stack_local patches it makes i386_aligned return 1 on all HImode values on the stack making our prefix elimination procedures much more aggresive. I also believe, that caches will like more to have HImode values aligned. I would make the same change for static variables as well, but I don't think it will work, because i386_aligned is probably not smart enought to get this. If you have any idea how to extend it, let me know. (I would need some way to get alignment and size of the address, symbol_ref is pointing to. I didn't found any function for this... ) Also it would be nice to have prefix elimination working for storing values as well. Do you have any idea how to figure out, whether is safe to rewrite next 2 bytes after value or not, let me know. The problem is that structures can be saved on the stack and they are not aligned in this way. Honza Fri Apr 16 01:46:25 CEST 1999 Jan Hubicka * i386.h (LOCAL_ALIGNMENT): Align HImode values to 32 bit boundary to improve prefix elimination. *** i386.h.old Fri Apr 9 01:56:04 1999 --- i386.h Fri Apr 16 01:23:54 1999 *************** extern int ix86_arch; *** 545,550 **** --- 549,558 ---- : (TYPE_MODE (TYPE) == XFmode && (ALIGN) < 128) \ ? 128 \ : (ALIGN)) \ + : TREE_CODE (TYPE) == INTEGER_TYPE \ + ? ((TYPE_MODE (TYPE) == HImode && (ALIGN) < 32) \ + ? 32 \ + : (ALIGN)) \ : (ALIGN)) /* Set this non-zero if move instructions will actually fail to work Hi This is simple patch to improve mov?f patterns. They now use new constraint that refuse all constants (even 0 and 1). This allows combine to be more smart and in operations like a+1 read 1 from memory as add operand. Also it is tripple-checked for GNU coding standards so hope it is OK. Honza Wed Apr 14 21:56:54 CEST 1999 Jan Hubicka * i386.c (i387_move_operand): New function. * i386.h (i387_move_operand): Declare it. * i386.md (mov?f patterns): Use it. *** i386.c.old Tue Apr 13 22:10:23 1999 --- i386.c Wed Apr 14 21:42:32 1999 *************** output_ashlsi3 (operands) *** 5526,5528 **** --- 5526,5542 ---- /* Otherwise use a shift instruction. */ return AS2 (sal%L0,%2,%0); } + + /* Return 1 for operands acceptable for i387 moves (register, + memory and CONST_DOUBLEs (only after reload because otherwise + it combine to generate aritmetic operation with constant operands. + */ + int + i387_move_operand (x, mode) + enum machine_mode mode; + rtx x; + { + if (reload_in_progress || reload_completed || optimize_size) + return (general_operand (x, mode)); + return (nonimmediate_operand (x, mode)); + } *** i386.md.old Tue Apr 13 22:09:34 1999 --- i386.md Wed Apr 14 21:52:49 1999 *************** *** 1308,1314 **** (define_expand "movsf" [(set (match_operand:SF 0 "general_operand" "") ! (match_operand:SF 1 "general_operand" ""))] "" " { --- 1308,1314 ---- (define_expand "movsf" [(set (match_operand:SF 0 "general_operand" "") ! (match_operand:SF 1 "i387_move_operand" ""))] "" " { *************** *** 1326,1333 **** get better code out the back end. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE ! && !standard_80387_constant_p (operands[1])) { operands[1] = validize_mem (force_const_mem (SFmode, operands[1])); } --- 1326,1332 ---- get better code out the back end. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE) { operands[1] = validize_mem (force_const_mem (SFmode, operands[1])); } *************** *** 1449,1456 **** memory, but better safe than sorry. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE ! && !standard_80387_constant_p (operands[1])) { operands[1] = validize_mem (force_const_mem (DFmode, operands[1])); } --- 1448,1454 ---- memory, but better safe than sorry. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE) { operands[1] = validize_mem (force_const_mem (DFmode, operands[1])); } *************** *** 1459,1465 **** ;; For the purposes of regclass, prefer FLOAT_REGS. (define_insn "" [(set (match_operand:DF 0 "nonimmediate_operand" "=f,m,!*r,!o") ! (match_operand:DF 1 "general_operand" "fmG,f,*roF,*rF"))] "(!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)" "* --- 1457,1463 ---- ;; For the purposes of regclass, prefer FLOAT_REGS. (define_insn "" [(set (match_operand:DF 0 "nonimmediate_operand" "=f,m,!*r,!o") ! (match_operand:DF 1 "i387_move_operand" "fmG,f,*roF,*rF"))] "(!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)" "* *************** *** 1553,1559 **** (define_expand "movxf" [(set (match_operand:XF 0 "general_operand" "") ! (match_operand:XF 1 "general_operand" ""))] "" " { --- 1551,1557 ---- (define_expand "movxf" [(set (match_operand:XF 0 "general_operand" "") ! (match_operand:XF 1 "i387_move_operand" ""))] "" " { *************** *** 1572,1579 **** to memory, but better safe than sorry. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE ! && !standard_80387_constant_p (operands[1])) { operands[1] = validize_mem (force_const_mem (XFmode, operands[1])); } --- 1570,1576 ---- to memory, but better safe than sorry. */ else if ((reload_in_progress | reload_completed) == 0 && GET_CODE (operands[0]) != MEM ! && GET_CODE (operands[1]) == CONST_DOUBLE) { operands[1] = validize_mem (force_const_mem (XFmode, operands[1])); } *** i386.h.old Tue Apr 13 22:10:27 1999 --- i386.h Wed Apr 14 21:42:59 1999 *************** extern char *output_int_conditional_move *** 2764,2769 **** --- 2764,2770 ---- extern char *output_fp_conditional_move (); extern int ix86_can_use_return_insn_p (); extern int small_shift_operand (); + extern int i387_move_operand (); extern char *output_ashlsi3 (); #ifdef NOTYET Hi This is patch to squeze out some bytes in -Os compilation by emiting replacements of long mov instruction by arithmetic ones. It saves approx 1% of the overall xaos binarry. Note that I am not very experienced assembly coders, so my solutions can be suboptimal. Let me know if you have idea how to make them better. It also remove unnecesary code duplication between mov patterns, that showed as bad in my K6 change (I've forgot to change no fpic version) Note that it can do better job if WAS_0 notes were correct. But in most cases they are missing, because they are calculated in cse for pseudos and then usually removed. Maybe it is possible to re-create them in the flow2 pass? Honza Wed Apr 14 00:26:51 CEST 1999 Jan Hubicka * i386.c: (output_movsi): New function. * i386.h: Declare it. * i386.md (movsi patterns): Use output_movsi function. *** i386.c.old Wed Apr 14 11:59:01 1999 --- i386.c Wed Apr 14 13:06:29 1999 *************** singlemove_string (operands) *** 965,973 **** else if (GET_CODE (operands[1]) == CONST_DOUBLE) return output_move_const_single (operands); else if (GET_CODE (operands[0]) == REG || GET_CODE (operands[1]) == REG) ! return AS2 (mov%L0,%1,%0); else if (CONSTANT_P (operands[1])) ! return AS2 (mov%L0,%1,%0); else { output_asm_insn ("push%L1 %1", operands); --- 965,973 ---- else if (GET_CODE (operands[1]) == CONST_DOUBLE) return output_move_const_single (operands); else if (GET_CODE (operands[0]) == REG || GET_CODE (operands[1]) == REG) ! return output_movsi (operands); else if (CONSTANT_P (operands[1])) ! return output_movsi (operands); else { output_asm_insn ("push%L1 %1", operands); *************** memory_address_length (addr) *** 5662,5664 **** --- 5662,5793 ---- return len; } + + char * + output_movsi (insn, operands) + rtx insn, *operands; + { + /* Use of xor was disabled for AMD K6 as recommended by the Optimization + Manual. My test shows, that this generally hurts the performance, because + mov is longer and takes longer to decode and decoding is the main + bottleneck of K6 when executing GCC code. */ + + if (operands[1] == const0_rtx && REG_P (operands[0])) + { + CC_STATUS_INIT; + return AS2 (xor%L0,%0,%0); + } + + if (GET_CODE (operands[1]) == CONST_INT) + { + int is_zero = 0; + rtx link; + + if ((link = find_reg_note (insn, REG_WAS_0, 0)) + /* Make sure the insn that stored the 0 is still present. */ + && !INSN_DELETED_P (XEXP (link, 0)) + && GET_CODE (XEXP (link, 0)) != NOTE + /* Make sure cross jumping didn't happen here. */ + && no_labels_between_p (XEXP (link, 0), insn) + /* Make sure the reg hasn't been clobbered. */ + && !reg_set_between_p (operands[0], XEXP (link, 0), insn)) + is_zero = 1; + + /* Mov with immediate parameter takes 5 bytes. We can handle certain + cases by arithmetic operations as well. This strategy improves + performance on 386 and 486 CPUs. On the modern CPUs it is rather + problematic because of additional dependencies. Overall it seems to + be win for Pentium and lose on K6 and PPro. */ + + if (((int) ix86_cpu < (int) PROCESSOR_PENTIUMPRO && is_zero + && REG_P (operands[0])) + || optimize_size) + { + switch (INTVAL (operands[1])) + { + case 1: + /* 2+1 bytes */ + if (is_zero || REG_P(operands[0])) + { + CC_STATUS_INIT; + if (!is_zero) + output_asm_insn (AS2 (xor%L0,%0,%0), operands); + return AS1 (inc%L0,%0); + } + break; + case -1: + /* 2+1 bytes */ + if (is_zero || REG_P(operands[0])) + { + CC_STATUS_INIT; + if (!is_zero) + output_asm_insn (AS2 (xor%L0,%0,%0), operands); + return AS1 (dec%L0,%0); + } + break; + case 0xffff: + /* 2+2 bytes */ + if (optimize_size && (is_zero || REG_P(operands[0]))) + { + CC_STATUS_INIT; + if (!is_zero) + output_asm_insn (AS2 (xor%L0,%0,%0), operands); + return AS1 (dec%W0,%w0); + } + break; + } + + /* 2+2 bytes. Values smaller than 128 can be handled by push/pop pair + in 3 bytes. This case still wins if the register is zero. */ + if (INTVAL (operands[1]) >= 0 && INTVAL (operands[1]) <= 255 + && (is_zero || (INTVAL (operands[1]) > 128 && REG_P (operands[1])))) + { + CC_STATUS_INIT; + if (!is_zero) + output_asm_insn (AS2 (xor%L0,%0,%0), operands); + return AS2 (mov%B0,%1,%b0); + } + + /* 2+2 bytes. */ + if (INTVAL (operands[1]) >= 0 && (!INTVAL (operands[1]) & ~0xff00) + && is_zero && REG_P (operands[1])) + { + CC_STATUS_INIT; + if (!is_zero) + output_asm_insn (AS2 (xor%L0,%0,%0), operands); + return AS2 (mov%B0,%1,%h0); + } + + if (is_zero) + { + /* 3 bytes. */ + if (INTVAL (operands[1]) >= -128 && INTVAL (operands[1]) < 128) + { + CC_STATUS_INIT; + return AS2 (add%L0,%1,%0); + } + + /* 4 bytes. */ + if (is_zero && optimize_size && !(INTVAL (operands[1]) & ~0xffff)) + { + CC_STATUS_INIT; + return AS2 (mov%W0,%1,%0); + } + } /* is_zero */ + + /* 3 bytes. */ + if (optimize_size + && INTVAL (operands[1]) >= -128 && INTVAL (operands[1]) <= 127) + { + output_asm_insn (AS1 (push%L0,%1), operands); + return (AS1 (pop%L0,%0)); + } + } + } /* GET_CODE (operand[1]) == CONST_INT */ + + if (flag_pic && SYMBOLIC_CONST (operands[1])) + return AS2 (lea%L0,%a1,%0); + + return AS2 (mov%L0,%1,%0); + } + *** i386.md.old Wed Apr 14 12:43:13 1999 --- i386.md Wed Apr 14 12:46:19 1999 *************** *** 945,979 **** "((!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)) && flag_pic" ! "* ! { ! rtx link; ! ! /* K6: mov reg,0 is slightly faster than xor reg,reg but is 3 bytes ! longer. */ ! if ((ix86_cpu != PROCESSOR_K6 || optimize_size) ! && operands[1] == const0_rtx && REG_P (operands[0])) ! return AS2 (xor%L0,%0,%0); ! ! if (operands[1] == const1_rtx ! /* PPRO and K6 prefer mov to inc to reduce dependencies. */ ! && (optimize_size || (int)ix86_cpu < (int)PROCESSOR_PENTIUMPRO) ! && (link = find_reg_note (insn, REG_WAS_0, 0)) ! /* Make sure the insn that stored the 0 is still present. */ ! && ! INSN_DELETED_P (XEXP (link, 0)) ! && GET_CODE (XEXP (link, 0)) != NOTE ! /* Make sure cross jumping didn't happen here. */ ! && no_labels_between_p (XEXP (link, 0), insn) ! /* Make sure the reg hasn't been clobbered. */ ! && ! reg_set_between_p (operands[0], XEXP (link, 0), insn)) ! /* Fastest way to change a 0 to a 1. */ ! return AS1 (inc%L0,%0); ! ! if (SYMBOLIC_CONST (operands[1])) ! return AS2 (lea%L0,%a1,%0); ! ! return AS2 (mov%L0,%1,%0); ! }" [(set_attr "type" "integer,integer,memory") (set_attr "memory" "*,*,load")]) --- 945,951 ---- "((!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)) && flag_pic" ! "* return output_movsi (insn, operands);" [(set_attr "type" "integer,integer,memory") (set_attr "memory" "*,*,load")]) *************** *** 983,1016 **** "((!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)) && !flag_pic" ! "* ! { ! rtx link; ! ! /* Use of xor was disabled for AMD K6 as recommended by the Optimization ! Manual. My test shows, that this generally hurts the performance, because ! mov is longer and takes longer to decode and decoding is the main ! bottleneck of K6 when executing GCC code. */ ! ! if (operands[1] == const0_rtx && REG_P (operands[0])) ! return AS2 (xor%L0,%0,%0); ! ! if (operands[1] == const1_rtx ! /* PPRO and K6 prefer mov to inc to reduce dependencies. */ ! && (optimize_size || (int)ix86_cpu < (int)PROCESSOR_PENTIUMPRO) ! && (link = find_reg_note (insn, REG_WAS_0, 0)) ! /* Make sure the insn that stored the 0 is still present. */ ! && ! INSN_DELETED_P (XEXP (link, 0)) ! && GET_CODE (XEXP (link, 0)) != NOTE ! /* Make sure cross jumping didn't happen here. */ ! && no_labels_between_p (XEXP (link, 0), insn) ! /* Make sure the reg hasn't been clobbered. */ ! && ! reg_set_between_p (operands[0], XEXP (link, 0), insn)) ! /* Fastest way to change a 0 to a 1. */ ! return AS1 (inc%L0,%0); ! ! return AS2 (mov%L0,%1,%0); ! }" [(set_attr "type" "integer,memory") (set_attr "memory" "*,load")]) --- 955,961 ---- "((!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)) && !flag_pic" ! "* return output_movsi (operands);" [(set_attr "type" "integer,memory") (set_attr "memory" "*,load")]) *** i386.h.old Wed Apr 14 12:43:23 1999 --- i386.h Wed Apr 14 12:42:43 1999 *************** extern void split_di (); *** 2741,2746 **** --- 2741,2747 ---- extern int binary_387_op (); extern int shift_op (); extern int VOIDmode_compare_op (); + extern char *output_movsi (); extern char *output_387_binary_op (); extern char *output_fix_trunc (); extern char *output_float_compare (); Hi This patch models better the Pentium multiply unit. It sets correct parameters for fp multiply (that is according to Intel docs 3 cycles long). It also creates new function unit "fpmul" used to show that there can't be do two multiples in consetuctive cycles and that integer multiply is done in the same unit. I've also changes integer multiply and divide timmings from 11 and 22 to 1, because we can't model them correctly and gcc is generating worse code when these exact timmings are used. See comment inside the patch. Honza Mon Apr 12 17:21:47 MET DST 1999 Jan Hubicka * i386.md: Better model for Pentium multiply and divison units. *** i386.md.old Mon Apr 12 15:09:18 1999 --- i386.md Mon Apr 12 17:19:40 1999 *************** *** 135,141 **** (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 7 0) (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) --- 135,148 ---- (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 3 0) ! ! ; Pentium fp multiplying unit is not 100% pipelined and can accept new ! ; multiply only every other cycle. ! ! (define_function_unit "fpmul" 1 0 ! (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) ! 2 2) (define_function_unit "fp" 1 0 (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) *************** *** 182,199 **** (eq_attr "memory" "load"))) 3 0) ! ;; Multiplies use one of the integer units (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul")) ! 11 11) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "imul")) 2 2) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv")) ! 25 25) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "idiv")) --- 189,220 ---- (eq_attr "memory" "load"))) 3 0) ! ;; Multiplies use both integer units ! ;; Integer multiply takes in fact 11 cycles on Pentium CPU. Both ! ;; units are blocked while multiplying is executed so CPU is not accepting ! ;; new instructions. It is better to model it for us like it was a 1 cycle ! ;; instruction, because we can't block the second pipe for gcc so it would ! ;; be scheduling code for it otherwise. ! ! (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul")) ! 1 1) ! ! ;; Integer multiply is executed in fp multiplying unit. ! (define_function_unit "fpmul" 1 0 ! (and (eq_attr "type" "imul") (eq_attr "cpu" "pentium")) ! 1 1) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "imul")) 2 2) + ;; The division takes approx. 25 cycles on Pentium CPU. We model it as + ;; 1 cycle instruction for the same reasons as imul instruction above. (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv")) ! 1 1) (define_function_unit "integer" 2 0 (and (eq_attr "cpu" "k6") (eq_attr "type" "idiv")) Hi This patch attempts to reduce number of fp patterns in i386.md. It implements new predicates i387_reg_operand and i387_memreg_operand that are used for i387 opcode operands and accept FLOAT_EXTENDEDed operands as well (FLOAT_EXTENDS between registers are real no-ops, so this allows combiner to remove them). Old code was implementing patterns in normal form with various combinations of extends. This caused unnecesary code duplication and also many more obscure cases were missed. This patch also implements i387_real_operand function that removes float_extend garbage. This approach seems to handle much more cases so according of my test about 20% of extend insns are eliminated now. This reduces register pressure and scheduler confusion so it might result in better code. I am getting 200 bytes shorter XaoS binary, so this is probably not the world most powerfull optimization, but it cleans thinks up and I will have less work to modify FP patterns for future correct RTL representation (I am still waiting for the decision about format...) There is also quite complex problem in patch with complex type. If we allows FLOAT_EXTENDS of all register operands, we can get following code: (insn:QI 29 27 30 (parallel[ (set (cc0) (compare:CCFPEQ (float_extend:DF (subreg:SF (reg/v:DI 22) 0)) (reg:DF 27))) (clobber (scratch:HI)) ] ) 24 {*cmpsf_cc_1-1} (insn_list 20 (insn_list 27 (nil))) (expr_list:REG_UNUSED (scratch:HI) (expr_list:REG_DEAD (reg:DF 27) (nil)))) Greg then alocate DI mode register in integer registers causing reload to generate move (or extend) from normal register to fp one: (insn 70 27 29 (set (reg:DF 9 %st(1)) (float_extend:DF (reg:SF 2 %ecx))) 106 {extendsfdf2} (nil) (nil)) And we can't handle it. So I am refusing FLOAT_EXTENDS of SUBREGs. I am not too happy with this situation. This patch is re-make of same think I've made few months ago, so concept is quite well tested. It shows no regressions and compiles well FP intensive apps (XaoS and povray). Note that I can't test the conditional move changes... This patch also depends on the constraints fix I've sent few minutes ago and removes constraints from cmov splitters I've found while changing the i386.md file. Honza Fri Apr 9 02:38:53 CEST 1999 Jan Hubicka * i386.c (i387_real_operand): New function. (i387_memreg_operand): New function. (i387_reg_operand): New function. (output_387_binary_op): Call i387_real_operand. (output_float_compare): Call i387_real_operand. (output_fp_conditional_move): Call i387_real_operand. * i386.h: Declare new functions * i386.md: (all fp patterns): Use i387_reg_operand instead of register_operand, i387_memreg_operand instead of nonimmediate_operand, call i387_real_operand where necesary, remove unnecesary patterns with float_extend operands, add missing patterns. (fp cmov splitters): Remove unnecesary constraints. *** i386.c.old Thu Apr 8 19:45:13 1999 --- i386.c Fri Apr 9 03:20:06 1999 *************** output_387_binary_op (insn, operands) *** 4219,4224 **** --- 4219,4228 ---- strcpy (buf, base_op); + /* Unwind all FLOAT_EXTENDS to get the real operands. */ + operands[1] = i387_real_operand (operands[1]); + operands[2] = i387_real_operand (operands[2]); + switch (GET_CODE (operands[3])) { case MULT: *************** output_float_compare (insn, operands) *** 4361,4366 **** --- 4365,4374 ---- int unordered_compare = GET_MODE (SET_SRC (body)) == CCFPEQmode; rtx tmp; + /* Unwind all FLOAT_EXTENDS to get the real operands. */ + operands[0] = i387_real_operand (operands[0]); + operands[1] = i387_real_operand (operands[1]); + if (0 && TARGET_CMOVE && STACK_REG_P (operands[1])) { cc_status.flags |= CC_FCOMI; *************** output_fp_conditional_move (which_altern *** 5473,5479 **** int which_alternative; rtx operands[]; { ! enum rtx_code code = GET_CODE (operands[1]); /* This should never happen. */ if (!(cc_prev_status.flags & CC_IN_80387) --- 5481,5491 ---- int which_alternative; rtx operands[]; { ! enum rtx_code code; ! ! operands[2] = i387_real_operand (operands[2]); ! operands[3] = i387_real_operand (operands[3]); ! code = GET_CODE (operands[1]); /* This should never happen. */ if (!(cc_prev_status.flags & CC_IN_80387) *************** output_ashlsi3 (operands) *** 5692,5694 **** --- 5704,5762 ---- /* Otherwise use a shift instruction. */ return AS2 (sal%L0,%2,%0); } + + /* Return real i387 operand after unwinding all FLOAT_EXTENDS. */ + rtx + i387_real_operand (x) + rtx x; + { + while (GET_CODE (x) == FLOAT_EXTEND) + x = XEXP (x, 0); + return x; + } + + /* Return 1 for operands acceptable for i387 opcodes to be passed in register + or memory (XFmode memory parameters are not supported by i387 except for fld + ans fst instructions). */ + int + i387_memreg_operand (x, mode) + enum machine_mode mode; + rtx x; + { + int extended = 0; + while (GET_CODE (x) == FLOAT_EXTEND + && GET_MODE (x) == mode) + x = XEXP (x, 0), mode = GET_MODE (x), extended = 1; + + /* i387 don't support XFmode memory parameters. + We need to refuse extended subregs. This code happends in complex + type causing greg to allocate complex in integer register resulting + in move from integer register to fp register that we can't hande. */ + + if (mode == XFmode || (extended && GET_CODE (x) == SUBREG)) + return 0; + + return (nonimmediate_operand (x, mode)); + } + + /* Return 1 for operands acceptable for i387 opcodes to be passed in + register. */ + int + i387_reg_operand (x, mode) + enum machine_mode mode; + rtx x; + { + int extended = 0; + while (GET_CODE (x) == FLOAT_EXTEND + && GET_MODE (x) == mode) + x = XEXP (x, 0), mode = GET_MODE (x), extended = 1; + + /* We need to refuse extended subregs. This code happends in complex + type causing greg to allocate complex in integer register resulting + in move from integer register to fp register that we can't hande. */ + + if (extended && GET_CODE (x) == SUBREG) + return 0; + + return (nonimmediate_operand (x, mode)); + } + *** i386.h.old Fri Apr 9 01:56:04 1999 --- i386.h Fri Apr 9 02:30:21 1999 *************** extern char *output_fp_conditional_move *** 2767,2772 **** --- 2767,2775 ---- extern int ix86_can_use_return_insn_p (); extern int small_shift_operand (); extern char *output_ashlsi3 (); + extern struct rtx_def *i387_real_operand (); + extern int i387_reg_operand (); + extern int i387_memreg_operand (); #ifdef NOTYET extern struct rtx_def *copy_all_rtx (); *** i386.md.old Thu Apr 8 19:45:09 1999 --- i386.md Fri Apr 9 02:27:44 1999 *************** *** 317,327 **** (define_insn "tstsf_cc" [(set (cc0) ! (match_operand:SF 0 "register_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { if (! STACK_TOP_P (operands[0])) abort (); --- 317,328 ---- (define_insn "tstsf_cc" [(set (cc0) ! (match_operand:SF 0 "i387_reg_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { + operands[0] = i387_real_operand (operands[0]); if (! STACK_TOP_P (operands[0])) abort (); *************** *** 352,362 **** (define_insn "tstdf_cc" [(set (cc0) ! (match_operand:DF 0 "register_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { if (! STACK_TOP_P (operands[0])) abort (); --- 353,364 ---- (define_insn "tstdf_cc" [(set (cc0) ! (match_operand:DF 0 "i387_reg_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { + operands[0] = i387_real_operand (operands[0]); if (! STACK_TOP_P (operands[0])) abort (); *************** *** 387,397 **** (define_insn "tstxf_cc" [(set (cc0) ! (match_operand:XF 0 "register_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { if (! STACK_TOP_P (operands[0])) abort (); --- 389,400 ---- (define_insn "tstxf_cc" [(set (cc0) ! (match_operand:XF 0 "i387_reg_operand" "f")) (clobber (match_scratch:HI 1 "=a"))] "TARGET_80387 && ! TARGET_IEEE_FP" "* { + operands[0] = i387_real_operand (operands[0]); if (! STACK_TOP_P (operands[0])) abort (); *************** *** 502,521 **** (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "register_operand" "f") ! (match_operand:XF 1 "register_operand" "f")])) ! (clobber (match_scratch:HI 3 "=a"))] ! "TARGET_80387" ! "* return output_float_compare (insn, operands);" ! [(set_attr "type" "fcompare")]) ! ! (define_insn "" ! [(set (cc0) ! (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "register_operand" "f") ! (float_extend:XF ! (match_operand:DF 1 "nonimmediate_operand" "fm"))])) ! (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" [(set_attr "type" "fcompare")]) --- 505,513 ---- (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "i387_memreg_operand" "fm,f") ! (match_operand:XF 1 "i387_memreg_operand" "f,fm")])) ! (clobber (match_scratch:HI 3 "=a,a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" [(set_attr "type" "fcompare")]) *************** *** 523,531 **** (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(float_extend:XF ! (match_operand:DF 0 "nonimmediate_operand" "fm")) ! (match_operand:XF 1 "register_operand" "f")])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 515,522 ---- (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "i387_memreg_operand" "fm") ! (match_operand:XF 1 "i387_reg_operand" "f")])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 534,542 **** (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "register_operand" "f") ! (float_extend:XF ! (match_operand:SF 1 "nonimmediate_operand" "fm"))])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 525,532 ---- (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "i387_reg_operand" "f") ! (match_operand:XF 1 "i387_memreg_operand" "fm")])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 545,553 **** (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(float_extend:XF ! (match_operand:SF 0 "nonimmediate_operand" "fm")) ! (match_operand:XF 1 "register_operand" "f")])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 535,542 ---- (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:XF 0 "i387_reg_operand" "f") ! (match_operand:XF 1 "i387_reg_operand" "f")])) (clobber (match_scratch:HI 3 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 555,562 **** (define_insn "" [(set (cc0) ! (compare:CCFPEQ (match_operand:XF 0 "register_operand" "f") ! (match_operand:XF 1 "register_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 544,551 ---- (define_insn "" [(set (cc0) ! (compare:CCFPEQ (match_operand:XF 0 "i387_reg_operand" "f") ! (match_operand:XF 1 "i387_reg_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 565,572 **** (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:DF 0 "nonimmediate_operand" "f,fm") ! (match_operand:DF 1 "nonimmediate_operand" "fm,f")])) (clobber (match_scratch:HI 3 "=a,a"))] "TARGET_80387 && (GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM)" --- 554,561 ---- (define_insn "" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:DF 0 "i387_memreg_operand" "f,fm") ! (match_operand:DF 1 "i387_memreg_operand" "fm,f")])) (clobber (match_scratch:HI 3 "=a,a"))] "TARGET_80387 && (GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM)" *************** *** 575,615 **** (define_insn "" [(set (cc0) ! (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:DF 0 "register_operand" "f") ! (float_extend:DF ! (match_operand:SF 1 "nonimmediate_operand" "fm"))])) ! (clobber (match_scratch:HI 3 "=a"))] ! "TARGET_80387" ! "* return output_float_compare (insn, operands);" ! [(set_attr "type" "fcompare")]) ! ! (define_insn "" ! [(set (cc0) ! (match_operator 2 "VOIDmode_compare_op" ! [(float_extend:DF ! (match_operand:SF 0 "nonimmediate_operand" "fm")) ! (match_operand:DF 1 "register_operand" "f")])) ! (clobber (match_scratch:HI 3 "=a"))] ! "TARGET_80387" ! "* return output_float_compare (insn, operands);" ! [(set_attr "type" "fcompare")]) ! ! (define_insn "" ! [(set (cc0) ! (match_operator 2 "VOIDmode_compare_op" ! [(float_extend:DF ! (match_operand:SF 0 "register_operand" "f")) ! (match_operand:DF 1 "nonimmediate_operand" "fm")])) ! (clobber (match_scratch:HI 3 "=a"))] ! "TARGET_80387" ! "* return output_float_compare (insn, operands);" ! [(set_attr "type" "fcompare")]) ! ! (define_insn "" ! [(set (cc0) ! (compare:CCFPEQ (match_operand:DF 0 "register_operand" "f") ! (match_operand:DF 1 "register_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 564,571 ---- (define_insn "" [(set (cc0) ! (compare:CCFPEQ (match_operand:DF 0 "i387_reg_operand" "f") ! (match_operand:DF 1 "i387_reg_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 638,645 **** (define_insn "*cmpsf_cc_1" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:SF 0 "nonimmediate_operand" "f,fm") ! (match_operand:SF 1 "nonimmediate_operand" "fm,f")])) (clobber (match_scratch:HI 3 "=a,a"))] "TARGET_80387 && (GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM)" --- 594,601 ---- (define_insn "*cmpsf_cc_1" [(set (cc0) (match_operator 2 "VOIDmode_compare_op" ! [(match_operand:SF 0 "i387_memreg_operand" "f,fm") ! (match_operand:SF 1 "i387_memreg_operand" "fm,f")])) (clobber (match_scratch:HI 3 "=a,a"))] "TARGET_80387 && (GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM)" *************** *** 648,655 **** (define_insn "" [(set (cc0) ! (compare:CCFPEQ (match_operand:SF 0 "register_operand" "f") ! (match_operand:SF 1 "register_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" --- 604,611 ---- (define_insn "" [(set (cc0) ! (compare:CCFPEQ (match_operand:SF 0 "i387_reg_operand" "f") ! (match_operand:SF 1 "i387_reg_operand" "f"))) (clobber (match_scratch:HI 2 "=a"))] "TARGET_80387" "* return output_float_compare (insn, operands);" *************** *** 1378,1390 **** (define_insn "swapsf" ! [(set (match_operand:SF 0 "register_operand" "f") ! (match_operand:SF 1 "register_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else --- 1334,1348 ---- (define_insn "swapsf" ! [(set (match_operand:SF 0 "i387_reg_operand" "f") ! (match_operand:SF 1 "i387_reg_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { + operands[0] = i387_real_operand (operands[0]); + operands[1] = i387_real_operand (operands[1]); if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else *************** *** 1503,1515 **** (define_insn "swapdf" ! [(set (match_operand:DF 0 "register_operand" "f") ! (match_operand:DF 1 "register_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else --- 1461,1475 ---- (define_insn "swapdf" ! [(set (match_operand:DF 0 "i387_reg_operand" "f") ! (match_operand:DF 1 "i387_reg_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { + operands[0] = i387_real_operand (operands[0]); + operands[1] = i387_real_operand (operands[1]); if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else *************** *** 1624,1636 **** }") (define_insn "swapxf" ! [(set (match_operand:XF 0 "register_operand" "f") ! (match_operand:XF 1 "register_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else --- 1584,1598 ---- }") (define_insn "swapxf" ! [(set (match_operand:XF 0 "i387_reg_operand" "f") ! (match_operand:XF 1 "i387_reg_operand" "f")) (set (match_dup 1) (match_dup 0))] "" "* { + operands[0] = i387_real_operand (operands[0]); + operands[1] = i387_real_operand (operands[1]); if (STACK_TOP_P (operands[0])) return AS1 (fxch,%1); else *************** byte_xor_operation: *** 4461,4619 **** (define_insn "negsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (neg:SF (match_operand:SF 1 "register_operand" "0")))] "TARGET_80387" "fchs") (define_insn "negdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (neg:DF (match_operand:DF 1 "register_operand" "0")))] ! "TARGET_80387" ! "fchs") ! ! (define_insn "" ! [(set (match_operand:DF 0 "register_operand" "=f") ! (neg:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))] "TARGET_80387" "fchs") (define_insn "negxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (neg:XF (match_operand:XF 1 "register_operand" "0")))] "TARGET_80387" "fchs") - (define_insn "" - [(set (match_operand:XF 0 "register_operand" "=f") - (neg:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))] - "TARGET_80387" - "fchs") ;; Absolute value instructions (define_insn "abssf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (abs:SF (match_operand:SF 1 "register_operand" "0")))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "absdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (abs:DF (match_operand:DF 1 "register_operand" "0")))] ! "TARGET_80387" ! "fabs" ! [(set_attr "type" "fpop")]) ! ! (define_insn "" ! [(set (match_operand:DF 0 "register_operand" "=f") ! (abs:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "absxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (abs:XF (match_operand:XF 1 "register_operand" "0")))] ! "TARGET_80387" ! "fabs" ! [(set_attr "type" "fpop")]) ! ! (define_insn "" ! [(set (match_operand:XF 0 "register_operand" "=f") ! (abs:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "sqrtsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (sqrt:SF (match_operand:SF 1 "register_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387" "fsqrt") (define_insn "sqrtdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (sqrt:DF (match_operand:DF 1 "register_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && (TARGET_IEEE_FP || flag_fast_math) " "fsqrt") - (define_insn "" - [(set (match_operand:DF 0 "register_operand" "=f") - (sqrt:DF (float_extend:DF - (match_operand:SF 1 "register_operand" "0"))))] - "! TARGET_NO_FANCY_MATH_387 && TARGET_80387" - "fsqrt") - (define_insn "sqrtxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (sqrt:XF (match_operand:XF 1 "register_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && (TARGET_IEEE_FP || flag_fast_math) " "fsqrt") - (define_insn "" - [(set (match_operand:XF 0 "register_operand" "=f") - (sqrt:XF (float_extend:XF - (match_operand:DF 1 "register_operand" "0"))))] - "! TARGET_NO_FANCY_MATH_387 && TARGET_80387" - "fsqrt") - - (define_insn "" - [(set (match_operand:XF 0 "register_operand" "=f") - (sqrt:XF (float_extend:XF - (match_operand:SF 1 "register_operand" "0"))))] - "! TARGET_NO_FANCY_MATH_387 && TARGET_80387" - "fsqrt") - (define_insn "sindf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(match_operand:DF 1 "register_operand" "0")] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "sinsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (unspec:SF [(match_operand:SF 1 "register_operand" "0")] 1))] ! "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" ! "fsin") ! ! (define_insn "" ! [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(float_extend:DF ! (match_operand:SF 1 "register_operand" "0"))] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "sinxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (unspec:XF [(match_operand:XF 1 "register_operand" "0")] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "cosdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(match_operand:DF 1 "register_operand" "0")] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") (define_insn "cossf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (unspec:SF [(match_operand:SF 1 "register_operand" "0")] 2))] ! "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" ! "fcos") ! ! (define_insn "" ! [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(float_extend:DF ! (match_operand:SF 1 "register_operand" "0"))] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") (define_insn "cosxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (unspec:XF [(match_operand:XF 1 "register_operand" "0")] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") --- 4423,4521 ---- (define_insn "negsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (neg:SF (match_operand:SF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fchs") (define_insn "negdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (neg:DF (match_operand:DF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fchs") (define_insn "negxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (neg:XF (match_operand:XF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fchs") ;; Absolute value instructions (define_insn "abssf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (abs:SF (match_operand:SF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "absdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (abs:DF (match_operand:DF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "absxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (abs:XF (match_operand:XF 1 "i387_reg_operand" "0")))] "TARGET_80387" "fabs" [(set_attr "type" "fpop")]) (define_insn "sqrtsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (sqrt:SF (match_operand:SF 1 "i387_reg_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387" "fsqrt") (define_insn "sqrtdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (sqrt:DF (match_operand:DF 1 "i387_reg_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && (TARGET_IEEE_FP || flag_fast_math) " "fsqrt") (define_insn "sqrtxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (sqrt:XF (match_operand:XF 1 "i387_reg_operand" "0")))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && (TARGET_IEEE_FP || flag_fast_math) " "fsqrt") (define_insn "sindf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(match_operand:DF 1 "i387_reg_operand" "0")] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "sinsf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (unspec:SF [(match_operand:SF 1 "i387_reg_operand" "0")] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "sinxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (unspec:XF [(match_operand:XF 1 "i387_reg_operand" "0")] 1))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fsin") (define_insn "cosdf2" [(set (match_operand:DF 0 "register_operand" "=f") ! (unspec:DF [(match_operand:DF 1 "i387_reg_operand" "0")] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") (define_insn "cossf2" [(set (match_operand:SF 0 "register_operand" "=f") ! (unspec:SF [(match_operand:SF 1 "i387_reg_operand" "0")] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") (define_insn "cosxf2" [(set (match_operand:XF 0 "register_operand" "=f") ! (unspec:XF [(match_operand:XF 1 "i387_reg_operand" "0")] 2))] "! TARGET_NO_FANCY_MATH_387 && TARGET_80387 && flag_fast_math" "fcos") *************** byte_xor_operation: *** 6852,6859 **** (define_insn "" [(set (match_operand:DF 0 "register_operand" "=f,f") (match_operator:DF 3 "binary_387_op" ! [(match_operand:DF 1 "nonimmediate_operand" "0,fm") ! (match_operand:DF 2 "nonimmediate_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6754,6761 ---- (define_insn "" [(set (match_operand:DF 0 "register_operand" "=f,f") (match_operator:DF 3 "binary_387_op" ! [(match_operand:DF 1 "i387_memreg_operand" "0,fm") ! (match_operand:DF 2 "i387_memreg_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 6869,6876 **** (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "register_operand" "0,f") ! (match_operand:XF 2 "register_operand" "f,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6771,6778 ---- (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "i387_memreg_operand" "0,fm") ! (match_operand:XF 2 "i387_memreg_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 6886,6893 **** (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(float_extend:XF (match_operand:SF 1 "nonimmediate_operand" "fm,0")) ! (match_operand:XF 2 "register_operand" "0,f")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6788,6795 ---- (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "i387_reg_operand" "0,f") ! (match_operand:XF 2 "i387_memreg_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 6903,6911 **** (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "register_operand" "0,f") ! (float_extend:XF ! (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6805,6812 ---- (define_insn "" [(set (match_operand:XF 0 "register_operand" "=f,f") (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "i387_memreg_operand" "0,fm") ! (match_operand:XF 2 "i387_reg_operand" "f,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 6919,6928 **** )]) (define_insn "" ! [(set (match_operand:DF 0 "register_operand" "=f,f") ! (match_operator:DF 3 "binary_387_op" ! [(float_extend:DF (match_operand:SF 1 "nonimmediate_operand" "fm,0")) ! (match_operand:DF 2 "register_operand" "0,f")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6820,6829 ---- )]) (define_insn "" ! [(set (match_operand:XF 0 "register_operand" "=f,f") ! (match_operator:XF 3 "binary_387_op" ! [(match_operand:XF 1 "i387_reg_operand" "0,f") ! (match_operand:XF 2 "i387_reg_operand" "f,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 6935,6963 **** ) )]) - (define_insn "" - [(set (match_operand:DF 0 "register_operand" "=f,f") - (match_operator:DF 3 "binary_387_op" - [(match_operand:DF 1 "register_operand" "0,f") - (float_extend:DF - (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))] - "TARGET_80387" - "* return output_387_binary_op (insn, operands);" - [(set (attr "type") - (cond [(match_operand:DF 3 "is_mul" "") - (const_string "fpmul") - (match_operand:DF 3 "is_div" "") - (const_string "fpdiv") - ] - (const_string "fpop") - ) - )]) (define_insn "" [(set (match_operand:SF 0 "register_operand" "=f,f") (match_operator:SF 3 "binary_387_op" ! [(match_operand:SF 1 "nonimmediate_operand" "0,fm") ! (match_operand:SF 2 "nonimmediate_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") --- 6836,6847 ---- ) )]) (define_insn "" [(set (match_operand:SF 0 "register_operand" "=f,f") (match_operator:SF 3 "binary_387_op" ! [(match_operand:SF 1 "i387_memreg_operand" "0,fm") ! (match_operand:SF 2 "i387_memreg_operand" "fm,0")]))] "TARGET_80387" "* return output_387_binary_op (insn, operands);" [(set (attr "type") *************** byte_xor_operation: *** 7248,7255 **** (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:SF 4 "register_operand" "f,f,0,0") ! (match_operand:SF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" --- 7132,7139 ---- (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:SF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:SF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" *************** byte_xor_operation: *** 7260,7279 **** (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:SF 4 "register_operand" "f,f,0,0") ! (match_operand:SF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:SF 0 "register_operand" "=f,f") (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:SF 3 "register_operand" "f,0") ! (match_operand:SF 4 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) --- 7144,7163 ---- (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:SF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:SF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:SF 0 "register_operand" "") (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:SF 3 "i387_reg_operand" "") ! (match_operand:SF 4 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) *************** byte_xor_operation: *** 7283,7294 **** "") (define_split ! [(set (match_operand:SF 0 "register_operand" "=f,f") (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:SF 4 "register_operand" "f,0") ! (match_operand:SF 5 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) --- 7167,7178 ---- "") (define_split ! [(set (match_operand:SF 0 "register_operand" "") (if_then_else:SF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:SF 4 "i387_reg_operand" "") ! (match_operand:SF 5 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) *************** byte_xor_operation: *** 7300,7307 **** [(set (match_operand:SF 0 "register_operand" "=f,f") (if_then_else:SF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:SF 2 "register_operand" "f,0") ! (match_operand:SF 3 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);") --- 7184,7191 ---- [(set (match_operand:SF 0 "register_operand" "=f,f") (if_then_else:SF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:SF 2 "i387_reg_operand" "f,0") ! (match_operand:SF 3 "i387_reg_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);") *************** byte_xor_operation: *** 7350,7357 **** (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:DF 4 "register_operand" "f,f,0,0") ! (match_operand:DF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" --- 7234,7241 ---- (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:DF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:DF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" *************** byte_xor_operation: *** 7362,7381 **** (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:DF 4 "register_operand" "f,f,0,0") ! (match_operand:DF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:DF 0 "register_operand" "=f,f") (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:DF 3 "register_operand" "f,0") ! (match_operand:DF 4 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) --- 7246,7265 ---- (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:DF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:DF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:DF 0 "register_operand" "") (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:DF 3 "i387_reg_operand" "") ! (match_operand:DF 4 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) *************** byte_xor_operation: *** 7385,7396 **** "") (define_split ! [(set (match_operand:DF 0 "register_operand" "=f,f") (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:DF 4 "register_operand" "f,0") ! (match_operand:DF 5 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) --- 7269,7280 ---- "") (define_split ! [(set (match_operand:DF 0 "register_operand" "") (if_then_else:DF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:DF 4 "i387_reg_operand" "") ! (match_operand:DF 5 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) *************** byte_xor_operation: *** 7402,7409 **** [(set (match_operand:DF 0 "register_operand" "=f,f") (if_then_else:DF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:DF 2 "register_operand" "f,0") ! (match_operand:DF 3 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);") --- 7286,7293 ---- [(set (match_operand:DF 0 "register_operand" "=f,f") (if_then_else:DF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:DF 2 "i387_reg_operand" "f,0") ! (match_operand:DF 3 "i387_reg_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);") *************** byte_xor_operation: *** 7452,7459 **** (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:XF 4 "register_operand" "f,f,0,0") ! (match_operand:XF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" --- 7336,7343 ---- (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand:QI 2 "nonimmediate_operand" "q,m,q,m") (match_operand:QI 3 "general_operand" "qmn,qn,qmn,qn")]) ! (match_operand:XF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:XF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" *************** byte_xor_operation: *** 7464,7483 **** (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:XF 4 "register_operand" "f,f,0,0") ! (match_operand:XF 5 "register_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:XF 0 "register_operand" "=f,f") (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:XF 3 "register_operand" "f,0") ! (match_operand:XF 4 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) --- 7348,7367 ---- (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "r,m,r,m") (match_operand 3 "general_operand" "rmi,ri,rmi,ri")]) ! (match_operand:XF 4 "i387_reg_operand" "f,f,0,0") ! (match_operand:XF 5 "i387_reg_operand" "0,0,f,f")))] "TARGET_CMOVE && GET_MODE_CLASS (GET_MODE (operands[2])) == MODE_INT && GET_CODE (operands[1]) != LT && GET_CODE (operands[1]) != LE && GET_CODE (operands[1]) != GE && GET_CODE (operands[1]) != GT" "#") (define_split ! [(set (match_operand:XF 0 "register_operand" "") (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (const_int 0)]) ! (match_operand:XF 3 "i387_reg_operand" "") ! (match_operand:XF 4 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (match_dup 2)) *************** byte_xor_operation: *** 7487,7498 **** "") (define_split ! [(set (match_operand:XF 0 "register_operand" "=f,f") (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:XF 4 "register_operand" "f,0") ! (match_operand:XF 5 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) --- 7371,7382 ---- "") (define_split ! [(set (match_operand:XF 0 "register_operand" "") (if_then_else:XF (match_operator 1 "comparison_operator" [(match_operand 2 "nonimmediate_operand" "") (match_operand 3 "general_operand" "")]) ! (match_operand:XF 4 "i387_reg_operand" "") ! (match_operand:XF 5 "i387_reg_operand" "")))] "TARGET_CMOVE && reload_completed" [(set (cc0) (compare (match_dup 2) (match_dup 3))) (set (match_dup 0) *************** byte_xor_operation: *** 7504,7511 **** [(set (match_operand:XF 0 "register_operand" "=f,f") (if_then_else:XF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:XF 2 "register_operand" "f,0") ! (match_operand:XF 3 "register_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);") --- 7388,7395 ---- [(set (match_operand:XF 0 "register_operand" "=f,f") (if_then_else:XF (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) ! (match_operand:XF 2 "i387_reg_operand" "f,0") ! (match_operand:XF 3 "i387_reg_operand" "0,f")))] "TARGET_CMOVE && reload_completed" "* return output_fp_conditional_move (which_alternative, operands);")