Date: Tue, 9 Mar 1999 23:09:03 +0100 To: pgcc AT delorie DOT com Subject: Re: loop unrolling Message-ID: <19990309230903.F360@cerebro.laendle> Mail-Followup-To: pgcc AT delorie DOT com References: <199902241423 DOT JAA29290 AT envy DOT delorie DOT com> <19990225235232 DOT C20417 AT cerebro DOT laendle> <36DA64A8 DOT 5EEBA3B0 AT rug DOT ac DOT be> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <36DA64A8.5EEBA3B0@rug.ac.be>; from Marc Palmans on Mon, Mar 01, 1999 at 10:58:00AM +0100 X-Operating-System: Linux version 2.2.2 (marc AT cerebro) (gcc driver version pgcc-2.93.04 19990131 (gcc2 ss-980929 experimental) executing gcc version 2.7.2.3) From: Marc Lehmann Reply-To: pgcc AT delorie DOT com X-Mailing-List: pgcc AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk On Mon, Mar 01, 1999 at 10:58:00AM +0100, Marc Palmans wrote: > Marc Lehmann wrote: > > > From: =?iso-8859-1?Q?Johnny_Teve=DFen?= > > > of the loop, the FPU is nearly totally left alone (well, I don't think > > > the load-"d"-from-stack still occupies it here). And is the pentiumpro > > > (i686) really capable of collecting 6 fp multiplications in its queue? > > > > Yes ;) > > Is this because of the out of order execution ? In one of the manuals it > states that the execution unit cannot accept another fmul after the > cycle it has accepted the first or am I missing something. > Where in the source can I find this (okay, I'm new at this :), in the > machine description I can only find a reference to the execution units. This is true, however, the p-ii (I'm not so sure about the ppro) can collect upto 6 instructions in its decoder-queue. However, there are other instructions and other sources of stalls, so that goal is likely not to get reached with normal code. -- -----==- | ----==-- _ | ---==---(_)__ __ ____ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / pcg AT goof DOT com |e| -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |