Date: Wed, 2 Feb 2000 20:29:26 +0100 (CET) From: Martin Ockajak To: pgcc AT delorie DOT com Subject: Re: pgcc and egcs alignment -- function, basic block and string In-Reply-To: <20000130211158.D641@cerebro.laendle> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Reply-To: pgcc AT delorie DOT com Errors-To: dj-admin AT delorie DOT com X-Mailing-List: pgcc AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk On Sun, 30 Jan 2000, Marc Lehmann wrote: > > 10% is really a lot, inside a loop, which takes (about) 25 * 35 cycles. > > That's very much. I doubt it really is the three nops, but... Well, AFAIK K6 family (especially K6-1) is pretty sensitive to splitting insns over cache line boundary. Such cases slow down the decoding of instruction. Considering importance of decoders' performance on K6 and loop length (only 25-35 cycles as being said) and assuming some longer insns was split this way, 10% difference is IMHO possible. BTW: On my K6-2, I get best performance when loops and functions are aligned to 8 byte boundary. But this (as well as cache line end issues) deserves more testing, so I will do so during weekend. Have a nice day ------------------------------------------------------------------------------ Martin Ockajak a.k.a. Mandos http://hq.alert.sk/~mandos "The goal of Computer Science is to build something that will last at least until we've finished building it."