Sender: chris AT mindspring DOT com Message-ID: <38927310.2033EED4@ix.netcom.com> Date: Fri, 28 Jan 2000 20:56:48 -0800 From: Chris Sears X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.13-7mdk i686) X-Accept-Language: en MIME-Version: 1.0 To: hubicka AT atrey DOT karlin DOT mff DOT cuni DOT cz CC: pgcc AT delorie DOT com Subject: Re: pgcc and egcs alignment -- function, basic block and string References: <38921CD6 DOT 2A725779 AT ix DOT netcom DOT com> <20000129032101 DOT A25630 AT atrey DOT karlin DOT mff DOT cuni DOT cz> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: pgcc AT delorie DOT com Jan, thanks for your reply. > > In pgcc some basic blocks (loops?) are being aligned. > > These 16 byte blocks are ifetch blocks. > > Quoting Agner Fog, "While aligning data is always important, > > aligning code is not necessary on the PPlain and PMMX." > > The alignment (4,,7) is consistent with Intel Optimizing Manual's > recommendation. Changing this value might require quite extensive testing to > prove your statement. For Pentium, the alignment 4,,7 seems to be win > according to my (simple) tests. Is there a switch to turn this alignment off so that I could test it? -mcode-align? Or does this turn off alignment of entry points as well? > > In pgcc strings are being aligned to cache lines. > > But is alignment even necessary for strings? > It is. Consider memset/memcpy/strlen expanders. These can work > much better when they know that destination is word size aligned. I didn't quite understand this. The string alignment now is to a cache line. .file "ioport.c" .version "01.01" gcc2_compiled.: .section .rodata .LC0: .string "eip: %p\n" .align 32 .LC1: .string "/home/chris/linux/include/asm/spinlock.h" Admittedly, a cache line is word aligned as well, but wouldn't .align 4 suffice to align to a word boundary? > > I will verify this tommorow and in case you are correct, I will fix this bug. > > (in both gas and gcc). If possible could you send me email telling me what happened. > > So in summary, I think that functions should be aligned to cache lines > > and that basic blocks and strings should not be aligned at all. > Gcc don't align every basic block. It uses alignments for top of loops, where > the alignment to ifetch block is necesary. Top of loop appearing at the very > end of ifetch blocks may cause stalls in the decoding process IMO. > Second alignment is dont after barriers, where situation is in many points > of view equivalent to function entry point. The .p2align 4,,7 is deceptively misleading. It could probably be better read as .align 8 as the 7 represents a limit of 7 nops, which gas usually replaces with a do nothing leal and a nop. So given that this can happen in four cases in a 32 byte cache line: bytes 0-7 + 7 gets aligned to bytes 7-15 -- alignment not done bytes 8-15 + 7 gets aligned to 16 -- alignment to 16 bytes 16-23 + 7 gets aligned to 23-31 -- alignment not done bytes 24-31 + 7 get aligned to 32 -- alignment to 32 So half of the time it isn't being aligned anyways. In the second case, it seems a waste since the icache line will be in the buffer. No point. In the fourth case, I can see a point, especially if there is an jmp instruction and no nops will be executed. > Aligning to 16 byte boundary can be quite good tradeoff between code size > and cache line fetching effecienty. While function starting near end of > cache line is catastrophical, function starting in the middle of it is not > so bad. > Again Intel Optimizing Manual recommends this. I believe Intel did some experiments > before saying so. 16 byte alignment for functions trades memory against cache footprint. I would strongly prefer cache and I would urge someone to look at this. In this case, I wouldn't take Intel's word. To summarize: word alignment for strings -- .align 4 not .align 32 cache line alignment for functions -- .align 32 not .align 4 (egcs) or .align 16 pgcc change loop body alignment to only the fourth quarter of a cacheline .p2align probably can't do this -- not .p2align 4,,7 Chris Sears cbsears AT ix DOT netcom DOT com