Date: Sat, 29 Jan 2000 03:21:01 +0100 From: Jan Hubicka To: pgcc AT delorie DOT com Subject: Re: pgcc and egcs alignment -- function, basic block and string Message-ID: <20000129032101.A25630@atrey.karlin.mff.cuni.cz> References: <38921CD6 DOT 2A725779 AT ix DOT netcom DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <38921CD6.2A725779@ix.netcom.com>; from cbsears@ix.netcom.com on Fri, Jan 28, 2000 at 02:48:54PM -0800 Reply-To: pgcc AT delorie DOT com Errors-To: dj-admin AT delorie DOT com X-Mailing-List: pgcc AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > In pgcc some basic blocks (loops?) are being aligned. > These 16 byte blocks are ifetch blocks. > Quoting Agner Fog, "While aligning data is always important, > aligning code is not necessary on the PPlain and PMMX." The alignment (4,,7) is consistent with Intel Optimizing Manual's recommendation. Changing this value might require quite extensive testing to prove your statement. For Pentium, the alignment 4,,7 seems to be win according to my (simple) tests. > He means with respect to instruction fetch, not cache line. > Is this alignment a good idea? It seems unnecessary from > a processor point of view and it seems to increase > the cache footprint. The p2align 4,,7 means align min(2^4,7) > and it means that there may be some padded nop instructions. > This is a COST for ifetch alignment in addition to the > cache footprint. > > cmpl $31,%ebx > jle .L1476 > .p2align 4,,7 > .L1471: > movl %edx,(%ebp) > addl $4,%ebp > > In pgcc strings are being aligned to cache lines. > But is alignment even necessary for strings? It is. Consider memset/memcpy/strlen expanders. These can work much better when they know that destination is word size aligned. > > egcs has the same string (32) and basic block alignment (.p2align 4,,7) > But it uses .align 4 (!) for functions. I might point out that the gas > documentation has a bug in the .align description saying that the > operand is like the .p2align operand, the number of bits to shift. I will verify this tommorow and in case you are correct, I will fix this bug. (in both gas and gcc). > > So in summary, I think that functions should be aligned to cache lines > and that basic blocks and strings should not be aligned at all. Gcc don't align every basic block. It uses alignments for top of loops, where the alignment to ifetch block is necesary. Top of loop appearing at the very end of ifetch blocks may cause stalls in the decoding process IMO. Second alignment is dont after barriers, where situation is in many points of view equivalent to function entry point. Aligning to 16 byte boundary can be quite good tradeoff between code size and cache line fetching effecienty. While function starting near end of cache line is catastrophical, function starting in the middle of it is not so bad. Again Intel Optimizing Manual recommends this. I believe Intel did some experiments before saying so. Honza > > Chris Sears > cbsears AT ix DOT netcom DOT com