www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1999/07/03/20:00:40

Sender: bill AT delorie DOT com
Message-ID: <377EA4A6.EB687A6D@taniwha.org>
Date: Sun, 04 Jul 1999 12:02:46 +1200
From: Bill Currie <bill AT taniwha DOT org>
X-Mailer: Mozilla 4.6 [en] (X11; I; Linux 2.2.9 i486)
X-Accept-Language: en
MIME-Version: 1.0
To: djgpp-workers AT delorie DOT com
Subject: Re: .align directives in libc.a
References: <Pine DOT SUN DOT 3 DOT 91 DOT 990701074005 DOT 22046A-100000 AT is> <377BB217 DOT 2FFBAEA8 AT inti DOT gov DOT ar> <377C5986 DOT 1B33420B AT taniwha DOT org> <377CB4A1 DOT BB72A3EA AT inti DOT gov DOT ar>
Reply-To: djgpp-workers AT delorie DOT com

salvador wrote:
> Yes, most processors uses 32, but that's also a waste if you routine is around 32
> bytes  and you pad it with 60 bytes ;-) (30+30).
> 32 bytes is too much and you start losing from other things. I don't know how MSVC
> determines when 32 bytes is good idea or not, perhaps is related to the size of the
> functions. BTW MSVC also exploits "proximity" by moving functions closer to the
> caller (mostly small static ones, that's usually better than inlining if the function
> have more than a couple of lines).

I beleive the goal is to land on the beginning of a cache line if you
have to have a cache miss.  However, for small functions, I agree that
if you can pack a more than one function into a cache line you will win
more often.  You will also have a smaller 

> > > But looks like the most sensitive stuff is the entry point of functions, not the
> > > align of loops or jumps.
> >
> > Nope, any destination: functions, loops and jumps are all equally
> > important.
> 
> Not for K6 and not for Pentium MMX, I tried it. In fact MSVC do *not* align jumps or
> loops. In K6 processors it could be even worst (if a loop is in a xxxxxC memory
> address works slowly), in Pentium MMX de difference is very small.

Hmm, interesting.  I'll believe it, as you've obviously actually
measured it.  Hmmm, this implies an improvement in handling jump
instruction.

> I think aligning jumps and loops is only good idea for big functions, small functions
> could need more cache lines if you add bytes inside.

Agreed, mostly.  I would say from your ealier comments about loop
alignments that even big functions could benefit from unaligned loops on
newer processors.  I think you would have to check this out (I can't
yet, I've only got a 486 and a 386, but not for too much longer:)

> Currently I think MSVC have better ideas than gcc because generates faster code.

GCC is, unfortunatly, held back by its portability and multiple
targets.  MS gets to concentrate on just x86, and probably just the
newer ones, and thus can do more clever tricks than gcc can ATM.

> Take a look to my compila.html page.

Is this the one you posted recently?  I had a look at that one and
noticed MSVC was a little faster, but (from memory) not always.

> I think Gas should have conditional aligment instructions, like: "align it if all the
> references are at 64 or more bytes of distance"

Probably too hard an maybe not worth the effort.  I think this would
require too many passes.

Bill
-- 
Leave others their otherness.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019