www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2001/05/11/13:47:17

Date: Fri, 11 May 2001 19:53:17 +0300
From: "Eli Zaretskii" <eliz AT is DOT elta DOT co DOT il>
Sender: halo1 AT zahav DOT net DOT il
To: Michiel de Bondt <michielb AT sci DOT kun DOT nl>
Message-Id: <3277-Fri11May2001195316+0300-eliz@is.elta.co.il>
X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9
CC: djgpp AT delorie DOT com
In-reply-to: <3AFBF8AB.C42331EB@sci.kun.nl> (message from Michiel de Bondt on
Fri, 11 May 2001 16:35:23 +0200)
Subject: Re: how to use inline push and pop
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1010510171303 DOT 7067H-100000 AT is> <3AFBF8AB DOT C42331EB AT sci DOT kun DOT nl>
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> From: Michiel de Bondt <michielb AT sci DOT kun DOT nl>
> Newsgroups: comp.os.msdos.djgpp
> Date: Fri, 11 May 2001 16:35:23 +0200
> >
> > There's a popular belief that recursive code is terribly slow, but
> > experience shows that this is mostly a myth.  Recursive code _might_ be
> > slow, but in many cases it isn't.  Because recursive code is usually
> > smaller, it fits better into the CPU caches.  It is also simpler, so you
> > have less probability for bugs, and it lends itself better to compiler
> > optimizations.
> >
> 
> I have once seen the opposite: faster recursive code..

Yes, that's what I was saying as well.

> What do you mean with profile? Fine-tuning the code within the
> language itself?

No, I mean use the profiler.  Compile and link the program with the
"-pg" compiler switch, then run it, and when it exits, run gprof, the
profiler which is part of the Binutils distribution.  It will show you
where does your program spends most of its time.  If that place is not
in the code you are trying to inline, you are wasting your time.

> I discovered that the base pointer can be used as well, with
> -fno-frame-pointer.  This makes an extra register available and my
> code can be speeded up in another way.

Yes, this is another optimization switch that you should try.


> I started using many intel inline asms when I discovered that my C
> instructions were not translated to the one-liners I had in
> mind. The code gcc generates looks terrible.
> See e.g. the following examples:
> 
> C-code:
> T.Byte += dd[2]
> (union {long Long; unsigned char Byte; } T;)
> 
> gcc-output:
> movb %cl, %al
> addb _dd+2, %al
> movb %al,  %cl
> 
> one-liner:
> addb _dd+2, %cl

You did compile this with optimizations, yes?  And you do have the
latest GCC version, right?  Also, did you use the -march=pentium
option?

You also should look at the time it takes to perform these
instructions.  Sometimes, the code looks to be of poor quality, but it
actually runs faster.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019