www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1994/08/17/15:33:59

To: eliz AT is DOT elta DOT co DOT il
Cc: djgpp AT sun DOT soe DOT clarkson DOT edu
Subject: Re: Speed tuning programs
Date: Wed, 17 Aug 1994 17:25:55 +0100
From: Olly Betts <olly AT mantis DOT co DOT uk>

In message <9408170942 DOT AA02203 AT is DOT elta DOT co DOT il>, eliz AT is DOT elta DOT co DOT il writes:
>> I haven't tried profiling the code recently, so it might be worth doing
>> again.  However, this would probably just reduce the times for both
>> versions.
>
>Not necessarily true.  The libraries and the code generation of the two
>compilers (BC and GCC) are quite different, so what's a hot spot in one
>version, doesn't have to be such in another.  For example, imagine that
>some specific library function is much more efficient for one of the
>compilers, and this very function is used in the innermost loop of
>your program.

Good point.  I've had a go at profiling the code, but I think I'm failing
to do something.  Here's what I did:

Deleted *.o and the coff and executable files
Modified the makefile to add the flags -pg to all compiles and links
Rebuilt the program
Ran the program on a sample data set (20.43 secs internal timing)
Ran: gprof survex.out  [survex.out is the coff file]

Here's the output (editted highlights anyway):

===========================================================================
Flat profile:

Each sample counts as 0.055556 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 99.99      0.06     0.06        1    55.55    55.55  main
  0.00      0.06     0.00    92183     0.00     0.00  fputc
  0.00      0.06     0.00    43295     0.00     0.00  strncmp
  0.00      0.06     0.00    23389     0.00     0.00  skipblanks
  0.00      0.06     0.00    16879     0.00     0.00  tochar
[...]
  0.00      0.06     0.00        1     0.00     0.00  write_image

 %         the percentage of the total running time of the
time       program used by this function.

cumulative a running sum of the number of seconds accounted
 seconds   for by this function and those listed above it.

 self      the number of seconds accounted for by this
seconds    function alone.  This is the major sort for this
           listing.
[...]
===========================================================================

Now from my reading of this, the profiler thinks that the program took
0.06 seconds, all spent in main(), which is just plain wrong.  I'm sure
I must just be failing to do something.

It does show that it makes a lot of use of fputc() and strncmp() though,
which may be pertinent if the library implementations of these are weak.

>On the other hand, library functions which move buffers, such as
>strcpy(), memcpy(), memset(), memmove() are inlined by BC under
>-O2, which GCC does not.  Also, in BC these work by moving 16-bit
>words, whereas memcpy() which comes with DJGPP moves bytes.  If you
>have such calls, you're better off using movedata() which moves
>32-bit double-words.

Looks like strncpy() might be a worthy candidate then.  n is 12 in
almost all the calls to it, so the function overhead is probably fairly
significant.  If only I could get some timings from gprof ...

Olly

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019