www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/02/20/18:46:07

From: jesse AT lenny DOT dseg DOT ti DOT com (Jesse Bennett)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Netlib code [was Re: flops...]
Date: 20 Feb 1997 04:07:13 GMT
Organization: Texas Instruments
Lines: 127
Message-ID: <5egilh$k7g$1@superb.csc.ti.com>
References: <Pine DOT SUN DOT 3 DOT 91 DOT 970218122520 DOT 20000K-100000 AT is>
Reply-To: jbennett AT ti DOT com (Jesse Bennett)
NNTP-Posting-Host: lenny.dseg.ti.com
Mime-Version: 1.0
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

First, I would like to make a couple of comments.

1.  I am *not* interested in a FORTRAN vs. C war.  Both languages have
    their strengths and weaknesses.

2.  I did not say (or mean to imply) that C was a better language for
    numerical progarmming in general.  In fact, for a numerical
    program that is to be developed from scratch FORTRAN is probably a
    better choice for many reasons.

The problem I was originally discussing is one where I want to
integrate some LAPACK/BLAS routines into a rather large (5000+ lines)
body of C code.  As I see it I have 3 choices; a) rewrite the C code
so that it conforms to the FORTRAN conventions for matrix storage,
b) "roll my own" numerical code and c) write C wrappers around the
FORTRAN code to implement any necessary data reordering.  I have
implemented c) and am considering a) and b).  I doubt b) will happen.

> On 17 Feb 1997, Dave Love wrote:
> 
>> >>>>> "Jesse" == Jesse Bennett <jesse AT lenny DOT dseg DOT ti DOT com> writes:
>> 
>>  Jesse> The sad thing (to me) is that well written C can perform at
>>  Jesse> least as well (and often better) than equivalent Fortran code
>>  Jesse> in numerical analysis applications.
>> 
>> You mean there's some feature of C that makes it possible to optimize
>> better than `equivalent' numerical Fortran with all the support F95
>> provides for performance (especially in parallel)?

First, AFAIK F95 is still vaporware.  Even F90 is not widely
available.  For my platform(s) the only real F90 choice is NAG F90
which is nothing more than an F90 version of F2C (it translates to C
which is then compiled).  If you change the above comment to refer to
F77 (the only remaining choice) the answer is yes.

>> Which one?

The ability to do low-level programming.  Using C the programmer can
manipulate data at a very low-level (approaching the machine code
level).  The advantage of this is that the programmer does not have to
rely on the compiler to perform the algorithmic optimizations.  For
example, consider the matrix-matrix multiplication problem C = AB + C.
This is the bread and butter of numerical analysis problems.  In
FORTRAN it would probably be written as:

      DO 90, J = 1, N
         DO 80, L = 1, K
            TEMP = B( L, J )
            DO 70, I = 1, M
               C( I, J ) = C( I, J ) + TEMP*A( I, L )
70          CONTINUE
80       CONTINUE
90    CONTINUE

A straightforward implementation in C would be:

   int i, j, l;
   double temp;

   for( i=0; i<m; i++ )
      for( l=0; l<k; l++ )
      {
         temp = a[i][l];
	 for( j=0; j<n; j++ )
	    c[i][j] += temp * b[l][j];
      }     

When coded in this fashion the FORTRAN code will likely result in more
efficient code, largely because of the pointer aliasing issue which
limits the allowed optimizations in C.  Now, consider an alternate
(although admittedly much more obscure) implementation in C:

   double *aptr, **bptr2, **endrow, *aend, *crow;
   register double temp, *cptr, *bptr, *bend;  /* inner loop variables */

   endrow = a+m;
   while( a<endrow )
   {
      bptr2 = b;
      aptr = *a++;
      crow = *c++;
      aend = aptr+k;
      while( aptr<aend )
      {
         temp = *aptr++;
	 cptr = crow;
         bptr = *bptr2++;
         bend = bptr+n;
         while( bptr<bend )
	    *cptr++ += temp * (*bptr++);
      }
   }

Since all of the array indexing is explicitly coded using pointers the
issue of aliasing is avoided.  Standard F77 does not provide memory
pointers and this level of "human optimization" is not possible.  I
understand that F90 incorporates the concept of pointers, but I have
no experience with it.  Quality F90 compilers are not currently
available on a wide variety of common platforms at a reasonable cost
(IMHO).

I have done some benchmarking of the above matrix multiply code on a
DEC Alpha using the native DEC compilers.  The "pointer based" C
implementation was fastest, followed by the F77 code and the "array
based" C code.  Hence my comments.

>> [The next version of G77 will specifically take advantage of the
>> Fortran no-alias semantics to do optimizations which aren't possible
>> for standard C.]

This I look forward to.  Will it be in 0.5.20?  If G77 had been a bit
more mature 2-3 years ago I might have coded my application in FORTRAN
to begin with.

In article <Pine DOT SUN DOT 3 DOT 91 DOT 970218122520 DOT 20000K-100000 AT is>,
	Eli Zaretskii <eliz AT is DOT elta DOT co DOT il> added:
> 
> To the best of my knowledge, Fortran indeed can be optimized better than 
> C, mainly because of the pointer aliasing issue.

This is true as far as compiler optimizations go.  It is also true
that the pointer aliasing issue can be overcome by explicitly
dereferencing pointers.

Best Regards,
Jesse

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019