www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/02/28/14:45:00

From: Dave Love <d DOT love AT dl DOT ac DOT uk>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Netlib code [was Re: flops...]
Date: 28 Feb 1997 15:30:52 +0000
Organization: Daresbury Laboratory, Warrington WA4 4AD, UK
Message-ID: <rzqvi7d6j83.fsf@djlvig.dl.ac.uk>
References: <Pine DOT LNX DOT 3 DOT 91 DOT 970226105830 DOT 29585A-100000 AT lenny DOT dseg DOT ti DOT com>
NNTP-Posting-Host: djlvig.dl.ac.uk
Lines: 80
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

>>>>> "Jesse" == Jesse W Bennett <jesse AT lenny DOT dseg DOT ti DOT com> writes:

 Jesse> void gemm( int m, int n, int k, double **a, double **b, double **c )
 Jesse> {

 Jesse> /* C = AB + C */

 Jesse>    int i, j, l;
 Jesse>    double temp;

 Jesse>    for( i=0; i<m; i++ )
 Jesse>       for( l=0; l<k; l++ )
 Jesse>       {
 Jesse>          temp = a[i][l];
 Jesse> 	 for( j=0; j<n; j++ )
 Jesse> 	    c[i][j] += temp * b[l][j];
 Jesse>       }     
 Jesse> }

 Jesse> compiled with gcc -O2 -S gemm.c

 Jesse> The generated assembly for the inner loop is:

 Jesse> L13:
 Jesse>         movl (%edi),%edx
 Jesse>         movl (%esi),%eax
 Jesse>         fld %st(0)
 Jesse>         fmull (%eax,%ecx,8)
 Jesse>         faddl (%edx,%ecx,8)
 Jesse>         fstpl (%edx,%ecx,8)
 Jesse>         incl %ecx
 Jesse>         cmpl %ecx,12(%ebp)
 Jesse>         jg L13

 Jesse> It is not clear to me why the edx and eax registers are being reloaded 
 Jesse> each iteration.  

I can't show DJGPP G77 o/p at present, but assume the generated code
would be the same as this.  (On 586 and especially on ppro, the speed
will actually be determined by how your double words happen to get
aligned, sigh.)

$ cat a.f
      subroutine gemm(m, n, k, a, b, c)
      integer i,m,n,k,l,j
      double precision a(n,m),  b(n,m),  c(n,m)
      do i=1,m     ! poor for illustration only
        do l=1,k
          do j=1,n
            c(j,i) = c(j,i) + a(l,i)*b(j,l)
          end do
        end do
      end do
      end
$ g77 -S -O2 -v a.f
g77 version 0.5.19.1
 gcc -S -O2 -v -xf77 a.f
Reading specs from /usr/lib/gcc-lib/i486-unknown-linux/2.7.2.1.f.1/specs
gcc version 2.7.2.1.f.1
 /usr/lib/gcc-lib/i486-unknown-linux/2.7.2.1.f.1/f771 a.f -fset-g77-defaults -qu
iet -dumpbase a.f -O2 -version -fversion -o a.s
GNU F77 version 2.7.2.1.f.1 (i386 Linux/ELF) compiled by GNU C version 2.7.2.1.f
.1.
GNU Fortran Front End version 0.5.19.1 compiled: Feb  1 1997 19:51:03
$ more +/L13 a.s

...skipping
        addl 24(%ebp),%eax
        .align 4
.L13:
        movl -24(%ebp),%edi
        fldl (%edi)
        fmull (%eax)
        faddl (%edx)
        fstpl (%edx)
        addl $8,%eax
        addl $8,%edx
        decl %ecx
        jns .L13
.L8:

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019