Date: Wed, 19 Feb 1997 21:45:21 -0600 (CST)
From: "Jesse W. Bennett" <jesse AT lenny DOT dseg DOT ti DOT com>
Reply-To: Jesse Bennett <jbennett AT ti DOT com>
To: kagel AT dg1 DOT bloomberg DOT com
cc: jbennett AT ti DOT com, djgpp AT delorie DOT com
Subject: Re: Netlib code [was Re: flops...]
In-Reply-To: <9702182137.AA02157@quasar.bloomberg.com >
Message-ID: <Pine.LNX.3.91.970219212755.25328B-100000@lenny.dseg.ti.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Tue, 18 Feb 1997 kagel AT quasar DOT bloomberg DOT com wrote:

[snip]

>    The problem is not with the performance of the Fortran
>    code but with the memory bandwidth overhead associated with converting
>    the C row-major matrices to the Fortran column-major order prior to
> 
> What conversion?  The FORTRAN is not converting you arrays.  FORTRAN and C
> share a common calling convention (ignoring the facts that FORTRAN passes
> string lengths and always passes pointers).  They just disagree on which
> dimension to increment first.  You are not inverting the arrays are you?  Just
> declare the C arrays with the indices reversed and everything will be fine.
> This way FORTRAN can see it's columns where C sees rows and both can work
> efficiently without copying.  Think about it!  We combine FORTRAN and C here at
> Bloomberg all the time (70% of our code is still in FORTRAN) with C calling
> FORTRAN and FORTRAN calling C and none of the problems that you report.

Yes, I am transposing the arrays before calling some of the LAPACK/BLAS
library routines.  I understand that the arrays could be defined in C as

   array[col][row]

but my problem is that I have a large amount of code that uses the more
conventional array[row][col] notation.  An representative example of the
type of calculations I am doing is

1. Read an image array into matrix Y.
     
2. Extract a feature matrix B = f(Y).

3. Calculate the eigenvalues/vectors of X (using LAPACK).

4. Calculate a feature transformation A = f(B eigenvalues/vectors, ...).

5. Calculate X = inv(A)B (using LAPACK).

6. Perform image classification using X.

The code for steps 1, 2, 4, and 6 exist and are, collectively, a large and
complex piece of code written in C.  For some cases I have no problems
calling the F77 code directly.  For example the matrix multiply C = AB can
be passed directly to the BLAS as C' = B'A' where ' denotes matrix
transpose.  In the linear equation problem of step 5 I need the F77 code
to solve X' = [inv(A)B]' = B'inv(A').  I see no easy way to do this
without either transposing the C arrays or rewriting a large amount of
code (actually I could solve X = inv(A')B' and transpose the X array to
get X', which is really X in C! :).  If you have any ideas which might
help I would love to hear them.  Maybe I am overlooking something obvious. 
It wouldn't be the first time.  :)
  
Best Regards,
Jesse