Date: Thu, 15 Jul 1993 15:56:58 -0600
From: jweiss AT silver DOT sdsmt DOT edu (John M. Weiss)
To: djgpp AT sun DOT soe DOT clarkson DOT edu
Subject: available memory


On 07/14/93 dave AT echologic DOT com (Dave Hayden) wrote:

As I understand it, John Weiss's problem with large 2D arrays comes
from all the paging.  He must access the array in both row and column
order.  When it's allocated by rows and accessed by columns, each
access hits a new page and the disk thrashes madly.


Right now, the elements are allocated to pages something like this:

			Index
	    0  1  2  3  4  5  6  7  8  9 10 11 ....
	 0+---------------------------------------
	 1| 0  0  0  0  0  0  0  0  1  1  1  1 ...
	 2| 3  3  3  3  3  3  3  3  4  4  4  4
	 3| 7  7  7  7  7  7  7  7  8  8  8  8
	 4|11 11 11 11 11 11 11 11 12 12 12 12
	 5|15 15 15 15 15 15 15 15 16 16 16 16
	 .|
	 .|                          Page number
	 .|


So why not allocate them like this instead:

	    0  1  2  3  4  5  6  7  8  9 10 11 ....
	 0+---------------------------------------
	 1| 0  0  0  0  0  1  1  1  1  2  2  2
	 2| 0  0  0  0  0  1  1  1  1  2  2  2
	 3| 7  7  7  7  7  8  8  8  8  9  9  9
	 4| 7  7  7  7  7  8  8  8  8  9  9  9
	 5|15 15 15 15 15 16 16 16 16 17 17 17
	 .|15 15 15 15 15 16 16 16 16 17 17 17
	 .|
	 .|

In other words, let each page contain a block of values intead of part
of one row or column.  Obviously this means you have to write your own
routine to get an element from it's subscripts, but in C++ you could just
overload the [] operator for this.

This will make the program run slower if you're not paging, but I
think it will help prevent thrashing if you do page.

My response:
-----------
My FFT is fairly optimal for an in-memory FFT. There are other algorithms
which are far better when the FFT will not fit in RAM, and are designed
to write intermediate results to disk in such a way as to minimize the
i/o overhead. I would like to switch over to such an algorithm only when
necessary, but I cannot find a way to query the available RAM in DJGPP.

Your approach, if I understand it correctly, attempts to preload the 2-D
arrays in such a way as to minimize paging. This is an ingenious idea,
and I may play around with it if necessary. One potential problem:
an efficient implemention *still* depends on how much RAM is available,
otherwise unnecessary paging will occur. You also need to consider the
malloc() algorithm; if this changes, the code may no longer be optimal.
I'm hoping the next release of DJGPP will provide a routine to query the
amount of available heap, and all this will prove unnecessary!

Thanks for your suggestion - JW