www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/07/14/01:30:47

From: Andrew Bainbridge <andrew DOT bainbridge AT virgin DOT net>
Newsgroups: comp.os.msdos.djgpp
Subject: DJGPP division optimisations
Date: Mon, 13 Jul 1998 23:26:28 +0100
Organization: Virgin News Service
Lines: 90
Message-ID: <35AA8994.3FC1@virgin.net>
NNTP-Posting-Host: 194.168.71.22
Mime-Version: 1.0
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

I wrote a fractal generator a while ago using DJGPP and Allegro and was
quite
pleased with the speed it ran at. But there is always room for
improvement,
so I started to look at it again this evening. I had suspected that
DJGPP 
wasn't optimising the way it should, so I did a little test. 

Basically the fractal routine relied heavily on two divide operations
on long integers. In the source code I used x /= 2; y /=2; for clarity,
on
the assumption that the compiler would replace these operations with
shifts.
However, it seems this isn't a valid assumption. Replacing the code with
x >>= 1; y >>= 1; made a large speed up. 

To make things simpler I have made a shorter program that demonstrates
the
same problem. On my machine the test program takes 1.78 seconds to run
using
shifts and 5.08 seconds using divides. Can somebody tell me why this is 
happening. BTW I have tried compiling with all kinds of switches but
mainly I 
use:

gcc test.c -o test.exe -O3 -m486 -ffast-math -fomit-frame-pointer
-lalleg


Here is the code:

#include <stdio.h>
#include <allegro.h>

#define WIDTH		640
#define HEIGHT 		480
#define MAX_POINTS	50000000

volatile int timer = 0;

void inc_timer() {
	timer++;
}

END_OF_FUNCTION(inc_timer);

int main() {
	long r, r2, r3, r4, r5;		// Will store a random number below
	long x1=WIDTH/2-1, x3=WIDTH-1, y2=HEIGHT-1, y3=HEIGHT-1;
	long x = x1, y = 0;		// The current cursor position
	long i;				// Just a for loop counter
	float seconds;

	allegro_init();
	install_keyboard();
	install_timer();

	// Install one of Allegro's timer routines
	LOCK_VARIABLE(timer);
	LOCK_FUNCTION(inc_timer);
	install_int(inc_timer, 10);	// INCREMENT TIMER EVERY 1/100 SECOND

	printf("Benchmarking\n");
	readkey();

	timer = 0;

	for(i = MAX_POINTS; i; i--) {
		x += x3;				
		y += y3;
//		x /= 2;		// Swap these two lines for the two below
//		y /= 2;
		x >>= 1;
		y >>= 1;
		x += x1;
		y += y2;
//		x /= 2;		// And these two
//		y /= 2;
		x >>= 1;
	  	y >>= 1;
	}
		
	seconds = timer;
	printf("%d, %d\n", x, y);	// Need to print x and y to stop the
					// compiler from missing out the loop
					// altogether.
	seconds /= 100;
	printf("Took %2.2f seconds\n", seconds);
	return 0;
}

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019