www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/02/09/22:35:36

From: jestandi AT cs DOT indiana DOT edu (Jeff Standish)
Newsgroups: comp.os.msdos.djgpp
Subject: timing instructions [code, LONG]
Date: 9 Feb 1997 16:58:00 -0500
Organization: Computer Science, Indiana University
Lines: 834
Message-ID: <5dlh98$qrl@gummy.cs.indiana.edu>
NNTP-Posting-Host: gummy.cs.indiana.edu
NNTP-Posting-User: jestandi
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

With the recent discussion of how long it takes for floats vs. doubles
to run, I decided to convert a program of my own over to DJGPP to get
the instruction rate for various arithmetic operations.  Included below
are the program and the results of running it on a 100 MHz Pentium
(using one million iterations).

I ran the program under both Win95 and plain DOS without Win95, and
got pretty much the same results (Win95 is slightly slower, which I
attribute to all of the stuff Win95 is doing in the background).

Now, my question is, if you look at the two sets of results below, you'll
see than they are almost identical, _except_ for integer +/-, which
literally takes twice as long to run when Win95 is installed.  I'm
baffled by the result.  I don't think it has anything to do with
cache misses, since that will typically only hurt the first loop.
This is essentially the same code I use for benchmarking different
machines (except for the code to get ellapsed time), and I've never seen
the likes of this before.  What about win95 would cause only integer +/-
to take twice as long to execute?

                   MS-DOS       Win95  (instructions per second)
                  --------    --------
int +:   	  33217359    16326061
int -:   	  33221636    16386937
int *:   	   9966067     9062017
int /:   	   1993184     1934307
int %:   	   1993151     1934341
float +:	  14235940    14087006
float -:	  14235812    14092372
float *:	  14235112    14120285
float /:	   2317579     2295202
double +:	  14237554    14127662
double -:	  14237554    14106763
double *:	  14235791    14092372
double /:	   2317557     2296431
pow():   	    140768      139298
powf():   	    168457      166748
sqrt():   	    881877      872720
sqrtf():   	    996510      986519
sin():   	    996526      986750
sinf():   	   1444230     1429516

In regards to the discussion on whether floats or doubles are faster,
from this I would say that on Pentiums the time of basic operations is
the same, but for the trig and other math functions, the equivalent
float function takes less time than the double version (fewer bits of
accuracy required, presumably).

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

/*************************************************************************
*
* Arithmetic instruction timing benchmark program.
* Copyright 1997, Jeff Standish.  All rights reserved.
*
* This program uses loops of instructions to determine the rate at
* which instructions are executed, in terms of instructions per second.
* Two loops are executed, one with 8 copies of the same instruction,
* the other with 16 copies.  The loops execute ITERATIONS times,
* so the first loop takes ITERATIONS * (8 * instruction_time + loop_overhead),
* and the second loop takes ITERATIONS * (16 * instruction_time + loop_ovrhd).
*
* The difference in the two times is therefore ITERATIONS*8*instruction_time.
* Dividing this number by ITERATIONS * 8 gives the time required for one
* operation.  Inverting that value gives the number of instructions executed
* in one second.
*
* The larger the value for ITERATIONS, the more accurate the average value
* of an instruction's time will be.  For small/slow computers, use
* ITERATIONS < 100000.  For faster computers, ITERATION = 1000000 is
* sufficient.
*
* Compile with DJGPP, with the -O0 flag to make sure no optimizations are
* performed by the compiler.
*
*************************************************************************/

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>

#define ITERATIONS 100000

int main(void)
{
    int i1, i2, i3;
    int index, ips;
    double d1, d2, d3;
    float  f1, f2, f3;
    uclock_t t1, t2, t3;

	/* junk loop take the cache miss performance hit */
    i2 = 1;
    i3 = 2;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index)
	i1 = i2 + i3;

    i2 = 45;
    i3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
	i1 = i2 + i3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("int +:   \t%10d\n", ips);

    i2 = 45;
    i3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
	i1 = i2 - i3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("int -:   \t%10d\n", ips);

    i2 = 45;
    i3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
	i1 = i2 * i3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("int *:   \t%10d\n", ips);

    i2 = 3445236;
    i3 = 45;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
	i1 = i2 / i3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("int /:   \t%10d\n", ips);

    i2 = 3445236;
    i3 = 45;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
	i1 = i2 % i3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("int %%:   \t%10d\n", ips);

    f2 = 45;
    f3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
	f1 = f2 + f3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("float +:\t%10d\n", ips);

    f2 = 45;
    f3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
	f1 = f2 - f3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("float -:\t%10d\n", ips);

    f2 = 45;
    f3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
	f1 = f2 * f3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("float *:\t%10d\n", ips);

    f2 = 3445236;
    f3 = 45;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
	f1 = f2 / f3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("float /:\t%10d\n", ips);

    d2 = 45;
    d3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
	d1 = d2 + d3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("double +:\t%10d\n", ips);

    d2 = 45;
    d3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
	d1 = d2 - d3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("double -:\t%10d\n", ips);

    d2 = 45;
    d3 = 3445236;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
	d1 = d2 * d3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("double *:\t%10d\n", ips);

    d2 = 3445236;
    d3 = 45;
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
	d1 = d2 / d3;
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("double /:\t%10d\n", ips);

    d2 = 45;
    d3 = 3.45236;
    d1 = pow(d2,d3);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
	d1 = pow(d2,d3);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("pow():   \t%10d\n", ips);

    f2 = 45;
    f3 = 3.45236;
    f1 = powf(f2,f3);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
	f1 = powf(f2,f3);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("powf():   \t%10d\n", ips);

    d2 = 45;
    d3 = 3445236;
    d1 = sqrt(d2);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
	d1 = sqrt(d2);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("sqrt():   \t%10d\n", ips);

    f2 = 45;
    f3 = 3445236;
    f1 = sqrtf(f2);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
	f1 = sqrtf(f2);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("sqrtf():   \t%10d\n", ips);

    d2 = 1.5;
    d3 = 3445236;
    d1 = sin(d2);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
	d1 = sin(d2);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("sin():   \t%10d\n", ips);

    f2 = 1.5;
    f3 = 3445236;
    f1 = sin(f2);
    t1 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
	f1 = sin(f2);
    }
    t2 = uclock();
    for (index = 0; index < ITERATIONS; ++index) {
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
	f1 = sinf(f2);
    }
    t3 = uclock();
    d1 = (double)((long)t2 - (long)t1) / UCLOCKS_PER_SEC;
    d2 = (double)((long)t3 - (long)t2) / UCLOCKS_PER_SEC;
    ips = 1.0 / ((d2 - d1) / (8.0 * ITERATIONS));
    printf("sinf():   \t%10d\n", ips);

    return 0;
}
-- 
-----------------------------------------------------------------------------
Jeff Standish                  http://www.cs.indiana.edu/hyplan/jestandi.html   
jestandi AT cs DOT indiana DOT edu              May your deeds with sword and axe
jlstandish AT acm DOT org                    Equal those with sheep and yaks

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019