Message-ID: <354DC10D.1F4E@pobox.oleane.com>
Date: Mon, 04 May 1998 15:22:21 +0200
From: Francois Charton <deef AT pobox DOT oleane DOT com>
Organization: CCMSA
MIME-Version: 1.0
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
CC: djgpp AT delorie DOT com
Subject: Re: Heapsort [was: Library Function qsort() Exhibits N^2 Behavior]
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980503172303 DOT 19706A-100000 AT is>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Precedence: bulk

Eli Zaretskii wrote:
> 
> > Incidentally, aren't there some adjustments you can make to quicksort to
> > eliminate the worst-case behavior?
> 
> AFAIK, the DJGPP version already does some of these.  Look it up in
> the sources.

Quicksort being an O(n^2) algorithm, any implementation of it will show 
an N^2 behaviour for some sets of data. The only thing that can be done 
about it is to avoid that these N^2 cases happen in series which are more 
likely to be seen in real life programs. 

This is what the DJGPP version of Quicksort does. In naive 
implementations of quicksort, the pathological N^2 cases happen when the 
data is already sorted or almost sorted. 

However, the tests conducted by Mr Williams seem to show that 
unsatisfactory (though maybe not worst case) behaviour can still be found 
in fairly standard (ie likely in real life) cases.


An improvement to the current situation might be to include two 
algorithms for qsort in the library: quicksort by default, and a slower 
nlog(n) algorithm, which could be used when one is worried about worst 
case performance. The decision of which algorithm is actually used could 
be a header file parameter.

For the second algorithm, I would, again, recommend Heapsort over 
Mergesort (both O(nlog(n))) for two reasons:

1- Heapsort is an in-place algorithm, while Mergesort needs some 
additional memory, which may become a problem in the case of large files
2- Heapsort is simpler to implement than Mergesort, so adding it as an 
"emergency solution" would not result in too big an increase in the 
library size.

Francois