Sender: crough45 AT amc DOT de Date: Thu, 3 Apr 1997 09:03:15 +0100 From: Chris Croughton Mime-Version: 1.0 To: jestandi AT cs DOT indiana DOT edu Cc: djgpp AT delorie DOT com Subject: Re: How is switch implemented? Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <97Apr3.090043gmt+0100.21909@internet01.amc.de> Jeff Standish wrote: > I'm assuming that most intelligent compilers would try to use an jump > table, but I don't know for certain, and I'm not in the mood to compile, Not necessarily. Most compilers will make a decision based on the size and distribution of the cases in the switch. It's very often faster and/or smaller to use a series of compare/jump instructions if the number of cases is small, especially if the range is large. For instance: switch (x) { case 5: ... break; case 576: ... break; } If implemented as a jump table, not only would you need a 572 word lookup table but you also need comparisons for under 5 and over 576 as well as the indexing code, making it much bigger and slower than just comparing the two values. With more values it might be more efficient to have a sorted lookup table of the values and use a binary chop on them, or a hashing algorithm. The "break even" point depends on the target machine architecture and instruction timings, of course, which is why some code compiled for 486 processors will run more slowly on a Pentium than on their intended target (extreme cases, but it can happen). Which method is used also depends on the optimisation options. I know Visual C and Borland C each have options to optimise for speed or size, and they will give different results. Some compilers even allow you to specify which optimisations to perform on a more detailed level, so you can select what's best for your program (and on a per module basis). Some older compilers will give different results depending on the order of the case values as well, generating more efficient code if they are already sorted, but all modern optimisers should do the sorting themselves... > disassemble, and try to read the machine code. Besides, just because > this might happen for a trivial test program doesn't mean it will also > work for 40+ cases spread out over several hundred lines of code (which > is the minimum of what I expect to write). That last point is the real problem - you don't have to disassemble because all compilers I know will generate assembler source (-S for gcc/DJGPP, other flags for other compilers so look in the documentation for the compiler). I think the best thing for you to do (and this goes for time/space critical applications in general) is to write the code and then do performance analysis (DJGPP and Borland, at least, have a 'profiler' which can help with this) to find the parts of your code which are slowest. Having identified those then you can rewrite them to be more efficient, in assembler if necessary... Hope that helps. Chris