From: "George C. Moschovitis" Subject: optimisation... To: djgpp AT sun DOT soe DOT clarkson DOT edu (djgpp) Date: Wed, 28 Jun 1995 11:36:51 +0300 (EET DST) Hi there... i was experimenting with the asm output of gcc and was quite dissapointed with the quality of the code. I compiled with the -O3 option... does this include ABSOLUTELY ALL optimisations ? or should i use more switches. And what are the switches for the absolutely best speed optimized code ? btw here is an example: the c++ code: inline __dpmi_memblock __dpmi_allocate_low_memory(int size) { __dpmi_memblock mb; mb.l.segment = __dpmi_allocate_dos_memory((size>>4)+1,&mb.l.selector); mb.l.size = size; return mb; } void VBE_InitBuffer() { // Allocate a global buffer for communicating with the VBE. TransferBuffer = __dpmi_allocate_low_memory(vbeBUFFERSIZE); // remember to free the buffer before exiting ! atexit(VBE_KillBuffer); } produced the asm output: .align 2 ; why not align 4 ?? .globl _VBE_InitBuffer__Fv _VBE_InitBuffer__Fv: pushl %ebp movl %esp,%ebp subl $32,%esp leal -24(%ebp),%eax pushl %eax pushl $65 call ___dpmi_allocate_dos_memory movw %ax,-20(%ebp) ; why is movl $1024,-28(%ebp) ; all this crap movl -32(%ebp),%ecx ; generated ? movl %ecx,-16(%ebp) ; an optimizing movl $1024,-12(%ebp) ; compiler should movl -24(%ebp),%edx ; leave out this movl %edx,-8(%ebp) ; code movl -20(%ebp),%eax ; (like Watcom C/C++ does movl %eax,-4(%ebp) ; for example) addl $8,%esp movl %ecx,_TransferBuffer movl $1024,_TransferBuffer+4 movl %edx,_TransferBuffer+8 movl %eax,_TransferBuffer+12 pushl $_VBE_KillBuffer__Fv call _atexit leave ret is there anyway i can rearange the c++ code for the compiler to produce better output ? or any switch i should use ? if I understand correctly gcc fills the temporary object mb even if it is not needed. Maybe I am wrong and I havent run many tests but i preferred to mail to this list since there are MANY extremely helpfull guys here... And something else. From the (admitedly) limited asm outputs i have seen gcc doesnt seem to use register passing that much :( On this topic how can i tell the compiler to which registers to pass the parameters ? Btw i dont think i have seen any size otpimisation switches, aren't here any ? tmL- ps: btw sorry for my bad english :( ps2: i allready found that -fomit-frame-pointer gets rid of the ebp/leave crap.. no need telling me abou this...