Mail Archives: djgpp/1996/12/13/14:26:01
I just changed some heavily used mathematical functions to assembly,
surprisingly, the performance gain is only 10%. I fully optimized
the assembly codes according to the floating point rules to make sure
each mul and add only takes one cycle, but for a function that involves
15 muls and 30 adds the performance gain is 10%. Is it because DJGPP
is so efficient that I shouldn't bother to code in assembly? or 10% is
actually a reasonable increase? I used inline assembly, is there any
tricks that I don't know about for inline assembly(it's not that I know
any, but what kind of tricks could be there??) I also tried to write
an external assembly routine, but MASM reported trillions of errors when
I changed the memory mode to flat(it was a working assembly routine when
I used memory mode small & large). I've never worked with flat mode
assembly before, sorry if this is something out of the topic, is there
anything else I should do other than just switched the keyword?
by the way, I've heard about win32gcc in this newsgroup, what is it?
Where can I get it?
thanks,
P. Sun
- Raw text -