X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f Lines: 56 X-Admin: news AT aol DOT com From: sterten AT aol DOT com (Sterten) Newsgroups: comp.os.msdos.djgpp Date: 03 Apr 2002 09:11:06 GMT References: <3CAAB61B DOT FC481AF4 AT is DOT elta DOT co DOT il> Organization: AOL http://www.aol.com Subject: Re: help with inline AT&T assembly Message-ID: <20020403041106.20871.00001341@mb-mm.aol.com> To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Eli Zaretskii wrote: >Sterten wrote: >> >> and then I had several errors , I don't like the AT&T-syntax. > >Many people disagree with you (they don't like the Intel syntax). well, there are some objective measurements. One of them is source size , and mere typing time. Typing all the "%" and "(" , which need a shift is awful. I don't like the Intel syntax either. I don't understand why we are supposed to write "mov eax,ebx" or even worse : "movl %%ebx,%%eax" instead of just A=B. And who knows what "punpckldq" means ? ;-) Maybe that's the reason why most people don't like assembly ? I'd like to view C as an assembler with "macros" , but I'd have to be able to predict the exact opcode being generated. And all assembly commands should be part of the C-language. BTW. can I specify which register to use for a C-variable and can I change this during the program ? >> Here is , what finally worked but gave only a speed improvement >> of about 30% on my K6/2 : > >As a rule of thumb, you shouldn't expect any speedups more than 30-50% from >going to assembly. usually I get more. This program was originally 250sec , now it's 49sec due to algo-changes and code optimization , but each improvement only gave 10%-30%. Using 3 register variables already helped a lot. (I usually don't use register variables , maybe I'll use more in future) I could still unroll the loop to avoid stalls for another estimated 20% . This is all for the K6/2 , maybe it's better on newer processors. >If you need a larger speedup, you should rethink your >algorithms. yes, of course. But I want to measure the algorithms by their speed , so it does make sense to optimize them before comparing them. And then the performance also depends on the instances and one algo is better with one instance while another one is better on another. I recently had a program, which was only half as fast with GCC/djgpp than with other compilers :-( Usually GCC/djgpp 's code is about 30% slower than e.g. code from the Intel compiler. Does this coincide with what other people experienced ? Guenter