Message-ID: <3607A6F6.A323A602@cyberoptics.com>
From: Eric Rudd <rudd AT cyberoptics DOT com>
Reply-To: rudd AT cyberoptics DOT com
Organization: CyberOptics
MIME-Version: 1.0
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Optimizations
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980918110932 DOT 17626E-100000 AT is>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 80
Date: Tue, 22 Sep 1998 08:32:38 -0500
NNTP-Posting-Host: 206.144.150.73
NNTP-Posting-Date: Tue, 22 Sep 1998 06:31:11 PDT
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Precedence: bulk

Eli Zaretskii wrote:

> On Thu, 17 Sep 1998, John S. Fine wrote:
>
> >   While debugging I noticed many place where gcc has
> > generated crude code that is both larger and slower
> > than I would have expected.
>
> Maybe it's a good idea to post code fragments, the code emitted by GCC,
> and what you'd expect it to emit.  Sometimes GCC is smarter than you
> might think.  In some rare cases it is indeed pretty dumb.  You've posted
> some details, but I think a specific example with specific machine code
> is better.

Here is an subroutine that causes gcc to generate (exclusive of the single
asm instruction) 14 instructions, where 2 would have sufficed:

   long long prod(long x, long y) {
      union {
         long long ll;
         long d[2];
      } product;

      asm("
         imull %2
      "
      : "=a" (product.d[0]),
        "=d" (product.d[1])
      : "m"  (y),
        "0"  (x)
      );
      return product.ll;
   }

When invoked with the command line

   gcc -S ll.c -O3 -fomit-frame-pointer

gcc v2.8.1 produces the following assembly output:

      .file "ll.c"
   gcc2_compiled.:
   ___gnu_compiled_c:
   .text
      .p2align 2
   .globl _prod
   _prod:
      pushl %edi
      pushl %esi
      pushl %ebx
      movl 20(%esp),%eax
      movl 16(%esp),%eax
   /APP

      imull 20(%esp)

   /NO_APP
      movl %eax,%esi
      movl %edx,%ebx
      movl %ebx,%edi
      movl %esi,%eax
      movl %edi,%edx
      popl %ebx
      popl %esi
      popl %edi
      ret

Here is what I would have hand coded:

   .globl _prod
   _prod:
      movl  4(%esp), %eax
      imull 8(%esp)
      ret

This was not gcc's finest hour. ;-)

-Eric Rudd
rudd AT cyberoptics DOT com