Message-ID: <19990304152121.42144@insula.local> Date: Thu, 4 Mar 1999 15:21:21 +0000 From: Philipp Rumpf To: pgcc AT delorie DOT com Subject: Re: Intel/Cygnus References: <36DD6D94 DOT 79AFEC8F AT mitre DOT org> <000f01be6632$02e96240$3bd16482 AT ellemtel DOT se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.89.1 In-Reply-To: <000f01be6632$02e96240$3bd16482@ellemtel.se>; from David Jonsson on Thu, Mar 04, 1999 at 12:27:39PM +0100 X-Accept-Language: en,de,se Reply-To: pgcc AT delorie DOT com > This is far from trivial. The C syntax need to be abandoned if the optimization > is to be transparent from the programmer, see SWAR http://shay.ecn.purdue.edu/~swar/ I cannot see what is so difficult about it[1] ... I think it is just a special case of loop unrolling. char *p; int i; for(i=0; i<4; i++) p[i] |= 0x80; should become a 32-bit OR ... once we can do that, the rest of SIMD should be trivial[2] > Another approach is to use a MACRO like addition to ordinary compilers. > This is what Apple has done with AltiVec wich is more promising than MMX > or KNI/SSI, http://developer.apple.com/hardware/altivec/model.html Intel is doing something very similar in their compilers, they even give the compiler intrinsics or whatever they call them in the instruction set reference ... The macro approach has additional advantages though, I really would not like to get 11 bits precision for a normal float though I probably would not mind sometimes. [1] - I know about nothing about gcc/egcs/pgcc internals, so there may be something important I missed [2] - Well, it could be a bit difficult to ensure a float * is 128-bit aligned ...