Date: Thu, 5 Aug 1999 16:29:36 +0300 From: Alexander Bokovoy X-Mailer: The Bat! (v1.33) UNREG / CD5BF9353B3B7091 Organization: BSPU named after Maxim Tank X-Priority: 3 (Normal) Message-ID: <17687.990805@bspu.unibel.by> To: Duncan Coutts Subject: Re: Data Alignment for Optimal Access In-reply-To: <7oak0j$11v$1@lure.pipex.net> References: <7oak0j$11v$1 AT lure DOT pipex DOT net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: djgpp AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk You may take a look at PGCC - version of GCC optimized for Pentium Processors. It does good work for using MMX instructions (and therefore MMX-friendly alignment of data). Also, AFAIK latest GCC (2.95 - http://www.lanet.lv/~pavenis/djgpp.html) supports MMX optimization. On 05.08.1999 Duncan Coutts wrote: > I know that gcc (on PC targets) aligns data to 4 byte > boundaries because its normal word length is > 32bit - 4byte. Performance of some types of operations > can be significantly improved if different alignments are > used. > For example, the Intel MMX tutorials strongly encourage > quad word (64bit) memory operations to be quad word > aligned (1 cycle vs 3 (when cached)). > Also, by aligning larger structures (such as matrices) on > 32 byte boundaries, caching performance can be > improved (each cache line is 32 bytes large). > These optimisations obviously should only be considered > when doing assembly optimisation of time critical loops > (such as matrix and vector operations in a 3D graphics > pipeline). > Does anyone have any suggestions on how to do the > memory alignment? Are there any compiler exensions > similar to __atribute__ ((packed)) ? I know gcc aligns > data on the stack, however I suspect that it would not be > possible to force 8 byte alignment for local variables or > parameters. > Dynamic storage seems the only other possibility. > The new operator aligns allocations to 4 byte boundaries. > I could over allocate by 4 bytes and then do some bit > twiddling to force a pointer to a 8 byte boundary. > What's the nicest way of doing this? Perhaps I should > overload the matrix class' new operator to do the special > allocations. Best regards, Alexander Bokovoy, = Linux ============================================================== Though it is always possible to have a look at the world through the Windows, people usually prefer not only to look but live in it too. ============================================================== Linux =