www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1999/08/05/07:52:48

From: "Duncan Coutts" <Duncan DOT Coutts AT dial DOT pipex DOT com>
Newsgroups: comp.os.msdos.djgpp
Subject: Data Alignment for Optimal Access
Date: Thu, 5 Aug 1999 00:53:41 +0100
Organization: UUNET WorldCom server (post doesn't reflect views of UUNET WorldCom
Lines: 34
Message-ID: <7oak0j$11v$1@lure.pipex.net>
NNTP-Posting-Host: userk558.uk.uudial.com
X-Trace: lure.pipex.net 933811027 1087 193.149.70.134 (4 Aug 1999 23:57:07 GMT)
X-Complaints-To: abuse AT uk DOT uu DOT net
NNTP-Posting-Date: 4 Aug 1999 23:57:07 GMT
X-Newsreader: Microsoft Outlook Express 4.72.2106.4
X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

I know that gcc (on PC targets) aligns data to 4 byte
boundaries because its normal word length is
32bit - 4byte. Performance of some types of operations
can be significantly improved if different alignments are
used.

For example, the Intel MMX tutorials strongly encourage
quad word (64bit) memory operations to be quad word
aligned (1 cycle vs 3 (when cached)).
Also, by aligning larger structures (such as matrices) on
32 byte boundaries, caching performance can be
improved (each cache line is 32 bytes large).

These optimisations obviously should only be considered
when doing assembly optimisation of time critical loops
(such as matrix and vector operations in a 3D graphics
pipeline).

Does anyone have any suggestions on how to do the
memory alignment? Are there any compiler exensions
similar to   __atribute__ ((packed))  ? I know gcc aligns
data on the stack, however I suspect that it would not be
possible to force 8 byte alignment for local variables or
parameters.

Dynamic storage seems the only other possibility.
The new operator aligns allocations to 4 byte boundaries.
I could over allocate by 4 bytes and then do some bit
twiddling to force a pointer to a 8 byte boundary.
What's the nicest way of doing this? Perhaps I should
overload the matrix class' new operator to do the special
allocations.


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019