From: "Duncan Coutts" Newsgroups: comp.os.msdos.djgpp Subject: Data Alignment for Optimal Access Date: Thu, 5 Aug 1999 00:53:41 +0100 Organization: UUNET WorldCom server (post doesn't reflect views of UUNET WorldCom Lines: 34 Message-ID: <7oak0j$11v$1@lure.pipex.net> NNTP-Posting-Host: userk558.uk.uudial.com X-Trace: lure.pipex.net 933811027 1087 193.149.70.134 (4 Aug 1999 23:57:07 GMT) X-Complaints-To: abuse AT uk DOT uu DOT net NNTP-Posting-Date: 4 Aug 1999 23:57:07 GMT X-Newsreader: Microsoft Outlook Express 4.72.2106.4 X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com I know that gcc (on PC targets) aligns data to 4 byte boundaries because its normal word length is 32bit - 4byte. Performance of some types of operations can be significantly improved if different alignments are used. For example, the Intel MMX tutorials strongly encourage quad word (64bit) memory operations to be quad word aligned (1 cycle vs 3 (when cached)). Also, by aligning larger structures (such as matrices) on 32 byte boundaries, caching performance can be improved (each cache line is 32 bytes large). These optimisations obviously should only be considered when doing assembly optimisation of time critical loops (such as matrix and vector operations in a 3D graphics pipeline). Does anyone have any suggestions on how to do the memory alignment? Are there any compiler exensions similar to __atribute__ ((packed)) ? I know gcc aligns data on the stack, however I suspect that it would not be possible to force 8 byte alignment for local variables or parameters. Dynamic storage seems the only other possibility. The new operator aligns allocations to 4 byte boundaries. I could over allocate by 4 bytes and then do some bit twiddling to force a pointer to a 8 byte boundary. What's the nicest way of doing this? Perhaps I should overload the matrix class' new operator to do the special allocations.