From: nxk3 AT b63526 DOT student DOT cwru DOT edu (Natarajan Krishnaswami) Newsgroups: comp.os.msdos.djgpp Subject: Re: Optimization Date: 29 Nov 1996 18:41:43 GMT Organization: Case Western Reserve University, Cleveland OH (USA) Lines: 33 Message-ID: References: <57hg9b$or5 AT kannews DOT ca DOT newbridge DOT com> <329C95AD DOT C3E AT silo DOT csci DOT unt DOT edu> <57k531$5bu AT kannews DOT ca DOT newbridge DOT com> NNTP-Posting-Host: b63526.student.cwru.edu To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp >Well, my logic is this: You have to move 2x as much data around; this >means your L1 cache fills up 2x as fast. This is not good. You piqued my interest, so I looked it up: "To gain efficiency in the implementation of the internal cache, storage is allocated in chunks of 128 bits, called cache lines. External caches are not likely to use cache lines smaller than those of the internal cache." [...] "To simplify the hardware implementation, cache lines can only be mapped to aligned 128-bit blocks of main memory." (i486 Microprocessor Programmer's Reference Manual, 12-1, 12-2) Also, "Because the i486 microprocessor has a 32-bit data bus, communications between the processor and memory take place as doubleword transfers aligned to addresses evenly divisible by 4; the processor converts doubleword transfers aligned to other addresses into multiple transfers;..." (ibid., 2-4, 2-6) Well, have fun optimizing your program. Cheers, Natarajan -==(UDIC)==- "Time flies like an arrow; fruit flies like a banana." -Groucho Marx