From: Shawn Hargreaves Newsgroups: comp.os.msdos.djgpp Subject: Re: Allegro & sprite stretching optimization Date: Mon, 2 Jun 1997 22:38:35 +0100 Organization: None Distribution: world Message-ID: References: <01bc6d4e$720f1900$ec3e63c3 AT default> NNTP-Posting-Host: talula.demon.co.uk MIME-Version: 1.0 Lines: 55 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk Tom writes: >Recent discussion made me think about Allegro's sprite-stretching. It >seems to me that there is a major redundancy there in that multiple uses >of the same stretch routine recompile the same thing over and over. This is true. I did at one point think about trying to optimise this case, but never got round to doing anything about it :-) But you are right, it would be possible to get some dramatic speed improvements when doing repeated stretches by identical amounts.. >Would it be possible to simply split do_stretch_blit to separate the >stretch-compile functionality (make_stretcher and a lot of >do_stretch_blit) and the functionality that uses it (_do_stretch and the >rest of do_stretch_blit), so that a user desirous of speed can compile a >stretcher into memory that they control and pass that to _do_stretch? That would work, but I'm very wary of an API that exposes the internal workings of the implementation like that. Designing an interface that is dependent on this kind of implementation detail could cause no end of problems in the long run, and would restrict the ways in which the routine could be developed in the future. It makes me nervous :-) IMHO a much better approach would be simply to make the stretch_blit() code cache the last few (say 4) routines that it compiled, and reuse them wherever it can. This could be added in do_stretch_blit() without too much hassle (I think just after the clipping code but before the first call to make_stretcher()), and would provide the speed improvement without any API clutter. Of course there would still be a few obscure situations where such a general implementation would fall down, but to my way of thinking that is the price of writing generic library code. If I can handle 99% of situations in an efficient way, I'm willing to sacrifice the remaining 1% in exchange for a cleaner interface (and of course the beauty of having source code available is that people with really specialised requirements are able to customise the routines to fit those needs...) >But then I realized a much easier approach, that also doesn't require >the user to guess, calculate, or overallocate the memory that's needed >is to compile into _scratch_mem as now, and then copy that result into >allocated memory. It can be done even more simply than that, and there's no need for the copy! At the start of the function, push the values of _scratch_mem and _scratch_mem_size into some local variables, and reset _scratch_mem to NULL and _scratch_mem_size to zero. Run the compiler function as normal, and it will allocate some new space for the resulting routine. When it is done, pop the stored _scratch_mem and _scratch_mem_size back into the global variables, and return the new _scratch_mem buffer (that was allocated by the compiler) to the caller. When they are done with it they can just free() the memory, and all will be well... -- Shawn Hargreaves - shawn AT talula DOT demon DOT co DOT uk - http://www.talula.demon.co.uk/ Beauty is a French phonetic corruption of a short cloth neck ornament.