2006-06-23 Steven G. Johnson * NEWS, configure.ac, doc/FAQ/fftw-faq.bfnn, doc/fftw3.texi, kernel/cycle.h, m4/acx_pthread.m4, m4/ax_cc_maxopt.m4, m4/ax_gcc_archflag.m4, m4/ax_gcc_x86_cpuid.m4, threads/threads.c: copy bug fixes from CVS HEAD * api/Makefile.am: install x77.h guru.h guru64.h in pkgincludedir * configure.ac: whitespace * kernel/cycle.h: support cycle counter with xlc on Linux/ppc 2006-06-20 Matteo Frigo * tools/fftw-wisdom.c: Stylistic change. 2006-06-20 Steven G. Johnson * m4/ax_cc_maxopt.m4: bump date * m4/ax_cc_maxopt.m4: correct bug reported by Andrew Salamon ... --enable-portable-binary was ignored (or rather, treated unpredictably) due to typo, grrr 2006-06-02 Steven G. Johnson * Makefile.am, api/Makefile.am, dft/Makefile.am, kernel/Makefile.am, rdft/Makefile.am, reodft/Makefile.am, threads/Makefile.am: install 'internal' header files into includedir/fftw3/, includedir/fftw3f/, etcetera....this will make it easier to write external libraries that plug into FFTW internals, e.g. to add new solvers 2006-05-30 Steven G. Johnson * threads/threads.c: bug fix, thanks to James Donald for the bug report (only affects experimental semaphore stuff) * NEWS: comment * m4/acx_pthread.m4: whoops 2006-05-27 Steven G. Johnson * m4/acx_pthread.m4: version bump * m4/acx_pthread.m4: only check for xlc_r/cc_r if we are not using gcc 2006-05-26 Steven G. Johnson * api/fftw3.h: use ptrdiff_t (it's C89 and standard C++, hooray) * configure.ac: version bump * NEWS: noted 64-bit guru API * api/fftw3.h: note that newer versions of VC++ support long long * api/fftw3.h: try harder to get a portable 64-bit type * api/Makefile.am, api/api.h, api/fftw3.h, api/guru.h, api/guru64.h, api/mktensor-iodims.c, api/mktensor-iodims.h, api/mktensor-iodims64.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-c2r.h, api/plan-guru-dft-r2c.c, api/plan-guru-dft-r2c.h, api/plan-guru-dft.c, api/plan-guru-dft.h, api/plan-guru-r2r.c, api/plan-guru-r2r.h, api/plan-guru-split-dft-c2r.c, api/plan-guru-split-dft-c2r.h, api/plan-guru-split-dft-r2c.c, api/plan-guru-split-dft-r2c.h, api/plan-guru-split-dft.c, api/plan-guru-split-dft.h, api/plan-guru64-dft-c2r.c, api/plan-guru64-dft-r2c.c, api/plan-guru64-dft.c, api/plan-guru64-r2r.c, api/plan-guru64-split-dft-c2r.c, api/plan-guru64-split-dft-r2c.c, api/plan-guru64-split-dft.c: added draft guru64 API 2006-05-22 Steven G. Johnson * m4/acx_pthread.m4: added FIXME note * m4/acx_pthread.m4: check for xlc_r in addition to cc_r; thanks to Guy Moebs for the bug report 2006-04-21 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: added note about gcc 4.0.1 on MacOS/Intel * m4/ax_gcc_archflag.m4: added code for Core Duo; thanks to Eric Branlund * m4/ax_gcc_x86_cpuid.m4: fixed failure for -fPIC or for gcc-4 on Apple Intel machines; thanks to Eric Branlund for the bug report 2006-04-12 Matteo Frigo * configure.ac: Use -maltivec when checking for altivec.h. 2006-04-03 Steven G. Johnson * doc/fftw3.texi: note planner overwriting input in planner-flags reference 2006-03-28 Matteo Frigo * doc/FAQ/fftw-faq.bfnn: FAQ entry about --enable-k7 in 64-bit mode. 2006-03-28 Steven G. Johnson * configure.ac, libbench2/report.c, tools/fftw-wisdom.c: sprintf -> snprintf, to avoid (harmless) complaints by users/compilers * kernel/align.c: silence compiler warning 2006-03-17 Matteo Frigo * doc/fftw3.texi: Remove dft/codelets/inplace, add simd/nonportable to list of directories to be compiled on non-unix systems. 2006-03-04 Steven G. Johnson * doc/fftw3.texi: whoops * doc/fftw3.texi: note that we align the stack ourselves if necessary, with gcc and icc * doc/fftw3.texi: clearer distinction between static and automatic storage in C 2006-02-26 Steven G. Johnson * libbench2/verify-lib.c: rm unused var 2006-02-25 Matteo Frigo * libbench2/my-getopt.c: Improved usage of goto (Dijkstra miserere nostri) 2006-02-25 Steven G. Johnson * libbench2/my-getopt.h: boilerplate * NEWS: update for upcoming 3.1.1 * tools/fftw-wisdom.c, tools/fftw_wisdom.1.in: replace obsolete IMPATIENT with MEASURE * tools/fftw-wisdom.c: corrected comment 2006-02-25 Matteo Frigo * tools/fftw-wisdom.c: -v does not take an argument. * libbench2/my-getopt.c: Obey the unix convention that -ab = -a -b 2006-02-25 Steven G. Johnson * libbench2/bench-main.c, libbench2/my-getopt.c, tools/fftw-wisdom.c: minor fixes (return error on unrecognized option) * tools/fftw-wisdom.c: ugh 2006-02-25 Matteo Frigo * libbench2/my-getopt.c: require exact match for long options. * libbench2/my-getopt.c: better fix * libbench2/my-getopt.c: Fix * libbench2/Makefile.am, libbench2/bench-main.c, libbench2/bench.h, libbench2/getopt-utils.c, libbench2/getopt.c, libbench2/getopt.h, libbench2/getopt1.c, libbench2/my-getopt.c, libbench2/my-getopt.h: nothing 2006-02-20 Steven G. Johnson * dft/indirect-transpose.c: rm transpose-indirect-inplace solver, which was buggy 2006-02-15 Matteo Frigo * kernel/cycle.h: Comment fix. * kernel/cycle.h: Cycle counter for Visual C++ x86-64, courtesy of Dirk Michaelis 2006-02-15 Steven G. Johnson * doc/Makefile.am: rfftwnd.png is in builddir * doc/fftw3.texi: fixed typo: --enable-portable-binary, not --with 2006-02-13 Matteo Frigo * dft/dftw-direct.c, rdft/hc2hc-direct.c: estimator tweaks. * simd/simd-sse.h, simd/simd-sse2.h: sse/sse2 support for t3?v codelets * simd/simd-altivec.h: Use CEXP instead of SIN/COS. * genfft/oracle.ml: bug in randomized cse eliminator. 2006-02-12 Matteo Frigo * dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, dft/simd/t3b.h, dft/simd/t3f.h, genfft/algsimp.ml, genfft/annotate.ml, genfft/c.ml, genfft/c.mli, genfft/complex.ml, genfft/complex.mli, genfft/expr.ml, genfft/expr.mli, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_mdct.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_r2r.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/gen_twidsq_c.ml, genfft/magic.ml, genfft/oracle.ml, genfft/schedule.ml, genfft/simd.ml, genfft/to_alist.ml, genfft/trig.ml, genfft/twiddle.ml, genfft/twiddle.mli, simd/simd-altivec.h: Added support for t2-style simd codelets. This is altivec only for now; sse/sse2 don't even compile yet. * dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, dft/simd/t1s.c, dft/simd/t1s.h, dft/simd/ts.c, dft/simd/ts.h, genfft/twiddle.ml: Added support for t2-style simd split-complex codelets. 2006-02-10 Steven G. Johnson * m4/ax_openmp.m4: *** empty log message *** * m4/ax_openmp.m4: punctuation * api/f77api.c, api/f77funcs.h: windows DLL stuff for Fortran interface 2006-02-10 Matteo Frigo * configure.ac: Bumped version to 3.1.1 * kernel/ifftw.h: Precompute array indices on x86-64. Speeds up Pentium IV and makes no appreciable difference on AMD. 2006-02-08 Matteo Frigo * simd/Makefile.am, simd/sse.c, simd/sse2.c, simd/x86-cpuid.h: Check whether the processor supports CPUID before issuing the instruction. (Grrr...) Code contributed by Eric J. Korpela. * kernel/cycle.h: icc supports x86_64 these days. 2006-02-05 Matteo Frigo * kernel/primes.c: Paranoia. 2006-01-30 Steven G. Johnson * kernel/primes.c: whoops, fixed assert (y <= x) * kernel/primes.c: note that safe_mulmod requires {x,y} < p (or at least < 2p), and added assert 2006-01-30 Matteo Frigo * libbench2/bench-user.h, libbench2/timer.c: fixed aix/xlc lossage * libbench2/verify-lib.c: In the impuse test, normalize the impulse so that the impulse and the random vectors have roughly the same L2 norm. This change reduces the number of bits that we lose because of floating-point cancellation, so that we can focus on the bits that we lose because of bugs. * rdft/dht-rader.c: Compute omega in trigreal precision, as opposed to R. 2006-01-28 Steven G. Johnson * Makefile.am, configure.ac, tests/Makefile.am, threads/Makefile.am, tools/Makefile.am: add --with-combined-threads option as workaround to Windows inability to build shared libs with dependencies 2006-01-27 Steven G. Johnson * threads/Makefile.am: libfftw3_threads should *not* used -no-undefined because, in fact, it is not true -- this library depends on -lfftw3, and is not self-contained * NEWS: updated 2006-01-27 Matteo Frigo * api/apiplan.c, dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/ctsq.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/indirect-transpose.c, dft/indirect.c, dft/rader.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/timer.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c, tests/hook.c, threads/ct.c, threads/dft-vrank-geq1.c, threads/hc2hc.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Added paranoid stack alignment when awaking plans. While I was at it, removed obsolete, redundant AWAKE macro. * NEWS: Updated for 3.1. * TODO, libbench2/bench-main.c: ditched one alignment check and noted that we should eliminate the rest as well 2006-01-26 Matteo Frigo * libbench2/bench-main.c: alignment hack * m4/ax_gcc_archflag.m4: detect pentium M 2006-01-25 Steven G. Johnson * m4/ax_gcc_archflag.m4: don't trust host_cpu if it claims we are on i386/i486, and call cpuid anyway (if it fails we use no arch flag). This is needed on FreeBSD * kernel/kalloc.c: suggest --with-our-malloc16 in error message * configure.ac: ditto for -no-gcc * configure.ac: flags required for successfull compilation should be added even if the user overrides CFLAGS 2006-01-24 Steven G. Johnson * m4/ax_openmp.m4: upcoming gcc OpenMP support uses -fopenmp * m4/ax_openmp.m4: note that PGI uses -mp as well 2006-01-23 Matteo Frigo * kernel/cycle.h, simd/sse.c, simd/sse2.c: my best guess at how to fix the microsoft crap du jour 2006-01-23 Steven G. Johnson * configure.ac, kernel/cycle.h: use -Masmkeyword for PGI cycle counter, grr 2006-01-22 Matteo Frigo * configure.ac: Bumped version number to 3.1. 2006-01-21 Matteo Frigo * configure.ac: Report that --enable-k7 is incompatible with --enable-shared. * Makefile.am: Do not use empty libraries in LIBADD, since otherwise the linker fails on Solaris. 2006-01-18 Steven G. Johnson * bootstrap.sh: warn end-users away from this file 2006-01-17 Matteo Frigo * simd/simd-sse.h: Gcc sucks. * tests/hook.c: Disabled checks that may turn out to be too paranoid. * tests/hook.c: Some paranoid checks. * libbench2/ovtpvt.c: Flush stdout after printing. * kernel/alloc.c, tests/bench.c: Run the leak detector in all cases, not just when verbose > 2. * api/mapflags.c: Eliminate calls to pow(), rint(). 2006-01-17 Steven G. Johnson * kernel/ifftw.h: put # in first column, for stylistic consistency 2006-01-17 Matteo Frigo * api/mapflags.c, kernel/ifftw.h, kernel/planner.c: Made timeout part of impatience flags, in order to improve the usability of wisdom. Also, fixed bogus error recovery logic in planner.c:imprt(). 2006-01-17 Steven G. Johnson * api/apiplan.c, api/fftw3.h, doc/fftw3.texi, kernel/planner.c: make timelimit < 0 .eq. FFTW_NO_TIMELIMIT 2006-01-17 Matteo Frigo * api/apiplan.c, api/fftw3.h, api/the-planner.c, doc/fftw3.texi, kernel/planner.c, tests/bench.c: Eliminated the FFTW_TIMELIMIT flag in favor of this simpler logic: fftw_set_timelimit(0) disables time limit. fftw_set_timelimit(X), X>0 sets the time limit to X. 2006-01-16 Matteo Frigo * api/apiplan.c: Force the use of the estimator when wisdom fails because of md5 collisions, otherwise the planner takes forever. * kernel/ifftw.h: Ranted about how broken gcc-4 is. 2006-01-16 Steven G. Johnson * api/apiplan.c, api/fftw3.h, doc/fftw3.texi, tests/bench.c: change fftw_timelimit global var to fftw_set_timelimit(double) function, for simpler usage with shared libraries and for consistency with e.g. set_numthreads 2006-01-16 Matteo Frigo * doc/fftw3.texi: Minor tweaks. 2006-01-15 Matteo Frigo * libbench2/timer.c: tweaks to make sure that time_n() is always called from the same stack position. * libbench2/bench.h, libbench2/speed.c, libbench2/timer.c, libbench2/timer2.c: Major simplification of the timer calibration logic. Also, use an FFT as a unit of work instead of the old pointer chasing, because God knows how pointer chasing interacts with the idiotic cache-hit speculation on the Pentium IV. * kernel/align.c: Fixed broken aligment checks when sizeof(R)==12. * libbench2/timer2.c: Manual unrolling of loop. * libbench2/Makefile.am, libbench2/bench.h, libbench2/timer.c, libbench2/timer2.c: Various improvements to timer calibration routines. * libbench2/timer.c: cygwin defines __CYGWIN__, not __WIN32__ etc. * libbench2/bench-user.h, libbench2/speed.c, libbench2/timer.c, tests/bench.c: fixed confusion between libbench and user timers 2006-01-14 Steven G. Johnson * NEWS: update 2006-01-14 Matteo Frigo * simd/simd-sse.h: Comment. * simd/simd-sse.h: Workaround gcc bug. * configure.ac: Switched to -beta2. 2006-01-13 Matteo Frigo * rdft/buffered.c, rdft/indirect.c, rdft/problem.c, rdft/rank0-rdft2.c, rdft/rdft.h, rdft/vrank3-transpose.c: Fixed technically correct but highly obfuscated use of the enum tag R2HC as a null pointer. 2006-01-13 Steven G. Johnson * configure.ac: --enable-unsafe-mulmod is obsolete 2006-01-13 Matteo Frigo * TODO: More thoughts. * rdft/buffered2.c: Removed loop unrolling because it slows things down on at least one powerpc and it generates clumsy x86 code. 2006-01-13 Steven G. Johnson * kernel/kalloc.c: tweaks 2006-01-12 Steven G. Johnson * kernel/ifftw.h: MacOSX x86 ABI specifies that the stack is kept 16-byte aligned 2006-01-12 Matteo Frigo * kernel/cycle.h: ``ret'' is a reserved word in the evil empire. * simd/sse2.c, simd/sse.c: Changed ret => result because ret ``is a reserved word'' in the evil empire. * simd/simd-sse2.h: Workaround Visual c++ lossage. * simd/simd-sse.h: Workaround visual c++ lossage. * libbench2/getopt-utils.c: isprint() is guaranteed to work for unsigned char + EOF only. 2006-01-11 Steven G. Johnson * rdft/vrank3-transpose.c: rm obsolete fixme * rdft/vrank3-transpose.c: *** empty log message *** * rdft/vrank3-transpose.c: fix comment 2006-01-11 Matteo Frigo * dft/bluestein.c, rdft/buffered2.c, rdft/dht-rader.c, rdft/rank0-rdft2.c, reodft/rodft00e-r2hc-pad.c: Paranoid use of K(x) for all constants x, to avoid runtime double->float conversions on sufficiently stupid compilers. * simd/simd-sse.h: Workaround to gcc nonsense. 2006-01-10 Steven G. Johnson * rdft/vrank3-transpose.c: bug fix: infinite loop in transpose-cut planning * api/fftw3.h: clarified comment * tests/bench.c: more Windows decorations * support/Makefile.codelets: added FIXME comment * support/Makefile.codelets: 'make clean' should not delete codlist.c since it is included in the dist tarball 2006-01-10 Matteo Frigo * dft/dftw-direct.c: Change threshold for ``large'' Cooley-Tukey to 256K from 64K, since it seems to benefit the Pentium IV with sse and the planning cost is not too horrible. 2006-01-10 Steven G. Johnson * kernel/ifftw.h: more missing Windows DLL decorations * rdft/dht-rader.c: remove unused var * threads/threads.c: allow compiler threads, if enabled, to take precedence over explicit threads * api/api.h, kernel/planner.c: *** empty log message *** 2006-01-10 Matteo Frigo * kernel/planner.c: Fixed comment typo. * kernel/planner.c: Rearranged timeout checks so as to eliminate one of them. * kernel/plan.c: Converted residual CK() -> A(). * kernel/planner.c: Maintain the invariant TIMED_OUT ==> NEED_TIMEOUT_CHECK. * api/mapflags.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/buffered.c, kernel/md5.c, kernel/scan.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: silence some 64-bit warnings * tests/hook.c: Assertions. 2006-01-10 Steven G. Johnson * kernel/timer.c: some condensing * api/apiplan.c, kernel/ifftw.h, kernel/planner.c, kernel/timer.c: eliminate X(seconds) in favor of X(elapsed_since), in paranoia of clock wrap * kernel/timer.c: *** empty log message *** * kernel/timer.c: hmm, a bit more pessimistic about clock wrapping 2006-01-10 Matteo Frigo * configure.ac, kernel/ifftw.h: Revert to md5uint = unsigned int whenever possible, so as to avoid wasting space for unsigned long on 64-bit machines. 2006-01-10 Steven G. Johnson * kernel/timer.c: note why clock() wrap should not be a concern * kernel/planner.c: bugfix in recent timeout changes - check for case where last solver times out * NEWS: started changes list from beta 2006-01-10 Matteo Frigo * api/mapflags.c: Paranoia. * kernel/planner.c: Paranoid assertions. * tests/hook.c: Added FIXME comment stating the 64-bit uncleaniness of fftw_tensor_to_bench_tensor(). * dft/simd/t.c: Another 64-bit bug. 2006-01-10 Steven G. Johnson * api/api.h, kernel/ifftw.h, tests/hook.c: more Windows DLL nonsense * api/api.h, kernel/ifftw.h: some additional dllexport tags required to build the test program, due to internal stuff called by hook.c * api/fftw3.h: *** empty log message *** * api/fftw3.h: comment * api/fftw3.h, api/api.h: *** empty log message *** * api/fftw3.h: clarification * api/api.h: define FFTW_DLL if DLL_EXPORT (defined by libtool) is supplied * api/fftw3.h: whoops * api/fftw3.h: another stab at Windows DLL mess 2006-01-10 Matteo Frigo * simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: 64-bit clean SIMD header file. I missed those because sparse does not know vector types. Grrr... 2006-01-10 Steven G. Johnson * configure.ac: this option is called AC_DISABLE_SHARED in the documentation 2006-01-09 Steven G. Johnson * m4/ax_gcc_archflag.m4: fixed --with-gcc-arch to work when cross-compiling 2006-01-09 Matteo Frigo * api/apiplan.c, kernel/ifftw.h, kernel/planner.c: Moved the timeout check back into the search loop, sicut erat in principio. This gives us a precise control over the timeout. To avoid the overhead of X(seconds)(), only call X(seconds)() if some time measurement was taken since the last call to X(seconds)(). 2006-01-09 Steven G. Johnson * rdft/vrank3-transpose.c: comments * rdft/vrank3-transpose.c: generalized transpose-cut routine to be able to call transpose-gcd recursivly; TOMS follow-the-cycles algorithm now seems to be completely superseded * threads/threads.c: *** empty log message *** * threads/threads.c: ignore errors from setscope -- POSIX standard does not require PTHREAD_SCOPE_SYSTEM to be supported, and PTHREAD_SCOPE_PROCESS is usually okay in that case 2006-01-08 Steven G. Johnson * rdft/vrank3-transpose.c: added TODO comment * rdft/vrank3-transpose.c: whoops 2006-01-08 Matteo Frigo * NEWS: Boasted ``much faster altivec performance''. * configure.ac, dft/simd/codelets/Makefile.am, genfft/annotate.ml, genfft/magic.ml, genfft/schedule.ml, support/Makefile.codelets, support/twovers.sh: Added a new pass to the generator to schedule for the pipeline latency. (This schedule modifies the ``optimal'' cache-oblivious schedule and hence it uses more registers.) This pass is currently: * disabled for non-fma code, under the assumption that this will run on a register-starved fma. * enabled for non-simd fma code, under the assumption that this will run on a processor with 32 or more FP registers. The latency of 4 is conservative and does not introduce too much register pressure. * enabled for simd fma code, under the assumption that this will run on altivec. The latency of 8 seems to produce the best results. 2006-01-08 Steven G. Johnson * rdft/vrank3-transpose.c: fixed estimator for vrank3-transpose * NEWS: more detail on VC++ workaround * rdft/vrank3-transpose.c: typo * rdft/vrank3-transpose.c: screw it, just use planner for all sub-transposes in vrank3-transpose (still just use memcpy for contiguous copies, though) * kernel/tile2d.c: add an assert * kernel/ifftw.h, rdft/rank0.c, rdft/vrank3-transpose.c: vrank3-transpose now uses planner to decide whether to use cpy2d, cpy2d_tiled, etc. * kernel/primes.c: too annoying to have isqrt unexpectedly fail for n==0 2006-01-07 Steven G. Johnson * NEWS, doc/fftw3.texi: clarifications * rdft/vrank3-transpose.c: comment fix * doc/FAQ/fftw-faq.bfnn: more faq updates * configure.ac, doc/FAQ/fftw-faq.bfnn: enable fma on hppa, update FAQ entry 2006-01-07 Matteo Frigo * dft/simd/t.c: Accomodate different semantics of 'const' in C and C++ * NEWS: Altivec is called VMX in IBM land. * NEWS: Noted faster altivec support. 2006-01-07 Steven G. Johnson * m4/ax_cc_maxopt.m4: updated icc flag detection 2006-01-06 Matteo Frigo * TODO: Note ``memoize triggen''. * mkdist.sh: Use --enable-threads to generate dependencies in the threads/ directory. * kernel/ifftw.h: Workaround to icc #defining __GNUC__. * configure.ac: Switched name to 3.1-beta1. * TODO: More thoughts. * TODO: Note wish that (block_size % 4) == 0. * dft/codelet-dft.h, dft/codelets/t.c, dft/ctsq.c, dft/dftw-direct.c, dft/k7/k7.c, dft/simd/q1b.c, dft/simd/q1f.c, dft/simd/t.c, dft/simd/t1s.c, threads/ct.c, threads/hc2hc.c: Check alignment of mstart, mcount in SIMD codelets. * bootstrap.sh: Enable threads at bootstrap time, so I get the compiler warnings that I would otherwise ignore. 2006-01-05 Matteo Frigo * threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: made compilable by c++ * kernel/twiddle.c: FIXED: incorrect twiddle_shift() * reodft/redft00e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/ct.c, threads/dft-vrank-geq1.c, threads/hc2hc.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Replaced remnants of awake flag with the new enum wakefulness type. * kernel/planner.c: Oops---there is no need to find a free slot. * kernel/planner.c: Assertions. * kernel/planner.c: Commented the hash table lookup algorithm. * kernel/planner.c: Fixed infinite loop in hashtable lookup/insert. Grrr... 2006-01-05 Steven G. Johnson * COPYRIGHT, api/api.h, api/apiplan.c, api/configure.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute-split-dft-c2r.c, api/execute-split-dft-r2c.c, api/execute-split-dft.c, api/execute.c, api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/export-wisdom.c, api/extract-reim.c, api/f77api.c, api/f77funcs.h, api/fftw3.h, api/flops.c, api/forget-wisdom.c, api/import-system-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c, api/malloc.c, api/map-r2r-kind.c, api/mapflags.c, api/mkprinter-file.c, api/mktensor-iodims.c, api/mktensor-rowmajor.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-guru-split-dft-c2r.c, api/plan-guru-split-dft-r2c.c, api/plan-guru-split-dft.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, api/print-plan.c, api/rdft2-pad.c, api/the-planner.c, api/version.c, api/x77.h, dft/bluestein.c, dft/buffered.c, dft/codelet-dft.h, dft/codelets/n.c, dft/codelets/n.h, dft/codelets/t.c, dft/codelets/t.h, dft/conf.c, dft/ct.c, dft/ct.h, dft/ctsq.c, dft/dft.h, dft/dftw-direct.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/direct.c, dft/generic.c, dft/indirect-transpose.c, dft/indirect.c, dft/k7/k7.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/kdft.c, dft/nop.c, dft/plan.c, dft/problem.c, dft/rader.c, dft/rank-geq2.c, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/n2b.c, dft/simd/n2b.h, dft/simd/n2f.c, dft/simd/n2f.h, dft/simd/n2s.c, dft/simd/n2s.h, dft/simd/q1b.c, dft/simd/q1b.h, dft/simd/q1f.c, dft/simd/q1f.h, dft/simd/t.c, dft/simd/t1b.h, dft/simd/t1f.h, dft/simd/t1s.c, dft/simd/t1s.h, dft/simd/t2b.h, dft/simd/t2f.h, dft/solve.c, dft/vrank-geq1.c, dft/zero.c, doc/f77_wisdom.f, doc/fftw3.texi, genfft-k7/algsimp.ml, genfft-k7/algsimp.mli, genfft-k7/assoctable.ml, genfft-k7/assoctable.mli, genfft-k7/complex.ml, genfft-k7/complex.mli, genfft-k7/expr.ml, genfft-k7/expr.mli, genfft-k7/fft.ml, genfft-k7/gen_notw.ml, genfft-k7/littlesimp.ml, genfft-k7/littlesimp.mli, genfft-k7/monads.ml, genfft-k7/number.ml, genfft-k7/number.mli, genfft-k7/oracle.ml, genfft-k7/oracle.mli, genfft-k7/to_alist.ml, genfft-k7/to_alist.mli, genfft-k7/twiddle.ml, genfft-k7/twiddle.mli, genfft-k7/vScheduler.mli, genfft/algsimp.ml, genfft/algsimp.mli, genfft/annotate.ml, genfft/annotate.mli, genfft/assoctable.ml, genfft/assoctable.mli, genfft/c.ml, genfft/c.mli, genfft/complex.ml, genfft/complex.mli, genfft/conv.ml, genfft/conv.mli, genfft/dag.ml, genfft/dag.mli, genfft/expr.ml, genfft/expr.mli, genfft/fft.ml, genfft/fft.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_mdct.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_r2r.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/gen_twidsq_c.ml, genfft/genutil.ml, genfft/littlesimp.ml, genfft/littlesimp.mli, genfft/magic.ml, genfft/monads.ml, genfft/number.ml, genfft/number.mli, genfft/oracle.ml, genfft/oracle.mli, genfft/schedule.ml, genfft/schedule.mli, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml, genfft/to_alist.ml, genfft/to_alist.mli, genfft/trig.ml, genfft/trig.mli, genfft/twiddle.ml, genfft/twiddle.mli, genfft/unique.ml, genfft/unique.mli, genfft/util.ml, genfft/util.mli, genfft/variable.ml, genfft/variable.mli, kernel/align.c, kernel/alloc.c, kernel/assert.c, kernel/awake.c, kernel/buffered.c, kernel/cpy1d.c, kernel/cpy2d-pair.c, kernel/cpy2d.c, kernel/ct.c, kernel/cycle.h, kernel/debug.c, kernel/hash.c, kernel/iabs.c, kernel/ifftw.h, kernel/kalloc.c, kernel/md5-1.c, kernel/md5.c, kernel/minmax.c, kernel/ops.c, kernel/pickdim.c, kernel/plan.c, kernel/primes.c, kernel/print.c, kernel/problem.c, kernel/rader.c, kernel/scan.c, kernel/solver.c, kernel/solvtab.c, kernel/stride.c, kernel/tensor.c, kernel/tensor1.c, kernel/tensor2.c, kernel/tensor4.c, kernel/tensor5.c, kernel/tensor7.c, kernel/tensor8.c, kernel/tensor9.c, kernel/tile2d.c, kernel/timer.c, kernel/transpose.c, kernel/trig.c, kernel/twiddle.c, libbench/accopy-from.c, libbench/accopy-to.c, libbench/allocate.c, libbench/bench-main.c, libbench/bench-user.h, libbench/bench.h, libbench/can-do.c, libbench/ccopy-from.c, libbench/ccopy-to.c, libbench/deallocate.c, libbench/getopt-utils.c, libbench/info.c, libbench/main.c, libbench/prime.c, libbench/problem.c, libbench/report.c, libbench/speed.c, libbench/timer.c, libbench/verify.c, libbench/zero.c, libbench2/aligned-main.c, libbench2/allocate.c, libbench2/can-do.c, libbench2/dotens2.c, libbench2/getopt-utils.c, libbench2/info.c, libbench2/main.c, libbench2/report.c, libbench2/tensor.c, libbench2/useropt.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h, libbench2/zero.c, m4/ax_gcc_archflag.m4, rdft/buffered.c, rdft/buffered2.c, rdft/codelet-rdft.h, rdft/codelets/hb.h, rdft/codelets/hc2r.c, rdft/codelets/hc2r.h, rdft/codelets/hc2rIII.h, rdft/codelets/hf.h, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/codelets/r2hc.h, rdft/codelets/r2hcII.h, rdft/codelets/r2r.c, rdft/codelets/r2r.h, rdft/conf.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-common.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/indirect.c, rdft/khc2hc.c, rdft/khc2r.c, rdft/kr2hc.c, rdft/kr2r.c, rdft/nop.c, rdft/nop2.c, rdft/plan.c, rdft/plan2.c, rdft/problem.c, rdft/problem2.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-inplace-strides.c, rdft/rdft2-radix2.c, rdft/rdft2-strides.c, rdft/rdft2-tensor-max-index.c, rdft/solve.c, rdft/solve2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/conf.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft.h, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c, simd/altivec.c, simd/nonportable/sse.c, simd/nonportable/sse2.c, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h, simd/sse.c, simd/sse2.c, simd/taint.c, threads/api.c, threads/conf.c, threads/ct.c, threads/dft-vrank-geq1.c, threads/f77api.c, threads/f77funcs.h, threads/hc2hc.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/threads.h, threads/vrank-geq1-rdft2.c, tools/fftw-wisdom-to-conf.1, tools/fftw-wisdom-to-conf.in, tools/fftw-wisdom.c, tools/fftw_wisdom.1.in: updated copyright years to 2006 * m4/ax_gcc_archflag.m4: whoops * m4/ax_gcc_archflag.m4: more updates for recent pentia/amd 2006-01-05 Matteo Frigo * TODO: Pruned TODO. * libbench2/bench-user.h, libbench2/bench.h: Prototype of problem_destroy() 2006-01-05 Steven G. Johnson * TODO: rm obsoleted TODOs 2006-01-05 Matteo Frigo * m4/ax_gcc_archflag.m4: Fallback to 970 if neither -mcpu=power5 nor -mcpu=power4 are supported. 2006-01-05 Steven G. Johnson * NEWS: NEWS updates, clarifications, and reorganization * dft/dftw-genericbuf.c, kernel/planner.c, kernel/trig.c, m4/ax_gcc_x86_cpuid.m4, rdft/dft-r2hc.c: remove some compiler warnings, add an assert check, make estimator work properly for nop plans 2006-01-04 Matteo Frigo * api/apiplan.c, api/fftw3.h, api/mapflags.c, configure.ac, dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/ctsq.c, dft/dftw-direct.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/direct.c, dft/generic.c, dft/indirect-transpose.c, dft/indirect.c, dft/rader.c, dft/rank-geq2.c, dft/vrank-geq1.c, genfft/twiddle.ml, kernel/awake.c, kernel/ifftw.h, kernel/plan.c, kernel/planner.c, kernel/timer.c, kernel/trig.c, kernel/twiddle.c, libbench2/bench-main.c, libbench2/bench.h, libbench2/problem.c, libbench2/speed.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, reodft/redft00e-r2hc-pad.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, tests/hook.c: Two big changes: 1) revised the twiddle generation machinery, to avoid generating twiddles when measuring, and to use a faster O(sqrt(N)) table when this entails no loss of precision. 2) implemented new ALLOW_PRUNING estimator hack. 2005-12-25 Matteo Frigo * dft/generic.c, rdft/generic.c: Estimator tweaks, mostly to favor generic over rader for small n. 2005-12-24 Matteo Frigo * tests/hook.c: Grrr... missing break statement in switch. * genfft/gen_hc2hc.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/gen_twidsq_c.ml, rdft/codelet-rdft.h, dft/codelet-dft.h, genfft-k7/gen_twiddle.ml: Swapped fields TW and OPS in struct ct_desc_s, to make k7 asm code insensitive to -malign-double. For consistency, changed struct hc2hc_desc_s in the same way. * kernel/planner.c: Wrong check for infeasible slvndx in imprt(). * kernel/planner.c: Removed obsolete function invoke_solver_if_correct_kind(). * kernel/primes.c: Faster implementation of safe_mulmod(), avoiding divisions altogether. Works for 0 <= p <= INT_MAX. * api/mapflags.c: FFTW_ALLOW_LARGE_GENERIC must belong to flags->l, it cannot be overridden by fftw. 2005-12-24 Steven G. Johnson * kernel/primes.c: no more need for limits.h, add some explanatory comments 2005-12-23 Matteo Frigo * dft/k7/k7.c: Paranoia. * kernel/ifftw.h, kernel/planner.c: Fixed subtle bug involving overflow of the slvndx field in flags_t. * NEWS: Note 64-bit clean. * threads/ct.c, threads/dft-vrank-geq1.c, threads/hc2hc.c, threads/rdft-vrank-geq1.c, threads/threads.h, threads/vrank-geq1-rdft2.c: Threads are now 64-bit clean * kernel/ifftw.h: Restored the old numbering TW_NEXT=3 etc, because the k7 code depends on it. * configure.ac, kernel/ifftw.h, kernel/primes.c: Portable implementation of MULMOD() and safe_mulmod(). Removed all unnecessary AC_CHECK_SIZEOF() from configure.ac. 2005-12-22 Matteo Frigo * genfft/gen_r2r.ml: Inline the loop body in r2r codelets like we do everywhere else. * dft/conf.c: Oops. * dft/bluestein.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/rader.c, kernel/ifftw.h, kernel/trig.c, kernel/twiddle.c, rdft/dht-rader.c: Renamed X(sin_and_cos)() to X(cexp)(). * dft/bluestein.c, dft/conf.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/rader.c, kernel/Makefile.am, kernel/ifftw.h, kernel/trig.c, kernel/trig1.c, kernel/twiddle.c, rdft/dht-rader.c: Somewhat faster generation of twiddle factors. 2005-12-21 Matteo Frigo * kernel/md5.c: tweaks * dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/ctsq.c, dft/dft.h, dft/direct.c, dft/generic.c, dft/indirect-transpose.c, dft/indirect.c, dft/nop.c, dft/problem.c, dft/rader.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/problem.c, rdft/problem2.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c, tests/hook.c: Sped up planner, esp. in estimate mode. The planner now classifies all solvers into DFT, RDFT, and RDFT2, and it only invokes solvers appropriate for the problem being planned. Because we have several hundred solvers, the overhead of calling irrelevant solvers is significant, and this modification mitigates the issue somewhat. 2005-12-20 Matteo Frigo * kernel/print.c: Eliminated all calls to sprintf() in favor of own routines, so as not to force users to link stdio and the associated locale/pthreads crap. * kernel/ifftw.h, kernel/print.c: Implemented routine to print INT, removing the need for c99's %td format. 2005-12-19 Matteo Frigo * kernel/alloc.c: info->n is size_t 2005-12-18 Matteo Frigo * configure.ac, dft/problem.c, rdft/problem.c, rdft/problem2.c: Explicit casts in front of pointer difference in printf() context, just in case INT != ptrdiff_t. * kernel/print.c: Forgot to add %D to print.c * dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/ctsq.c, dft/dftw-direct.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/direct.c, dft/generic.c, dft/problem.c, dft/rader.c, dft/vrank-geq1.c, kernel/print.c, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/problem.c, rdft/problem2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c: Use %D as format character for type INT. * kernel/ifftw.h, kernel/stride.c: Changed type of an_int_guaranteed_to_be_zero. Changed name as well. * kernel/ifftw.h, kernel/planner.c, kernel/print.c: converted %o -> INT * dft/bluestein.c, dft/buffered.c, dft/codelet-dft.h, dft/codelets/n.c, dft/codelets/t.c, dft/ct.c, dft/ct.h, dft/ctsq.c, dft/dftw-direct.c, dft/dftw-generic.c, dft/dftw-genericbuf.c, dft/direct.c, dft/generic.c, dft/indirect-transpose.c, dft/problem.c, dft/rader.c, dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/n2b.c, dft/simd/n2f.c, dft/simd/n2s.c, dft/simd/q1b.c, dft/simd/q1f.c, dft/simd/t.c, dft/simd/t1s.c, dft/vrank-geq1.c, dft/zero.c, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_r2r.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/gen_twidsq_c.ml, kernel/buffered.c, kernel/cpy1d.c, kernel/cpy2d-pair.c, kernel/cpy2d.c, kernel/ct.c, kernel/iabs.c, kernel/ifftw.h, kernel/md5-1.c, kernel/minmax.c, kernel/ops.c, kernel/planner.c, kernel/primes.c, kernel/rader.c, kernel/solvtab.c, kernel/stride.c, kernel/tensor.c, kernel/tensor1.c, kernel/tensor2.c, kernel/tensor4.c, kernel/tensor7.c, kernel/tile2d.c, kernel/transpose.c, kernel/trig.c, kernel/twiddle.c, rdft/buffered.c, rdft/buffered2.c, rdft/codelet-rdft.h, rdft/codelets/hc2r.c, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/codelets/r2r.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-common.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/problem.c, rdft/problem2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-inplace-strides.c, rdft/rdft2-radix2.c, rdft/rdft2-strides.c, rdft/rdft2-tensor-max-index.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c, simd/sse2.c, simd/taint.c: Major 64-bit cleanup. 2005-12-08 Steven G. Johnson * kernel/cycle.h: PGI x86-64 cycle counter, courtesy Cristiano Calonaci 2005-12-06 Matteo Frigo * kernel/planner.c: Must insert into hash table when wisdom_state == WISDOM_ONLY, otherwise wisdom does not work. 2005-10-08 Steven G. Johnson * m4/acx_pthread.m4: comment 2005-10-02 Matteo Frigo * api/apiplan.c, kernel/ifftw.h, kernel/planner.c: Paranoia: made planner robust against MD5 collisions. 2005-09-28 Matteo Frigo * doc/FAQ/fftw-faq.bfnn: Note that --enable-3dnow is unsupported. * NEWS: * Removed --enable-3dnow support. * SIMD support for split complex arrays. * api/version.c, configure.ac, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq_c.ml, kernel/align.c, kernel/ifftw.h, simd/3dnow.c, simd/Makefile.am, simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h: Removed --enabled-3dnow, since it is becoming useless as the world moves to x86-64, and it is a pain to maintain. (We should probably remove the k7 stuff as well.) * genfft/gen_notw.ml, genfft/gen_twiddle.ml: Missing BEGIN_SIMD(), END_SIMD() statements. 2005-09-27 Matteo Frigo * simd/simd-sse.h: Tweaks * genfft/to_alist.ml, dft/dftw-direct.c: Fixed wrong opcount for simd codelets. * genfft/c.ml, simd/simd-altivec.h, simd/simd-sse2.h: fixed flop counts * simd/simd-sse2.h: Silence warnings * dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, dft/simd/n2s.c, dft/simd/n2s.h, dft/simd/t1s.c, dft/simd/t1s.h, genfft/annotate.ml, genfft/c.ml, genfft/gen_hc2hc.ml, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq_c.ml, genfft/genutil.ml, genfft/simd.ml, genfft/twiddle.ml, genfft/twiddle.mli, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h: Implemented split-complex SIMD codelets 2005-09-26 Matteo Frigo * dft/simd/codelets/Makefile.am, genfft/annotate.ml, genfft/annotate.mli, genfft/expr.ml, genfft/expr.mli, genfft/gen_notw_c.ml, genfft/simd.ml, genfft/simdmagic.ml, simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: Generalized the ``store pairs'' trick (now called ``store multiple''). 2005-09-25 Matteo Frigo * simd/simd-altivec.h: Silence some warnings. 2005-09-24 Matteo Frigo * simd/simd-altivec.h: Removed obsolete cruft 2005-09-20 Matteo Frigo * configure.ac, simd/simd-altivec.h: Re-enabled check for because OSX requires it. 2005-09-11 Matteo Frigo * configure.ac: Check for sizeof(unsigned int) unconditionally, because the result is used by ifftw.h. * dft/simd/t.c: Higher size limit for t2 codelets. * dft/simd/Makefile.am, dft/simd/t.c, dft/simd/t1b.c, dft/simd/t1f.c, dft/simd/t2b.h, dft/simd/t2f.h: Heuristic: do not use t2 simd codelets for N>1024. 2005-09-06 Matteo Frigo * libbench2/timer.c: Larger tolerance in timer calibration routine. 2005-09-05 Matteo Frigo * Makefile.am, configure.ac, simd/Makefile.am, simd/nonportable/Makefile.am, simd/nonportable/sse.c, simd/nonportable/sse2.c, simd/simd-sse.h, simd/simd-sse2.h, simd/sse-aux.c, simd/sse.c, simd/sse2-aux.c, simd/sse2.c: Removed SSE and SSE2 asm because it was bitrotting. Use the Intel API instead, which seems to be supported by gcc >= 3.3. Moved files that require -msse, -msse2 to new directory. * m4/ax_gcc_archflag.m4: Parse cputypes of the form 7447A,altivecsupported * m4/ax_gcc_archflag.m4: Distinguish powerpc 7400 from the 7450, which has a different pipeline. * simd/simd-altivec.h: Paranoia: define RIGHT_CPU unconditionally. 2005-08-12 Matteo Frigo * tools/fftw-wisdom-to-conf.in: Removed obsolete name fftw-wisdom2c. * tools/fftw-wisdom-to-conf.in: Avoid creation of temporary files---use cpp magic instead. This fix solves a security bug and avoids nonportable tempfile creation hacks. 2005-08-05 Matteo Frigo * configure.ac, simd/altivec.c, simd/simd-altivec.h: Workaround for with gcc-3.3 altivec bug. 2005-06-16 Steven G. Johnson * m4/acx_pthread.m4: solaris fix: check -pthreads first since gcc does not like -pthread but chokes due to stubbed libc (grr) 2005-06-03 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: note that VC++ bug was fixed in 2005 2005-05-30 Steven G. Johnson * configure.ac, m4/ax_cc_maxopt.m4, m4/ax_cc_vendor.m4, m4/ax_compiler_vendor.m4: generalized ax_cc_vendor to ax_compiler_vendor * m4/ax_cc_maxopt.m4: updated message * m4/acx_pthread.m4, m4/ax_cc_maxopt.m4, m4/ax_cc_vendor.m4, m4/ax_check_compiler_flags.m4, m4/ax_gcc_aligns_stack.m4, m4/ax_gcc_archflag.m4, m4/ax_gcc_version.m4, m4/ax_gcc_x86_cpuid.m4, m4/ax_openmp.m4: update for new AC archive format 2005-05-24 Steven G. Johnson * api/fftw3.h: *** empty log message *** 2005-05-23 Steven G. Johnson * NEWS: *** empty log message *** * NEWS: more notes * m4/ax_cc_maxopt.m4: whoops * doc/FAQ/fftw-faq.bfnn: note icc 8.x annoyance * doc/FAQ/fftw-faq.bfnn: *** empty log message *** * doc/FAQ/fftw-faq.bfnn: note gcc 3.4.[0123] bug, which is fixed in gcc 3.4.4 * m4/ax_cc_maxopt.m4: added automatic detection of icc architecture flag * configure.ac: add -no-gcc to icc flags...even if it is Intel's fault, I'm sick of dealing with bug reports about this * doc/fftw3.texi: added @cindex portability * doc/fftw3.texi: note --without-gcc-arch * m4/ax_gcc_archflag.m4: bsd ppc detection; some odd 603 types 2005-05-22 Steven G. Johnson * m4/ax_gcc_archflag.m4: *** empty log message *** * m4/ax_gcc_archflag.m4: ensure no spaces in cputype * m4/ax_gcc_archflag.m4: nevermind * m4/ax_gcc_archflag.m4: more bsd stuff * m4/ax_gcc_archflag.m4: added BSD cpu detection for SPARC and better super/hypersparc detection * m4/ax_gcc_archflag.m4: comment 2005-05-20 Steven G. Johnson * doc/fftw3.texi: "alternate" == "alternative" is US-centric * doc/fftw3.texi: typo * doc/FAQ/fftw-faq.bfnn: clarification 2005-05-17 Steven G. Johnson * tests/bench.c: print out estimate-planner time from can_do in verbose>2 mode 2005-05-09 Steven G. Johnson * m4/ax_cc_vendor.m4: comment 2005-05-06 Steven G. Johnson * Makefile.am, api/api.h, api/fftw3.h, configure.ac, threads/Makefile.am: fixes for building Windows DLLs with Cygwin; thanks in part to Stephane Fillod 2005-04-22 Steven G. Johnson * m4/ax_cc_maxopt.m4: -ffast-math seems to produce code that is either about the same speed or slightly faster (gcc 3.3 and 4.0, x86) * m4/ax_gcc_archflag.m4: power5 fallback to power4 sched for older gcc's * m4/ax_gcc_archflag.m4: check for power5 2005-04-20 Matteo Frigo * api/fftw3.h: Removed clause #3 2005-04-20 Steven G. Johnson * api/fftw3.h: license clarification 2005-04-20 Matteo Frigo * api/fftw3.h: Changed license of fftw3.h to X11. 2005-04-11 Steven G. Johnson * genfft/gen_conv.ml: delete fixed-input code 2005-04-10 Matteo Frigo * api/apiplan.c, api/fftw3.h, api/mapflags.c, dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/dftw-direct.c, dft/dftw-generic.c, dft/generic.c, dft/indirect-transpose.c, dft/indirect.c, dft/rader.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/generic.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c, tests/bench.c, tests/hook.c: joned L-U-planner branch 2005-04-08 Steven G. Johnson * reodft/reodft00e-splitradix.c: ref 2005-04-07 Steven G. Johnson * genfft/gen_r2r.ml: whoops * genfft/complex.ml, genfft/complex.mli, genfft/fft.ml, genfft/gen_athtw.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_r2r.ml, genfft/magic.ml, genfft/number.ml, genfft/number.mli: added (optional) new split-radix algorithm, enabled with -newsplit; also new -standalone option to omit desc; also -unitary, -normalization, and -normsqr options to generate r2r codelets with various normalization (to match lit. in DCT-II, use: -unitary -normsqr 2) 2005-04-02 Matteo Frigo * reodft/redft00e-r2hc-pad.c, reodft/redft00e-r2hc.c, reodft/reodft00e-splitradix.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc-pad.c, reodft/rodft00e-r2hc.c: reodft solvers are SLOW, not UGLY * api/fftw3.h, api/mapflags.c, dft/bluestein.c, dft/dftw-generic.c, dft/generic.c, dft/rader.c, kernel/ifftw.h, kernel/planner.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/generic.c, rdft/hc2hc-generic.c, rdft/rdft-dht.c, rdft/vrank3-transpose.c: Renamed SLOW_ALGORITHMS => SLOW; promoted vrank3-transpose from UGLY to SLOW. * api/mapflags.c, dft/bluestein.c, dft/dftw-generic.c, dft/generic.c, dft/indirect-transpose.c, dft/rader.c, kernel/ifftw.h, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/generic.c, rdft/hc2hc-generic.c, rdft/rdft-dht.c: Distinguish SLOW_ALGORITHM from UGLY; UGLY is never tried unless EXHAUSTIVE. 2005-04-01 Matteo Frigo * dft/indirect.c, kernel/planner.c, rdft/indirect.c: fixed NO_BUFFERING * api/apiplan.c, api/mapflags.c, dft/bluestein.c, dft/buffered.c, dft/ct.c, dft/indirect-transpose.c, dft/indirect.c, dft/rader.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, tests/bench.c, tests/hook.c: Eliminated problem_flags in favor of a combination of L and U. Renamed planner_flags -> flags. * api/apiplan.c, api/fftw3.h, api/mapflags.c, dft/bluestein.c, dft/ct.c, dft/dftw-direct.c, dft/dftw-generic.c, dft/indirect-transpose.c, dft/indirect.c, dft/rader.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-rader.c, rdft/hc2hc.c, rdft/indirect.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: Initial creation of cvs branch for new [L,U] planner 2005-03-25 Matteo Frigo * kernel/planner.c: Moved timeout check outside the search loop, because X(seconds) is expensive. 2005-03-20 Matteo Frigo * dft/ct.c: Enable vector recursion for in-place problems, otherwise dftw-genericbuf works only in PATIENT mode. * dft/dftw-genericbuf.c: oops * dft/dftw-genericbuf.c: make solver UGLY for small N * dft/dftw-genericbuf.c, dft/Makefile.am, dft/conf.c, dft/dft.h: new dftw-genericbuf solver 2005-03-18 Matteo Frigo * simd/sse2-aux.c: Hmm... what was I thinking? * simd/simd-sse2.h, simd/sse2-aux.c: Workaround for a MSVC bug. 2005-03-17 Matteo Frigo * simd/simd-sse.h, simd/sse-aux.c: Workaround for a MSVC bug that was reported by Eddie Yee. 2005-03-15 Matteo Frigo * rdft/rank0.c: try both contiguous input and contiguous output when in doubt * dft/codelets/standard/Makefile.am, genfft/genutil.ml, genfft/magic.ml, genfft/schedule.ml, genfft/schedule.mli, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am: Added genfft flag -precompute-twiddles which moves the computation of the twiddle factors before the main schedule. This flag produces smaller code everywhere, and slightly faster code on powerpc. I observe no speed difference on x86. 2005-03-15 Steven G. Johnson * kernel/kalloc.c: sp * kernel/alloc.c: whoops, spelling error (thanks to Steve Eddins for bug report) 2005-03-12 Matteo Frigo * dft/vrank-geq1.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: Do not approximate pcost = vl * child->pcost unless child is guaranteed not to be a simple codelet. 2005-03-10 Matteo Frigo * dft/direct.c: Relaxed applicability conditions. 2005-03-09 Matteo Frigo * dft/dftw-generic.c: Minor optimization * libbench2/problem.c: Interpret K to mean *1024. Similarly for M. * kernel/primes.c: Hmm... somehow some previous commit got lost. * dft/ct.c: Paranoia 2005-03-07 Steven G. Johnson * configure.ac: whoops * configure.ac, m4/ax_cc_maxopt.m4: move fftw-specific HP/UX tweak into configure.ac * configure.ac, m4/ax_cc_family.m4, m4/ax_cc_maxopt.m4, m4/ax_cc_vendor.m4: ax_cc_family -> ax_cc_vendor (vendor names are easier to remember), add checks for many new compilers, use in ax_cc_maxopt 2005-03-07 Matteo Frigo * kernel/planner.c: Count FMA as one flop in estimator when HAVE_FMA * dft/dftw-generic.c: Do not try radix-2 generic. 2005-03-06 Matteo Frigo * m4/ax_cc_maxopt.m4: Use -O3 for xlc now that we use -O for CODELET_OPTIM * configure.ac, m4/ax_cc_family.m4: New AX_CC_FAMILY macro, that detects the compiler based on symbols that it defines (as opposed to the name of the compiler). We need to start use this strategy everywhere else. * dft/direct.c: Runtime checks to guarantee small strides. * dft/vrank-geq1.c, kernel/tensor7.c, rdft/rank0.c, rdft/vrank-geq1.c: Reduced the search space for rank-0 transforms 2005-03-04 Steven G. Johnson * kernel/primes.c: little assert 2005-03-01 Matteo Frigo * dft/dft.h, dft/dftw-direct.c, dft/direct.c, dft/kdft.c: Implemented directbuf, enabled for now. * dft/Makefile.am, dft/dftw-direct.c, dft/dftw-directbuf.c, dft/kdft-dif.c, dft/kdft-dit.c: Unified dftw-direct, dftw-directbuf in an attempt to tame code growth 2005-02-27 Steven G. Johnson * doc/fftw3.texi: fixed copyright 2005-02-27 Matteo Frigo * rdft/rank0.c: silence warnings * rdft/rank0.c: oops * rdft/rank0.c: Tweaking while thinking about a higher-rank transposer (bitreverser) * dft/dftw-directbuf.c: Transposed the buffer, and skewed it. This allows for contiguous copy operations, and the codelet should not incur associativity conflicts if the buffer is large. 2005-02-26 Steven G. Johnson * kernel/tensor4.c: make tensor_max_index more reasonable (take maximum of input and output max indices, computed separately) 2005-02-26 Matteo Frigo * rdft/vrank3-transpose.c: Use cpy2d instead of cpy2d_tiled, because vl may be too large. * genfft/annotate.ml: Fixed old bug that was introduced with yesterday's changes. * kernel/cpy1d.c: ``Interesting'' switch statement. 2005-02-25 Matteo Frigo * support/Makefile.codelets: Disabled -reorder-loads -reorder-stores, since they seem to do nothing. 2005-02-25 Steven G. Johnson * rdft/rank-geq2.c, dft/rank-geq2.c: Because of the recent changes to kernel/pickdim.c, splitrnk=0 is no longer equivalent to splitrnk=1 for rnk < 4, where the latter is the FFTW2 behavior. For small rnk, however, I observe the planner to pretty consistently choose the FFTW2 behavior (splitrnk=1), despite its not being asymptotically optimal in the cache oblivious sense. So, make splitrnk=1 instead of splitrnk=0 the default in FFTW_MEASURE and FFTW_ESTIMATE modes (rnk > 3 is pretty rare in practice anyway). * dft/indirect-transpose.c: tweak * dft/indirect-transpose.c: slight relaxation * dft/indirect-transpose.c: cruft * dft/Makefile.am, dft/conf.c, dft/dft.h, dft/indirect-transpose.c, dft/indirect.c, kernel/ifftw.h, kernel/tensor4.c: added experimental indirect-transpose solver: when transforming the columns of the matrix, allow us to do a transpose to make the DFTs contiguous * configure.ac: check for abort() * kernel/assert.c: call abort() on failed assertion 2005-02-25 Matteo Frigo * kernel/primes.c: Forgot to change X(isqrt) -> isqrt_maybe 2005-02-25 Steven G. Johnson * rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, dft/rank-geq2.c: require finite_rnk 2005-02-24 Matteo Frigo * support/Makefile.codelets, support/twovers.sh, genfft/annotate.ml, genfft/magic.ml: Implemented reordering of loads and stores so that the real and imaginary part are loaded/stored together. This should improve out-of-cache performance in the presence of associativity conflicts, and maybe worsen in-cache performance because of worse scheduling. Enabled for now, for experimental purposes. 2005-02-24 Steven G. Johnson * m4/ax_gcc_aligns_stack.m4: fix comment * m4/ax_gcc_aligns_stack.m4: better message * m4/ax_gcc_aligns_stack.m4: use gcc version > 3.0 as fallback in check for alignment bug * m4/ax_gcc_aligns_stack.m4: don't use -malign-double unconditionally (it is only available on x86) 2005-02-24 Matteo Frigo * kernel/transpose.c: Subtler selection of tilesz. * rdft/rank0.c: Call cpy2d_tiledbuf, not cpy2d_tiled. * kernel/cpy2d.c, kernel/transpose.c: buffer sizes were wrong :-( * kernel/cpy2d.c, kernel/ifftw.h, kernel/tile2d.c, kernel/transpose.c, rdft/rank0.c: Single function for computing tile size. Eliminate spurious assertions. * kernel/tile2d.c: Do tiling recursively. * kernel/Makefile.am, kernel/cpy2d.c, kernel/ifftw.h, kernel/tile2d.c, kernel/transpose.c, rdft/rank0.c, rdft/vrank3-transpose.c: Reworked tiled transposes; provide tiling with and without buffering. I can't believe that one has to waste his life with this @#$%. * kernel/pickdim.c: Clarified logic. I am not sure why the code was so confusing to begin with. The computation of *dp in the which_dim == 0 case was also wrong, returning e.g. *dp == -1 if sz->rnk == 1. * configure.ac: Enable aggressive inlining in codelets only, to avoid code bloat. * kernel/Makefile.am, kernel/cpy2d.c, kernel/ifftw.h, kernel/primes.c, kernel/transpose-rec.c, kernel/transpose.c, rdft/rank0.c, rdft/vrank3-transpose.c: Removed cache-oblivious copy/transpose algorithms in favor of explicitly blocked algorithms. The cache-oblivious algorithms fail if there are associativity conflicts, in which case buffering is necessary, as per Carter and Gatlin. Once you set the buffer size, there is no point whatsoever to do the algorithm recursively, and you may as well use blocking. 2005-02-23 Steven G. Johnson * configure.ac: --disable-fortran now differs from --enable-fortran that fails * api/f77api.c: comment tweak * api/f77api.c: If a Fortran compiler was not detected, just make our best guess at what wrappers to use...I'm sick of dealing with user complaints from cases where wrapper detection fails for whatever reason. * api/f77funcs.h: fflush(stdout) after print_plan, in case F77 doesn't 2005-02-23 Matteo Frigo * mkdist.sh: --enable-sse is necessary after all, to generate all dependencies correctly. * dft/dftw-directbuf.c, kernel/Makefile.am, kernel/cpy2d-pair.c, kernel/ifftw.h: Put cpy2d_pair into its own file, so that I can experiment with buffering of nontwiddle codelets. * doc/Makefile.am: Copy rfftwnd.png from ${srcdir}, not $PWD 2005-02-22 Matteo Frigo * rdft/rank0.c: Do not bother memcpy-ing complex numbers. * kernel/cpy2d.c, kernel/transpose-rec.c: Tighther layout of buffers. I am not sure it matters, but just in case... * rdft/rank0.c: Usec cpy1d for rank-0 copies * kernel/Makefile.am, kernel/cpy1d.c, kernel/cpy2d.c, kernel/ifftw.h, kernel/transpose-rec.c, kernel/transpose.c, rdft/rank0.c, rdft/vrank3-transpose.c: Implemented in-place transposes with buffering. Moved copy/transposition routines into own files, so that we can reuse them from multiple places. TODO: merge vrank3-transpose.c with rank0.c, or rename vrank3-transpose.c to rank0-fancy.c or something like that; decide whether square in-place transposes should be in rank0.c or vrank3-transpose.c; apply FIXME's in vrank3-transpose.c. * kernel/print.c: Indentation should be printed after newline, not at the beginning of print() 2005-02-21 Matteo Frigo * rdft/rank0.c: generalized in anticipation of more complicated solvers. * rdft/rank0.c: Implemented buffered recursive transpose 2005-02-20 Matteo Frigo * rdft/rank0.c: Fixed comment * rdft/Makefile.am, rdft/conf.c, rdft/rank0-vrank2.c, rdft/rank0.c, rdft/rdft.h: grand unification of rank0 solvers * rdft/vrank3-transpose.c: manual tail-recursion optimization 2005-02-19 Matteo Frigo * libbench2/verify-lib.c, libbench2/verify-r2r.c, tests/check.pl: implemented check for transpositions * libbench2/verify-lib.c: Previous fix was wrong for rdft2 problems. * rdft/dft-r2hc.c: vecsz->rnk must be finite for this solver to apply. * rdft/vrank3-transpose.c: unified the various simple'' transposers * libbench2/verify-lib.c, libbench2/verify-r2r.c, rdft/vrank3-transpose.c: Fixed stupid bug in rec_transpose_swap. Fixed stupid verifier that did not catch the bug. * rdft/vrank3-transpose.c: Minor cleanup of transposition routines. * dft/dftw-directbuf.c, rdft/hc2hc-directbuf.c: Make the batch size B=Theta(r) instead of B=Theta(1) in buffered twiddle solvers. Theory: for cache line size L, we want B = Omega(L) to utilize the cache line fully. We also want B*r =O(Z), where Z is the size of the cache. It is safe to assume that Z = Theta(L^2): cache designers will tend to make L as large as they can get away with, because they don't have to program the machines that they build, and Z < Theta(L^2) will screw up the little matrix transposition benchmarks that they use to design the cache. Hence, B=Theta(r) is the right number. 2005-02-19 Steven G. Johnson * m4/ax_gcc_archflag.m4: for --enable-portable-binary, only try -mcpu=$arch and -m$arch on x86, since these generate non-portable code on every other target (and some other targets, like Alpha, don't support -mtune=$arch). 2005-02-18 Matteo Frigo * kernel/ifftw.h: gcc/aix defines _POWER, not __powerpc__ like the rest of the world does. 2005-02-17 Matteo Frigo * configure.ac: enable fma for ia64, since it seems to help with the hpux compiler. * TODO: *** empty log message *** 2005-02-16 Matteo Frigo * simd/simd-altivec.h: Fixes for darwin * api/apiplan.c: Made the correctness of the code more obvious. 2005-02-16 Steven G. Johnson * NEWS, m4/ax_cc_maxopt.m4: s/with-portable-binary/enable-portable-binary/ to be GNUlly correct; I'm sticking with --with-gcc-arch=arch, however, as --enable-gcc-arch=arch has the wrong connotations for me * api/apiplan.c: whoops * api/apiplan.c: bless wisdom with patience used to create it * api/apiplan.c: whoops * NEWS, TODO, api/apiplan.c, api/fftw3.h, doc/fftw3.texi, kernel/ifftw.h, kernel/planner.c, kernel/timer.c, tests/bench.c: added 'timed' planner option 2005-02-16 Matteo Frigo * dft/simd/Makefile.am, simd/Makefile.am: Do not use SIMD_CFLAGS. The theory is that if taint.c is unsafe with SIMD_CFLAGS, then all files in this directory are as well. Conversely, if these files require SIMD_CFLAGS because they include "simd.h", then taint.c requires SIMD_CFLAGS as well, and thus we need some other hack. * dft/codelets/standard/Makefile.am, dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am, rdft/codelets/r2r/Makefile.am, support/Makefile.codelets: Do not override CFLAGS in Makefile.am. 2005-02-15 Matteo Frigo * configure.ac: Allow users to build long double version even if sizeof(long double) == sizeof(double) * commercialize.sh: Updated for 3.1 * api/version.c: Oops, version.h is no longer used 2005-02-14 Matteo Frigo * api/Makefile.am, api/version.c, configure.ac, dft/codelets/standard/Makefile.am, dft/simd/codelets/Makefile.am, m4/ocaml.m4, mkdist.sh, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am, rdft/codelets/r2r/Makefile.am, support/Makefile.am, support/Makefile.codelets, support/twovers.sh: unified fma and non-fma versions * configure.ac: forgot to remove inplace/Makefile from configure.ac * dft/codelet-dft.h, dft/codelets/Makefile.am, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/conf.c, Makefile.am: Merged dft/codelets/inplace with the main dft/codelets/standard directory. This step makes dft codelets consistent with the rest of the naming conventions, and will simplify the eventual merge of fma and non-fma codelets. * simd/altivec.c, simd/simd-altivec.h: inline altivec constants, since gcc seems to generate better code this way. 2005-02-13 Matteo Frigo * simd/altivec.c, simd/simd-altivec.h: group altivec constants into a single array, for faster access * genfft/c.ml, genfft/c.mli, genfft/simd.ml: code cleanup * genfft/c.ml, genfft/c.mli: removed some unused stuff * simd/simd-3dnow.h, simd/simd-altivec.h: New twiddle scheme for altivec, 3dnow * simd/simd-sse2.h: Implemented new twiddle scheme for sse2 * dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, dft/simd/q1b.h, dft/simd/q1f.h, dft/simd/t1b.h, dft/simd/t1f.h, dft/simd/t2b.h, dft/simd/t2f.h, simd/simd-sse.h: Implemented experimental t2* codelets, which store twiddle factors in a more convenient format, at the expense of twice the storage. Currently only SSE works; I have to port SSE2, altivec, etc. to the new scheme. After this, we will decide whether these codelets are worth the price. 2005-02-11 Matteo Frigo * simd/simd-altivec.h: Forgot to define SIMD_STRIDE_OKPAIR * simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: fixed sse2, 3dnow, and altivec, as promised * simd/simd-sse.h, dft/simd/n2b.c, dft/simd/n2f.c, genfft/annotate.ml, genfft/expr.ml, genfft/expr.mli, genfft/simd.ml: Generate n2?v_* codelets in such a way that we may or may not pair stores, depending on which mode happens to work best on a particular SIMD implementation. sse2, 3dnow, and altivec are currently broken---will fix soon. 2005-02-10 Matteo Frigo * simd/altivec.c, simd/simd-altivec.h: instantiate altivec constants only once * dft/simd/n2b.c, dft/simd/n2f.c: Fixed alignment checks for new SIMD scheme * dft/simd/codelets/Makefile.am, genfft/annotate.ml, genfft/annotate.mli, genfft/expr.ml, genfft/expr.mli, genfft/gen_notw_c.ml, genfft/genutil.ml, genfft/simd.ml, genfft/simdmagic.ml, simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: Change n2?v_* codelets to store pairs of vectors, with implicit 2x2 transposition. Works for 2-way SIMD as well. Tested with sse and sse2. I haven't tried altivec yet, but I observed a huge speedup when I transformed one codelet by hand. 2005-02-09 Matteo Frigo * dft/codelets/standard/Makefile.am: Resurrected old DIF codelets for experimental purposes. They are disabled for now, but I am keeping the setup around for future reference. 2005-02-09 Steven G. Johnson * doc/fftw3.texi: *** empty log message *** * doc/fftw3.texi: clarifications, document --with-portable-binary and --with-gcc-arch * NEWS: *** empty log message *** 2005-02-08 Steven G. Johnson * NEWS: more change comments * doc/FAQ/fftw-faq.bfnn: fma is definitely beneficial on Itanium with the HP/UX compiler 2005-02-08 Matteo Frigo * libbench2/bench-main.c: Silence warnings. 2005-02-08 Steven G. Johnson * libbench2/getopt.h: when we compile our own getopt, change symbol names to avoid conflicts (e.g. avoid build failure on MacOS X with --enable-shared) * reodft/reodft00e-splitradix.c: grr, more bugfixes for in-place case 2005-02-08 Matteo Frigo * dft/codelets/standard/Makefile.am: removed relics of FRANZ mode 2005-02-07 Matteo Frigo * simd/altivec.c: Somehow xlc does not like ``vector int dummy;'' * mkdist.sh: There is no need to enable sse to make the distribution. This might have been true in the past but not anymore. * api/Makefile.am: Oops---included fortran file in C sources * api/Makefile.am, api/version.c: Set version string at ``make dist'' time, not at ``configure'' time, so we know whether a user is using the fma version or not. 2005-02-06 Matteo Frigo * genfft/gen_hc2r_noinline.ml, genfft/gen_notw_noinline.ml, genfft/gen_notw_noinline_c.ml, genfft/gen_r2hc_noinline.ml: Removed useless files * configure.ac, dft/codelets/standard/Makefile.am, dft/simd/codelets/Makefile.am, genfft/Makefile.am, genfft/c.ml, genfft/c.mli, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/gen_twidsq_c.ml, genfft/genutil.ml, genfft/simd.ml, kernel/ifftw.h, kernel/stride.c, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am, support/Makefile.codelets: Different (simpler?) way to prevent the compiler from optimizing loop inductive variables. We now explicitly corrupt stride variables by xor-ing them with another variable that happens to be zero (but the compiler does not know it). In this way, the compiler does not attempt to extract a zillion loop indices from codelets, which would overflow the register set. Set the -fno-loop-optimize flag to further help the process. Consequences: removed m* codelets. Smaller library size. Slightly faster code with gcc/powerpc (including altivec). Much faster code with xlc/powerpc. No changes for gcc/pentium. Maybe slightly faster with icc/pentium. 2005-02-05 Steven G. Johnson * reodft/reodft00e-splitradix.c: paranoia about in-place rodft00 plans * kernel/planner.c: don't believe pcost when using the estimator...there is no point, and it screws up estimator hacks to prefer in-codelet loops to vecloops 2005-02-05 Matteo Frigo * m4/ax_cc_maxopt.m4: Reduced optimization level from -O3 to -O for xlc, since -O generates faster code. 2005-02-05 Steven G. Johnson * reodft/reodft00e-splitradix.c: whoops, only applicable to redft00/rodft00 plans * reodft/reodft00e-splitradix.c: fixed in-place operation, and don't create size-0 sub-plans 2005-02-04 Matteo Frigo * simd/altivec.c: Autodetect altivec on linux. This code works with gcc-3.4 and -maltivec, with or without -mabi=altivec. The code *should* work with gcc-3.3 without -mabi=altivec. However, disabling -mabi=altivec on gcc-3.4 produces much worse code (I don't know why). 2005-01-28 Steven G. Johnson * doc/fftw3.texi: update reference 2005-01-27 Steven G. Johnson * doc/fftw3.texi: note that DCT-II/III are often called the'' DCT/DCT 2005-01-21 Steven G. Johnson * kernel/cycle.h: added MSVC++ for ia64 (based on information at http://www.intel.com/cd/ids/developer/asmo-na/eng/19949.htm?prn=Y) * kernel/cycle.h: vc++ defines _M_AMD64 on x86-64, apparently 2005-01-19 Steven G. Johnson * m4/acx_pthread.m4: avoid gratuitous breakage with -Werror, requested by Simon Perreault 2005-01-17 Steven G. Johnson * m4/ax_gcc_aligns_stack.m4: comment typo 2005-01-15 Steven G. Johnson * configure.ac: bumped shared-lib revision# * api/fftw3.h, api/flops.c, kernel/ifftw.h, kernel/planner.c, tests/bench.c: add X(estimate_cost) to get estimator cost, and print from bench, to aid in tweaking estimator * doc/fftw3.texi: *** empty log message *** * doc/fftw3.texi: formatting fix * doc/fftw3.texi, reodft/Makefile.am, reodft/conf.c: tweaks * reodft/reodft00e-splitradix.c: use less buffer space * doc/fftw3.texi, reodft/Makefile.am, reodft/conf.c, reodft/redft00e-r2hc.c, reodft/reodft.h, reodft/reodft00e-splitradix.c, reodft/rodft00e-r2hc.c: added split-radix-based dct/dst I for odd n * api/fftw3.h: *** empty log message *** * api/fftw3.h: warn silly users who confuse CVS id with FFTW version 2005-01-14 Steven G. Johnson * m4/ax_gcc_archflag.m4: get sparc cpu type on solaris as well as with linux * m4/ax_gcc_archflag.m4: detect prescott mobile (f37) 2005-01-13 Steven G. Johnson * bootstrap.sh, m4/ax_gcc_archflag.m4: use cpuid for x86_64 as well as i[56]86 * m4/ax_gcc_archflag.m4: update with x86info 1.7 and other sources (identify k8, nocona, etc), handle nonzero leading bytes in eax * m4/acx_pthread.m4: compactified check for JOINABLE; use AC_DEFINE_UNQUOTED instead of AC_DEFINE for PTHREAD_CREATE_JOINABLE (thanks to Oliver Niekrenz for the bug report) 2005-01-12 Matteo Frigo * genfft/annotate.ml: The scheduler hack was incorrect because it swapped instructions of the form A = *B and *B = C. Fixed. * m4/ax_cc_maxopt.m4, m4/ax_gcc_archflag.m4: Quote expressions such as ``if test $FOO = yes'' when $FOO may be empty. Also, $GCC is set to either ``yes'' or empty, never to ``no''. * TODO, configure.ac, simd/altivec.c: Hmm---somehow the previous commit did not work. 2005-01-11 Matteo Frigo * simd/simd-altivec.h: Fixed various gcc-related problems on powerpc: - gcc-3.4 becomes totally confused by expressions like vec_add(a, vec_add(b, vec_add(c, ...))) The compiler uses gigabytes of memory and then crashes, presumably because of the exponential-time search problem involved in typing the above expression (since vec_add can take either ints or floats). I changed VADD and similar macros to be inline functions, thus constraining the type system. - New flags --param inline-unit-growth=1000 --param large-function-growth=1000 to work around limitations of the gcc-3.4 inliner. * simd/simd-altivec.h: Check for HAVE_ALTIVEC_H * configure.ac, simd/altivec.c, simd/simd-altivec.h: Remove support for altivec using gcc builtins, since these keep changing across gcc versions. These changes work on gcc-3.4/linux; I haven't tried MacOS X yet. (The altivec ``spec'' differs between Motorola/Apple and gcc, grrr...) 2005-01-10 Matteo Frigo * rdft/rank0-vrank2.c: Stylistic changes * rdft/dft-r2hc.c: Changed incorrect ugliness condition. 2005-01-10 Steven G. Johnson * m4/ax_gcc_archflag.m4: note x86info version number that was used, to make it easier to update the cpuid for changes in later versions 2005-01-10 Matteo Frigo * rdft/dft-r2hc.c: Make dft-r2hc non-UGLY for rank-0 problems * m4/ax_gcc_archflag.m4: Do not use -mcpu=970 on power4 processors, because power4 does not have altivec. * TODO: Note gcc-3.4 problem with inlining. * genfft/gen_hc2r_noinline.ml, genfft/gen_notw_noinline_c.ml: Oops, forgot to remove ``static'' from the declaration of noninlinable functions. * m4/ax_gcc_archflag.m4: Recognize power4. Use ``head -n COUNT'' instead of obsolete ``head -COUNT'' (which fails on gentoo). * TODO: Remind to add FAQ entry concerning gcc-3.4.[1-3] crashes. 2005-01-10 Steven G. Johnson * m4/ax_gcc_version.m4: whoops * m4/ax_gcc_version.m4: support checking for major.minor.patchlevel 2005-01-10 Matteo Frigo * configure.ac: Revert CODELET_OPTIM to -O on IA32, which is faster than -O2. * configure.ac: /bin/sh allows no spaces in assignments. * genfft/gen_hc2r_noinline.ml, genfft/gen_notw_noinline.ml, genfft/gen_notw_noinline_c.ml: Make non-inlinable functions external, so that gcc becomes confused and does not try to inline them. 2005-01-09 Matteo Frigo * configure.ac: Add -fno-web to CFLAGS, because -fweb destroys FMAs. * m4/ax_gcc_archflag.m4: Allow -mcpu=970 besides -mcpu=G5 * configure.ac: configure was not using -fno-schedule-insns :-( * kernel/planner.c: In mkplan() and elsewhere, use solver index instead of solver *pointer*, which looks marginally clearer. * TODO, kernel/ifftw.h, kernel/planner.c: Split planner hash table into two tables, for blessed and unblessed solutions respectively. Now an unblessed solution never overwrites a blessed solution, thus avoiding wisdom leakage by construction. Further, forget() is now a O(1) operation, which speeds up the estimator when the wisdom table is large. * TODO: New TODO idea. 2005-01-06 Matteo Frigo * kernel/planner.c: Split search() into two routines to make the UGLY/NO_UGLY logic obvious. 2004-12-17 Steven G. Johnson * simd/3dnow.c, simd/sse.c, simd/sse2.c: push/pop 64-bit registers on ia64; thanks to Orion Poplawski for the fix 2004-12-10 Steven G. Johnson * kernel/kalloc.c: patch from FreeBSD ports - FreeBSD does not have memalign, but its malloc is 16-byte aligned 2004-11-23 Steven G. Johnson * simd/Makefile.am: don't compile taint.c with SIMD_CFLAGS (fixed Debian bug #259612) 2004-11-18 Steven G. Johnson * support/Makefile.codelets: revert incorrect change -- codlist.c should be rebuilt, but it is built in the build directory and not in the source directory * support/Makefile.codelets: $(CODLIST) should be rebuilt only if Makefile.am changes, or alternatively only in maintainer mode, to prevent stomping in the source directory during user builds. (Thanks to Grant Cook for the bug report.) 2004-11-13 Steven G. Johnson * kernel/cycle.h: corrected #ifdef for icc/ia64, thanks to Matt Boman * NEWS: spelling correction (Larsen, not Larson) 2004-11-09 Steven G. Johnson * m4/ax_gcc_archflag.m4: use standard withval * m4/ax_gcc_x86_cpuid.m4: match doc * m4/ax_openmp.m4: formatting * m4/ax_openmp.m4: make sure OPENMP_CFLAGS environment variable is used correctly * configure.ac, m4/ax_cc_maxopt.m4, m4/ax_check_cc_flags.m4, m4/ax_check_compiler_flags.m4, m4/ax_gcc_aligns_stack.m4, m4/ax_gcc_archflag.m4: replace ax_check_cc_flags with more generic ax_check_compiler_flags 2004-11-08 Steven G. Johnson * configure.ac, m4/ax_cc_maxopt.m4, m4/ax_openmp.m4: separate macro for OpenMP test 2004-11-05 Steven G. Johnson * doc/fftw3.texi: typo 2004-10-29 Steven G. Johnson * configure.ac: *** empty log message *** 2004-10-28 Steven G. Johnson * m4/ax_gcc_archflag.m4: better guessing of sparc type on Linux 2004-10-27 Steven G. Johnson * m4/ax_gcc_archflag.m4: note default * m4/ax_gcc_archflag.m4: tweak * m4/ax_gcc_x86_cpuid.m4: comment * Makefile.am: whoops, m4 is EXTRA_DIST, not SUBDIR, since it doesn't have a Makefile * m4/ocaml.m4: silence warnings * Makefile.am, acinclude.m4, acx_pthread.m4, bootstrap.sh, configure.ac, m4/acx_pthread.m4, m4/amx_prog_as.m4, m4/ax_cc_maxopt.m4, m4/ax_check_cc_flags.m4, m4/ax_gcc_aligns_stack.m4, m4/ax_gcc_archflag.m4, m4/ax_gcc_version.m4, m4/ax_gcc_x86_cpuid.m4, m4/ocaml.m4: clean up m4 macros; try to detect correct gcc -march flag on x86; new --with-portable-binary, --with-gcc-arch= flags; use -O2 for codelets with gcc 3.4 to work around bug 2004-10-26 Steven G. Johnson * libbench2/mp.c: rename cexp -> mcexp to avoid conflict with C99 builtin 2004-10-25 Steven G. Johnson * acinclude.m4: use basename , w/o args, for compiler-name comparisons; also detect Compaq ccc on alpha-linus * doc/FAQ/fftw-faq.bfnn: note recent icc problems 2004-10-24 Steven G. Johnson * threads/threads.c: whoops, disable semaphores again (for now) * threads/threads.c: POSIX semaphores are *not* the same as SYSV semaphores * dft/conf.c, dft/ct.c, dft/ct.h, dft/ctsq.c, dft/dft.h, dft/dftw-direct.c, dft/dftw-directbuf.c, dft/dftw-generic.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, kernel/ifftw.h, kernel/twiddle.c, rdft/Makefile.am, rdft/conf.c, rdft/ct.c, rdft/ct.h, rdft/hc2hc-common.c, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/hc2hc-generic.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/khc2hc.c, rdft/rdft.h, threads/Makefile.am, threads/ct-dit.c, threads/ct.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/hc2hc.c, threads/threads.c, threads/threads.h: re-implement threaded stuff; dftw now takes parameters to indicate a portion of m loop 2004-10-22 Steven G. Johnson * doc/fftw3.texi: more C++ notes 2004-10-14 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: note bug report for VC++ 6.0 from Dale Dickerhoof 2004-10-01 Steven G. Johnson * api/fftw3.h: fmt * rdft/vrank3-transpose.c: comment typo * rdft/dft-r2hc.c: bug fix -- ishift/oshift only apply to execution of child plan 2004-10-01 Matteo Frigo * api/fftw3.h, api/mapflags.c, kernel/ifftw.h, kernel/planner.c: New planner that tries never to lose wisdom. 2004-09-30 Matteo Frigo * api/fftw3.h: Nested comment was triggering a warning. 2004-09-10 Steven G. Johnson * api/import-system-wisdom.c: system "root" under dgjpp is /dev/env/DJDIR, not /dev/env/DJGPP, according to djgpp's libc.info; patch confirmed with J. M. Guerrero 2004-09-08 Steven G. Johnson * api/import-system-wisdom.c, tests/Makefile.am, tools/fftw-wisdom-to-conf.in: some minor portability fixes for djgpp; thanks to Juan Manuel Guerrero for the patch 2004-08-19 Steven G. Johnson * README: pointer to tutorial for quick start * api/fftw3.h: point users to manual 2004-08-07 Steven G. Johnson * doc/fftw3.texi: minor typo 2004-07-18 Steven G. Johnson * kernel/cycle.h: use __DECCXX for Compaq cxx, not Linux-specific symbol 2004-07-16 Steven G. Johnson * kernel/cycle.h: patch by John Bowman to make cycle counter work with DEC cxx under Linux 2004-06-30 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn, doc/FAQ/html.refs: updated pruned FFT discussion, with link to further details on www.fftw.org/pruned.html 2004-06-15 Steven G. Johnson * acx_pthread.m4: darwin is based on freebsd 2004-06-03 Steven G. Johnson * api/f77api.c: in --with-windows-f77-mangling, add lowercase + single underscore for Intel compilers, etc. (thanks to David Gomez for the bug report) 2004-04-07 Steven G. Johnson * rdft/rank0-vrank2.c: whoops, extra alignment check * kernel/ifftw.h, rdft/rank0-vrank2.c, rdft/vrank3-transpose.c: disable most 2-float-as-double copying, add alignment check in one remaining place 2004-04-06 Steven G. Johnson * doc/fftw3.texi: make sure it is clear that real-even/odd refers to symmetry, not size * rdft/vrank3-transpose.c: optimization 2004-04-03 Steven G. Johnson * rdft/vrank3-transpose.c: separate cutoff for ugliness...these cutoffs are still not ideal * kernel/ifftw.h: transpose.c is gone * configure.ac, dft/Makefile.am, dft/conf.c, dft/rank0.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/transpose.c, rdft/Makefile.am, rdft/conf.c, rdft/dft-r2hc.c, rdft/rank0-vrank2.c, rdft/rdft.h, rdft/vrank3-transpose.c: move all rank0 transforms to rdft * libbench2/mflops.c, libbench2/report.c: enable fp-moves/us comparison of rank-0 transforms 2004-04-01 Steven G. Johnson * kernel/transpose.c, kernel/tensor7.c: whoops 2004-03-31 Steven G. Johnson * kernel/tensor7.c: sort tensor dims by stride absolute values, not strides * kernel/transpose.c: *** empty log message *** * dft/dftw-generic.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/transpose.c, libbench2/problem.c: added improved transpose algorithm for N x M where |N-M| is small * configure.ac: check to make sure SIMD matches precision, and make sure user doesn't select both SSE and SSE2 2004-03-28 Matteo Frigo * rdft/hc2hc-generic.c: Implemented hc2hc-generic hc2r. 2004-03-25 Matteo Frigo * rdft/hc2hc-generic.c: Inverted loop for stride-1 access. * dft/dftw-generic.c: Swapped j <-> k for consistency 2004-03-23 Matteo Frigo * rdft/hc2hc-generic.c: Require that R be odd * rdft/Makefile.am, rdft/conf.c, rdft/dft-r2hc.c, rdft/hc2hc-generic.c, rdft/rdft.h: Implemented hc2hc-generic (DIT only for now). 2004-03-22 Matteo Frigo * kernel/twiddle.c: Relax equality of twiddle description, since the `i' field is not used by TW_FULL or TW_HALF. * dft/dftw-generic.c, kernel/twiddle.c: Do not allocate tw_instr's on the stack. Thus, the ``consistency check'' in twiddle.c becomes wrong. * libbench2/mp.c: Fixed incorrect malloc()/free() logic. * rdft/hc2hc-directbuf.c: Silence warnings * rdft/Makefile.am, rdft/ct.c, rdft/hc2hc-common.c: Separate file for hc2hc common routines * dft/dftw-directbuf.c, rdft/Makefile.am, rdft/ct.h, rdft/hc2hc-direct.c, rdft/hc2hc-directbuf.c, rdft/khc2hc.c: (re)Implemented buffered hc2hc. Slight simplification of twiddle-factors management. * configure.ac: Incremented libtool revision number before we forget. * rdft/hc2hc-direct.c: Fixed opcnt 2004-03-21 Matteo Frigo * dft/Makefile.am, dft/ct-directw.c, dft/ct-directwbuf.c, dft/ct-generic.c, dft/dftw-direct.c, dft/dftw-directbuf.c, dft/dftw-generic.c: Renamed files. These solvers are not really cooley-tukey. * dft/ct.h, genfft/gen_hc2hc.ml, rdft/Makefile.am, rdft/codelet-rdft.h, rdft/ct.c, rdft/ct.h, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-direct.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/khc2hc-dif.c, rdft/khc2hc-dit.c, rdft/khc2hc.c, rdft/rdft.h: Started moving rdft/ to the new cooley-tukey ontology * dft/ct-directw.c, dft/ct-directwbuf.c, dft/ct-generic.c: Plans in ct-*.c are subtypes of plan_dftw, not plan_dft * dft/ct-directw.c: Slight simplification * dft/ct.c: Minor simplification 2004-03-20 Matteo Frigo * simd/simd-sse.h, simd/simd-sse2.h: Workarounds for icc-8.0 nonsense. 2004-03-07 Matteo Frigo * doc/fftw3.texi: FFTW_FORWARD is not technically an ``option''. 2004-02-24 Steven G. Johnson * acx_pthread.m4: Alejandro requested that his name be removed from @author 2004-02-23 Steven G. Johnson * acx_pthread.m4: GNU Pth emulation library check 2004-02-21 Steven G. Johnson * tools/fftw-wisdom.c: calling can-do calls the estimating-planner, which creates wisdom that we don't want ...we should be able to do all of the documented problems, anyway * tests/bench.c: don't forget_wisdom because of side effects * tests/bench.c: forget wisdom from can_do 2004-02-19 Steven G. Johnson * api/malloc.c: parenthesization 2004-02-13 Matteo Frigo * api/Makefile.am, api/malloc.c, kernel/Makefile.am, kernel/alloc.c, kernel/ifftw.h, kernel/kalloc.c, tests/bench.c: Split malloc into kernel_malloc and API malloc 2004-02-12 Steven G. Johnson * kernel/alloc.c: X(malloc) must be extern "C" * dft/bluestein.c: satsify C++ compiler 2004-02-06 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: with the new flags, fma is definitely beneficial on PA-RISC with HP/UX cc * acinclude.m4: grr, Ofaster etcetera are not supported under older versions of the compiler. Note that +Ofltacc *disables* fp-reordering optimizations (which are enabled by +Oall). +Optrs_ansi is the older version of the aliasing stuff * acinclude.m4: +Otype_safety=ansi on hpux * acinclude.m4: just use +Ofaster on hpux (+O3 +Onolimit +Olibcalls +Ofltacc=relaxed -Wl,+mergeseg) 2004-01-30 Steven G. Johnson * configure.ac: check for win32 threads for mingw32; thanks to Alessio Massaro 2004-01-29 Steven G. Johnson * threads/threads.c: added missing 'static', thanks to Alessio Massaro 2004-01-09 Steven G. Johnson * rdft/dht-rader.c: print more like bluestein * rdft/dht-rader.c: fixed op count for R2HC_ONLY_CONV * dft/buffered.c, rdft/buffered.c, rdft/buffered2.c: include DESTROY_INPUT in buffered flags for in-place...otherwise in-place hc2r uses rdft-dhtcvs diff * rdft/dht-rader.c: resurrected R2HC_ONLY_CONV option to share plans and save on planning time * rdft/dht-rader.c: precompute folding for cyclic convolution 2004-01-07 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: minor * doc/FAQ/fftw-faq.bfnn: note reports of successful compilation on Windows * reodft/reodft010e-r2hc.c: citation year 2004-01-06 Steven G. Johnson * rdft/dht-rader.c: comment * rdft/dht-rader.c: comment fix * rdft/dht-rader.c: fixed naming cruft * rdft/dht-rader.c: space * rdft/dht-rader.c: comment * rdft/dht-rader.c: moved assert * rdft/dht-rader.c: comment * rdft/dht-rader.c: delete old R2HC_ONLY_CONV hack, now defunct * rdft/dht-rader.c: added padded real rader * rdft/generic.c: removed unused var * rdft/generic.c: handle both FFT_SIGN values 2004-01-02 Matteo Frigo * rdft/codelets/r2hc.c: Oops: d->ros ==> d->ios * rdft/codelets/hc2r.c: Oops: d->ris should have been d->iis 2004-01-01 Matteo Frigo * dft/Makefile.am, dft/dft.h, dft/rader-omega.c, dft/rader.c, rdft/Makefile.am, rdft/conf.c, rdft/rader-hc2hc.c: Removed rdft rader cooley-tukey, to be superseded by a generic reduction of rdft twiddle problems to dft + pre/post processing * dft/ct.c, dft/generic.c, kernel/ifftw.h, kernel/primes.c, kernel/twiddle.c, rdft/generic.c: In anticipation of the upcoming revision of rdft, removed rdft generic dit/dif cooley-tukey, in favor of generic rh2c and hc2r solvers. Cleaned up stuff that became unused after this change, such as TW_GENERIC. * kernel/Makefile.am, kernel/ifftw.h, kernel/square.c: Removed useless file 2003-12-26 Steven G. Johnson * configure.ac: whoops, don't call AC_F77_DUMMY_MAIN if no Fortran compiler is found; thanks to Charles Radley for the bug report. 2003-12-19 Steven G. Johnson * acinclude.m4: guess good flags for Solaris/intel, suggested by J. Gregory Wright 2003-12-06 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn, doc/FAQ/html.refs: blah 2003-11-30 Matteo Frigo * rdft/generic.c: DIF generic solver was destroying the input. * NEWS, rdft/rader-hc2hc.c: Fixed bug that caused HC2R transforms to destroy the input in certain cases, even if the user specified FFTW_PRESERVE_INPUT. 2003-11-29 Matteo Frigo * libbench2/verify-r2r.c: Implemented swap_io hack for r2r verifier. 2003-11-21 Steven G. Johnson * reodft/reodft010e-r2hc.c: citation 2003-11-15 Matteo Frigo * kernel/ifftw.h, kernel/planner.c, tests/hook.c: Trying to get ``make paranoid-check'' to work. (Still broken.) 2003-11-15 Steven G. Johnson * libbench2/bench-user.h, libbench2/tensor.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.h: fixes for input-preservation tests 2003-11-15 Matteo Frigo * tests/bench.c: Assume FFTW_PRESERVE_INPUT unless either the `d' flag is given in the problem, or the problem is multidimensional c2r (which fftw3 cannot without destroying the input). With this change, we can at least test that FFTW_PRESERVE_INPUT works in the c2r 1d case. 2003-11-15 Steven G. Johnson * libbench2/verify-dft.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c: apply should copy back input for input-preservation check 2003-11-15 Matteo Frigo * tests/bench.c, tests/check.pl, rdft/rank-geq2-rdft2.c: Undone previous bogus changes 2003-11-14 Matteo Frigo * tests/check.pl: Check dr[fb] in addition to r[fb] * rdft/rank-geq2-rdft2.c, tests/bench.c: Fixed conditions under which the rank-geq2-rdft2 solver is applicable. The old solver was not applicable for out-of-place problems unless DESTROY_INPUT. This is bogus. As long as the subsolvers honor !DESTROY_INPUT, the solver is always applicable. Changed semantics of test program, so that PRESERVE_INPUT is always true unless the problem specifies destroy_input explicitly. Without this change, there is no way to test the new solver. 2003-10-30 Steven G. Johnson * configure.ac: added AIX OpenMP (-qsmp=omp) support; thanks to Greg Bauer 2003-10-30 Matteo Frigo * acinclude.m4: G5 CFLAGS 2003-10-24 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: western FAQ 2003-10-23 Matteo Frigo * simd/altivec.c: Oops. * configure.ac, simd/altivec.c, simd/simd-altivec.h: Autodetect altivec 2003-10-22 Steven G. Johnson * tests/check.pl: MinGW gets confused by a single / 2003-10-17 Matteo Frigo * libbench2/mp.c: Paranoid portability fix 2003-10-16 Matteo Frigo * doc/fftw3.texi: size -> length, which should make clear that we are not talking about arbitrary precision. 2003-10-15 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: pruned transforms are a FAQ 2003-10-09 Steven G. Johnson * TODO: NO_SEARCH has already been mapped to FFTW_WISDOM_ONLY * TODO: newline 2003-09-28 Steven G. Johnson * doc/fftw3.texi: fix * doc/fftw3.texi: clarification 2003-09-27 Steven G. Johnson * doc/fftw3.texi: minor fix * doc/fftw3.texi: grammar * doc/fftw3.texi: html output fix * doc/fftw3.texi: mentioned sqrt(2) factors for DCT/DST * api/fftw3.h, api/mapflags.c: FFTW_WISDOM_ONLY flag (undocumented for now), suggested by Phil Dumont 2003-09-24 Steven G. Johnson * kernel/cycle.h: removed UpTime code * kernel/cycle.h: updated documentation for mach_absolute_time * configure.ac, kernel/cycle.h: use mach_absolute_time on MacOS/Darwin, as a fallback; don't bother checking for UpTime since it requires extra libs * configure.ac, kernel/cycle.h: support Apple UpTime function for asm-less xlc, grrr... 2003-09-23 Steven G. Johnson * api/api.h, api/fftw3.h: additional paranoia for xlc etc. 2003-09-22 Steven G. Johnson * api/api.h, api/fftw3.h: work around _Complex_I weirdness in xlc, reported by Greg Allen 2003-09-05 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: typo 2003-09-05 Matteo Frigo * commercialize.sh: New script that produces commercial version. * doc/FAQ/fftw-faq.bfnn: Noted that VC++ is buggy. Noted that we know nothing about Windows. Noted that the sky is blue as well. 2003-09-02 Matteo Frigo * doc/fftw3.texi: Noted that certain arrays are no longer used after the planner has completed. 2003-08-26 Matteo Frigo * doc/fftw3.texi: Typo * TODO: New item 2003-08-21 Steven G. Johnson * tools/fftw-wisdom.c: try creating output file before planning (thanks to Phil Dumont for the suggestion) 2003-08-19 Matteo Frigo * doc/fftw3.texi: Clarified fftw_cleanup() 2003-08-16 Steven G. Johnson * doc/fftw3.texi: typo 2003-07-28 Steven G. Johnson * tools/fftw-wisdom.c: use time() instead of clock() (FIXME: what to do for non-POSIX systems?) ...thanks to JP Sugarbroad and James A. Treacy for the bug report 2003-07-24 Matteo Frigo * kernel/cycle.h: Need __volatile__ in sparc cycle counter. This is why the debian port hangs. 2003-07-20 Steven G. Johnson * NEWS: merged 3.0.1 notes 2003-07-14 Steven G. Johnson * libbench2/bench-main.c: whoops 2003-07-10 Matteo Frigo * simd/simd-sse.h, simd/simd-sse2.h: Dealing with constants in a way that seems to confuse gcc less. 2003-07-09 Matteo Frigo * support/Makefile.codelets, genfft/annotate.ml, genfft/magic.ml: Enabled scheduler hack for FMA, where it seems to help. * genfft/annotate.ml: Hmm---the new scheduler seems make things worse for gcc/x86, better for gcc/ppc, and about the same for icc/x86. Disabled for now. * genfft/annotate.ml: New scheduling pass that keeps ``x = a + b'' and ``y = a - b'' close together. This property was no longer automatic for the dags generated in SIMD mode. I cannot measure any speed difference due to this change. However, the change is justified by a minimal-screwup argument. Moreover, the sse2 fftw library is now 1% smaller than it was before. * genfft/c.ml: -(FNMS()) => FMS() 2003-07-06 Steven G. Johnson * doc/FAQ/Makefile.am: added more convenient target name 2003-07-05 Steven G. Johnson * kernel/ifftw.h: typo * kernel/ifftw.h: removed typo 2003-07-05 Matteo Frigo * dft/ct-generic.c: Consistent naming * dft/Makefile.am, dft/conf.c, dft/ct-directw.c, dft/ct-directwbuf.c, dft/ct-generic.c, dft/ct.c, dft/ct.h, dft/ctsq.c, dft/dft.h, dft/dftw-dft.c, dft/direct.c, dft/directw.c, dft/directwbuf.c, dft/generic.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/plan.c, dft/problemw.c, dft/rader.c, dft/solve.c: Got rid of problemw. 2003-07-04 Matteo Frigo * kernel/cycle.h, kernel/timer.c: Increase TIME_MIN on intel only * genfft/schedule.ml: A little hack to get more consistent scheduling. 2003-07-04 Steven G. Johnson * acinclude.m4, genfft/oracle.ml, tools/fftw-wisdom-to-conf.in: merged changes from mainline * configure.ac: bumped version * NEWS: updated for 3.0.1 2003-07-03 Matteo Frigo * genfft/magic.ml, genfft/schedule.ml: New experimental scheduler (currently disabled). The old scheduler is ``optimal'' in the sense that it minimizes register pressure. The only way to reduce register pressure is to schedule dependent instructions as closely as possible, so as to minimize the life time of registers. This strategy maximizes the number of pipeline stalls, however. With enough registers and short enough pipelines, this tradeoff is fine. This is no longer the case for the devilish pipeline of the Pentium IV or (probably) the PowerPC 970. The new scheduler switches to a ``list scheduler'' for dags smaller than a specified size. The list scheduler executes a butterfly left to right one column at the time. This amounts to the best possible pipeline utilization, and the worst possible register pressure. The ``specified size'' defaults to 0, i.e., no change from fftw2 and fftw-3.0. It seems like a value of 7--10 produces the best results for Pentium IV (probably screwing the G3/G4 powerpcs and sparc, but I haven't tried.) As time goes by, we may want to increase this number to favor newer processors over older processors. 2003-06-25 Steven G. Johnson * tools/fftw-wisdom-to-conf.in: remove non-portable use of tempfile; thanks to Nicolas Decoster for the patch * acinclude.m4: increase stupid HP preprocessor limits 2003-06-19 Matteo Frigo * genfft/Makefile.am: Distribute gen_mdct.ml 2003-06-16 Steven G. Johnson * dft/simd/codelets/Makefile.am: remove -trivial-stores, as in mainline 2003-06-11 Matteo Frigo * rdft/buffered2.c, rdft/rdft2-radix2.c: Cleared int/ptrdiff_t confusions * dft/dftw-dft.c, dft/directwbuf.c, dft/rank0.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/planner.c: Cleared int/ptrdiff_t confusion 2003-06-08 Matteo Frigo * kernel/timer.c: Increased TIME_MIN. This seems to produce more reliable plans on Pentium IV. * dft/simd/codelets/Makefile.am: Removed relic -trivial-stores, which dates back to Franz's early experiments. Speed improved on SSE2, both with gcc and icc. 2003-06-06 Steven G. Johnson * doc/fftw3.texi: fix direntry 2003-06-05 Steven G. Johnson * genfft/gen_mdct.ml: added imdct 2003-06-04 Matteo Frigo * genfft/c.ml: Collect pattern (a * b) +- (c * d) in generic-arith, because this operation can usually be computed with one rounding in fixed-point (and it possibly exposes a FMA instruction) * genfft/c.ml, genfft/magic.ml: Generic-arithmetic unparser 2003-06-01 Matteo Frigo * genfft-k7/oracle.ml, genfft/oracle.ml: Oops---randomized CSE was using the same random numbers over and over * genfft/c.ml: Paranoia. * genfft/oracle.ml: Use relative error instead of absolute error, to avoid problems when normalization factors are used. 2003-06-01 Steven G. Johnson * reodft/reodft11e-radix2.c: slight opt * reodft/reodft11e-radix2.c: slight optimization * genfft/gen_mdct.ml: *W is const * genfft/gen_mdct.ml: comment 2003-05-30 Steven G. Johnson * genfft/Makefile.am, genfft/gen_mdct.ml: added experimental MDCT 2003-05-29 Steven G. Johnson * mkdist.sh: merge from main branch * mkdist.sh: altivec (fma) needs simd codlist.c too * mkdist.sh: make sure we include SIMD codlist.c for non-Unix folks * mkdist.sh: make sure we include simd codlist.c 2003-05-28 Steven G. Johnson * doc/fftw3.texi: noted howmany_rank == 0 is a single transform * doc/fftw3.texi: further stride clarification 2003-05-26 Matteo Frigo * dft/Makefile.am, dft/conf.c, dft/ct.c, dft/ctsq.c, dft/dft.h, dft/dftw-dft.c, dft/directw.c, dft/directwbuf.c, dft/directwsq.c, dft/kdft-difsq.c, dft/problemw.c: Removed transposed dftw problems. I now consider transposed dftw a Bad Idea, since it does not apply to the case that it was originally meant for (speed up four-step) and it complicates the implementation of the other thing I want to try (dftw m-slices). * dft/buffered.c: Obsolete comment 2003-05-24 Matteo Frigo * dft/ct.c: comment * dft/dftw-dft.c: Oops---wrong test NO_UGLYP instead of !NO_UGLYP * dft/ct.c: Implemented radix r, where n=r^2 * p 2003-05-21 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: xlc seems to properly use fma as well * configure.ac, doc/fftw3.texi: print warning if there is no cycle counter 2003-05-20 Steven G. Johnson * libbench2/verify-lib.c, libbench2/verify-r2r.c: updated Funda reference 2003-05-20 Matteo Frigo * dft/ct.c: const 2003-05-19 Matteo Frigo * dft/dftw-dft.c, dft/directwsq.c: Implemented generic dif square transposed (q-style) solver. * dft/dftw-dft.c: applicable() is now a property of the solver (in anticipation of transposed solvers) * dft/dftw-dft.c: Slight cleanup 2003-05-18 Matteo Frigo * dft/bluestein.c, kernel/ifftw.h, kernel/primes.c: Nothing, really * dft/dftw-dft.c: Moved vector loop inside bytwiddle(), in anticipation of a q-style dftw-dit transposed solver. * dft/dftw-dft.c: Fixed flops count * dft/dftw-dft.c: style * dft/dftw-dft.c: Faster inner loop. 2003-05-17 Matteo Frigo * dft/dftw-dft.c: Print vector length * dft/dftw-dft.c: Oops * dft/dftw-dft.c: Allow vl > 1 * dft/ctsq.c: Radix can be derived from problem---no need to pre-specify it. 2003-05-17 Steven G. Johnson * kernel/transpose.c: fixed comment * kernel/transpose.c: whoops, gcd is static * kernel/transpose.c: whoops, gcd should be static * kernel/transpose.c: more unrolling 2003-05-17 Matteo Frigo * dft/bluestein.c: Hack to avoid infinite recursion. 2003-05-16 Steven G. Johnson * dft/codelet-dft.h: consistency 2003-05-16 Matteo Frigo * dft/bluestein.c: Wrong comment. * dft/bluestein.c: Style. 2003-05-16 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: punctuation * doc/FAQ/fftw-faq.bfnn: added allzero FAQ * dft/bluestein.c: simplification: instead of cldb, just use cldf with inputs/output values swapped 2003-05-16 Matteo Frigo * dft/bluestein.c: Allow more general transform sizes. 2003-05-16 Steven G. Johnson * kernel/ifftw.h: MS has __int64, not long long (grr) * kernel/ifftw.h: slight change * kernel/ifftw.h: MS has __int64 type, not long long (grr) 2003-05-16 Matteo Frigo * dft/ct.c: Fixed printout * dft/bluestein.c: Fixed flop count * dft/Makefile.am, dft/bluestein.c, dft/conf.c, dft/dft.h: New bluestein solver * dft/ct.c: Implemented generic radix. * dft/generic.c, kernel/ifftw.h, kernel/twiddle.c: Removed conditional branch from inner loop in generic.c * dft/generic.c: Simplified indexing * dft/generic.c: Better still. * dft/generic.c: Further improvement of generic solver * dft/rader.c, dft/dftw-dft.c: Cleanup * dft/generic.c: Generic now only works for odd sized. Added check. * kernel/ifftw.h: Increased GENERIC_MIN_BAD because of new algorithm. * dft/generic.c: Much, much better. * dft/generic.c: Still trying to understand why rdft-generic-dit is faster then dft-generic... * dft/generic.c: Nothing, really * dft/generic.c: Never be clever for the sake of being clever. * dft/generic.c: Simplified. generic-dit is gone. The solver is now out-of-place only---buffering is done by the buffered solver. 2003-05-15 Matteo Frigo * dft/rader.c: rader-dit is gone. * dft/plan.c: Cast * configure.ac, dft/Makefile.am, dft/buffered.c, dft/codelet-dft.h, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/ctsq.c, dft/dft.h, dft/dftw-dft.c, dft/directw.c, dft/directwbuf.c, dft/directwsq.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/plan.c, dft/problemw.c, dft/rader.c, dft/solve.c: Introduced twiddle problem ``dftw''. Changed most other things to deal with this change. 2003-05-15 Steven G. Johnson * kernel/primes.c: whoops, X(safe_mulmod) not fftw_safe_mulmod * simd/sse.c, simd/sse2.c: add VC++ versions of asm * simd/simd-sse.h, simd/simd-sse2.h: VC++ reportedly supports the intel intrinsics, but requires __inline instead of __inline__ * kernel/ifftw.h: precompute array indices with VC++ * acx_pthread.m4: added doc note 2003-05-14 Steven G. Johnson * threads/threads.c: autodetect windows * libbench2/getopt.c: don't bother with #ifdef HAVE_CONFIG_H, since non-Unix users always forget to define it 2003-05-13 Steven G. Johnson * kernel/cycle.h: VC++ uses __inline * doc/FAQ/fftw-faq.bfnn: added leak question 2003-05-12 Steven G. Johnson * kernel/cycle.h: LARGE_INTEGER needs windows.h (supposedly, there is some problem converting _itnt64 to double...damn MS and their nonstandard types) * libbench2/timer.c: whoops * tools/fftw-wisdom.c: added 256x256 to canonical list 2003-05-12 Matteo Frigo * kernel/transpose.c: Oops... 2003-05-11 Matteo Frigo * kernel/transpose.c: Unrolled loops, changed cutoff * tests/bench.c: Do not multiply strides by 2 twice. 2003-05-08 Steven G. Johnson * tests/Makefile.am: added 'make smallcheck' * configure.ac, doc/fftw3.texi, kernel/timer.c: --without-cycle-counter becomes --with-slow-timer, updated docs 2003-05-07 Steven G. Johnson * configure.ac: remove duplicate -openmp check; Sun requires -xopenmp * dft/ct-ditbuf.c, rdft/hc2hc-buf.c: fixed compilation under Sun C++ 2003-05-07 Matteo Frigo * kernel/planner.c, kernel/timer.c: Use estimator if cycle counter is unavailable, regardless of the FFTW_MEASURE/ESTIMATE setting. 2003-05-07 Steven G. Johnson * kernel/cycle.h: _WIN32 (not __WIN32__) is always defined * kernel/cycle.h: minor cleanup * kernel/cycle.h: tentative VC++ stuff, some consolidation 2003-05-06 Steven G. Johnson * kernel/cycle.h, kernel/timer.c: made cycle.h more self-contained 2003-05-06 Matteo Frigo * simd/simd-3dnow.h, simd/simd-sse.h, simd/simd-sse2.h: Use ``%'' flag to denote commutative operations. 2003-05-06 Steven G. Johnson * kernel/cycle.h: MIT license, brief documentation * doc/Makefile.am: whoops, forgot f77_wisdom.f 2003-05-04 Matteo Frigo * dft/problem.c, libbench/mp.c, libbench2/bench.h, libbench2/mp.c, libbench2/verify-lib.c, rdft/problem2.c: Improved speed of accuracy test. 2003-04-29 Matteo Frigo * kernel/cycle.h: s390 cycle counter 2003-04-26 Steven G. Johnson * doc/fftw3.texi: forgot r2r directory * rdft/Makefile.am, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c: delete unused files, since they don't compile any more 2003-04-24 Matteo Frigo * simd/simd-sse2.h: Better gcc code generation 2003-04-23 Steven G. Johnson * acinclude.m4: ccc is the Compaq C compiler on Linux/alpha * doc/fftw3.texi: whoops 2003-04-19 Matteo Frigo * kernel/cycle.h: ia64 cycle counter with intel compiler. 2003-04-18 Matteo Frigo * doc/FAQ/fftw-faq.bfnn: More gcc bugs. Sigh. * bootstrap.sh: touch ChangeLog to observe GNU standards * ChangeLog: We now build ChangeLog automatically at distribution time * mkdist.sh: Automatic ChangeLog hackery 2003-04-18 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: plural * NEWS: updated 2003-04-18 Matteo Frigo * ChangeLog: Updated 2003-04-18 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: a -> an * doc/FAQ/fftw-faq.bfnn: hyphen * doc/FAQ/fftw-faq.bfnn: comma * doc/FAQ/fftw-faq.bfnn: minor 2003-04-18 Matteo Frigo * doc/FAQ/fftw-faq.bfnn: Updated * mkdist.sh: New script that builds the distributions * dft/simd/codelets/Makefile.am: Oops again * dft/simd/codelets/Makefile.am: Oops, forgot -sign 1 * configure.ac, dft/simd/codelets/Makefile.am, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/n2b.c, dft/simd/n2b.h, dft/simd/n2f.c, dft/simd/n2f.h: Reorganization of simd codelets * genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml: k7 assembly was not updated after conversion of opcnt from int to double 2003-04-17 Matteo Frigo * dft/vrank2-transpose.c, dft/vrank3-transpose.c: Capital `X' looks bad in all-lowercase plans * dft/codelets/standard/Makefile.am, dft/simd/codelets/Makefile.am, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am: Removed redundant inline/noinline codelets * genfft/Makefile.am, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_hc2r_noinline.ml, genfft/gen_r2hc.ml, genfft/gen_r2hc_noinline.ml, genfft/gen_r2r.ml, kernel/ifftw.h, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am, support/Makefile.codelets: New noinline Noinline real codelets 2003-04-17 Steven G. Johnson * TODO: more ideas 2003-04-17 Matteo Frigo * dft/simd/codelets/Makefile.am: Removed duplicate rules. * Makefile.am: acx_pthread.m4 was not distributed * support/Makefile.codelets: Oops * dft/codelets/standard/Makefile.am, genfft/Makefile.am, genfft/gen_notw.ml, genfft/gen_notw_noinline.ml, support/Makefile.codelets: Both inlined and non-inlined notw codelets. * dft/simd/codelets/Makefile.am, genfft/Makefile.am, genfft/gen_notw_noinline_c.ml, support/Makefile.codelets: Initial experiment with both inlined and non-inlined simd codelets. Both are included for now. * configure.ac, support/Makefile.codelets: --enable-fma to build FMA distribution 2003-04-16 Matteo Frigo * genfft/gen_notw_c.ml: Inline SIMD nontwiddle codelets * simd/simd-sse.h, simd/simd-sse2.h: Pathetic attempt at saving a couple of registers... * genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_r2r.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml: for (i = 0; i < m; ++i) ==> for (i = m; i > 0; --i) No proof of evidence that this is any faster, but just in case... 2003-04-15 Steven G. Johnson * dft/vrank-geq1.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: added hack to make sure that codelet loops are preferred to vecloop solvers in the estimator * api/f77funcs.h, api/fftw3.h, api/flops.c, doc/fftw3.texi, kernel/ifftw.h, tests/bench.c: use double for flops * kernel/cycle.h: metrowerks reportedly supports gcc assembly extensions on ppc 2003-04-14 Matteo Frigo * simd/Makefile.am, dft/simd/Makefile.am: foo_CFLAGS generates some automake junk that breaks the build on Redhat 7.3. Screw it. * tests/check.pl: Carefully check return status 2003-04-13 Matteo Frigo * genfft/c.ml, genfft/simd.ml, kernel/ifftw.h, support/Makefile.codelets: Removed annoying -FMA() expressions. 2003-04-12 Matteo Frigo * kernel/ifftw.h: Major fma hackery * api/apiplan.c: Slight cleanup * configure.ac: Updated version number * acinclude.m4: Damn autoconf * acinclude.m4: Recognize all 74xx processors * acinclude.m4: Detect 7400 processor. * acinclude.m4: No need to check for gcc-2.95 2003-04-11 Steven G. Johnson * NEWS: removed duplicate 2003-04-11 Matteo Frigo * libbench2/report.c: mflops ==> ``mflops'' * libbench2/report.c: Print setup time as well 2003-04-10 Matteo Frigo * dft/problem.c, kernel/ifftw.h, rdft/problem.c, rdft/problem2.c, simd/taint.c: Enforce pointer equality for in-place problems. 2003-04-09 Steven G. Johnson * ChangeLog, NEWS: updated * tests/README: cross-ref fftw-wisdom man page 2003-04-09 Matteo Frigo * kernel/planner.c: Undone previous change, committed by mistake. * tests/Makefile.am, tests/README, kernel/planner.c: Quick and dirty README for bench * libbench2/bench-main.c, libbench2/timer.c: Consider additional command-line arguments as problems to be benchmarked. * libbench2/bench-main.c, libbench2/bench.h, libbench2/report.c: Default report format is now human-readable. Removed unnecessary complexity in benchmark reporting. * doc/fftw3.texi: Updated for new interleaved/split api. 2003-04-09 Steven G. Johnson * doc/fftw3.texi: updated citation 2003-04-08 Matteo Frigo * configure.ac: Time for beta3 2003-04-08 Steven G. Johnson * reodft/redft00e-r2hc-pad.c: whoops, added * doc/fftw3.texi: more comparison of different R*DFT types * reodft/redft00e-r2hc.c, reodft/rodft00e-r2hc.c: comments * reodft/Makefile.am, reodft/conf.c, reodft/reodft.h, reodft/rodft00e-r2hc-pad.c: more accurate DCT-I and DST-I, at the expense of up to a factor of 2 in speed and memory 2003-04-08 Matteo Frigo * kernel/planner.c: Workaround gcc/sparc bug 2003-04-08 Steven G. Johnson * doc/fftw3.texi: rumors 2003-04-07 Steven G. Johnson * tests/hook.c: added rdft2 paranoid mode * tests/hook.c: added paranoid mode for r2r * libbench2/verify-r2r.c: whoops, sincos is predefined on some systems 2003-04-05 Matteo Frigo * tests/hook.c: bp->destroy_input was not initialized * dft/problem.c, kernel/ifftw.h, rdft/problem.c, rdft/problem2.c: Asserted correctness conditions for tainted pointers. (For now, use CK() while we test. They should be changed into A() at some point.) * dft/problem.c, rdft/problem.c, rdft/problem2.c: Untaint pointers before zero'ing arrays and before hashing * libbench2/bench-main.c: Alignment check did not work with icc, which seems to be confused by the fact that the variable is not used. * tests/Makefile.am: More paranoid paranoid-check * kernel/ifftw.h: 0 == x & 7 parses as (0 == x) & 7, which is wrong 2003-04-05 Steven G. Johnson * dft/direct.c, kernel/ifftw.h, kernel/planner.c, libbench2/bench-main.c, rdft/direct.c, rdft/direct2.c: alignment checks * rdft/rdft-dht.c: prevent infinite loops in exhaustive planning * api/Makefile.am, api/api.h, api/apiplan.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-split-dft-c2r.c, api/execute-split-dft-r2c.c, api/execute-split-dft.c, api/f77funcs.h, api/fftw3.h, api/mktensor-iodims.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-guru-split-dft-c2r.c, api/plan-guru-split-dft-r2c.c, api/plan-guru-split-dft.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, tests/bench.c: split/unsplit guru interface 2003-04-05 Matteo Frigo * tests/hook.c: Need UNTAINT in verifier too. * simd/taint.c: Forgot #if HAVE_SIMD * api/fftw3.h, kernel/align.c, kernel/ifftw.h, simd/Makefile.am, simd/simd.h, simd/taint.c: Keep track of two separate taint bits 2003-04-05 Steven G. Johnson * api/api.h, api/fftw3.h, api/mapflags.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, dft/k7/k7.c, dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/n2b.c, dft/simd/n2f.c, dft/simd/q1b.c, dft/simd/q1f.c, dft/simd/t1b.c, dft/simd/t1f.c, kernel/ifftw.h, tests/bench.c: added NO_SIMD problem flag, made UNALIGNED an API issue (taints input pointers) 2003-04-04 Steven G. Johnson * dft/buffered.c, rdft/buffered.c, rdft/buffered2.c: bugfix in buffered: wrong pointers passed for cldrest; also use TAINT instead of UNALIGNED in buffered2 2003-04-04 Matteo Frigo * dft/vrank-geq1.c: Reverted previous change, committed accidentally * kernel/align.c: What was I thinking? * dft/vrank-geq1.c: *** empty log message *** 2003-04-04 Steven G. Johnson * configure.ac, libbench2/aligned-main.c: added --enable-debug-alignment * kernel/align.c, kernel/ifftw.h: X(taint) prototype, define corresponding function only if HAVE_SIMD 2003-04-04 Matteo Frigo * dft/buffered.c, dft/solve.c, dft/vrank-geq1.c, kernel/align.c, kernel/ifftw.h, rdft/buffered.c, rdft/buffered2.c, rdft/solve.c, rdft/solve2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: Initial checkin of tained pointers * dft/buffered.c, dft/rader.c, dft/simd/n2b.c, dft/simd/n2f.c, dft/vrank-geq1.c, kernel/align.c, kernel/ifftw.h, rdft/buffered.c, rdft/buffered2.c, rdft/dht-rader.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: More conservative preservation of alignment 2003-04-04 Steven G. Johnson * api/apiplan.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute.c, api/f77funcs.h: plan/execute with aligned stack 2003-04-03 Steven G. Johnson * api/Makefile.am: whoops, missed FFTW_MEASURE in fftw3.f * threads/threads.c: use WITH_ALIGNED_STACK for experimental semaphore stuff, too 2003-04-03 Matteo Frigo * kernel/stack.c: Removed old file * kernel/Makefile.am, kernel/ifftw.h, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/threads.h, threads/vrank-geq1-rdft2.c: Improved stack-alignment hack 2003-04-03 Steven G. Johnson * threads/threads.c: use aligned stack for experimental semaphores, too * kernel/ifftw.h, kernel/stack.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/threads.h, threads/vrank-geq1-rdft2.c: whoops * kernel/ifftw.h, kernel/stack.c, threads/ct-dit.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/threads.c: fix(?) for SIMD thread problems * doc/fftw3.texi: noted n=1 REDFT01 case * doc/fftw3.texi: note about n=2 REDFT00 formula * doc/fftw3.texi: note about undefined REDFT00 * doc/fftw3.texi: noted n=1 RODFT01 case * doc/equation-redft11.png, doc/equation-rodft01.png, doc/equation-rodft11.png, doc/fftw3.texi: corrected definitions * rdft/codelet-rdft.h, rdft/problem.c, rdft/vrank-geq1.c: added REODFT_KINDP, fixed nontrivial test for R2HC11 and HC2R11 (not that we support these yet anyway) * rdft/codelets/hc2r/Makefile.am, rdft/problem.c: size 2 hc2r and dht are equivalent to r2hc 2003-04-02 Steven G. Johnson * doc/fftw3.texi: noted overwriting in upgrading section 2003-04-02 Matteo Frigo * kernel/Makefile.am, kernel/align.c, kernel/stack.c: Moved with_aligned_stack to its own file * kernel/align.c, libbench2/aligned-main.c: Fixed comments * kernel/align.c, kernel/ifftw.h, libbench2/aligned-main.c, libbench2/bench-main.c: Alignment hacks 2003-04-01 Steven G. Johnson * threads/threads.c: phew, no, previous version was okay * threads/threads.c: whoops, crap 2003-04-01 Matteo Frigo * simd/simd-sse2.h: support sse2 in forthcoming gcc-3.3 2003-04-01 Steven G. Johnson * kernel/cycle.h: comment * kernel/cycle.h: noted ac_check_headers * kernel/cycle.h: comment * kernel/cycle.h: documented autoconf tests, so that cycle.h can be distributed separately * NEWS: IRIX is all-caps * NEWS: noted Irix fix * threads/api.c, threads/threads.h: whoops * threads/threads.c: use ithreads_init so as not to confuse fftw 2 users * threads/threads.c: IRIX lossage * configure.ac: check for -openmp (icc) among the OpenMP flags (TODO: make this a separate macro, with a loop instead of repeated checks) 2003-03-31 Steven G. Johnson * doc/fftw3.texi: clarification 2003-03-31 Matteo Frigo * acinclude.m4: More liberal test for solaris CC * simd/simd-sse.h, simd/simd-sse2.h: Allow x86-64 simd * kernel/cycle.h: Added x86-64 timer code 2003-03-31 Steven G. Johnson * NEWS, ChangeLog: updated * doc/FAQ/fftw-faq.bfnn: colon 2003-03-31 Matteo Frigo * doc/FAQ/fftw-faq.bfnn: Reorganized compiler bugs section (which is growing out of control) * doc/FAQ/fftw-faq.bfnn: solaris gcc bug appears to be also in 2.95.2 * kernel/planner.c: Workaround works---there is another gcc/sparc bug elsehwere * kernel/planner.c: Grrr, workaround does not work. * kernel/planner.c: ADDMOD is now function, which seems to avoid gcc bugs. 2003-03-30 Matteo Frigo * kernel/planner.c: Workaround sparc gcc bug 2003-03-30 Steven G. Johnson * doc/fftw3.texi: note * dft/vrank2-transpose.c, dft/vrank3-transpose.c: make non-square UGLY, for now * tests/bench.c: added -o amnesia to forget_wisdom before each plan 2003-03-30 Matteo Frigo * libbench2/bench-user.h, libbench2/report.c, libbench2/speed.c: Report setup time in benchmark 2003-03-30 Steven G. Johnson * kernel/transpose.c: comment * doc/fftw3.texi: slight change 2003-03-29 Matteo Frigo * kernel/ct.c: More relaxed definition of UGLYness 2003-03-29 Steven G. Johnson * rdft/codelet-rdft.h, rdft/hc2hc.h, rdft/rdft.h, reodft/reodft.h, threads/threads.h: no more cvs id strings in header files...I'm tired of having to rebuild everything after a commit * rdft/Makefile.am, rdft/buffered2.c, rdft/direct2.c, rdft/rdft.h, rdft/rdft2-inplace-strides.c, rdft/rdft2-strides.c, rdft/rdft2-tensor-max-index.c, rdft/vrank-geq1-rdft2.c, threads/vrank-geq1-rdft2.c: rdft2 stride unification * rdft/vrank-geq1-rdft2.c: preserve in-place-ness * tests/Makefile.am, tests/bench.c, tests/check.pl: make nowisdom the default 2003-03-29 Matteo Frigo * tests/Makefile.am: --verbose in paranoid-check produces too much output. Make it quiet. 2003-03-29 Steven G. Johnson * dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/transpose.c: fixed transpose bugs...need to check ri-ii before deciding whether Ntuple fits 2003-03-29 Matteo Frigo * tests/check.pl: try more 2^k * kernel/ifftw.h: MIN_ALIGNMENT was defined after being used, causing crash in sse2. 2003-03-29 Steven G. Johnson * kernel/Makefile.am, kernel/ifftw.h, kernel/tensor10.c, kernel/transpose.c, rdft/Makefile.am, rdft/conf.c: real transposes are currently unused, and are not needed for MPI code either * dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/ifftw.h, kernel/transpose.c: added general transpose * libbench2/problem.c: added transposition option * dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/ifftw.h, kernel/tensor10.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c: yikes, fixed incorrect applicability of transpose plans * rdft/dft-r2hc.c: in the future, we might want to allow sz->rnk == 0, vecsz->rnk arbitrary to be converted to r2hc (the apply function already should work for this case)...disabled for now, though * kernel/align.c, kernel/ifftw.h, rdft/vrank-geq1-rdft2.c: use most_unaligned in rdft2 * tests/Makefile.am: slight change * tests/Makefile.am: output message when checks pass 2003-03-28 Steven G. Johnson * kernel/ifftw.h: added ifndef alloca around alloca stuff 2003-03-28 Matteo Frigo * rdft/dht-rader.c, dft/rader.c, dft/vrank-geq1.c, kernel/align.c, kernel/ifftw.h: Proper alignment in rader 2003-03-28 Steven G. Johnson * kernel/ifftw.h: whitespace * kernel/ifftw.h: whoops, alloca stuff inside HAVE_ALLOCA * tests/Makefile.am: make check can afford to be a little bigger * kernel/ifftw.h: use same alloca macrology as configure script * kernel/ifftw.h: fallback is no longer needed for mingw * kernel/ifftw.h: alloca fallback for gcc * configure.ac: _alloca was added for MinGW, but it causes problems there * kernel/align.c: fixed most_unaligned for split format * Makefile.am: whoops * Makefile.am, configure.ac, fftw.pc.in: added pkg-config 2003-03-27 Steven G. Johnson * dft/vrank-geq1.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: fixed asserts 2003-03-27 Matteo Frigo * kernel/align.c: Do not adjust r/i pointers separately. * dft/simd/n2b.h, dft/simd/n2f.h: iForgot to add files * dft/simd/codelets/Makefile.am, configure.ac, dft/simd/Makefile.am, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/n2b.c, dft/simd/n2f.c: Specialized n simd codelets for unit vector stride. * configure.ac: Changed version number to beta2 * api/mapflags.c, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/q1b.c, dft/simd/q1f.c, dft/simd/t1b.c, dft/simd/t1f.c, dft/vrank-geq1.c, kernel/align.c, kernel/ifftw.h, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: Changed alignment requirements for n1 simd codelets. Changed mechanism for detecting lack of alignment. * tests/bench.c: Oops, wrong place for hook 2003-03-27 Steven G. Johnson * dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/k7/codelets/Makefile.am, dft/simd/codelets/Makefile.am, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am, rdft/codelets/r2r/Makefile.am: added comments to codelet makefiles, to aid people wanting to generate their own code * doc/FAQ/fftw-faq.bfnn: Matteo is also a copyright holder * doc/FAQ/fftw-faq.bfnn: FORTRAN is officially Fortran, these days * doc/FAQ/fftw-faq.bfnn: punctuation * doc/FAQ/fftw-faq.bfnn: don't use "wrapper" * doc/FAQ/fftw-faq.bfnn: plural * doc/FAQ/fftw-faq.bfnn: grammar * doc/FAQ/fftw-faq.bfnn: better phrasing * kernel/align.c: stddef.h should not be needed anymore for this file * dft/codelets/standard/Makefile.am: added comments for Franz mode * dft/simd/codelets/Makefile.am: clarification * dft/simd/codelets/Makefile.am: commented on FRANZ codelets * NEWS: updated * dft/codelets/inplace/Makefile.am: disable DIF codelets, since they are never used (apparently) except for some non-power-of-two sizes...improve support for the latter by adding size 3, 5, and 6 q^2 codelets. * doc/fftw3.texi: DHT has no forward/backward 2003-03-27 fftw * tests/bench.c: added hacky way to use an arbitrary flag 2003-03-27 Matteo Frigo * tests/bench.c: Better place to install hook 2003-03-27 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: noted that the user should run make check if they think FFTW has a bug 2003-03-26 Matteo Frigo * kernel/planner.c: Oops, what am I thinking * kernel/planner.c: Grrr.... fixed bug in estimator * genfft/c.ml: Oops---the flop count was right. The estimator is broken elsewhere. * genfft/c.ml: Fixed SIMD estimator * dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, dft/simd/q1b.c, dft/simd/q1b.h, dft/simd/q1f.c, dft/simd/q1f.h, dft/simd/t1b.c, dft/simd/t1f.c, genfft/Makefile.am, genfft/gen_twiddle_c.ml, genfft/gen_twidsq_c.ml, support/Makefile.codelets: Added twidsq simd codelets 2003-03-26 Steven G. Johnson * doc/fftw3.texi: gensrc -> genfft * TODO: newline 2003-03-26 Matteo Frigo * TODO: Noted need to add dif simd codelets 2003-03-25 Steven G. Johnson * doc/fftw3.texi: noted shift * doc/fftw3.texi: clarification * doc/fftw3.texi: need make after bootstrap * doc/fftw3.texi: slight change * doc/fftw3.texi: libtool is also needed * doc/fftw3.texi: added code generator introduction * Makefile.am, configure.ac, genfft/Makefile.am, genfft/complex.ml, genfft/complex.mli, genfft/gen_r2r.ml, genfft/gen_trig.ml, genfft/trig.ml, rdft/Makefile.am, rdft/codelet-rdft.h, rdft/codelets/Makefile.am, rdft/codelets/r2r.c, rdft/codelets/r2r.h, rdft/codelets/r2r/Makefile.am, rdft/conf.c, rdft/direct.c, rdft/kr2r.c, rdft/rdft.h, support/Makefile.codelets: added support for REDFT/RODFT/DHT direct codelets * doc/FAQ/fftw-faq.bfnn: noted ARM bug; thanks to Jay Treacy 2003-03-25 Matteo Frigo * genfft-k7/vK7Optimization.ml: bugfix from Stefan 2003-03-24 Steven G. Johnson * doc/fftw3.texi: slight change * doc/fftw3.texi: caveat * doc/fftw3.texi: warning about DHT 2003-03-24 Matteo Frigo * dft/k7/codelets/Makefile.am: Oops * dft/k7/codelets/Makefile.am, tests/Makefile.am, tests/check.pl: Regression test for p4fftwgel 2003-03-24 Steven G. Johnson * tests/Makefile.am: make check is faster, old tests are in make bigcheck 2003-03-22 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: note * doc/FAQ/fftw-faq.bfnn: whoops, line wrapping 2003-03-21 Matteo Frigo * dft/codelets/standard/Makefile.am, genfft/gen_notw.ml, genfft/gen_twiddle.ml: Franz-mode codelets even without SIMD. (disabled) * doc/FAQ/fftw-faq.bfnn: Bug is in netbsd-1.6, not 1.5 * simd/simd-altivec.h: const cast, should placate c++ compilers. 2003-03-20 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: added FAQ on why plans are array-specific * reodft/reodft010e-r2hc.c: comment fix * reodft/reodft010e-r2hc.c: noted comparison to NR * api/fftw3.h: whoops, C99 complex didn't work if complex is a macro (as it is with glibc); thanks to Keh-Cheng Chu for the bug report 2003-03-19 Steven G. Johnson * configure.ac: noted in help that --enable-k7 enables 3dnow, and that --enable-3dnow is only a fallback 2003-03-19 Matteo Frigo * doc/FAQ/Makefile.am, doc/FAQ/fftw-faq.bfnn, doc/FAQ/html.refs: New gcc bug. html.refs was not in repository/distribution. * tests/bench.c: Don't write wisdom if you don't have it. 2003-03-18 Matteo Frigo * doc/fftw3.texi: Added index entries for DHT. Similarly for DCT, DST 2003-03-18 Steven G. Johnson * api/f77api.c, api/f77funcs.h: execute should not go through C api, for efficiency 2003-03-18 Matteo Frigo * api/fftw3.h: Renamed FFTW_IODIM, FFTW_R2R_KIND 2003-03-18 Steven G. Johnson * doc/Makefile.am: added rfftwnd.eps to dist, so that transfig is not required for people trying to build other formats (e.g. ps); thanks to Brian Gough for the bug report 2003-03-17 Steven G. Johnson * doc/fftw3.texi: pointer to upgrading section from tutorial * api/f77funcs.h, api/fftw3.h, api/print-plan.c, doc/fftw3.texi, tests/bench.c: make print_plan and fprint_plan, so that the former can be more easily called from other languages * doc/fftw3.texi: whoops, forgot to change equation image links to .png 2003-03-17 Matteo Frigo * api/fftw3.h, api/version.c, support/Makefile.codelets: fixed c++ linkage problems * api/fftw3.h, api/version.c: Removed ``const'', otherwise c++ link fails 2003-03-17 Steven G. Johnson * api/f77api.c, api/f77funcs.h, api/version.c, libbench2/allocate.c, libbench2/getopt-utils.c, libbench2/problem.c, libbench2/speed.c, libbench2/timer.c, libbench2/verify-r2r.c, libbench2/zero.c, support/Makefile.codelets, tests/bench.c, tests/hook.c, tools/fftw-wisdom.c: fixed C++ annoyances: void* casts, and global variables are static by default(?!?) 2003-03-16 Steven G. Johnson * doc/FAQ/fftw-faq.bfnn: ranlib bug is in binutils * doc/FAQ/fftw-faq.bfnn: ranlib Irix bug * tests/check.pl: start with random tests * api/Makefile.am, dft/direct.c, kernel/ifftw.h, libbench2/verify-r2r.c, rdft/direct.c, rdft/direct2.c, threads/Makefile.am: silenced some compiler warnings, eliminated unused variables, and fixed Makefile.am for f77funcs.h * doc/FAQ/fftw-faq.bfnn: whoops * doc/fftw3.texi: 3dnow is float * doc/fftw3.texi: fixed k7 docs * kernel/cycle.h: SGI compilers now support inline * kernel/cycle.h: cruft * doc/fftw3.texi: texinfo doesn't like commas in nodes * README, ChangeLog: updated * api/f77api.c, api/f77funcs.c, api/f77funcs.h, threads/f77api.c, threads/f77funcs.c, threads/f77funcs.h: f77funcs.c -> f77funcs.h so that people don't try to compile it * doc/FAQ/fftw-faq.bfnn: minor changes * doc/FAQ/fftw-faq.bfnn: updated compiler bug list * doc/fftw3.texi: noted how to set CC * TODO: TODONE * threads/vrank-geq1-rdft2.c: yikes, bugfix * kernel/ifftw.h: whoops 2003-03-16 Matteo Frigo * api/version.c: Report SIMD extensions in version string 2003-03-15 Steven G. Johnson * tests/bench.c: more verbose output * doc/fftw3.texi: a couple of additional non-Unix instructions * doc/FAQ/fftw-faq.bfnn: hyphen * doc/FAQ/fftw-faq.bfnn: softened * configure.ac, doc/FAQ/Makefile.am, doc/FAQ/bfnnconv.pl, doc/FAQ/fftw-faq.bfnn, doc/FAQ/m-ascii.pl, doc/FAQ/m-html.pl, doc/FAQ/m-info.pl, doc/FAQ/m-lout.pl, doc/FAQ/m-post.pl, doc/Makefile.am, doc/equation-dft.gif, doc/equation-dft.png, doc/equation-dht.gif, doc/equation-dht.png, doc/equation-idft.gif, doc/equation-idft.png, doc/equation-redft00.gif, doc/equation-redft00.png, doc/equation-redft01.gif, doc/equation-redft01.png, doc/equation-redft10.gif, doc/equation-redft10.png, doc/equation-redft11.gif, doc/equation-redft11.png, doc/equation-rodft00.gif, doc/equation-rodft00.png, doc/equation-rodft01.gif, doc/equation-rodft01.png, doc/equation-rodft10.gif, doc/equation-rodft10.png, doc/equation-rodft11.gif, doc/equation-rodft11.png: added FAQ, used PNGs * COPYRIGHT, TODO, api/api.h, api/apiplan.c, api/configure.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute.c, api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/export-wisdom.c, api/extract-reim.c, api/f77api.c, api/f77funcs.c, api/fftw3.h, api/flops.c, api/forget-wisdom.c, api/import-system-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c, api/map-r2r-kind.c, api/mapflags.c, api/mkprinter-file.c, api/mktensor-iodims.c, api/mktensor-rowmajor.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, api/print-plan.c, api/rdft2-pad.c, api/the-planner.c, api/version.c, api/x77.h, dft/buffered.c, dft/codelet-dft.h, dft/codelets/n.c, dft/codelets/n.h, dft/codelets/t.c, dft/codelets/t.h, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/direct.c, dft/generic.c, dft/indirect.c, dft/k7/k7.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/kdft.c, dft/nop.c, dft/plan.c, dft/problem.c, dft/rader-omega.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/t1b.c, dft/simd/t1b.h, dft/simd/t1f.c, dft/simd/t1f.h, dft/solve.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, dft/zero.c, doc/f77_wisdom.f, doc/fftw3.texi, genfft-k7/algsimp.ml, genfft-k7/algsimp.mli, genfft-k7/assoctable.ml, genfft-k7/assoctable.mli, genfft-k7/expr.ml, genfft-k7/expr.mli, genfft-k7/fft.ml, genfft-k7/littlesimp.ml, genfft-k7/littlesimp.mli, genfft-k7/monads.ml, genfft-k7/number.ml, genfft-k7/number.mli, genfft-k7/oracle.ml, genfft-k7/oracle.mli, genfft-k7/to_alist.ml, genfft-k7/to_alist.mli, genfft-k7/twiddle.ml, genfft-k7/twiddle.mli, genfft/algsimp.ml, genfft/algsimp.mli, genfft/annotate.ml, genfft/annotate.mli, genfft/assoctable.ml, genfft/assoctable.mli, genfft/c.ml, genfft/c.mli, genfft/complex.ml, genfft/complex.mli, genfft/conv.ml, genfft/conv.mli, genfft/dag.ml, genfft/dag.mli, genfft/expr.ml, genfft/expr.mli, genfft/fft.ml, genfft/fft.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/littlesimp.ml, genfft/littlesimp.mli, genfft/magic.ml, genfft/monads.ml, genfft/number.ml, genfft/number.mli, genfft/oracle.ml, genfft/oracle.mli, genfft/schedule.ml, genfft/schedule.mli, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml, genfft/to_alist.ml, genfft/to_alist.mli, genfft/trig.ml, genfft/trig.mli, genfft/twiddle.ml, genfft/twiddle.mli, genfft/unique.ml, genfft/unique.mli, genfft/util.ml, genfft/util.mli, genfft/variable.ml, genfft/variable.mli, kernel/align.c, kernel/alloc.c, kernel/assert.c, kernel/awake.c, kernel/buffered.c, kernel/ct.c, kernel/cycle.h, kernel/debug.c, kernel/hash.c, kernel/iabs.c, kernel/ifftw.h, kernel/md5-1.c, kernel/md5.c, kernel/minmax.c, kernel/ops.c, kernel/pickdim.c, kernel/plan.c, kernel/planner.c, kernel/primes.c, kernel/print.c, kernel/problem.c, kernel/rader.c, kernel/scan.c, kernel/solver.c, kernel/solvtab.c, kernel/square.c, kernel/stride.c, kernel/tensor.c, kernel/tensor1.c, kernel/tensor2.c, kernel/tensor4.c, kernel/tensor5.c, kernel/tensor7.c, kernel/tensor8.c, kernel/tensor9.c, kernel/timer.c, kernel/trig.c, kernel/trig1.c, kernel/twiddle.c, libbench/bench-main.c, libbench/bench-user.h, libbench/bench.h, libbench/can-do.c, libbench/getopt-utils.c, libbench/info.c, libbench/main.c, libbench/prime.c, libbench/problem.c, libbench/report.c, libbench/speed.c, libbench/timer.c, libbench/util.c, libbench/verify.c, libbench/zero.c, libbench2/aligned-main.c, libbench2/bench-main.c, libbench2/bench-user.h, libbench2/bench.h, libbench2/can-do.c, libbench2/dotens2.c, libbench2/getopt-utils.c, libbench2/info.c, libbench2/main.c, libbench2/problem.c, libbench2/report.c, libbench2/speed.c, libbench2/tensor.c, libbench2/timer.c, libbench2/useropt.c, libbench2/util.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h, libbench2/zero.c, rdft/buffered.c, rdft/buffered2.c, rdft/codelet-rdft.h, rdft/codelets/hb.h, rdft/codelets/hc2r.c, rdft/codelets/hc2r.h, rdft/codelets/hc2rIII.h, rdft/codelets/hf.h, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/codelets/r2hc.h, rdft/codelets/r2hcII.h, rdft/conf.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/indirect.c, rdft/khc2hc-dif.c, rdft/khc2hc-dit.c, rdft/khc2r.c, rdft/kr2hc.c, rdft/nop.c, rdft/nop2.c, rdft/plan.c, rdft/plan2.c, rdft/problem.c, rdft/problem2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-inplace-strides.c, rdft/rdft2-radix2.c, rdft/rdft2-tensor-max-index.c, rdft/solve.c, rdft/solve2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/conf.c, reodft/redft00e-r2hc.c, reodft/reodft.h, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc.c, simd/3dnow.c, simd/altivec.c, simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h, simd/sse-aux.c, simd/sse.c, simd/sse2-aux.c, simd/sse2.c, threads/api.c, threads/conf.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/f77api.c, threads/f77funcs.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/threads.h, threads/vrank-geq1-rdft2.c, tools/fftw-wisdom-to-conf.1, tools/fftw-wisdom-to-conf.in, tools/fftw-wisdom.c, tools/fftw_wisdom.1.in: great copyright update * TODO, tests/Makefile.am, tests/check.pl: threads in make check * threads/ct-dit.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c: fixed const warnings * threads/ct-dit.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c: make sure spawn_loop size > 1 (it has to be at least > 0 lest we crash, but > 1 is an optimization) 2003-03-15 Matteo Frigo * kernel/cycle.h: hpux seems to want machine/sys/inline.h as opposed to machine/inline.h. 2003-03-15 Steven G. Johnson * doc/fftw3.texi: Sourceforge is really SourceForge.net, and is run by VA * doc/fftw3.texi: comma * doc/fftw3.texi: fixed AMD company name * doc/fftw3.texi: minor changes * api/f77api.c, api/f77funcs.c: more emitter->read_char renaming * doc/fftw3.texi: more wisdom docs, noted wisdom utilities * doc/fftw3.texi: compound adjectives are hyphenated * doc/fftw3.texi: fftw does support another type of packed array via r2r * api/export-wisdom.c, api/f77api.c, api/f77funcs.c, api/fftw3.h, api/import-wisdom.c, doc/f77_wisdom.f, doc/fftw3.texi: write_char/read_char for export/import functions * threads/threads.c: comments 2003-03-15 Matteo Frigo * support/Makefile.codelets: Enabled randomized-cse * configure.ac: Changed to 3.0-beta1 * doc/fftw3.texi: First complete draft * api/fftw3.h, api/import-wisdom.c: EMITTER is a misnomer * doc/fftw3.texi: Revision, wisdom tutorial, acks. 2003-03-15 Steven G. Johnson * NEWS: noted OpenMP * threads/threads.c: comment * threads/threads.c: comments * threads/threads.c: reformatting * threads/threads.c: whoops * tests/bench.c, threads/api.c, threads/threads.c, threads/threads.h: some threads fixes, and added experimental semaphore (pre-thread-spawning) and Linux spinlock support * threads/f77funcs.c: whoops 2003-03-14 Steven G. Johnson * doc/fftw3.texi: added note that FFTW_PATIENT will disable threads if they are not beneficial * doc/fftw3.texi: made fftw_cleanup* more restrictive, in that we don't want to guarantee that previously created plans will still work (they won't, in the case of threaded plans and fftw_cleanup_threads), and there is no reason to provide such a guarantee anyway. 2003-03-14 Matteo Frigo * api/Makefile.am, api/version.c, kernel/Makefile.am, kernel/ifftw.h, kernel/version.c: Moved version.c from kernel/ into api/ * configure.ac: icc-7.0 requires -openmp * doc/Makefile.am: Ensure that one can do make dist given the distribution * doc/Makefile.am: Dist fftw3.pdf, not fftw.pdf * tests/bench.c: Support -onthreads=%d 2003-03-14 Steven G. Johnson * kernel/alloc.c: comment * threads/Makefile.am: whoops * doc/rfftwnd.fig: fftw_real is gone * doc/fftw3.texi: typos 2003-03-14 Matteo Frigo * api/fftw3.h, tests/bench.c: More BENCH_DOC strings * doc/fftw3.texi: Fixed xref's * doc/Makefile.am, doc/fftw3.texi, doc/rfftwnd.gif: Revised manual (esp. intro and tutorial), fixed texinfo hackery for figures. 2003-03-12 Steven G. Johnson * doc/fftw3.texi: redirect users from guru execute to advanced interface, if possible * doc/fftw3.texi: punctuation * doc/fftw3.texi: use correct heading level * doc/Makefile.am, doc/fftw3.texi: html generation * doc/equation-dft.gif, doc/equation-dht.gif, doc/equation-idft.gif, doc/equation-redft00.gif, doc/equation-redft01.gif, doc/equation-redft10.gif, doc/equation-redft11.gif, doc/equation-rodft00.gif, doc/equation-rodft01.gif, doc/equation-rodft10.gif, doc/equation-rodft11.gif: added equation GIFs * doc/fftw3.texi: punctuation * doc/fftw3.texi: added multi-dimensional transform definitions * doc/fftw3.texi: slight changes * doc/fftw3.texi: typo * doc/fftw3.texi: added 1d version of What FFTW Really Computes * doc/fftw3.texi: note in upgrading section about FFTW_PATIENT 2003-03-11 Steven G. Johnson * doc/fftw3.texi: added cycle-counter section * TODO: more ideas 2003-03-10 Steven G. Johnson * dft/indirect.c, rdft/indirect.c: noted that indirect should probably be merged with rank-geq2, to make a rank-split solver 2003-03-07 Steven G. Johnson * doc/fftw3.texi: added non-Unix installation instructions * doc/fftw3.texi: also talk about stack alignment with SSE/SSE2 * doc/fftw3.texi: made warning more dire * doc/fftw3.texi: fix * doc/fftw3.texi: number * doc/fftw3.texi: fix * doc/fftw3.texi: minor * doc/fftw3.texi: minor fix * doc/fftw3.texi: cross-ref * doc/fftw3.texi: minor * doc/fftw3.texi: more installation manual * doc/fftw3.texi: GNU-lly correct * doc/fftw3.texi: started installation section * configure.ac, kernel/timer.c: added --without-cycle-counter option as a last resort * kernel/cycle.h: macros with () arguments were only standardized in C99, and we don't need them anyway * doc/fftw3.texi: wording * doc/fftw3.texi: parallelism * doc/fftw3.texi: additions to upgrading chapter * doc/fftw3.texi: noted additional humility of FFTW 3 wisdom * doc/fftw3.texi: renaming * doc/fftw3.texi: added placeholder for wisdom reference * doc/fftw3.texi: wrote upgrading chapter 2003-03-06 Steven G. Johnson * doc/fftw3.texi: slight change * doc/fftw3.texi: placeholder for upgrade chapter * tools/fftw-wisdom.c: whoops * tools/fftw_wisdom.1.in: strengthed warning about time * tools/fftw_wisdom.1.in: noted -t in example * threads/f77api.c: pay attention to WINDOWS_F77_MANGLING * doc/fftw3.texi: punctuation * doc/fftw3.texi: index * doc/fftw3.texi: documented C++ usage * doc/fftw3.texi: got rid of overfull hbox TeX warnings * doc/fftw3.texi: whoops * doc/fftw3.texi: noted fftw_iodim split for Fortran guru interface * doc/fftw3.texi: added guru reference * doc/fftw3.texi: minor * doc/fftw3.texi: use @r{...} for comment text in code examples 2003-03-05 Steven G. Johnson * simd/sse.c: eliminate warning * configure.ac, dft/simd/Makefile.am, dft/simd/codelets/Makefile.am, kernel/align.c, simd/Makefile.am: SIMD_CFLAGS only for simd code 2003-03-05 Matteo Frigo * doc/fftw3.texi: Minor changes. 2003-03-05 Steven G. Johnson * api/f77api.c, configure.ac: cross-compiling with MinGW can't detect f77 mangling, so add an option to use what seems to be the most common styles * libbench2/util.c: comment * libbench2/util.c: we only use our-malloc-16 on machines where size_t == uintptr_t, so don't bother doing the right thing with the benchmark * libbench2/util.c: support WITH_OUR_MALLOC16 2003-03-04 fftw * configure.ac: automatically add -msse etcetera for --enable-sse etcetera * tools/fftw-wisdom.c: got rid of const warning * libbench2/problem.c: missing header 2003-03-04 Steven G. Johnson * doc/fftw3.texi: fixes * api/import-system-wisdom.c: whoops * doc/fftw3.texi: started guru reference * api/fftw3.h: use same FFTW_IODIM between precisions * doc/fftw3.texi: renamed section * doc/fftw3.texi: no need for "advanced" in subheadings * doc/fftw3.texi: typo * doc/fftw3.texi: finished advanced interface * doc/fftw3.texi: more advance interface docs * api/import-system-wisdom.c: fail for win32 2003-03-03 fftw * configure.ac: shortened help string * doc/fftw3.texi: fixed cross-refs * api/fftw3.h, api/mapflags.c, doc/fftw3.texi, tests/bench.c: FFTW_POSSIBLY_UNALIGNED -> simpler FFTW_UNALIGNED in API, added bench option * kernel/alloc.c: whoops * kernel/alloc.c: noted assumption * configure.ac, kernel/alloc.c: provide our own malloc16 routine because of Windows lossage 2003-03-03 Steven G. Johnson * doc/fftw3.texi: capitalization * doc/fftw3.texi: whoops * doc/fftw3.texi: vertical skip looks better than indenting for setting off short paragraphs 2003-03-03 Matteo Frigo * configure.ac, dft/simd/codelets/Makefile.am: Removed franz-mode. Automake was distributing franz files whether franz mode was enabled or not. 2003-03-03 Steven G. Johnson * doc/fftw3.texi: made output boundary conditions more prominent; they are important, because they make the different transform types inequivalent in parity * doc/fftw3.texi: clarification * doc/fftw3.texi: typo * doc/fftw3.texi: started advanced reference * doc/fftw3.texi: r2r reference * doc/fftw3.texi: workaround for info formatting bug * doc/fftw3.texi: noted lack of fftw_malloc in Fortran * doc/fftw3.texi: parallelism * doc/fftw3.texi: whoops * doc/fftw3.texi: r2c/c2r reference * doc/fftw3.texi: table of contents was being included twice * doc/fftw3.texi: minor changes * doc/fftw3.texi: started reference section * doc/Makefile.am: whoops * doc/fftw3.texi: started ref. section 2003-03-02 Steven G. Johnson * api/fftw3.h, api/flops.c: fftw_flops takes const plan * doc/fftw3.texi: typo * doc/fftw3.texi: added "Wisdom of Fortran?" section * doc/f77_wisdom.f: typo * doc/f77_wisdom.f: wording * doc/f77_wisdom.f: added comments * doc/f77_wisdom.f: added example file * tests/bench.c: don't print out READ WISDOM unless we have * kernel/scan.c: EOF is not a space 2003-03-02 Matteo Frigo * kernel/ifftw.h: Turn on inline by default * genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, kernel/ifftw.h: Optionally inline loop in notw codelets 2003-03-02 Steven G. Johnson * doc/fftw3.texi: updated nodes * doc/fftw3.texi: wrote most of Fortran chapter * doc/fftw3.texi: citation * doc/fftw3.texi: added parallel FFTW chapter * doc/fftw3.texi: typo * TODO: added inlining to TODO * CONVENTIONS: added K * dft/zero.c, kernel/trig1.c, rdft/generic.c, rdft/problem.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc.c: use K for constants * doc/fftw3.texi: fixed cross-ref * doc/fftw3.texi: whoops 2003-03-01 Steven G. Johnson * doc/fftw3.texi: cleanup * doc/fftw3.texi: "words of wisdom" by itself is a little too obscure * doc/fftw3.texi: re-added multi-dimensional array stuff * doc/fftw3.texi: added alignment section * reodft/reodft11e-r2hc-odd.c: shrunk code * reodft/reodft11e-r2hc-odd.c: slight compression * doc/fftw3.texi, reodft/reodft11e-radix2.c: style 2003-02-28 Steven G. Johnson * CONVENTIONS: noted not in API * CONVENTIONS: more updates * CONVENTIONS: slight updates * api/f77funcs.c, api/fftw3.h, api/print-plan.c, dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/dft.h, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/solve.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-radix2.c, rdft/solve.c, rdft/solve2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, reodft/rodft00e-r2hc.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: great const-ification of apply/solve and print * api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute.c, api/f77funcs.c, api/fftw3.h, doc/fftw3.texi: make fftw_execute take a const plan, to remind the user that it is re-entrant (or should be)... * doc/fftw3.texi: weakening * doc/fftw3.texi: note * doc/fftw3.texi: footnote about why DHT is provided * doc/fftw3.texi: index * doc/fftw3.texi: added DHT tutorial * doc/fftw3.texi: fixed O(n log n) * doc/fftw3.texi: whoops * doc/fftw3.texi: slight improvements * doc/fftw3.texi: addition * doc/fftw3.texi: clarification * doc/fftw3.texi: fix * doc/fftw3.texi: slight changes * doc/fftw3.texi: added R{E,O}DFTab tutorial 2003-02-27 Steven G. Johnson * doc/fftw3.texi: fixes * doc/fftw3.texi: slight change * doc/fftw3.texi: documented r2hc/hc2r * doc/fftw3.texi: minor changes * TODO: timed planner and unifying radix-2 butterfly loops are not critical for release * TODO: reodft/verify.c no longer exists * rdft/problem.c: optimization: REDFT00 of size 2 is same as R2HC * rdft/problem.c: R{E,O}DFT01 of size-1 is identity * reodft/reodft11e-r2hc-odd.c: minor simplification * reodft/reodft11e-r2hc-odd.c: fixed add count * reodft/reodft11e-r2hc-odd.c: whoops * reodft/reodft11e-r2hc-odd.c: another optimization * reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-radix2.c: added op counts * reodft/reodft11e-r2hc-odd.c: cleanup * reodft/reodft11e-r2hc-odd.c: typo in comment * reodft/reodft11e-r2hc-odd.c: fixed comment * reodft/reodft11e-r2hc-odd.c: use E instead of R * reodft/reodft11e-r2hc-odd.c: more unrolling to eliminate if statements in loops, for speedups of 25-40% * reodft/reodft11e-r2hc-odd.c: some loop splitting to touch each element of output buf only once and eliminate some conditionals...speeds up by 30-40% 2003-02-26 Steven G. Johnson * reodft/reodft11e-r2hc-odd.c: comma * reodft/reodft11e-radix2.c: pointer to odd case * reodft/reodft11e-r2hc.c: precision -> accuracy (c.f. Kahan) * Makefile.am, libbench2/bench-user.h, libbench2/problem.c, tools/fftw-wisdom.c, tools/fftw_wisdom.1.in: added time limit for wisdom generation * reodft/reodft11e-r2hc-odd.c: caps * reodft/reodft11e-r2hc-odd.c: another note * reodft/reodft11e-r2hc-odd.c: note * configure.ac, kernel/alloc.c, kernel/ifftw.h, libbench2/bench-main.c, libbench2/bench.h, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h, reodft/Makefile.am, reodft/conf.c, reodft/reodft.h, reodft/reodft11e-r2hc-odd.c, reodft/reodft11e-r2hc.c, reodft/reodft11e-radix2.c, tests/bench.c: added new, more accurate (hopefully) reodft11 algorithms; added --disable-debug-malloc; added --impulse-accuracy-rounds=rounds flags to libbench2 for impulse-response accuracy tests 2003-02-23 Matteo Frigo * tools/Makefile.am: fftw_wisdom.1 is in $builddir, not $srcdir 2003-02-17 Steven G. Johnson * doc/fftw3.texi: pde * doc/fftw3.texi: consistent number * doc/fftw3.texi: started r2r doc * doc/Makefile.am, doc/fftw3.texi, doc/rfftwnd.fig, doc/rfftwnd.gif: rfftwnd 2003-02-15 Steven G. Johnson * doc/fftw3.texi: continued * doc/fftw3.texi: started r2c/c2r docs * libbench2/verify-r2r.c: added r{e,o}dft11 accuracy test * libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.h: added more r2r accuracy checks 2003-02-15 Matteo Frigo * tools/Makefile.am: $< is a GNUism 2003-02-13 Steven G. Johnson * TODO: r2r test cases are in * TODO: added vector radix to TODO 2003-02-12 Steven G. Johnson * tools/fftw_wisdom.1.in: fixed cross-ref * tools/fftw_wisdom.1.in: shorter synopsis * tests/debug.h: obsolete * tests/dotens.c, tests/dotens2.c: removed old dotens * tests/verify-dft.c, tests/verify-lib.c, tests/verify-rdft.c, tests/verify-reodft.c, tests/verify.h: removed old verify files * tools/fftw-wisdom.c, tools/fftw_wisdom.1.in: disable threads support by default 2003-02-12 Matteo Frigo * tests/bench.c: Removed old test program 2003-02-12 Steven G. Johnson * tools/fftw-wisdom-to-conf.in: joke * tools/fftw-wisdom-to-conf.1, tools/fftw-wisdom-to-conf.in: add --help and --version, to be GNU-lly correct * tools/fftw_wisdom.1.in: whoops * tools/fftw-wisdom.c: better help * tools/fftw-wisdom-to-conf.1: comma * tools/fftw-wisdom-to-conf.1: formatting * configure.ac, tools/Makefile.am, tools/fftw-wisdom-to-conf.1, tools/fftw_wisdom.1.in: man pages for tools * tools/fftw-wisdom.c: added -V 2003-02-11 Steven G. Johnson * Makefile.am: added install-wisdom target * NEWS: another note * libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h: started r2r accuracy tests (only three kinds covered so far) * kernel/ifftw.h: silence warning 2003-02-11 Matteo Frigo * TODO: gcc bug is now avoided. * libbench2/Makefile.am, libbench2/bench-user.h, libbench2/mp.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-r2r.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h: Accuracy test 2003-02-10 Matteo Frigo * kernel/ifftw.h: There is no point in precomputing strides for the long-double code, as multiplication by sizeof(long double) cannot be folded into the addressing mode. This change also fixes the gcc-2.95 bug that causes miscompilation of certain codelets. 2003-02-10 Steven G. Johnson * tests/check.pl: added random r2r tests * reodft/reodft010e-r2hc.c: whoops, bugfix: missing stride for ro10 * api/mapflags.c: formatting * reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c: flop counts for reodft * libbench2/bench.h: declare aligned_main * rdft/dht-rader.c, rdft/rader-hc2hc.c: corrected rader op counts * TODO: punctuation * TODO: noted need for better estimator * NEWS: noted F77 api fix for g77 mangling incompatibility * api/Makefile.am: build f77 header file of constants from fftw3.h * TODO: updates * api/Makefile.am, api/f77api.c, api/x77.h, threads/Makefile.am, threads/f77api.c, threads/f77funcs.c: threads f77 api 2003-02-09 Steven G. Johnson * api/f77api.c, api/f77funcs.c: finished f77 serial api * api/f77api.c, api/f77funcs.c: added flops, slight cleanups 2003-02-09 Matteo Frigo * libbench2/aligned-main.c: Oops, forgot #include * libbench2/Makefile.am, libbench2/aligned-main.c, libbench2/bench-main.c, libbench2/main.c, tools/fftw-wisdom.c: Removed duplication of stack-alignment code 2003-02-09 Steven G. Johnson * tools/fftw-wisdom.c: allow - to read problems from stdin * tools/Makefile.am, tools/fftw-wisdom.c: added fftw-wisdom tool * tests/bench.c: elim. warning * tests/bench.c: destroy_input should not contaminate flags of other problems * ChangeLog: updated * dft/rank-geq2.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c: removed overzealous inplace check, which caused problems for rdft2 2003-02-09 Matteo Frigo * kernel/tensor.c: Consistent syntax for RNK_MINFTY tensors * kernel/tensor.c: lisply-correct tensor print. We no longer need to parse tensors. 2003-02-09 Steven G. Johnson * TODO: removed completed items * libbench2/verify-r2r.c: slight renaming * libbench2/problem.c, libbench2/verify-r2r.c: multi-dimensional r2r verifier * libbench2/verify-r2r.c: comments * libbench2/verify-r2r.c: slight simplification * libbench2/Makefile.am, libbench2/allocate.c, libbench2/bench-user.h, libbench2/mflops.c, libbench2/problem.c, libbench2/verify-r2r.c, libbench2/verify.c, libbench2/zero.c, tests/bench.c: added 1d r2r verifier (triple ugh) * tests/check.pl: added vector transforms to random tests * rdft/direct2.c: whoops * libbench2/problem.c: fixed interaction between dwims for sz/vecsz with rdft2 transforms * libbench2/bench-user.h, libbench2/problem.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-rdft2.c, libbench2/verify.h, tests/bench.c: added destroy_input flag/check * api/Makefile.am, api/dfthelp.c, api/extract-reim.c, api/plan-guru-dft-c2r.c, api/plan-many-dft-c2r.c, libbench2/Makefile.am, libbench2/allocate.c, libbench2/aset.c, libbench2/bench-user.h, libbench2/bench.h, libbench2/problem.c, libbench2/tensor.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify-rdft2.c, libbench2/verify.c, libbench2/verify.h, libbench2/zero.c, tests/bench.c, tests/check.pl: added rdft2 verifier 2003-02-08 Steven G. Johnson * rdft/rdft2-radix2.c: an additional check for in-place case 2003-02-07 Steven G. Johnson * rdft/rank0-rdft2.c: slight fix: hc2r constraints are mostly determined by sub-plan * rdft/rdft2-radix2.c: make radix2-dft inapplicable to in-place/split case (r == rio, iio >= rio + n/2+1 != r + 1) 2003-02-04 Matteo Frigo * kernel/planner.c, tests/hook.c: Allow plnr->hook to be 0 2003-02-04 Steven G. Johnson * libbench2/bench-user.h, libbench2/verify-dft.c, libbench2/verify.c: moved dft stuff into verify-dft * tests/hook.c: cruft * libbench2/bench-user.h, libbench2/problem.c, libbench2/verify.c, tests/bench.c, tests/hook.c: further unify libbench2 and paranoid verifiers 2003-02-02 Steven G. Johnson * api/import-wisdom-from-file.c: typo in comment 2003-02-01 Matteo Frigo * kernel/primes.c: Fixed p==2 case * kernel/primes.c: Incorporated new find_generator by Greg Dionne. * libbench2/getopt.c: Removed nonportable call to gettext() 2003-01-30 Matteo Frigo * kernel/ifftw.h: uintptr_t is in in openbsd 2003-01-29 Matteo Frigo * api/export-wisdom-to-string.c, api/export-wisdom.c, api/import-wisdom-from-file.c, api/mkprinter-file.c, kernel/debug.c, kernel/ifftw.h, kernel/planner.c, kernel/print.c, kernel/scan.c, tests/bench.c: Huge speedups in wisdom I/O. * kernel/planner.c: Added appropriate warning against likely future bug. * kernel/planner.c: Don't attempt to remove bogus wisdom entries. 2003-01-28 Matteo Frigo * kernel/planner.c: Fixed a couple of very very very nasty bugs---pointers became invalid after the hash table was relocated. * tests/bench.c: Read wisdom at can_do() time, otherwise wisdom is destroyed. * kernel/planner.c: More conservative inheritance of blessings * dft/problem.c: Print the same info as it is hashed * tests/check.pl: Print name of executable when FAILURE 2003-01-27 Matteo Frigo * kernel/ifftw.h, kernel/planner.c: New NO_SEARCH planner flag, which avoids searching altogether. A wisdom entry must lead to a NO_SEARCH-grade plan, or else the wisdom entry is bogus. * libbench2/verify-lib.c: Use cosl()/sinl() when appropriate 2003-01-26 Matteo Frigo * kernel/planner.c, libbench2/problem.c, libbench2/speed.c, libbench2/verify.c: Use null pointers when estimating. The estimator should never time anything. 2003-01-26 Steven G. Johnson * api/f77api.c: note * api/Makefile.am, api/f77api.c, api/f77funcs.c, configure.ac: support multiple mangling schemes with g77 * tests/check.pl: fixed verbose, made random tests only use selected rank, use rank <= 4, fixed final flush_problems call * tests/check.pl: fixed typo (count instead of maxcount) * configure.ac: hypot is no longer used * configure.ac, kernel/ifftw.h: check for _alloca (MSVC) * kernel/alloc.c: slight fix in assert 2003-01-26 Matteo Frigo * libbench2/problem.c, libbench2/speed.c, libbench2/verify.c, tests/bench.c: Allocate problem in all cases--- can_do may need correct pointers. * tests/bench.c, tests/check.pl: Nastier checks * kernel/ifftw.h, kernel/plan.c, kernel/planner.c: X(use_plan) is a relic. * tests/Makefile.am: Print full pathname of the bench executable, so that I don't get confused when running multiple tests for different configurations. * libbench2/bench-main.c, libbench2/bench-user.h, tests/bench.c: Split done() into done() and cleanup(), in order to test multiple problems with the same planner from the command line. * kernel/alloc.c: Improved readability 2003-01-26 Steven G. Johnson * kernel/alloc.c: comment * kernel/alloc.c: added macos9 mpallocatealigned function 2003-01-25 Steven G. Johnson * kernel/alloc.c: sometimes __APPLE__ is defined instead of __MACOSX__ * kernel/alloc.c: macos x malloc is already 16-byte aligned 2003-01-25 Matteo Frigo * kernel/ifftw.h: Include because uintptr_t is defined there on solaris. * libbench2/Makefile.am, libbench2/getopt1.c: Oops---forgot getopt_long * configure.ac: Include default includes when checking for uintptr_t. (Otherwise solaris breaks.) * tests/Makefile.am: distribute check.pl * tests/check.pl: Check split format, too. * tests/Makefile.am, tests/check.pl: New tests, added make check 2003-01-23 Matteo Frigo * tests/check.pl: More tests 2003-01-22 Matteo Frigo * libbench2/problem.c, api/mktensor-iodims.c, api/mktensor-rowmajor.c: Deal with rnk(sz)=-infinity 2003-01-21 Matteo Frigo * TODO: Crazy idea * tests/check.pl: Test program, still barely worthy of the name. 2003-01-20 Matteo Frigo * libbench2/problem.c: Stylistic changes * api/Makefile.am, api/fftw3.h, api/flops.c, tests/bench.c: Implemented flops api 2003-01-19 Steven G. Johnson * libbench2/problem.c: cleanup * libbench2/problem.c: 'v' syntax now defaults to an 'internal' (stride 1) vector, which is a more interesting case and corresponds more closely to the intuitive notion of a 'vector' transform, while '*' does the old 'external' (stride n) vector * libbench2/problem.c: removed '/' overloading * libbench2/problem.c: get rid of '*' and ',' synonyms for 'x' in problem parser; there's no need to clutter the namespace with syntax we never use 2003-01-19 Matteo Frigo * kernel/planner.c: Signed/unsigned fixes. * libbench2/bench-user.h, libbench2/verify-dft.c, libbench2/verify.c, libbench2/verify.h, tests/bench.c, tests/hook.c: Test split arrays. 2003-01-19 Steven G. Johnson * doc/fftw3.texi: clarification * doc/fftw3.texi: caps * doc/fftw3.texi: brackets * doc/fftw3.texi: quote * doc/fftw3.texi: referencing * doc/fftw3.texi: fix * doc/fftw3.texi: slight change 2003-01-19 Matteo Frigo * libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify.h, tests/bench.c: Print errors when --verify. 2003-01-19 Steven G. Johnson * doc/fftw3.texi: improved description, noted that FFTW_ESTIMATE does not destroy arrays * api/fftw3.h: FFTW_DEFAULTS isn't really needed * api/fftw3.h, doc/fftw3.texi: added FFTW_MEASURE synonym for FFTW_DEFAULTS * kernel/alloc.c: slight change 2003-01-19 Matteo Frigo * tests/bench.c: Clearer name * api/fftw3.h, libbench2/tensor.c, tests/bench.c: Completed dft api test 2003-01-19 Steven G. Johnson * doc/fftw3.texi: index * doc/fftw3.texi: fix * doc/fftw3.texi: parallel structure * doc/fftw3.texi: fix * doc/fftw3.texi: joke * doc/fftw3.texi: recommendation to read tutorial in-order * doc/fftw3.texi: expanded outline * doc/fftw3.texi: clarification * doc/fftw3.texi: draft complex-dft tutorial 2003-01-18 Matteo Frigo * libbench2/allocate.c, libbench2/bench-main.c, libbench2/bench-user.h, libbench2/bench.h, libbench2/can-do.c, libbench2/dotens2.c, libbench2/info.c, libbench2/problem.c, libbench2/report.c, libbench2/speed.c, libbench2/timer.c, libbench2/util.c, libbench2/verify.c, libbench2/verify.h, libbench2/zero.c, tests/Makefile.am, tests/bench.c, tests/hook.c: Paranoid mode is back. Fixed dwim to do what I mean. 2003-01-18 Steven G. Johnson * doc/fftw3.texi: started tut. 2003-01-18 Matteo Frigo * libbench2/allocate.c, libbench2/bench-user.h, libbench2/bench.h, libbench2/can-do.c, libbench2/dotens2.c, libbench2/mflops.c, libbench2/problem.c, libbench2/report.c, libbench2/speed.c, libbench2/tensor.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify.c, libbench2/verify.h, libbench2/zero.c, tests/bench.c: Great renaming, so that we can include both bench-user.h and ifftw.h to implement the paranoid-mode hook. * libbench2/bench-user.h, libbench2/problem.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify.c, libbench2/verify.h: Trying to tweak the verifier so that I can use it in bench.c for paranoid mode * tests/bench.c: Added stride_factor for complex arrays. * tests/bench.c: can_do now calls the planner. * api/plan-guru-dft.c, tests/bench.c: Call guru api in bench.c * libbench2/bench.h, libbench2/zero.c: Fixed prototype. * api/api.h, api/apiplan.c, api/fftw3.h, api/mapflags.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, kernel/ifftw.h: Attempt to make the signed/unsigned use of flags consistent. * libbench2/Makefile.am, libbench2/bench-main.c, libbench2/bench-user.h, libbench2/useropt.c, tests/bench.c: Implemented useropt. * api/mapflags.c: The first map_flags pass must be transitive, i.e., always use the latest flags value as opposed to the original value. (I think.) * libbench2/Makefile.am, libbench2/bench-user.h, libbench2/dotens2.c, libbench2/tensor.c, libbench2/verify-dft.c, libbench2/verify-lib.c, libbench2/verify.c, libbench2/verify.h, tests/Makefile.am: Started working on verifier 2003-01-17 Steven G. Johnson * api/fftw3.h, threads/api.c, threads/threads.c, threads/threads.h: added X(threads_cleanup) 2003-01-17 Matteo Frigo * libbench2/allocate.c, libbench2/tensor.c: Use C style for upper and lower array bounds. Free tensors properly. * libbench2/problem.c: Fixed ambiguous syntax * libbench2/problem.c: Parse minus sign, bugfixes * Makefile.am, configure.ac, libbench2/Makefile.am, libbench2/allocate.c, libbench2/bench-main.c, libbench2/bench-user.h, libbench2/bench.h, libbench2/can-do.c, libbench2/caset.c, libbench2/getopt-utils.c, libbench2/getopt.c, libbench2/getopt.h, libbench2/info.c, libbench2/main.c, libbench2/mflops.c, libbench2/ovtpvt.c, libbench2/pow2.c, libbench2/problem.c, libbench2/report.c, libbench2/speed.c, libbench2/tensor.c, libbench2/timer.c, libbench2/util.c, libbench2/verify.c, libbench2/zero.c, tests/Makefile.am, tests/bench.c: Skeleton libbench2 implemented (probably still buggy) * kernel/tensor4.c: Formatting 2003-01-17 fftw * doc/fftw3.texi: slight updates 2003-01-17 Steven G. Johnson * dft/vrank-geq1.c, kernel/buffered.c, kernel/ifftw.h, kernel/minmax.c, kernel/tensor4.c, rdft/buffered2.c, rdft/rdft2-inplace-strides.c, rdft/rdft2-tensor-max-index.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: eliminated obsolete uimin/uimax * Makefile.am, api/Makefile.am, api/configure.c, api/fftw3.h, api/plan-with-nthreads.c, tests/Makefile.am, tests/bench.c, threads/Makefile.am, threads/api.c, threads/threads.c: threads needs to have its own library, lest all programs linking to libfftw3.so need -lpthread * api/f77api.c: whoops * api/f77api.c: better name * api/f77api.c: added more functions 2003-01-16 Steven G. Johnson * kernel/ifftw.h: if 'long' is big enough, use it for mulmod in preference to 'long long' * configure.ac, kernel/align.c, kernel/ifftw.h: use uintptr_t for pointer alignment arithmetic 2003-01-16 Matteo Frigo * kernel/planner.c, kernel/print.c, kernel/tensor.c, kernel/twiddle.c, rdft/problem.c: More signed/unsigned cleanup * kernel/solvtab.c: null function pointers are technically nonportable * libbench/bench-main.c: Free short_options * kernel/alloc.c, kernel/ifftw.h, tests/bench.c: Oops, forgot STACK_FREE * kernel/alloc.c, kernel/ifftw.h: Do not require memalign() unless HAVE_SIMD 2003-01-16 Steven G. Johnson * kernel/alloc.c: MS VC++ _aligned_malloc * api/fftw3.h, kernel/alloc.c: added api fftw_malloc/free * api/map-r2r-kind.c: silence warning * tools/fftw-wisdom-to-conf.in: send error output to stderr 2003-01-15 Matteo Frigo * kernel/tensor7.c: Pure paranoia. * api/api.h, api/apiplan.c, api/configure.c, api/dfthelp.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute.c, api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/export-wisdom.c, api/f77api.c, api/fftw3.h, api/forget-wisdom.c, api/import-system-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c, api/map-r2r-kind.c, api/mapflags.c, api/mkprinter-file.c, api/mktensor-iodims.c, api/mktensor-rowmajor.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, api/plan-with-nthreads.c, api/print-plan.c, api/rdft2-pad.c, api/the-planner.c, dft/buffered.c, dft/ct.c, dft/direct.c, dft/generic.c, dft/problem.c, dft/rader.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/alloc.c, kernel/ifftw.h, kernel/planner.c, kernel/print.c, kernel/tensor.c, kernel/tensor9.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc.c, rdft/problem.c, rdft/problem2.c, rdft/rader-hc2hc.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Fixed formatting that was messed up by the conversion uint->int. Ensure that iodims etc are kosher. 2003-01-15 Steven G. Johnson * tools/fftw-wisdom-to-conf.in: added version stamp * tools/fftw-wisdom-to-conf.in: added warning * tools/Makefile.am: add fftw-wisdom-to-conf to BUILT_SOURCES * tools/fftw-wisdom-to-conf.in: added const * Makefile.am, configure.ac, tools/Makefile.am, tools/fftw-wisdom-to-conf.in: added wisdom-to-conf * kernel/planner.c: include type prefix in wisdom preamble * TODO: updates * tests/bench.c: check the_plan before printing 2003-01-15 Matteo Frigo * ChangeLog, api/api.h, api/apiplan.c, api/configure.c, api/dfthelp.c, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/execute-dft.c, api/execute-r2r.c, api/execute.c, api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/export-wisdom.c, api/f77api.c, api/fftw3.h, api/forget-wisdom.c, api/import-system-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c, api/map-r2r-kind.c, api/mapflags.c, api/mkprinter-file.c, api/mktensor-iodims.c, api/mktensor-rowmajor.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, api/plan-with-nthreads.c, api/print-plan.c, api/rdft2-pad.c, api/the-planner.c, configure.ac, dft/buffered.c, dft/codelet-dft.h, dft/codelets/n.c, dft/codelets/t.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/direct.c, dft/generic.c, dft/indirect.c, dft/k7/k7.c, dft/problem.c, dft/rader-omega.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, dft/zero.c, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, kernel/align.c, kernel/buffered.c, kernel/ct.c, kernel/hash.c, kernel/iabs.c, kernel/ifftw.h, kernel/md5-1.c, kernel/md5.c, kernel/minmax.c, kernel/ops.c, kernel/pickdim.c, kernel/planner.c, kernel/primes.c, kernel/print.c, kernel/rader.c, kernel/scan.c, kernel/tensor.c, kernel/tensor1.c, kernel/tensor2.c, kernel/tensor4.c, kernel/tensor5.c, kernel/tensor7.c, kernel/trig.c, kernel/twiddle.c, libbench/acopy.c, libbench/allocate.c, libbench/ascale.c, libbench/aset.c, libbench/bench-user.h, libbench/bench.h, libbench/caadd.c, libbench/cacopy.c, libbench/cascale.c, libbench/caset.c, libbench/casub.c, libbench/copy-c2h-1d-fftpack.c, libbench/copy-c2h-1d-halfcomplex.c, libbench/copy-c2h-1d-packed.c, libbench/copy-c2h-1d-unpacked-ri.c, libbench/copy-c2h-unpacked.c, libbench/copy-c2r-packed.c, libbench/copy-c2r-unpacked.c, libbench/copy-c2ri.c, libbench/copy-h2c-1d-fftpack.c, libbench/copy-h2c-1d-halfcomplex.c, libbench/copy-h2c-1d-packed.c, libbench/copy-h2c-1d-unpacked-ri.c, libbench/copy-h2c-unpacked.c, libbench/copy-r2c-packed.c, libbench/copy-r2c-unpacked.c, libbench/copy-ri2c.c, libbench/getopt-utils.c, libbench/getopt.c, libbench/log2.c, libbench/mp.c, libbench/pow2.c, libbench/prime.c, libbench/problem.c, libbench/timer.c, libbench/verify.c, rdft/buffered.c, rdft/buffered2.c, rdft/codelet-rdft.h, rdft/codelets/hc2r.c, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/indirect.c, rdft/problem.c, rdft/problem2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-inplace-strides.c, rdft/rdft2-radix2.c, rdft/rdft2-tensor-max-index.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, simd/3dnow.c, simd/sse.c, simd/sse2.c, tests/bench.c, tests/dotens.c, tests/dotens2.c, tests/trigtest.c, tests/verify-dft.c, tests/verify-lib.c, tests/verify-rdft.c, tests/verify-reodft.c, tests/verify.h, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/threads.h, threads/vrank-geq1-rdft2.c: Eliminated those unsigned values that would break LP64 machines. 2003-01-14 Steven G. Johnson * kernel/primes.c: comments 2003-01-14 Matteo Frigo * dft/generic.c, rdft/generic.c: Oops * dft/generic.c, rdft/generic.c: int/uint confusion 2003-01-14 Steven G. Johnson * doc/fftw3.texi: updated introduction and some organization * api/f77api.c: whoops * Makefile.am: newline * libbench/timer.c: added win32 timer * libbench/util.c: sync with kernel/alloc.c * api/f77api.c: handle missing F77_FUNC_ 2003-01-13 Steven G. Johnson * api/f77api.c: used fint instead of int to make Fortran integer type easier to change * api/f77api.c: slight abbreviation * api/Makefile.am, api/api.h, api/f77api.c, api/fftw3.h, api/mktensor-rowmajor.c, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-dft.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c, api/rdft2-pad.c, kernel/Makefile.am, kernel/ifftw.h, kernel/tensor3.c, tests/bench.c: the great lengthening, part I: int -> long in api; mv mktensor-rowmajor to api * configure.ac: long types 2003-01-13 Matteo Frigo * api/apiplan.c, api/export-wisdom-to-string.c, api/f77api.c, api/map-r2r-kind.c, api/plan-guru-r2r.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/plan-many-r2r.c, api/rdft2-pad.c, dft/buffered.c, dft/generic.c, dft/problem.c, dft/rader-omega.c, dft/rader.c, kernel/alloc.c, kernel/ifftw.h, kernel/plan.c, kernel/planner.c, kernel/print.c, kernel/problem.c, kernel/rader.c, kernel/scan.c, kernel/solver.c, kernel/stride.c, kernel/tensor.c, kernel/twiddle.c, rdft/buffered.c, rdft/buffered2.c, rdft/dht-rader.c, rdft/generic.c, rdft/problem.c, rdft/problem2.c, rdft/rader-hc2hc.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/verify-dft.c, tests/verify-rdft.c, tests/verify-reodft.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/vrank-geq1-rdft2.c: Renamed fftw_malloc -> MALLOC, X(free) -> X(ifree), X(free0) -> X(ifree0), non_fftw_malloc -> NATIVE_MALLOC 2003-01-13 Steven G. Johnson * api/Makefile.am, api/f77api.c: added beginning of Fortran interface * configure.ac: add fortran mangling check * api/Makefile.am, api/execute-r2r.c, api/fftw3.h, api/plan-guru-r2r.c: added guru r2r interface * api/fftw3.h, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c: whoops * api/Makefile.am, api/fftw3.h, api/map-r2r-kind.c, api/plan-many-r2r.c, api/plan-r2r-1d.c, api/plan-r2r-2d.c, api/plan-r2r-3d.c, api/plan-r2r.c: added r2r planner * configure.ac: more long-double checks * kernel/planner.c: slight regrouping * kernel/planner.c: added joke * api/Makefile.am, api/api.h, api/mktensor-rowmajor-pad.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c, api/rdft2-pad.c: simplified rdft2 padding * api/fftw3.h: added comment 2003-01-12 Steven G. Johnson * tests/bench.c: use latest api * api/fftw3.h, api/plan-dft-1d.c, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c, api/plan-dft.c: nembed should only be in advanced (many) interface, not basic interface...only a handful of people over the years have ever requested that functionality. * api/fftw3.h, api/mapflags.c: impatient is default; generalize mapping functions using xor trick * api/mktensor-rowmajor-pad.c, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c: use NULL nembed to signal padding * api/plan-many-dft.c: accept NULL nembed * api/Makefile.am, api/execute-dft-c2r.c, api/execute-dft-r2c.c, api/fftw3.h: added execute-dft-r2c/c2r * api/plan-dft.c: don't need dft.h * api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c: tensors are compressed in the problem, duh * kernel/alloc.c: noted that posix_memalign bug is now fixed, thanks to bug report by yours truly 2003-01-12 Matteo Frigo * api/plan-dft-3d.c, api/plan-dft-c2r-3d.c, api/plan-dft-r2c-3d.c: Bug: n[3] instead of n[2]. Bug was propagated by copy-and-paste. Grrr... * api/plan-dft.c: Express plan_dft() in terms of plan_many_dft() 2003-01-12 Steven G. Johnson * api/plan-guru-dft-c2r.c, api/plan-guru-dft-r2c.c, api/plan-guru-dft.c: whoops 2003-01-12 Matteo Frigo * Makefile.am, configure.ac, doc/Makefile.am, doc/fftw3.texi, genfft-k7/vK7Optimization.ml: Manual skeleton. 2003-01-12 Steven G. Johnson * api/Makefile.am, api/fftw3.h: added r2c/c2r guru api * api/plan-many-dft-c2r.c: FFTW_DESTROY_INPUT is default for c2r transforms * api/Makefile.am, api/fftw3.h, api/plan-dft-c2r-1d.c, api/plan-dft-c2r-2d.c, api/plan-dft-c2r-3d.c, api/plan-dft-c2r.c, api/plan-dft-r2c-1d.c, api/plan-dft-r2c-2d.c, api/plan-dft-r2c-3d.c, api/plan-dft-r2c.c: added more of r2c/c2r api * api/fftw3.h, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c: r2c doesn't have adjustible sign * TODO: note that copyright year is out of date * api/fftw3.h: updated api for r2c * api/mktensor-rowmajor-pad.c: removed annoying nophys == niphys case * api/Makefile.am, api/api.h, api/mktensor-rowmajor-pad.c, api/plan-many-dft-c2r.c, api/plan-many-dft-r2c.c: added basic r2c/c2r planner * api/plan-many-dft.c: dist should be in terms of complex values * api/fftw3.h: added plan-with-nthreads * api/Makefile.am, api/plan-with-nthreads.c: added function to set nthr 2003-01-11 Steven G. Johnson * api/fftw3.h: slight cleanup * api/mktensor-iodims.c: whoops * kernel/scan.c: maxlen is maximum string length, not including null termination * kernel/planner.c: imprt reverts hashtable on failure * api/fftw3.h: slight move * api/fftw3.h: stdio.h should be inlcuded outside of extern "C" * api/Makefile.am, api/api.h, api/fftw3.h: added guru planner API * api/fftw3.h: added FFTW_FORWARD/BACKWARD * api/Makefile.am, api/fftw3.h, api/plan-many-dft.c: added plan_many_dft * kernel/tensor3.c: indenting 2003-01-11 Matteo Frigo * tests/bench.c: Final \n * kernel/debug.c: Do not compile if not defined(FFTW_DEBUG), in order to avoid unused code in the shared library. * api/Makefile.am, api/api.h, api/export-wisdom-to-file.c, api/fftw3.h, api/mkprinter-file.c, api/print-plan.c, tests/bench.c: Implemented print_plan() 2003-01-11 Steven G. Johnson * api/apiplan.c, api/fftw3.h, tests/bench.c: changed the OOP-like plan_destroy to the more-grammatical destroy_plan * api/execute-dft.c, api/Makefile.am, api/fftw3.h: added guru execute_dft * api/export-wisdom-to-string.c: allow for malloc errors in wisdom string, since non-fftw-malloc * api/the-planner.c: cleanup should reset plnr to zero so that fftw can be restarted * api/fftw3.h, api/mapflags.c: NO_UGLY is an internal planner flag 2003-01-11 Matteo Frigo * api/plan-dft-1d.c, tests/bench.c: Written 1d api in terms of generic n-d api. The code is less compact but easier to test * api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/fftw3.h, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, kernel/alloc.c, kernel/assert.c, kernel/debug.c, kernel/ifftw.h, kernel/print.c, kernel/scan.c, tests/bench.c, tests/verify-lib.c, tests/verify-reodft.c: Added wisdom to header file, made scanners/printer static. stdio.h no longer needed in fftw.h, removed. Probably the printer_file should be reintroduced in a separate file if we ever want to print plans... * api/Makefile.am, api/apiplan.c, api/fftw3.h, api/plan-dft-2d.c, api/plan-dft-3d.c, api/plan-dft.c, tests/bench.c: Implemented more APIs * api/fftw3.h, api/the-planner.c, tests/bench.c: Added cleanup() to API * api/api.h, api/apiplan.c, api/fftw3.h, dft/buffered.c, dft/ct.c, dft/generic.c, dft/indirect.c, dft/rader.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/plan.c, kernel/planner.c, libbench/bench-user.h, libbench/bench.h, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0-rdft2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/Makefile.am, tests/bench.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Started new bench.c. I had to rename plan_destroy -> plan_destroy_internal to avoid conflicts with API 2003-01-11 Steven G. Johnson * api/Makefile.am, api/export-wisdom.c, api/import-system-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c: fix types * api/export-wisdom-to-string.c: whoops * api/Makefile.am, api/export-wisdom-to-file.c, api/export-wisdom-to-string.c, api/export-wisdom.c, api/forget-wisdom.c, api/import-wisdom-from-file.c, api/import-wisdom-from-string.c, api/import-wisdom.c, kernel/Makefile.am, kernel/ifftw.h, kernel/printers.c, kernel/scanners.c: added wisdom api * api/mapflags.c: grammar * api/mapflags.c: slight change * api/fftw3.h, api/mapflags.c: implemented api/mapflags * kernel/ifftw.h: IMPATIENT is an api issue 2003-01-10 Steven G. Johnson * api/the-planner.c: removed un-needed headers * api/the-planner.c: mkplanner initializes nthr to 1 already 2003-01-09 Steven G. Johnson * api/fftw3.h: boilerplate * rdft/vrank-geq1.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c: fold vecloop into r{e,o}dft apply function to share buffer, etcetera * tests/verify-reodft.c: whoops, bugfix in impulse test for vecn > 1 * rdft/hc2hc-buf.c: bugfix, grr * rdft/codelet-rdft.h: fixed signed-ness enum problem 2003-01-09 Matteo Frigo * kernel/md5-1.c: Explicit cast * api/Makefile.am, api/api.h, api/apiplan.c, api/configure.c, api/fftw3.h, api/the-planner.c: Added configure_planner(). mkplan() behaves properly when plan is null. * api/Makefile.am, api/api.h, api/apiplan.c, api/execute.c, api/fftw3.h, api/mapflags.c, api/plan-dft-1d.c, tests/bench.c: More API work * Makefile.am, api/Makefile.am, api/api.h, api/dfthelp.c, api/fftw3.h, api/plan-dft-1d.c, api/the-planner.c, configure.ac, kernel/ifftw.h, kernel/trig.c, tests/Makefile.am: First skeleton of API infrastructure 2003-01-09 Steven G. Johnson * rdft/rdft2-tensor-max-index.c: unsigned strikes again * rdft/Makefile.am, rdft/problem2.c, rdft/rdft2-inplace-strides.c, rdft/rdft2-tensor-max-index.c, rdft/vrank-geq1-rdft2.c: put rdft2_inplace_strides and rdft2_tensor_max_index in their own files for tighter linking * rdft/rank-geq2-rdft2.c, rdft/rdft.h, rdft/vrank-geq1-rdft2.c: added rdft2_tensor_max_index...incorrect use of tensor_max_index was preventing proper loop ordering for rnk > 2 rdft2 * rdft/rank-geq2-rdft2.c: arbitrary spltrnk in rdft2 rank-geq2 * tests/bench.c: don't mention wisdom when non-verbose * dft/problem.c, rdft/problem.c, rdft/problem2.c: bug fix: printing %T should pass tensor *, not tensor ** * reodft/rodft00e-r2hc.c, tests/verify-reodft.c: correct(?) normalization for rodft00 ... all of the even/odd transforms should be normalized according to the expanded'' DFT of ~twice the length * tests/verify-reodft.c: fixed tests for n=1 * tests/bench.c: fixed bug in vector tests for rdft(2) * rdft/problem2.c: fixed handling when first rnk-1 dimensions compress to nothing (ugh) * rdft/Makefile.am, rdft/conf.c, rdft/nop2.c, rdft/rank0-rdft2.c, rdft/rdft.h: fixed incorrect/missing rdft2 rank-0 handling * rdft/problem2.c: bug fix: for rnk > 1, must compress rnk-1 dims separately (ugh) 2003-01-08 Steven G. Johnson * configure.ac: added trailing newline * ChangeLog: updated * rdft/problem.c: got rid of compiler warning * tests/bench.c: whoops, test r2hc and not rodft00 by default * rdft/buffered.c, rdft/indirect.c, rdft/problem.c, rdft/rank-geq2.c, rdft/rdft.h, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/bench.c, tests/verify-reodft.c: got rid of real_n...use physical n everywhere in rdft; fixed rdft sz compression; fixed rodft00 verify bug 2003-01-08 Matteo Frigo * simd/Makefile.am, simd/sse-aux.c, simd/sse.c, simd/sse2-aux.c, simd/sse2.c: icc-6.0 bug workaround * kernel/ifftw.h, rdft/buffered2.c, rdft/rader-hc2hc.c, tests/bench.c: Reclaimed the fftw_real identifier, because I need it for the API * configure.ac: Use recommended AC_OUTPUT syntax * kernel/ifftw.h, tests/bench.c: Removed FFTW(foo) as a synonym for X(foo). This is an API issue. 2003-01-07 Steven G. Johnson * simd/sse2.c: get rid of warning 2003-01-07 Matteo Frigo * dft/Makefile.am, dft/codelet-dft.h, dft/codelet.h, dft/codelets/inplace/Makefile.am, dft/codelets/n.c, dft/codelets/standard/Makefile.am, dft/codelets/t.c, dft/dft.h, dft/simd/codelets/Makefile.am, dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, rdft/Makefile.am, rdft/codelet-rdft.h, rdft/codelet.h, rdft/codelets/hc2r.c, rdft/codelets/hc2r/Makefile.am, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/codelets/r2hc/Makefile.am, rdft/rdft.h, support/Makefile.am, support/Makefile.codelets, support/codelet_prelude, support/codelet_prelude.dft, support/codelet_prelude.rdft: Renamed conflicting files */codelet.h into dft/codelet-dft.h and rdft/codelet-rdft.h 2003-01-07 Steven G. Johnson * ChangeLog: updated 2003-01-07 Matteo Frigo * simd/simd-3dnow.h, simd/simd-sse.h, simd/simd-sse2.h, simd/sse2.c: Silence warnings 2003-01-07 Steven G. Johnson * dft/rank-geq2.c, rdft/rank-geq2.c: fftw2 used spltrnk=1 2003-01-07 Matteo Frigo * dft/codelet.h, rdft/codelet.h, simd/simd-sse.h, simd/sse.c: Silence warning 2003-01-07 Steven G. Johnson * TODO: noted deficiency 2003-01-07 Matteo Frigo * rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: Strengthened conditions for a problem to be POSSIBLY_UNALIGNED * dft/vrank-geq1.c, kernel/align.c, kernel/ifftw.h: Strengthened conditions for a plan to be POSSIBLY_UNALIGNED 2003-01-05 Steven G. Johnson * TODO: added copyright todo * kernel/planner.c: modified comment * tests/verify-rdft.c: fixed comment * TODO, tests/verify-rdft.c: implemented rdft2 verify 2003-01-04 Steven G. Johnson * configure.ac: fix --enable-single 2002-10-23 Steven G. Johnson * threads/threads.c: slight fixes * threads/threads.c: typo 2002-10-01 Matteo Frigo * genfft/annotate.ml, genfft/annotate.mli, genfft/c.ml, genfft/genutil.ml, genfft/magic.ml: Experimental stuff 2002-09-28 Matteo Frigo * configure.ac, dft/simd/codelets/Makefile.am, genfft/gen_notw_c.ml, genfft/gen_twiddle_c.ml, genfft/genutil.ml: Experimental Franz mode 2002-09-26 Matteo Frigo * kernel/tensor.c: const-correct * dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/tensor7.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c: Reuse dimcmp routine for other purposes 2002-09-25 Matteo Frigo * dft/direct.c, kernel/ifftw.h, kernel/tensor.c, rdft/direct.c, rdft/direct2.c: Use tornk1 correctly. * rdft/rdft2-radix2.c: Hmm... I thought I had fixed this before... * dft/buffered.c, dft/rank0.c, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/rank0.c: Collect more common idioms * dft/direct.c, rdft/direct.c, rdft/direct2.c: Still collecting common idioms... * dft/direct.c, rdft/direct.c, rdft/direct2.c: More garbage collection. * dft/buffered.c: More compact code * dft/buffered.c, dft/generic.c, dft/rader.c, kernel/alloc.c, kernel/ifftw.h, kernel/planner.c, kernel/stride.c, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/dht-rader.c, rdft/generic.c, rdft/problem.c, rdft/rader-hc2hc.c: Collect common pattern if (foo) free(foo) ==> free0(foo) * dft/buffered.c, kernel/Makefile.am, kernel/buffered.c, kernel/ifftw.h, rdft/buffered.c, rdft/buffered2.c: Collect some common code in */buffered*.c 2002-09-24 Steven G. Johnson * rdft/problem.c, rdft/rdft.h: use STRUCT_HACK #define to determing rdft kind[] allocation * kernel/ifftw.h, kernel/planner.c: report total pcost of measured/estimated plans...epcost is especially useful to estimate the effects of various impatience flags on planning time for large transforms 2002-09-23 Matteo Frigo * kernel/Makefile.am, kernel/trig.c, kernel/trig1.c: Prevent unwanted inlining * kernel/ifftw.h, kernel/trig.c: Space compaction * kernel/Makefile.am, kernel/hash.c, kernel/ifftw.h, kernel/md5-1.c, kernel/planner.c, kernel/scan.c: Still reducing size 2002-09-22 Matteo Frigo * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/ops.c, kernel/plan.c, rdft/buffered.c, rdft/buffered2.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Saved another 5KB by redesigning opcnt protocol. (gasp!) * dft/buffered.c, dft/direct.c, dft/indirect.c, dft/problem.c, dft/rank-geq2.c, kernel/Makefile.am, kernel/ifftw.h, kernel/tensor1.c, kernel/tensor4.c, kernel/tensor8.c, rdft/buffered.c, rdft/dft-r2hc.c, rdft/direct.c, rdft/hc2hc.c, rdft/indirect.c, rdft/problem.c, rdft/problem2.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c: More code compression * kernel/ifftw.h, kernel/solver.c: Smaller code size. * dft/Makefile.am, dft/dft.h, dft/rader-omega.c, dft/rader.c, rdft/rader-hc2hc.c: Started unification of rader * rdft/rdft2-radix2.c: Typo * dft/buffered.c, dft/ct.c, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/plan.c, kernel/problem.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Changed protocol for destroy_plan so as to save space. * dft/buffered.c, dft/ct.c, dft/generic.c, dft/indirect.c, dft/rader.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Introduced convenient function X(mkplan_d) * kernel/Makefile.am, kernel/md5-1.c, kernel/md5.c, kernel/tensor.c, kernel/tensor1.c, kernel/tensor2.c, kernel/tensor3.c, kernel/tensor4.c, kernel/tensor5.c, kernel/tensor7.c: Split tensor/md5 into separate files to allow independent linking and/or prevent undesidred inlining * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/dft.h, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/problem.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, dft/zero.c, kernel/ifftw.h, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/problem.c, rdft/problem2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/debug.h, tests/dotens.c, tests/dotens2.c, tests/verify-dft.c, tests/verify-lib.c, tests/verify-rdft.c, tests/verify-reodft.c, tests/verify.h: Treat all tensors as dynamically allocated objects. They were dynamically allocated in part anyway, so there is no point in complicating the object code with the clumsy calling conventions for by-value structs. 2002-09-21 Steven G. Johnson * kernel/ifftw.h: typo 2002-09-21 Matteo Frigo * tests/verify-lib.c: Avoid generating NaN when n = 0. * dft/dft.h, dft/problem.c, dft/rank-geq2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/hc2hc.c, rdft/problem.c, rdft/problem2.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft.h, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/dft-vrank-geq1.c: Saved more. * dft/buffered.c, dft/ct.c, dft/direct.c, dft/indirect.c, dft/nop.c, dft/problem.c, dft/rank-geq2.c, dft/vrank-geq1.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/pickdim.c, kernel/print.c, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/direct.c, rdft/hc2hc.c, rdft/indirect.c, rdft/nop.c, rdft/problem.c, rdft/problem2.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft.h, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/verify-dft.c, tests/verify-lib.c, tests/verify-rdft.c, tests/verify-reodft.c, threads/dft-vrank-geq1.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Save 1200 bytes of object code. Do not pass structs by value whenever practical, because the calling protocol generates clumsy code. * rdft/dht-rader.c: Do not allocate buffers for rader omegas. Let the planner do it if necessary. * tests/verify-rdft.c, tests/verify-reodft.c: Check rank *before* reading kind[0], which may be undefined if rnk < 1 * dft/rader.c, rdft/rader-hc2hc.c: Second step towards rader unification. * dft/rader.c, kernel/Makefile.am, kernel/ifftw.h, kernel/rader.c, rdft/dht-rader.c, rdft/rader-hc2hc.c: First step towards unification of Rader code * dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, kernel/Makefile.am, kernel/ct.c, kernel/ifftw.h, kernel/planner.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/rdft-dht.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, threads/ct-dit.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c: Fix ugliness condition for cooley-tukey. 2002-09-20 Matteo Frigo * dft/rader.c, kernel/ifftw.h, rdft/dht-rader.c, rdft/rader-hc2hc.c: Removed RADER_MIN_GOOD and associated machinery * rdft/dht-r2hc.c: Proper cast * kernel/planner.c: Typo * dft/generic.c, kernel/ifftw.h, rdft/dht-rader.c, rdft/generic.c, rdft/rdft-dht.c, tests/bench.c: Implemented NO_LARGE_GENERIC 2002-09-19 Matteo Frigo * kernel/ifftw.h, rdft/dht-r2hc.c: Consistent macroization of NO_DHT_R2HC * kernel/ifftw.h, kernel/planner.c, rdft/dht-r2hc.c, tests/bench.c: NO_DHT_R2HC is a planner flag, otherwise the EXHAUSTIVE planner loops. * kernel/ifftw.h, kernel/planner.c: Resurrected NO_EXHAUSTIVE 2002-09-18 Steven G. Johnson * threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: au revoir, score() * tests/bench.c, tests/verify-reodft.c: eliminated unused * kernel/planner.c: capitalize and parenthesize SUBSUMES * kernel/ifftw.h: comment 2002-09-18 Matteo Frigo * kernel/ifftw.h, kernel/planner.c: Use flags from wisdom if wisdom is applicable. * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/rader-hc2hc.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rank0.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, tests/bench.c: Removed score() machinery * kernel/planner.c: Revised planner hack * simd/simd-altivec.h: Fix warning 2002-09-17 Matteo Frigo * dft/indirect.c, rdft/indirect.c: Type qualifiers. * kernel/planner.c: ESTIMATE is no longer subsumed by everything else. * dft/indirect.c, rdft/indirect.c: NO_BUFFERING is a planner flag, not a problem flag * kernel/ifftw.h, kernel/planner.c: Maintain flags in canonical form. * kernel/ifftw.h, kernel/planner.c: In dramatic break with tradition, SUBSUME is now a partial order. I swear. * kernel/planner.c: Added comment * kernel/ifftw.h, kernel/planner.c, tests/bench.c: Inverted ESTIMATE flag, renamed USE_SCORE for consistency with the convention that 0 subsumes 1. 2002-09-17 Steven G. Johnson * dft/indirect.c, kernel/ifftw.h, rdft/indirect.c, tests/bench.c: NO_INDIRECT -> NO_INDIRECT_OP (out-of-place only) * acx_pthread.m4: hpux needs -D_REENTRANT (thanks to Clinton Roy for the bug report) 2002-09-17 Matteo Frigo * kernel/planner.c: Oops. * kernel/ifftw.h, kernel/planner.c: Yet another attempt at getting the planner right. * kernel/planner.c: Better coding. * kernel/ifftw.h, kernel/planner.c: NO_UGLY is no longer a flag, but a separate planner field that does not interfere with wisdom. 2002-09-16 Matteo Frigo * tests/verify-reodft.c: Did not compile without FFTW_DEBUG * kernel/ifftw.h, kernel/plan.c, kernel/planner.c, tests/bench.c: Changed scoring mechanism. * kernel/planner.c: Count infeasible plans * kernel/planner.c: curse subsumed plans before export 2002-09-16 Steven G. Johnson * kernel/ifftw.h, kernel/planner.c: removed ESTIMATE_BIT vs. ESTIMATE... ESTIMATE | IMPATIENT is a UI issue * rdft/buffered2.c: cleanup * dft/buffered.c, rdft/buffered.c, rdft/buffered2.c: use CONSERVE_MEMORY flag to prevent buffered for large sizes * kernel/ifftw.h: moved NO_DHT_R2HC back into planner flags: there's no reason we would want this flag to block plan reuse * kernel/ifftw.h: whoops, commas * kernel/ifftw.h: problem_flags == checked in applicable, planner_flags == checked in score * kernel/ifftw.h, kernel/planner.c: ESTIMATE should not *include* all impatience flags, even if it subsumes them; some impatience flags, like NO_INDIRECT, might make a problem unsolvable * kernel/planner.c: quotatio marks * kernel/planner.c: delete blank line * kernel/planner.c: substitution * kernel/planner.c: note that we are not GNUlly correct * kernel/planner.c: indenting * kernel/planner.c: more jokes * dft/ct-dit.c, dft/vrank-geq1.c, kernel/ifftw.h, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: NONTHREADED_ICKYP includes nthr > 1 check * kernel/md5.c: use md5sig * kernel/ifftw.h, kernel/planner.c: md5sig typedef * ChangeLog: updated * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct.c, dft/indirect.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, tests/bench.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: partially-ordered impatience 2002-09-14 Matteo Frigo * kernel/Makefile.am, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, tests/bench.c: Removed all that planner inheritance crap. 2002-09-14 Steven G. Johnson * kernel/planner.c: string.h is used for more than strlen 2002-09-14 Matteo Frigo * kernel/ifftw.h, kernel/planner.c: Reduced hashtable size by 1/6 (on 32-bit machines) at the expense of messier planner. * tests/bench.c: Only print wisdom if verbose > 3 * genfft-k7/variable.ml, genfft/variable.ml: Changed syntax of temporaries to avoid shadowing library functions (which is harmless but I hate the warning) 2002-09-14 Steven G. Johnson * acinclude.m4, configure.ac, dft/rader.c, kernel/alloc.c, kernel/assert.c, kernel/ifftw.h, kernel/md5.c, kernel/planner-score.c, kernel/primes.c, kernel/scan.c, libbench/bench-user.h, libbench/bench.h, libbench/report.c, libbench/timer.c, libbench/util.c, libbench/verify.c, rdft/rader-hc2hc.c, tests/bench.c, tests/verify-lib.c: only add warnings in debug/maintainer mode, and add a few more warning flags; eliminate more warnings; add support for posix_memalign (broken in glibc, grrr) 2002-09-14 Matteo Frigo * kernel/twiddle.c: Explicit cast * kernel/ifftw.h, kernel/planner.c, kernel/primes.c: Use double-hashing. This allows a slightly higher load factor at the expense of a messier computation of the hashtable size. 2002-09-13 Steven G. Johnson * genfft/magic.ml: typo 2002-09-13 Matteo Frigo * kernel/planner.c: Slight change in hash table growth functions. * kernel/ifftw.h, kernel/planner.c: More statistics. * kernel/planner.c: Clearer logic. * kernel/planner.c: Oops. * kernel/planner.c: Cleaned up * kernel/planner.c: Deal properly with infeasible problems. * kernel/planner.c: Redundantly initialize hash table to prevent valgrind warnings. 2002-09-12 Matteo Frigo * kernel/md5.c: Removed relics from past. * kernel/ifftw.h, kernel/planner.c: md5hash a problem only once. * genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml: Renamed k7 codelets 2002-09-12 Steven G. Johnson * kernel/ifftw.h, rdft/dht-r2hc.c: FORBID_DHT_R2HC -> DHT_R2HC_VERBOTEN for consistency * kernel/ifftw.h: removed obsolete macro 2002-09-12 Matteo Frigo * dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c: Split flags in SIMD code. * threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/rdft-vrank-geq1.c, threads/vrank-geq1-rdft2.c: Forgot to fix threads * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct.c, dft/indirect.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/dht-r2hc.c, rdft/dht-rader.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft2-radix2.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, tests/bench.c: Split flags into planner_flags and problem_flags 2002-09-12 Steven G. Johnson * kernel/planner.c: tetrameter 2002-09-12 Matteo Frigo * kernel/planner.c: Overwrite less impatient solutions properly. * kernel/planner.c: Oops. * kernel/planner.c: Keep less impatient solution in case of conflict. Paranoid cast to uint in certain places. * kernel/ifftw.h, kernel/planner.c, tests/bench.c: Complete reimplementation of planner hash table. * kernel/planner.c: planner->cnt was not properly decremented. 2002-09-11 Steven G. Johnson * NEWS: typo 2002-09-09 Matteo Frigo * kernel/planner.c: Simplified * kernel/planner.c: Always overwrite old wisdom with new, in case the old is corrupt/conclicting. 2002-09-09 Steven G. Johnson * kernel/plan.c: added quote/joke 2002-09-09 Matteo Frigo * kernel/ifftw.h, kernel/md5.c, kernel/planner.c, kernel/print.c, kernel/scan.c, tests/bench.c: Completed wisdom import * dft/problem.c, kernel/ifftw.h, kernel/md5.c, rdft/problem.c, rdft/problem2.c: Slight cleanup of md5 interface. 2002-09-04 Matteo Frigo * kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c: More consistent protocol between planner and inferior. * kernel/planner.c: I can't think of any situation where saving infeasible problems would be desirable. Removed relevant code. * kernel/ifftw.h, kernel/planner.c, kernel/solvtab.c, tests/bench.c: Encoder registrar's names in wisdom. Remove export_conf, since a separate program can now generate it. 2002-09-03 Matteo Frigo * kernel/planner.c: Fixed typo * kernel/planner.c: Fixed broken trochaic meter. * kernel/planner.c: Initialize planner->score. It is correct to leave it uninitialized, but I don't want people to send reports about purify complaining. * kernel/planner.c: More latin silliness 2002-09-02 Steven G. Johnson * ChangeLog: updated * kernel/timer.c: added clock() getseconds timer 2002-09-02 Matteo Frigo * rdft/indirect.c: Oops * dft/indirect.c, kernel/ifftw.h, rdft/indirect.c: Experimental INDIRECT_VERBOTEN flag (not used) * dft/buffered.c, dft/indirect.c, kernel/ifftw.h, rdft/buffered.c, rdft/buffered2.c, rdft/indirect.c: Do not allow buffering in children of indirect solvers. * kernel/planner.c: Oops * kernel/planner.c: Hash sizeof(R) as part of wisdom. 2002-09-02 Steven G. Johnson * configure.ac: added --enable-float synonym for --enable-single (since with have --enable-long-double) 2002-09-02 Matteo Frigo * dft/Makefile.am, dft/problem.c, dft/zero.c: zerotens is now in its own file, so it does not cause dft to be linked in if only rdft is used. * kernel/planner.c: Removed unused var. * kernel/planner.c: Split insert() in preparation for wisdom import * dft/Makefile.am, dft/dft.h, dft/verify.c, kernel/Makefile.am, kernel/dotens.c, kernel/dotens2.c, kernel/ifftw.h, kernel/verify-lib.c, kernel/verify.h, rdft/Makefile.am, rdft/rdft.h, rdft/verify.c, reodft/Makefile.am, reodft/reodft.h, reodft/verify.c, tests/Makefile.am, tests/bench.c, tests/debug.h, tests/dotens.c, tests/dotens2.c, tests/verify-dft.c, tests/verify-lib.c, tests/verify-rdft.c, tests/verify-reodft.c, tests/verify.h: Moved debugging infrastructure to test directory so that it is not linked into the shared library. * kernel/planner.c, kernel/print.c: Reactivated wisdom export * kernel/verify-lib.c: Dump errors to stderr, not stdout. * kernel/Makefile.am, kernel/ifftw.h, kernel/planner-score.c, kernel/traverse.c, tests/bench.c: Removed traverse.c. traverse.c is no longer need for plan blessing. I figured out a way to avoid using it in planner-score.c, so the file is now redundant. 2002-09-01 Matteo Frigo * dft/conf.c, dft/dft.h, dft/problem.c, kernel/align.c, kernel/ifftw.h, kernel/planner.c, kernel/problem.c, kernel/scan.c, kernel/scanners.c, kernel/tensor.c, rdft/conf.c, rdft/problem.c, rdft/problem2.c, rdft/rdft.h: Removed code made obsolete by new MD5 scheme: problem equality tests, scanners, and associated list of problem kinds. * dft/problem.c, kernel/Makefile.am, kernel/ifftw.h, kernel/md5.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/tensor.c, rdft/problem.c, rdft/problem2.c: Started md5 implementation 2002-08-31 Matteo Frigo * kernel/ifftw.h, kernel/planner.c: Keep track of hit rate * kernel/planner.c: Only dump when verbose > 4 * dft/indirect.c, kernel/ifftw.h, kernel/plan.c, kernel/planner.c, tests/bench.c: Debugging infrastructure * kernel/planner.c, kernel/print.c: Use debug infrastructure to dump planner. * kernel/alloc.c, kernel/ifftw.h, kernel/plan.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/scan.c, kernel/scanners.c, tests/bench.c: Do not store plans in planner, plus general planner cleanup. 2002-08-30 Steven G. Johnson * kernel/ifftw.h, rdft/dht-r2hc.c: renamed IN_DHT_R2HC to the more general FORBID_DHT_R2HC * kernel/planner.c: eliminated unused var 2002-08-30 Matteo Frigo * kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c: Score planner was not working correctly when using wisdom. Fixed. * kernel/alloc.c, kernel/ifftw.h, kernel/stride.c: Use hash table in debug malloc 2002-08-30 Steven G. Johnson * NEWS: listed some good stuff * TODO: timed planner * TODO: fma? * TODO: update * rdft/Makefile.am, rdft/conf.c, rdft/dht-rader.c, rdft/rader-dht.c, rdft/rdft.h: rader-dht -> dht-rader * kernel/ifftw.h, rdft/Makefile.am, rdft/buffered2.c, rdft/conf.c, rdft/dht-r2hc.c, rdft/r2hc-hc2r.c, rdft/rader-dht.c, rdft/rank-geq2.c, rdft/rdft-dht.c, rdft/rdft.h: add DHT solver, and break up rader-dht and r2hc-hc2r * tests/bench.c: another option * dft/indirect.c, kernel/ifftw.h, kernel/tensor.c, rdft/indirect.c: generalized indirect solvers for fftw2-like buffering and more 2002-08-29 Steven G. Johnson * dft/vrank-geq1.c, kernel/ifftw.h, kernel/tensor.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: tensor_max_index and tensor_min_stride are now both unsigned * kernel/Makefile.am, kernel/iabs.c, kernel/ifftw.h, kernel/tensor.c, rdft/buffered2.c, rdft/problem2.c: added iabs.c, and tensor_min_stride returns min absolute value * rdft/buffered2.c: bug fix in cldrest hc2c/c2hc copy loops 2002-08-29 Matteo Frigo * TODO: Added things to do. 2002-08-29 Steven G. Johnson * configure.ac: added automake prereq 2002-08-29 Matteo Frigo * rdft/rdft2-radix2.c: Use indexed addressing * libbench/verify.c, rdft/rdft2-radix2.c: Ooops * kernel/ifftw.h: Oops 2002-08-29 Steven G. Johnson * threads/threads.c: updates to win32 threads code (ick) * Makefile.am, acx_pthread.m4, configure.ac, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/kdft-dif.c, dft/kdft-dit.c, dft/vrank-geq1.c, kernel/alloc.c, kernel/ifftw.h, kernel/planner.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/khc2hc-dif.c, rdft/khc2hc-dit.c, rdft/rdft.h, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, tests/Makefile.am, tests/bench.c, threads/Makefile.am, threads/conf.c, threads/ct-dit.c, threads/dft-vrank-geq1.c, threads/hc2hc-dif.c, threads/hc2hc-dit.c, threads/rdft-vrank-geq1.c, threads/threads.c, threads/threads.h, threads/vrank-geq1-rdft2.c: added threaded version 2002-08-28 Steven G. Johnson * kernel/Makefile.am: fix make dist * rdft/rank-geq2-rdft2.c: whoops, bugfix for inverse 2002-08-28 Matteo Frigo * Makefile.am, configure.ac, kernel/Makefile.am, kernel/dfftw3.h, kernel/fftw3.h, kernel/ifftw.h, kernel/lfftw3.h, kernel/sfftw3.h, tests/Makefile.am: Use C9x convention for naming (fftwf etc.). Removed installable header files since they will be part of the API. 2002-08-28 Steven G. Johnson * rdft/problem.c: allow _1 variants to accept rnk 0 (sz 1) problems 2002-08-27 Steven G. Johnson * ChangeLog: updated 2002-08-27 Matteo Frigo * dft/rank0.c: Loop unroll is useless * dft/ct-ditbuf.c: Use indexed addressing 2002-08-26 Matteo Frigo * dft/vrank2-transpose.c, dft/vrank3-transpose.c: Use indexed addressing in transpose routines. (Seems to be slightly better on athlon.) 2002-08-26 Steven G. Johnson * reodft/redft00e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c: added comment about stability 2002-08-26 Matteo Frigo * rdft/rdft2-radix2.c: Approximate opcount * dft/rank-geq2.c, rdft/rank-geq2.c, rdft/rdft2-radix2.c: Finished rdft2 via dft/rdft 2002-08-26 Steven G. Johnson * TODO: some updates * rdft/Makefile.am, rdft/buffered.c, rdft/buffered2.c, rdft/conf.c, rdft/dft-r2hc.c, rdft/direct.c, rdft/generic.c, rdft/hc2hc.c, rdft/indirect.c, rdft/problem.c, rdft/r2hc-hc2r.c, rdft/rader-dht.c, rdft/rader-hc2hc.c, rdft/rank-geq2.c, rdft/rdft.h, rdft/rdft2-radix2.c, rdft/verify.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c, reodft/verify.c, tests/bench.c: rdft kind is now per-dimension, added rdft/rank-geq2 * rdft/problem.c: added note * rdft/problem.c: must zero real sz * dft/rank-geq2.c, dft/vrank-geq1.c, kernel/Makefile.am, kernel/ifftw.h, kernel/pickdim.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, tests/bench.c: unified pickdim funcs 2002-08-25 fftw * libbench/mp.c, rdft/codelet.h, rdft/indirect.c, rdft/rank-geq2-rdft2.c, rdft/verify.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/verify.c: silence warnings 2002-08-25 Matteo Frigo * dft/codelet.h, dft/codelets/n.c, dft/codelets/t.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/direct.c, dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, dft/vrank-geq1.c, kernel/ifftw.h, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c: I had to add another planner flag to record whether pointers could become unaligned because of vrank-geq1 solvers (these solvers only plan the first element of a vector problem, but the second element may have a different alignment). This addition is ugly, but I don't see any way around it. * TODO: Added thoughts * rdft/Makefile.am, rdft/conf.c, rdft/rdft.h, rdft/rdft2-dft.c, rdft/rdft2-radix2.c: Implemented rdft2 via vector rdft + radix2 step 2002-08-24 Matteo Frigo * rdft/rdft2-dft.c: Stylistic changes * dft/ct.c, dft/generic.c, kernel/ifftw.h, kernel/twiddle.c, rdft/generic.c, rdft/rdft2-dft.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c: Simplified mktwiddle interface * dft/ct-dif.c, dft/ct-dit.c, dft/ct.c, dft/ct.h, kernel/ifftw.h, kernel/tensor.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/rdft2-dft.c: Unification of certain vector computations. rdft2-dft is now a vector transform. * configure.ac, simd/sse.c, simd/sse2.c: Intel compiler seems to be still buggy 2002-08-23 Matteo Frigo * dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/generic.c, dft/indirect.c, kernel/ifftw.h, kernel/twiddle.c, rdft/generic.c, rdft/hc2hc.c, rdft/rdft2-dft.c, reodft/redft00e-r2hc.c, reodft/reodft010e-r2hc.c, reodft/reodft11e-r2hc.c, reodft/rodft00e-r2hc.c: Streamlined twiddle protocol * libbench/verify.c, rdft/Makefile.am, rdft/conf.c, rdft/rdft.h, rdft/rdft2-dft.c: Implemented rdft2 via dft (forward only for now) 2002-08-22 Matteo Frigo * kernel/verify-lib.c, libbench/verify.c: More cleanup of verify * kernel/verify-lib.c: Changed error criterion because old one was too strict * bootstrap.sh: Disable shared * TODO: Added thoughts * dft/generic.c: Oops * dft/generic.c, kernel/alloc.c, kernel/planner-score.c, kernel/tensor.c: Do not use inline. Minor changes. 2002-08-21 Steven G. Johnson * tests/bench.c: more commented flags 2002-08-20 Steven G. Johnson * reodft/Makefile.am, reodft/conf.c, reodft/reodft11e-r2hc.c, reodft/verify.c, tests/bench.c: added DCT-IV and DST-IV 2002-08-20 Matteo Frigo * genfft/twiddle.ml: Slight improvement in twiddle scheme 2002-08-20 Steven G. Johnson * reodft/conf.c, reodft/reodft.h, reodft/reodft010e-r2hc.c: name fix * reodft/reodft010e-r2hc.c: removed extraneous variable 2002-08-20 Matteo Frigo * libbench/mp.c, libbench/verify.c: Oops * genfft/twiddle.ml, kernel/trig.c: Still playing around 2002-08-19 Matteo Frigo * TODO, genfft/algsimp.ml, genfft/expr.ml, genfft/expr.mli, genfft/twiddle.ml, support/addchain.c: Playing around with addition chain 2002-08-19 Steven G. Johnson * reodft/redft00e-r2hc.c, reodft/rodft00e-r2hc.c: comments * reodft/reodft010e-r2hc.c: comment fixes * Makefile.am, configure.ac, dft/dft.h, rdft/rdft.h, reodft/Makefile.am, reodft/conf.c, reodft/redft00e-r2hc.c, reodft/reodft.h, reodft/reodft010e-r2hc.c, reodft/rodft00e-r2hc.c, reodft/verify.c, tests/Makefile.am, tests/bench.c: added reodft stuff 2002-08-18 Matteo Frigo * libbench/Makefile.am, libbench/verify.c: Sync with nbenchfft * genfft/complex.ml, genfft/complex.mli, genfft/twiddle.ml: Economy of thought 2002-08-17 Steven G. Johnson * support/Makefile.am: distribute addchain.c 2002-08-17 Matteo Frigo * support/addchain.c: Nothing serious * genfft/twiddle.ml, support/addchain.c: New twiddle policy (disabled for now) 2002-08-17 Steven G. Johnson * rdft/rank-geq2-rdft2.c: bug fix for hc2r (must use inverse dft) 2002-08-17 Matteo Frigo * dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, genfft/twiddle.ml, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am: New log3 twiddle policy 2002-08-16 Matteo Frigo * dft/verify.c, kernel/verify-lib.c, kernel/verify.h, rdft/verify.c: More verify cleanup * rdft/verify.c: Oops * dft/verify.c, kernel/Makefile.am, kernel/verify-lib.c, kernel/verify.h, rdft/verify.c: Economy of thought (and code) * TODO: Added comment * libbench/mp.c: Cleaner rounding algorithm * libbench/mp.c: Can get away with shorter length in bluestein (I think). * libbench/mp.c: Portability improvements * libbench/bench-main.c, libbench/bench.h, libbench/verify.c: Optionally average accuracy test over many rounds * dft/rader.c, rdft/rader-dht.c, rdft/rader-hc2hc.c: More accurate formula for trig tables * libbench/mp.c, libbench/verify.c: Implemented accuracy test for all integers 2002-08-15 Matteo Frigo * libbench/mp.c: inv, neg: make static * libbench/verify.c: Verify was not complete for real transforms * libbench/verify.c: Oops * genfft/gen_hc2hc.ml, libbench/verify.c: Fixed hb codelets * dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, rdft/codelets/r2hc/Makefile.am: Changed twiddle policy 2002-08-15 Steven G. Johnson * rdft/direct2.c: whoops 2002-08-15 Matteo Frigo * libbench/Makefile.am, tests/Makefile.am: No point in libbench being a shared library * libbench/Makefile.am, libbench/bench-main.c, libbench/bench.h, libbench/mp.c, libbench/util.c, libbench/verify.c, tests/Makefile.am, tests/accuracy.c, tests/mp.c: Moved accuracy test to libbench 2002-08-14 Matteo Frigo * tests/accuracy.c: Modified accuracy test * tests/accuracy.c, tests/mp.c: Fixes for long double * tests/accuracy.c: Normalize input * tests/accuracy.c: Oops * tests/accuracy.c: Also compute relative error * tests/accuracy.c: Loop over N * tests/Makefile.am, tests/accuracy.c, tests/mp.c: simple-minded accuracy test 2002-08-14 Steven G. Johnson * rdft/rank-geq2-rdft2.c: whoops 2002-08-13 Matteo Frigo * kernel/trig.c: fma() stuff is too nonportable, removed 2002-08-12 Steven G. Johnson * rdft/problem.c: slight fix * rdft/problem.c: use table for rdft_kind_str * rdft/problem2.c: slight fixes * kernel/ifftw.h, kernel/planner.c, kernel/tensor.c, rdft/Makefile.am, rdft/buffered2.c, rdft/conf.c, rdft/direct2.c, rdft/nop2.c, rdft/problem2.c, rdft/rdft.h, rdft/vrank-geq1-rdft2.c, tests/bench.c: multidimensional rdft2 2002-08-10 Steven G. Johnson * rdft/indirect.c: use tensor_copy_inplace * dft/rank-geq2.c: bugfix, use tensor_copy_inplace * dft/indirect.c: use tensor_copy_inplace * kernel/ifftw.h, kernel/tensor.c: added tensor_copy_inplace * kernel/twiddle.c: fixed trig-function table type 2002-08-10 Matteo Frigo * kernel/trig.c, tests/trigtest.c: Improved trig scheme * tests/trigtest.c: Allow for testing using long double instead of pari * kernel/trig.c, tests/trigtest.c: Yet another trig scheme. * kernel/trig.c, tests/trigtest.c: Yet another scheme * kernel/ifftw.h, kernel/trig.c, tests/trigtest.c: Careful with overflow * kernel/ifftw.h, kernel/trig.c, tests/trigtest.c: Avoid overflow 2002-08-09 Matteo Frigo * dft/rader.c, dft/verify.c, kernel/ifftw.h, kernel/trig.c, kernel/twiddle.c, rdft/rader-dht.c, rdft/rader-hc2hc.c, rdft/verify.c, tests/trigtest.c: New(er) trig routines * tests/bench.c: Oops * tests/trigtest.c: New file * TODO: Commented about likely gcc bug * dft/rader.c, dft/verify.c, kernel/Makefile.am, kernel/ifftw.h, kernel/trig.c, kernel/twiddle.c, rdft/rader-dht.c, rdft/rader-hc2hc.c, rdft/verify.c, tests/bench.c: Improved accuracy of twiddle factors 2002-08-08 Matteo Frigo * simd/simd-3dnow.h: Wrong comment 2002-08-07 Matteo Frigo * configure.ac, genfft/gen_notw_c.ml, genfft/gen_twiddle_c.ml, kernel/ifftw.h, simd/3dnow.c, simd/Makefile.am, simd/simd-3dnow.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h: Experimental 3dnow port using gcc, to compare it with Stefan's stuff. * genfft/c.ml, kernel/ifftw.h: End of AREF experiment * configure.ac: Oops * configure.ac: Pathetic attempt to reduce size of configure script * genfft/c.ml, kernel/ifftw.h: Changed array syntax for experiments. 2002-08-06 Matteo Frigo * simd/simd-sse2.h: Fix warning * dft/problem.c, kernel/align.c, kernel/ifftw.h, rdft/problem.c, rdft/problem2.c: Move nonportable stuff in one place. * kernel/planner.c: Economy of thought: I didn't like having two algorithms for removing solutions, both correct. At least now we have the same algorithm copied twice. * TODO: Added things to do 2002-08-05 Steven G. Johnson * kernel/ifftw.h, kernel/planner.c: improved interaction of planner with patience flags * rdft/buffered.c, rdft/codelet.h, rdft/indirect.c, rdft/problem.c, rdft/rader-hc2hc.c, rdft/rdft.h, tests/bench.c: set up for real-even/odd DFTs, where n is not the size of the data * dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, kernel/ifftw.h, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/hc2hc.c, rdft/hc2hc.h, rdft/r2hc-hc2r.c, tests/bench.c: DESTROY_INPUT flag * dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner.c, rdft/dft-r2hc.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, tests/bench.c: CLASSIC -> IMPATIENT 2002-08-04 Matteo Frigo * genfft-k7/Makefile.am, genfft/Makefile.am: Require make maintainer-clean to remove the generator, as opposed to make clean. In this way we can type make clean without regenerating all codelets. 2002-08-04 Steven G. Johnson * kernel/planner.c: ESTIMATE plans are not blessed * kernel/ifftw.h, kernel/planner.c: use flags in wisdom * dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/direct.c, dft/generic.c, dft/indirect.c, dft/nop.c, dft/rader.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/planner-score.c, kernel/tensor.c, rdft/buffered.c, rdft/buffered2.c, rdft/dft-r2hc.c, rdft/direct.c, rdft/direct2.c, rdft/generic.c, rdft/hc2hc-buf.c, rdft/hc2hc-dif.c, rdft/hc2hc-dit.c, rdft/indirect.c, rdft/nop.c, rdft/nop2.c, rdft/r2hc-hc2r.c, rdft/rader-dht.c, rdft/rader-hc2hc.c, rdft/rank0.c, rdft/vrank-geq1-rdft2.c, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c, tests/bench.c: score now takes plnr, not flags, as arg * acinclude.m4: align initial stack in alignment check, which should now pass for gcc 3.1.1 2002-08-04 Matteo Frigo * acinclude.m4: Detect ultrasparc (sort of) 2002-08-03 Steven G. Johnson * rdft/codelet.h: added solvtab_rdft_r2r placeholder 2002-08-03 Matteo Frigo * support/Makefile.codelets: Damn solaris 2002-08-03 Steven G. Johnson * rdft/problem.c: use E extended precision in solvers * rdft/codelet.h: an alternative notation for D{C,S}T: DXTio, where i/o are {0,1} according to whether the input/output are shifted, respectively. Alternatively, io is the binary representation of the usual DXT-{I,II,III,IV} nomenclature, minus 1. * dft/generic.c, dft/rader.c, rdft/generic.c, rdft/r2hc-hc2r.c, rdft/rader-dht.c: use E extended precision in solvers 2002-08-03 Matteo Frigo * configure.ac, kernel/cycle.h, kernel/planner.c, rdft/problem2.c: More portability fixes, compiler bugs workarounds, etc. * configure.ac, kernel/cycle.h, kernel/ifftw.h: More portability work * acinclude.m4, configure.ac, kernel/cycle.h, kernel/ifftw.h, support/Makefile.codelets: Improved portability, removed gnu make dependencies * TODO: Remember to thank XXX 2002-08-02 Matteo Frigo * simd/simd-altivec.h: Multiplication on altivec requires FMA with -0.0 to be IEEE754 compliant. * genfft/c.ml, kernel/ifftw.h: Allow for extended precision in codelets * dft/codelets/inplace/Makefile.am: Shortened names 2002-08-02 Steven G. Johnson * TODO, rdft/codelet.h, rdft/problem.c: added infrastructure for future r2r transforms 2002-08-02 Matteo Frigo * Makefile.am, configure.ac: Version info * dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/k7/codelets/Makefile.am, dft/simd/codelets/Makefile.am, kernel/align.c, rdft/codelets/hc2r/Makefile.am, rdft/codelets/r2hc/Makefile.am: Listened to one customer and added radix-12. Added radix-15 for consistency (whatever that is) 2002-08-01 Steven G. Johnson * kernel/cycle.h: whoops again, fixed the wrong line * kernel/cycle.h: whoops * configure.ac, kernel/planner.c: use new AC_INIT and add VERSION to wisdom * kernel/scan.c: mygetR -> getR * dft/problem.c, kernel/planner.c, kernel/scan.c, kernel/tensor.c, rdft/problem.c, rdft/problem2.c, tests/bench.c: scanner cleanups: just return 0/1, simplify integer reads 2002-08-01 Matteo Frigo * kernel/align.c: Reverted back to casting pointer to ulong * kernel/ifftw.h: Cast to unsigned long, not long 2002-08-01 Steven G. Johnson * kernel/scan.c: additional comment * kernel/scan.c: added comment * dft/conf.c, dft/dft.h, dft/problem.c, dft/verify.c, kernel/Makefile.am, kernel/alloc.c, kernel/assert.c, kernel/debug.c, kernel/ifftw.h, kernel/planner.c, kernel/print.c, kernel/printers.c, kernel/problem.c, kernel/scan.c, kernel/scanners.c, kernel/tensor.c, kernel/timer.c, rdft/conf.c, rdft/problem.c, rdft/problem2.c, rdft/rdft.h, rdft/verify.c, tests/bench.c: added wisdom import * kernel/align.c: whoops * dft/problem.c, rdft/problem.c, rdft/problem2.c: use %u for alignment_of * kernel/align.c: ptrdiff_t form 2002-08-01 Matteo Frigo * kernel/ifftw.h: Cast to avoid warning from C++ compiler 2002-07-31 Matteo Frigo * dft/problem.c, kernel/Makefile.am, kernel/align.c, kernel/ifftw.h, rdft/problem.c, rdft/problem2.c, simd/simd.h: Make problem equality depend on alignments. * dft/simd/codelets/Makefile.am: Shorter names * simd/simd-sse.h: Oops * simd/simd-sse.h: Fix warning * kernel/alloc.c, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c: Removed silly abstraction barrier. Also, cons() terminology was no longer appropriate. 2002-07-31 Steven G. Johnson * kernel/ifftw.h, kernel/planner.c, kernel/solvtab.c: removed register_registrar and solvtab_exec_reverse hacks 2002-07-30 Steven G. Johnson * kernel/planner.c: register_registrar doesn't search whole solver list (maybe we should change register_solver instead) * kernel/cycle.h: credit * kernel/cycle.h: added HP/UX ia64 support, courtesy of Teresa L. Johnson 2002-07-30 Matteo Frigo * dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, kernel/alloc.c: Fixed alignment checks 2002-07-30 Steven G. Johnson * kernel/ifftw.h, kernel/planner.c, kernel/solvtab.c: ugh, wisdom id fixes in exprt_conf * kernel/ifftw.h, kernel/planner.c, tests/bench.c: exprt_registrars -> exprt_conf, added missing SOLVTAB_END * kernel/planner.c: exprt_registrars should output self-contained configuration * dft/conf.c, kernel/ifftw.h, kernel/planner.c, kernel/solvtab.c, rdft/conf.c, support/Makefile.codelets, tests/bench.c: added exprt_registrars * kernel/print.c: whoops 2002-07-30 Matteo Frigo * dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/t1b.c, dft/simd/t1b.h, dft/simd/t1f.c, dft/simd/t1f.h, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h: More stringent requirements on strides for SIMD codelets 2002-07-30 Steven G. Johnson * rdft/buffered2.c: remove warning * dft/problem.c, rdft/problem.c, rdft/problem2.c, kernel/print.c, kernel/traverse.c: use %td for ptrdiff_t and %T for tensors 2002-07-29 Matteo Frigo * dft/buffered.c: Fix for SIMD * kernel/ifftw.h: Missing lfftw_mkstride and lfftw_stride_destroy * simd/simd-altivec.h: Implement LDA/STA * dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h, simd/simd.h: More SIMD work * simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: Cleanup 2002-07-29 Steven G. Johnson * ChangeLog: update 2002-07-29 Matteo Frigo * dft/simd/n1b.c, dft/simd/n1f.c, dft/simd/t1b.c, dft/simd/t1f.c, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: Also check strides in SIMD codelets * simd/simd-altivec.h: Minor changes, mostly for consistency with the big-endian processor 2002-07-29 Steven G. Johnson * rdft/rader-dht.c: added comment * configure.ac, kernel/alloc.c: added code for icc's _mm_malloc (memalign replacement) 2002-07-28 Steven G. Johnson * rdft/problem2.c, rdft/verify.c, tests/bench.c: slight fixes * rdft/problem2.c: whoops 2002-07-28 Matteo Frigo * simd/altivec.c, simd/simd-altivec.h: Use vec_xor to change sign 2002-07-28 Steven G. Johnson * rdft/Makefile.am, rdft/buffered2.c, rdft/conf.c, rdft/direct2.c, rdft/khc2r.c, rdft/kr2hc.c, rdft/nop2.c, rdft/plan2.c, rdft/problem2.c, rdft/rdft.h, rdft/solve2.c, rdft/verify.c, rdft/vrank-geq1-rdft2.c, tests/bench.c: added rdft2 2002-07-28 Matteo Frigo * simd/simd-altivec.h: Optimized * simd/simd-altivec.h: Changed ALIGNMENT * simd/simd-sse.h: alignment := 8 * simd/simd-altivec.h: Avoid warning * simd/simd-sse2.h: Oops * genfft/annotate.ml, genfft/genutil.ml, genfft/simd.ml, genfft/variable.ml, genfft/variable.mli, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: New altivec experiment * simd/simd-altivec.h: Nothing * simd/simd-altivec.h: Oops * simd/simd-altivec.h: Nothing * simd/Makefile.am, simd/altivec.c, simd/simd-altivec.h: Constants are now in separate file. * simd/simd-altivec.h: More precise comment * simd/simd-altivec.h: gcc-3.1 bug workaround 2002-07-28 Steven G. Johnson * dft/buffered.c, dft/dft.h, dft/problem.c, rdft/buffered.c, rdft/problem.c, rdft/rdft.h: slight optimization, and exported zerotens functions * rdft/dft-r2hc.c: should be a plan_dft, not a plan_rdft 2002-07-28 Matteo Frigo * simd/simd-altivec.h: Optimizations. Make it work with vanilla non-Apple gcc. 2002-07-27 Steven G. Johnson * rdft/generic.c: whoops * rdft/generic.c: added hc2r (dif) * rdft/rader-hc2hc.c: add hc2r (dif) case 2002-07-27 Matteo Frigo * simd/simd-altivec.h, support/Makefile.codelets: Altivec port * kernel/twiddle.c: Fixed signed/unsigned bug. 2002-07-26 Matteo Frigo * dft/rank0.c, rdft/rank0.c: Make rank0 unapplicable to in-place problems. 2002-07-25 Steven G. Johnson * rdft/generic.c: only works for r odd 2002-07-25 Matteo Frigo * kernel/planner-score.c: Reinserted much better timing-avoidance heuristic * dft/buffered.c, kernel/ifftw.h, kernel/plan.c, kernel/planner-score.c, kernel/traverse.c, rdft/buffered.c, tests/bench.c: Score is now a property of the plan, not of the solver. Revised representation of closures. * genfft/gen_hc2r.ml, genfft/gen_r2hc.ml, rdft/codelets/hc2r/Makefile.am: Cosmetic changes. Added hc2r_128.c 2002-07-25 Steven G. Johnson * rdft/rader-dht.c: added hc2r * rdft/Makefile.am, rdft/hc2hc-buf.c, rdft/hc2hc-ditbuf.c, rdft/khc2hc-dif.c: added hc2hc-difbuf * rdft/Makefile.am, rdft/hc2hc-dif.c, rdft/hc2hc.c, rdft/khc2hc-dif.c, rdft/rdft.h: added rdft-dif * rdft/verify.c: whoops, hc2r must be conjugated to have right sign * dft/ct-dif.c: slight change * rdft/verify.c: whoops * rdft/Makefile.am, rdft/codelet.h, rdft/direct-r2hc.c, rdft/direct.c, rdft/khc2r.c, rdft/rdft.h: support hc2r codelets * rdft/dft-r2hc.c: use vector plan for r/i instead of two separate plans * dft/buffered.c, dft/rader.c, kernel/ifftw.h, rdft/buffered.c, rdft/rader-dht.c, rdft/rader-hc2hc.c: hack to allow rader/generic to work in-place for small prime sizes, instead of always using buffered 2002-07-24 Steven G. Johnson * rdft/Makefile.am, rdft/conf.c, rdft/generic.c: added rdft-generic * dft/generic.c: fixed add count * rdft/rader-hc2hc.c: again * rdft/rader-hc2hc.c: slight fix * rdft/rader-hc2hc.c: fixed comment * tests/bench.c: whoops * rdft/Makefile.am, rdft/conf.c, rdft/rader-hc2hc.c, rdft/rdft.h, tests/bench.c: added rader-hc2hc * dft/rader.c: whoops, initialize W * rdft/rader-dht.c: strides should not be unsigned * dft/rader.c: more stride sign fixes * dft/rader.c: strides should not be unsigned! 2002-07-23 Steven G. Johnson * rdft/dft-r2hc.c: added comment * rdft/r2hc-hc2r.c: another fix to op count * rdft/r2hc-hc2r.c: whoops * rdft/dft-r2hc.c, rdft/r2hc-hc2r.c: slight fix to op counts * rdft/Makefile.am, rdft/conf.c, rdft/dft-r2hc.c, rdft/rdft.h: added dft-r2hc * rdft/rader-dht.c: better comment and var. name * rdft/Makefile.am, rdft/conf.c, rdft/r2hc-hc2r.c, rdft/rdft.h, rdft/verify.c, tests/bench.c: fixed tests for hc2r, and added r2hc-hc2r * rdft/Makefile.am, rdft/conf.c, rdft/rader-dht.c, rdft/rdft.h: added rader-dht 2002-07-23 Matteo Frigo * rdft/codelets/r2hc/Makefile.am: Added r2hc_128, what the hell. * rdft/codelets/r2hc/Makefile.am: Added codelets that compute twiddle factors 2002-07-22 Steven G. Johnson * rdft/Makefile.am, rdft/buffered.c, rdft/conf.c: added rdft-buffered * rdft/Makefile.am, rdft/hc2hc-ditbuf.c, rdft/khc2hc-dit.c: added hc2hc-ditbuf * dft/generic.c: use STACK_MALLOC (alloca), since generic radix is always small * rdft/hc2hc-dit.c: small cleanup 2002-07-22 Matteo Frigo * rdft/problem.c: What the hell was I thinking? * rdft/problem.c: Reduced code size by using table instead of switch statement. * rdft/problem.c: Changed hash function to avoid collisions with DFT. 2002-07-22 Steven G. Johnson * rdft/hc2hc-dit.c: added missing file, whoops * rdft/hc2hc.c: whoops, generate enough twiddles for odd m * rdft/verify.c: don't try to verify R2HCII or HC2RIII plans * rdft/hc2hc.c: recursive case now works, I think * rdft/verify.c: add extra impulse test for debugging * rdft/direct-r2hc.c: whoops, multiply ios offset by stride (and rename to ioffset) * rdft/verify.c: whoops * genfft/gen_hc2hc.ml, rdft/Makefile.am, rdft/hc2hc.c, rdft/khc2hc-dit.c: added hc2hc-dit * kernel/twiddle.c: twiddles can be shared with smaller m's * rdft/Makefile.am, rdft/codelet.h, rdft/codelets/hfb.c, rdft/hc2hc.c, rdft/hc2hc.h: preparing for recursive rdft... 2002-07-21 Steven G. Johnson * rdft/verify.c: slight fix, to match libbench/verify.c * rdft/direct-r2hc.c: r2hcII has imag parts offset by n-1, not n. We can also allocate fewer strides. * rdft/rank0.c: delete unused var * rdft/Makefile.am, rdft/codelet.h, rdft/codelets/hc2r.c, rdft/codelets/r2hc.c, rdft/conf.c, rdft/direct-r2hc.c, rdft/indirect.c, rdft/khc2rIII.c, rdft/kr2hc.c, rdft/kr2hcII.c, rdft/nop.c, rdft/problem.c, rdft/rank0.c, rdft/rdft.h, rdft/vrank-geq1.c, rdft/vrank2-transpose.c, rdft/vrank3-transpose.c: added some rdft solvers * kernel/fftw3.h: pass identifier in FFTW() through another macro so that the mangled name can itself be a preprocessor symbol * dft/vrank-geq1.c: fix in comment * Makefile.am, rdft/rdft.h, tests/bench.c: bench tests rdft plans * rdft/codelet.h, tests/Makefile.am, tests/bench.c: make rdft.h and dft.h compatible * rdft/Makefile.am, rdft/problem.c, rdft/rdft.h, rdft/verify.c: first-draft rdft verify * rdft/khc2hc-dif.c, rdft/khc2hc-dit.c, rdft/khc2r.c, rdft/khc2rIII.c, rdft/kr2hc.c, rdft/kr2hcII.c: got rid of annoying warnings * rdft/Makefile.am, rdft/khc2hc-dif.c, rdft/khc2hc-dit.c, rdft/khc2r.c, rdft/khc2rIII.c, rdft/kr2hc.c, rdft/kr2hcII.c, rdft/rdft.h: added stub codelet registration for linking purposes * rdft/Makefile.am, rdft/conf.c, rdft/plan.c, rdft/problem.c, rdft/rdft.h, rdft/solve.c: basic rdft stuff * Makefile.am, configure.ac, dft/codelet.h, genfft/gen_hc2hc.ml, kernel/ifftw.h, rdft/Makefile.am, rdft/codelet.h, rdft/codelets/Makefile.am, rdft/codelets/hb.h, rdft/codelets/hc2r.c, rdft/codelets/hc2r.h, rdft/codelets/hc2r/Makefile.am, rdft/codelets/hc2rIII.h, rdft/codelets/hf.h, rdft/codelets/hfb.c, rdft/codelets/r2hc.c, rdft/codelets/r2hc.h, rdft/codelets/r2hc/Makefile.am, rdft/codelets/r2hcII.h: rdft codelets now compile 2002-07-20 Matteo Frigo * genfft/gen_hc2r.ml: Oops, was generating rdfts instead of hdfts * TODO, configure.ac, genfft-k7/twiddle.ml, genfft/twiddle.ml, kernel/twiddle.c, rdft/codelets/hc2r/Makefile.am: Added hc2r codelets * genfft/gen_hc2hc.ml: return W in hc2hc codelets * configure.ac, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/simd/codelets/Makefile.am, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_r2hc.ml, genfft/trig.ml, rdft/codelets/r2hc/Makefile.am, support/Makefile.codelets: Some work on rdft codelets 2002-07-16 Matteo Frigo * kernel/fftw3.h: fix const * acinclude.m4, configure.ac, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/indirect.c, dft/rank0.c, dft/simd/codelets/Makefile.am, kernel/version.c, tests/bench.c: Separate CFLAGS in codelets. Fix const in certain places. 2002-07-16 Steven G. Johnson * TODO: note buffering problem 2002-07-16 Matteo Frigo * dft/generic.c: Removed unpredictable branch from inner loop 2002-07-15 Steven G. Johnson * TODO: update * dft/generic.c: optimization * dft/Makefile.am, dft/conf.c, dft/dft.h, dft/generic.c, kernel/ifftw.h, kernel/twiddle.c: added generic dit * dft/rader.c: whoops, mksolver should be static 2002-07-15 Matteo Frigo * genfft/Makefile.am, genfft/algsimp.ml, genfft/c.ml, genfft/c.mli, genfft/gen_hc2hc.ml, genfft/gen_hc2r.ml, genfft/gen_r2hc.ml, genfft/genutil.ml: First implementation of gen_hc2hc, probably still buggy. 2002-07-15 Steven G. Johnson * dft/rader.c: don't count loading of twiddle factors in ops.other, since it isn't counted for the codelets * dft/ct.c, dft/rader.c, kernel/plan.c: plan_destroy puts plan to sleep before deallocating it, to eliminate duplicate free calls in solvers * dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct.c, dft/vrank-geq1.c, kernel/ifftw.h, tests/bench.c: fftw2-like vector recursion flag 2002-07-15 Matteo Frigo * kernel/planner.c: More jokes * tests/bench.c: Bless plan for testing purposes * kernel/planner.c: Canonical linked-list deletion (hope it is right) 2002-07-14 Steven G. Johnson * dft/rader.c: use estimating planner for cld_omega * dft/rader.c: better internal naming * dft/rader.c: printing should really be fixed now, grrr * dft/rader.c: print all distinct child plans * tests/bench.c: whoops * dft/rader.c: whoops, destroy should delete twiddle/omega from list * kernel/planner.c: whoops * kernel/ifftw.h, kernel/plan.c, kernel/planner.c: added plan_bless and FORGET_ACCURSED * kernel/traverse.c: further cleanup * kernel/traverse.c: slight cleanup * kernel/Makefile.am, kernel/ifftw.h, kernel/traverse.c, tests/bench.c: added traverse_plan via print (ugh) * dft/ct.c, kernel/ifftw.h, kernel/twiddle.c: added TW_FULL, and additional n parameter for twiddles * kernel/planner.c: whoops * kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c: save flags before invoking solver mkplan 2002-07-14 Matteo Frigo * TODO: *** empty log message *** 2002-07-14 Steven G. Johnson * configure.ac, kernel/cycle.h: added support for UNICOS _rtc() real-time-clock intrinsic function * kernel/timer.c: fixed typo: HAVE_TIME_H should include time.h, not sys/time.h * configure.ac, kernel/cycle.h: support AIX read_real_time timer 2002-07-13 Steven G. Johnson * configure.ac: use && instead of the (sigh) unportable -a * configure.ac: use AC_HELP_STRING * configure.ac, dft/codelet.h, dft/verify.c, kernel/Makefile.am, kernel/dfftw3.h, kernel/fftw3.h, kernel/ifftw.h, kernel/lfftw3.h, kernel/sfftw3.h, libbench/bench-main.c, libbench/bench-user.h, libbench/info.c, libbench/verify.c, simd/simd-sse2.h: support long-double precision * dft/rader.c: whoops whoops * dft/rader.c: whoops * TODO: buffered solver strides have been fixed * dft/rader.c: convention * TODO, dft/rader.c: share twiddle arrays in Rader * libbench/verify.c: call done() after verify 2002-07-12 Steven G. Johnson * tests/bench.c: output planner time with -v * kernel/print.c: support double outputs * dft/vrank-geq1.c: removed extraneous parens * dft/buffered.c: increase maxbufsz to 64k; makes a big difference for large 2d transforms 2002-07-12 Matteo Frigo * dft/vrank-geq1.c: Fix 2002-07-12 Steven G. Johnson * dft/rank-geq2.c: fix comment * kernel/tensor.c: fix in comment * ChangeLog: updated * TODO: buffered malloc's buffers * TODO, dft/rader.c: share more code between apply and apply_dit in Rader 2002-07-08 Matteo Frigo * simd/simd-sse.h, simd/simd-sse2.h, simd/sse.c, simd/sse2.c: Polished * support/Makefile.codelets: *** empty log message *** * dft/simd/codelets/Makefile.am, genfft/c.ml, genfft/gen_notw_c.ml, genfft/simd.ml, genfft/to_alist.ml, genfft/to_alist.mli, simd/simd-sse.h, simd/simd-sse2.h, support/Makefile.codelets: SIMD/FMA stuff * simd/simd-sse.h: Avoid code duplication * genfft/Makefile.am, genfft/to_alist.ml: Fixes for FMA+SIMD * dft/buffered.c, dft/codelets/standard/Makefile.am, dft/simd/Makefile.am, dft/simd/NAMING, dft/simd/codelets/Makefile.am, dft/simd/n1b.c, dft/simd/n1b.h, dft/simd/n1f.c, dft/simd/n1f.h, dft/simd/n2f.c, dft/simd/n2f.h, dft/simd/n3f.h, dft/simd/n4.c, dft/simd/n4.h, dft/simd/t1b.c, dft/simd/t1b.h, dft/simd/t1f.c, dft/simd/t1f.h, dft/simd/t2f.c, dft/simd/t2f.h, dft/simd/t3f.h, dft/simd/t4.c, dft/simd/t4.h, genfft/Makefile.am, genfft/algsimp.ml, genfft/annotate.ml, genfft/annotate.mli, genfft/c.ml, genfft/complex.ml, genfft/complex.mli, genfft/expr.ml, genfft/expr.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_notw_c.ml, genfft/gen_r2hc.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twiddle_c.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/oracle.ml, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml, genfft/to_alist.ml, genfft/trig.ml, genfft/trig.mli, genfft/twiddle.ml, kernel/ifftw.h, simd/simd-sse.h, simd/simd-sse2.h, simd/sse.c, simd/sse2.c, support/Makefile.codelets: Major changes in SIMD fftw 2002-07-05 Matteo Frigo * dft/buffered.c, simd/simd-altivec.h, simd/simd-sse.h: Use unpck instructions instead of shuffles * dft/codelets/n.c, dft/codelets/t.c, dft/ct-ditbuf.c, dft/verify.c, kernel/ifftw.h, kernel/planner.c, tests/bench.c: Minor tweaks * tests/bench.c: Use score planner * CONVENTIONS, dft/Makefile.am, dft/dft.h, dft/verify.c, kernel/Makefile.am, kernel/debug.c, kernel/dotens.c, kernel/dotens2.c, kernel/ifftw.h, tests/bench.c: Added verifier 2002-07-04 Matteo Frigo * dft/buffered.c, dft/codelet.h, dft/codelets/n.c, dft/codelets/t.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct.c, dft/ct.h, dft/simd/Makefile.am, dft/simd/NAMING, dft/simd/codelets/Makefile.am, dft/simd/n2f.c, dft/simd/n2f.h, dft/simd/n3f.h, dft/simd/n4.c, dft/simd/t2f.c, dft/simd/t2f.h, dft/simd/t3f.h, dft/simd/t4.c, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml, genfft/annotate.ml, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/simd.ml, genfft/simdmagic.ml, kernel/alloc.c, kernel/ifftw.h, simd/simd-sse.h, simd/simd-sse2.h: More simd codelets 2002-07-02 Matteo Frigo * dft/rank-geq2.c: Oops * dft/rank-geq2.c, dft/vrank-geq1.c: Fixed classic mode * genfft/simd.ml, simd/simd-altivec.h, simd/simd-sse.h, simd/simd-sse2.h: Use LDK for constants so that we can play games. * dft/codelet.h, dft/codelets/n.c, dft/simd/n4.c, dft/simd/t4.c, genfft-k7/gen_notw.ml, genfft/gen_notw.ml, genfft/genutil.ml, genfft/simd.ml, genfft/simd.mli, simd/simd-sse.h: Improved support for fixed strides * dft/codelet.h, dft/codelets/n.c, dft/codelets/n.h, dft/codelets/t.c, dft/codelets/t.h, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/direct.c, dft/k7/k7.c, dft/simd/n4.c, dft/simd/n4.h, dft/simd/t4.c, dft/simd/t4.h, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml: Changed accounting of flops * genfft-k7/algsimp.ml, genfft-k7/to_alist.ml, genfft/algsimp.ml, simd/simd-sse2.h: Wrong code in non-fma mode * genfft/simdmagic.ml, kernel/alloc.c, simd/Makefile.am, simd/simd-sse2.h, simd/sse2.c: sse2 stuff 2002-07-01 Matteo Frigo * Makefile.am, dft/ct.c, dft/direct.c, dft/k7/k7.c, dft/simd/n4.c, dft/simd/t4.c, kernel/alloc.c, simd/Makefile.am, simd/simd-altivec.h, simd/simd-sse.h, simd/sse.c: Identify CPUs for special codelets * libbench/problem.c: Change split problem syntax * dft/simd/codelets/Makefile.am: Removed -fma flag * simd/simd-altivec.h: Work around gcc bug 2002-06-30 Matteo Frigo * genfft/algsimp.ml, genfft/magic.ml, genfft/oracle.ml, genfft/simd.ml, genfft/to_alist.ml: New simd stuff * dft/simd/codelets/Makefile.am, simd/Makefile.am, simd/simd-altivec.h, simd/simd-sse.h: Added altivec support * dft/simd/t4.c: Forgot file * Makefile.am, configure.ac, dft/Makefile.am, dft/codelet-k7.h, dft/codelet.h, dft/codelets/Makefile.am, dft/codelets/f.h, dft/codelets/inplace/Makefile.am, dft/codelets/n.c, dft/codelets/n.h, dft/codelets/q.h, dft/codelets/standard/Makefile.am, dft/codelets/t.c, dft/codelets/t.h, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/direct.c, dft/k7/Makefile.am, dft/k7/codelets/Makefile.am, dft/k7/ct-dif.c, dft/k7/ct-dit.c, dft/k7/ct-ditbuf.c, dft/k7/direct.c, dft/k7/k7.c, dft/k7/kdft-dif.c, dft/k7/kdft-dit.c, dft/k7/kdft.c, dft/simd/Makefile.am, dft/simd/NAMING, dft/simd/codelets/Makefile.am, dft/simd/n4.c, dft/simd/n4.h, dft/simd/t4.h, genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml, genfft/annotate.ml, genfft/c.ml, genfft/c.mli, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/magic.ml, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml, genfft/twiddle.ml, genfft/twiddle.mli, kernel/alloc.c, kernel/ifftw.h, libbench/bench-user.h, libbench/problem.c, libbench/util.c, simd/Makefile.am, simd/README, simd/simd-sse.h, simd/simd.h, support/Makefile.codelets, tests/Makefile.am, tests/bench.c: Progress towards simd implementation 2002-06-26 Matteo Frigo * dft/k7/codelets/Makefile.am: Add 128- codelet 2002-06-23 Matteo Frigo * configure.ac, genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml, genfft/c.ml, genfft/c.mli, genfft/expr.ml, genfft/expr.mli, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_r2hc.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/simd.ml, genfft/simd.mli, libbench/bench-main.c: More simd changes. Ensure proper stack alignment in k7 codelets. 2002-06-22 Matteo Frigo * kernel/ifftw.h, kernel/solvtab.c: Fixed prototypes * kernel/cycle.h: Sparc cycle counter requires v9 * configure.ac, kernel/cycle.h, kernel/ifftw.h: Minor fixes * acinclude.m4: Fixed ev67 detection * tests/bench.c: Print flops * genfft/simd.ml: Nothing really * dft/codelet-k7.h, dft/codelet.h, genfft-k7/magic.ml, genfft-k7/to_alist.ml, genfft-k7/to_alist.mli, genfft/Makefile.am, genfft/c.ml, genfft/c.mli, genfft/gen_conv.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_r2hc.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/magic.ml, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml, genfft/to_alist.ml, kernel/ifftw.h: More simd work 2002-06-21 Matteo Frigo * genfft/Makefile.am, genfft/annotate.ml, genfft/annotate.mli, genfft/magic.ml, genfft/simd.ml, genfft/simd.mli, genfft/simdmagic.ml: More simd work 2002-06-20 Matteo Frigo * genfft/Makefile.am, genfft/annotate.ml, genfft/c.ml, genfft/c.mli, genfft/magic.ml, genfft/simd.ml, genfft/variable.ml, genfft/variable.mli: More simd work * genfft/annotate.ml, genfft/annotate.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_hc2r.ml, genfft/gen_notw.ml, genfft/gen_r2hc.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/magic.ml, genfft/twiddle.ml, genfft/twiddle.mli, genfft/variable.ml, genfft/variable.mli: Moving towards incorporation of simd stuff 2002-06-19 Matteo Frigo * Makefile.am, configure.ac, dft/Makefile.am, dft/codelets/Makefile.am, dft/ct-dif-k7.c, dft/ct-dit-k7.c, dft/ct-ditbuf-k7.c, dft/direct-k7.c, dft/k7/Makefile.am, dft/k7/codelets/Makefile.am, dft/k7/ct-dif.c, dft/k7/ct-dit.c, dft/k7/ct-ditbuf.c, dft/k7/direct.c, dft/k7/kdft-dif.c, dft/k7/kdft-dit.c, dft/k7/kdft.c, dft/kdft-dif-k7.c, dft/kdft-dit-k7.c, dft/kdft-k7.c: Reorganized k7 stuff into own directory * genfft-k7/expr.ml, genfft-k7/expr.mli, genfft/expr.ml, genfft/expr.mli, genfft/genutil.ml, genfft/magic.ml: Minor experimental stuff * genfft/expr.ml, genfft/expr.mli, genfft/genutil.ml: Cosmetic changes 2002-06-19 fftw * dft/buffered.c, dft/rader.c: allocate buffers on the fly 2002-06-18 Matteo Frigo * dft/Makefile.am, dft/codelet-k7.h, dft/ct-dif-k7.c, dft/ct-dif.c, dft/ct-dit-k7.c, dft/ct-dit.c, dft/ct-ditbuf-k7.c, dft/ct-ditbuf.c, dft/ct.c, dft/ct.h, dft/kdft-dit-k7.c, dft/rader.c, genfft-k7/Makefile.am, genfft-k7/assignmentsToVfpinstrs.ml, genfft-k7/gen_twiddle.ml, genfft-k7/k7Basics.ml, genfft-k7/k7Basics.mli, genfft-k7/k7RegisterAllocationBasics.ml, genfft-k7/k7RegisterAllocationBasics.mli, genfft-k7/k7RegisterAllocator.ml, genfft-k7/k7RegisterAllocatorInit.ml, genfft-k7/number.ml, genfft-k7/to_alist.ml, genfft/number.ml, genfft/to_alist.ml, kernel/ifftw.h, kernel/planner.c, kernel/primes.c: Added ct-ditbuf-k7.c . Major changes required in generator. * genfft-k7/gen_twiddle.ml, kernel/ifftw.h: Nothing, really * configure.ac: !SINGLE ==> !K7_MODE (for some reason the contrapositive sounds wrong) * dft/buffered.c: Buffer is now symmetric wrt forward/backward transform * dft/ct-dif.c, dft/indirect.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/debug.c, kernel/ifftw.h, kernel/print.c: Fixed applicable() in indirect.c * dft/rader.c: Fixed attempt to free() uninitialized pointer. * CONVENTIONS, TODO, dft/rader.c, kernel/ifftw.h, kernel/plan.c, kernel/planner.c, tests/bench.c: Added reference counts for awake() 2002-06-18 Steven G. Johnson * dft/rader.c: updated comment * TODO: slight update 2002-06-17 fftw * dft/rader.c, kernel/Makefile.am, kernel/ifftw.h, kernel/primes.c: moved prime-number stuff into primes.c, so it can be shared with generic codelet and with rfftw rader * dft/rader.c: added comment * dft/rader.c, kernel/ifftw.h, kernel/twiddle.c: added rader-dit * configure.ac, dft/Makefile.am, dft/conf.c, dft/dft.h, dft/rader.c: added initial Rader (no DIT yet) * acinclude.m4: don't warn about long long 2002-06-17 Matteo Frigo * dft/Makefile.am, dft/codelet-k7.h, dft/ct-dif-k7.c, dft/ct.h, dft/kdft-dif-k7.c, genfft-k7/gen_twiddle.ml, kernel/planner.c, tests/bench.c: Added k7 DIF codelets 2002-06-16 Matteo Frigo * TODO: Added stuff to do * dft/rank0.c: Handle dual case R = I + 1 * bootstrap.sh: Removed useless flag * mkdist.sh: Removed useless file * CLASSIC-MODE, Makefile.am, configure.ac, dft/Makefile.am, dft/buffered.c, dft/codelet-k7.h, dft/codelet.h, dft/codelets/inplace/Makefile.am, dft/ct-dif.c, dft/ct-dit-k7.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.h, dft/direct-k7.c, dft/direct.c, dft/indirect.c, dft/kdft-dit-k7.c, dft/kdft-dit.c, dft/nop.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, genfft-k7/Makefile.am, genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml, genfft-k7/twiddle.ml, genfft-k7/twiddle.mli, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/twiddle.ml, genfft/twiddle.mli, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/version.c, support/Makefile.codelets, tests/bench.c: More k7 work. Switched to runtime CLASSIC mode. 2002-06-16 Steven G. Johnson * kernel/tensor.c: spelling 2002-06-16 Matteo Frigo * dft/kdft-k7.c: Do not compile if not K7_MODE * dft/codelet-k7.h, dft/dft.h: Do not require K7 definitions to compile * dft/Makefile.am, dft/codelet-k7.h, dft/codelet.h, dft/direct-k7.c, genfft-k7/Makefile.am, genfft-k7/complex.ml, genfft-k7/complex.mli, genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_twiddle.ml: More k7 stuff 2002-06-15 Matteo Frigo * acinclude.m4: Try to be compatible with automake-1.6 * acinclude.m4, configure.ac, genfft-k7/Makefile.am, genfft-k7/algsimp.ml, genfft-k7/algsimp.mli, genfft-k7/assignmentsToVfpinstrs.ml, genfft-k7/assoctable.ml, genfft-k7/assoctable.mli, genfft-k7/complex.ml, genfft-k7/complex.mli, genfft-k7/expr.ml, genfft-k7/expr.mli, genfft-k7/exprdag.ml, genfft-k7/exprdag.mli, genfft-k7/genUtil.ml, genfft-k7/gen_notw.ml, genfft-k7/k7Unparsing.ml, genfft-k7/littlesimp.ml, genfft-k7/littlesimp.mli, genfft-k7/magic.ml, genfft-k7/monads.ml, genfft-k7/number.ml, genfft-k7/number.mli, genfft-k7/oracle.ml, genfft-k7/oracle.mli, genfft-k7/to_alist.ml, genfft-k7/to_alist.mli, genfft-k7/twiddle.ml, genfft-k7/twiddle.mli, genfft-k7/util.ml, genfft-k7/util.mli, genfft-k7/vFpUnparsing.ml, genfft-k7/vSimdBasics.ml, genfft-k7/vSimdUnparsing.ml, genfft-k7/variable.ml, genfft-k7/variable.mli, genfft/number.ml, support/Makefile.am, support/Makefile.codelets, support/codelet_asmprelude: More merging of Stefan's generator with main genfft branch * genfft-k7/Makefile.am, genfft-k7/complex.ml, genfft-k7/complex.mli, genfft-k7/expr.ml, genfft-k7/expr.mli, genfft-k7/exprdag.ml, genfft-k7/exprdag.mli, genfft-k7/fft.ml, genfft-k7/fft.mli, genfft-k7/genUtil.ml, genfft-k7/gen_hc2hc.ml, genfft-k7/gen_hc2real.ml, genfft-k7/gen_notw.ml, genfft-k7/gen_notwiddle.ml, genfft-k7/gen_notwiddle_fixedstride.ml, genfft-k7/gen_real2hc.ml, genfft-k7/gen_realeven.ml, genfft-k7/gen_realeven2.ml, genfft-k7/gen_realodd.ml, genfft-k7/gen_realodd2.ml, genfft-k7/gen_twiddle.ml, genfft-k7/magic.ml, genfft-k7/symmetry.ml, genfft-k7/twiddle.ml, genfft-k7/util.ml, genfft-k7/util.mli, genfft-k7/variable.ml, genfft-k7/variable.mli, genfft/expr.ml, genfft/expr.mli, genfft/genutil.ml, support/Makefile.codelets: Slowly merging genfft-k7 with main genfft branch * genfft-k7/Makefile.am, genfft-k7/genUtil.ml, genfft-k7/magic.ml, genfft-k7/magic.mli, genfft-k7/twiddle.ml, support/Makefile.codelets: Fixed, really * support/Makefile.codelets: Oops... * support/Makefile.codelets: Work properly when $(ALL_CODELETS) = "" * Makefile.am, configure.ac, dft/codelet.h, dft/codelets/Makefile.am, dft/conf.c, genfft-k7/gen_notwiddle.ml, kernel/ifftw.h, support/Makefile.codelets: Fixed k7 build machinery 2002-06-14 Matteo Frigo * Makefile.am, configure.ac, dft/codelet.h, dft/direct-k7.c, genfft-k7/Makefile.am, genfft-k7/codeletMisc.ml, genfft-k7/codeletMisc.mli, genfft-k7/genUtil.ml, genfft-k7/genUtil.mli, genfft-k7/gen_hc2hc.mli, genfft-k7/gen_hc2real.mli, genfft-k7/gen_notwiddle.ml, genfft-k7/gen_notwiddle.mli, genfft-k7/gen_real2hc.mli, genfft-k7/gen_realeven.mli, genfft-k7/gen_realeven2.mli, genfft-k7/gen_realodd.mli, genfft-k7/gen_realodd2.mli, genfft-k7/gen_twiddle.mli, genfft-k7/genfft.ml, genfft-k7/k7Basics.ml, genfft-k7/k7Basics.mli, genfft-k7/k7Unparsing.ml, genfft-k7/magic.ml, genfft-k7/magic.mli, kernel/ifftw.h, libbench/bench-user.h, support/Makefile.am, dft/Makefile.am, dft/conf.c, dft/dft.h, dft/kdft-k7.c, support/Makefile.codelets: More work on k7 stuff * dft/codelet.h, dft/direct.c, genfft/gen_notw.ml: Changed my mind again * genfft-k7/gen_notwiddle.ml: Removed some useless stuff. * genfft-k7/gen_notwiddle.ml: Hmm... * dft/codelet.h, dft/direct.c, genfft-k7/gen_notwiddle.ml, genfft/gen_notw.ml: More work in preparation for k7 stuff * TODO, dft/codelet.h, dft/direct.c, genfft/gen_notw.ml: Still preparing to include k7 stuff * bootstrap.sh: Create .depend * AUTHORS, Makefile.am, bootstrap.sh, configure.ac, genfft-k7/Makefile.am, genfft-k7/assignmentsToVfpinstrs.ml, genfft-k7/assignmentsToVfpinstrs.mli, genfft-k7/balanceVfpinstrs.ml, genfft-k7/balanceVfpinstrs.mli, genfft-k7/codeletMisc.ml, genfft-k7/codeletMisc.mli, genfft-k7/complex.ml, genfft-k7/complex.mli, genfft-k7/expr.ml, genfft-k7/expr.mli, genfft-k7/exprdag.ml, genfft-k7/exprdag.mli, genfft-k7/fft.ml, genfft-k7/fft.mli, genfft-k7/genUtil.ml, genfft-k7/genUtil.mli, genfft-k7/gen_hc2hc.ml, genfft-k7/gen_hc2hc.mli, genfft-k7/gen_hc2real.ml, genfft-k7/gen_hc2real.mli, genfft-k7/gen_notwiddle.ml, genfft-k7/gen_notwiddle.mli, genfft-k7/gen_notwiddle_fixedstride.ml, genfft-k7/gen_real2hc.ml, genfft-k7/gen_real2hc.mli, genfft-k7/gen_realeven.ml, genfft-k7/gen_realeven.mli, genfft-k7/gen_realeven2.ml, genfft-k7/gen_realeven2.mli, genfft-k7/gen_realodd.ml, genfft-k7/gen_realodd.mli, genfft-k7/gen_realodd2.ml, genfft-k7/gen_realodd2.mli, genfft-k7/gen_twiddle.ml, genfft-k7/gen_twiddle.mli, genfft-k7/genfft.ml, genfft-k7/id.ml, genfft-k7/id.mli, genfft-k7/k7Basics.ml, genfft-k7/k7Basics.mli, genfft-k7/k7ExecutionModel.ml, genfft-k7/k7ExecutionModel.mli, genfft-k7/k7FlatInstructionScheduling.ml, genfft-k7/k7FlatInstructionScheduling.mli, genfft-k7/k7InstructionSchedulingBasics.ml, genfft-k7/k7InstructionSchedulingBasics.mli, genfft-k7/k7RegisterAllocationBasics.ml, genfft-k7/k7RegisterAllocationBasics.mli, genfft-k7/k7RegisterAllocator.ml, genfft-k7/k7RegisterAllocator.mli, genfft-k7/k7RegisterAllocatorEATranslation.ml, genfft-k7/k7RegisterAllocatorEATranslation.mli, genfft-k7/k7RegisterAllocatorInit.ml, genfft-k7/k7RegisterAllocatorInit.mli, genfft-k7/k7RegisterReallocation.ml, genfft-k7/k7RegisterReallocation.mli, genfft-k7/k7Translate.ml, genfft-k7/k7Translate.mli, genfft-k7/k7Unparsing.ml, genfft-k7/k7Unparsing.mli, genfft-k7/k7Vectorization.ml, genfft-k7/k7Vectorization.mli, genfft-k7/magic.ml, genfft-k7/magic.mli, genfft-k7/memoMonad.ml, genfft-k7/memoMonad.mli, genfft-k7/nonDetMonad.ml, genfft-k7/nonDetMonad.mli, genfft-k7/nullVectorization.ml, genfft-k7/nullVectorization.mli, genfft-k7/number.ml, genfft-k7/number.mli, genfft-k7/stateMonad.ml, genfft-k7/stateMonad.mli, genfft-k7/symmetry.ml, genfft-k7/twiddle.ml, genfft-k7/util.ml, genfft-k7/util.mli, genfft-k7/vAnnotatedScheduler.ml, genfft-k7/vAnnotatedScheduler.mli, genfft-k7/vDag.ml, genfft-k7/vDag.mli, genfft-k7/vFpBasics.ml, genfft-k7/vFpBasics.mli, genfft-k7/vFpUnparsing.ml, genfft-k7/vFpUnparsing.mli, genfft-k7/vImproveSchedule.ml, genfft-k7/vImproveSchedule.mli, genfft-k7/vK7Optimization.ml, genfft-k7/vK7Optimization.mli, genfft-k7/vScheduler.ml, genfft-k7/vScheduler.mli, genfft-k7/vSimdBasics.ml, genfft-k7/vSimdBasics.mli, genfft-k7/vSimdIndexing.ml, genfft-k7/vSimdIndexing.mli, genfft-k7/vSimdUnparsing.ml, genfft-k7/vSimdUnparsing.mli, genfft-k7/variable.ml, genfft-k7/variable.mli: Imported Stefan's K7 generator 2002-06-13 Matteo Frigo * genfft/Makefile.am, genfft/c.ml, genfft/complex.ml, genfft/complex.mli, genfft/gen_hc2r.ml, genfft/gen_r2hc.ml, genfft/genutil.ml, genfft/trig.ml: Generator for real->halfcomplex and halfcomplex->real codelets * dft/problem.c, kernel/planner.c, kernel/tensor.c, tests/bench.c: Improved hash functions, printers * support/Makefile.codelets: Only regenerate codlist.c in maintainer mode * dft/problem.c, dft/rank-geq2.c, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/print.c, kernel/tensor.c, tests/bench.c: Planner can export solution list * dft/ct-ditbuf.c, dft/dft.h, dft/direct.c, kernel/cycle.h, libbench/bench-user.h: Fixed for intel compiler * dft/codelet.h, genfft/c.ml, genfft/c.mli, genfft/gen_notw.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/magic.ml: Revised strategy for constants in codelets * tests/bench.c: Enable score planner in classic mode, naive planner in pro mode. 2002-06-12 Matteo Frigo * tests/bench.c: Report classic/pro * dft/buffered.c, tests/bench.c: Fixed behavior of buffered solver for large buffers. * dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/timer.c, libbench/timer.c, tests/bench.c: Make assumption COST(vector) = length * COST(scalar) in classic mode. * kernel/ifftw.h, kernel/plan.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, support/Makefile.codelets: Revised planner implementation in preparation for wisdom. * dft/ct-ditbuf.c: Manually hoist loop invariants. * dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c: Revised loop to compile better with gcc -O 2002-06-11 Matteo Frigo * kernel/tensor.c: Changed tensor syntax * TODO: Added stuff to do. * kernel/version.c: Report classic/pro in version number * CLASSIC-MODE, Makefile.am, RESEARCH-MODE, bootstrap.sh, configure.ac, dft/codelets/inplace/Makefile.am, dft/ct-dit.c, dft/ct-ditbuf.c, dft/kdft-dit.c, dft/rank-geq2.c, dft/vrank-geq1.c, kernel/ifftw.h, mkdist.sh, tests/bench.c: Renamed versions into classic/pro * kernel/Makefile.am, kernel/ifftw.h, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, tests/bench.c: Revised planners, estimator * Makefile.am, dft/buffered.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, kernel/ifftw.h: I don't know what I am doing. * Makefile.am, dft/buffered.c, dft/codelet.h, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct-ditf.c, dft/ct.c, dft/dft.h, dft/direct.c, dft/indirect.c, dft/nop.c, dft/rank-geq2.c, dft/rank0.c, dft/vrank-geq1.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, genfft/c.ml, kernel/Makefile.am, kernel/cycle.h, kernel/flops.c, kernel/ifftw.h, kernel/ops.c, kernel/plan.c, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/planner-score.c, tests/bench.c: Massive revision of estimator * dft/Makefile.am, dft/buffered.c, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/conf.c, dft/ct-dit.c, dft/ct.c, dft/dft.h, dft/indirect.c, dft/problem.c, dft/rank-geq2.c, dft/vecloop.c, dft/vrank-geq1.c, kernel/ifftw.h, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/print.c, kernel/tensor.c, kernel/timer.c, libbench/bench-main.c, tests/bench.c: Many changes * dft/ct-ditbuf.c: Keep it simple, stupid. 2002-06-10 Matteo Frigo * kernel/ifftw.h: Fixed when #undef PRECOMPUTE_ARRAY_INDICES * dft/vrank3-transpose.c, kernel/print.c: Minor changes * CONVENTIONS, configure.ac, dft/Makefile.am, dft/buffered.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditbuf.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/direct.c, dft/indirect.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/kdft.c, dft/nop.c, dft/rank-geq2.c, dft/rank0.c, dft/vecloop.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/ifftw.h, kernel/plan.c, kernel/problem.c, kernel/timer.c, tests/bench.c: Added ct-ditbuf.c, many changes everywhere * kernel/ifftw.h, kernel/planner.c, tests/bench.c: More name mangling * Makefile.am, acinclude.m4, configure.ac, tests/Makefile.am: Fixed build system for single/double precision * CONVENTIONS, configure.ac, dft/buffered.c, dft/codelet.h, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditf.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/direct.c, dft/indirect.c, dft/kdft-dif.c, dft/kdft-difsq.c, dft/kdft-dit.c, dft/kdft.c, dft/nop.c, dft/plan.c, dft/problem.c, dft/rank-geq2.c, dft/rank0.c, dft/solve.c, dft/vecloop.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, kernel/Makefile.am, kernel/alloc.c, kernel/assert.c, kernel/awake.c, kernel/cycle.h, kernel/dfftw3.h, kernel/fftw.h, kernel/fftw3.h, kernel/flops.c, kernel/ifftw.h, kernel/minmax.c, kernel/plan.c, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/planner-score.c, kernel/planner.c, kernel/print.c, kernel/problem.c, kernel/sfftw3.h, kernel/solver.c, kernel/solvtab.c, kernel/square.c, kernel/stride.c, kernel/tensor.c, kernel/timer.c, kernel/twiddle.c, kernel/version.c, support/Makefile.codelets, tests/bench.c: Massive renaming to support both single and double precision. (Must recompile everything twice). * libbench/allocate.c, libbench/bench-user.h, libbench/mflops.c, libbench/problem.c, tests/bench.c: Preliminary crude support for vector transforms in benchmark library. * kernel/tensor.c: Wrong cast 2002-06-09 Matteo Frigo * TODO: Added things to do. * kernel/twiddle.c: twlen0: make static * dft/buffered.c: Nothing * kernel/print.c: Forgot break in switch statement. * kernel/print.c: Fix for c++ compatibility * TODO, dft/buffered.c, dft/ct.c, dft/direct.c, dft/indirect.c, dft/nop.c, dft/rank-geq2.c, dft/rank0.c, dft/vecloop.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, kernel/Makefile.am, kernel/ifftw.h, kernel/planner.c, kernel/print.c, tests/bench.c: Added printer, changed everything * dft/buffered.c, dft/nop.c, dft/rank0.c, tests/bench.c: Removed redundant nop solver * TODO: More things to do * TODO, dft/Makefile.am, dft/buffered.c, dft/conf.c, dft/dft.h, dft/direct.c, dft/indirect.c, dft/nop.c, dft/problem.c, dft/rank0.c, dft/vecloop.c, kernel/ifftw.h, kernel/tensor.c: Introduced idea of rank -infinity and associated NOP plans * dft/buffered.c: Fixed comment * kernel/tensor.c: Removed useless assertions. * kernel/tensor.c: Don't malloc(0). * dft/buffered.c: Fixed signed/unsigned puns * dft/Makefile.am, dft/buffered.c, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditf.c, dft/dft.h, dft/direct.c, dft/indirect.c, dft/rank-geq2.c, dft/rank0.c, dft/vecloop.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, libbench/bench-main.c, libbench/bench-user.h, tests/bench.c: Added buffered.c 2002-06-08 Matteo Frigo * dft/ct.c: Fixed printout * dft/vrank3-transpose.c: Fixed comment * dft/Makefile.am, dft/conf.c, dft/dft.h, dft/vrank0-transpose.c, dft/vrank2-transpose.c, dft/vrank3-transpose.c, tests/bench.c: Added vrank3-transpose, renamed vrank0-transpose -> vrank2-transpose * bootstrap.sh, dft/Makefile.am, dft/conf.c, dft/direct.c, dft/rank-geq2.c, dft/rank0.c, dft/rank_geq2.c, dft/vrank0-transpose.c, tests/bench.c: Added vrank0-transpose * dft/Makefile.am, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditf.c, dft/direct.c, dft/indirect.c, dft/rank0.c, dft/rank_geq2.c, dft/vecloop.c, kernel/Makefile.am, kernel/ifftw.h, kernel/planner-score.c, kernel/planner.c, tests/bench.c: Added planner-score.c * dft/Makefile.am, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditf.c, dft/dft.h, dft/indirect.c, dft/rank_geq2.c, dft/vecloop.c: Added indirect.c * Makefile.am, dft/Makefile.am, dft/codelet.h, dft/codelets/Makefile.am, dft/codelets/inplace/Makefile.am, dft/conf.c, dft/ct-dif.c, dft/ct-dit.c, dft/ct-ditf.c, dft/dft.h, dft/direct.c, dft/kdft-dif.c, dft/kdft-difsq.c, mkdist.sh, tests/Makefile.am: dif, ditf solvers 2002-06-07 Matteo Frigo * Makefile.am, RESEARCH-MODE, bootstrap.sh, configure.ac, dft/Makefile.am, dft/conf.c, dft/dft.h, dft/rank_geq2.c, dft/vecloop.c, kernel/ifftw.h, kernel/minmax.c, kernel/planner.c, support/Makefile.codelets: Implemented rank_geq2. Revised build system * kernel/alloc.c: Fixed printout * Makefile.am, bootstrap.sh, configure.ac, dft/Makefile.am, dft/codelet.h, dft/codelets/Makefile.am, dft/codelets/inplace/Makefile.am, dft/codelets/standard/Makefile.am, dft/conf.c, dft/dft.h, dft/rank0.c, genfft/gen_notw.ml, tests/Makefile.am, tests/bench.c: Added rank0. Revised codelet organization. 2002-06-06 Matteo Frigo * dft/ct.c, dft/vecloop.c, genfft/trig.ml, kernel/ifftw.h, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/planner.c, libbench/bench-user.h, tests/bench.c: Added memoization * dft/Makefile.am, dft/dft.h, dft/direct.c, dft/vecloop.c, kernel/alloc.c, kernel/ifftw.h, kernel/planner.c, tests/bench.c: Added vecloop 2002-06-05 Matteo Frigo * dft/Makefile.am, dft/ct-dit.c, dft/ct.c, dft/ct.h, dft/dft.h, dft/direct.c, dft/kdft-dit.c, kernel/alloc.c, kernel/twiddle.c: First DIT solver/plan * dft/Makefile.am, dft/ct.c, dft/ct.h, kernel/ifftw.h, kernel/stride.c, kernel/twiddle.c: More work on ct * kernel/ifftw.h, kernel/planner-naive.c, kernel/timer.c: Only use cycle counters * CONVENTIONS, bootstrap.sh, kernel/ifftw.h, kernel/twiddle.c: Signed/unsigned fixup * kernel/Makefile.am, kernel/ifftw.h, kernel/twiddle.c: New file twiddle.c 2002-06-04 Matteo Frigo * configure.ac, dft/Makefile.am, dft/direct.c, dft/kdft-dit.c, dft/problem.c, genfft/gen_notw.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, kernel/Makefile.am, kernel/codelet.h, kernel/flops.c, kernel/ifftw.h, kernel/tensor.c, kernel/timer.c, tests/bench.c: Made tensor ranks and vector lengths unsigned. Hopefully fixed all places where it matters. * Makefile.am, configure.ac, dft/Makefile.am, dft/dft.h, dft/direct.c, dft/kdft.c, kernel/Makefile.am, kernel/codelet.h, kernel/fftw.h, kernel/ifftw.h, kernel/planner-estimate.c, kernel/planner-naive.c, kernel/solvtab.c, libbench/Makefile.am, support/Makefile.codelets, tests/Makefile.am, tests/bench.c: System is in working state now (but very incomplete) 2002-06-03 Matteo Frigo * CONVENTIONS, kernel/Makefile.am, kernel/ifftw.h, kernel/planner-naive.c, kernel/planner.c: Started implementing planners * Makefile.am, configure.ac, libbench/Makefile.am, libbench/accopy-from.c, libbench/accopy-to.c, libbench/acopy.c, libbench/allocate.c, libbench/ascale.c, libbench/aset.c, libbench/bench-main.c, libbench/bench-user.h, libbench/bench.h, libbench/caadd.c, libbench/cacopy.c, libbench/can-do.c, libbench/cascale.c, libbench/caset.c, libbench/casub.c, libbench/ccopy-from.c, libbench/ccopy-to.c, libbench/copy-c2c-from.c, libbench/copy-c2c-to.c, libbench/copy-c2h-1d-fftpack.c, libbench/copy-c2h-1d-halfcomplex.c, libbench/copy-c2h-1d-packed.c, libbench/copy-c2h-1d-unpacked-ri.c, libbench/copy-c2h-unpacked.c, libbench/copy-c2h.c, libbench/copy-c2r-packed.c, libbench/copy-c2r-unpacked.c, libbench/copy-c2r.c, libbench/copy-c2ri.c, libbench/copy-h2c-1d-fftpack.c, libbench/copy-h2c-1d-halfcomplex.c, libbench/copy-h2c-1d-packed.c, libbench/copy-h2c-1d-unpacked-ri.c, libbench/copy-h2c-unpacked.c, libbench/copy-h2c.c, libbench/copy-r2c-packed.c, libbench/copy-r2c-unpacked.c, libbench/copy-r2c.c, libbench/copy-ri2c.c, libbench/deallocate.c, libbench/getopt-utils.c, libbench/getopt.c, libbench/getopt.h, libbench/getopt1.c, libbench/info.c, libbench/log2.c, libbench/main.c, libbench/mflops.c, libbench/ovtpvt.c, libbench/pow2.c, libbench/prime.c, libbench/problem.c, libbench/report.c, libbench/speed.c, libbench/timer.c, libbench/unnormalize.c, libbench/util.c, libbench/verify.c, libbench/zero.c: Imported libbench from the new benchfft. We will use libbench for benchmarking and testing. * kernel/Makefile.am, kernel/ifftw.h, kernel/rand.c, kernel/timer.c: Removed useless rand.c * CONVENTIONS, dft/problem.c, kernel/Makefile.am, kernel/alloc.c, kernel/cycle.h, kernel/ifftw.h, kernel/plan.c, kernel/timer.c: Added timer * configure.ac: Split codelets into standard and inplace 2002-06-02 Matteo Frigo * CONVENTIONS, Makefile.am, dft/Makefile.am, dft/dft.h, dft/direct.c, dft/plan.c, dft/problem.c, dft/solve.c, kernel/Makefile.am, kernel/awake.c, kernel/ifftw.h, kernel/square.c: Many many changes * kernel/codelet.h: Fixed anachronism * genfft/littlesimp.mli, genfft/magic.ml, genfft/monads.ml, genfft/number.ml, genfft/number.mli, genfft/oracle.ml, genfft/oracle.mli, genfft/schedule.ml, genfft/schedule.mli, genfft/to_alist.ml, genfft/to_alist.mli, genfft/trig.ml, genfft/trig.mli, genfft/twiddle.ml, genfft/twiddle.mli, genfft/unique.ml, genfft/unique.mli, genfft/util.ml, genfft/util.mli, genfft/variable.ml, genfft/variable.mli, kernel/Makefile.am, kernel/alloc.c, kernel/assert.c, kernel/codelet.h, kernel/fftw.h, kernel/flops.c, kernel/ifftw.h, kernel/minmax.c, kernel/plan.c, kernel/problem.c, kernel/rand.c, kernel/solver.c, kernel/stride.c, kernel/tensor.c, kernel/version.c, support/Makefile.am, support/Makefile.codelets, support/codelet_prelude: Initial import * genfft/littlesimp.mli, genfft/magic.ml, genfft/monads.ml, genfft/number.ml, genfft/number.mli, genfft/oracle.ml, genfft/oracle.mli, genfft/schedule.ml, genfft/schedule.mli, genfft/to_alist.ml, genfft/to_alist.mli, genfft/trig.ml, genfft/trig.mli, genfft/twiddle.ml, genfft/twiddle.mli, genfft/unique.ml, genfft/unique.mli, genfft/util.ml, genfft/util.mli, genfft/variable.ml, genfft/variable.mli, kernel/Makefile.am, kernel/alloc.c, kernel/assert.c, kernel/codelet.h, kernel/fftw.h, kernel/flops.c, kernel/ifftw.h, kernel/minmax.c, kernel/plan.c, kernel/problem.c, kernel/rand.c, kernel/solver.c, kernel/stride.c, kernel/tensor.c, kernel/version.c, support/Makefile.am, support/Makefile.codelets, support/codelet_prelude: New file. * AUTHORS, COPYRIGHT, ChangeLog, Makefile.am, NEWS, README, acinclude.m4, bootstrap.sh, configure.ac, dft/Makefile.am, dft/dft.h, dft/problem.c, genfft/Makefile.am, genfft/algsimp.ml, genfft/algsimp.mli, genfft/annotate.ml, genfft/annotate.mli, genfft/assoctable.ml, genfft/assoctable.mli, genfft/c.ml, genfft/c.mli, genfft/complex.ml, genfft/complex.mli, genfft/conv.ml, genfft/conv.mli, genfft/dag.ml, genfft/dag.mli, genfft/expr.ml, genfft/expr.mli, genfft/fft.ml, genfft/fft.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_notw.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/littlesimp.ml: Initial import * AUTHORS, COPYRIGHT, ChangeLog, Makefile.am, NEWS, README, acinclude.m4, bootstrap.sh, configure.ac, dft/Makefile.am, dft/dft.h, dft/problem.c, genfft/Makefile.am, genfft/algsimp.ml, genfft/algsimp.mli, genfft/annotate.ml, genfft/annotate.mli, genfft/assoctable.ml, genfft/assoctable.mli, genfft/c.ml, genfft/c.mli, genfft/complex.ml, genfft/complex.mli, genfft/conv.ml, genfft/conv.mli, genfft/dag.ml, genfft/dag.mli, genfft/expr.ml, genfft/expr.mli, genfft/fft.ml, genfft/fft.mli, genfft/gen_athnotw.ml, genfft/gen_athtw.ml, genfft/gen_conv.ml, genfft/gen_notw.ml, genfft/gen_trig.ml, genfft/gen_twiddle.ml, genfft/gen_twidsq.ml, genfft/genutil.ml, genfft/littlesimp.ml: New file.