rockbox

Author	SHA1	Message	Date
Jens Arnold	0030ae28b5	Get rid of .rept in inline asm() blocks where possible. Using .rept causes gcc to wrongly estimate the size of the asm(), leading to (potential) compilation problems. This is necessary for the upcoming restructuring, and should fix ARMv6+ sim builds as well. No functional change. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@25004 a1c6a512-1295-4272-9138-f99709370657	2010-03-03 20:52:02 +00:00
Jens Arnold	9f6586698a	APE codec: Speed up decoding of -c2000 and higher on ARMv4 and coldfire by fusing vector math for the filters. Speedup is roughly 3.5% for -c2000, 8% for -c3000 and 12% for -c4000. To be extended to other architectures. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24473 a1c6a512-1295-4272-9138-f99709370657	2010-02-02 22:50:21 +00:00
Jens Arnold	2a5053f58c	Several tweaks and cleanups: * Use .rept instead of repeated macros for repeating blocks. * Use MUL (variant) instead of MLA (variant) in the first step of the ARM scalarproduct() if there's no loop. * Unroll ARM assembler functions to 32 where not already done, plus the generic scalarproduct(). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19144 a1c6a512-1295-4272-9138-f99709370657	2008-11-19 21:31:33 +00:00
Jens Arnold	35f23267bf	Further optimised the filter vector math assembly for coldfire, and added assembly filter vector math for ARM. Both make use of the fact that the first argument of the vector functions is longword aligned. * The ARM version is tailored for ARM7TDMI, and would slow down arm9 or higher. Introduced a new CPU_ macro for ARM7TDMI. Speedup for coldfire: -c3000 104%->109%, -c4000 43%->46%, -c5000 1.7%->2.0%. Speedup for PP502x: -c2000 66%->75%, -c3000 37%->48%, -c4000 11%->18%, -c5000 2.5%->3.7% git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15302 a1c6a512-1295-4272-9138-f99709370657	2007-10-25 18:58:16 +00:00
Jens Arnold	2e9c77cc2a	APE codec: Further optimised filtering yields 3..4% speedup for -c2000 (now 135% realtime), -c3000 (now 97% realtime) and higher modes. Single 32 bit stores are faster than movem/lea in IRAM. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15200 a1c6a512-1295-4272-9138-f99709370657	2007-10-19 07:30:55 +00:00
Jens Arnold	2640bdb262	APE codec: Assembler optimised vector math routines for coldfire. -c2000 is now usable at 130% realtime (was 107%), -c3000 is near realtime (93%, was 64%). -c1000 doesn't change. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15194 a1c6a512-1295-4272-9138-f99709370657	2007-10-18 22:37:33 +00:00

6 commits