Commit graph

6 commits

Author SHA1 Message Date
Jens Arnold
0030ae28b5 Get rid of .rept in inline asm() blocks where possible. Using .rept causes gcc to wrongly estimate the size of the asm(), leading to (potential) compilation problems. This is necessary for the upcoming restructuring, and should fix ARMv6+ sim builds as well. No functional change.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@25004 a1c6a512-1295-4272-9138-f99709370657
2010-03-03 20:52:02 +00:00
Jens Arnold
9f6586698a APE codec: Speed up decoding of -c2000 and higher on ARMv4 and coldfire by fusing vector math for the filters. Speedup is roughly 3.5% for -c2000, 8% for -c3000 and 12% for -c4000. To be extended to other architectures.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24473 a1c6a512-1295-4272-9138-f99709370657
2010-02-02 22:50:21 +00:00
Jens Arnold
2a5053f58c Several tweaks and cleanups: * Use .rept instead of repeated macros for repeating blocks. * Use MUL (variant) instead of MLA (variant) in the first step of the ARM scalarproduct() if there's no loop. * Unroll ARM assembler functions to 32 where not already done, plus the generic scalarproduct().
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19144 a1c6a512-1295-4272-9138-f99709370657
2008-11-19 21:31:33 +00:00
Jens Arnold
35f23267bf Further optimised the filter vector math assembly for coldfire, and added assembly filter vector math for ARM. Both make use of the fact that the first argument of the vector functions is longword aligned. * The ARM version is tailored for ARM7TDMI, and would slow down arm9 or higher. Introduced a new CPU_ macro for ARM7TDMI. Speedup for coldfire: -c3000 104%->109%, -c4000 43%->46%, -c5000 1.7%->2.0%. Speedup for PP502x: -c2000 66%->75%, -c3000 37%->48%, -c4000 11%->18%, -c5000 2.5%->3.7%
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15302 a1c6a512-1295-4272-9138-f99709370657
2007-10-25 18:58:16 +00:00
Jens Arnold
2e9c77cc2a APE codec: Further optimised filtering yields 3..4% speedup for -c2000 (now 135% realtime), -c3000 (now 97% realtime) and higher modes. Single 32 bit stores are faster than movem/lea in IRAM.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15200 a1c6a512-1295-4272-9138-f99709370657
2007-10-19 07:30:55 +00:00
Jens Arnold
2640bdb262 APE codec: Assembler optimised vector math routines for coldfire. -c2000 is now usable at 130% realtime (was 107%), -c3000 is near realtime (93%, was 64%). -c1000 doesn't change.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@15194 a1c6a512-1295-4272-9138-f99709370657
2007-10-18 22:37:33 +00:00