rockbox

Author	SHA1	Message	Date
Jens Arnold	0a291fff12	APE: Fused vector math for the filters on ARMv5te. Speedup on Cowon D2 is ~4% for -c2000..-c4000 (less for -c5000). Thanks to Frank Gevaerts for testing. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24590 a1c6a512-1295-4272-9138-f99709370657	2010-02-10 23:23:17 +00:00
Jens Arnold	1cc4bd8f86	APE: Fused vector math for the filters on ARMv6. Speedup is ~2.5% for -c2000, ~7% for -c3000 and higher. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24569 a1c6a512-1295-4272-9138-f99709370657	2010-02-08 21:59:24 +00:00
Jens Arnold	69fe1ad830	Put back the insane buffer where it belongs on non-ARM, and simplify the selection. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24512 a1c6a512-1295-4272-9138-f99709370657	2010-02-04 20:20:10 +00:00
Andrew Mahone	723d5c6da6	Fix yellow: add newline at EOF in udiv32_arm-pre.S git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24508 a1c6a512-1295-4272-9138-f99709370657	2010-02-04 08:55:36 +00:00
Andrew Mahone	b1caf4a07d	Use all available codec iram for reciprocal table in APE codec on ARMv4. Done by linking first with the table empty to determine free space, then sizing table to fill it. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24507 a1c6a512-1295-4272-9138-f99709370657	2010-02-04 08:45:38 +00:00
Andrew Mahone	8ed7bda64c	Move udiv32_arm.S into libdemac, as this divider is specialized for the APE codec and an optimized divider is already provided for general use in codeclib. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24506 a1c6a512-1295-4272-9138-f99709370657	2010-02-04 05:49:37 +00:00
Jens Arnold	9f6586698a	APE codec: Speed up decoding of -c2000 and higher on ARMv4 and coldfire by fusing vector math for the filters. Speedup is roughly 3.5% for -c2000, 8% for -c3000 and 12% for -c4000. To be extended to other architectures. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24473 a1c6a512-1295-4272-9138-f99709370657	2010-02-02 22:50:21 +00:00
Andrew Mahone	436f4d3a20	Improve libdemac SATURATE slightly on ARMv4/5, move filter buffers and code out of IRAM for sizes that aren't near realtime and extend udiv32_arm reciprocal table. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24376 a1c6a512-1295-4272-9138-f99709370657	2010-01-30 02:20:54 +00:00
Andrew Mahone	e76f30a57c	Improvements to specialized dividers for APE codec: * Use Newton-Raphson divider on ARMv5e and ARMv6, about 7% speedup on Gigabeat S. * On ARMv4 targets using IRAM, remove insane filter buffer from IRAM, fill available IRAM with LUT of reciprocals for small divisors - speedup varies according to target and available IRAM, APE normal sample is approx. 109% RT on e200. * Rename apps/codecs/lib/udiv32_armv4.S to apps/codecs/lib/udiv32_arm.S, which includes dividers for all ARM targets specialized for APE. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24354 a1c6a512-1295-4272-9138-f99709370657	2010-01-28 02:28:52 +00:00
Michael Sparmann	099df2fb71	Make the codecs use more IRAM on S5L870x, as we have plenty of it. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@23594 a1c6a512-1295-4272-9138-f99709370657	2009-11-09 20:01:07 +00:00
Jens Arnold	82dc91a102	Don't use ldrd/strd on ARMv5 since not all revisions support them and the gain from using them is minimal (basically code size only). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@21916 a1c6a512-1295-4272-9138-f99709370657	2009-07-17 09:17:54 +00:00
Jens Arnold	6421f94c0d	Silence warning from 'ar' if the archive had to be created. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@20151 a1c6a512-1295-4272-9138-f99709370657	2009-03-01 09:04:15 +00:00
Daniel Stenberg	2e6d604bb6	Stop hiding errors by redirecting stderr to /dev/null. If we really need to do re-introduce somewhere we should rather make it dependent on the V variable so that make V=1 would still show the error and only "normal" builds would hide it. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@20090 a1c6a512-1295-4272-9138-f99709370657	2009-02-23 08:45:16 +00:00
Björn Stenberg	6427d127aa	Calculate watermark from bitrate and harddisk spinup time. Use a smaller PCM buffer on targets with 2MB or less ram. (FS#9703) git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19743 a1c6a512-1295-4272-9138-f99709370657	2009-01-10 21:10:56 +00:00
Bertrik Sikken	32c2f455d1	static/const/#include/tab police on various files git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19643 a1c6a512-1295-4272-9138-f99709370657	2009-01-02 21:43:52 +00:00
Bertrik Sikken	8e22f7f5b0	Make local functions static in codecs, where possible. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19612 a1c6a512-1295-4272-9138-f99709370657	2008-12-29 19:49:48 +00:00
Jens Arnold	ed945e31c1	Slight speedup for the APE filters. Most noticeable on coldfire (+3.5% for -c2000), but also helps on the arm targets (+0.9% for -c2000 on PP5002). This transformation is oveflow safe, as absres < 2^24 is guaranteed. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19556 a1c6a512-1295-4272-9138-f99709370657	2008-12-22 08:33:49 +00:00
Jens Arnold	dca9f42cdf	Fix decoding of stereo frames with silence in only one channel. * Make the standalone decoder contain debugging information. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19552 a1c6a512-1295-4272-9138-f99709370657	2008-12-21 23:49:02 +00:00
Jens Arnold	0bf6e36628	Fix handling of 8 bit mono and stereo APE files, and also optimise 16 and 24 bit output in the standalone decoder a bit. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19517 a1c6a512-1295-4272-9138-f99709370657	2008-12-21 01:29:36 +00:00
Jens Arnold	a29b659758	Assembler optimised mono predictor for ARM. Speedup for -c1000 mono is ~5% on PP, ~8% on Gigabeat S (less for higher compression levels). Also fix some overlooked comments in the stereo predictor. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19375 a1c6a512-1295-4272-9138-f99709370657	2008-12-09 23:20:59 +00:00
Jens Arnold	c1cd0469ca	Implement mono predictor in assembler for coldfire, yielding a ~6% speedup for mono -c1000. Apply ideas gained from it back to the stereo predictor, saving 4 instructions. No speed increase for stereo, probably due to cache aliasing effects. * 80-column police. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19296 a1c6a512-1295-4272-9138-f99709370657	2008-12-02 02:26:04 +00:00
Jens Arnold	75bd4adbc2	Shuffling around register allocation allows to keep decoded0 and decoded1 in registers, for a slight speedup. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19287 a1c6a512-1295-4272-9138-f99709370657	2008-12-01 13:21:06 +00:00
Jens Arnold	89a6fe7ae4	Remove extraneous semicolons, and fix a comment. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19268 a1c6a512-1295-4272-9138-f99709370657	2008-11-30 11:54:20 +00:00
Jens Arnold	797ef6585a	Fix APE 16-bit mono output: mono signals need to be scaled for rockbox. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19264 a1c6a512-1295-4272-9138-f99709370657	2008-11-30 01:01:04 +00:00
Jens Arnold	88270f7622	Resurrect the ARM7 16-bit packed vector addition/subtraction for ARMv5, giving a nice speedup for the higher compression levels (tested on Cowon D2). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19260 a1c6a512-1295-4272-9138-f99709370657	2008-11-28 23:50:22 +00:00
Jens Arnold	113c285045	On ARM9TDMI (e.g. Gigabeat F) it's faster to use a ldr/str pair than add+ldmia/stmia for 2 registers. On ARM7TDMI a str pair is equally fast, so go for the simpler macro and use it for all ARMv4. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19250 a1c6a512-1295-4272-9138-f99709370657	2008-11-27 22:07:46 +00:00
Jens Arnold	6d34e33b94	Speed up the predictor a little by using ldrd/strd on ARMv5+. This required shuffling around the register allocation somewhat. Performance on ARMv4 is unaffected. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19248 a1c6a512-1295-4272-9138-f99709370657	2008-11-27 20:52:23 +00:00
Jens Arnold	5b0d74a7d3	Get rid of unused return values, except the one from decode_chunk() which will be used in the dual core split. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19236 a1c6a512-1295-4272-9138-f99709370657	2008-11-26 18:01:18 +00:00
Björn Stenberg	a091d20ba0	Added 'keywords' and 'eol-style' properties. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19218 a1c6a512-1295-4272-9138-f99709370657	2008-11-25 19:54:23 +00:00
Jens Arnold	d7e4e54bcb	Reorder instructions to avoid pipeline stalls on ARMv6 wherever possible (sometimes using different registers to allow this). Speeds up the predictor by almost 20% on ARMv6 (overall speedup for -c1000 is 5%), and might also help a bit on ARMv5. ARMv4 speed is unaffected. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19210 a1c6a512-1295-4272-9138-f99709370657	2008-11-24 23:09:09 +00:00
Jens Arnold	3761c0108c	Branch optimisation in both C (giving hints to gcc - verified using -fprofile-arcs and gcov) and asm files. Biggest effect on coldfire (-c1000: +8%, -c2000: +5%), but ARM also profits a bit (less than 1% on ARM7TDMI, around 1% on ARM1136). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19199 a1c6a512-1295-4272-9138-f99709370657	2008-11-24 18:40:49 +00:00
Jens Arnold	66c0cf2eb1	Tweak the ARMv6 filter assembly a bit further. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19198 a1c6a512-1295-4272-9138-f99709370657	2008-11-24 18:40:43 +00:00
Björn Stenberg	303b455ceb	Remove .a files before running ar, to avoid problems with renamed files remaining in the library. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19160 a1c6a512-1295-4272-9138-f99709370657	2008-11-20 16:49:55 +00:00
Björn Stenberg	c6b3d38a15	New makefile solution: A single invocation of 'make' to build the entire tree. Fully controlled dependencies give faster and more correct recompiles. Many #include lines adjusted to conform to the new standards. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19146 a1c6a512-1295-4272-9138-f99709370657	2008-11-20 11:27:31 +00:00
Jens Arnold	2a5053f58c	Several tweaks and cleanups: * Use .rept instead of repeated macros for repeating blocks. * Use MUL (variant) instead of MLA (variant) in the first step of the ARM scalarproduct() if there's no loop. * Unroll ARM assembler functions to 32 where not already done, plus the generic scalarproduct(). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19144 a1c6a512-1295-4272-9138-f99709370657	2008-11-19 21:31:33 +00:00
Jens Arnold	77934cbc96	Compile-time choice between 16 bit and 32 bit integers for the filters. 32 bit filters are faster on ARMv4 (with assembler code), so use them there. Nice speedup on PP and Gigabeat F/X. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19140 a1c6a512-1295-4272-9138-f99709370657	2008-11-19 00:34:48 +00:00
Jens Arnold	1b14167861	Centralise compile-time configuration. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19121 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 17:49:37 +00:00
Jens Arnold	66ff812c4a	Make it compile again on linux... git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19120 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 17:44:59 +00:00
Jens Arnold	dfafd67948	Make the standalone decoder actually work on Windows (need to open the output file in binary mode). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19119 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 17:30:02 +00:00
Jens Arnold	bd49ec97b2	Make the standalone decoder build on cygwin. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19117 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 12:59:03 +00:00
Jens Arnold	b5c0afc442	Move the contents of rangecoding.h into entropy.c, and remove the former. It was only used there, and defined some variables in the .h git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19116 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 12:58:15 +00:00
Jens Arnold	5ba11af855	Avoid unnecessary register copies on ARMv5. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19112 a1c6a512-1295-4272-9138-f99709370657	2008-11-16 10:12:38 +00:00
Dave Chapman	3e8a2bfa12	Make the standalone demac program compile again git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19107 a1c6a512-1295-4272-9138-f99709370657	2008-11-15 00:35:07 +00:00
Jens Arnold	9a0224fd28	Fix comments. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19102 a1c6a512-1295-4272-9138-f99709370657	2008-11-12 18:20:25 +00:00
Jens Arnold	60e16e8e7a	Tiny speedup by simplifying the filter wrap check. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19101 a1c6a512-1295-4272-9138-f99709370657	2008-11-12 18:16:27 +00:00
Jens Arnold	1600e4918e	Tiny performance improvement for the (not yet usable) compression levels >= -c2000 on ARM7TDMI, utilizing the multiplier's early termination. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19099 a1c6a512-1295-4272-9138-f99709370657	2008-11-12 09:18:36 +00:00
Jens Arnold	fe04e40be7	Further optimised (vs. libgcc) unsigned 32 bit division for ARMv4 (based on the ARMv5(+) version from libgcc), in IRAM on PP for better performance on PP5002, and put into the codeclib for possible reuse. APE -c1000 is now usable on both PP502x and PP5002 (~138% realtime, they're on par now). Gigabeat F/X should also see an APE speedup. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19009 a1c6a512-1295-4272-9138-f99709370657	2008-11-05 00:10:05 +00:00
Jens Arnold	7a835ee0c6	Some entropy decoder tweaks. Also removed unnecessary 'tmp' variables. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19008 a1c6a512-1295-4272-9138-f99709370657	2008-11-04 23:46:04 +00:00
Jens Arnold	dd7cacdc88	Another minor improvement: better pipelining and one less register used in vector addition/ subtraction. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@18739 a1c6a512-1295-4272-9138-f99709370657	2008-10-07 20:52:42 +00:00
Jens Arnold	6b84f60046	APE: Further ARMv6 filter optimisations: Save 4 'ror's per round by utilising the shift feature of the 'pack halfword' instructions in the unaligned vector addition/ subtraction, better pipelining in the aligned scalarproduct(), and a new method to calculate the unaligned scalarproduct(). git-svn-id: svn://svn.rockbox.org/rockbox/trunk@18736 a1c6a512-1295-4272-9138-f99709370657	2008-10-07 19:40:17 +00:00

1 2

73 commits