Commit graph

16 commits

Author SHA1 Message Date
Andree Buschmann
c7840e745e opus: speed up mdct overlap add and copying
Unroll overlap add loop by four and use memcpy for copying
instead of loops.

Change-Id: I17114626a395d5972130251d892f851bc86e3a6a
Signed-off-by: Nils Wallménius <nils@rockbox.org>
2012-10-07 00:31:08 +02:00
Nils Wallménius
3ac0fc7c90 opus: cf inline asm for MULT32_32_Q31
speeds up decoding of a 64kbps test file by 2MHz on h300

Change-Id: I437d05278fe1c495715cf0e3477f9960d1df9d3a
2012-10-06 23:43:05 +02:00
Andree Buschmann
2119f75af3 opus: full precision MULT32_32_Q31 (32*32=64>>31) multiplication
Replace complicated macro doing three 16*16 muls and add an inline
asm implementation for arm, speeds up decoding a 64kbps test file
by 0.5MHz on c200 (pp) and gives slightly better precision.

Change-Id: I6fc5b83c210f01bffdc38aec54cc5a8b646d8169
Signed-off-by: Nils Wallménius <nils@rockbox.org>
2012-10-06 23:43:05 +02:00
Andree Buschmann
da67f66eed opus: slight speedup of deemphasis
Hoist load of coefficients out of the loop.

Speeds up decoding of a 64kbps test file by 0.6MHz on h300 (cf)
0.2MHz on c200 (pp) and 0.1MHz on fuzev1 (amsv1)

Signed-off-by: Nils Wallménius <nils@rockbox.org>

Change-Id: I4be0059fc2a77748575f5fc9378f7f348d64f1c4
2012-10-06 14:51:01 +02:00
Andree Buschmann
dceec09092 opus: speed up comb_filter
Skip expensive multiply-accumulate loop when gains are 0 and
just copy using memcpy if soure and destination are not the same

Speeds up decoding of a 64kbps test file by 6MHz on h300 (cf)
7MHz on c200 (pp) and 6MHz on fuzev1 (amsv1)

Change-Id: Ibbc9ddfd45a9ac661467b1327b8c67761924fb8b
Signed-off-by: Nils Wallménius <nils@rockbox.org>
2012-10-06 14:25:20 +02:00
Nils Wallménius
6d2ad505dc opus: put arrays frequently used by pulse decoding on the stack
speeds up decoding of a 64kbps test file by 14MHz on h300 (cf)
and 1MHz on c200 (pp)

Change-Id: I852cb66808676ea51109423f5b70cfc8782dd109
2012-10-04 19:20:55 +02:00
Nils Wallménius
8687b98993 opus: speed up arm asm MULT16_32_Q15
Reorder operands to take advantage of the early termination of
multiplications. Saves 2.5MHz decoding a 64kbps opus test file
on c200 (pp).

Change-Id: I470266dc870ab183ece3b23426d41e2a64342a71
2012-10-01 22:36:57 +02:00
Andree Buschmann
d7799aaf33 opus: allocate mdct f2 buffer in iram
Speeds up decoding of 64kbps test file by 6.3MHz on h300 (cf)
and 1.2MHz on c200 (pp).

Signed-off-by: Nils Wallménius <nils@rockbox.org>

Change-Id: I08c2c332153abcbef9447c81986777fd2fcc73fe
2012-10-01 22:07:44 +02:00
Andree Buschmann
b6bcb1338e opus: allocate buffers for X and freq in iram
speeds up decoding of 64kbps test file by 19MHz on h300 (cf)
and 2.5MHz on c200 (pp)

Change-Id: Idacd2f8962c20c518055d586daeec6b932b7ded2
Signed-off-by: Nils Wallménius <nils@rockbox.org>
2012-10-01 21:37:03 +02:00
Nils Wallménius
082cd01eb2 opus: speed up deemphasis
Remove downsampling code from deemphasis loop as we don't use
it and remove multiplications that are not relevant when
not using custom modes. Saves 1.4MHz on h300 (cf), 4.3MHz on
c200 (pp) and 4.6 on fuzev1 (amsv1).

Change-Id: Iab3f1d737a656a563aaa351d50db987a9cff2287
2012-09-28 00:09:54 +02:00
Nils Wallménius
f636aa07df opus: put frequently used mdct buffer on the real stack which is in iram
Saves about 30MHz on h300 (cf) and 1.5MHz on c200 (pp) decoding a
64kbps test file. Stack usage is still below 70%.

Change-Id: Ib13df9011adb4eef4bb91a52e5a32741c8bf8988
2012-09-26 11:54:03 +02:00
Nils Wallménius
425725edb0 opus: improve cf MULT16_32_Q15 by giving the compiler more freedom
saves about 3MHz when decoding a 64kbps test file

Change-Id: I10f47173ccb78e60e364662220d1db2f78dd5fdd
2012-09-26 11:21:25 +02:00
Nils Wallménius
5f60590e80 opus: put some const tables and structs in iram
Speeds up decoding of a 64kbps test file by 20MHz on h300 (cf)
and 1MHz on c200 (pp)

Change-Id: Ia2adc0a3ad86abce8f948062eb53a8ac14c2cdf2
2012-09-25 17:19:05 +02:00
Nils Wallménius
afc6b3f021 opus: asm MULT16_32_Q15 for arm and cf
Speeds up decoding of a 64kbps opus test file by 34MHz on h300 (cf),
24MHz on c200 (pp) and 13MHz on fuzev1 (amsv1)

Change-Id: I0dce6b3bfe6c81d0a722dfebb13891b9a428c6ba
2012-09-25 11:40:59 +02:00
Nils Wallménius
f498142143 opus: #if 0 out some unused code
Change-Id: I16fa9b439f8da5b9b8a4f17040487b9535078ec5
2012-09-24 15:20:21 +02:00
Frederik M J Vestre
1b8e3801b2 Initial opus codec support
Synchronised with opus repo on github (https://github.com/freqmod/rockbox-opus)

Status:
* Seeking ported from speex, but fails on some cases (e.g. seek to granule 0)
* ReplayGain parsing needs to be reworked, we do vorbis-style replaygain now.
  http://wiki.xiph.org/OggOpus#Comment_Header explicitly forbids these in
  favour of R128_TRACK_GAIN tag.
* No optimisation yet, source files still nearly identical to opus upstream
* Multi-stream opus files may not be parsed correctly

Change-Id: Ia66f1027dc1d288083e3c57b2816700078376f9a
Reviewed-on: http://gerrit.rockbox.org/300
Reviewed-by: Bertrik Sikken <bertrik@sikken.nl>
Tested-by: Bertrik Sikken <bertrik@sikken.nl>
2012-09-20 20:47:44 +02:00