b320bbaf61
Optimizes YUV to RGB conversion using ARMv5 multiply-accumulate intructions for operations and data tables for saturation. This first patch set includes the three versions i have developed. Although iPod Classic need to use the latest version to reach 30fps, old versions may serve other targets. All versions are based on current SVN algorithm (round->scale->add) using the same coefficients, so output results are identical. Version history: ARMv4: - use all available registers to calculate four pixels within each loop iteration. - avoid LDR interlocks. ARMv5TE: - use ARMv5TE+ 1-cycle multiply-accumulate instructions. ARMv5TE_WST: - use data tables (256 bytes) for RBG565 saturation. Benchmarks results using iPod Classic (ARM926EJ 216Mhz): size test_fps (1) mpegplayer (2) bytes YUV YUV1/4 average min/max ----- ----------- ------------------ SVN-20141107 528 27.8 110.0 11035 10864/13397 ARMv4 480 28.8 114.0 9767 9586/12126 ARMv5TE 468 29.7 117.5 8751 8584/11118 ARMv5TE_WST 544 33.6 133.0 6355 6316/6403 (1) boosted (2) play full elephants_dream_320x240.mpg file (15693 frames) using mpegplayer, patched RB measures YUV to RGB565 frame conversion time (microseconds) Compared against the WST version, the ARMV5TE version w/o cached saturation tables is slower, but it is smaller and i have doubts about the power consumption. Change-Id: I2b6a81804636658d85a1bb104ccb2055e77ac120 Reviewed-on: http://gerrit.rockbox.org/1034 Reviewed-by: Cástor Muñoz <cmvidal@gmail.com> Tested: Cástor Muñoz <cmvidal@gmail.com> |
||
---|---|---|
.. | ||
arm | ||
coldfire | ||
hosted | ||
mips | ||
sh |