For >8-point vertical IDCT, transpose the coefficients while decoding them, so that the vertical IDCT can read in rows rather than columns. This improves speed a bit for this size even using the C IDCT.
Remove inline ARM asm, replacing it with an external file containing pure asm IDCT functions.
Add jpeg_ prefix to JPEG IDCT functions since some of them will now be visible globally.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@21345 a1c6a512-1295-4272-9138-f99709370657