SPE Media Lib

iam · Post by **iam** » Wed Nov 28, 2007 7:22 am

Any updates on the medialib project?

No post from unsolo since October, not much got updated on the wiki wither... I hope this project isn't dead... Really looking forward for H.264 support... eventually...

unsolo · Post by **unsolo** » Wed Nov 28, 2007 9:59 am

It'is far from dead..

I can inform you that we are currently working on the following

EXA driver for accelerated X support

Xv driver for Video acceleration in X

the mplayer-vo is under further development

And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.

There is also allways room for more people :)

Cheers

unsolo

popper · Post by **popper** » Sun Dec 02, 2007 3:30 am

its good to see even more and steady progress Unsolo,perhaps
Lu has mentioned it or you already read this thread
http://www.powerdeveloper.org/forums/vi ... php?t=1410

it might be a good thing to help pull some new blood into the effective efforts if your thinking of combining some of both codebases/projects, whatever Markos and the teams decide to work on first....

at least i assume theres something of interest to both projects to co-operate or perhaps combine to better progress and have fun at the same time... :)

d-range · Post by **d-range** » Sun Dec 02, 2007 9:35 pm

unsolo wrote:And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.

There is also allways room for more people :)

I'm still looking into the video decoding stuff, but I more or less scaled down my focus from h264 to mpeg1/2, as there is lots of overlap in all of the mpeg/h26x decoding processes, and I need to have more basic video decoder experience before I can seriously think about h264.

Anyway, I'm not sure accelerating the existing ffmpeg codecs is the way to go for PS3. The PS3 architecture is almost a perfect fit for very, very high performance video decoding, but the way the ffmpeg codecs are set up it is impossible to get there. These codecs are all optimized for either single-threaded x86 or symmetric dual-thread x86 execution. You cannot efficiently parallelize them for the Cell without ending up rewriting everything.

Stuff like IDCT/dequant, color conversion, motion compensation, deblocking, you can lift them out and write spu-medialib code for it, and it will improve the computational cost of them, but you will end up with a decoder that does some parts of the decoding process very, very fast, but is throttled by its data dependencies, ie: getting stuff in and out of the SPU's and combining them for the next step.

I have a few papers about decoder setups for architectures like the cell. In short: the ffmpeg codecs are not optimized for multicore (>2 core) processing, and use a functional partitioning for the decoding process (ie: a pipeline-like setup). This is good for PC architectures, because there is no communication overhead, all decoder stages can access the same RAM. Also, typical multicore PC-setups have symmetric cores, it does not matter what task you put on what core. The PS3 however would benefit from a mixed data-partitioning/functional partitioning scheme, where each of the SPU implements it's own pipeline for a subset of the full frame data. This reduces communication overhead and maximizes parallelism. The PPU can handle entropy decoding and macroblock parsing better than the SPU's, and the SPU's can do all the other stuff.

For practical purposes hacking ffmpeg with some SPU code is a good first step, but I'm not convinced it can pull off full HD h.264 decoding at full framerates. But it might be bearable. My own 'goal' would be a decoder that is optimized for the Cell, and nothing else. I think that way it can do full HD H.264 decoding with ample room to spare.

unsolo · Post by **unsolo** » Mon Dec 03, 2007 12:28 am

Provided the cell (spe's) do both inter and intra frame decoding the ppc processor is left with the task of decoding the bitstream more or less. hopefully that will be enough

d-range · Post by **d-range** » Mon Dec 03, 2007 1:28 am

unsolo wrote:Provided the cell (spe's) do both inter and intra frame decoding the ppc processor is left with the task of decoding the bitstream more or less. hopefully that will be enough

If you build it efficiently, it will be. You will want to limit ppu<->ram<->spu traffic and data dependencies as much as possible. That requires careful data partitioning and scheduling, which means you will end up rewriting almost all of the ffmpeg codec. Which is not necessarily a bad thing btw, but it's too messy for me.

unsolo · Post by **unsolo** » Mon Dec 03, 2007 6:32 am

you have 24 GB/s to go on there..

in comparison a YUV420 frame is 3.1MB in 1080p

so even if you split it and over dma so that you transfer 4 times as much data as needed its still fine..

d-range · Post by **d-range** » Mon Dec 03, 2007 8:04 pm

unsolo wrote:you have 24 GB/s to go on there..

in comparison a YUV420 frame is 3.1MB in 1080p

so even if you split it and over dma so that you transfer 4 times as much data as needed its still fine..

24GB/s bandwidth that is, but bandwidth is not the problem. You still need to feed everything to the SPE's in time otherwise you'll stall them. With the limited local memory of the SPE's and the different data dependencies for inter and intra prediction, you will need to arrange macroblocks in data partition order to satisfy all data dependencies, and implement adequate buffering from entropy decoding on the PPU to inter/intra prediction on the SPE's. Extra complications involved in PEL reconstruction from the IDCT and the prediction from the reference images, because they also need to be available just in time. It's all possible, but you need more than a naive port of the ffmpeg decoder.

unsolo · Post by **unsolo** » Sat Dec 15, 2007 4:37 am

I wouldnt worry to much ...

Im saying its doable

very very very doable..

and im allways right :)

btw im working on a fifo for the spe's that should/could allow for more than enough unique tasks to be transfered to the spe's

Arwin · Post by **Arwin** » Sat Dec 15, 2007 6:46 am

Is this useful to you guys? Though I'm sure you probably already know it:

http://sourceforge.net/project/showfile ... _id=200163