SPE Media Lib
It'is far from dead..
I can inform you that we are currently working on the following
EXA driver for accelerated X support
Xv driver for Video acceleration in X
the mplayer-vo is under further development
And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.
There is also allways room for more people :)
Cheers
unsolo
I can inform you that we are currently working on the following
EXA driver for accelerated X support
Xv driver for Video acceleration in X
the mplayer-vo is under further development
And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.
There is also allways room for more people :)
Cheers
unsolo
Don't do it alone.
its good to see even more and steady progress Unsolo,perhaps
Lu has mentioned it or you already read this thread
http://www.powerdeveloper.org/forums/vi ... php?t=1410
it might be a good thing to help pull some new blood into the effective efforts if your thinking of combining some of both codebases/projects, whatever Markos and the teams decide to work on first....
at least i assume theres something of interest to both projects to co-operate or perhaps combine to better progress and have fun at the same time... :)
Lu has mentioned it or you already read this thread
http://www.powerdeveloper.org/forums/vi ... php?t=1410
it might be a good thing to help pull some new blood into the effective efforts if your thinking of combining some of both codebases/projects, whatever Markos and the teams decide to work on first....
at least i assume theres something of interest to both projects to co-operate or perhaps combine to better progress and have fun at the same time... :)
I'm still looking into the video decoding stuff, but I more or less scaled down my focus from h264 to mpeg1/2, as there is lots of overlap in all of the mpeg/h26x decoding processes, and I need to have more basic video decoder experience before I can seriously think about h264.unsolo wrote:And we have started looking into ways of accelerating the mpeg2 mpeg4 and h264 decoding using existing programs such as ffmpeg.
There is also allways room for more people :)
Anyway, I'm not sure accelerating the existing ffmpeg codecs is the way to go for PS3. The PS3 architecture is almost a perfect fit for very, very high performance video decoding, but the way the ffmpeg codecs are set up it is impossible to get there. These codecs are all optimized for either single-threaded x86 or symmetric dual-thread x86 execution. You cannot efficiently parallelize them for the Cell without ending up rewriting everything.
Stuff like IDCT/dequant, color conversion, motion compensation, deblocking, you can lift them out and write spu-medialib code for it, and it will improve the computational cost of them, but you will end up with a decoder that does some parts of the decoding process very, very fast, but is throttled by its data dependencies, ie: getting stuff in and out of the SPU's and combining them for the next step.
I have a few papers about decoder setups for architectures like the cell. In short: the ffmpeg codecs are not optimized for multicore (>2 core) processing, and use a functional partitioning for the decoding process (ie: a pipeline-like setup). This is good for PC architectures, because there is no communication overhead, all decoder stages can access the same RAM. Also, typical multicore PC-setups have symmetric cores, it does not matter what task you put on what core. The PS3 however would benefit from a mixed data-partitioning/functional partitioning scheme, where each of the SPU implements it's own pipeline for a subset of the full frame data. This reduces communication overhead and maximizes parallelism. The PPU can handle entropy decoding and macroblock parsing better than the SPU's, and the SPU's can do all the other stuff.
For practical purposes hacking ffmpeg with some SPU code is a good first step, but I'm not convinced it can pull off full HD h.264 decoding at full framerates. But it might be bearable. My own 'goal' would be a decoder that is optimized for the Cell, and nothing else. I think that way it can do full HD H.264 decoding with ample room to spare.
If you build it efficiently, it will be. You will want to limit ppu<->ram<->spu traffic and data dependencies as much as possible. That requires careful data partitioning and scheduling, which means you will end up rewriting almost all of the ffmpeg codec. Which is not necessarily a bad thing btw, but it's too messy for me.unsolo wrote:Provided the cell (spe's) do both inter and intra frame decoding the ppc processor is left with the task of decoding the bitstream more or less. hopefully that will be enough
24GB/s bandwidth that is, but bandwidth is not the problem. You still need to feed everything to the SPE's in time otherwise you'll stall them. With the limited local memory of the SPE's and the different data dependencies for inter and intra prediction, you will need to arrange macroblocks in data partition order to satisfy all data dependencies, and implement adequate buffering from entropy decoding on the PPU to inter/intra prediction on the SPE's. Extra complications involved in PEL reconstruction from the IDCT and the prediction from the reference images, because they also need to be available just in time. It's all possible, but you need more than a naive port of the ffmpeg decoder.unsolo wrote:you have 24 GB/s to go on there..
in comparison a YUV420 frame is 3.1MB in 1080p
so even if you split it and over dma so that you transfer 4 times as much data as needed its still fine..
Is this useful to you guys? Though I'm sure you probably already know it:
http://sourceforge.net/project/showfile ... _id=200163
http://sourceforge.net/project/showfile ... _id=200163