Mpeg decoding
Moderator: unsolo
Mpeg decoding
Ok i have started peaking at mpeg decoding
My first plan is as follows.
bisect mpeg2dec to gain further understanding of how mpeg works.
reconstruct the I image.
predict the P image
add error correction to the P image.
then try predicting the B images and then add error correction to the B images.
I plan to do this at first one image at a time. so that i can verify the results
If anyone at all care to give me a hand on this it would be highly appriciated.
cheers
Kristian
			
			
									
									My first plan is as follows.
bisect mpeg2dec to gain further understanding of how mpeg works.
reconstruct the I image.
predict the P image
add error correction to the P image.
then try predicting the B images and then add error correction to the B images.
I plan to do this at first one image at a time. so that i can verify the results
If anyone at all care to give me a hand on this it would be highly appriciated.
cheers
Kristian
Don't do it alone.
						Mpeg 2 file format 1
Further work
I have gotten my hands on the mpeg2 iso standard so i can now more easely divide this.
some simple mpeg facts about mpeg file format
Binary 0000 0000 0000 0000 0000 0001 hex 0x000001 is the 24 bit mpeg start code prefix.
This is followed by a 1 byte start code value.
The start code values are in hex.
00 for a picture start
01-AF for a slice start
B2 is user_data_code
B3 is sequence_header_code
B4 is sequence_error_code
B5 is extension_start cide
B7 is sequence_end_code
B8 is group_start_code
B9-FF is system_start_codes
			
			
									
									I have gotten my hands on the mpeg2 iso standard so i can now more easely divide this.
some simple mpeg facts about mpeg file format
Binary 0000 0000 0000 0000 0000 0001 hex 0x000001 is the 24 bit mpeg start code prefix.
This is followed by a 1 byte start code value.
The start code values are in hex.
00 for a picture start
01-AF for a slice start
B2 is user_data_code
B3 is sequence_header_code
B4 is sequence_error_code
B5 is extension_start cide
B7 is sequence_end_code
B8 is group_start_code
B9-FF is system_start_codes
Don't do it alone.
						- 
				jockyw2001
- Posts: 339
- Joined: Thu Sep 29, 2005 4:19 pm
Target is to make it work for example for example with ffmpeg.
example spu inverse scan aka zigzag
void inverse_scan_progressive(vector unsigned char * Input[8],vector unsigned char * Output[4])
{
//packing data block these can all be removed if ffmpeg supplies data as uint8_t
vector unsigned char pack={1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31};
vector unsigned char A=spu_shuffle(Input[0],Input[1],pack);
vector unsigned char B=spu_shuffle(Input[2],Input[3],pack);
vector unsigned char C=spu_shuffle(Input[4],Input[5],pack);
vector unsigned char D=spu_shuffle(Input[6],Input[7],pack);
//packing data block ends these can all be removed if ffmpeg supplies data as uint8_t
//pipelined this is xx cycles
Output[0]=spu_shuffle(A,B,((vector unsigned char ){0,1,5,6,14,15,27,28,2,4,7,13,16,26,29,0x00}));
Output[1]=spu_shuffle(A,B,((vector unsigned char ){3,8,12,17,25,30,0x00,0x00,9,11,18,24,31,0x00,0x00,0x00}));
Output[2]=spu_shuffle(C,D,((vector unsigned char ){ 0x00,0x00,0x00,0,7,13,20,22,0x00,0x00,1,6,14,19,23,28}));
Output[3]=spu_shuffle(C,D,((vector unsigned char ){ 0x00,2,5,15,18,24,27,293,4,16,17,25,26,30,31}));
Output[0]=spu_shuffle(Output[0],C,((vector unsigned char ){0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,26}));//adding nr 42
Output[1]=spu_shuffle(Output[1],C,((vector unsigned char ){0,1,2,3,4,5,25,27,8,9,10,11,12,24,28,0x00}));//adding 41,43,,40,44
	
Output[2]=spu_shuffle(Output[2],D,((vector unsigned char ){0x00,19,23,3,4,,5,6,7,20,22,10,11,12,13,14,15})); // only missing 1 here;
Output[3]=spu_shuffle(Output[3],D,((vector unsigned char ){21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15})); //move this down
Output[1]=spu_shuffle(Output[1],D,((vector unsigned char ){0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,21})); //adding 53
Output[2]=spu_shuffle(Output[2],C,((vector unsigned char ){26,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}));//adding 10
}
			
			
									
									example spu inverse scan aka zigzag
void inverse_scan_progressive(vector unsigned char * Input[8],vector unsigned char * Output[4])
{
//packing data block these can all be removed if ffmpeg supplies data as uint8_t
vector unsigned char pack={1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31};
vector unsigned char A=spu_shuffle(Input[0],Input[1],pack);
vector unsigned char B=spu_shuffle(Input[2],Input[3],pack);
vector unsigned char C=spu_shuffle(Input[4],Input[5],pack);
vector unsigned char D=spu_shuffle(Input[6],Input[7],pack);
//packing data block ends these can all be removed if ffmpeg supplies data as uint8_t
//pipelined this is xx cycles
Output[0]=spu_shuffle(A,B,((vector unsigned char ){0,1,5,6,14,15,27,28,2,4,7,13,16,26,29,0x00}));
Output[1]=spu_shuffle(A,B,((vector unsigned char ){3,8,12,17,25,30,0x00,0x00,9,11,18,24,31,0x00,0x00,0x00}));
Output[2]=spu_shuffle(C,D,((vector unsigned char ){ 0x00,0x00,0x00,0,7,13,20,22,0x00,0x00,1,6,14,19,23,28}));
Output[3]=spu_shuffle(C,D,((vector unsigned char ){ 0x00,2,5,15,18,24,27,293,4,16,17,25,26,30,31}));
Output[0]=spu_shuffle(Output[0],C,((vector unsigned char ){0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,26}));//adding nr 42
Output[1]=spu_shuffle(Output[1],C,((vector unsigned char ){0,1,2,3,4,5,25,27,8,9,10,11,12,24,28,0x00}));//adding 41,43,,40,44
Output[2]=spu_shuffle(Output[2],D,((vector unsigned char ){0x00,19,23,3,4,,5,6,7,20,22,10,11,12,13,14,15})); // only missing 1 here;
Output[3]=spu_shuffle(Output[3],D,((vector unsigned char ){21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15})); //move this down
Output[1]=spu_shuffle(Output[1],D,((vector unsigned char ){0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,21})); //adding 53
Output[2]=spu_shuffle(Output[2],C,((vector unsigned char ){26,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}));//adding 10
}
Don't do it alone.
						IDCT running
Ok finally i got the idct running. 
it was a bit painful getting there from scratch but i did.
Time for me to take a vacation
http://svn.ps2dev.org/listing.php?repna ... rev=0&sc=0
			
			
									
									it was a bit painful getting there from scratch but i did.
Time for me to take a vacation
http://svn.ps2dev.org/listing.php?repna ... rev=0&sc=0
Don't do it alone.
						- 
				clockpenalty
- Posts: 1
- Joined: Mon Aug 27, 2007 6:51 pm
This is an excellent project
It needs more publicity, so you can get some help from the dev community
There are hundreds of posts on the web complaining about poor performance with the mplayer and vlc software scalers, and at the moment the only solution is to transcode, either on the fly or prior to loading, which defeats the purpose of using linux.
if you can achieve a scaling/decompression library that matches the speed of hardware acceleration, it will be a great showcase for the power of the cell!
This is potentially the most important ps3 dev project at the moment.
You have a bilinear scaler working at the moment? How do I incorporate it into mplayer on my system to test the performance? Does it deliver comparable quality to the built in bilinear scaler (-sws 0) in mplayer? does it have tunable parameters? And how do you plan to optimize mpeg decoding? Will you write a decoder from scratch (please don't!) or modify libavcodec (please do!) with spu optimizations?
Please dont let this project die....
			
			
									
									
						It needs more publicity, so you can get some help from the dev community
There are hundreds of posts on the web complaining about poor performance with the mplayer and vlc software scalers, and at the moment the only solution is to transcode, either on the fly or prior to loading, which defeats the purpose of using linux.
if you can achieve a scaling/decompression library that matches the speed of hardware acceleration, it will be a great showcase for the power of the cell!
This is potentially the most important ps3 dev project at the moment.
You have a bilinear scaler working at the moment? How do I incorporate it into mplayer on my system to test the performance? Does it deliver comparable quality to the built in bilinear scaler (-sws 0) in mplayer? does it have tunable parameters? And how do you plan to optimize mpeg decoding? Will you write a decoder from scratch (please don't!) or modify libavcodec (please do!) with spu optimizations?
Please dont let this project die....
Hi,
this is my first post here either ...
I am on my way for getting the patch for a vo for mplayer compiled and in place.
After this I can answer some of your questions and even more. Please understand, that it is not easily portable to cell. This is caused in the limitation of the spu memory size of only 256kb. This means - not even a frame fits into it nor a scaled one or anything like this. So the complete routine has to be altered. It should be pretty easy to transcode the Altivec code to SPU-code. But it is not enough and the complete aproach is different. So I fear that we cannot simply port the old code from libavcodec :-(
So if I have compiled everything I would join the development - lets see what kind of help I may be.
CU
Protheus
			
			
									
									
						this is my first post here either ...
I am on my way for getting the patch for a vo for mplayer compiled and in place.
After this I can answer some of your questions and even more. Please understand, that it is not easily portable to cell. This is caused in the limitation of the spu memory size of only 256kb. This means - not even a frame fits into it nor a scaled one or anything like this. So the complete routine has to be altered. It should be pretty easy to transcode the Altivec code to SPU-code. But it is not enough and the complete aproach is different. So I fear that we cannot simply port the old code from libavcodec :-(
So if I have compiled everything I would join the development - lets see what kind of help I may be.
CU
Protheus
Hi Protheus 
I have already made a bilinear yuv420/yv12 scaler and a yuv420/yv12 to ARGB converter for the cell you can find this on svn.
Im currently running them from console in a very bad VO module.
I also think someone else is looking into making a vo module using these however i do not know the status.
Using those two you should be able to make a proper VO for X and if you lack functionality im the guy to talk to.
			
			
									
									I have already made a bilinear yuv420/yv12 scaler and a yuv420/yv12 to ARGB converter for the cell you can find this on svn.
Im currently running them from console in a very bad VO module.
I also think someone else is looking into making a vo module using these however i do not know the status.
Using those two you should be able to make a proper VO for X and if you lack functionality im the guy to talk to.
Don't do it alone.
						Hi unsolo,
can you point me to a short description on how to compile the vo from the svc?
I have the description on compiling the mplayer ... but I have no Information on how to get your sources from the svc into the mplayer ...
I saw a patch on http://lists.mplayerhq.hu in the [MPlayer-dev-eng] RFC on libspe2 detection thread.
But for now, as I am new to mplayer compiling I need a little startup help for compiling this into mplayer.
Thx.
Protheus
			
			
									
									
						can you point me to a short description on how to compile the vo from the svc?
I have the description on compiling the mplayer ... but I have no Information on how to get your sources from the svc into the mplayer ...
I saw a patch on http://lists.mplayerhq.hu in the [MPlayer-dev-eng] RFC on libspe2 detection thread.
But for now, as I am new to mplayer compiling I need a little startup help for compiling this into mplayer.
Thx.
Protheus
Hi unsolo,
I very much like the work you have done with the Cell.
I am a little concerned about the current state of the mpeg encoding.
Is the I frames encoding still in progress?
I plan to contribute on this project and I would like to know where can I start the learning process.
Again thank you for your great work.
Thx.
Zodd
			
			
									
									
						I very much like the work you have done with the Cell.
I am a little concerned about the current state of the mpeg encoding.
Is the I frames encoding still in progress?
I plan to contribute on this project and I would like to know where can I start the learning process.
Again thank you for your great work.
Thx.
Zodd
Im just back from 3 weeks of vacation.
Progress is beeing made on a general framework from my end as well as various bits and pieces of decoding from various other contributors.
mpeg isn't the highest priority but im very confident we will be able to accomidate both mpeg and h264 decoding in the future.
			
			
									
									Progress is beeing made on a general framework from my end as well as various bits and pieces of decoding from various other contributors.
mpeg isn't the highest priority but im very confident we will be able to accomidate both mpeg and h264 decoding in the future.
Don't do it alone.
						Thanks for all your hard work guys.  I am anxiously awaiting h264 decoding to be able to run the ps3 as a media center (with freevo as well).  I have read posts from keepkool and anthraxx and it looks like it's getting there but since I'm a linux newb I am waiting a bit longer till a stable solution is created and a "For Dummies" walk through tutorial is made.  But again, not that 2.10 has broke access to the RSX, this project is our best bet for fluid media center presentation.  Thank you again and hope you had a great vacation!
			
			
									
									You mean people actually do that?
						In case anyone is interested, I'm also working on an MPEG decoder for the Cell, built from scratch with the Cell architecture in mind. In the future it could also be extended with MPEG-II, H.264 and then MPEG-4 codecs (in that order) but I probably won't be doing all that myself. 
Status for this project: I-frame decoding works, decoder framework in place for ~80%, memory interface to SPE's almost done. No SPE code has been written yet but the way the code is set-up that's a matter of drop-in replacement of the decoder pipeline stages suited for SPE execution.
For now I haven't put up any code deliberately because the thing is still too much of pet project I do in my spare time, so progress is slow and irregular. When I have P/B-frame decoding and the general architecture of the thing is fully fixed I will (most likely) release the sources depending how well it works. If it works too well I might consider keeping the code closed by the way.
Someone might want to take a look at MPEG-4 (divx/xvid) decoding, because I predict that will be the hardest to implement (the spec is huge and very complex). H264 is easy compared to MPEG-4.
			
			
									
									
						Status for this project: I-frame decoding works, decoder framework in place for ~80%, memory interface to SPE's almost done. No SPE code has been written yet but the way the code is set-up that's a matter of drop-in replacement of the decoder pipeline stages suited for SPE execution.
For now I haven't put up any code deliberately because the thing is still too much of pet project I do in my spare time, so progress is slow and irregular. When I have P/B-frame decoding and the general architecture of the thing is fully fixed I will (most likely) release the sources depending how well it works. If it works too well I might consider keeping the code closed by the way.
Someone might want to take a look at MPEG-4 (divx/xvid) decoding, because I predict that will be the hardest to implement (the spec is huge and very complex). H264 is easy compared to MPEG-4.
d-range wrote:In case anyone is interested, I'm also working on an MPEG decoder for the Cell, built from scratch with the Cell architecture in mind. In the future it could also be extended with MPEG-II, H.264 and then MPEG-4 codecs (in that order) but I probably won't be doing all that myself.
Status for this project: I-frame decoding works, decoder framework in place for ~80%, memory interface to SPE's almost done. No SPE code has been written yet but the way the code is set-up that's a matter of drop-in replacement of the decoder pipeline stages suited for SPE execution.
For now I haven't put up any code deliberately because the thing is still too much of pet project I do in my spare time, so progress is slow and irregular. Also I don't want to accept any code or patches yet that would steer development away from my own ideas. When I have P/B-frame decoding, and the general architecture of the thing is fully fixed I will (most likely) release the sources depending how well it works. If it works too well I might consider keeping the code closed by the way.
Someone might want to take a look at MPEG-4 (divx/xvid) decoding, because I predict that will be the hardest to implement (the spec is huge and very complex). H264 is easy compared to MPEG-4.
Cool, this is for moving small parts of ffmpeg code to spe's right? Or does it also make it easy to completely rip out a codec and replace it?unsolo wrote:d-range
I have made spexms work so that it should be possible to simply add hooks into existing software such as ffmpeg / mplayer etc. An i am now investigating various ways to do that especially with h264 in mind.
I've not been doing anything lately so no progress here, just some refactoring... If you need any help on the h264 specs: I studied them a lot before, when I was still planning to write a h264 codec instead of MPEG-I/II, so don't hesitate to ask if you're not sure about something, maybe I can help.
I belive making some form of hook into ffmpeg is the best overall sollution but i have to study ffmpeg more.. But spexms should be able to handle tasks at macroblock or even block level 
Some tests i have made has come up to 1.63Million 128 bit tasks issued to the spe's a second .. but please note that its better to have less tasks and more data of course
			
			
									
									Some tests i have made has come up to 1.63Million 128 bit tasks issued to the spe's a second .. but please note that its better to have less tasks and more data of course
Don't do it alone.
						To get reasonably efficient decoding soon, I also think hooking into ffmpeg and moving parts of it to SPE's is the best solution. Not optimal, but probably very useful. I didn't see your post about spexms in the other thread before, but it also looks very useful, i will take a look into that when I start actually moving stuff to the spe's. I like the fire-and-forget idea, but for my decoder I think I also need some interface for double-buffering data to spes and signalling between spe-spe and spe-ppu that tasks have completed. Maybe there's a way to integrate that with spexms.unsolo wrote:I belive making some form of hook into ffmpeg is the best overall sollution but i have to study ffmpeg more.. But spexms should be able to handle tasks at macroblock or even block level
Some tests i have made has come up to 1.63Million 128 bit tasks issued to the spe's a second .. but please note that its better to have less tasks and more data of course
there is one here: http://code.google.com/p/cell-mpeg2-decoder/unsolo wrote:Perhaps i should make a decoder from scratch ;)
cheers
works fine for 720p but needs some optimization for 1080p.
it is a lib and can also be integrated into ffmpeg by replacing the ffmpeg decoder with a call to this lib. there is a mpeg12.c file which shows how to integrate the lib.ps3fanboy wrote:there is one here: http://code.google.com/p/cell-mpeg2-decoder/unsolo wrote:Perhaps i should make a decoder from scratch ;)
cheers
works fine for 720p but needs some optimization for 1080p.
WTF does it work?
Anyone tried their stuff?
Kristian, are you working with these guys? I guess you are working on similar things otherwise. What do you think about their work? How far are you from acheiving 264 720p decode with your trunk?
//P
			
			
									
									
						Kristian, are you working with these guys? I guess you are working on similar things otherwise. What do you think about their work? How far are you from acheiving 264 720p decode with your trunk?
//P