The hunt for HV's FIFO/Push buffer...

Technical discussion on the newly released and hard to find PS3.

Moderators: cheriff, emoon

Post Reply
cypherpunks
Posts: 4
Joined: Mon Oct 22, 2007 7:13 am

Post by cypherpunks »

I'm a native English speaker and am willing to edit the wiki. There is already a wiki at <http://wiki.ps2dev.org/> you can use. Just add whatever info you have, and I'll edit it for grammar and such, then integrate it into a cohesive document.

To everyone in this thread: THANK YOU! I'm a long time lurker interested in writing my own OS for the PS3. The lack of acceleration (2D or 3D) was the only thing which kept me from working on it. However your hard work as changed that, and I hope to start contributing soon!
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Ok, thanks a lot cypherpunks. I've sketched something locally, but it still needs a lot of improvement. I'll post it on the wiki on tuesday if I get a chance; won't have time before then.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

RAMIN tweaking.

Post by IronPeter »

Topic of this thread is now "the hunt for hypervisor function creating instance of CLASS_3D".

Looks like the hunt for black cat in the black room.

I tried to create instance of CLASS_3D via RAMIN writes. I did only three memory writes:

1.) offset in dwords 0x64cb8, value 0xfeed0003 ( RAMHT key )
2.) offset in dwords 0x64cb9, value 0x00105020 ( RAMHT value )
3.) offset in dwords 0x1d020 * 4, value 0x00004097 ( instance data, class 3D )

the newly created object was attached to subchannel 0. 2D blits work after that insertion.

The only problem is hypervisor. It is very unhappy with my CLASS_3D and cannot make regular screen blits after ioctl( PS3FB_IOCTL_OFF ).

I did not try any 3D stuff yet. But it looks like we have very tight GPU control.

Edit: grammar, constants
Last edited by IronPeter on Fri Oct 26, 2007 4:31 am, edited 1 time in total.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

IronPeter, can you try binding to subchannel 7 instead? I suspect channel 0 is already used to send the 0x110 final command in the HV blit, which might cause the problem you observe. (Note: I'm away from my console too ATM)
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Glaurung, there are many workarounds... To use second context, to supress 0x110 packets, to replace hypervisor blits by hand-made, to use different subchannel. Minor problem.


( I'm away from my console too :)
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

3D is working.

Post by IronPeter »

The first operation was TCL_PRIMITIVE_3D_CLEAR_BUFFERS. It does work with my CLASS_3D instance.

Want to draw triangle. Do not expect result ( positive or negative ) in less than 3-4 days. Need a lot of stuff ( pixel and fragment shaders, shader constants ) to setup.
ps2devman
Posts: 259
Joined: Mon Oct 09, 2006 3:56 pm

Post by ps2devman »

Grats!
Clear is enough to port pbKit demo 01 hardware accelerated Pong!
(if it's the command that takes x,y,w,h,color,Z parameters)
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

first documentation draft

Post by Glaurung »

Hi,

I've started a very first draft of the RSX documentation here:
http://wiki.ps2dev.org/ps3:rsx

It is probably still full of spelling and grammar errors, and may contain incorrect information. I tried to make a self-contained document and therefore it summarizes a lot of information that can be found elsewhere, especially in the nouveau wiki. Everything is on a single page for now. Anyway, native speakers are welcome to edit this page at will. Same goes for the technical side which may lack some information, be imprecise or unclear.

IronPeter, good work on the 3D object!
I'm also interested in the 3D, fragment and vertex shaders, etc.. to try the NV40 Xorg Composite code from nouveau (blending). BTW, a friend and I managed to get Xv working relatively well (for a 3-hour long hacking session) using the blitter. Code is still very ugly, but we will provide an update of the experimental Xorg driver soon.
ps2devman
Posts: 259
Joined: Mon Oct 09, 2006 3:56 pm

Post by ps2devman »

Impressive Wiki page! Thanks a lot!
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Glaurung, I think there is workaround for blending with 2D objects. You probably can do multiply blits with mask to emulate blend mode you want.

Nice page. Somebody, please, check grammar after my changes :)).
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Heres my wokaround using the SPE's :)

bad joke its 5.am

but in fact it kind of is.

I threw an experimental SPU Xv driver on SVN.
It does 1080p in X using 1 spu so it looks ok but theres lots of TODO's with it like fix interrupt handlers and timers.

Expect install guideline within days inside spu-medialib withing the Xv thread.

it uses around 25% of a single spu when upscaling 720p to 1080p. however i think its feasable to bring this down to 12.5% or lower.

Scaling method now is bilinear floating point precision.

But we are working/looking into other scalers.
Don't do it alone.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

just fun.

Post by IronPeter »

It seems like some areas of RAMIN are persistent even after cold reboot...

Looks like good idea to analyze RAMIN content after GameOS :).
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Hi,

Here is an update of the GPU based Xorg driver, with accelerated Xv support:
git clone http://mandos.homelinux.org/~glaurung/g ... eo-ps3.git

This patch to the PS3 Linux framebuffer driver is needed to test the Xorg driver:
http://mandos.homelinux.org/~glaurung/p ... ps3fb.diff

Xv support is based on the nouveau code, using the blitter. It supports YUYV and UYVY formats, and clipping and boxes work as expected (i.e. you can put a normal window over the video). Video is _not_ synchronized on vsync yet. The Xorg driver only works in the PS3 fullscreen mode for now (see the ps3videomode utility). Also, the EXA Copy operation still has a rendering bug and solid fills are unaccelerated. Performance is ok, rendering a full HD video, Xorg takes ~10% CPU time (but YV12->YUYV conversion being done by mplayer, it should be added to that figure). Thanks to the nouveau guys for the initial code, and to my friend who did most of the Xv adaptation.

unsolo, I didn't have a chance to try your SPU-based implementation yet, but will soon. It is probably a better short term alternative to the FB driver, and we should continue working on both drivers in parallel. For technical comparison, the GPU blitter is doing bilinear interpolation, scaling steps are in 12.20 fixed point and source coordinates in 16th of pixels. There are limits to the width and height, but a single blit can handle full HD. The blit speed is about 16.5GB/s, but the source video has to be copied to the XDR framebuffer region first to be accessible for DMA (this is similar to how AGP works), and must be in YUYV or UYVY format.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

IronPeter, you mentioned 'multiply blits' earlier, did you manage to get that working?
The problem I have with blending is that if I change the blit operation from SRCCOPY (3) to BLEND (2) it hangs the GPU. I think that's why the nouveau guys implemented the EXA composite operation using a textured quad with shaders doing the blending instead of using the blitter.
Oh, and nice finding about the left-over RAMIN memory :-)
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Glaurung, as far as I know, nouvuea uses 3D for blend only because EXA wants ( 1, 1 - alpha ) blend mode, not ( alpha, 1 - alpha ).

Blending must work with 2D. But I did not try yet.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Glarung

Perhaps i can make you a yuv420/yv12 to YUY2 spe converter.. should be really really fast.
Don't do it alone.
d-range
Posts: 60
Joined: Fri Oct 26, 2007 8:22 pm

Post by d-range »

I've been following this thread for a while, great stuff by everyone involved, keep it up!

I've just started a 'little' pet project for a H.264 decoder specifically written for the Cell CPU, from scratch and thus not based on the ffmpeg decoder which is hard to parallelize optimally. It's still at the very, very, very early stages (only working on CABAC atm, which will probably be PPU based), but if ever evolves to something usable you can understand I'm very much interested in the accelerated graphics work, even if only 2D.

IronPeter or Glaurung: do you expect Sony can and will block the hypervisor regions that now allow (at least) 2D access in a future fw? And if they don't, do you expect your work to be merged into an official Xv/xorg/directfb driver with HW acceleration eventually?

I'm willing to test some stuff later if that might help in any way, and I'll keep following this thread!
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Mabye i could make that into something like this.

CONV_YUV420_TO_YUV(Ypointer,Upointer,Vpointer,Istride0,Istride1,Opointer,Ostride0,TAG)

WAIT_CONV(TAG) //waits for that tag to be marked as complete.

where TAG is a 32 bit int or something.

In theory that should be capable of going in close to 24GB/s speeds on a single spe. (I do not think i will get there tho for several reasons)
Don't do it alone.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

d-range, I have no insider info from Sony. Wait and watch :).

unsolo, you can achieve only 12 Gbps of bidirectional DMA ( 6 up and 6 down or 12 up/down ) on the single spe. Under ideal conditions ( large blocks with 128 byte memory alignment both in local store and XDR ).
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Ahh ok

well anyways it should be possible to reach those 12GB/S speeds doing only yuv420 to yuv2.
I have done yuv420 to ARGB at 2.5GB/s bidirectional which is a lot more processing at least when using floats. Maybe this weekend :)
Don't do it alone.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Managed to reach 6.1GB/s using 3 dma's in and one out on a single one.. with what i belive is a working yuv420 to yuv2
Don't do it alone.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Hi, I tried to draw something in 3D. It does not work :). Probably, some states were not setted correctly

Anyway, sources:

http://ps3rsx.googlecode.com/svn/trunk/

This demo does not draw anything ( but tries to render quad ), it just clears blue channel of subrectangle of the screen. The demo needs glaurung's kernel patch to work.

Known potential issues

1.) probably incorrect mask for vp_in - vp_out.
2.) probably incorrect endiannes for fp,
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

3D is working.

Post by IronPeter »

just small bugs.

It works for now. Make svn up.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

black triangle on the screen. In the new repo revision.

Enjoi.
Warren
Posts: 175
Joined: Sat Jan 24, 2004 8:26 am
Location: San Diego, CA

Post by Warren »

Wow monumental work guys! I can't wait until this gets a little more polished.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Well done IronPeter! I've just been playing a bit with your code, drawing quads, using integer instead of float vertices, etc.. I'm wondering why color does not work, but I think I'll play with texturing first (based on nv40_exa.c blending code). Congratulations!
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Glaurung, about color... Did you use the right vp_in / vp_out mask? ( just bitfields for vp input/output ).
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

svn hosting

Post by IronPeter »

Can be ps3rsx project ( or better OpenRSX ) hosted on ps2dev site?

I do not want to use google.code. I have not any experience in the svn administration and want to use simple service with trivial user rights setup, etc.
User avatar
emoon
Posts: 91
Joined: Sun Jan 18, 2004 10:03 pm
Location: Stockholm, Sweden
Contact:

Post by emoon »

Very impressive work!

I will talk to neov (or Oobles) about setting up the project on ps2dev.org once I get hold of them.
chp
Posts: 313
Joined: Wed Jun 23, 2004 7:16 am

Post by chp »

Very nice! Excellent work, guys!
GE Dominator
Post Reply