The hunt for HV's FIFO/Push buffer...

Technical discussion on the newly released and hard to find PS3.

Moderators: cheriff, emoon

Post Reply
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Also, blitting from the vidmem can be useful for using ddr memory as fast swap-file ( CPU readings are very slooooow ).
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

FB_SETUP and XDR<->DDR DMA

Post by Glaurung »

Hi,

First of all, let's start with some news on the information left over by FB_SETUP. The dump from IronPeter shows that there are 6 objects bound to channels 1-6:

- channel 1: instance 0x31337303 of class NvMemFormat. It is used for uploading data from XDR (DMA object 0xfeed0001) to DDR (DMA object 0xfeed0000)
- channel 2: instance 0x3137c0de of class NvMemFormat. It is used for downloading data from DDR to XDR
- channel 3: instance 0x313371c3 of class NvContextSurfaces. This describes the screen surface parameters (ARGB, ...) and is referenced by other objects.
- channel 6: instance 0x3137af00 of class NvScaledImage. This is used for blitting, along with the NvContextSurfaces object above

That's the ones I sure of.. Then we have:
- channel 5: instance 0x31337808 of unknown class, maybe NvImageFromCpu or NvImageBlit
- channel 4: instance 0x31337a73 of unknown class, no idea what it is.
and 0x66604200 is most probably an instance of a DMA Notify object.

Binding an existing object to another channel using tag 0 works. Using this, I was able to perform DMA from DDR to XDR and vice-versa. For example, put this in the FIFO for download:

Code: Select all

	0x0020430c, // &#40;size = 8, chan 2&#41;
	0x00000000, // 0x30c&#58; src
	0x0d000000, // 0x310&#58; dst &#40;GPU_IOF&#41;
	0x00004000, // 0x314&#58; src_pitch
	0x00004000, // 0x318&#58; dst_pitch
	0x00004000, // 0x31c&#58; src_line_len
	0x00000400, // 0x320&#58; line_count
	0x00000101, // 0x324&#58; &#40;&#40;dst_inc << 8&#41; | src_inc&#41;
	0x00000000, // 0x328&#58; buf_notify &#40;?&#41;
Setting src to anything above 252MB crashes the GPU.. no luck for the RAMIN area, sorry IronPeter.. I did a few other random tests though:
- Extending the ioremapping of the the framebuffer, I was able to read 2 more MB of memory past 252MB. Does not look very interesting, 64k of ff ff ff xx then 64k of 00 00 00 xx, etc... No idea what it might be
- I tried the same trick on the fifo registers. Reading returns zero (except for the three FIFO registers of course) until you reach 64k < addr < 128k which crashes the PS3 with a nice beep and blinking red led (just by reading!!). I guess the HV is not happy... need to power cycle to fix the condition
- Same thing on the reports buffer. It starts with:
0x0000: 13 37 c0 d3 13 37 ba be 13 37 be ef 13 37 f0 01, etc..
(looks like some guys at Sony or IBM have humor...)
0x1400: ff ff ff ff ff ff ff ff 00 00 00 00 ff ff ff ff, etc..
0x9400: 00 04 00 00 20 00 04 10 00 5a 02 22 00 81 14 01
40 00 20 04 00 88 00 24 44 00 01 44 02 02 04 01
c0 00 0a 08 00 20 20 00 00 02 08 81 02 10 00 00
01 41 08 53 21 04 04 00 00 24 00 00 00 05 10 20
There seem to be real data starting from 0x9400 up, but I don't know what that could be either...
- I played a bit with the values of lv1_gpu_memory_allocate(). The four values set to zero are actually refering to resources, probably two memory resources and 2 other resources. Here are the maximum values I could set before the call returns invalid parameters (-17):
status = lv1_gpu_memory_allocate(ps3fb.vram_size,
512*1024,
3075*1024,
15,
8,
&ps3fb.memory_handle,
&ps3fb.vram_lpar);

Anyway, we now have everything needed to write a decent Xorg driver, with Xv and Composite support. Also, the mtd driver for swapping RAM to DDR could be improved using DMA (I measured bandwith of ~3GB/s in both directions which is far less than expected but much better than direct access to DDR; blitting is faster, at ~16GB/s).

Have fun.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Good work, Glaurung

Idea for getting RAMIN must fail. The direct RAMIN access means HUGE security hole in the hypervisor.

I supposed subchannel 5 has CLASS_3D :).
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

very strange blit result

Post by IronPeter »

Probably it is my mistake, but this DDR blit into itself does really work:

http://www.everfall.com/paste/id.php?wpt5ez8wbvpq

Ye, it looks like white-black regular areas with irregular color dots :).

Edited: XDR->DDR
Last edited by IronPeter on Fri Oct 12, 2007 2:28 pm, edited 1 time in total.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Dump of the top of videomemory

Post by IronPeter »

yes, I have dump of the top 4 megs of vidmem. This dump contains for example the funny "dma_report" area in the middle. And handles for the context objects ( in some endianness ) 313371c3 -> c3713331.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

@IronPeter

Im working on spu medialib and we have made a yv12/yuv420 scaler that also does colorspace conversion at more then sufficient framereates 85FPS @1920x1080 using a single spe. we have also worked on a mplayervo using spu-medialib and libsp3fb a small fb lib we made.

if we could figure out how to blit the video over as an overlay to the X beeing rendered it would be great. maybe by extending libps3fb somehow.
Don't do it alone.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

I would also be more than happy to write other 2d acellerations for the spu like
yuyv perhaps.
Don't do it alone.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Hi,

IronPeter: Great! I didn't try the blit, just the XDR<->DDR DMA, so bliting DDR to DDR was a good idea and you get basically the same as I was observing with remapping (white/black stuff) plus the last 2MB. The fact that the object handles lie at the end of the framebuffer is quite interesting (hash table?), I'll have a look at that tomorrow. However I doubt channel 5 is 3D, but let's hope for it :-)

unsolo: Blitting YUYV to RGB with scaling is clearly feasible, as I have reported in my first post. Actually, I'm not a 3D guy at all and I'm more interested in getting a decent Xorg driver for the PS3. This means accelerated Xv support and Composite support too (as we now have all the tools to do that I think). Basically, changing the strech blit format from 3 to 5 does the job. Converting YUV420 to YUYV is relatively straightforward and should not take too much CPU power, leaving the SPUs for the applications. However, until we find out how to create new GPU objects, using the SPUs for 3D might be a good option. I did actually start a FB+SPU based Xorg driver, thinking there was no hope for direct GPU access, but the experiments of the last days show otherwise and are quite encouraging. Anyway, a SPU can clearly handle YUV->RGB conversion at full resolution alone, so this might be an alternative for people not interested in messing up with their framebuffer driver.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

unsolo, I am a 3D guy. Glaurung is the 2D guy you need.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

@Glarung please take a look here

http://wiki.ps2dev.org/ps3:spu-medialib

As you can se we not only got the colorspace into a single spe but also a scaler into the same one.

I am more than happy to extend these to whatever is needed to make some XV driver and using the stuff we have in the mplayer-vo that shouldnt be to hard.

But my knowledge of xv/Xorg is what comes short. If you could/want to assist on this it would be very much appriciated.
Don't do it alone.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

> but also a scaler into the same one

Not a problem, stretched blits are possible.
unsolo
Posts: 155
Joined: Mon Apr 16, 2007 2:39 am
Location: OSLO Norway

Post by unsolo »

Anyhow i think making a XV driver based on spu scaling + csc + blit is stable and safe. Untill we se the responses to these developments.
Don't do it alone.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Hi,

For the SPU based approach, you might be interested by this:
git clone http://manwe.homelinux.org/~glaurung/xf86-video-ps3.git

This is basically just a standard FB Xorg driver, on which I have added a simple "Hello world" from a SPU. You can plug your media library instead, just ignore the warning from libtool, that's expected. Note: it is not usable as it exits purposedly on startup. I have not worked on this driver since a month or so (was away), but I'll probably start from it as the base for the GPU accelerated version too.
Also, I am not a Xorg expert (mostly read the FB and nouveau code for now), but I have been told that implementing Xv and Composite is mostly done with a few functions by someone who did that for an embedded platform.
I agree that an SPU-based driver is safer and also simpler to install (just need to change the Xorg driver, not the kernel). So if you want to give it a try, no problem. I'll focus on the GPU accelerated one though (done mostly cleanup of the kernel code today).
Nismobeach
Posts: 9
Joined: Thu Aug 16, 2007 1:31 pm

Post by Nismobeach »

This is fantastic news guys! Keep up the great work and thanks for helping PS3 Linux grow! :)
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

RAMHT dump

Post by IronPeter »

The top of vidmem is definitely RAMIN memory.

RAMHT contains instances of context objects:

handle : dword object HEADER

3137af00 04983089 //NV10_SCALED_IMAGE_FROM_MEMORY
56616661 00003002
56616660 0000303d
66626660 4000303d
66616661 00003002
66606660 0000303d
31337a73 0000309e // NV20_SWIZZLED_SURFACE
31337303 02000039 // NV_MEMORY_TO_MEMORY_FORMAT
cafebabe cafebabe
feed0001 0002303d
feed0000 0000303d
31337808 0418308a //NV_IMAGE_FROM_CPU
31337000 00000030
3137c0de 02000039// NV_MEMORY_TO_MEMORY_FORMAT
313371c3 00003062 //NV_CONTEXT_SURFACE
66604201 04003003
66604200 00003003
66604203 0c003003
66604202 08003003
66604205 14003003
66604204 10003003
66604207 1c003003
66604206 18003003
66604209 24003003
66604208 20003003
6660420b 2c003003
6660420a 28003003
6660420d 34003003
6660420c 30003003
6660420f 3c003003
6660420e 38003003

Enjoi.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Getting RAMIN via blitter is incredible thing.

It looks like getting OS kernel info via memcopy from the userland.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

My position is

1.) Sony must fix that security hole as soon as possible ( probably, this broken condition is already fixed in the new firmwares ).

2.) I do not want to modify RAMIN via inserting 3d objects. It is exploit.

3.) It is better for Sony to provide legal and safe hypervisor-level access to the GPU.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

blit fails with v1.93

Post by Glaurung »

Hi,

IronPeter, i was unable to reproduce the RAMIN->DDR blit with firmware 1.93. The blit you mention will hang the GPU. I tried fixing the pitches (it seems you would access past the end of VRAM otherwise), and still no luck. If I blit 1MB from offset 251MB it works (I get the white/black stuff with dots), if I change offset to 252MB it fails (I get pure white) and a second blit is not executed. It is strange as it does not correspond to what I see with direct access to the VRAM at offset 252MB-254MB (white/black stuff again). Unfortunatelly I cannot access the last 2MB with the ioremapping method, and I guess that's where the interesting stuff resides.. Can you provide the offsets of the RAMIN, RAMHT, reports area, etc.. with respects to the beginning of VRAM?

Here is the full test:

Code: Select all

	/* DDR->DDR blit from end of VRAM */
	BEGIN_RING&#40; 6,  0x184, 1 &#41;;
	OUT_RING&#40; 0xfeed0000 &#41;;
	BEGIN_RING&#40; 6, 0x198, 1 &#41;;
	OUT_RING&#40; 0x313371c3 &#41;;
	BEGIN_RING&#40; 3, 0x300, 1 &#41;;
	OUT_RING&#40; 0x0000000a &#41;;
	BEGIN_RING&#40;3, 0x30c, 1&#41;;
	OUT_RING&#40;0&#41;;
	BEGIN_RING&#40; 3, 0x304, 1 &#41;;
	OUT_RING&#40; 0x10001000 &#41;;
	BEGIN_RING&#40; 6, 0x2fc, 9 &#41;;
	OUT_RING&#40; 0x00000001 &#41;;
	OUT_RING&#40; 0x00000003 &#41;;
	OUT_RING&#40; 0x00000003 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00100000 &#41;;
	OUT_RING&#40; 0x00100000 &#41;;
	BEGIN_RING&#40; 6, 0x400, 4 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00021000 &#41;;
	OUT_RING&#40; 1024 * 1024 * 251 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;

	/* XDR->DDR blit to check GPU is still alive */
	BEGIN_RING&#40; 6,  0x184, 1 &#41;;
	OUT_RING&#40; 0xfeed0001 &#41;;
	BEGIN_RING&#40; 6, 0x198, 1 &#41;;
	OUT_RING&#40; 0x313371c3 &#41;;
	BEGIN_RING&#40; 3, 0x300, 1 &#41;;
	OUT_RING&#40; 0x0000000a &#41;;
	BEGIN_RING&#40;3, 0x30c, 1&#41;;
	OUT_RING&#40;0x00100000&#41;;
	BEGIN_RING&#40; 3, 0x304, 1 &#41;;
	OUT_RING&#40; 0x10001000 &#41;;
	BEGIN_RING&#40; 6, 0x2fc, 9 &#41;;
	OUT_RING&#40; 0x00000001 &#41;;
	OUT_RING&#40; 0x00000003 &#41;;
	OUT_RING&#40; 0x00000003 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00100000 &#41;;
	OUT_RING&#40; 0x00100000 &#41;;
	BEGIN_RING&#40; 6, 0x400, 4 &#41;;
	OUT_RING&#40; 0x01000400 &#41;;
	OUT_RING&#40; 0x00021000 &#41;;
	OUT_RING&#40; 0x0d000000 &#41;;
	OUT_RING&#40; 0x00000000 &#41;;
Changing 251 to 252 will trigger the GPU crash. Am I missing something or does that mean GPU initialization was fixed in recent firmwares?

Anyway, I would not consider direct access to the RAMIN area as a security issue, isn't that how things work on x86 hardware? It would be more concerning if we were able to DMA anywhere in the physical memory (which would allow to exploit the HV), but since the lv1_gpu_context_iomap() call uses lpar addresses just as everything else, I doubt it is possible.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

My push buffer:

http://www.everfall.com/paste/id.php?ez03odrzbafn

Funny, the videomemory wraps around 256 megs: the lines in the bottom of the screen come from the framebuffer ( offset zero ).

The screen resolutions are different ( is your 1024 x 768 ? ). I do not think it is really matter. Also firmware versions are different.

All RAMIN stuff is inside the last 2 megs of vidmem.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

log from #nouveau mIRC channel about secure RAMIN design:

15:26 < marcheu> IronPeter: access to parts of memory is controled with objects on nvidia hw. and their video memory object probably doesn't let you touch the part of VRAM that store RAMIN
15:26 < marcheu> IronPeter: of course, objects are written in RAMIN
15:26 < IronPeter> I see.
15:26 < marcheu> yeah that makes a secure design
15:26 < marcheu> we use the same design btw
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

kind of OT

Post by IronPeter »

I wrote post about my efforts ( be warned - russian language inside ):

http://blog.gamedeff.com/?p=57

It is collective blog of russian-speaking professional game developers.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Still no luck with this either. I guess some limit was set correctly in this firmware version.. (as I get 0xffffffff for anything above 252MB + a GPU crash). No wrapping either. My resolution is the same (1280x720), it's just I had changed the pitch to have 1MB blits.
EDIT: fix resolution
Last edited by Glaurung on Mon Oct 15, 2007 1:26 am, edited 1 time in total.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

Glaurung, looks like security fix from Sony.
ps2devman
Posts: 259
Joined: Mon Oct 09, 2006 3:56 pm

Post by ps2devman »

We can't expect legal RSX access from Sony. I think they don't even know yet if giving Linux out of the box was a good or bad idea.

So, to get things clear :

- With fw 1.93 or below, accelerated 2D tricks thru RSX are possible.

- With fw 1.93, accelerated 3D tricks thru RSX are not possible (hole fixed).

- With fw 1.8 or below, accelerated 3D tricks thru RSX are possible.

Still need little public 2D & 3D demo sources in order to have more compatibility reports from more pople with different firmwares and motherboard models. Xantium's table :
v1: no flashcard ports, 4x usb, GS+EE+RDRAM PS2 HW (NTSC/J and NTSC/U 20gb)
v2: flashcard ports, 4x usb, GS+EE+RDRAM PS2 HW (NTSC/J and NTSC/U 60gb)
v3: flashcard ports, 4x usb, GS PS2 HW (PAL 60GB and NTSC/U 80gb)
v4: no flashcard ports, 2x usb, no PS2 HW (NTSC/J, NTSC/U and PAL 40GB)

From now, avoid upgrading beyond 1.8... unless you got infectus and dumped your nands for later reflash.

EDIT : FALSE ALARM!
Hole still there in 1.93... The above statements are FALSE!
Last edited by ps2devman on Tue Oct 16, 2007 3:53 pm, edited 1 time in total.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

ps2devman, only one huge and naive hole was fixed.

The code from Sony seems to contain many holes.

Possibility of the full 3D... I do not know the english translation of the russian phrase "&#1042;&#1086;&#1089;&#1093;&#1086;&#1076; &#1089;&#1086;&#1083;&#1085;&#1094;&#1072; &#1074;&#1088;&#1091;&#1095;&#1085;&#1091;&#1102;". Inserting user handles to RAMIN is not very easy.
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

Post by Glaurung »

Hi,

Full source for custom 2D blit is here:
http://manwe.homelinux.org/~glaurung/ps3/

It consists in a patch for the current ps3fb Linux driver to add support for mapping the FIFO, FIFO regs and video RAM though a char device. It also add two ioctl, one for obtaining information on the size of these regions, one for enabling the FIFO trick.
The usermode application (ps3gpu.tar.gz) maps the regions, disables automatic blitting through the already existing ioctl (PS3FB_IOCTL_ON), and installs the FIFO trick. Then a blit program is generated, video ram is cleared, and the FIFO is executed. The default program blits a 256x256 block of the virtual framebuffer to the video memory, stretched by 2.0 is both directions and bilinear filtered. The blit is resolution dependent, change the pitches according the actual resolution you use.

Enjoy.
ps2devman
Posts: 259
Joined: Mon Oct 09, 2006 3:56 pm

Post by ps2devman »

Many thanks!
I'm more a cygwin Oopo's toolchain and Marcus' Other OS Demo user than a Linux user, so I will try to make a patch for Other OS Demo, compilable under Windows with that.

Once PRAMIN can be filled with our data, I don't see what can stop us from playing with RSX to its full extent... Don't forget that commands you can write in push buffer are not only graphic commands. You can also ask GPU to write data in any GPU register and trigger interrupts too.
Last edited by ps2devman on Mon Oct 15, 2007 5:40 pm, edited 2 times in total.
ralferoo
Posts: 122
Joined: Sat Mar 03, 2007 9:14 am
Contact:

Post by ralferoo »

Glaurung wrote:Full source for custom 2D blit is here:
http://manwe.homelinux.org/~glaurung/ps3/
Just read the code rather than trying it out, looks sweet! Haven't applied the kernel patch as my PS3 is currently very busy doing other stuff at the moment...

In terms of the kernel patch, I'd suggest adding a timeout and possibly retry to ps3fb_setup_fifo when you try to grab control of the FIFO, just in case it goes wrong.

I also haven't done any playing, but I'm just wondering about the whole eptr/rptr thing and whether it's possible to write to those registers... because otherwise we're going to reach the end of the FIFO buffer fairly quickly. It seems you only ever write to wptr and just watch for the eptr to catch up. Presumably it's possible to reset eptr ourselves? The hypervisor must be doing that, at least...

Anyway, nice work. It might also be good to call lv1_get_version_info and expose the firmware<1.9.3 check as a simple boolean is3Dsupported ioctl.

Dying to experiment with this myself! I've also just bought a second PS3 for development purposes that's FW 1.70, so I've got a reason not to upgrade that now! Bit of a shame - I'd rather keep my NTSC one for development and use the PAL one for games, but I'd rather have 3D homebrew!
Glaurung
Posts: 49
Joined: Thu Oct 11, 2007 4:54 am

X driver with accelerated EXA Copy using FIFO hack

Post by Glaurung »

Hi,

I just got the first EXA accelerated operation working on my X driver. It still has some bugs and lots of debug output, but I got the FIFO management code handling wrap around correctly, and the Copy operation (i.e. VRAM to VRAM copy) is implemented using stretched blit. I tried to stay as close to the nouveau and FB code as possible, and the code is still a real mess. The X framebuffer resides in video ram, so any unaccelerated operation will be slow as hell, but speed is already ok with just Copy accelerated. If uglyness and experimental code do not scare you, it is there:
git clone http://manwe.homelinux.org/~glaurung/xf86-video-ps3.git
(I moved the old SPU based draft to xf86-video-ps3-spu.git)
It uses the same kernel patch to get access to the GPU.

ralferoo: Adding timeouts would indeed be a good idea, it is just the code is still very experimental. I don't know the difference between rptr and eptr, except eptr lags a bit behind. But people from nouveau and nv use rptr for syncing.. As for running out of FIFO space, there is a jump command (0x20000000 | address) to get back to the beggining of the FIFO. I had to take care not reaching the past 1kB for this to work without hanging though.

ps2devman: Any pointer to the commands for reading/writing registers?
rapso
Posts: 140
Joined: Mon Mar 28, 2005 6:35 am

Post by rapso »

ps2devman wrote:We can't expect legal RSX access from Sony. I think they don't even know yet if giving Linux out of the box was a good or bad idea.
sony could give us access with openGL(ES) , there should be no problem on that.
I think it would be a bigger problem relating to the companies that payed for licenses to have all the access to the ps3.
Post Reply