Who wants 252MB more RAM for PS3 homebrew.

Technical discussion on the newly released and hard to find PS3.

Moderators: cheriff, emoon

jimparis
Posts: 1145
Joined: Fri Jun 10, 2005 4:21 am
Location: Boston

Post by jimparis »

You can't just cherry-pick single files from the repository and expect them to work with your older kernel.. just switch everything over to the latest git and it'll be a lot easier to develop on & follow geoff.
HD
Posts: 4
Joined: Tue Mar 11, 2008 8:26 pm

Post by HD »

> The driver is still experimental, feel free to test if interested:
> git clone http://mandos.homelinux.org/~glaurung/git/ps3vram.git/

Hello,

Thanks a lot for posting this very useful extension for the PS3.

I have compiled it against YDL 5 and in general it works quite well.
Two problems:
(1) Inserting the module for the first time usually fails with error
" could not allocate XDR buffer". After 3-4 retries (with intermediate rmmod)
it works.
(2) I do not get a speed of 100-150MB/s. hdparm shows ~30MB/s, and
if I switch all disk I/0 to ps3vram in my test-program I get a slight speed decrease.
I do have the dma-version, not the memcpy-version of ps3vram.

Regards

Helmut Dersch
jovi
Posts: 1
Joined: Fri Apr 11, 2008 3:57 am

Stupid question, but...

Post by jovi »

Excuse me if this is a really dumb question.

Could the kernel be reconfigured to allocate the framebuffer from the video RAM? Maybe even to use the bottom chunk that it is currently blitting to already? It seems like that might have two beneficial effects:

1. Free up more primary RAM for applications.

2. Speed up video operations.

Even if the framebuffer is allocated to the video RAM above the currently used segment, it seems like maybe the blit operation could be handed off to the GPU, which arguably might speed things up a bit.
jimparis
Posts: 1145
Joined: Fri Jun 10, 2005 4:21 am
Location: Boston

Re: Stupid question, but...

Post by jimparis »

jovi wrote:Excuse me if this is a really dumb question.

Could the kernel be reconfigured to allocate the framebuffer from the video RAM? Maybe even to use the bottom chunk that it is currently blitting to already? It seems like that might have two beneficial effects:

1. Free up more primary RAM for applications.

2. Speed up video operations.

Even if the framebuffer is allocated to the video RAM above the currently used segment, it seems like maybe the blit operation could be handed off to the GPU, which arguably might speed things up a bit.
Having the CPU write directly to VRAM would be incredibly slow.
Currently we store the framebuffer in RAM and use the GPU to copy it to VRAM, which is the fastest way. The only benefit would be to free up a little bit of primary RAM, but if your intended application really cares that much about 1-2% of your RAM then you probably need to rethink your general approach anyway.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

>Having the CPU write directly to VRAM would be incredibly slow.

Writes are fast. DMA writes are at ~5 GB/s I think.

XDR frame buffer is allocated at system startup. By default ps3fb allocates 9 megs for 1080i frame buffer. You can change that value to 4-5 MiBs if you do not want to use HD resolutions.
jimparis
Posts: 1145
Joined: Fri Jun 10, 2005 4:21 am
Location: Boston

Post by jimparis »

>Having the CPU write directly to VRAM would be incredibly slow.
Writes are fast. DMA writes are at ~5 GB/s I think.
Sure, but that still requires that we prepare a buffer beforehand that we can pass to gpu_blit or whatever. If we want random access to the framebuffer (which AFAIK is something Linux expects), then direct CPU writes are all we can use, and those are something like 10.6MB/sec when going direct to VRAM.
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

VRAM writes by CPU are fast ( GiBs/s for DMA ). Readings are slow ( MiBs/s ).
jimparis
Posts: 1145
Joined: Fri Jun 10, 2005 4:21 am
Location: Boston

Post by jimparis »

IronPeter wrote:VRAM writes by CPU are fast ( GiBs/s for DMA ).
I still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec. That's what I measured with my original ps3vram driver. If you want to DMA, you'll need to prepare a buffer beforehand that you can pass to the DMA hardware, so you'll still need a copy of the framebuffer in main RAM. Can you be more specific about how to get GiBs/s with random writes to VRAM from the CPU?
IronPeter
Posts: 207
Joined: Mon Aug 06, 2007 12:46 am
Contact:

Post by IronPeter »

> still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec

It is better to try SPU-initiated DMAs for VRAM writes. I'm not sure about exact numbers, need to retest. But I think that 5GiB/s is possible to achieve.

High VRAM write rate is very usable for resource uploading ( textures, etc ). Not very usable for SW 2D driver, because you need to blend framebuffer or to perform masked writes.
DJohn
Posts: 4
Joined: Tue Apr 15, 2008 4:39 am

Post by DJohn »

jimparis wrote:I still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec.
The hardware is probably implementing your write as

1) Read 128 bytes
2) Change one word
3) Write 128 bytes

The read will take you down to MB/s.

Is video memory cached? The cache will behave like this, but if you're writing entire cache lines at a time there's an instruction you can use to clear the line before you start writing. That'll remove step 1.

If it isn't, there probably isn't any way around it from PPU. DMA from SPU should get full speed (doing aligned transfers of multiples of 128 bytes).
jimparis
Posts: 1145
Joined: Fri Jun 10, 2005 4:21 am
Location: Boston

Post by jimparis »

Is video memory cached? The cache will behave like this, but if you're writing entire cache lines at a time there's an instruction you can use to clear the line before you start writing. That'll remove step 1.
Right. But the original question was whether we could move the kernel's framebuffer memory directly into VRAM. I still don't think we can. Applications can mmap that space and we don't know their access patterns. We could implement our own caching in cpu-local RAM using a smaller buffer and paging in/out as necessary via MMU tricks, but that gets complicated and I'm not sure it's worth it.
Post Reply