PSP Slim can't output double-buffered 720x480

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

PSP Slim can't output double-buffered 720x480

Post by edepot »

AgenaWorld 1.5 was released for the neoflash competition today.
You can download it at http://www.edepot.com/game.html

One interesting thing about this version of the game is that it
allows you to output to the TV (when running off of the PSP Slim).
The game is double-buffered, but it can't do 720x480, but
has been tweaked to give the max resolution of 640x384 and
720x320. The full story about development of this version is at:

http://www.edepot.com/forums/viewforum.php?f=8
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

There's only 2MB of VRAM. 720x480x4x2 = 2,764,800 which is greater than 2MB. Switch to thousands mode for double-buffered 720x480. Or just keep a single buffer in VRAM and blit to the appropriate system memory frame buffer.
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

2MB VRAM limit

Post by edepot »

Actually I don't know how to switch to thousands mode. Do you have
sample code?

Also, if blitting from system memory, how does that
affect VRAM? Do you mean leave the LCD buffer location to point to beginning of 720x480x4, and have 720x480x4x2 outside of VRAM in
system ram, and then use PSP's hardware blit call to alternatively copy either one of the system ram 720x480x4 to the allocated VRAM?
If yes, then what is the command for the blit? (including the initial
setting of the 720x480x4 for vram?). And where would you poke
your bytes to manually modify the buffer in system ram?
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Re: 2MB VRAM limit

Post by J.F. »

edepot wrote:Actually I don't know how to switch to thousands mode. Do you have sample code?
In pspdisplay.h, you find

Code: Select all

/** Framebuffer pixel formats. */
enum PspDisplayPixelFormats {
	/** 16-bit RGB 5:6:5. */
	PSP_DISPLAY_PIXEL_FORMAT_565 = 0,
	/** 16-bit RGBA 5:5:5:1. */
	PSP_DISPLAY_PIXEL_FORMAT_5551,
	/* 16-bit RGBA 4:4:4:4. */
	PSP_DISPLAY_PIXEL_FORMAT_4444,
	/* 32-bit RGBA 8:8:8:8. */
	PSP_DISPLAY_PIXEL_FORMAT_8888
};
So you could do

Code: Select all

	sceDisplaySetFrameBuf(frameBuf, 512, PSP_DISPLAY_PIXEL_FORMAT_5551, PSP_DISPLAY_SETBUF_NEXTFRAME);
Also, if blitting from system memory, how does that
affect VRAM? Do you mean leave the LCD buffer location to point to beginning of 720x480x4, and have 720x480x4x2 outside of VRAM in
system ram, and then use PSP's hardware blit call to alternatively copy either one of the system ram 720x480x4 to the allocated VRAM?
If yes, then what is the command for the blit? (including the initial
setting of the 720x480x4 for vram?). And where would you poke
your bytes to manually modify the buffer in system ram?
I mean do all the rendering to VRAM (since that's the only place that you can use the GU on), then copy to a frame buffer in system memory. The frame buffer can be anywhere and is set with the sceDisplaySetFrameBuf() command mentioned above.

For example, I modified graphics.c to blit from screen to memory like this:

Code: Select all

void blitScreenToFrameBuf(int sx, int sy, int width, int height, int dx, int dy, void *dest, int laced)
{
	if (!initialized) return;
	unsigned int vram = (unsigned int)getVramDrawBuffer();
	//sceKernelDcacheWritebackInvalidateAll();
	guStart();
	if (laced)
	{
		sceGuCopyImage&#40;GU_PSM_8888, sx, sy, width, height>>1, PSP_LINE_SIZE<<1, &#40;void *&#41;&#40;vram+PSP_LINE_SIZE*4&#41;, dx, dy>>1, PSP_LINE_SIZE, dest&#41;;
		sceGuTexSync&#40;&#41;;
		sceGuCopyImage&#40;GU_PSM_8888, sx, sy, width, height>>1, PSP_LINE_SIZE<<1, &#40;void *&#41;vram, dx, dy>>1, PSP_LINE_SIZE, &#40;void *&#41;&#40;&#40;unsigned int&#41;dest + PSP_LINE_SIZE*262*4&#41;&#41;;
	&#125;
	else
		sceGuCopyImage&#40;GU_PSM_8888, sx, sy, width, height, PSP_LINE_SIZE, &#40;void *&#41;vram, dx, dy, PSP_LINE_SIZE, dest&#41;;
	sceGuTexSync&#40;&#41;;
	sceGuFinish&#40;&#41;;
	sceGuSync&#40;0,0&#41;;
&#125;
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

double buffering in 720x480x4

Post by edepot »

I kinda understand what you are saying but how does that solve the double buffering problem in 720x480x4? The thousands mode you mention I knew about already (is 16bit called thousands mode?). I had 16bits per pixel before and moved to 24bits per pixel to get rid of the washed out pictures. So going back to the old "thousands" mode won't work in this case.

As for the double buffering of 720x480x4, I don't quite grasp how you are solving the problem by blitting TO system ram rather than blitting FROM system ram. If you do GU stuff in video ram, you are occupying 720x480x4 (the first buffer). Then when you blit to the system memory I presume that that is the second buffer. But to display the second buffer you still need to blit back the memory from system ram to video ram right? But because the video ram can hold only one full 720x480x4 display, you cant blit from system to video (this second buffer) without ruining the first screen in video ram.

Are you saying there is NO way to use double buffering in 720x480x4 resolution when you use GU stuff? But is possible with non-GU?

Here is how I think double buffering would work in 720x480x4 using
only one frame of 720x480x4 system ram...

non-GU:
work in system ram (buffer 1)
blit system ram to video ram (now holds buffer 1)
clear system ram
work in system ram (buffer 2)
blit system ram to video ram (now holds buffer 2)
clear system ram
work in system ram (buffer 1)
blit system ram to video ram (now holds buffer 1)
... etc

But how do you do it while using GU accellerated stuff too?
Each frame would need to use GU and then have ability to
manually change the pixels (I presume in system ram in a
separate bufffer because manual stuff is slow).

How about this...
GU -> video ram (buffer 1)
blit video ram -> system ram Loc1 (buffer 1)
//blit system ram Loc2 to video ram (buffer 2) <-no blit first time around
do manual pixel stuff in system ram Loc1 (buffer 1)
GU -> video ram (buffer 2)
blit video ram -> system ram Loc2 (buffer 2)
blit system ram Loc1 to video ram (buffer 1) *
do manual pixel stuff in system ram Loc2 (buffer 2)
GU -> video ram (buffer 1)
blit video ram -> system ram Loc2 (buffer 1)
blit system ram Loc2 to video ram (buffer 2) *
do manual pixel stuff in system ram Loc1 (buffer 1)
GU -> video ram (buffer 2)
...etc

The above scenario should allow both GU accelerated stuff and
manual pixel manipulation in second work buffer. However, if you look
at where there is an asterix (*) there may be a flicker: between
when the GU has done accelerated graphical stuff in the video memory for one buffer and the time the other finished buffer from system memory is blit into the video memory (because they are of different buffers).

So if no flicker is desired (is it fast enough?), would it work? If not, then
GU accelerated stuff can't be used if double buffering is needed right?

And if it comes down to no GU accelerated stuff, is there a simple command to blit from system memory to video memory? I mean is it possible to allocate system ram, do stuff in that system ram and blit to video when finished, so you can work on the second buffer in system ram while the video memory is displaying the first? I haven't seen much code that does this, as I presume they all have both accelerated buffers in video memory (because they are using only the small PSP display).
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

You have room in the vram for one 720x480 buffer. If you use that as the frame buffer as well, you could see the drawing as it occurs. To prevent that, you copy the data once it's done drawing into a frame buffer in system memory (the vram buffer is the drawing buffer, and the system memory buffer is the display buffer). To be able to flip the display without tearing requires two display buffers. So you have the drawing buffer in vram, and two display buffers in system memory.

Set the frame buffer to display buffer 1.
Do all drawing to the drawing buffer.
Copy drawing buffer to display buffer 2.
Start drawing next frame.
At vertical blank, switch frame buffer from display buffer 1 to 2.
Wait for next frame to finish drawing.
Copy drawing buffer to display buffer 1.
Start drawing next frame.
At vertical blank, switch frame buffer from display buffer 2 to 1.
Repeat.

That's all hardware accelerated. The copying from vram to system memory is DMA, just like the other way around when doing things like loading textures into vram. If you looked at the code I gave above, it uses a GU command and then waits on the GU to finish. Making the GU DMA in interlaced mode was a real fun thing to work out. It took some experimentation to work that out so that it blitted every other line in one single blit rather than doing 240 blits like everyone else seems to do. 8)
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

display buffer can be outside of vram

Post by edepot »

Hmm... I didn't know that the display buffer can be pointed to someplace outside of vram. I know that if trying to do 720x480x4x2 leaves the second buffer with artifacts, I presume these artifacts would be actual data somewhere in system ram. (video ram is mapped to system ram address right?). In this case (assuming not using accelerated stuff gu calls), is it possible to do pixel stuff on the second buffer (in vram) up to the point where it is about to go outside of vram, and if you want to manipulate pixels beyond that limit, you would add or subtract some delta value which would point you to the actual system ram that is concatenated to the bottom of the second buffer of the 720x480x4. This would save lots of memory. What I don't understand is, is it possible to move vram to anywhere in the address space so that the bottom of the second buffer can map to somewhere that you can write to in system ram and does not contain OS data or some important data that will crash the psp?

But the possibility to map display buffer in system ram solves a lot of problems with gu doing large display screens on multiple buffers as well.
Is there a list of places in the psp address space that you are not suppose to touch? Can you manually put your program data somewhere in the address space so that you leave large chunks of space for your two buffers? In other words where can your two large buffers exist in the address space? Does the stack and heap (I assume there is where the two buffers will end up) grow in opposite directions? Are they fixed somewhere at an address space when you start a program?

Lastly, can you move your program into the vram? This would leave
a large chunk of contiguous system ram. I think the PSP doesn't have
virtual address space, so every program including the OS would have
to have their absolute address somewhere and they must not overwrite each other.

So how do you point the display buffer to a safe place in system ram?
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

The DISPLAY buffer can be in system memory. The DRAWING BUFFER cannot. All drawing must be in the vram. That's why I told you to make one drawing buffer in vram that never changes, and two display buffers in sysmem that are flipped each frame.

Note that if the drawing is by the CPU, you don't need to use vram at all. Doom for the PSP does that - Doom renders to memory, then I convert the memory into the actual display which is also in system memory (for TV modes). If you use the GU, only vram can be used as the target of drawing operations. So copying data can go either way, but sprites or 3D require you to target vram.

And no, you can't split a frame to be partly in vram, and partly in sysmem. It's ALL in or all out. However, having one single drawing buffer in vram does leave room for textures in vram.

I suppose you could put SC code in vram, but it's a waste of a valuable resource. You cannot put ME code in vram as the ME has no access to the vram.

You set the frame buffer to any valid address using sceDisplaySetFrameBuf() as I mentioned above. The variable "frameBuf" can point to vram or sysmem. There have been examples posted on framebuffers in system memory here. In fact, I did an example here a while back that showed what I've been talking about - it drew to the vram while copying to a framebuffer in system memory. That's where I pulled the code from I posted above. It was called CursorTest as I did it in response to someone who needed help making a cursor. I eventually used it to demonstrate a number of things: a cursor, loading and displaying an anigif, and using a vram buffer and two sysmem buffers for TV out.

Here's an arc of all the stuff in cursortest (source, pics, etc):
http://www.mediafire.com/?uz0uaw2jnb9
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

Post by edepot »

Yes, I understand what you are saying. Your generic definition of drawing buffer is GU specific in vram, and when using CPU the drawing buffer is actually not limited to the vram.

There are a few other things that are bugging me though. I could "almost" swear that when I set the whole 2 buffers in vram, that the bottom half of the second buffer exists in system ram. This is when the display mode is set to 740x480x4x2 in vram. Is the vram fixed at a certain address on the PSP regular and slim? Is it possible to relocate it?
And if not, do you happen to know what the psp is displaying on the TV when the bottom of the second buffer is shown? It is showing data from somewhere right? That data I would presume exist in system ram, and it would be interesting to know where it is exactly in the address map.

Also, is it possible to move your program in system ram dynamically?
Can psp program recursively modify it's own code in ram? You mentioned that the code can exist in vram, do you know how to go about doing this? Do you add or modify some parameters in the makefile when compiling? Or do you must have a prx module plugin that changes the default location of the eboot when loading? Or can the eboot.pbp have a parameter telling the OS to load it specifically somewhere? Lastly, can you point to info on whether the stack and heap grows towards the lower addresses or higher addresses? And their default address locations. Do they start growing at the beginning and end of your program in memory? Or somewhere else? And where does the OS and other stuff exist in the address space? Without these info how would you know where you can put your buffers in system memory without corrupting some important stuff?

I will take a closer look at your cursor coding. The interlace stuff is pretty good. On a glance I presume you are increasing the pixel line length to include the second line, then blit the left half for odd and the right half for even in one go, but I thought the PSP slim accumulate output in progressive only, even if you set it to interlace?
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

The EDRAM for the GU is at 0x04000000 in the SC memory map and is 2 MB in size. The system memory is at 0x08000000 and is either 32 or 64 MB (Phat and Slim respectively). You'll find a memory map here:

http://hitmen.c02.at/files/yapspd/psp_d ... tml#sec7.2

If you write past the end of the vram, there's no telling what will happen. Generally, you get all kinds of garbage. That's what I got when I tried two frames in the vram before I remembered the size was too big. The "bottom" cannot be remapped somewhere. It's either in the vram entirely, or it's not. There's no MMU in the CPU, and no GART in the VDP.

Technically speaking, if I remember old threads right, the vram mirrors a number of times, but the mirrors are not true mirrors, but do various types of swizzling. So writing past the end of vram is writing to the start of vram with the data bytes reordered. It does NOT go to system memory. That is certainly not what you want. You simply CANNOT put two 720x480 frames into vram. It just doesn't fit. Period.

You can move code around no problem. Just be sure to flush the caches after you do so. Moving code is a simple form of dynamic recompilation. Look at the code for Daedalus for an example of dynamically generating code at an arbitary address. However, that doesn't apply to prxs. You rely on the firmware to load them, and accept whatever address it puts it at. You don't get to choose that as far as I know.

Don't worry about heaps or stacks. You don't need to know anything about them. Just allocate the frame buffers. Use memalign() to allocate aligned memory from the heap. I cheated on Doom and simply set the address of the buffers to the extra memory in the slim. I knew it wasn't being used as the extra memory wasn't available yet (that came in the next rev of the cfw), and I knew that if you had TV, you had the extra memory. I believe the cursor test does the same thing (hardcoded address in the slim's extra memory).

There are currently three TV modes you can choose: Progressive Component, Interlaced Component, and Interlaced Composite. Only Progressive Component is "normal" - you have 480 lines of 720 pixels (768 units wide if that is what you set the pitch to be). Both the Interlaced Component and Interlaced Composite are laid out in a "weird" format - you have the even lines, then a gap, then the odd lines. So you have 240 lines, then a gap of 22 lines, then 240 more lines, then one final line of nothing (total of 503 lines, only 480 visible).

So the code I posted has two methods of blitting to the TV buffer: progressive merely blits all 480 lines at once; interlaced does two blits - one for the even lines, and then a second for the odd lines.

Actually, looking at the cursortest code, I don't think I used two system memory buffers for TV out - only one. I guess I was counting on the blit being fast enough to not cause tearing. It really should double buffer that.

Oh well, it was merely some test code. :)
gauri
Posts: 35
Joined: Sun Jan 20, 2008 11:17 pm
Location: Belarus

Post by gauri »

didn't they increase the size of VRAM too on Slim? or sceGeEdramGetSize() still reports 2M?
sorry, no Slim here to check...
Freelance game industry veteran. 8]
moonlight
Posts: 567
Joined: Wed Oct 26, 2005 7:46 pm

Post by moonlight »

gauri wrote:didn't they increase the size of VRAM too on Slim? or sceGeEdramGetSize() still reports 2M?
sorry, no Slim here to check...
Probably in game mode the size returned is 2 MB as probably Sony doesn't want official devs to detect the "slimness" of a psp.

But i saw that paf calls the function sceGeEdramSetSize (kernel only) through vshbridge. Code is something like this:


Code: Select all

if &#40;slim&#41;
   sceGeEdramSetSize&#40;4*1024*1024&#41;;
else
   sceGeEdramSetSize&#40;2*1024*1024&#41;;
So BOOM, we have now 2 MB more of edram, you just need a kernel prx to call sceGeEdramSetSize.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Does it really have two more megs of EDRAM? I'll have to try messing with cursor test again, but the last I remember, going past 2M resulted in trash. It didn't act like 4M were there. So we have to call that function sceGeEdramSetSize first and THEN we can use 4M? I'll have to experiment. Thanks for the tip.
moonlight
Posts: 567
Joined: Wed Oct 26, 2005 7:46 pm

Post by moonlight »

Well, I haven't tested it, but is what paf is doing. I saw pops/popsman doing the same, a bit different, popsman calls incoditionally the SetSize function with the 4 MB parameter, and then calls sceGeGetEdramSize and returns that.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Okay, now the $10M question - is sceGeEdramSetSize handled by the NID resolver? I found it in the 3.52 sceGe_driver, but it's almost certainly different for 3.80/3.90.

The best place to set this would probably be in the dve manager prx. They are related after all. :) When you set one of the TV modes, you just set the memory as well.
moonlight
Posts: 567
Joined: Wed Oct 26, 2005 7:46 pm

Post by moonlight »

Yes, all sceGe_driver functions are handled by the nids resolver.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Cool! I'll try modifying my cursor test to use 4M and see what happens.

EDIT: Tried... and SUCCEEDED! Setting the memory to 4M works like a charm. :)

Here's a newer cursortest, with updated dvemanager that sets the GE mem to 4M when you set a TV mode. Note in the test program makefile that there are two flags: FULLSCREEN, and INTERLACED. FULLSCREEN tells the test to do TV out. If INTERLACED is then also defined, it does interlaced, otherwise it does progressive. The compiled EBOOT in the arc is interlaced since that was what I was testing last. If someone wants me to recompile it without interlaced (because they can't compile it themselves or something), just let me know.

In progressive mode, it's just like regular - two buffers in the vram that are flipped between drawing and displaying. It works really nice and fast. In interlaced mode, I use one buffer for drawing at the start of vram, then I blit it to a display buffer (also in vram). Unless you do the drawing as interlaced, you still have to copy the drawing buffer to the display buffer to account for the difference in storage for interlaced mode. That makes it not as fast as progressive.

http://www.mediafire.com/?kcmx2ryhqub

This "new" info about 4M should really help folks trying to make TV out who were daunted by the 2M limit. It also means that people who wish to optimize an existing app for the slim could set the edram to 4M and increase the amount of texture caching they do. :)

EDIT: Here's the same test, but progressive mode. No source in this arc since it's just the above recompiled with the INTERLACED define commented out.

http://www.mediafire.com/?vvbnfjpzqjz
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

works

Post by edepot »

Well, I just took the dvemgr.prx from cursortest, plopped it into Agenaworld, then changed some code (the FRAMESIZE, width and height) and confirmed that it works also. That is neat :) I have an old version of pspsdk, so I couldn't get your program to compile without stripping out gif stuff, but even without the anigif, I notice that your cursor (which is not a gif) has a problem displaying near the bottom of the screen. (This was in progressive mode though. I don't have a composite cable to test interlace stuff). To get 720x480x4x2 working, however, I had to use a framesize of 0x5B000 instead of 0x5A000 (I think this is with each value of size 4 bytes).
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Re: works

Post by J.F. »

edepot wrote:Well, I just took the dvemgr.prx from cursortest, plopped it into Agenaworld, then changed some code (the FRAMESIZE, width and height) and confirmed that it works also. That is neat :) I have an old version of pspsdk, so I couldn't get your program to compile without stripping out gif stuff, but even without the anigif, I notice that your cursor (which is not a gif) has a problem displaying near the bottom of the screen. (This was in progressive mode though. I don't have a composite cable to test interlace stuff). To get 720x480x4x2 working, however, I had to use a framesize of 0x5B000 instead of 0x5A000 (I think this is with each value of size 4 bytes).
Giflib isn't part of the SDK. I posted info on how to compile that and libFLAC quite some time ago. If you want, I can arc the lib and include files for folks.

I'm running the progressive version on my Slim right now, and the cursor looks fine on my TV all the way down to the bottom, but the TV has SOME overscan, so I can't see the ENTIRE display. I know from my centering code in DOOM that it has only a very small overscan, so if it has a problem, it would only be in the last few lines that I can't see on my TV.

Oh, you can use interlaced with component. I test both on the same cable and TV. I don't have a composite cable for testing, but the memory layout of the buffer is the same as for component interlaced.

Also, how do you get a framesize of 0x5a000? That's just 768x480, not 768x480x4. Also, 0x5b000 doesn't even work out to recognizable dimensions in any case.

EDIT: here's the giflib lib and include files for those who want to recompile the cursortest but don't wish to try compiling the lib themselves.

http://www.mediafire.com/download.php?35gymx05z0c
edepot
Posts: 111
Joined: Sat Apr 09, 2005 3:39 pm

slim

Post by edepot »

well, if you move the cursor along the bottom edge of the screen from left to right or right to left, it seems to pop in and out of invisibility.

I released a quick agenaworld 1.6 with output of 720x480x32x2 from vram. Download at http://www.edepot.com/game.html

I also managed to add proportional fonts in this release.
Details on how the fonts are implemented are at:

http://www.edepot.com/forums/viewforum.php?f=8
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Re: slim

Post by J.F. »

edepot wrote:well, if you move the cursor along the bottom edge of the screen from left to right or right to left, it seems to pop in and out of invisibility.
How far from the bottom? Maybe it's a cursor positioning problem rather than a drawing problem.
I released a quick agenaworld 1.6 with output of 720x480x32x2 from vram. Download at http://www.edepot.com/game.html
Your description of the game needs to be updated - there's only eight planets, not nine. :P :)
Post Reply