GU_PSM_T8

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
User avatar
uberjack
Posts: 34
Joined: Tue Jul 17, 2007 9:09 am
Location: California, USA
Contact:

GU_PSM_T8

Post by uberjack »

Hi everyone,
Up until now I've used GU_PSM_T8 textures with a CLUT. I'm wondering if it's possible to use these textures without a CLUT, and if so, how is the value interpreted? Is it simply a grayscale value 0-255, or is there color packing? What about T4?

Thanks
User avatar
jean
Posts: 489
Joined: Sat Jan 05, 2008 2:44 am

Post by jean »

T4 should be 4bit-addressed CLUT. That means you have 16 possible colors specified in a LUT the usual way. Long ago, in a desperate atempt to use GE for a somewhat GPGPU i tried some tricks with CLUT... using Tx textures without providing a valid CLUT leaded to random (or black at all) screen. I see no point in doing so..if you want grayscale, just generate it on the fly. The real mistery is GU_PSM_T32... take a look at http://forums.ps2dev.org/viewtopic.php? ... b2d8c9e506
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

jean wrote:T4 should be 4bit-addressed CLUT. That means you have 16 possible colors specified in a LUT the usual way. Long ago, in a desperate atempt to use GE for a somewhat GPGPU i tried some tricks with CLUT... using Tx textures without providing a valid CLUT leaded to random (or black at all) screen. I see no point in doing so..if you want grayscale, just generate it on the fly. The real mistery is GU_PSM_T32... take a look at http://forums.ps2dev.org/viewtopic.php? ... b2d8c9e506
No mystery on GU_PSM_T32 or GU_PSM_T16. The GPU fetches 16 or 32 bits at a time instead of 4 or 8, then the palette is addressed according to the shift and mask specified for the texture palette operation. I use this in B2 to accelerate the conversion of 15 and 24 bit Mac video into the proper PSP video data. It's not mysterious - just not well documented. You need to look at examples... like my refresh routines in Basilisk II.
User avatar
jean
Posts: 489
Joined: Sat Jan 05, 2008 2:44 am

Post by jean »

Wow....many thanks, guy! I'll take a look...
User avatar
uberjack
Posts: 34
Joined: Tue Jul 17, 2007 9:09 am
Location: California, USA
Contact:

Post by uberjack »

Thanks!
Xfacter
Posts: 9
Joined: Wed Feb 28, 2007 10:13 am

Post by Xfacter »

You can also use it to do fullscreen effects by creating a clut for whatever the effect is, then shifting/masking and blending for each color. FuncLib has a module for effects like this, so you can check that out if you want.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

No mystery on GU_PSM_T32 or GU_PSM_T16. The GPU fetches 16 or 32 bits at a time instead of 4 or 8, then the palette is addressed according to the shift and mask specified for the texture palette operation. I use this in B2 to accelerate the conversion of 15 and 24 bit Mac video into the proper PSP video data.
Have you benchmarked how long each pass takes? This could possibly be used to accelerate unchained vga to linear texture conversion, but would be pointless if it is slower then the dcache unfriendly method of splitting up the planar data when a vram write occurs.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

crazyc wrote:
No mystery on GU_PSM_T32 or GU_PSM_T16. The GPU fetches 16 or 32 bits at a time instead of 4 or 8, then the palette is addressed according to the shift and mask specified for the texture palette operation. I use this in B2 to accelerate the conversion of 15 and 24 bit Mac video into the proper PSP video data.
Have you benchmarked how long each pass takes? This could possibly be used to accelerate unchained vga to linear texture conversion, but would be pointless if it is slower then the dcache unfriendly method of splitting up the planar data when a vram write occurs.
I haven't done any specific benchmarking, but it's faster than using the CPU. Let's see... how you would convert the VGA data...

Set the palette for 8 bit to 8888 so that you convert XX to 000000XX. Do pass over VGA plane 0 with no adding. Then set the palette to convert XX to 0000XX00 and do a pass over plane 1 with adding. Repeat for planes 2 and 3 with the appropriate palette. Then finally do a normal blit using the texture just built and the real palette. That would work. Overall, you'd be doing 4 smaller blits and one big one (equivalent to two big ones), so I'd say it would be pretty fast. It shouldn't be too hard to make that routine and try it. I'd have the palettes already preset from the start so you could just upload them without making them on the fly... every little bit of time saved helps.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

J.F. wrote:Set the palette for 8 bit to 8888 so that you convert XX to 000000XX. Do pass over VGA plane 0 with no adding. Then set the palette to convert XX to 0000XX00 and do a pass over plane 1 with adding. Repeat for planes 2 and 3 with the appropriate palette. Then finally do a normal blit using the texture just built and the real palette. That would work. Overall, you'd be doing 4 smaller blits and one big one (equivalent to two big ones), so I'd say it would be pretty fast. It shouldn't be too hard to make that routine and try it. I'd have the palettes already preset from the start so you could just upload them without making them on the fly... every little bit of time saved helps.
This is pretty much what I was thinking based on what I saw in your Basilisk drawing code. The fact that it take 5 passes per frame is why I was wondering about the performance of doing multiple full frame adds. Looks like I've got some testing to do.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

crazyc wrote:
J.F. wrote:Set the palette for 8 bit to 8888 so that you convert XX to 000000XX. Do pass over VGA plane 0 with no adding. Then set the palette to convert XX to 0000XX00 and do a pass over plane 1 with adding. Repeat for planes 2 and 3 with the appropriate palette. Then finally do a normal blit using the texture just built and the real palette. That would work. Overall, you'd be doing 4 smaller blits and one big one (equivalent to two big ones), so I'd say it would be pretty fast. It shouldn't be too hard to make that routine and try it. I'd have the palettes already preset from the start so you could just upload them without making them on the fly... every little bit of time saved helps.
This is pretty much what I was thinking based on what I saw in your Basilisk drawing code. The fact that it take 5 passes per frame is why I was wondering about the performance of doing multiple full frame adds. Looks like I've got some testing to do.
When you look at the total, it's really only doing two passes. The first four passes are only on 1/4 the screen data. So it does 4 X 1/4 to put together the data into a form that you can then do one full pass over.
Post Reply