Weird screen behavior

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Eib
Posts: 7
Joined: Wed Apr 15, 2009 4:56 am

Weird screen behavior

Post by Eib »

Hello everyone. Tried to search for solution, but can't even come with correct search request :)
So, recently i downloaded ODE example ( http://forums.ps2dev.org/viewtopic.php? ... hlight=ode ), and tried to examine it. After some extremely tiny modifications, i noticed some weird behavior. When i try to start resulting .prx via PSPLink, in 90 of 100 tries i get this strange result :
Image
while normally it should look like :
Image
Simply restarting .prx in psplink may give any of these. No recompilation is involved. I reviewed source, and it looks like to be correct so i just run out of ideas. Maybe someone can look into it as well ? Here is the link to full pack of sources and binaries : http://www.mediafire.com/?sharekey=cf40 ... f6e8ebb871
Thanks.

P.S. Sorry for bad english.
coolkehon
Posts: 355
Joined: Mon Oct 20, 2008 5:44 am

Post by coolkehon »

i got a similar error when doing drawings to the gu but when i restart it didnt happen only happend every so offten so i just ignored and now it doesnt happen anymore
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

That "split in half" look is usually due to mixing up the pixel size when drawing - for example, drawing as 16 bit pixels when the display is 32 bit. Maybe your screen pitch is calculating that off similarly - using 512*2 instead of 512 *4. Double check everything related to the screen pitch and pixel formats.
Eib
Posts: 7
Joined: Wed Apr 15, 2009 4:56 am

Post by Eib »

J.F. wrote:That "split in half" look is usually due to mixing up the pixel size when drawing - for example, drawing as 16 bit pixels when the display is 32 bit. Maybe your screen pitch is calculating that off similarly - using 512*2 instead of 512 *4. Double check everything related to the screen pitch and pixel formats.
Well, i think if i were going to work with the screen pitch of 512*2 in 32bpp mode, this splitting should be constant on every run. Instead, it completely random. Looks like some caching problems, but it's hard to tell for sure. Of course, there is a chance i miss something in the code, but there is only 2 draw calls (each in loop) so it's hard to miss something. Basically, that's why i uploaded the source, maybe fresh look into it will reveal some mistakes?
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Well, looking at the code, it seems you're working from something meant to be a plug-in prx, not an executable prx. There's no "module_start" in an executable, and you certainly need the crt and start files that you exclude in the makefile. You need to start by converting this into a NORMAL executable. Look at the sample apps that come with the SDK, or any of the apps out there that come with source.
Insert_witty_name
Posts: 376
Joined: Wed May 10, 2006 11:31 pm

Post by Insert_witty_name »

Check you're not overflowing the display list.

Check the alignment of everything you're passing to the GU (ensure you're vertex structs are 16 byte aligned by using __attribute__((aligned(16))))
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

Eib, I'm currently running into the same problem and wondered if you ever found a solution.

I'll try Insert_witty_name's suggestions...
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

I spotted some fishy code in my application.

Is it OK/Dangerous/Stupid/Useless/useful to call sceGuTextImage outside of the GuStart()...GuFinish() block ?

I have code that does something like:
sceGuTexImage(0, tex->mTexWidth, tex->mTexHeight, tex->mTexWidth, tex->mBits);
[...]
sceGuStart(GU_DIRECT, list);
[draw stuff]
sceGuFinish()
This doesn't seem alright to me, but I'd like to know if my feeling is right and why.
Edit:
Fixing the above did not change a thing. I'm using 10% of the display list, so I'm now assuming that something is not aligned how it should.

I read that vertices need to be 8bit aligned and textures need to be 16 bits aligned, is that correct ?
Assuming it is:
To get memory for the vertices, I use sceGetMemory. Does it give me 8bit aligned space ?
To get memory for textures, I use either memalign(16,...) (when stored in Ram), or valloc (from lib pspvram) when they are in VRAM. Do these functions send me 16 aligned space ?
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

Sorry to bump, but nobody has any idea on that?

It feels to me that if one vertex/texture is not aligned, this specific vertex/texture will look funny, but not the entire screen, so I'm thinking something does not get initialized correctly somewhere else...
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

I think in my case the problem was not directly related to textures... other variables were messed up. A huge cleanup of the code apparently made it go away. Seeing the randomness of this, I of course need to confirm it, but using a variable that had previously been freed caused lots of collateral issues.
Edit: Never mind, my issue is still here
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

Can anyone help willow's issue?
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

On the initial problem:
This random behaviour suggests that you somewhere have some kind of overflow in the code, that can carry on to the gu rendering (since all gu context information is stored in sysram actually).
However, this can be insanely hard to track down, and I'd already to give up on something similar in one or two of my old projects.
Best solution if you can't track it down after many trys is to revert to an old revision of code that was stable.

I spotted some fishy code in my application.

Is it OK/Dangerous/Stupid/Useless/useful to call sceGuTextImage outside of the GuStart()...GuFinish() block ?

I have code that does something like:
sceGuTexImage(0, tex->mTexWidth, tex->mTexHeight, tex->mTexWidth, tex->mBits);
[...]
sceGuStart(GU_DIRECT, list);
[draw stuff]
sceGuFinish()
This doesn't seem alright to me, but I'd like to know if my feeling is right and why.
Yes, it's wrong to call any sceGu* function outside of sceGuStart()/sceGuFinish() block (apart from 2 or 3 exceptions, which shouldn't be cared about really).

willow :--) wrote: Edit:
Fixing the above did not change a thing. I'm using 10% of the display list, so I'm now assuming that something is not aligned how it should.

I read that vertices need to be 8bit aligned and textures need to be 16 bits aligned, is that correct ?
Assuming it is:
To get memory for the vertices, I use sceGetMemory. Does it give me 8bit aligned space ?
To get memory for textures, I use either memalign(16,...) (when stored in Ram), or valloc (from lib pspvram) when they are in VRAM. Do these functions send me 16 aligned space ?
No, 8bit alignment is generally wrong in psp (and more specially gu) context. EVERYTHING has AT LEAST to be 32bit aligned.
For gu and me, actually 128bit (ie 16 BYTE) alignment is necessary for some things, so maybe you just confused bit and byte alignment.

Regarding alignment of the allocation functions:
sceGuGetMemory is always guaranteed to be aligned on 32bit, malloc is even aligned on 16byte by default (ie same as memalign(16, ..)) iirc and my valloc routines from lib pspvram align on 16byte (valloc.c version) or 512byte (vram.c version) by default, but can be setup to any other alignment). So you're normally always safe regarding the alignment of allocations.
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
User avatar
Torch
Posts: 825
Joined: Wed May 28, 2008 2:50 am

Post by Torch »

J.F. wrote:Well, looking at the code, it seems you're working from something meant to be a plug-in prx, not an executable prx. There's no "module_start" in an executable, and you certainly need the crt and start files that you exclude in the makefile. You need to start by converting this into a NORMAL executable. Look at the sample apps that come with the SDK, or any of the apps out there that come with source.
?

Isn't the recommended 3.XX way to use a PRX in the EBOOT and not a static ELF? Thus it will still have a module_start (or the compiler will add it).
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

Raphael, thanks a lot for the precious info.
Given the randomness of the bug there is no way I can go back to a revision that works...because as far as I know, they all have the bug.

I believe the bug lies in the library I use (JGE++), as one of our devs (Yeshua) was able to reproduce the issue with the hello world sample that comes with the library. There were also reports more than 2 years ago about such behavior with the library.

I am still investigating. The contents of gu_draw_buffer and gu_contexts don't seem to be impacted. I was expecting one of them to contain garbage if it was an overflow. For example I would have loved to see gu_contexts.texture_mode contain an insane value when I get the purple screen, but that's not the case.

Even the Hello world calls lots of code though, so there are lots of possibilities for an overflow somewhere in the code.

This bug is really frustrating, so I look into it from time to time then forget about it, but if I ever manage to fix it I'll let you know, I guess it could be useful.
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

willow :--) wrote: I believe the bug lies in the library I use (JGE++), as one of our devs (Yeshua) was able to reproduce the issue with the hello world sample that comes with the library. There were also reports more than 2 years ago about such behavior with the library.
Can you provide the code and/or instructions on how to reproduce the error with JGE++ hello world sample?
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
AlphaDingDong
Posts: 29
Joined: Fri Mar 21, 2008 2:51 pm
Location: The interwebs

Post by AlphaDingDong »

We've been trying to track this down over at psp-programming as well.

Here's a link to his hello world app that reproduces the bug:

http://code.google.com/p/purplescreendebug/

I examined alignment as a possible cause (framebuffer alignment, texture alignment, vertex alignment) but in my testing all incorrect alignments resulted in very different bugs than this one. Mostly just shearing, but in the case of the vertices it simply didn't draw.
willow :--)
Posts: 107
Joined: Sat Jan 13, 2007 11:50 am

Post by willow :--) »

Just to give a few precisions: there is no known way to reproduce the bug 100% of the time with the Hello World. The only proof I have right now that the hello world (link provided above by AlphaDingDong) has the issue is a screenshot by a user I trust.

I also agree with AlphaDingDong: a mis-alignement in a texture or in a bunch of vertices will cause visual artifacts on that specific texture, but not on the whole screen.

Edit: to compile the hello world:
1) in JGE/ compile HGEParticles (make -f Makefile.hge; mv libhgetools.a lib/psp/)
2) in JGE/ compile JGE++ (make)
2) in purples/01.HelloWorld compile the game (make)
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

You can now press square in our program in until you get the bug (the hello world program)
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

I took a look into JGE render setup and everything that happens in the hello world sample.

Some things I found:
- The initial render setup is very rudimentary, a lot of modes aren't set explicitly. These include:
* GU_CULL_FACE and GU_CLIP_PLANES not disabled in 2D mode (but enabled, which is default anyway, in 3D mode)
* GU_COLOR_TEST, GU_ALPHA_TEST and GU_LIGHTING not disabled (though iirc they are disabled by default, still better to make it explicit)
* Most importantly: No depth buffer allocated, but GU_DEPTH_TEST not disabled and, what less people seem to know, depth writes (sceGuDepthMask(GU_TRUE)) not disabled! (On a side-note: sceGuDrawBuffer ALWAYS sets the depthbuffer to be drawbuffer+512*height*4 if it was not set before, which in the case of JGE will end up at the display buffer)
- JGfx.cpp includes both <vram.h> and <valloc.h>, which is wrong, since those are two alternative libraries to use (vram is faster, with more allocation size overhead, valloc is slower with as little as needed allocation overhead). Possibly both libraries are also linked in, so that should be changed too.
- JRenderer::Destroy() is not a safe destruction of all objects involved (see below)

Though those don't quite explain the double-pink-screen, those are points that should be fixed.

Regarding the new test method, which basically creates a new JRenderer every time SQUARE is pressed - the error is obviously, because the JRenderer::Destroy() function doesn't safely clean up everything, just calls sceGuTerm(). There's no freeing of allocated VRAM buffers, no freeing of allocated texture, vertices/object RAM.
This HAS TO cause some graphical glitch at some point.

Regarding the look of the graphical glitch: it actually looks a lot like if the screen got rendered in 16bit mode, but displayed as 32bit (gonna try that out with a test image to verify). Hence there's also a very small possibility that it's either a bug in sceGu library or maybe even the hardware (though I'd rather not assume that).
For starters, could you try the following:
in JRenderer::StartScene() at the end, add the following line:

Code: Select all

sendCommandi&#40;210,BUFFER_FORMAT&#41;;
EDIT: A simple test has strengthened my assumption about the render format. I created a TGA file which was saved in 16bit (5551) at double height of the original image, filling the lower bottom with black (0).
Then i hex'ed the TGA file, so the header said it was actually 32bit (8888) at half height. The outcome is the same as the screenshots.
http://www.fx-world.org/images/test.tga (hex'ed image)
http://www.fx-world.org/images/test_orig.tga (unhex'ed image)

In case the line from above fixes the error, we can be sure that the problem is that the hardwares "render buffer format" (PSM) register is at some place overwritten with a value other than 3 (NOT the buffer format variable in the sceGu context, since that one is used every frame to set the display format through sceDisplaySetFrameBuf and obviously sets 32bit correctly).
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

I have fixed the problem with simply calling destroy() and the issue is still present
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

have you added this line:

Code: Select all

sendCommandi&#40;210,BUFFER_FORMAT&#41;; 
to JRenderer::StartScene()?

EDIT: I see, you only changed JRenderer::Destroy() for engine->End(). This will not solve the problem of JGE not freeing up the vram allocations ANYWHERE, ie it will always cause problems unless you only destroy/end JGE once during your programs lifetime.
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

I have not commited the changes to the svn as of yet.
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

What header file do I include to use that command you posted above. It doesn't compile by simply just adding it...
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

yeshua wrote:What header file do I include to use that command you posted above. It doesn't compile by simply just adding it...
It's in pspgu.h... oh wait, it's actually called sceGuSendCommandi. My fault :)
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

Thanks, I believe this fixes the issue. My HelloWorld Program has the "fix" in the svn.

To anyone else with this issue: Just do what Raphael said to do...
Last edited by yeshua on Mon Dec 07, 2009 6:19 am, edited 1 time in total.
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

yeshua wrote:Thanks, I believe this fixes the issue. My HelloWorld Program has the "fix" in the svn.
Well, then I guess the only question left is: when and what causes this hardware register to be overwritten?...
At least it might be worth-while putting that "fix" into pspgu directly, since it's no problem to reset that register on each sceGuStart() and I'm also not sure it's just wrong coding in JGE.
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

I'm stumped man, I'm just glad we got a work around.
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

I noticed, since sceGuDrawBufferList changes the psm hardware register, but is supposed to be non-permanent (ie changed back to value given with sceGuDrawBuffer when a new display list is started with sceGuStart) it makes absolute sense to add that command in sceGuStart, since it already resets the draw buffer pointer as it's supposed to, just not the pixel format.

Hmm... need to dig out my SVN login and put a patch in :)

EDIT: Submitted revision 2489, fixing this bug in sceGuStart
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
yeshua
Posts: 20
Joined: Mon Nov 30, 2009 10:54 am

Post by yeshua »

Wow, there was an error in the sdk...
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

Yes. It's just that the cause auf the SDK bug would be if the PSM register is overwritten actively by use of sceDrawBufferList (I don't know any other function that writes to that register...). But JGE++ didn't seem to use it anywhere, so the question remains why it happened there anyway.

I'm sill suspicous it might be a hardware glitch.
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
Post Reply