Porting FCEUltra to the PSP

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
User avatar
bootsector_
Posts: 15
Joined: Mon Feb 25, 2008 12:56 am

Porting FCEUltra to the PSP

Post by bootsector_ »

Hi guys!

It has been a long time that I wanted to see FCEUltra ported to the PSP. Mainly a version that could be ran on the slim/newest versions which don't support kernel 1.5.

Searching a little, I discovered that Hamsterburt atually did a port of it. The sound is not perfect and the sources weren't released (I've even sent him an email asking for it, but I didn't get an answer). My idea was to use his port as my first (and PSPSDK learning) project, so I could try to improve it.

As I didn't get any sources from him, I've decided to get FCEUltra 0.98.12 original source code and try to port it for the PSP. This is my first PSP related thing and I'm completely ignorant regarding several specific PSP dev topics. And that's why I'm coming here and making this post, so you guys may help me! :)

Now, to the funny part! Right now, the emulator doesn't have any menu or file browser. It just runs whatever the ROM is placed on the root of your memory stick with the "nesrom.nes" name. It supports "basic" input (only the necessary to play the game) and NO sound.

The video is not perfect: it's not smooth, it stutters sometimes, and if you check Super Mario Bros, you're going to see that weird "crispy" effect on the green pipes and on the floor. I would like to make the video rendering the most optimized as possible before moving to the sound and menu areas. Maybe It's because I'm not using the VRAM? Or calling some unnecessary functions in the main loop rendering function (PSPVideoRenderFrame)? Or even calling the GU functions without the most suitable parameters/options? I don't have a clue:

GU Init code:

Code: Select all

void PSPVideoInit() {
	// Setup GU
	sceGuInit(); // Turn on the GU
	sceGuStart(GU_DIRECT,list); // Start filling a command list.

	sceGuDrawBuffer(GU_PSM_8888,(void*)0,BUF_WIDTH); // Point out the drawing buffer
	sceGuDispBuffer(SCR_WIDTH,SCR_HEIGHT,(void*)FRAME_SIZE,BUF_WIDTH); // Point out the display buffer
	sceGuDepthBuffer((void*)(FRAME_SIZE*2),BUF_WIDTH); // Point out the depth buffer
	sceGuOffset(2048 - (SCR_WIDTH/2),2048 - (SCR_HEIGHT/2)); // Define current drawing area.
	sceGuViewport(2048,2048,SCR_WIDTH,SCR_HEIGHT); // Center screen in virtual space.
	sceGuDepthRange(0xc350,0x2710); // Tells the GU what value range to use within the depth buffer.
	sceGuScissor(0,0,SCR_WIDTH,SCR_HEIGHT); // Sets up a scissor rect for the screen.
	sceGuEnable(GU_SCISSOR_TEST); // Enables scissor mode: pixels outside the scissor rect are not rendered.
	sceGuFrontFace(GU_CW);
	sceGuEnable(GU_TEXTURE_2D); // Enables texturing of primitives.
	sceGuClear(GU_COLOR_BUFFER_BIT|GU_DEPTH_BUFFER_BIT); // Clears current drawbuffer

	sceGuFinish(); // End of command list
	sceGuSync(0,0); // Wait for list to finish executing

	sceDisplayWaitVblankStart(); // Wait for vertical blank start
	sceGuDisplay(GU_TRUE); // VRAM should be displayed on screen.

	// Clear screen
	sceGuClearColor(0xff00ff); // Sets current clear color
	sceGuClear(GU_COLOR_BUFFER_BIT); // Clears current drawbuffer
}
Function used to render every frame generated by FCEUltra:

Code: Select all

void PSPVideoRenderFrame(uint8 *XBuf) {
	sceGuStart(GU_DIRECT,list);

	// setup CLUT texture
	sceGuClutMode(GU_PSM_8888,0,0xff,0); // 32-bit palette
	sceGuClutLoad((256/8),clut256); // upload 32*8 entries (256)
	sceGuTexMode(GU_PSM_T8,0,0,0); // 8-bit image
	sceGuTexImage(0,256,256,256,XBuf);
	sceGuTexFunc(GU_TFX_REPLACE,GU_TCC_RGB);
	sceGuTexFilter(GU_LINEAR,GU_LINEAR);
	//sceGuTexFilter(GU_NEAREST, GU_NEAREST);
	sceGuTexScale(1.0f,1.0f);
	sceGuTexOffset(0.0f,0.0f);
	sceGuAmbientColor(0xffffffff);

	// render sprite
	sceGuColor(0xffffffff);
	struct Vertex* vertices = (struct Vertex*)sceGuGetMemory(2 * sizeof(struct Vertex));
	vertices[0].u = 0; vertices[0].v = 0;
	vertices[0].x = 85; vertices[0].y = 0; vertices[0].z = 0;
	vertices[1].u = 256; vertices[1].v = 256;
	//vertices[1].x = 480; vertices[1].y = 272; vertices[1].z = 0;
	vertices[1].x = 394; vertices[1].y = 290; vertices[1].z = 0;
	sceGuDrawArray(GU_SPRITES,GU_TEXTURE_32BITF|GU_VERTEX_32BITF|GU_TRANSFORM_2D,2,0,vertices);

	// wait for next frame
	sceGuFinish();
	sceGuSync(0,0);

	//sceDisplayWaitVblankStart();
	sceGuSwapBuffers();
}
Main code:

Code: Select all

int main(int argc, char *argv[])
{
	SetupCallbacks();

	sceKernelDcacheWritebackAll();

	scePowerSetClockFrequency(333, 333, 166);

	//XBuf = getStaticVramBuffer(512,272,3);

    if(!(FCEUI_Initialize())) {
		printf("FCEUltra did not initialize.\n");
		return(0);
	}

	FCEUI_SetVidSystem(0); // 0 - NTSC
	FCEUI_SetGameGenie(0);
	FCEUI_DisableSpriteLimitation(1);
	FCEUI_SetSoundVolume(0);
	FCEUI_SetSoundQuality(0);
	FCEUI_SetLowPass(0);
	FCEUI_Sound(0);

    FCEUGI *tmp;

    if((tmp=FCEUI_LoadGame("ms0:/nesrom.nes"))) {
        printf("Game Loaded!\n");
        CurGame=tmp;
    }
    else {
        printf("Didn't load Game!\n");
    }

    PSPInputInitPads();

    PSPVideoInit();

    PSPVideoOverrideNESClut();

	while(CurGame) {//FCEUI_CloseGame turns this false
        DoFun();
    }

	sceGuTerm();

	sceKernelExitGame();

	return 0;
}


void FCEUD_Update(uint8 *XBuf, int32 *tmpsnd, int32 ssize)
{
	PSPVideoRenderFrame(XBuf);
	PSPInputReadPad();
	//printf("FCEUD_Update\n");

//	OutputSound(tmpsnd, ssize);

//	if (Get_NESInput()) {
//		FCEUI_CloseGame();
//		CurGame=0;
//	}
}

void DoFun()
{
    uint8 *gfx;
    int32 *sound;
    int32 ssize;

    FCEUI_Emulate(&gfx, &sound, &ssize, 0);
    FCEUD_Update(gfx, sound, ssize);
}
Any comment, help, hint or explanation are welcome! If you got interested on this and you are able to help, please let me know and I can add you as a developer on this project's Google Code SVN repository:

http://code.google.com/p/fceupsp/

If you need look further on the source code, you can always point your browser to:

http://fceupsp.googlecode.com/svn/trunk/

The PSP related modifications are pretty much the following:

:. http://fceupsp.googlecode.com/svn/trunk ... u/Makefile -> PSPSDK specific Makefile

:. http://fceupsp.googlecode.com/svn/trunk ... ivers/psp/ -> PSP specific code

Thanks!

bootsector
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Code: Select all

   vertices[0].u = 0; vertices[0].v = 0;
   vertices[0].x = 85; vertices[0].y = 0; vertices[0].z = 0;
   vertices[1].u = 256; vertices[1].v = 256;
   //vertices[1].x = 480; vertices[1].y = 272; vertices[1].z = 0;
   vertices[1].x = 394; vertices[1].y = 290; vertices[1].z = 0;
   sceGuDrawArray(GU_SPRITES,GU_TEXTURE_32BITF|GU_VERTEX_32BITF|GU_TRANSFORM_2D,2,0,vertices);
That's drawing the emulator screen buffer to the PSP display as one big sprite. ThePSP display is 480x272, so using a y coord of 290 is a "bad thing". If you wish to make a 4:3 centered display on the PSP, use a 364x272 display offset to (58,0).

For better speed, you really should do that SPRITE operation in slices. Look at sprite examples, or any number of open source apps. Slicing the draw makes it much faster as you take better advantage of the texture cache in the VDP.
User avatar
bootsector_
Posts: 15
Joined: Mon Feb 25, 2008 12:56 am

Post by bootsector_ »

Thanks for the valuable tips, J.F. :)

I've made a couple of experiments and the video is now much better (not perfect IMO, because of some slowdowns and lack of VSYNC):

:. FCEU core is now using VRAM and rendering frames there directly (not swizzle though)

Code: Select all

	void* fbp0 = getStaticVramBuffer(BUF_WIDTH,SCR_HEIGHT,GU_PSM_8888);
	void* fbp1 = getStaticVramBuffer(BUF_WIDTH,SCR_HEIGHT,GU_PSM_8888);
	void* zbp = getStaticVramBuffer(BUF_WIDTH,SCR_HEIGHT,GU_PSM_8888);

	vram_buffer = getStaticVramTexture(BUF_WIDTH,SCR_HEIGHT,GU_PSM_8888);

	XBuf = (uint8 *)((unsigned int)vram_buffer|0x40000000);
:. Rendering engine now using slicing:

Code: Select all

#define SLICE_SIZE 8

void PSPVideoRenderFrame(uint8 *XBuf) {
	sceGuStart(GU_DIRECT,list);


	// setup CLUT texture
	sceGuClutMode(GU_PSM_8888,0,0xff,0); // 32-bit palette
	sceGuClutLoad((256/8),clut256); // upload 32*8 entries (256)
	sceGuTexMode(GU_PSM_T8,0,0,0); // 8-bit image
	sceGuTexImage(0,256,256,256,vram_buffer);
	sceGuTexFunc(GU_TFX_REPLACE,GU_TCC_RGB);
	//sceGuTexFilter(GU_LINEAR,GU_LINEAR);
	sceGuTexFilter(GU_NEAREST, GU_NEAREST);
	//sceGuTexScale(2.0f,2.0f);
	//sceGuTexOffset(0.0f,0.0f);
	//sceGuAmbientColor(0xffffffff);

	advancedBlit(0, 0, 256, 240, 112, 16, SLICE_SIZE);

	// wait for next frame
	sceGuFinish();
	sceGuSync(0,0);

	//sceDisplayWaitVblankStart();
	sceGuSwapBuffers();
}

void advancedBlit(int sx, int sy, int sw, int sh, int dx, int dy, int slice)
{
	int start, end;

	// blit maximizing the use of the texture-cache

	for &#40;start = sx, end = sx+sw; start < end; start += slice, dx += slice&#41;
	&#123;
		struct Vertex* vertices = &#40;struct Vertex*&#41;sceGuGetMemory&#40;2 * sizeof&#40;struct Vertex&#41;&#41;;
		int width = &#40;start + slice&#41; < end ? slice &#58; end-start;

		vertices&#91;0&#93;.u = start; vertices&#91;0&#93;.v = sy;
		//vertices&#91;0&#93;.color = 0;
		vertices&#91;0&#93;.x = dx; vertices&#91;0&#93;.y = dy; vertices&#91;0&#93;.z = 0;

		vertices&#91;1&#93;.u = start + width; vertices&#91;1&#93;.v = sy + sh;
		//vertices&#91;1&#93;.color = 0;
		vertices&#91;1&#93;.x = dx + width; vertices&#91;1&#93;.y = dy + sh; vertices&#91;1&#93;.z = 0;

		//sceGuDrawArray&#40;GU_SPRITES,GU_TEXTURE_16BIT|GU_COLOR_4444|GU_VERTEX_16BIT|GU_TRANSFORM_2D,2,0,vertices&#41;;
		sceGuDrawArray&#40;GU_SPRITES,GU_TEXTURE_32BITF|GU_VERTEX_32BITF|GU_TRANSFORM_2D,2,0,vertices&#41;;
	&#125;
&#125;
I'm not sure if all the code above is coded to achieve performance, but Super Mario Bros. is not suffering of that "lack of VSYNC" on the floor region too much :) Maybe if I add swizzling I could get better results?

I will keep poking this a little bit more to try getting the best video performance possible. Please keep coming with the suggestions and tips :)

If you need to look to additional code (and download/compile a binary so you can actually see how's it going), please, check the SVN repository!

Thanks,

bootsector
User avatar
bootsector_
Posts: 15
Joined: Mon Feb 25, 2008 12:56 am

Post by bootsector_ »

Experimental sound support added. Sound is AWFUL though. It looks like configuring FCEU core to output 44100hz sound is very CPU consuming (we can note the slowdowns). After I realize how to to output 44100hz sound correctly I willchange it to 22050hz.

Code: Select all

FCEUI_Sound&#40;44100&#41;;
Sound output related code:

Code: Select all

    chan = sceAudioChReserve&#40;PSP_AUDIO_NEXT_CHANNEL, PSP_AUDIO_SAMPLE_ALIGN&#40;64*7&#41;, PSP_AUDIO_FORMAT_MONO&#41;;

Code: Select all

void FCEUD_Update&#40;uint8 *XBuf, int32 *tmpsnd, int32 ssize&#41;
&#123;
	PSPVideoRenderFrame&#40;XBuf&#41;;
	PSPInputReadPad&#40;&#41;;
	PSPSoundOutput&#40;tmpsnd, ssize&#41;;
&#125;

Code: Select all

void inline PSPSoundOutput&#40;int32 *tmpsnd, int32 ssize&#41; &#123;
    int i;
    s16 ssound&#91;ssize&#93;;

    for &#40;i=0;i<ssize;i++&#41; &#123;
        ssound&#91;i&#93;=tmpsnd&#91;i&#93;;
    &#125;
    //sceAudioSetChannelDataLen&#40;chan, ssize<<8&#41;;
	sceAudioOutput&#40;chan, PSP_AUDIO_VOLUME_MAX, ssound&#41;;
&#125;
After I realize how to output sound samples properly (without those noises), I will try to code something like the wavegen.c PSPSDK sample, by using a callback. I'm not sure if it would be better using a customized approach based on sceAudioOutput + new thread though.

Any help would be appreciated!

Thanks,

bootsector
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

You can't really swizzle the texture because that would complicate (thus slowing) the emulator screen drawing. The slicing is the best you can do there. You might look into how the screen is updated and speed it up. One thing I've always found that helps - make screen buffers uncached. Cacheable screen buffers flood the caches, slowing the rest of the program. You can make memory uncached on the PSP by ORing 0x40000000 with the pointer. If you do, make sure the buffers are cache aligned (64 bytes), and you flush the dcache after allocation.

You should probably put sound into it's own thread and store the sound into a buffer.
User avatar
bootsector_
Posts: 15
Joined: Mon Feb 25, 2008 12:56 am

Post by bootsector_ »

Thank you guys for the support so far. There's still a lot to be done, but I think that the port worths a public release of its current state:

http://code.google.com/p/fceupsp/

Download:

http://code.google.com/p/fceupsp/downloads/list

As usual, any feedback/help is welcome. If you'd like to contribute with the project (by coding), please contact me!

Regards,

bootsector
Post Reply