Media Engine?

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

for the moment, i'm quite busy because of my work. I may only test your code at weekend. Sorry :(.
jockyw2001
Posts: 339
Joined: Thu Sep 29, 2005 4:19 pm

Post by jockyw2001 »

No problem, atm it works fine with kernel mode functions. I can live with that for now.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Here's something people may enjoy - I did a PRX for the MediaEngine. It's mostly based on the code in the previous page for the ME part, and KeyCleaner for the EBOOT/PRX parts. It works fine on my Slim PSP with 3.60 M33.

This doesn't yet have any attributions in the files. Before an "official" release, I wanted to see what you all think I should put in there. Pmpmod is GPL, so this may need to be as well (the prx part, at least).

http://groups.google.com/group/chilly-w ... -v1.0a.zip

Comments and suggestions welcome.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

I've got some code that includes locking, interrupts and exception handling that I suppose I should clean up and post.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

crazyc wrote:I've got some code that includes locking, interrupts and exception handling that I suppose I should clean up and post.
That would be nice. I could work it into the next revision of the prx.
Art
Posts: 642
Joined: Wed Nov 09, 2005 8:01 am

Post by Art »

I've not seen this before.
I would think everyone should be interested in a heap of CPU time for free.
mypspdev
Posts: 178
Joined: Wed Jul 11, 2007 10:30 pm

Post by mypspdev »

Does it work well for you? thanks
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

mypspdev wrote:
Does it work well for you? thanks
I'm going to do another example of using it today. The app with the prx is just a demonstration of functions.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

J.F. wrote:That would be nice. I could work it into the next revision of the prx.
Well, for starters here's some dcache functions.

Code: Select all

#define load_tag(index, hi, lo) __builtin_allegrex_cache(0x10, index);					\
			        __asm__ volatile ("mfc0 %0, $28\nmfc0 %1, $29\n":"=r"(lo), "=r"(hi));

#define store_tag(index, hi, lo) __asm__ (".set push\n"		\
					  ".set noreorder\n"	\
					  "mtc0 %0, $28\n"	\
					  "mtc0 %1, $29\n"	\
					  ".set pop\n"		\
				 ::"r"(lo),"r"(hi));		\
				 __builtin_allegrex_cache(0x11, index); 

void dcache_wb_range(void *addr, int size) {
	int i, j = (int)addr;
	for&#40;i = j; i < size+j; i += 64&#41;
		__builtin_allegrex_cache&#40;0x1a, i&#41;;
&#125;

void dcache_wbinv_all&#40;&#41; &#123;
	int i;
	for&#40;i = 0; i < 8192; i += 64&#41;
		__builtin_allegrex_cache&#40;0x14, i&#41;;
&#125;

void dcache_inv_range&#40;void *addr, int size&#41; &#123;
	int i, j = &#40;int&#41;addr;
	for&#40;i = j; i < size+j; i += 64&#41;
		__builtin_allegrex_cache&#40;0x19, i&#41;;
&#125;

void dcache_wb_all&#40;&#41; &#123;
	int i, hi, lo;
	for&#40;i = 0; i < 8192; i += 64&#41; &#123;
		load_tag&#40;i, hi, lo&#41;;
		if&#40;hi&&#40;1<<20&#41;&#41; __builtin_allegrex_cache&#40;0x1a, &#40;&#40;hi<<13&#41; | i&#41;&#41;;
		if&#40;lo&&#40;1<<20&#41;&#41; __builtin_allegrex_cache&#40;0x1a, &#40;&#40;lo<<13&#41; | i&#41;&#41;;
	&#125;
&#125;

void dcache_inv_all&#40;&#41; &#123;
	int i;
	store_tag&#40;i, 0, 0&#41;;
	for&#40;i = 0; i < 8192; i += 64&#41; &#123;
		__builtin_allegrex_cache&#40;0x13, i&#41;;
		__builtin_allegrex_cache&#40;0x11, i&#41;;
	&#125;
&#125;

void dcache_wbinv_range&#40;void *addr, int size&#41; &#123;
	int i, j = &#40;int&#41;addr;
	for&#40;i = j; i < size+j; i += 64&#41;
		__builtin_allegrex_cache&#40;0x1b, i&#41;;
&#125;
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Thanks. Good cache routines can improve the speed of certain things quite a bit.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

J.F. wrote:Thanks. Good cache routines can improve the speed of certain things quite a bit.
The FlushME function confuses me. Unless I'm missing something, it looks like you are flushing the SC.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

crazyc wrote:
J.F. wrote:Thanks. Good cache routines can improve the speed of certain things quite a bit.
The FlushME function confuses me. Unless I'm missing something, it looks like you are flushing the SC.
:)

You're right. I goofed on that one. That's another reason I posted here before releasing. Another goof - you don't need the PRX for setting the signals. SignalME should be a macro in the program. So I have some goofs to fix as well as some extra stuff to add as you post.

Thanks for the heads up!
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Okay, here's my latest. It has some stuff taken out of the kernel mode prx that didn't need to be there in the first place and shuffled into a .c file and .h file you include with the project. The kernel mode prx also takes into account 3.71, so it should work on any 3.xx firmware. It works well on my 3.71 M33 Slim.

The example is a bit more substantial that the first test - it uses the ME to color cycle the screen. The code for this test is based on some old(ish) code I found here on using a framebuffer in the main memory, so this demonstrates a number of things - putting the framebuffer into main memory, getting the info about the framebuffer, setting the debug screenbase so you can still use the debug printf function, and using the ME to draw to the display.

You'll notice I use the non-cached address of the framebuffer on the ME. One thing I learned when I made the first Amiga PPC port of DOOM - don't draw to cached buffers. You flood the cache and slow down the rest of the program. Drawing to non-cached memory means the difference between 10 FPS and 100 FPS... at least on a PPC, so I figure it's probably about the same on the MIPS. :)

http://groups.google.com/group/chilly-w ... est-v1.zip
jas0nuk
Posts: 137
Joined: Thu Apr 27, 2006 8:00 am

Post by jas0nuk »

Only this link works for me:
http://chilly-willys-ice-flow.googlegro ... est-v1.zip

edit: WTF, now the original one works... oh well :p

Thanks :)
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

jas0nuk wrote:Only this link works for me:
http://chilly-willys-ice-flow.googlegro ... est-v1.zip

edit: WTF, now the original one works... oh well :p

Thanks :)
Probably some delay in google updating all the group links. I have no idea how their web server software works - I just upload the file and it supplies a link. :)
StrmnNrmn
Posts: 46
Joined: Wed Feb 14, 2007 11:32 pm
Location: London, UK
Contact:

Post by StrmnNrmn »

J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:

Code: Select all

PSP_MODULE_INFO&#40;"MediaEngine", 0x1006, VERS, REVS&#41;;
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this.
jas0nuk
Posts: 137
Joined: Thu Apr 27, 2006 8:00 am

Post by jas0nuk »

StrmnNrmn wrote:J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:

Code: Select all

PSP_MODULE_INFO&#40;"MediaEngine", 0x1006, VERS, REVS&#41;;
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this.
http://forums.ps2dev.org/viewtopic.php?p=57753#57753 :)

StrmnNrmn, please read my reply (no. 6) to your Media Engine blog entry :p
StrmnNrmn
Posts: 46
Joined: Wed Feb 14, 2007 11:32 pm
Location: London, UK
Contact:

Post by StrmnNrmn »

jas0nuk wrote:
StrmnNrmn wrote:J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:

Code: Select all

PSP_MODULE_INFO&#40;"MediaEngine", 0x1006, VERS, REVS&#41;;
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this.
http://forums.ps2dev.org/viewtopic.php?p=57753#57753 :)
Awesome - that's exactly the page I failed to find while searching :) I guess 0x1006 is kernel mode/load/start then?

jas0nuk wrote:StrmnNrmn, please read my reply (no. 6) to your Media Engine blog entry :p
Will do - thanks :)
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

By the way, the latest incarnation of the MediaEngine prx is with the source for SNES9X_TYL 0.4.2 ME for 3xx. You can find an arc here:

http://chilly-willys-ice-flow.googlegro ... 3x-src.zip

It has all the routines that don't need to be in the prx in a separate .c file, and has the ability to invalidate/wbinv the caches on entry/exit of the ME function.

When using the ME on the Slim, remember that you cannot try to change the cpu clock once the ME has been activated or the PSP will freeze.
StrmnNrmn
Posts: 46
Joined: Wed Feb 14, 2007 11:32 pm
Location: London, UK
Contact:

Post by StrmnNrmn »

Cheers J.F. - I'll check that out.
CpuWhiz
Posts: 42
Joined: Mon Jun 04, 2007 1:30 am

Post by CpuWhiz »

I was thinking about how to call kernel functions from the ME. I warn you now that I am not the best at understanding low level hardware or assembly, but I try my best. The following could be wrong, totally off, or just plain stupid. Tell me what you think. The ME program could be a separate elf with a copy of the SDK with special wrapper functions (described below) and it's own version of malloc, free, etc. It would link to ME versions of these libraries. Any library like libmikmod would not need to be changed (just compiled and linked with the ME version of PSPSDK) and to the normal homebrew programmer everything would look the same.

The following memory (obviously needs some adjusting) would not be cached, available to both the main CPU and the ME, and would be outside the heap for both the main CPU and the ME (or set dynamically from a malloc or static variable on the main CPU and passed to some init function). The start address of 0x13370000 is just for the example and would not be the real location. I did not take into account the different memory locations for the main CPU and the ME when I wrote the example code bellow, so bear with me.

Code: Select all

0x13370000 Main Call Triggered
0x13370001 ME Call Triggered
0x13370100 Main Function Index
0x13370104 Main Function Return Value
0x13370108 Main Function Arg 1
0x1337010C Main Function Arg 2
0x13370110 Main Function Arg 3
0x13370200 ME Function Index
0x13370204 ME Function Arg 1
The process would go like this:
ME code calls the function sceIoOpen from the ME version of PSPSDK as seen bellow.
This adds the function to our shared memory and waits for the main CPU to finish with it.
The main CPU either via an interrupt or a function called from the main loop checks if it needs to call a kernel function. If it does, it looks up the function index in an array of function pointers and calls that function (handle_sceIoOpen).
The function handle_sceIoOpen (or similar) would pass the proper arguments and set the return value. Then when the kernel function returns would signal the ME to wake up and continue.

Calling a function on the ME (for example you might want to load a song, play, stop, etc) would work the same way but in reverse and with the ME function area in the above example memory map.

Code: Select all

#define FUNC_sceIoOpen  0
#define FUNC_sceIoClose 1

SceUID sceIoOpen&#40;const char *file, int flags, SceMode mode&#41;
&#123;
	// Push the function
	&#40;*&#40;&#40;unsigned int*&#41;0x13370100&#41;&#41; = FUNC_sceIoOpen;

	// Set the first argument past the return value
	void *current_arg = 0x13370104 + sizeof&#40;SceUID&#41;;

	// Push the arguments
	&#40;*&#40;&#40;const char**&#41;current_arg&#41;&#41; = file - 0x40000000; // Adjust the pointer for the main CPU memory location
	current_arg += sizeof&#40;const char*&#41;;
	&#40;*&#40;&#40;int*&#41;current_arg&#41;&#41; = flags;
	current_arg += sizeof&#40;int&#41;;
	&#40;*&#40;&#40;SceMode*&#41;current_arg&#41;&#41; = mode;

	// Trigger the function and wait
	&#40;*&#40;&#40;unsigned int*&#41;0x13370000&#41;&#41; = 1;
	waitForMainReturn&#40;&#41;;

	return &#40;*&#40;&#40;SceUID*&#41;0x13370104&#41;&#41;;
&#125;;

///////////////////////////////////////////////////

void handle_sceIoOpen&#40;&#41;
&#123;
	&#40;*&#40;&#40;SceUID*&#41;0x13370104&#41;&#41; = sceIoOpen&#40;*&#40;&#40;const char**&#41;0x13370108&#41;, *&#40;&#40;int*&#41;0x1337010C&#41;, *&#40;&#40;SceMode*&#41;0x13370110&#41;&#41;;
	signalMeMainCallDone&#40;&#41;;
&#125;;
The above is a little ugly and could be made simpler to the eyes with macros. It might be a little slow so you wouldn't want to call a lot of kernel functions on the ME. One way you might be able to reduce calls is loading a file into memory (if there is room) and having the library (for example libmikmod or libpng) use the copy in memory so all the I/O calls would be done on the main CPU without the need to pass back and forth between the main CPU and the ME. Am I way off? Is there a better way? I might try to code up a example of this if it sounds good (emphasis on might and try). I still need to look over the code posted here a little more and figure out some specifics. I bet someone has already come up with an idea like this and has it in the works... anyway, how does it sound :)
SilverSpring
Posts: 110
Joined: Tue Feb 27, 2007 9:43 pm
Contact:

Post by SilverSpring »

The ME firmware has many of these libs already. It would be better to try make use of them instead of creating your own functions.

In <2.50, the ME fw is gzipped inside mebooter.prx. From 2.50+ they are stored as encrypted images in kd/resources.

Ive worked out a few of these libs and their locations already, like the ME equivalent sysreg lib, and also cache functions.
Cpasjuste
Posts: 214
Joined: Sun May 29, 2005 8:28 am

Post by Cpasjuste »

J.F, could you write a little sample to decode mp3 with the ME ?
I'm unable to do it with my little knowledge :/
StrmnNrmn
Posts: 46
Joined: Wed Feb 14, 2007 11:32 pm
Location: London, UK
Contact:

Post by StrmnNrmn »

J.F. wrote:By the way, the latest incarnation of the MediaEngine prx is with the source for SNES9X_TYL 0.4.2 ME for 3xx. You can find an arc here:

http://chilly-willys-ice-flow.googlegro ... 3x-src.zip
...
J.F./crazyc: I've found something very unusual going on with the cache invalidation functions called from me_loop(). Specifically, I've found that when I set postcache_len to -1 (to cause a call to dache_wbinv_all()), this doesn't seem to be correctly writing back the data cache, i.e. it's as if the function isn't being called.

What's very confusing is that if I call dcache_wbinv_all() twice in me_loop, everything works as expected.

Can anyone verify that this implementation is correct?

Code: Select all

void dcache_wbinv_all&#40;&#41; &#123;
   int i;
   for&#40;i = 0; i < 8192; i += 64&#41;
      __builtin_allegrex_cache&#40;0x14, i&#41;;
&#125; 
Looking at the MIPS reference manual, it looks like 0x14 corresponds to a 'Fill' operation on the instruction cache? Shouldn't it be 0x15 (i.e. I'd expect the lowest 2 bits to be 0x1 which corresponds to an operation on the data cache...)
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

StrmnNrmn wrote:J.F./crazyc: I've found something very unusual going on with the cache invalidation functions called from me_loop(). Specifically, I've found that when I set postcache_len to -1 (to cause a call to dache_wbinv_all()), this doesn't seem to be correctly writing back the data cache, i.e. it's as if the function isn't being called.

What's very confusing is that if I call dcache_wbinv_all() twice in me_loop, everything works as expected.

Can anyone verify that this implementation is correct?

Code: Select all

void dcache_wbinv_all&#40;&#41; &#123;
   int i;
   for&#40;i = 0; i < 8192; i += 64&#41;
      __builtin_allegrex_cache&#40;0x14, i&#41;;
&#125; 
Looking at the MIPS reference manual, it looks like 0x14 corresponds to a 'Fill' operation on the instruction cache? Shouldn't it be 0x15 (i.e. I'd expect the lowest 2 bits to be 0x1 which corresponds to an operation on the data cache...)
No, it's definitely 0x14 in the PSP kernel. Interestingly, in the sceKernelDcacheWritebackInvalidateAll, Sony does the cache operation twice in each iteration of the loop just like you suggest.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

The cache routines were from a previous post in this thread, so they may need some work. If the routine needs to be called twice, the change should be made. That might be why there are sometimes issues with the sound on the SNES emu using the ME. If you find any other problems, be sure to post on it. Eventually, we'll have a nice ME prx that's (hopefully) bug-free.

Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets. So it would make sense that you would either have to call the function twice, or do it like this:

Code: Select all

void dcache_wbinv_all&#40;&#41; &#123;
   int i;
   for&#40;i = 0; i < 8192; i += 64&#41;
      __builtin_allegrex_cache&#40;0x14, i&#41;;
      __builtin_allegrex_cache&#40;0x14, i&#41;;
&#125; 
which is how you normally see PSP cache code.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

J.F. wrote: Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets.
That shouldn't be the case as 0x14 is index writeback invalidate and the psp has 128 (8192/64) total cache lines with 64 in each way. Each index should refer to an individual cache line.
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

crazyc wrote:
J.F. wrote: Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets.
That shouldn't be the case as 0x14 is index writeback invalidate and the psp has 128 (8192/64) total cache lines with 64 in each way. Each index should refer to an individual cache line.
Isn't the cache 16kb? Hence it would be 16384/64 = 256 lines with 128 each way. And since it's index writeback there are only 128 indices, so the function used for(i = 0; i < 8192; i += 64).
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

Raphael wrote:Isn't the cache 16kb?
I believe the entire cache (i+d) is 16kb, but i'm not 100% sure.
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

I'm too used to caches on regular CPUs, like the 68k or PPC. These MIPS caches are weird.
:)
Post Reply