| View previous topic :: View next topic |
| Author |
Message |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Wed Feb 14, 2007 8:39 am Post subject: |
|
|
| for the moment, i'm quite busy because of my work. I may only test your code at weekend. Sorry :(. |
|
| Back to top |
|
 |
jockyw2001
Joined: 29 Sep 2005 Posts: 339
|
Posted: Wed Feb 14, 2007 9:14 am Post subject: |
|
|
| No problem, atm it works fine with kernel mode functions. I can live with that for now. |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Mon Sep 17, 2007 11:47 am Post subject: |
|
|
Here's something people may enjoy - I did a PRX for the MediaEngine. It's mostly based on the code in the previous page for the ME part, and KeyCleaner for the EBOOT/PRX parts. It works fine on my Slim PSP with 3.60 M33.
This doesn't yet have any attributions in the files. Before an "official" release, I wanted to see what you all think I should put in there. Pmpmod is GPL, so this may need to be as well (the prx part, at least).
http://groups.google.com/group/chilly-willys-ice-flow/web/MediaEnginePRX-v1.0a.zip
Comments and suggestions welcome. |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Mon Sep 17, 2007 12:58 pm Post subject: |
|
|
| I've got some code that includes locking, interrupts and exception handling that I suppose I should clean up and post. |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Mon Sep 17, 2007 1:30 pm Post subject: |
|
|
| crazyc wrote: | | I've got some code that includes locking, interrupts and exception handling that I suppose I should clean up and post. |
That would be nice. I could work it into the next revision of the prx. |
|
| Back to top |
|
 |
Art
Joined: 09 Nov 2005 Posts: 647
|
Posted: Mon Sep 17, 2007 7:15 pm Post subject: |
|
|
I've not seen this before.
I would think everyone should be interested in a heap of CPU time for free. |
|
| Back to top |
|
 |
mypspdev
Joined: 11 Jul 2007 Posts: 178
|
Posted: Tue Sep 18, 2007 3:18 am Post subject: |
|
|
Does it work well for you? thanks |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Tue Sep 18, 2007 5:23 am Post subject: |
|
|
| mypspdev wrote: |
Does it work well for you? thanks |
I'm going to do another example of using it today. The app with the prx is just a demonstration of functions. |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Tue Sep 18, 2007 12:55 pm Post subject: |
|
|
| J.F. wrote: | | That would be nice. I could work it into the next revision of the prx. | Well, for starters here's some dcache functions.
| Code: | #define load_tag(index, hi, lo) __builtin_allegrex_cache(0x10, index); \
__asm__ volatile ("mfc0 %0, $28\nmfc0 %1, $29\n":"=r"(lo), "=r"(hi));
#define store_tag(index, hi, lo) __asm__ (".set push\n" \
".set noreorder\n" \
"mtc0 %0, $28\n" \
"mtc0 %1, $29\n" \
".set pop\n" \
::"r"(lo),"r"(hi)); \
__builtin_allegrex_cache(0x11, index);
void dcache_wb_range(void *addr, int size) {
int i, j = (int)addr;
for(i = j; i < size+j; i += 64)
__builtin_allegrex_cache(0x1a, i);
}
void dcache_wbinv_all() {
int i;
for(i = 0; i < 8192; i += 64)
__builtin_allegrex_cache(0x14, i);
}
void dcache_inv_range(void *addr, int size) {
int i, j = (int)addr;
for(i = j; i < size+j; i += 64)
__builtin_allegrex_cache(0x19, i);
}
void dcache_wb_all() {
int i, hi, lo;
for(i = 0; i < 8192; i += 64) {
load_tag(i, hi, lo);
if(hi&(1<<20)) __builtin_allegrex_cache(0x1a, ((hi<<13) | i));
if(lo&(1<<20)) __builtin_allegrex_cache(0x1a, ((lo<<13) | i));
}
}
void dcache_inv_all() {
int i;
store_tag(i, 0, 0);
for(i = 0; i < 8192; i += 64) {
__builtin_allegrex_cache(0x13, i);
__builtin_allegrex_cache(0x11, i);
}
}
void dcache_wbinv_range(void *addr, int size) {
int i, j = (int)addr;
for(i = j; i < size+j; i += 64)
__builtin_allegrex_cache(0x1b, i);
} |
|
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Tue Sep 18, 2007 2:49 pm Post subject: |
|
|
| Thanks. Good cache routines can improve the speed of certain things quite a bit. |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Wed Sep 19, 2007 1:13 am Post subject: |
|
|
| J.F. wrote: | | Thanks. Good cache routines can improve the speed of certain things quite a bit. | The FlushME function confuses me. Unless I'm missing something, it looks like you are flushing the SC. |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Wed Sep 19, 2007 5:31 am Post subject: |
|
|
| crazyc wrote: | | J.F. wrote: | | Thanks. Good cache routines can improve the speed of certain things quite a bit. | The FlushME function confuses me. Unless I'm missing something, it looks like you are flushing the SC. |
:)
You're right. I goofed on that one. That's another reason I posted here before releasing. Another goof - you don't need the PRX for setting the signals. SignalME should be a macro in the program. So I have some goofs to fix as well as some extra stuff to add as you post.
Thanks for the heads up! |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Tue Oct 02, 2007 10:41 pm Post subject: |
|
|
Okay, here's my latest. It has some stuff taken out of the kernel mode prx that didn't need to be there in the first place and shuffled into a .c file and .h file you include with the project. The kernel mode prx also takes into account 3.71, so it should work on any 3.xx firmware. It works well on my 3.71 M33 Slim.
The example is a bit more substantial that the first test - it uses the ME to color cycle the screen. The code for this test is based on some old(ish) code I found here on using a framebuffer in the main memory, so this demonstrates a number of things - putting the framebuffer into main memory, getting the info about the framebuffer, setting the debug screenbase so you can still use the debug printf function, and using the ME to draw to the display.
You'll notice I use the non-cached address of the framebuffer on the ME. One thing I learned when I made the first Amiga PPC port of DOOM - don't draw to cached buffers. You flood the cache and slow down the rest of the program. Drawing to non-cached memory means the difference between 10 FPS and 100 FPS... at least on a PPC, so I figure it's probably about the same on the MIPS. :)
http://groups.google.com/group/chilly-willys-ice-flow/web/DisplayTest-v1.zip |
|
| Back to top |
|
 |
jas0nuk
Joined: 27 Apr 2006 Posts: 137
|
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Wed Oct 03, 2007 2:18 am Post subject: |
|
|
Probably some delay in google updating all the group links. I have no idea how their web server software works - I just upload the file and it supplies a link. :) |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Mon Nov 12, 2007 12:26 am Post subject: |
|
|
J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:
| Code: | | PSP_MODULE_INFO("MediaEngine", 0x1006, VERS, REVS); |
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this. |
|
| Back to top |
|
 |
jas0nuk
Joined: 27 Apr 2006 Posts: 137
|
Posted: Mon Nov 12, 2007 4:02 am Post subject: |
|
|
| StrmnNrmn wrote: | J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:
| Code: | | PSP_MODULE_INFO("MediaEngine", 0x1006, VERS, REVS); |
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this. | http://forums.ps2dev.org/viewtopic.php?p=57753#57753 :)
StrmnNrmn, please read my reply (no. 6) to your Media Engine blog entry :p |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Mon Nov 12, 2007 5:01 am Post subject: |
|
|
| jas0nuk wrote: | | StrmnNrmn wrote: | J.F.: In your DisplayTest sample (thanks for this by the way - it's excellent), the prx has the following module declaration:
| Code: | | PSP_MODULE_INFO("MediaEngine", 0x1006, VERS, REVS); |
I appreciate that the 0x1000 flag is for kernel mode. What does 0x6 signify? I can't find any documentation for this. | http://forums.ps2dev.org/viewtopic.php?p=57753#57753 :) |
Awesome - that's exactly the page I failed to find while searching :) I guess 0x1006 is kernel mode/load/start then?
| jas0nuk wrote: | | StrmnNrmn, please read my reply (no. 6) to your Media Engine blog entry :p |
Will do - thanks :) |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Mon Nov 12, 2007 6:14 am Post subject: |
|
|
By the way, the latest incarnation of the MediaEngine prx is with the source for SNES9X_TYL 0.4.2 ME for 3xx. You can find an arc here:
http://chilly-willys-ice-flow.googlegroups.com/web/snes9xTYL-0.4.2me_fw3x-src.zip
It has all the routines that don't need to be in the prx in a separate .c file, and has the ability to invalidate/wbinv the caches on entry/exit of the ME function.
When using the ME on the Slim, remember that you cannot try to change the cpu clock once the ME has been activated or the PSP will freeze. |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Mon Nov 12, 2007 9:13 am Post subject: |
|
|
| Cheers J.F. - I'll check that out. |
|
| Back to top |
|
 |
CpuWhiz
Joined: 04 Jun 2007 Posts: 42
|
Posted: Thu Nov 15, 2007 3:30 am Post subject: |
|
|
I was thinking about how to call kernel functions from the ME. I warn you now that I am not the best at understanding low level hardware or assembly, but I try my best. The following could be wrong, totally off, or just plain stupid. Tell me what you think. The ME program could be a separate elf with a copy of the SDK with special wrapper functions (described below) and it's own version of malloc, free, etc. It would link to ME versions of these libraries. Any library like libmikmod would not need to be changed (just compiled and linked with the ME version of PSPSDK) and to the normal homebrew programmer everything would look the same.
The following memory (obviously needs some adjusting) would not be cached, available to both the main CPU and the ME, and would be outside the heap for both the main CPU and the ME (or set dynamically from a malloc or static variable on the main CPU and passed to some init function). The start address of 0x13370000 is just for the example and would not be the real location. I did not take into account the different memory locations for the main CPU and the ME when I wrote the example code bellow, so bear with me.
| Code: | 0x13370000 Main Call Triggered
0x13370001 ME Call Triggered
0x13370100 Main Function Index
0x13370104 Main Function Return Value
0x13370108 Main Function Arg 1
0x1337010C Main Function Arg 2
0x13370110 Main Function Arg 3
0x13370200 ME Function Index
0x13370204 ME Function Arg 1 |
The process would go like this:
ME code calls the function sceIoOpen from the ME version of PSPSDK as seen bellow.
This adds the function to our shared memory and waits for the main CPU to finish with it.
The main CPU either via an interrupt or a function called from the main loop checks if it needs to call a kernel function. If it does, it looks up the function index in an array of function pointers and calls that function (handle_sceIoOpen).
The function handle_sceIoOpen (or similar) would pass the proper arguments and set the return value. Then when the kernel function returns would signal the ME to wake up and continue.
Calling a function on the ME (for example you might want to load a song, play, stop, etc) would work the same way but in reverse and with the ME function area in the above example memory map.
| Code: | #define FUNC_sceIoOpen 0
#define FUNC_sceIoClose 1
SceUID sceIoOpen(const char *file, int flags, SceMode mode)
{
// Push the function
(*((unsigned int*)0x13370100)) = FUNC_sceIoOpen;
// Set the first argument past the return value
void *current_arg = 0x13370104 + sizeof(SceUID);
// Push the arguments
(*((const char**)current_arg)) = file - 0x40000000; // Adjust the pointer for the main CPU memory location
current_arg += sizeof(const char*);
(*((int*)current_arg)) = flags;
current_arg += sizeof(int);
(*((SceMode*)current_arg)) = mode;
// Trigger the function and wait
(*((unsigned int*)0x13370000)) = 1;
waitForMainReturn();
return (*((SceUID*)0x13370104));
};
///////////////////////////////////////////////////
void handle_sceIoOpen()
{
(*((SceUID*)0x13370104)) = sceIoOpen(*((const char**)0x13370108), *((int*)0x1337010C), *((SceMode*)0x13370110));
signalMeMainCallDone();
}; |
The above is a little ugly and could be made simpler to the eyes with macros. It might be a little slow so you wouldn't want to call a lot of kernel functions on the ME. One way you might be able to reduce calls is loading a file into memory (if there is room) and having the library (for example libmikmod or libpng) use the copy in memory so all the I/O calls would be done on the main CPU without the need to pass back and forth between the main CPU and the ME. Am I way off? Is there a better way? I might try to code up a example of this if it sounds good (emphasis on might and try). I still need to look over the code posted here a little more and figure out some specifics. I bet someone has already come up with an idea like this and has it in the works... anyway, how does it sound :) |
|
| Back to top |
|
 |
SilverSpring
Joined: 27 Feb 2007 Posts: 115
|
Posted: Thu Nov 15, 2007 10:25 am Post subject: |
|
|
The ME firmware has many of these libs already. It would be better to try make use of them instead of creating your own functions.
In <2.50, the ME fw is gzipped inside mebooter.prx. From 2.50+ they are stored as encrypted images in kd/resources.
Ive worked out a few of these libs and their locations already, like the ME equivalent sysreg lib, and also cache functions. |
|
| Back to top |
|
 |
Cpasjuste
Joined: 29 May 2005 Posts: 214
|
Posted: Wed Nov 21, 2007 8:24 am Post subject: |
|
|
J.F, could you write a little sample to decode mp3 with the ME ?
I'm unable to do it with my little knowledge :/ |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Fri Dec 21, 2007 8:31 am Post subject: |
|
|
J.F./crazyc: I've found something very unusual going on with the cache invalidation functions called from me_loop(). Specifically, I've found that when I set postcache_len to -1 (to cause a call to dache_wbinv_all()), this doesn't seem to be correctly writing back the data cache, i.e. it's as if the function isn't being called.
What's very confusing is that if I call dcache_wbinv_all() twice in me_loop, everything works as expected.
Can anyone verify that this implementation is correct?
| Code: | void dcache_wbinv_all() {
int i;
for(i = 0; i < 8192; i += 64)
__builtin_allegrex_cache(0x14, i);
} |
Looking at the MIPS reference manual, it looks like 0x14 corresponds to a 'Fill' operation on the instruction cache? Shouldn't it be 0x15 (i.e. I'd expect the lowest 2 bits to be 0x1 which corresponds to an operation on the data cache...) |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Fri Dec 21, 2007 11:08 am Post subject: |
|
|
| StrmnNrmn wrote: | J.F./crazyc: I've found something very unusual going on with the cache invalidation functions called from me_loop(). Specifically, I've found that when I set postcache_len to -1 (to cause a call to dache_wbinv_all()), this doesn't seem to be correctly writing back the data cache, i.e. it's as if the function isn't being called.
What's very confusing is that if I call dcache_wbinv_all() twice in me_loop, everything works as expected.
Can anyone verify that this implementation is correct?
| Code: | void dcache_wbinv_all() {
int i;
for(i = 0; i < 8192; i += 64)
__builtin_allegrex_cache(0x14, i);
} |
Looking at the MIPS reference manual, it looks like 0x14 corresponds to a 'Fill' operation on the instruction cache? Shouldn't it be 0x15 (i.e. I'd expect the lowest 2 bits to be 0x1 which corresponds to an operation on the data cache...) | No, it's definitely 0x14 in the PSP kernel. Interestingly, in the sceKernelDcacheWritebackInvalidateAll, Sony does the cache operation twice in each iteration of the loop just like you suggest. |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Fri Dec 21, 2007 11:18 am Post subject: |
|
|
The cache routines were from a previous post in this thread, so they may need some work. If the routine needs to be called twice, the change should be made. That might be why there are sometimes issues with the sound on the SNES emu using the ME. If you find any other problems, be sure to post on it. Eventually, we'll have a nice ME prx that's (hopefully) bug-free.
Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets. So it would make sense that you would either have to call the function twice, or do it like this:
| Code: | void dcache_wbinv_all() {
int i;
for(i = 0; i < 8192; i += 64)
__builtin_allegrex_cache(0x14, i);
__builtin_allegrex_cache(0x14, i);
} |
which is how you normally see PSP cache code. |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Fri Dec 21, 2007 11:29 am Post subject: |
|
|
| J.F. wrote: |
Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets. | That shouldn't be the case as 0x14 is index writeback invalidate and the psp has 128 (8192/64) total cache lines with 64 in each way. Each index should refer to an individual cache line. |
|
| Back to top |
|
 |
Raphael

Joined: 17 Jan 2006 Posts: 646 Location: Germany
|
Posted: Fri Dec 21, 2007 12:05 pm Post subject: |
|
|
| crazyc wrote: | | J.F. wrote: |
Thinking about the threads I've read on the PSP cache, it seems that most cache commands are run twice, supposedly because the MIPS cache is two-way set associative, meaning you have to do the command twice to affect both sets. | That shouldn't be the case as 0x14 is index writeback invalidate and the psp has 128 (8192/64) total cache lines with 64 in each way. Each index should refer to an individual cache line. |
Isn't the cache 16kb? Hence it would be 16384/64 = 256 lines with 128 each way. And since it's index writeback there are only 128 indices, so the function used for(i = 0; i < 8192; i += 64). _________________ <Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
Alexander Berl |
|
| Back to top |
|
 |
crazyc
Joined: 17 Jun 2005 Posts: 410
|
Posted: Fri Dec 21, 2007 12:22 pm Post subject: |
|
|
| Raphael wrote: | | Isn't the cache 16kb? | I believe the entire cache (i+d) is 16kb, but i'm not 100% sure. |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Fri Dec 21, 2007 2:40 pm Post subject: |
|
|
I'm too used to caches on regular CPUs, like the 68k or PPC. These MIPS caches are weird.
:) |
|
| Back to top |
|
 |
|