Media Engine?

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Brunni
Posts: 186
Joined: Sat Oct 08, 2005 10:27 pm

Post by Brunni »

Hello
Sorry to bump this topic. I'm trying to continue research about the Media Engine, but I'm not a very good programmer (in fact programming is ok but with makefiles/gcc switches/unix command line and so, I'm very bad, so it's very difficult for me to work with this SDK).
I would like to know where researches have been done, has something more been done since executing a loop on the ME? I have searched over Internet but I found nothing; it seems the idea is abandoned (too complicated?).
From what I have seen it's not possible yet to make it execute C code directly from the project, as GCC generates absolute jumps (j) which makes the ME execute the code elsewhere (main memory or cache), and the program crashes short after (maybe when cache is flushed). I should relocate (maybe the .text section?) to 0xbfc00040, but I don't know how to do.
I tried crazyc's libme but after a lot of compilation problems (I also had to convert ./build to a standard Makefile because it must create a PBP file), it compiled, the elf is loaded (pspMeLoadExec returns 0) but it doesn't work (ASM code at 0xbfc00040 runs correctly, then jumps to k0 (883000f8) and then ME seems to crash... maybe the elf format generated by my compiler version is different? I can't even tell what it does execute since I have no debugger).
Any info about the ME (or at least how to use melib) would be very appreciated, thanks ^^
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

Sbrk is linked into newlib and psplibc now, and it is incompatible.
Brunni
Posts: 186
Joined: Sat Oct 08, 2005 10:27 pm

Post by Brunni »

Okay thanks.
I rewrote my own loader, which is more simplist, it justs resets the ME and loads a main routine for it. Then it can execute code from RAM just like the main processor (or at least I think it should be okay...)
But can you tell me if you did go further about the ME?

[Edit] I got some more code to work, however the lack of kernel functions / vram access is annoying and limits terribly what we can do with it :(
The base Media Engine impelementation accesses the kernel, no? So it should be possible to use it also here?
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

The base Media Engine impelementation accesses the kernel, no? So it should be possible to use it also here?
The ME kernel doesn't call the PSP kernel, it's completely self contained. BTW, there is something mapped at 0x04000000, where the VRAM is on the main cpu, but I'm not sure what it is.
Brunni
Posts: 186
Joined: Sat Oct 08, 2005 10:27 pm

Post by Brunni »

Yes I remarked also, but it's not VRAM. According to the Sony's block diagram, ME has access to VRAM trough the main bus.
It might be a stupid question since I know nothing about the PSP kernel, but might it be possible to call OS functions (sceKernel...) manually (i.e not passing trough the ME kernel), like a set of ROM CALLs in low-specs systems?
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

It might be a stupid question since I know nothing about the PSP kernel, but might it be possible to call OS functions (sceKernel...) manually (i.e not passing trough the ME kernel), like a set of ROM CALLs in low-specs systems?
Even if the kernel were reentrant or had locks at every entrypoint, any call that touched any mmio would crash.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

is PSP kernel memory mapped into the ME memory space?
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

is PSP kernel memory mapped into the ME memory space?
Yes, at the same address.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

communication between the two cores could get implemented via shared mailbox memory in kernelspace or SRAM (kind of pseudo-RPC, e.g. using NIDs - then the ME can execute kernel calls indirectly over the main core). In addition with a simple scheduler on the ME core then general POSIX threading may work and be useful in general applications, the PSP would appear as dual-processor system...
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

holger wrote:communication between the two cores could get implemented via shared mailbox memory in kernelspace or SRAM (kind of pseudo-RPC, e.g. using NIDs - then the ME can execute kernel calls indirectly over the main core). In addition with a simple scheduler on the ME core then general POSIX threading may work and be useful in general applications, the PSP would appear as dual-processor system...
Could do that but note the CPU's aren't cache coherent, you wouldn't want to do it much.
mrbrown
Site Admin
Posts: 1537
Joined: Sat Jan 17, 2004 11:24 am

Post by mrbrown »

Er, there's already RPCs to the ME. Consult your local NID list.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

mrbrown wrote:Er, there's already RPCs to the ME. Consult your local NID list.
Those functions are only usable when the me kernel is running. If you are running your own code, (the me kernel doen't appear to provide any interface for that) you have to roll you own RPC.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

crazyc wrote:
holger wrote:communication between the two cores could get implemented via shared mailbox memory in kernelspace or SRAM (kind of pseudo-RPC, e.g. using NIDs - then the ME can execute kernel calls indirectly over the main core). In addition with a simple scheduler on the ME core then general POSIX threading may work and be useful in general applications, the PSP would appear as dual-processor system...
Could do that but note the CPU's aren't cache coherent, you wouldn't want to do it much.
writes to the mailbox would need to be uncached, sure (a minimal protocol could e.g. only transfer the register set including the PC of the called function, this would also work bidirectionally). It's also likely that there is an interrupt line connecting the cores, this may get used for notification...

To be sure the communication interrupt handler could also flush caches. Nevertheless... sounds like some amount of work to do; we're basically talking about a mini-OS.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

mrbrown wrote:Er, there's already RPCs to the ME. Consult your local NID list.
may be worth an investigation. Transferring the register set of the calling function has the charme of extremely trivial implementation and minimal overhead -- no NIDs need to get resolved, register sets are saved/restored on interrupts anyways, so no need to write much new code.
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

It's also likely that there is an interrupt line connecting the cores, this may get used for notification...
sceSysregInterruptToOther probably triggers an interrupt on the ME. Also cop0 $22 appears to be connected between the two.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

crazyc wrote:BTW, there is something mapped at 0x04000000, where the VRAM is on the main cpu, but I'm not sure what it is.
Have you dumped the MMU map? What physical address belongs to this virtual one?
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

crazyc wrote:
It's also likely that there is an interrupt line connecting the cores, this may get used for notification...
sceSysregInterruptToOther probably triggers an interrupt on the ME. Also cop0 $22 appears to be connected between the two.
INT 31 catches the ME irq on the main core.
mrbrown
Site Admin
Posts: 1537
Joined: Sat Jan 17, 2004 11:24 am

Post by mrbrown »

holger wrote:Have you dumped the MMU map? What physical address belongs to this virtual one?
What MMU?
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

mrbrown wrote:
holger wrote:Have you dumped the MMU map? What physical address belongs to this virtual one?
What MMU?
Share both CPU cores the same TLB entries or uses the ME a seperate page table for virtual/physical address lookup?
mrbrown
Site Admin
Posts: 1537
Joined: Sat Jan 17, 2004 11:24 am

Post by mrbrown »

Neither CPU has a normal MIPS MMU, but some funky linear address mapping crap instead (Sony has stated the ALLEGREX doesn't, I'm just assuming the ME doesn't either).

Consult your slides :).
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

:) errmmm... which slides? are they available online somewhere?
mrbrown
Site Admin
Posts: 1537
Joined: Sat Jan 17, 2004 11:24 am

Post by mrbrown »

florinsasu
Posts: 47
Joined: Wed Dec 15, 2004 4:23 am

Post by florinsasu »

crazyc wrote:
It's also likely that there is an interrupt line connecting the cores, this may get used for notification...
sceSysregInterruptToOther probably triggers an interrupt on the ME. Also cop0 $22 appears to be connected between the two.
it looks like $22 is 0 on system core processor and !=0 on media engine processor. look at the exception vector at 0xBFC00000, the code is like this:
-----------8<------------
$c0c6 = $v0;
$v0 = $c0r22; //processor detection
if (0 != $v0) goto 0xBFC00040; //exception vector for ME
call $c0c9;
----------->8------------
they use $22 for processor detection, as probably $PrID is the same.
Brunni
Posts: 186
Joined: Sat Oct 08, 2005 10:27 pm

Post by Brunni »

crazyc wrote:BTW, there is something mapped at 0x04000000, where the VRAM is on the main cpu, but I'm not sure what it is.
Reffering to:
http://pc.watch.impress.co.jp/docs/2004 ... gai_3a.gif
It could be that 2 MB sub-memory in the block diagram, but I'm not sure. The fact is that it's not the same memory as the one the main processor has access.
It seems that this eDRAM is faster than the main (32 MB) DDR, if we could relocate a program from which the main loop is too big to be placed in I-cache, we might get a speed improvement (but maybe I'm totally wrong).
What do you think about this? I don't know if a better RAM speed would enhance things a lot or if it could even be used as normal memory... (it's 512 bits, right?)
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

Brunni wrote:
crazyc wrote:BTW, there is something mapped at 0x04000000, where the VRAM is on the main cpu, but I'm not sure what it is.
Reffering to:
http://pc.watch.impress.co.jp/docs/2004 ... gai_3a.gif
It could be that 2 MB sub-memory in the block diagram, but I'm not sure. The fact is that it's not the same memory as the one the main processor has access.
It seems that this eDRAM is faster than the main (32 MB) DDR, if we could relocate a program from which the main loop is too big to be placed in I-cache, we might get a speed improvement (but maybe I'm totally wrong).
What do you think about this? I don't know if a better RAM speed would enhance things a lot or if it could even be used as normal memory... (it's 512 bits, right?)
The local ram is at 0x80000000, in my library I use it for heap and stack, that seems to be what sony does too. What's at 0x04000000 is probably mmio, but all I know for sure is that it's not ram.
Brunni
Posts: 186
Joined: Sat Oct 08, 2005 10:27 pm

Post by Brunni »

Okay thx
I did some tries to get it to work and synchronize with the main CPU. But I'm having some serious problems. I was first invalidating the d-cache each time I finished to write informations to variables that should be used by the ME, but it slows down quite a lot.
So I replaced this method with writing to uncached addresses. Altrough I've found anywhere, I assumed that I could also add 0x40000000 to bypass cache with RAM variables (0x80000000). I did a simple synchronization:

Code: Select all

&#91;... shared variables ...&#93;
typedef unsigned int BOOL;
volatile BOOL bSync;
#define GetUncachedPtr&#40;address&#41;	&#40;&#40;void*&#41;&#40;&#40;u32&#41;&#40;address&#41;|0x40000000&#41;&#41;

&#91;... main code ...&#93;

void main&#40;void&#41;     &#123;
   volatile BOOL *pbSync;
   pbSync = GetUncachedPtr&#40;&bSync&#41;;
   MediaEngine_Boot&#40;MediaEngine_Main&#41;;     //Boots the ME to our routine
   *pbSync = 1;
   while &#40;*pbSync&#41;;
   printf&#40;"Execution done correctly"&#41;;
&#125;

&#91;... media engine code ...&#93;
void MediaEngine_Main&#40;&#41;       &#123;
   volatile BOOL *pbSync;
   pbSync = GetUncachedPtr&#40;&bSync&#41;;
   //Message handling loop
   while&#40;1&#41;    &#123;
       if &#40;*pbSync&#41;
           *pbSync = 0;
   &#125;
&#125;
Here it is okay. But when it becomes more complex, it doesn't work anymore. For example, if I add some code (which uses ONLY local variables) after the *pbSync=1 in the main routine, sometimes *pbSync won't be modified (or only much later), like if it was placed in cache, then the framerate goes down to 1 fps because pbSync is not seen immediately by the ME. The same thing if I do:

Code: Select all

&#91;media engine code&#93;
void MediaEngine_Main&#40;&#41;       &#123;
   volatile BOOL *pbSync;
   volatile int *pglobalCounter;
   pbSync = GetUncachedPtr&#40;&bSync&#41;;
   pglobalCounter = GetUncachedPtr&#40;&globalCounter&#41;;
   //Message handling loop
   while&#40;1&#41;    &#123;
       if &#40;*pbSync&#41;      &#123;
           i = *pglobalCounter;
           &#91;Some more code, that accesses global backbuffer&#93;
           &#40;*pglobalCounter&#41; = &#40;*pglobalCounter&#41; + 1;
       &#125;
   &#125;
&#125;
I do not access to pglobalCounter anywhere else, so it should work, 'i' should be a counter mirroring globalCounter. But it isn't! It is blocked, but not if I remove the additionnal code between the two instructions... I really don't know what's happening... It seems it's provoked by the cache flush of the ME and the main CPU (conflicts), but it should not as I access ONLY uncached variables from the Media Engine. It doesn't seems to be completely uncached tough...
I worked on it 4 whole days, but didn't found any solution. Here is the code: http://infernobox.dyndns.org/brunni/PSP_MediaEngine.zip
If you have any suggestions for helping me, I would really appreciate, thanks. You can modify and use my code if you want, but if you find a solution, I would really appreciate if you explain it to me. :)
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Post by crazyc »

I'm not sure if 0x08000000 is an alias for main memory on the ME. I'm also not sure if 0x40000000 is an uncached alias for main memory.
xdeadbeef
Posts: 1
Joined: Wed Oct 19, 2005 7:44 am

Running C code on the ME

Post by xdeadbeef »

OK I went ahead and merged code from a few of the posts here, and made an example which allows you to easily run C code on the ME (with the usual limitations on kernel calls etc).

The file is here: http://www.stashbox.org/uploads/1129672725/meccode.zip

it's based on the SDK's basic ME example.
matkeupon
Posts: 26
Joined: Sat Jul 02, 2005 10:58 pm

Post by matkeupon »

Just one dumb question (which should be yes as you talked about the stack) - is it possible to call functions from this c code ? Is it possible to share a function between the CPU and the ME ?
jonny
Posts: 351
Joined: Thu Sep 22, 2005 5:46 pm
Contact:

Post by jonny »

sorry for the n00b question, is the me running at the same speed of the main cpu?
and is scePowerSetClockFrequency affecting the me too?

(i'm trying to figure out how much speed i can gain using it in parallel with the main cpu to make integer yuv to rgb csc and integer idct)

EDIT: forgive me i've got a color space conversion running ~2x ^^ (and my answers)
xdeadbeef & all the others, thanks for the code
Post Reply