 |
forums.ps2dev.org Homebrew PS2, PSP & PS3 Development Discussions
|
| View previous topic :: View next topic |
| Author |
Message |
Raphael

Joined: 17 Jan 2006 Posts: 646 Location: Germany
|
Posted: Sat Jan 27, 2007 9:06 pm Post subject: |
|
|
Yeah, that looks a lot more reliable now. Also much nearer to my approximated cycles, so I'm happy :) _________________ <Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
Alexander Berl |
|
| Back to top |
|
 |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Fri Feb 02, 2007 11:28 pm Post subject: |
|
|
!!! NEW INSTRUCTIONS !!!
MrMr[iCE] asked me if there were the inverse instruction for vi2c.
And effectively mips-opc.c doesn't seem to have one at our surprise. Why ? it is because they forget to add the inverse instruction or is it because it is unimplemented ?
diggin in mips-opc.c, we have :
vi2uc.q --> 0xd03c8080
vi2c.q --> 0xd03d8080
vi2us.q --> 0xd03e8080
vi2s.q --> 0xd03f8080
and
vus2i.p --> 0xd03a0080
vs2i.p --> 0xd03b0080
vi2us.p --> 0xd03e0080
vi2s.p --> 0xd03f0080
Oh well, we can make some analogies and found out if vc2i.q and vuc2i.q really exist :
vi2us.p - vus2i.p = 0x00040000
vi2s.p - vs2i.p = 0x00040000
so lets's check if vc2i.s = vi2c.q + vi2s.p - vs2i.p = 0xd0398080
| Code: |
asm volatile
(
"lv.q C000, %0\n"
"vi2c.q\tS000, C000\n"
".word 0xD0398081\n" // "vc2i.s\tC010, S000\n"
"sv.q\tC010, %0\n"
: "+m"(res2) : : "memory"
);
printf("%08x %08x %08x %08x\n", res2.x, res2.y, res2.z, res2.w);
|
BINGO ! C010 == C000 as a result so 0xD0398081 seems to act as vc2i.s C010, S000 !!!
now, why .s instead of .q ? because the source is a scalar vfpu register (vi2c.q has a vector vfpu register as source whereas the target was a scalr vfpu register, so by analogy it would make sense to use here .s instead of .q)
ok now, let's check if vuc2i.s = vi2uc.q + vi2us.p - vus2i.p = 0xd0388080
| Code: |
asm volatile
(
"lv.q C000, %0\n"
"vi2uc.q\tS000, C000\n"
".word 0xD0388081\n" // "vc2i.s\tC010, S000\n"
"sv.q\tC010, %0\n"
: "+m"(res2) : : "memory"
);
printf("%08x %08x %08x %08x\n", res2.x, res2.y, res2.z, res2.w);
|
hmmmm... C010 != C000
but we have always the same result this way :
C010.x = (S000[7..0] * 0x01010101) >> 1;
C010.y = (S000[15..8] * 0x01010101) >> 1;
C010.z = (S000[23..16] * 0x01010101) >> 1;
C010.w = (S000[31..24] * 0x01010101) >> 1;
anyway if you want to make a complex calculus on a RGBA8888, you can do it this way :
vuc2i.s C000, S000
vi2f.q C000, C000, 23
... your calculus here
vf2iz.q C000, C000, 23
vi2uc.q S000, C000
although this instruction isn't the invert instruction of vi2uc.q.
so now I propose to add those both instructions to psp-as which doesn't reckonize them.
About the case of vuc2i.s, if someone has some idea about what its name should be, i would like to hear it.
EDIT: corrected the fact that "uc2i" halfs the result.
Last edited by hlide on Sun Mar 18, 2007 10:35 pm; edited 2 times in total |
|
| Back to top |
|
 |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Fri Mar 16, 2007 10:51 pm Post subject: |
|
|
hummm some suggestions about what those functions do :
- vsbz.s sd, ss : may change the binary logarithmic scale of a floating point value to 0 ?sd = (-1)^(ss.s) x 2^0 x (1 + ss.m/2^23).
- vsbn.s sd, ss, st : may change the binary logarithmic scale of a floating point value to N ?sd = (-1)^(ss.s) x 2^N x (1 + ss.m/2^23) where N is given by st.
- vlgb.s sd, ss : may give the binary logarithm of floating point value ?
- vwbn.s sd, ss, imm : may give the modulus of floating point value ?sd = (-1)^(ss.s) x 2^(N-127) x (1 + (ss.m % 2^N)/2^23) where N is given by imm.
No need to say they are just speculations... and I may be wrong... |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Sun Mar 18, 2007 8:08 am Post subject: |
|
|
Hmm, I've been playing around with vuc2i, and I've been getting slightly different results. Am I missing something?
| Code: |
u32 col = 0x014080c0;
i4 res;
asm volatile
(
"lv.s S200, %0\n"
".word 0xd0388080 | (8<<8) | (40)\n" // vuc2i.s R200, S200
"sv.q\tR200, %1\n"
: "+m"(col), "+m"(res) : : "memory"
);
printf("Col: %08x\n", col);
printf("Res: %08x %08x %08x %08x\n", res.x, res.y, res.z, res.w);
|
Which results in:
Col: 004080c0
Res: 60606060 40404040 20202020 00808080
So it looks to me like vuc2i.s is actually doing:
| Code: |
vuc2i.s vd.q, vs.s
{
vd.q[0] = (vs.s[0]( 0.. 7) * 0x01010101) >> 1;
vd.q[1] = (vs.s[0]( 8..15) * 0x01010101) >> 1;
vd.q[2] = (vs.s[0](16..23) * 0x01010101) >> 1;
vd.q[3] = (vs.s[0](24..31) * 0x01010101) >> 1;
}
|
I guess the final >>1 prevents the top bit from ever being set, which means that when you use vi2f.q (which is signed) your data is correctly comes through unsigned.
| Quote: |
anyway if you want to make a complex calculus on a RGBA8888, you can do it this way :
vuc2i.s C000, S000
vi2f.q C000, C000, 24
...
|
I think this actually needs to be:
vuc2i.s C000, S000
vi2f.q C000, C000, 23
(at least that works correctly for me - shifting by 24 gives me half-bright colours.)
StrmnNrmn
(Edit - fix code) |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Sun Mar 18, 2007 8:36 am Post subject: |
|
|
This is a little pedantic, but the pseudo-code on the first page seems to imply that the low-order bits are discarded:
| Code: |
vi2f.q/t/p/s vd, vs, imm 1 0
{
for (i = 0; i < |q/t/p/s|; ++i)
vd[i] = (float)(vs[i] >> imm);
}
|
I think this would more accurately be described as:
| Code: |
vi2f.q/t/p/s vd, vs, imm 1 0
{
for (i = 0; i < |q/t/p/s|; ++i)
vd[i] = (float)(vs[i]) / (float)(1<<imm);
}
|
i.e. the fractional bits are taken into account:
| Code: |
u32 col = 0x00800000;
float res;
asm volatile
(
"lv.s S200, %0\n"
"vi2f.s S200, S200, 24\n"
"sv.s\tS200, %1\n"
: "+m"(col), "+m"(res) : : "memory"
);
printf("Col: %08x\n", col);
printf("Res: %f\n", res);
|
Prints out:
00800000
0.500000
StrmnNrmn |
|
| Back to top |
|
 |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Sun Mar 18, 2007 10:28 am Post subject: |
|
|
| StrmnNrmn wrote: | Hmm, I've been playing around with vuc2i, and I've been getting slightly different results. Am I missing something?
|
nope, this topic is quite old indeed and not corrected.
if you looked at all the posts of topic, especially those MrMr[iCE] and mine you would have found that vuc2i is indeed as you say here.
For cycles, you can look at http://wiki.fx-world.org/doku.php?id=general:cycles. Details on instructions still need to be done.
EDIT: in fact, no. There isn't any post correcting vuc2i. Most corrections were indeed done when dicussing on IRC, not in this topic. Well your post is welcome :P
EDIT2: i corrected my post where i found out "vuc2i" because it is something i already knew but forgot to modify in this post. |
|
| Back to top |
|
 |
StrmnNrmn
Joined: 14 Feb 2007 Posts: 46 Location: London, UK
|
Posted: Sun Mar 18, 2007 10:28 pm Post subject: |
|
|
That's an excellent resource. Somehow I've managed to miss the overlook the URL while searching around on these forums. Thanks :) |
|
| Back to top |
|
 |
nDEV
Joined: 13 Apr 2007 Posts: 48
|
Posted: Sun Apr 29, 2007 8:47 pm Post subject: |
|
|
Holy crap!
You guys ARE GENIUS...wtf is all this?! i dont understand anything :lol.gif: |
|
| Back to top |
|
 |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Sun Apr 29, 2007 9:09 pm Post subject: |
|
|
| nDEV wrote: | Holy crap!
You guys ARE GENIUS...wtf is all this?! i dont understand anything :lol.gif: |
the main processor of PSP has two coprocessors, a FPU and a VFPU.
FPU is quite standard and is used for single floating point computation. Every homebrews is using it when using C float. VFPU probably stands for Vector Floating point Unit and is a very powerful SIMD-like FPU. Very few homebrews uses it or marginally because gcc has no knowlegde about it (only gas - the assembler - has) the same way gcc has for SSE or Altivec.
VFPU offers 128 single float registers (instead of 32 from FPU) which can be arranged to be accessed as :
- 8 non-overlapping 4x4 matrixes
- 8 non-overlapping 3x3 or 16 overlapping 3x3 matrixes
- 16 non-overlapping 2x2 matrixes
In each matrix, we can also access a register as a column or a row. Or simply a register as en element of the matrix.
That's very powerful to compute matrixes and vectors which are used for 2D and 3D. Quaternions are also easier and faster to compute with VFPU.
SIMD = Single Instruction Multiple Data, see wikipedia for some explanation. |
|
| Back to top |
|
 |
nDEV
Joined: 13 Apr 2007 Posts: 48
|
Posted: Sun Apr 29, 2007 10:26 pm Post subject: |
|
|
| hlide wrote: | | nDEV wrote: | Holy crap!
You guys ARE GENIUS...wtf is all this?! i dont understand anything :lol.gif: |
the main processor of PSP has two coprocessors, a FPU and a VFPU.
FPU is quite standard and is used for single floating point computation. Every homebrews is using it when using C float. VFPU probably stands for Vector Floating point Unit and is a very powerful SIMD-like FPU. Very few homebrews uses it or marginally because gcc has no knowlegde about it (only gas - the assembler - has) the same way gcc has for SSE or Altivec.
VFPU offers 128 single float registers (instead of 32 from FPU) which can be arranged to be accessed as :
- 8 non-overlapping 4x4 matrixes
- 8 non-overlapping 3x3 or 16 overlapping 3x3 matrixes
- 16 non-overlapping 2x2 matrixes
In each matrix, we can also access a register as a column or a row. Or simply a register as en element of the matrix.
That's very powerful to compute matrixes and vectors which are used for 2D and 3D. Quaternions are also easier and faster to compute with VFPU.
SIMD = Single Instruction Multiple Data, see wikipedia for some explanation. |
Thanks , thats wayyy to advanced for me.
Anyway , impressive work!! |
|
| Back to top |
|
 |
gauri
Joined: 20 Jan 2008 Posts: 35 Location: Belarus
|
Posted: Sun Jan 20, 2008 11:22 pm Post subject: |
|
|
nice work, guys!
I'm trying to speed up to VFPU diggins and for now collecting all the info on VFPU tha I can find. Already dug thru this forum.
Any other links you think will do good? Yep, I've already been to wiki.fx-world.org :-)
P.S. In fact, I'm building (yet another) wiki with PSP info. I'll share the link later, when there will be more stuff. _________________ Freelance game industry veteran. 8] |
|
| Back to top |
|
 |
hlide
Joined: 10 Sep 2006 Posts: 750
|
Posted: Mon Jun 23, 2008 8:30 pm Post subject: |
|
|
LIST UPDATED
Looking at POPS binary, i found out a special use of VCMOVF/T instruction :
i was wondering what this number 6 meaned. As a recall :
0 : comparison on X component
1 : comparison on Y component
2 : comparison on Z component
3 : comparison on W component
4 : OR comparison on all components
5 : AND comparison on all components
The code using it seems to imply we can set the components individually according their own comparisons.
So :
would be equivalent to :
| Code: |
vcmovt.s vd[0], vs[0], 0
vcmovt.s vd[1], vs[1], 1
vcmovt.s vd[2], vs[2], 2
|
I thought I knew every bit of VFPU :) |
|
| Back to top |
|
 |
MrMr[iCE]
Joined: 03 Oct 2005 Posts: 43
|
Posted: Sun Jul 20, 2008 1:35 pm Post subject: |
|
|
still keeping it going eh hlide?
come see me on irc, you know where to go =) |
|
| Back to top |
|
 |
anmabagima
Joined: 01 Oct 2009 Posts: 96
|
Posted: Tue Oct 13, 2009 11:53 pm Post subject: |
|
|
Hi there,
this might be a stupid question of a noob: but what is the intention of this ? Where and how do I use this usually in my PSP development project?
Regards
AnMaBaGiMa |
|
| Back to top |
|
 |
jojojoris
Joined: 30 Mar 2008 Posts: 261
|
Posted: Wed Oct 14, 2009 1:20 am Post subject: |
|
|
| anmabagima wrote: | Hi there,
this might be a stupid question of a noob: but what is the intention of this ? Where and how do I use this usually in my PSP development project?
Regards
AnMaBaGiMa |
Do your own research. >:(
Search this forum and use google. _________________
| Code: | int main(){
SetupCallbacks();
makeNiceGame();
sceKernelExitGame();
} |
|
|
| Back to top |
|
 |
anmabagima
Joined: 01 Oct 2009 Posts: 96
|
Posted: Wed Oct 14, 2009 1:30 am Post subject: |
|
|
Hi,
thanks for this hint....I've done a research and found this topic. I've searched for something like ASM to be used within PSP development to speed things up. However, the results I get pointed me to this thread. But what is this telling me ? What I've found out so far is that the PSP do have it's own dialect of assembler. It seem not to be comparable with ix86 assembler where you accessing registers like AX, BX, ESI and all this stuff. So I'm here and wondering if some wane could give me just a small finger tipp into the right direction...
Thanks... |
|
| Back to top |
|
 |
jojojoris
Joined: 30 Mar 2008 Posts: 261
|
Posted: Wed Oct 14, 2009 3:12 am Post subject: |
|
|
Search for "VFPU" since it's in the topic title.
I bet you don't have any coding experience since you are not even able to figure out what the topic title means. _________________
| Code: | int main(){
SetupCallbacks();
makeNiceGame();
sceKernelExitGame();
} |
|
|
| Back to top |
|
 |
dridri
Joined: 31 Jul 2009 Posts: 35
|
Posted: Wed Oct 14, 2009 3:40 am Post subject: |
|
|
| anmabagima wrote: | | What I've found out so far is that the PSP do have it's own dialect of assembler. It seem not to be comparable with ix86 assembler where you accessing registers like AX, BX, ESI and all this stuff |
The PSP uses a MIPS 32bit processor. So in assembler program you can use all MIPS commands and more: FPU and VFPU.
FPU = Floating Point Unit
VFPU = Video FPU => done by the GPU
So the VFPU commands are "ready to use" for maths: calculating a cosinus by using VFPU is +/- 800% faster than a normal cosinus (using libMaths).
And it can do a lot of matrices operations (rotate, translate, multiply...) _________________ I'm French, and 15 years old, so my English is not good... |
|
| Back to top |
|
 |
anmabagima
Joined: 01 Oct 2009 Posts: 96
|
Posted: Wed Oct 14, 2009 4:31 pm Post subject: |
|
|
| jojojoris wrote: | Search for "VFPU" since it's in the topic title.
I bet you don't have any coding experience since you are not even able to figure out what the topic title means. |
Usually I could start and argue why I feel I have already some code exerience - take a look at http://www.anmabagima.de/
Look for the project page and you will find my C++ tutorial for DDraw on Windows.
However, my assumtion was we are in that forum to help each other not to affront someone...Anyway..thanks to anyone else for the help... |
|
| Back to top |
|
 |
dridri
Joined: 31 Jul 2009 Posts: 35
|
Posted: Wed Oct 14, 2009 5:19 pm Post subject: |
|
|
I posted ;) _________________ I'm French, and 15 years old, so my English is not good... |
|
| Back to top |
|
 |
J.F.
Joined: 22 Feb 2004 Posts: 2906
|
Posted: Thu Oct 15, 2009 2:19 am Post subject: |
|
|
| dridri wrote: | | anmabagima wrote: | | What I've found out so far is that the PSP do have it's own dialect of assembler. It seem not to be comparable with ix86 assembler where you accessing registers like AX, BX, ESI and all this stuff |
The PSP uses a MIPS 32bit processor. So in assembler program you can use all MIPS commands and more: FPU and VFPU.
FPU = Floating Point Unit
VFPU = Video FPU => done by the GPU
So the VFPU commands are "ready to use" for maths: calculating a cosinus by using VFPU is +/- 800% faster than a normal cosinus (using libMaths).
And it can do a lot of matrices operations (rotate, translate, multiply...) |
Uh... no. VFPU = VECTOR FPU, and it's another CPU coprocessor, just like the FPU. It just works on vectors instead of single values. The VFPU commands are CPU assembler commands specific to the Allegro CPU inside the PSP.
Here's one place to start with the vfpu: http://forums.ps2dev.org/viewtopic.php?p=67320#67320 |
|
| Back to top |
|
 |
dridri
Joined: 31 Jul 2009 Posts: 35
|
Posted: Thu Oct 15, 2009 2:21 am Post subject: |
|
|
Ok, thanks ^^ _________________ I'm French, and 15 years old, so my English is not good... |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|