Author | Message | Time |
---|---|---|
Paul | Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value... My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321. This is what I've been bouncing around in my head... push eax mov al, byte ptr [esi+3F] mov byte ptr [Pointer part 1], al mov al, byte ptr [esi+3E] mov byte ptr [Pointer part 2], al mov al, byte ptr [esi+3D] mov byte ptr [Pointer part 3], al mov al, byte ptr [esi+3C] mov byte ptr [Pointer part 4], al pop eax ret Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer. If I'm rambling I apologize... Anyway, my question is: Is there a better way to do this? | November 16, 2003, 8:03 AM |
Adron | This is how I picture it: [code] mov ecx, 8 xor edx, edx mov eax, [esi+3ch] shifting: shl edx, 4 mov ebx, eax shr eax, 4 and ebx, 15 or edx, ebx loop shifting mov pointer, edx [/code] | November 16, 2003, 1:05 PM |
Kp | Why are you seeking to reverse the nibbles as well? Usually people just want to reverse the ordering at the byte level. :) | November 16, 2003, 4:37 PM |
Skywing | [quote author=Paul link=board=7;threadid=3642;start=0#msg29453 date=1068969824] Where [esi+3C] = 12345678 AND [esi+3C] is constantly updating with a new DWORD value... My goal is to throw 12345678 into my own static pointer, but in reverse order 87654321. This is what I've been bouncing around in my head... push eax mov al, byte ptr [esi+3F] mov byte ptr [Pointer part 1], al mov al, byte ptr [esi+3E] mov byte ptr [Pointer part 2], al mov al, byte ptr [esi+3D] mov byte ptr [Pointer part 3], al mov al, byte ptr [esi+3C] mov byte ptr [Pointer part 4], al pop eax ret Where the result would = DWORD of 87654321 in [Pointer], plus account for the auto-updating of the DWORD value in [esi+3C]. So if 12345678 changes to 23456789 it will become 98765432 my new static pointer. If I'm rambling I apologize... Anyway, my question is: Is there a better way to do this? [/quote] Use the bswap instruction - available on i486 and higher. | November 16, 2003, 4:58 PM |
Kp | [quote author=Skywing link=board=7;threadid=3642;start=0#msg29507 date=1069001935]Use the bswap instruction - available on i486 and higher.[/quote]I started to post the same thing, then noticed that he wants swapping on a per-nibble basis, not per-byte like bswap does. Hence my query to him. Of course, the code he provided us as an example is wrong if he really does want a per-nibble swap. :) | November 16, 2003, 5:02 PM |
Adron | Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code... I was looking for the bswap, but it wasn't listed in 386intel.txt..... edit: [code] mov eax, [esi+3ch] bswap eax mov ebx, eax and eax, 0f0f0f0fh and ebx, 0f0f0f0f0h shl eax, 4 shr ebx, 4 or eax, ebx mov pointer, eax [/code] | November 16, 2003, 5:04 PM |
Skywing | [quote author=Adron link=board=7;threadid=3642;start=0#msg29510 date=1069002249] Well, I was assuming that his explanation was correct and the code possibly wrong since he wanted help with the code... [/quote] Now he knows how to accomplish either of his listed goals. So, we should have everything covered? :p | November 16, 2003, 5:05 PM |
Paul | Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much! | November 17, 2003, 1:09 AM |
Skywing | [quote author=Paul link=board=7;threadid=3642;start=0#msg29614 date=1069031379] Excellent, when I get off work I'll be able to test/compile this code into my project. Thanks much! [/quote] So, which did you want to do, anyway? | November 17, 2003, 1:11 AM |
Paul | Byte swapping! | November 17, 2003, 1:12 AM |
CupHead | Along similar lines, I was talking to Sky this morning about byte swapping for other-endian protocols... So he gave me the function: [code] unsigned short bswap(unsigned short u) { return ((u & 0xff) << 8) | (u >> 8); } [/code] This is all well and good, but I decided to write it in ASM for the hell of it and got: [code] WORD ByteSwapWORD( WORD x ) { __asm { mov ax, x and ax, 0xff shl ax, 8 shr x, 8 or x, ax } return x; } [/code] Which is all well and good until you get to DWORDs. Damned if I could figure out how to do the swapping in C, so I went with ASM again, this time coming up with: [code] DWORD ByteSwapDWORD( DWORD x ) { __asm { push edx push ebx mov edx, x // 01 02 03 04 mov ax, dx // dx = 03 04 and ax, 0xff // ax = 03 04 -> 0000 0011 0000 0100 -> 0000 0000 0000 0100 shl ax, 8 // ax = 0000 0100 0000 0000 shr dx, 8 // dx = 0000 0000 0000 0011 or dx, ax // dx = 0000 0100 0000 0011 -> 04 03 shl edx, 16 // edx = 0000 0100 0000 0011 0000 0000 0000 0000 -> 04 03 00 00 xor ebx, ebx mov eax, x // eax = 01 02 03 04 shr eax, 16 // eax = 00 00 01 02 -> 0000 0000 0000 0000 0000 0001 0000 0010 mov bx, ax and ax, 0xff // ax = 0000 0000 0000 0010 shl ax, 8 // ax = 0000 0010 0000 0000 shr bx, 8 // bx = 0000 0000 0000 0001 or bx, ax // bx = 0000 0010 0000 0001 -> 02 01 or edx, ebx // edx = 0000 0100 0000 0011 0000 0010 0000 0001 -> 04 03 02 01 mov x, edx pop ebx pop edx } return x; } [/code] As you can see from the comments, I was having lots of fun working out the binary for the instructions and stuff. Anyway, after finishing this behemoth of a function, I seemed to remember something like this on the forums and whipped out my handy (and free) IA-32 Architecture Software Developer's Manual Volume 2: Instruction Set Reference. I looked up bswap and to my amazement, it did the whole DWORD thing in just one instruction. Well, damn, that was a lot of wasted effort. Then it said see xchg for 16-bit numbers and there was a single instruction that did it for words. *sigh* Final code looks like: [code] WORD ByteSwapWORD( WORD x ) { __asm { mov ax, x xchg ah, al mov x, ax } return x; } DWORD ByteSwapDWORD( DWORD x ) { __asm { mov eax, x bswap eax mov x, eax } return x; } [/code] Anyway, just thought I'd vent some frustration. :P | November 26, 2003, 5:25 PM |
Skywing | You could improve those by making them naked and fastcall. [code] __declspec(naked) unsigned short __fastcall ByteSwapWORD(unsigned short) { __asm { xchg cl, ch mov ax, cx } } __declspec(naked) unsigned long __fastcall ByteSwapDWORD(unsigned long) { __asm { bswap ecx mov eax, ecx } }[/code] | November 26, 2003, 5:34 PM |
Kp | [quote author=Skywing link=board=7;threadid=3642;start=0#msg31675 date=1069868048] You could improve those by making them naked and fastcall.[/quote] Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :) | November 26, 2003, 10:23 PM |
CupHead | I'm sure it's bloated too, probably because I used an instruction for each step (and you can see the progression). Obviously there would be faster ways like just swapping the inner bytes and then the outer bytes, but what you see is what I've got. | November 26, 2003, 10:45 PM |
Etheran | [quote author=Kp link=board=7;threadid=3642;start=0#msg31758 date=1069885426] [quote author=Skywing link=board=7;threadid=3642;start=0#msg31675 date=1069868048] You could improve those by making them naked and fastcall.[/quote] Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :) [/quote]How would you do that in msvc++ ? I can't find regparm or __attribute__ on msdn. [code]__attribute__((regparm(1)))[/code] ... ? | November 27, 2003, 12:24 AM |
Skywing | [quote author=Etheran link=board=7;threadid=3642;start=0#msg31847 date=1069892658] [quote author=Kp link=board=7;threadid=3642;start=0#msg31758 date=1069885426] [quote author=Skywing link=board=7;threadid=3642;start=0#msg31675 date=1069868048] You could improve those by making them naked and fastcall.[/quote] Even better, make them attribute ((regparm (1))), in which case the argument will be in eax/ax when the function starts, saving you from even having to move it from ecx. ;) Also, it might be worth doing some testing on whether it's faster to xchg or do the exchange manually. Same with bswap -- just because it's one instruction, it might not be fast. Finally, I'm certain CupHead's dword swapper is bloated. I've inlined something that has that effect several times and it's never been that long. :) [/quote]How would you do that in msvc++ ? I can't find regparm or __attribute__ on msdn. [code]__attribute__((regparm(1)))[/code] ... ? [/quote] Those are GCC extensions and are incompatible with VC. | November 27, 2003, 12:44 AM |
Etheran | that's what I thought, but is there any way to do this in vc? I'm thinking no and the only way to do it would be to put the value in eax before you make the function call. [code] int __declspec(naked) myFunction(void); int ret; __asm { mov eax, theVal } ret = myFunction(); [/code] or perhaps this generates a compiler error.. [code] myFunction(); __asm { mov ret, eax } [/code] | November 27, 2003, 12:48 AM |
Skywing | [quote author=Etheran link=board=7;threadid=3642;start=15#msg31869 date=1069894119] that's what I thought, but is there any way to do this in vc? [/quote] No. | November 27, 2003, 12:52 AM |
Kp | [quote author=Skywing link=board=7;threadid=3642;start=15#msg31876 date=1069894347] [quote author=Etheran link=board=7;threadid=3642;start=15#msg31869 date=1069894119] that's what I thought, but is there any way to do this in vc? [/quote] No. [/quote] Which is truly unfortunate, because there's really no reason that I can see why you shouldn't use all three call-clobbered registers for parameter passing (if you're going to pass values in registers at all -- there exist some circumstances (typically when the parameters are ignored for a while) when it's better not to pass them as registers). As an interesting quirk, GCC supports MSVC's _fastcall correctly by creating a two-register pass using ecx,edx; too bad VC can't do the reverse and support GCC's ability to do three-register using eax,edx,ecx. :) | November 27, 2003, 6:53 AM |