General Programming | Lots of NOPs?

Author	Message	Time
Myndfyr	I was taking a look at WoW.exe with drivehappy, and I noticed this: [code] 00401130 . E9 0B000000 JMP WoW.00401140 00401135 90 NOP 00401136 90 NOP 00401137 90 NOP 00401138 90 NOP 00401139 90 NOP 0040113A 90 NOP 0040113B 90 NOP 0040113C 90 NOP 0040113D 90 NOP 0040113E 90 NOP 0040113F 90 NOP [/code] I don't do a lot of assembly analysis; the only stuff that I have done was 8086 and 80286 stuff with a 16-bit DOS debugger. I was curious as to why there are so many NOPs -- is it used for big block transfers or for alignment in some way? I could see that, but -- alignment -- why are there more than 3 at any given time? Thanks!	November 30, 2004, 6:48 AM
Arta	Seems there's a JMP right before them. Does that code ever get executed?	November 30, 2004, 7:12 AM
Adron	It's not that uncommon to see functions aligned to 16 bytes. Diablo 1 and Starcraft are or used to be that way. Some compilers use INT 3 instead of NOP to fill out. IIRC it has to do with cache lines, ensuring that as much of the function as possible fits inside the first fetch to cache.	November 30, 2004, 11:56 AM
iago	I've seen it done with "int 3" only in debug versions. I would imagine it's so, if you pooch something up and end up jumping there, you'll get a debug break rather than a random crash.	November 30, 2004, 1:30 PM
Myndfyr	[quote author=Arta[vL] link=topic=9723.msg90557#msg90557 date=1101798744] Seems there's a JMP right before them. Does that code ever get executed? [/quote] No, of course, and I mentioned that I thought it might be for alignment, but as I said, I was surprised at the quantity of NOPs as opposed to them being there at all. [quote author=Adron link=topic=9723.msg90563#msg90563 date=1101815777] It's not that uncommon to see functions aligned to 16 bytes. Diablo 1 and Starcraft are or used to be that way. Some compilers use INT 3 instead of NOP to fill out. IIRC it has to do with cache lines, ensuring that as much of the function as possible fits inside the first fetch to cache. [/quote] Thanks Adron! That makes sense.	November 30, 2004, 9:44 PM
Skywing	You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]).	December 1, 2004, 8:00 AM
Adron	[quote author=Skywing link=topic=9723.msg90677#msg90677 date=1101888035] You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]). [/quote] Do you have any examples of loop start aligning occurring in practise?	December 1, 2004, 3:04 PM
Skywing	[quote author=Adron link=topic=9723.msg90696#msg90696 date=1101913482] [quote author=Skywing link=topic=9723.msg90677#msg90677 date=1101888035] You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]). [/quote] Do you have any examples of loop start aligning occurring in practise? [/quote] The first example I found offhand was: [code]text:00409433 mov al, byte ptr [esp+224h+RootPathName] .text:00409437 test al, al .text:00409439 lea esi, [esp+224h+RootPathName] .text:0040943D jz short loc_409493 .text:0040943F nop .text:00409440 .text:00409440 loc_409440: ; CODE XREF: sub_409350+141j .text:00409440 push esi ; lpRootPathName .text:00409441 call ebp ; GetDriveTypeA .text:00409443 cmp eax, 5 [/code] ..where it's aligning the loop to 00409440.	December 1, 2004, 7:07 PM
Adron	And that one doesn't seem like a particularly worthwhile optimization? Calling an API inside the loop should slow it down enough... Nothing in the compiler to pick out what loops might be better to optimize?	December 2, 2004, 1:00 AM
Skywing	[quote author=Adron link=topic=9723.msg90736#msg90736 date=1101949220] And that one doesn't seem like a particularly worthwhile optimization? Calling an API inside the loop should slow it down enough... Nothing in the compiler to pick out what loops might be better to optimize? [/quote] Well, that was from bnupdate, and judging from the things I've seen in it I think the programmer had the compiler set to "maximally stupid" while building it. It was the first example I ran into.	December 2, 2004, 8:00 PM

Author

Message

Time

Myndfyr

I was taking a look at WoW.exe with drivehappy, and I noticed this:

[code]
00401130 . E9 0B000000 JMP WoW.00401140
00401135 90 NOP
00401136 90 NOP
00401137 90 NOP
00401138 90 NOP
00401139 90 NOP
0040113A 90 NOP
0040113B 90 NOP
0040113C 90 NOP
0040113D 90 NOP
0040113E 90 NOP
0040113F 90 NOP
[/code]

I don't do a lot of assembly analysis; the only stuff that I have done was 8086 and 80286 stuff with a 16-bit DOS debugger. I was curious as to why there are so many NOPs -- is it used for big block transfers or for alignment in some way? I could see that, but -- alignment -- why are there more than 3 at any given time?

Thanks!

November 30, 2004, 6:48 AM

Arta

Seems there's a JMP right before them. Does that code ever get executed?

November 30, 2004, 7:12 AM

Adron

It's not that uncommon to see functions aligned to 16 bytes. Diablo 1 and Starcraft are or used to be that way. Some compilers use INT 3 instead of NOP to fill out. IIRC it has to do with cache lines, ensuring that as much of the function as possible fits inside the first fetch to cache.

November 30, 2004, 11:56 AM

iago

I've seen it done with "int 3" only in debug versions. I would imagine it's so, if you pooch something up and end up jumping there, you'll get a debug break rather than a random crash.

November 30, 2004, 1:30 PM

Myndfyr

[quote author=Arta[vL] link=topic=9723.msg90557#msg90557 date=1101798744]
Seems there's a JMP right before them. Does that code ever get executed?
[/quote]
No, of course, and I mentioned that I thought it might be for alignment, but as I said, I was surprised at the quantity of NOPs as opposed to them being there at all.

[quote author=Adron link=topic=9723.msg90563#msg90563 date=1101815777]
It's not that uncommon to see functions aligned to 16 bytes. Diablo 1 and Starcraft are or used to be that way. Some compilers use INT 3 instead of NOP to fill out. IIRC it has to do with cache lines, ensuring that as much of the function as possible fits inside the first fetch to cache.
[/quote]

Thanks Adron! That makes sense.

November 30, 2004, 9:44 PM

Skywing

You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]).

December 1, 2004, 8:00 AM

Adron

[quote author=Skywing link=topic=9723.msg90677#msg90677 date=1101888035]
You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]).
[/quote]

Do you have any examples of loop start aligning occurring in practise?

December 1, 2004, 3:04 PM

Skywing

[quote author=Adron link=topic=9723.msg90696#msg90696 date=1101913482]
[quote author=Skywing link=topic=9723.msg90677#msg90677 date=1101888035]
You might also find the compiler aligning the start of a loop using nops (or other nop-like instructions depending on how many bytes of padding are required, such as lea esp, dword ptr [esp+0]).
[/quote]

Do you have any examples of loop start aligning occurring in practise?
[/quote]
The first example I found offhand was:

[code]text:00409433 mov al, byte ptr [esp+224h+RootPathName]
.text:00409437 test al, al
.text:00409439 lea esi, [esp+224h+RootPathName]
.text:0040943D jz short loc_409493
.text:0040943F nop
.text:00409440
.text:00409440 loc_409440: ; CODE XREF: sub_409350+141j
.text:00409440 push esi ; lpRootPathName
.text:00409441 call ebp ; GetDriveTypeA
.text:00409443 cmp eax, 5
[/code]

..where it's aligning the loop to 00409440.

December 1, 2004, 7:07 PM

Adron

And that one doesn't seem like a particularly worthwhile optimization? Calling an API inside the loop should slow it down enough... Nothing in the compiler to pick out what loops might be better to optimize?

December 2, 2004, 1:00 AM

Skywing

[quote author=Adron link=topic=9723.msg90736#msg90736 date=1101949220]
And that one doesn't seem like a particularly worthwhile optimization? Calling an API inside the loop should slow it down enough... Nothing in the compiler to pick out what loops might be better to optimize?
[/quote]
Well, that was from bnupdate, and judging from the things I've seen in it I think the programmer had the compiler set to "maximally stupid" while building it. It was the first example I ran into.

December 2, 2004, 8:00 PM

Valhalla Legends Forums Archive | General Programming | Lots of NOPs?