Is there any situation where you’d want to remember the opcodes? Disassemblers should give you user-friendly assembly code, without any need to look at the raw numbers. Maybe it’s useful to remember which instructions are pseudo instructions (so you know stuff like jz (jump if zero) being the same as je (jump if equal) making it easier to understand the disassembly), but I don’t think you need to remember the opcode numbers for that.
Edit: Maybe with malware analysis where the malware in question may be obfuscated in interesting ways to make the job of binary analysis harder?
Just yesterday I ran into some chucklehead here on Lemmy that had convinced themselves that the average person would interpret “crypto” to mean SSL rather than cryptocurrency.
I had one last week here on claiming the average person could feed themselves for years by growing cherry tomatoes from 6 tiny plants. Bro is supposed to be a big-time agricultural bigwig
The TL;DR is that it’s used by debuggers to set a breakpoint in code.
For example, if you’re familiar with gdb, one of the simplest ways to make code stop executing at a particular point in the code is to add a breakpoint there.
Gdb replaces the instruction at the breakpoint with 0xCC, which happens to be the opcode for INT 3 — generate interrupt 3. When the CPU encounters the instruction, it generates interrupt 3, following which the kernel’s interrupt handler sends a signal (SIGTRAP) to the debugger. Thus, the debugger will know it’s meant to start a debugging loop there.
Before replacing the instruction with INT 3, the debugger keeps a note of what instruction was at that point in the code. When the CPU encounters INT 3, it hands control to the debugger.
When the debugging operations are done, the debugger replaces the INT 3 with the original instruction and makes the instruction pointer go back one step, thereby ensuring that the original instruction is executed.
The debug version you compile doesn’t affect the code; it just stores more information about symbols. The whole shtick about the debugger replacing instructions with INT3 still happens.
You can validate that the code isn’t affected yourself by running objdump on two binaries, one compiled with debug symbols and one without. Otherwise if you’re lazy (like me 😄):
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !programmerhumor@lemmy.ml
Post funny things about programming here! (Or just rant about your favourite programming language.)
Rules:
Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.
I mean who hasnt watched “Assembly Language in 100 seconds” by Fireship
Just looked this up and subscribed to the channel.
Is there any situation where you’d want to remember the opcodes? Disassemblers should give you user-friendly assembly code, without any need to look at the raw numbers. Maybe it’s useful to remember which instructions are pseudo instructions (so you know stuff like
jz
(jump if zero) being the same asje
(jump if equal) making it easier to understand the disassembly), but I don’t think you need to remember the opcode numbers for that.Edit: Maybe with malware analysis where the malware in question may be obfuscated in interesting ways to make the job of binary analysis harder?
The important thing is to be important. Engineering has to deal with teammates that don’t have these problems, so they equalize.
Isn’t there a version about mineralogy?
“So this here is a rock”
“Uhh, in english please?”
“Oy! Guv! This here’s a rock, innit?”
“In English, please?”
of course nods along
I feel attacked.
I mean I’m only missing int3
I think it is 0xCC, or in long form 0xCD03
I didn’t even know they released int2
Just yesterday I ran into some chucklehead here on Lemmy that had convinced themselves that the average person would interpret “crypto” to mean SSL rather than cryptocurrency.
You mean things like Bigfoot?
I had one last week here on claiming the average person could feed themselves for years by growing cherry tomatoes from 6 tiny plants. Bro is supposed to be a big-time agricultural bigwig
Makes sense. Human beings don’t actually need proteins or fats.
At least dead ones dont
That seems like the opposite problem
Holy shit that was weird.
deleted by creator
That’s the one
Now I want to know what int3 does.
https://en.wikipedia.org/wiki/INT_(x86_instruction) (scroll down to INT3)
https://stackoverflow.com/a/61946177
The TL;DR is that it’s used by debuggers to set a breakpoint in code.
For example, if you’re familiar with gdb, one of the simplest ways to make code stop executing at a particular point in the code is to add a breakpoint there.
Gdb replaces the instruction at the breakpoint with 0xCC, which happens to be the opcode for INT 3 — generate interrupt 3. When the CPU encounters the instruction, it generates interrupt 3, following which the kernel’s interrupt handler sends a signal (SIGTRAP) to the debugger. Thus, the debugger will know it’s meant to start a debugging loop there.
Hey thank you!
Not what I thought it was for sure 😃
How does it work if an instruction gets replaced by the INT3 though?
Excellent question!
Before replacing the instruction with INT 3, the debugger keeps a note of what instruction was at that point in the code. When the CPU encounters INT 3, it hands control to the debugger.
When the debugging operations are done, the debugger replaces the INT 3 with the original instruction and makes the instruction pointer go back one step, thereby ensuring that the original instruction is executed.
Whoo that seems complicated, I mean you akready compile a debug version.
Thanks for the explanation!
The debug version you compile doesn’t affect the code; it just stores more information about symbols. The whole shtick about the debugger replacing instructions with INT3 still happens.
You can validate that the code isn’t affected yourself by running objdump on two binaries, one compiled with debug symbols and one without. Otherwise if you’re lazy (like me 😄):
https://stackoverflow.com/a/8676610
And for completeness: https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Debugging-Options.html
Thanks, excellent information!
How come debug exes are bigger? Is the nifty stuff tucked on at the end?
As a bytecode tinkerer, I’d say considering NOP to be global knowledge is a slippery slope.
NOP is $EA, of course, and… um…
…sorry, I’m just a Commodore 64 scrub, I don’t know nothing about this high and mighty Intel 8086 nonsense.
[looking up]
…it’s 0x90 on IA-32? WHAT? Someone told me every processor used 0xEA because that was commonly agreed and readily apparent. …guess I was wrong
I thought NOP was 0x90. Edit: oh I just read the rest of the comment.
My daughter told me the other day, “I bet I could figure out a Commodore 64 if I had one.”
Good luck figuring out LOAD “*”,8,1 by yourself, kid.
deleted by creator
I can’t tell if you’re joking and deliberately invoking the original comic above
She meant she could figure it out just playing around with it, not reading a manual or asking around. I told her she’d have to read a manual.
Erm I might be showing my inexperience here.
Is there no equivalent to
man LOAD
in the commodore world? Or even justhelp
?Not that I remember.
You’re sixteen, you’re beautiful, I’m under arrest
I love that I’m getting downvoted for a Ringo Starr reference