Paul Bone

Disassembling JITed code in GDB

I’ve been making changes to the JIT in SpiderMonkey, and sometimes get a SEGFAULT, okay so open it in gdb, then this happens:

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x0000129af35af5e9 in ?? ()

Not helpful, maybe there’s something in the stack?

(gdb) backtrace
#0  0x0000129af35af5e9 in  ()
#1  0x0000129af35b107d in  ()
#2  0xfff9800000000000 in  ()
#3  0xfff8800000000002 in  ()
#4  0xfff8800000000002 in  ()

Still not helpful, I’m reasonably confident the crash is in JITed code which has no debugging symbols or other info. So I don’t know what it’s actually executing when it crashed.

In case it’s not apparent, this is a short blog post where I can make notes of one way to get some more information when debugging JITed code.

First of all, those really large addresses (frames 2, 3 and 4) look suspicious. I’m not sure what causes that.

Now, I know the change I made to the JIT, so it’s likely that that’s the code that’s crashing, I just don’t know why. It would help to see what code is being executed:

(gdb) disassemble
No function contains program counter for selected frame.

What it’s trying to say, is that the current program counter at this level in the backtrace does not correspond with the C program (SpiderMonkey). Yes, unless we did a call or goto of something invalid, then we’re probably executing JITed code.

Let’s get more info:

(gdb) info registers
rax            0x7ffff54b30c0   140737308733632
rbx            0xe4e4e4e400000891       -1953184670468274031
rcx            0xc      12
rdx            0x7ffff54c1058   140737308790872
rsi            0xa      10
rdi            0x7ffff54c1040   140737308790848
rbp            0x7fffffff9438   0x7fffffff9438
rsp            0x7fffffff9418   0x7fffffff9418
r8             0x7fffffff9088   140737488326792
r9             0x8      8
r10            0x7fffffff9068   140737488326760
r11            0x7ffff5d2f128   140737317630248
r12            0x0      0
r13            0x0      0
r14            0x7ffff54a0040   140737308655680
r15            0x0      0
rip            0x129af35af5e9   0x129af35af5e9
eflags         0x10202  [ IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

These are the values in the CPU registers. The debugger the rip (program counter) and rsp (stack pointer) and rbp (frame pointer) registers to know what it’s executing and to read the stack, including the calls that lead to this one. We can use this too, we’re going to use rip to figure out what’s being executed, it’s current value is 0x129af35af5e9.

(gdb) dump memory code.raw 0x129af35af5e9 0x129af35af600

Then in a shell:

$ hexdump -C code.raw
00000000  83 03 01 c7 02 4b 00 00  00 e9 82 00 00 00 49 bb
|.....K........I.|
00000010  a8 ab d1 f5 ff 7f 00                              |.......|

I have asked gdb, to write the contents of memory at the instruction pointer to a file named code.raw. Note that on x86-64 you need to write at least 15 bytes, as some instructions can be that long; I have 23 bytes.

I’d normally disassemble code using the objdump program:

$ objdump -d code.raw
objdump: code.raw: File format not recognised

In this case it needs extra clues about the raw data in this file. We tell it the file format, the machine "i386" and give the disassembler more information about the machine "x86-64".

$ objdump -b binary -m i386 -M x86-64 -D code.raw

code.raw:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   83 03 01                addl   $0x1,(%rbx)
   3:   c7 02 4b 00 00 00       movl   $0x4b,(%rdx)
   9:   e9 82 00 00 00          jmpq   0x90
   e:   49                      rex.WB
   f:   bb a8 ab d1 f5          mov    $0xf5d1aba8,%ebx
  14:   ff                      (bad)
  15:   7f 00                   jg     0x17

Yay. I can see the instruction it crashed on. Adding the number 1 to the 32-bit value stored at the address pointed to by rbx. I’d like some more context, so I have to get the instructions that lead to this. Note that after the jmpq instruction nothing makes sense, that’s okay since that jump is always taken.

(gdb) dump memory code.raw 0x2ce07c3895e6 0x2ce07c3895f7
...
$ objdump -b binary -m i386 -M x86-64 -D code.raw

code.raw:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   49 8b 1b                mov    (%r11),%rbx
   3:   83 03 01                addl   $0x1,(%rbx)
   6:   c7 02 4b 00 00 00       movl   $0x4b,(%rdx)
   c:   e9 82 00 00 00          jmpq   0x93

When I go back three bytes I get lucky and find another valid instruction that also makes sense.

(gdb) dump memory code.raw 0x2ce07c3895e5 0x2ce07c3895f7
...
$ objdump -b binary -m i386 -M x86-64 -D code.raw

code.raw:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   00 49 8b                add    %cl,-0x75(%rcx)
   3:   1b 83 03 01 c7 02       sbb    0x2c70103(%rbx),%eax
   9:   4b 00 00                rex.WXB add %al,(%r8)
   c:   00 e9                   add    %ch,%cl
   e:   82                      (bad)
   f:   00 00                   add    %al,(%rax)
        ...

Gibberish. Unfortunately I just have to guess which byte an instruction might begin on. Or go back byte-by-byte finding instructions that make sense. There was quiet a bit of experimentation, and a lot more gibberish until I found:

(gdb) dump memory code.raw 0x2ce07c3895dd 0x2ce07c3895f7
...
$ objdump -b binary -m i386 -M x86-64 -D code.raw

code.raw:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   bb 28 f1 d2 f5          mov    $0xf5d2f128,%ebx
   5:   ff                      (bad)
   6:   7f 00                   jg     0x8
   8:   00 49 8b                add    %cl,-0x75(%rcx)
   b:   1b 83 03 01 c7 02       sbb    0x2c70103(%rbx),%eax
  11:   4b 00 00                rex.WXB add %al,(%r8)
  14:   00 e9                   add    %ch,%cl
  16:   82                      (bad)
  17:   00 00                   add    %al,(%rax)
        ...

This is almost correct (except for all the gibberish). But at least it starts on an instruction that kind-of makes sense with a valid-looking memory address. But wait, that instruction uses ebx a 32-bit register. Which is not what I’m expecting since the code I’m JITing works with 64-bit memory addresses. And all that gibberish could be part of a memory address, it has bytes like 0xff and 0x7f in it!

I go back one more byte:

(gdb) dump memory code.raw 0x2ce07c3895dc 0x2ce07c3895f7
...
$ objdump -b binary -m i386 -M x86-64 -D code.raw

code.raw:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   49 bb 28 f1 d2 f5 ff    movabs $0x7ffff5d2f128,%r11
   7:   7f 00 00
   a:   49 8b 1b                mov    (%r11),%rbx
   d:   83 03 01                addl   $0x1,(%rbx)
  10:   c7 02 4b 00 00 00       movl   $0x4b,(%rdx)
  16:   e9 82 00 00 00          jmpq   0x9d

Got it. That’s a long instruction (which I’ll talk more about in my next article) Now that we have the extra byte at the beginning. x86 has prefix bytes for some instructions which can override some things about the instruction. In this case 0x49 is saying this instruction operates on 64-bit data (well 0x48 says that and +1 is part of the register address).

And there’s the bug (3rd line). I’m dereferencing this address, the one that I load into r11 once, and then again during the addl. I should only de-reference it once. The cause was that I misunderstood SpiderMonkey’s macro assembler’s mnemonics.

Update 2018-08-07

One response to this pointed out that I could have just used:

(gdb) disassemble 0x12345, +0x100

To disassemble a range of memory, and wouldn’t have had the "No function contains program counter for selected frame." error. They even suggested I could use something like:

(gdb) disassemble $rip-50, +0x100+

I’ll definitely try these next time, they might not be the exact syntax. I haven’t tested them..