Gdb python: Difference between revisions

From miki
Jump to navigation Jump to search
(Created page with "This page collects information on Python integration in gdb. == Python script == === Using Capstone to trace program === <source lang="python"> import re import gdb from...")
 
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page collects information on Python integration in [[gdb]].
This page collects information on Python integration in [[gdb]].


== Python script ==
== Links ==
* [https://sourceware.org/gdb/onlinedocs/gdb/Python.html Python in GDB manual].
* [https://sourceware.org/gdb/current/onlinedocs/gdb/Python-API.html Python API]
* {{red|New}} &mdash; https://www.pythonsheets.com/appendix/python-gdb.html
:Example for '''Dumping memory''', '''colorizing listing''', '''pretty printer''', '''tracing''' and '''profiling'''.
* {{red|New}} &mdash; https://repo.zenk-security.com/Reversing%20.%20cracking/Hi%20GDB,%20this%20is%20python.pdf
:Nice examples for <code>read_memory</code>, <code>write_memory</code>, <code>parse_and_eval</code>, manipulating and casting GDB <code>Value</code> objects.
<source lang="python">
# gdb$ x/dwx $esp
# 0xbffff7ac: 0xb7e4ee46gdb$
int_pointer_type = gdb.lookup_type(’int’).pointer()
stack_address = gdb.Value(0xbffff7ac)
stack_address_pointer = stack_address.cast(int_pointer_type)
content = long(stack_address_pointer.dereference())
print hex(content & 0xffffffff)
# 0xb7e4ee46L

def deref_long_from_addr(addr):
'''
Get the value pointed by addr
'''
p_long = gdb.lookup_type('long').pointer()
vale = gdb.Value(addr).cast(p_long).dereference()
return long(val) & 0xffffffff
</source>

== Tutorial ==
From [https://www.youtube.com/watch?v=PorfLSr3DDI Greg Law video].

* Single line python: for instance <code>python print("Hello World")</code>

* Multi-line python: simply enter <code>python</code>, then finish with <code>end</code>
python
<source lang="python">
import os
print ("My pid is %d" % os.getpid())
</source>
end

* Integration with gdb:
:* <code>python print(gdb.breakpoints()[0].location)</code>
:* <code>python gdb.Breakpoint('7')</code>, to insert breakpoint at line 7 in current file.

* To get output of a gdb execution, use <code>to_string=True</code>:
<source lang="python">
x=gdb.execute("show architecture", to_string=True).strip()
</source>

== How-to ==
=== Define custom gdb command ===
* https://sourceware.org/gdb/onlinedocs/gdb/Commands-In-Python.html#Commands-In-Python

<source lang="python">
class MyCommand( gdb.Command ):
"""MyCommand"""

def __init__(self):
super(MyCommand, self).__init__( "mycommand", gdb.COMMAND_USER )

def invoke(self, arg, from_tty):
print("My great command is alive!")
print "R0=0x%08X" %( gdb.parse_and_eval("$R0") )

MyCommand()
</source>

=== Quit program gracefully with custom python breakpoint ===
Here for instance a command <code>Quit</code> (note that gdb would accept abbreviation <code>Q</code>), that set a temporary breakpoint that definitely quit when reaching a custom <code>exit_function</code>, and jump to some <code>cleanup_function</code>:
<source lang="python">
JUMP_TO_FUNCTION="cleanup_function"
QUIT_ON_FUNCTION="exit_function"

class QuitBreakpoint(gdb.Breakpoint):
def stop(self):
print("Quitting...")
gdb.execute("quit")

class Quit( gdb.Command ):
"""Quit"""

def __init__(self):
super(Quit, self).__init__("Quit", gdb.COMMAND_USER )

def invoke(self, arg, from_tty):
print("Setting bkp")
QuitBreakpoint(QUIT_ON_FUNCTION, gdb.BP_BREAKPOINT, temporary=True)
gdb.execute("set $eip=%s" % JUMP_TO_FUNCTION)
gdb.execute("continue")

Quit()
</source>

=== Using Capstone to trace program ===
=== Using Capstone to trace program ===
<source lang="python">
<source lang="python">
Line 304: Line 395:
# Main
# Main
main()
main()
</source>

=== Define stop hook from Python ===
<source lang="python">
def stop_handler (event):
gdb.execute('dd')
gdb.execute('bt')

gdb.events.stop.connect (stop_handler)
</source>

=== Quit gdb from Python ===
Simply call <code>quit()</code>:
<source lang="python">
python quit()
</source>

=== Get architecture ===
Besides <code>show architecture</code>:

<source lang="python">
def is_alive():
"""Check if GDB is running."""
try:
return gdb.selected_inferior().pid > 0
except Exception:
return False
return False

if is_alive():
arch = gdb.selected_frame().architecture()
return arch.name()
</source>

=== Get register values ===
<source lang="python">
# https://stackoverflow.com/questions/6103887/how-do-i-access-the-registers-with-python-in-gdb
# https://programtalk.com/python-examples/gdb.parse_and_eval/

# Using parse_and_eval
print type(gdb.parse_and_eval('$SP')), gdb.parse_and_eval('$SP')
# <type 'gdb.Value'> 0x2730
print str(gdb.parse_and_eval('$SP'))
# 0x2730
print int(str(gdb.parse_and_eval('$SP')),16)
# 10032
print "0x%08x" % gdb.parse_and_eval('$SP')
# 0x00002730
print int(gdb.parse_and_eval('$SP'))
# ERROR!
print long(gdb.parse_and_eval('$SP'))
# 10032
print int(gdb.parse_and_eval('(int)$SP'))
# 10032
print int(gdb.parse_and_eval('(long)$SP'))
# 10032
int(long(gdb.parse_and_eval('$SP')) & 0xffffffff)
# 10032

# Using read_register
print int(str(gdb.selected_frame().read_register('SP),16)
# 10032
</source>

=== Implement offsetof and sizeof ===
<source lang="python">
# https://programtalk.com/python-examples/gdb.parse_and_eval/

def offsetof(struct_name, member_name):
expr = '(size_t)&(((%s *)0)->%s) - (size_t)((%s *)0)' % \
(struct_name, member_name, struct_name)
return to_int(gdb.parse_and_eval(expr))

def sizeof(type_name):
return to_int(gdb.parse_and_eval('sizeof(%s)' % (type_name)))
</source>

=== Get description of any GDB object ===
One option is to use the pythonic <code>dir(...)</code>, but in fact invoking help gives the best result
<source lang="python">
python help(gdb.selected_frame().read_register('SP'))
# Help on Value object:
#
# class Value(__builtin__.object)
# GDB value object
# ...
# cast(...)
# Cast the value to the supplied type.
# ...
# address
# The address of the value.
</source>

=== Get address of a symbol ===
For instance, say <code>__begin_text</code>:

<source lang="python">
python print gdb.parse_and_eval('__begin_text').address
# 0x40000 <__start>
python print str(gdb.parse_and_eval('__begin_text').address)
# 0x40000 <__start>
python print int(gdb.parse_and_eval('__begin_text').address)
# ERROR
python print long(gdb.parse_and_eval('__begin_text').address)
# 262144
python print "0x%08x" % (gdb.parse_and_eval('__begin_text').address)
# 0x00040000
</source>

=== Read / Write / Search memory ===
* https://www.zeuthen.desy.de/dv/documentation/unixguide/infohtml/gdb/Inferiors-In-Python.html
* <code>buffer</code> object: https://docs.python.org/2.7/library/functions.html#buffer
:* Python 3: Use <code>memoryview</code>.
* Examples:
:* https://stackoverflow.com/questions/46572440/gdb-python-module-read-memory-content
:* https://gist.github.com/ricksladkey/bdcd761a5b06e3d670728d8cc96458ba

<source lang="python">
read_memory(address, length)
write_memory(address, buffer [, length])
search_memory(address, length, pattern)
</source>

This is for Python 2:
<source lang="python">
inf = gdb.selected_inferior()
b = inf.read_memory(0x40000,16) # Return a buffer object
print list(b) # ['u', '\x86' ....]
b.tobytes() # ERROR! Python2
str(b) # ERROR! Python2
</source>
</source>

Latest revision as of 20:52, 13 March 2021

This page collects information on Python integration in gdb.

Links

Example for Dumping memory, colorizing listing, pretty printer, tracing and profiling.
Nice examples for read_memory, write_memory, parse_and_eval, manipulating and casting GDB Value objects.
# gdb$ x/dwx $esp
# 0xbffff7ac:     0xb7e4ee46gdb$ 
int_pointer_type = gdb.lookup_type(int).pointer()
stack_address = gdb.Value(0xbffff7ac)
stack_address_pointer = stack_address.cast(int_pointer_type)
content = long(stack_address_pointer.dereference())
print hex(content & 0xffffffff)
# 0xb7e4ee46L

def deref_long_from_addr(addr):
    '''
    Get the value pointed by addr
    '''
    p_long = gdb.lookup_type('long').pointer()
    vale = gdb.Value(addr).cast(p_long).dereference()
    return long(val) & 0xffffffff

Tutorial

From Greg Law video.

  • Single line python: for instance python print("Hello World")
  • Multi-line python: simply enter python, then finish with end
python
    import os
    print ("My pid is %d" % os.getpid())
end
  • Integration with gdb:
  • python print(gdb.breakpoints()[0].location)
  • python gdb.Breakpoint('7'), to insert breakpoint at line 7 in current file.
  • To get output of a gdb execution, use to_string=True:
x=gdb.execute("show architecture", to_string=True).strip()

How-to

Define custom gdb command

class MyCommand( gdb.Command ):
    """MyCommand"""

    def __init__(self):
        super(MyCommand, self).__init__( "mycommand", gdb.COMMAND_USER )

    def invoke(self, arg, from_tty):
        print("My great command is alive!")
        print "R0=0x%08X" %( gdb.parse_and_eval("$R0") )

MyCommand()

Quit program gracefully with custom python breakpoint

Here for instance a command Quit (note that gdb would accept abbreviation Q), that set a temporary breakpoint that definitely quit when reaching a custom exit_function, and jump to some cleanup_function:

JUMP_TO_FUNCTION="cleanup_function"
QUIT_ON_FUNCTION="exit_function"

class QuitBreakpoint(gdb.Breakpoint):
    def stop(self):
        print("Quitting...")
        gdb.execute("quit")

class Quit( gdb.Command ):
    """Quit"""

    def __init__(self):
        super(Quit, self).__init__("Quit", gdb.COMMAND_USER )

    def invoke(self, arg, from_tty):
        print("Setting bkp")
        QuitBreakpoint(QUIT_ON_FUNCTION, gdb.BP_BREAKPOINT, temporary=True)
        gdb.execute("set $eip=%s" % JUMP_TO_FUNCTION)
        gdb.execute("continue")

Quit()

Using Capstone to trace program

import re
import gdb

from capstone import *
from capstone.arm import *
import binascii

# Documentation : https://sourceware.org/gdb/onlinedocs/gdb/Python-API.html
# Server-side : gdbserver --multi CLIENT_ADDR:REMOTE_PORT
# Client-side : gdb-multiarch ./my_exec
# Load the script in GDB CLI : source /path/to/gdb_script_st.py

REMOTE_ADDR = "192.168.20.150"
REMOTE_PORT = "6666"
REMOTE_FILE_PATH = "/path/on/remote/to/my_exec"

PAGINATION = "off"

LOG = "off"
LOG_REDIRECT = "on"
LOG_OVERWRITE = "on"
LOG_FILE = "trace.log"

TRACE_START_ADDR = 0x112fc
TRACE_END_ADDR = 0

exited = False

def read_register(register_name):
    return int(gdb.selected_frame().read_register(register_name)) & 0xffffffff


def process_register(register_name, access, occurences, readen_registers, written_registers, operand_string):
    if register_name not in occurences:
        occurences[register_name] = (match.span() for match in re.finditer(register_name, operand_string))
    start, end = next(occurences[register_name])
    if access & CS_AC_WRITE:
        written_registers.append(register_name)
    else:
        register_value = read_register(register_name)
        readen_registers.append((register_name, register_value, start + occurences["offset"], end + occurences["offset"]))
        occurences["offset"] += len(f"0x{register_value:x}") - len(register_name)


def concrete_operand_string(operand_string, readen_registers):
    for register_name, register_value, start, end in readen_registers:
        operand_string = operand_string[:start] + f"0x{register_value:x}" + operand_string[end:]
    return operand_string


def exit_handler(event):

    global exited

    exited = True
    if hasattr(event, "exit_code"):
        print(f"[!] Program has exited returning {event.exit_code}")
    else:
        print(f"[!] Program has exited (couldn't read exit code)")


class TraceRun (gdb.Command):

    def __init__(self):
        super(TraceRun, self).__init__("trace-run", gdb.COMMAND_RUNNING)


    def invoke(self, arg, from_tty):

        global exited

        exited = False
        inferior = gdb.selected_inferior()
        pc = TRACE_START_ADDR
        breakpoint = gdb.Breakpoint(f"*0x{pc:x}", gdb.BP_BREAKPOINT, temporary=True)
        cs_engine = Cs(CS_ARCH_ARM, CS_MODE_ARM)
        cs_engine.detail = True

        gdb.execute(f"r {arg}")

        if not exited:

            gdb.execute(f"set logging {LOG}")

            while(pc != TRACE_END_ADDR):

                readen_registers = []
                written_registers = []
                occurences = {"offset": 0}

                memory_view = inferior.read_memory(pc, 0x4)
                instruction = next(cs_engine.disasm(memory_view.tobytes(), pc))

                for operand in instruction.operands:
                    if operand.type == ARM_OP_REG:
                        process_register(instruction.reg_name(operand.value.reg), operand.access, occurences, readen_registers, written_registers, instruction.op_str)
                    if operand.type == ARM_OP_MEM:
                        if operand.value.mem.base:
                            process_register(instruction.reg_name(operand.value.mem.base), operand.access, occurences, readen_registers, written_registers, instruction.op_str)
                        if operand.value.mem.index:
                            process_register(instruction.reg_name(operand.value.mem.index), operand.access, occurences, readen_registers, written_registers, instruction.op_str)

                gdb.execute("ni")
                print(f"0x{instruction.address:x}: {instruction.mnemonic} {instruction.op_str}")
                concrete_op_str = concrete_operand_string(instruction.op_str, readen_registers)
                if instruction.op_str != concrete_op_str:
                    print(f"0x{instruction.address:x}: {instruction.mnemonic} {concrete_op_str}")
                for register_name in written_registers:
                    register_value = read_register(register_name)
                    print(f"0x{instruction.address:x}: {register_name} = 0x{register_value:x}")
                pc = read_register("pc")

            gdb.execute("set logging off")
            gdb.execute("c")


def main():
    gdb.execute(f"target extended-remote {REMOTE_ADDR}:{REMOTE_PORT}")
    gdb.execute(f"set remote exec-file {REMOTE_FILE_PATH}")
    gdb.execute(f"set pagination {PAGINATION}")
    gdb.execute(f"set logging file {LOG_FILE}")
    gdb.execute(f"set logging redirect {LOG_REDIRECT}")
    gdb.execute(f"set logging overwrite {LOG_OVERWRITE}")
    gdb.execute(f"set logging off")
    gdb.execute(f"set confirm off")
    gdb.events.exited.connect(exit_handler)


if __name__ == "__main__":
    # Commands registration
    TraceRun()
    # Main
    main()

Disable ptrace detection in a program

Some programs calls ptrace to detect whether they are debugged or traced (with strace).

We can set breakpoints to disable this detection:

  • On first call, ptrace must return 0 to indicate success.
  • On second call, ptrace must return -1 to indicate a process was attached already (the program itself did it in first call above).

We use Python variables to store this state, and reset on program exit:

import gdb

PTRACE_ADDR = 0x000153d0

unptrace_breakpoint = None
unptrace_breakpoint_value = 0


def exit_handler(event):

    global unptrace_breakpoint_value

    exited = True
    if hasattr(event, "exit_code"):
        print(f"[!] Program has exited returning {event.exit_code}")
    else:
        print(f"[!] Program has exited (couldn't read exit code)")
    unptrace_breakpoint_value=0


class PtraceBreakpoint (gdb.Breakpoint):
    def stop(self):
        global unptrace_breakpoint_value

        print(f"set $r0={unptrace_breakpoint_value}")
        gdb.execute(f"set $r0={unptrace_breakpoint_value}")
        gdb.execute("set $pc=$lr")
        print(f"unptrace: returned {unptrace_breakpoint_value}!")
        unptrace_breakpoint_value=0xffffffff
        return False

class UnPtrace (gdb.Command):

    def __init__(self):
        global unptrace_breakpoint
        global unptrace_breakpoint_value

        super(UnPtrace, self).__init__("unptrace", gdb.COMMAND_USER)
        unptrace_breakpoint = None
        unptrace_breakpoint_value = 0

    def invoke(self, arg, from_tty):
        global unptrace_breakpoint

        if unptrace_breakpoint == None:
            unptrace_breakpoint = PtraceBreakpoint(f"*0x{PTRACE_ADDR:x}", gdb.BP_BREAKPOINT)
            unptrace_breakpoint_value = 0

def main():
    gdb.events.exited.connect(exit_handler)


if __name__ == "__main__":
    # Commands registration
    UnPtrace()
    # Main
    main()

Trace and inspect some functions calls

Here we set up some breakpoint hook to inspect calls to read(2) and write(2):

import gdb

import binascii

READ_ADDR = 0x000152f0
WRITE_ADDR = 0x00015300

exited = False

def exit_handler(event):

    global exited

    exited = True
    if hasattr(event, "exit_code"):
        print(f"[!] Program has exited returning {event.exit_code}")
    else:
        print(f"[!] Program has exited (couldn't read exit code)")

read_r0 = None
read_r1 = None
read_r2 = None

class ReadBackBreakpoint (gdb.Breakpoint):
    def stop(self):
        global read_r0, read_r1, read_r2

        if read_r0 != None:
            lr = read_register("pc")        # PC is LR when we created the bkp
            r0 = read_r0
            r1 = read_r1
            r2 = read_r2
            r_bytes = gdb.selected_inferior().read_memory(r1,r2).tobytes()
            print(f"0x{lr-4:08x}: read_({r0},0x{r1:08x},{r2:3}) <-- {binascii.hexlify(r_bytes).decode()}")

        (read_r0,read_r1,read_r2) = (None,None,None)
        return False

class ReadBreakpoint (gdb.Breakpoint):
    def stop(self):
        global read_r0, read_r1, read_r2

        r0 = read_register("r0")
        r1 = read_register("r1")
        r2 = read_register("r2")
        lr = read_register("lr")
        # print(f"0x{lr-4:08x}: read_({r0},0x{r1:08x},{r2:3})")

        # Set up temporary bkp to collect read bytes
        (read_r0,read_r1,read_r2) = (r0,r1,r2)
        breakpoint = ReadBackBreakpoint(f"*0x{lr:x}", gdb.BP_BREAKPOINT, temporary=True)

        return False

class WriteBreakpoint (gdb.Breakpoint):
    def stop(self):
        lr = read_register("lr")
        r0 = read_register("r0")
        r1 = read_register("r1")
        r2 = read_register("r2")
        w_bytes = gdb.selected_inferior().read_memory(r1, r2).tobytes()

        print(f"0x{lr-4:08x}: write({r0},0x{r1:08x},{r2:3}) ==> {binascii.hexlify(w_bytes).decode()}")
        print()
        return False

class HookRW (gdb.Command):

    def __init__(self):
        super(HookRW, self).__init__("hookrw", gdb.COMMAND_USER)
        self.read_bkp = None
        self.write_bkp = None

    def invoke(self, arg, from_tty):
        if self.read_bkp == None:
            self.read_bkp = ReadBreakpoint(f"*0x{READ_ADDR:x}", gdb.BP_BREAKPOINT)
        if self.write_bkp == None:
            self.write_bkp = WriteBreakpoint(f"*0x{WRITE_ADDR:x}", gdb.BP_BREAKPOINT)

def main():
    gdb.execute(f"set pagination {PAGINATION}")
    gdb.execute(f"set logging file {LOG_FILE}")
    gdb.execute(f"set logging redirect {LOG_REDIRECT}")
    gdb.execute(f"set logging overwrite {LOG_OVERWRITE}")
    gdb.execute(f"set logging off")
    gdb.execute(f"set confirm off")
    gdb.events.exited.connect(exit_handler)

if __name__ == "__main__":
    # Commands registration
    HookRW()
    # Main
    main()

Define stop hook from Python

def stop_handler (event):
    gdb.execute('dd')
    gdb.execute('bt')

gdb.events.stop.connect (stop_handler)

Quit gdb from Python

Simply call quit():

python quit()

Get architecture

Besides show architecture:

def is_alive():
    """Check if GDB is running."""
    try:
        return gdb.selected_inferior().pid > 0
    except Exception:
        return False
    return False

if is_alive():
    arch = gdb.selected_frame().architecture()
    return arch.name()

Get register values

# https://stackoverflow.com/questions/6103887/how-do-i-access-the-registers-with-python-in-gdb
# https://programtalk.com/python-examples/gdb.parse_and_eval/

# Using parse_and_eval
print type(gdb.parse_and_eval('$SP')), gdb.parse_and_eval('$SP')
# <type 'gdb.Value'> 0x2730
print str(gdb.parse_and_eval('$SP'))
# 0x2730
print int(str(gdb.parse_and_eval('$SP')),16)
# 10032
print "0x%08x" % gdb.parse_and_eval('$SP')
# 0x00002730
print int(gdb.parse_and_eval('$SP'))
# ERROR!
print long(gdb.parse_and_eval('$SP'))
# 10032
print int(gdb.parse_and_eval('(int)$SP'))
# 10032
print int(gdb.parse_and_eval('(long)$SP'))
# 10032
int(long(gdb.parse_and_eval('$SP')) & 0xffffffff)
# 10032

# Using read_register
print int(str(gdb.selected_frame().read_register('SP),16)
# 10032

Implement offsetof and sizeof

# https://programtalk.com/python-examples/gdb.parse_and_eval/

def offsetof(struct_name, member_name):
    expr = '(size_t)&(((%s *)0)->%s) - (size_t)((%s *)0)' % \
        (struct_name, member_name, struct_name)
    return to_int(gdb.parse_and_eval(expr))

def sizeof(type_name):
    return to_int(gdb.parse_and_eval('sizeof(%s)' % (type_name)))

Get description of any GDB object

One option is to use the pythonic dir(...), but in fact invoking help gives the best result

python help(gdb.selected_frame().read_register('SP'))
# Help on Value object:
# 
# class Value(__builtin__.object)
#   GDB value object
# ...
#   cast(...)
#       Cast the value to the supplied type.
# ...
#   address
#       The address of the value.

Get address of a symbol

For instance, say __begin_text:

python print gdb.parse_and_eval('__begin_text').address
# 0x40000 <__start>
python print str(gdb.parse_and_eval('__begin_text').address)
# 0x40000 <__start>
python print int(gdb.parse_and_eval('__begin_text').address)
# ERROR
python print long(gdb.parse_and_eval('__begin_text').address)
# 262144
python print "0x%08x" % (gdb.parse_and_eval('__begin_text').address)
# 0x00040000

Read / Write / Search memory

  • Python 3: Use memoryview.
  • Examples:
read_memory(address, length)
write_memory(address, buffer [, length])
search_memory(address, length, pattern)

This is for Python 2:

inf = gdb.selected_inferior()
b = inf.read_memory(0x40000,16)   # Return a buffer object
print list(b) # ['u', '\x86' ....]
b.tobytes() # ERROR! Python2
str(b)      # ERROR! Python2