This is a short introduction of the module filebytes. filebytes is a python module which can be used to read and write the following fileformats:

  • Executable and Linking Format (ELF),
  • Portable Executable (PE),
  • MachO and
  • OAT (Android Runtime).

Open files

For each filetype (elf, pe, mach_o, oat) exists a separate module which has to be imported. Each module has all types defined needed for parsing that filetype.  To open a file you can use the class corresponding to the filetype you want to read (ELF, PE, MachO, OAT).

from filebytes.elf import *

fileName = 'afile'
elffile = ELF(fileName)

# It is also possible to use bytes of a file.
data = open(fileName, 'rb').read()
elffile = ELF('any filename', data) # The filename is not used

Data Access

When the file is opened the content is parsed and you can access the data via several properties. The files data is generally hold in containers which have several attributes. The count and the names of these attributes depend on the file header field. If you want to access the structure which was used to parse the data, you can use the attribute 'header'. All container types have at least this attribute.

container = file.aHeaderContainer
print container.header.aHeaderField

If the header structure points to another region in the file, you can access this data via the 'bytes' attribute. This attribute holds a bytearray containing the bytes where the header structure points to.

container = file.aHeaderContainer
data = container.bytes

All file types have the attributes 'imageBase' and 'entryPoint'.

ib = file.imageBase
ep = file.entryPoint

ELF

To access the data the ELF class provides six properties:

elfheader = elffile.elfHeader     # The ELF header
                                  # elf structure: EHDR
segments = elffile.segments       # return a list of segments
                                  # elf structure: PHDR
segments = elffile.programHeaders # same like segments
sections = elffile.sections       # return a list of all sections in the file
                                  # elf structure: SHDR

For example how to access the .got section:

ls = ELF('/bin/ls')

got = [s for s in ls.sections if s.name='.got'][0]

print 'Name: %s' % got.name
print 'Offset: 0x%x' % got.header.sh_offset
print 'Address: 0x%x' % got.header.sh_addr
print 'Size: 0x%x' % got.header.sh_size

The following  containers are used for those properties

class EhdrData(Container):
    """
    header = ElfHeader (EHDR)
    """

class ShdrData(Container):
    """
    header = SectionHeader (SHDR)
    name = string (section name)
    bytes = bytearray (section bytes)
    raw = c_ubyte_array

    .dynamic
    content = List of DYN entries

    .rel & .rela
    relocations = list of REL or RELA

    .dynsym & .symtab
    symbols = list of SYM entries
    """


class PhdrData(Container):
    """
    type = Programm Header Type (PHDR)
    header = ProgrammHeader
    bytes = bytearray (section bytes)
    raw = c_ubyte_array
    vaddr = virtual address (int)
    offset = offset
    """


class SymbolData(Container):
    """
    header = Symbol
    name = string
    type = int
    bind = bind
    """


class RelocationData(Container):
    """
    header = Relocation
    symbol = SymbolData
    type = type of relocation
    """

class DynamicData(Container):
    """
    header = DYN
    tag = value of class DT
    """

PE

To access the data the PE class provides six properties:

idh = pefile.imageDosHeader 
inh = pefile.imageNtHeaders 
sections = pefile.sections            # a list of all sections
dataDirectory = pefile.dataDirectory  # a list with the data where the DataDirectory in the OptionalHeader points to
# example
imports = pefile.dataDirectory[ImageDirectoryEntry.IMPORT]
firstThunk = imports[0].header.OriginalFirstThunk
print imports[0].dllName
print len(imports[0].importNameTable)

The following containers are used for those properties

class ImageDosHeaderData(Container):
    """
    header = IMAGE_DOS_HEADER
    """

class ImageNtHeaderData(Container):
    """
    header = IMAGE_NT_HEADERS
    """

class SectionData(Container):
    """
    header = IMAGE_SECTION_HEADER
    name = name of the section (str)
    bytes = bytes of section (bytearray)
    raw = bytes of section (c_ubyte_array)
    """

class ImportDescriptorData(Container):
    """
    header = IMAGE_IMPORT_DESCRIPTOR
    dllName = name of dll (str)
    importNameTable = list of IMAGE_THUNK_DATA
    importAddressTable = list of IMAGE_THUNK_DATA
    """

class ImportByNameData(Container):
    """
    header = IMAGE_IMPORT_BY_NAME
    name = name of function (str)
    """


class ThunkData(Container):
    """
    header = IMAGE_THUNK_DATA
    rva = relative virtual address of thunk
    ordinal = None | Ordinal
    importByName = None| ImportByNameData
    """

class ExportDirectoryData(Container):
    """
    header = IMAGE_EXPORT_DIRECTORY
    name = name of dll (str)
    functions = list of FunctionData
    """

class FunctionData(Container):
    """
    name = name of the function (str)
    ordinal = ordinal (int)
    rva = relative virtual address of function (int)
    """

MachO

To access the data the MachO class provides two properties:

mach_header = macho.machHeader
load_commands = macho.loadCommands

The following containers are used for those properties:

class MachHeaderData(Container):
    """
    header = MachHeader
    """

class LoadCommandData(Container):
    """
    header = LoaderCommand
    bytes = bytes of the command bytearray
    raw = bytes of the command c_ubyte_array

    SegmentCommand
    sections = list of SectionData

    UuidCommand
    uuid = uuid (str)

    TwoLevelHintsCommand
    twoLevelHints = list of TwoLevelHintData

    DylibCommand
    name = name of dylib (str)

    DylinkerCommand
    name = name of dynamic linker
    """

class SectionData(Container):
    """
    header = Section

    """

class TwoLevelHintData(Container):
    """
    header = TwoLevelHint
    """

OAT

Since an OAT file is an ELF file you can use all the properties of ELF. Additionally the OAT class provides two OAT specific properties:

  • oatHeader
  • oatDexHeader
oat_header = oat.oatHeader
oat_dex_header = oat.oatDexHeader  # a list of OatDexHeaderData

The following containers are used for those properties:

class OatHeaderData(Container):
    """
    header = OatHeader
    keyValueStoreRaw = c_ubyte_array
    keyValueStore = dict 
    """

class OatDexFileHeaderData(Container):
    """
    header = OatDexFileHeader w/o dexFileLocationSize and dexFileLocation
    name = name of Dexfile str
    classOffsets = c_uint_array
    dexHeader = DexHeader
    dexRaw = c_ubyte_array
    dexBytes = bytearray
    oatClasses = list of OatClasses
    """