FileBytes - Introduction
This is a short introduction of the module filebytes. filebytes is a python module which can be used to read and write the following fileformats:
- Executable and Linking Format (ELF),
- Portable Executable (PE),
- MachO and
- OAT (Android Runtime).
Open files
For each filetype (elf, pe, mach_o, oat) exists a separate module which has to be imported. Each module has all types defined needed for parsing that filetype. To open a file you can use the class corresponding to the filetype you want to read (ELF, PE, MachO, OAT).
from filebytes.elf import *
fileName = 'afile'
elffile = ELF(fileName)
# It is also possible to use bytes of a file.
data = open(fileName, 'rb').read()
elffile = ELF('any filename', data) # The filename is not usedData Access
When the file is opened the content is parsed and you can access the data via several properties. The files data is generally hold in containers which have several attributes. The count and the names of these attributes depend on the file header field. If you want to access the structure which was used to parse the data, you can use the attribute 'header'. All container types have at least this attribute.
container = file.aHeaderContainer
print container.header.aHeaderFieldIf the header structure points to another region in the file, you can access this data via the 'bytes' attribute. This attribute holds a bytearray containing the bytes where the header structure points to.
container = file.aHeaderContainer
data = container.bytesAll file types have the attributes 'imageBase' and 'entryPoint'.
ib = file.imageBase
ep = file.entryPointELF
To access the data the ELF class provides six properties:
elfheader = elffile.elfHeader     # The ELF header
                                  # elf structure: EHDR
segments = elffile.segments       # return a list of segments
                                  # elf structure: PHDR
segments = elffile.programHeaders # same like segments
sections = elffile.sections       # return a list of all sections in the file
                                  # elf structure: SHDRFor example how to access the .got section:
ls = ELF('/bin/ls')
got = [s for s in ls.sections if s.name='.got'][0]
print 'Name: %s' % got.name
print 'Offset: 0x%x' % got.header.sh_offset
print 'Address: 0x%x' % got.header.sh_addr
print 'Size: 0x%x' % got.header.sh_sizeThe following containers are used for those properties
class EhdrData(Container):
    """
    header = ElfHeader (EHDR)
    """
class ShdrData(Container):
    """
    header = SectionHeader (SHDR)
    name = string (section name)
    bytes = bytearray (section bytes)
    raw = c_ubyte_array
    .dynamic
    content = List of DYN entries
    .rel & .rela
    relocations = list of REL or RELA
    .dynsym & .symtab
    symbols = list of SYM entries
    """
class PhdrData(Container):
    """
    type = Programm Header Type (PHDR)
    header = ProgrammHeader
    bytes = bytearray (section bytes)
    raw = c_ubyte_array
    vaddr = virtual address (int)
    offset = offset
    """
class SymbolData(Container):
    """
    header = Symbol
    name = string
    type = int
    bind = bind
    """
class RelocationData(Container):
    """
    header = Relocation
    symbol = SymbolData
    type = type of relocation
    """
class DynamicData(Container):
    """
    header = DYN
    tag = value of class DT
    """PE
To access the data the PE class provides six properties:
idh = pefile.imageDosHeader 
inh = pefile.imageNtHeaders 
sections = pefile.sections            # a list of all sections
dataDirectory = pefile.dataDirectory  # a list with the data where the DataDirectory in the OptionalHeader points to
# example
imports = pefile.dataDirectory[ImageDirectoryEntry.IMPORT]
firstThunk = imports[0].header.OriginalFirstThunk
print imports[0].dllName
print len(imports[0].importNameTable)The following containers are used for those properties
class ImageDosHeaderData(Container):
    """
    header = IMAGE_DOS_HEADER
    """
class ImageNtHeaderData(Container):
    """
    header = IMAGE_NT_HEADERS
    """
class SectionData(Container):
    """
    header = IMAGE_SECTION_HEADER
    name = name of the section (str)
    bytes = bytes of section (bytearray)
    raw = bytes of section (c_ubyte_array)
    """
class ImportDescriptorData(Container):
    """
    header = IMAGE_IMPORT_DESCRIPTOR
    dllName = name of dll (str)
    importNameTable = list of IMAGE_THUNK_DATA
    importAddressTable = list of IMAGE_THUNK_DATA
    """
class ImportByNameData(Container):
    """
    header = IMAGE_IMPORT_BY_NAME
    name = name of function (str)
    """
class ThunkData(Container):
    """
    header = IMAGE_THUNK_DATA
    rva = relative virtual address of thunk
    ordinal = None | Ordinal
    importByName = None| ImportByNameData
    """
class ExportDirectoryData(Container):
    """
    header = IMAGE_EXPORT_DIRECTORY
    name = name of dll (str)
    functions = list of FunctionData
    """
class FunctionData(Container):
    """
    name = name of the function (str)
    ordinal = ordinal (int)
    rva = relative virtual address of function (int)
    """MachO
To access the data the MachO class provides two properties:
mach_header = macho.machHeader
load_commands = macho.loadCommandsThe following containers are used for those properties:
class MachHeaderData(Container):
    """
    header = MachHeader
    """
class LoadCommandData(Container):
    """
    header = LoaderCommand
    bytes = bytes of the command bytearray
    raw = bytes of the command c_ubyte_array
    SegmentCommand
    sections = list of SectionData
    UuidCommand
    uuid = uuid (str)
    TwoLevelHintsCommand
    twoLevelHints = list of TwoLevelHintData
    DylibCommand
    name = name of dylib (str)
    DylinkerCommand
    name = name of dynamic linker
    """
class SectionData(Container):
    """
    header = Section
    """
class TwoLevelHintData(Container):
    """
    header = TwoLevelHint
    """OAT
Since an OAT file is an ELF file you can use all the properties of ELF. Additionally the OAT class provides two OAT specific properties:
- oatHeader
- oatDexHeader
oat_header = oat.oatHeader
oat_dex_header = oat.oatDexHeader  # a list of OatDexHeaderDataThe following containers are used for those properties:
class OatHeaderData(Container):
    """
    header = OatHeader
    keyValueStoreRaw = c_ubyte_array
    keyValueStore = dict 
    """
class OatDexFileHeaderData(Container):
    """
    header = OatDexFileHeader w/o dexFileLocationSize and dexFileLocation
    name = name of Dexfile str
    classOffsets = c_uint_array
    dexHeader = DexHeader
    dexRaw = c_ubyte_array
    dexBytes = bytearray
    oatClasses = list of OatClasses
    """
