Goblin is an impish, cross-platform binary parsing crate, written in Rust. It supports:
- An ELF32/64 parser, and raw C structs
- A 32/64-bit, zero-copy, endian aware, Mach-o parser, and raw C structs
- A PE32/PE32+ (64-bit) parser, and raw C structs
- A Unix archive parser and loader
Usage
- Goblin requires
rustc
1.31.1. - Add to your
Cargo.toml
[dependencies]
goblin = “0.1”
- Awesome crate name
- zero-copy, cross-platform, endian-aware, ELF64/32 implementation – wow!
- zero-copy, cross-platform, endian-aware, 32/64 bit Mach-o parser – zoiks!
- PE 32/64-bit parser – bing!
- a Unix and BSD style archive parser (latter courtesy of @willglynn) – huzzah!
- many cfg options – it will make your head spin, and make you angry when reading the source!
- fuzzed – “I am happy to report that goblin withstood 100 million fuzzing runs, 1 million runs each for seed 1~100.” – @sanxiyn
- tests
libgoblin
aims to be your one-stop shop for binary parsing, loading, and analysis.
Goblin primarily supports the following important use cases:
- Core, std-free
#[repr(C)]
structs, tiny compile time, 32/64 (or both) at your leisure. - Type punning. Define a function once on a type, but have it work on 32 or 64-bit variants – without really changing anything, and no macros! See
examples/automagic.rs
for a basic example. std
mode. This throws in read and write impls viaPread
andPwrite
, reading from file, convenience allocations, extra methods, etc. This is for clients who can allocate and want to read binaries off disk.Endian_fd
. A truly terrible name this is for binary analysis like in panopticon or falcon which needs to read binaries of foreign endianness, or as a basis for constructing cross platform foreign architecture binutils, e.g. cargo-sym and bingrep are simple examples of this, but the sky is the limit.
Here are some things you could do with this crate (or help to implement so they could be done):
- Write a compiler and use it to generate binaries (all the raw C structs have
Pwrite
derived). - Write a binary analysis tool which loads, parses, and analyzes various binary formats, e.g., panopticon or falcon.
- Write a semi-functioning dynamic linker.
- Write a kernel and load binaries using
no_std
cfg. I.e., it is essentially just struct and const defs (like a C header) – no fd, no output, no std. - Write a bin2json tool, because why shouldn’t binary formats be in JSON?
libgoblin
is designed to be massively configurable. The current flags are:
- elf64 – 64-bit elf binaries,
repr(C)
struct defs - elf32 – 32-bit elf binaries,
repr(C)
struct defs - mach64 – 64-bit mach-o
repr(C)
struct defs - mach32 – 32-bit mach-o
repr(C)
struct defs - pe32 – 32-bit PE
repr(C)
struct defs - pe64 – 64-bit PE
repr(C)
struct defs - archive – a Unix Archive parser
- endian_fd – parses according to the endianness in the binary
- std – to allow
no_std
environments
Modules
archive | Implements a simple parser and extractor for a Unix Archive. |
container | Binary container size information and byte-order context |
elf | The generic ELF module, which gives access to ELF constants and other helper functions, which are independent of ELF bithood. Also defines an Elf struct which implements a unified parser that returns a wrapped Elf64 or Elf32 binary. |
elf32 | The ELF 32-bit struct definitions and associated values, re-exported for easy “type-punning” |
elf64 | The ELF 64-bit struct definitions and associated values, re-exported for easy “type-punning” |
error | A custom Goblin error |
mach | The Mach-o, mostly zero-copy, binary format parser and raw struct definitions |
pe | A PE32 and PE32+ parser |
strtab | A byte-offset based string table. Commonly used in ELF binaries, Unix archives, and even PE binaries. |