Goblin : An Impish, Cross-Platform Binary Parsing Crate, Written In Rust

Goblin is an impish, cross-platform binary parsing crate, written in Rust. It supports:

  • An ELF32/64 parser, and raw C structs
  • A 32/64-bit, zero-copy, endian aware, Mach-o parser, and raw C structs
  • A PE32/PE32+ (64-bit) parser, and raw C structs
  • A Unix archive parser and loader

Usage

  • Goblin requires rustc 1.31.1.
  • Add to your Cargo.toml

[dependencies]
goblin = “0.1”

Features

  • Awesome crate name
  • zero-copy, cross-platform, endian-aware, ELF64/32 implementation – wow!
  • zero-copy, cross-platform, endian-aware, 32/64 bit Mach-o parser – zoiks!
  • PE 32/64-bit parser – bing!
  • a Unix and BSD style archive parser (latter courtesy of @willglynn) – huzzah!
  • many cfg options – it will make your head spin, and make you angry when reading the source!
  • fuzzed – “I am happy to report that goblin withstood 100 million fuzzing runs, 1 million runs each for seed 1~100.” – @sanxiyn
  • tests

libgoblin aims to be your one-stop shop for binary parsing, loading, and analysis.

Use-Cases

Goblin primarily supports the following important use cases:

  • Core, std-free #[repr(C)] structs, tiny compile time, 32/64 (or both) at your leisure.
  • Type punning. Define a function once on a type, but have it work on 32 or 64-bit variants – without really changing anything, and no macros! See examples/automagic.rs for a basic example.
  • std mode. This throws in read and write impls via Pread and Pwrite, reading from file, convenience allocations, extra methods, etc. This is for clients who can allocate and want to read binaries off disk.
  • Endian_fd. A truly terrible name  this is for binary analysis like in panopticon or falcon which needs to read binaries of foreign endianness, or as a basis for constructing cross platform foreign architecture binutils, e.g. cargo-sym and bingrep are simple examples of this, but the sky is the limit.

Here are some things you could do with this crate (or help to implement so they could be done):

  • Write a compiler and use it to generate binaries (all the raw C structs have Pwrite derived).
  • Write a binary analysis tool which loads, parses, and analyzes various binary formats, e.g., panopticon or falcon.
  • Write a semi-functioning dynamic linker.
  • Write a kernel and load binaries using no_std cfg. I.e., it is essentially just struct and const defs (like a C header) – no fd, no output, no std.
  • Write a bin2json tool, because why shouldn’t binary formats be in JSON?

CFGS

libgoblin is designed to be massively configurable. The current flags are:

  • elf64 – 64-bit elf binaries, repr(C) struct defs
  • elf32 – 32-bit elf binaries, repr(C) struct defs
  • mach64 – 64-bit mach-o repr(C) struct defs
  • mach32 – 32-bit mach-o repr(C) struct defs
  • pe32 – 32-bit PE repr(C) struct defs
  • pe64 – 64-bit PE repr(C) struct defs
  • archive – a Unix Archive parser
  • endian_fd – parses according to the endianness in the binary
  • std – to allow no_std environments

Modules

archiveImplements a simple parser and extractor for a Unix Archive.
containerBinary container size information and byte-order context
elfThe generic ELF module, which gives access to ELF constants and other helper functions, which are independent of ELF bithood. Also defines an Elf struct which implements a unified parser that returns a wrapped Elf64 or Elf32 binary.
elf32The ELF 32-bit struct definitions and associated values, re-exported for easy “type-punning”
elf64The ELF 64-bit struct definitions and associated values, re-exported for easy “type-punning”
errorA custom Goblin error
machThe Mach-o, mostly zero-copy, binary format parser and raw struct definitions
peA PE32 and PE32+ parser
strtabA byte-offset based string table. Commonly used in ELF binaries, Unix archives, and even PE binaries.