Apache Iceberg™ Rust is a native Rust implementation of the Apache Iceberg table format, designed for managing large-scale datasets in data lake environments.
It provides a highly scalable and efficient solution for accessing and manipulating Iceberg tables directly in Rust-based applications.
Components Of Iceberg-Rust
The project comprises several modular components, each serving a distinct purpose:
- iceberg: Core library for interacting with Iceberg tables.
- iceberg-datafusion: Integration with DataFusion, enabling query execution.
- iceberg-catalog-glue: AWS Glue catalog support.
- iceberg-catalog-hms: Hive Metastore catalog integration.
- iceberg-catalog-memory: In-memory catalog for lightweight use cases.
- iceberg-catalog-rest: REST-based catalog for distributed environments.
Key Features
- Rust Compatibility: Built and tested with stable Rust (minimum supported version 1.77.1). Unstable Rust is used for development tools like
clippy
andrustfmt
, ensuring downstream users are unaffected. - Layered Architecture:
- FileIO Abstraction: Powered by Apache OpenDAL, supporting storage backends like Amazon S3, Azure Blob, Google Cloud Storage, and local file systems.
- Data Format Support: Integration with formats such as Parquet and Avro.
- High-Level APIs: Includes table readers/writers and support for SQL-like operations.
- Extensibility:
- Future plans include WebAssembly bindings for browser-based table access and C bindings for integration with tools like DuckDB1.
Iceberg-Rust is an open-source project under the Apache Software Foundation (ASF). Contributions are encouraged through:
- Submitting issues or feature requests.
- Participating in discussions via mailing lists or Slack (#rust channel).
- Following the Contributing Guide23.
Several prominent projects leverage Iceberg-Rust:
- Databend: A cloud-native data warehouse integrating Iceberg tables.
- Lakekeeper: REST catalog with data access controls.
- RisingWave: Real-time event streaming database.
Iceberg-Rust is licensed under the Apache License 2.0, ensuring open-source accessibility and compliance.