Spiral Tables
Spiral Tables a powerful and flexible way for storing, analyzing, and querying massive and/or multimodal datasets. The data model will feel familiar to users of SQL- or DataFrame-style systems, yet is designed to be more flexible, more powerful, and more useful in the context of modern data processing. Tables are stored and queried directly from object storage.
Key Features
- Sufficiently Schemaless: Spiral supports complex data with nested relationships using column groups. Tables are sparse and support column appends without rewriting existing rows.
- High-throughput Scanning: Saturate the network bandwidth with an optimized high-throughput scanning.
- Scalable Storage with Flexible Cost-Performance: Based on log-structured merge (LSM) trees and lakehouse architecture for efficient data storage and retrieval, with acceleration layer when performance is critical.
- Python-centric Data Access Layer: Retrieve data in powerful columnar or row-based formats like PyArrow, Pandas, Polars, Dask, PyTorch and more with intuitive projection and filtering syntax.
- Cell Push-down: Keep large values like images, audio, and video seamlessly integrated with your tabular data. Filter rows and read only the parts you need with cell-level filtering. No need for separate storage systems.
Spiral Tables is built with Vortex , our SOTA file format and columnar data toolkit. Vortex’s random access performance advancements enable search and indexing features (soon)!
Dive Deeper
- PySpiral: Learn how to interact with Spiral Tables using the Python client library.
- Data Model: Understand Spiral’s flexible and powerful data model.
- Best Practices: Tips and tricks for optimizing your use of Spiral Tables.
Last updated on