LingoDB - Revolutionizing Data Processing with Compiler Technology
LingoDB is a cutting-edge data processing system that leverages compiler technology to achieve unprecedented flexibility and extensibility without sacrificing performance. It supports a wide range of data-processing workflows beyond relational SQL queries, thanks to declarative sub-operators. Furthermore, LingoDB can perform cross-domain optimization by interleaving optimization passes of different domains and its flexibility enables sustainable support for heterogeneous hardware.
Complex SQL
LingoDB can run complex analytical SQL queries and thus supports all queries of benchmarks like SSB, TPC-H, TPC-DS, and JOB.
Query Optimization
LingoDB implements state-of-the-art query optimizations as compiler passes, which allows for composing custom optimization pipeline, e.g., for cross-domain optimization.
JIT Query Compilation
LingoDB heavily builds on the MLIR compiler framework for compiling queries to efficient machine code without much latency.
Flexibility
LingoDB uses multiple layers of extendable intermediate representations. This approach allows for high flexibility by exchanging layers and targeting different execution platforms.
Apache Arrow
By using Apache Arrow as in-memory storage format, LingoDB can efficiently interface with other systems and libraries without copying data.
Open Source
LingoDB is open source under the MIT License.
Research Directions
Query Engine Design
Through its flexible design, LingoDB facilitates fundamental research regarding query engine architectures.
Heterogeneous Hardware
By using a layered design with sub-operators and building on MLIR, LingoDB is an ideal research tool for investigating heterogeneous hardware for data processing.
Cross-Domain Optimization and Execution
LingoDB's design allows for representing both SQL queries and other domains which simplifies resarch on cross-domain execution and optimization.
Understanding LingoDB
Team
Students
Student | Topic | Advisor(s) | Type |
---|---|---|---|
Robert Imschweiler | Transforming Data Frame Operations from Python to MLIR | Engelke, Jungmair | B.Sc. Thesis |
Florian Drescher | A template-based code generation backend for MLIR | Engelke | Guided Research |
Raoul Zebisch | Sub-Operator Placement on GPUs for accelerating analytical queries | Jungmair | M.Sc. Thesis |
Pascal Ginter | C-Backend, Index-Nested Loop Joins, Query Plan Visualization | Jungmair | Research Assistant |
Let's Work together. Get in Touch!
Contact us for student theses, collaborations, and research opportunities.
Email Us