ADBC how-to guides
Each guide solves one specific task. For a continuous walkthrough see the tutorial; for the full supported surface see the reference.
Guides
Section titled “Guides”pyarrow.parquet.read_table → cur.adbc_ingest.
Schema requirements (the _id column), chunked ingest for tables larger than memory, error handling, and rejection modes.
Two directions:
- Read XTDB into a
pyarrow.Tableand hand off to pandas / polars viato_pandas()/pl.from_arrow(), zero-copy where possible. - Build a DataFrame in pandas / polars, convert to Arrow,
adbc_ingestit.
Covers type-fidelity preservation (categoricals, optional fields, timestamps with timezone).
Point-in-time feature extraction for ML training sets
Section titled “Point-in-time feature extraction for ML training sets”Extract training-set features as known at label time: a single SQL query against FOR VALID_TIME AS OF produces a pyarrow.Table joined as-of label time, ready for a feature pipeline.
It avoids training on information that wasn’t yet available at the label time, the mistake feature stores are built to prevent.
cur.fetch_arrow_table() → DuckDB register_arrow → join with DuckDB-resident data.
When some of your data already lives in DuckDB, Arrow lets you join it against XTDB’s bitemporal data with no serialise/deserialise step between them.
End-to-end in Rust with adbc_core + arrow-rs: connect, ingest a RecordBatch, query, hand off to DataFusion for analytical compute, write the result out to Parquet.
The Arrow types are checked at compile time through the whole pipeline.
The full prepared-statement lifecycle over the wire: prepare() once, bind() + executeQuery() many times, parameter binding via Arrow batches.
Multi-statement transactions, autocommit on/off, rollback semantics, and how XTDB’s same-connection write-then-read works in practice.