Skip to main content
Version: Next

'arrow' Dialect

The Arrow dialect provides types and operations for working with Apache Arrow data structures. It includes types for arrays and builders (for chunked arrays), and the necessary operations to load values from arrays, and append values to builders.

The operation implemented by this dialect work directly on the physical memory layout, and do not have any knowledge about Apache Arrow's logical types. For example, dates are loaded as integers, strings are loaded as ptr + len, and so on. Dealing with logical types is the responsibility of higher-level dialects.

Operations

arrow.array.is_valid (::lingodb::compiler::dialect::arrow::IsValidOp)

Returns if a array element at a given offset is valid or not.

Syntax:

operation ::= `arrow.array.is_valid` $array `,` $offset attr-dict

Interfaces: InferTypeOpInterface

Operands:

OperandDescription
arrayrepresents an anonymous Apache Arrow array, without knowledge of the type stored by it
offsetindex

Results:

ResultDescription
valid1-bit signless integer

arrow.array.load_bool (::lingodb::compiler::dialect::arrow::LoadBoolOp)

Loads a boolean value from an array at a given offset.

Syntax:

operation ::= `arrow.array.load_bool` $array `,` $offset attr-dict

This special operation is necessary, since Arrow stores boolean values as bitset, and not individual bytes.

Interfaces: InferTypeOpInterface

Operands:

OperandDescription
arrayrepresents an anonymous Apache Arrow array, without knowledge of the type stored by it
offsetindex

Results:

ResultDescription
value1-bit signless integer

arrow.array.load_fixed_sized (::lingodb::compiler::dialect::arrow::LoadFixedSizedOp)

Loads an arbitrary, fixed sized value from an array at a given offset

Syntax:

operation ::= `arrow.array.load_fixed_sized` $array `,` $offset `->` type($value) attr-dict

Used for loading types that are of fixed size from an arrow array (e.g., integers, floats, decimals, dates, timestamp). There are now runtime checks to ensure that the type of the value matches the type of the array, so this operation can be used for any fixed sized type.

Operands:

OperandDescription
arrayrepresents an anonymous Apache Arrow array, without knowledge of the type stored by it
offsetindex

Results:

ResultDescription
valueany type

arrow.array.load_variable_size_binary (::lingodb::compiler::dialect::arrow::LoadVariableSizeBinaryOp)

Loads a variable sized binary value from an array at a given offset

Syntax:

operation ::= `arrow.array.load_variable_size_binary` $array `,` $offset `->` type($ptr) attr-dict

Used for loading variable sized binary values from an arrow array (e.g., strings, binary data). It returns both a pointer to the data and the length of the data.

Interfaces: InferTypeOpInterface

Operands:

OperandDescription
arrayrepresents an anonymous Apache Arrow array, without knowledge of the type stored by it
offsetindex

Results:

ResultDescription
length32-bit signless integer
ptrref type

arrow.array_builder.append_bool (::lingodb::compiler::dialect::arrow::AppendBoolOp)

Appends a boolean value to an Arrow array builder.

Syntax:

operation ::= `arrow.array_builder.append_bool` $builder `,` $value ( `,` $valid^ )? attr-dict

This operation appends a boolean value to an Arrow array builder, optionally with a validity flag. This operation is necessary because Arrow stores boolean values as a bitset, not as individual bytes.

Operands:

OperandDescription
builderrepresents an anonymous Apache Arrow builder (building a chunked array), without knowledge of the type stored by it
value1-bit signless integer
valid1-bit signless integer

arrow.array_builder.append_fixed_sized (::lingodb::compiler::dialect::arrow::AppendFixedSizedOp)

Appends a fixed sized value to an Arrow array builder.

Syntax:

operation ::= `arrow.array_builder.append_fixed_sized` $builder `,` $value `:` type($value) ( `,` $valid^ )? attr-dict

This operation appends a fixed sized value to an Arrow array builder, optionally with a validity flag. It can be used for any fixed sized type, such as integers, floats, decimals, dates, and timestamps.

Operands:

OperandDescription
builderrepresents an anonymous Apache Arrow builder (building a chunked array), without knowledge of the type stored by it
valueany type
valid1-bit signless integer

arrow.array_builder.append_variable_sized_binary (::lingodb::compiler::dialect::arrow::AppendVariableSizeBinaryOp)

Appends a variable sized binary value to an Arrow array builder.

Syntax:

operation ::= `arrow.array_builder.append_variable_sized_binary` $builder `,` $value   ( `,` $valid^ )? attr-dict

This operation appends a variable sized binary value to an Arrow array builder, optionally with a validity flag The binary data is (at the moment) expected to be a util.varlen32 type, which contains pointer and length information. In the future, we should return a pointer and length directly.

Operands:

OperandDescription
builderrepresents an anonymous Apache Arrow builder (building a chunked array), without knowledge of the type stored by it
valuetype representing variable-length data up to 2^31 bytes
valid1-bit signless integer

arrow.array_builder.from_ptr (::lingodb::compiler::dialect::arrow::BuilderFromPtr)

Creates a builder value from a pointer to an ArrowColumn builder that is managed in the runtime

Syntax:

operation ::= `arrow.array_builder.from_ptr` $ptr `->` type($builder) attr-dict

Interfaces: InferTypeOpInterface

Operands:

OperandDescription
ptrref type

Results:

ResultDescription
builderrepresents an anonymous Apache Arrow builder (building a chunked array), without knowledge of the type stored by it

Types

ArrayType

represents an anonymous Apache Arrow array, without knowledge of the type stored by it

Syntax: !arrow.array

ArrayBuilderType

represents an anonymous Apache Arrow builder (building a chunked array), without knowledge of the type stored by it

Syntax: !arrow.builder