'flow' Dialectlink
A dialect designed to model execution data flow and partitioning.
The flow dialect is used to model regions of dense computation and the data flow between them. MLIR value-semantic tensors are used as the primary data type to allow SSA use-def to provide a bulk of the infrastructure required to perform the computation partitioning and outlining.
The dialect is designed to ingest relatively high-level linear algebra via XLA HLO ops (that also operate on the value-semantic tensor types) and optionally MLIR standard ops for control flow and other actions. After conversion of any higher-level ops that have special semantics in the flow dialect, such as global variables, the rest are partitioned into regions containing simple and compatible computations. Finally, outlining moves the computations into executables and leaves only the execution flow encoded via dispatch operations.
The primary unit of interest is a "dispatch region" containing compatible computations that can be scheduled together efficiently (and safely). "Compatible" here is specified as similarly shaped workloads that indicate how many invocations a computation can be parallelized across when running in a SPMD execution model. Though it depends on the particular runtime backends this more concretely means things like the untiled workload (or tiled workgroups) used in GPU dispatches or similar thread pool executors.
After identification of the dispatchable regions a set of transformations performs folding and simplification to reduce the total number of dispatches. Heuristics are used in certain cases to more efficiently schedule special ops (such as GEMM) and the design is amenable to profile- guided analysis that can be added in the future.
The resulting outlined executable modules containing the dispatchable code can be translated to one or more backends (such as SPIR-V for Vulkan, or LLVM IR for running on the CPU, etc). The IR that is outlined is untouched and in the input format (such as XLA HLO ops) allowing conversion using any MLIR target that supports ingesting such input. A few special ops are used to communicate statically available information such as the expected workload size, shapes of inputs and outputs, etc.
- 'flow' Dialect
- Operations
- Collective communication ops
- flow.channel.count (Flow::ChannelCountOp)
- flow.channel.default (Flow::ChannelDefaultOp)
- flow.channel.rank (Flow::ChannelRankOp)
- flow.channel.split (Flow::ChannelSplitOp)
- flow.collective.all_gather (Flow::CollectiveAllGatherOp)
- flow.collective.all_reduce (Flow::CollectiveAllReduceOp)
- flow.collective.all_to_all (Flow::CollectiveAllToAllOp)
- flow.collective.reduce_scatter (Flow::CollectiveReduceScatterOp)
- flow.collective.send_recv (Flow::CollectiveSendRecvOp)
- Dispatch ops
- Executable ops
- Partitioned region ops
- flow.dispatch.region (Flow::DispatchRegionOp)
- flow.dispatch.tensor.load (Flow::DispatchTensorLoadOp)
- flow.dispatch.tensor.store (Flow::DispatchTensorStoreOp)
- flow.dispatch.tie_shape (Flow::DispatchTieShapeOp)
- flow.dispatch.workgroup.count (Flow::DispatchWorkgroupCountOp)
- flow.dispatch.workgroup.id (Flow::DispatchWorkgroupIDOp)
- flow.dispatch.workgroup.size (Flow::DispatchWorkgroupSizeOp)
- flow.dispatch.workgroups (Flow::DispatchWorkgroupsOp)
- flow.return (Flow::ReturnOp)
- Streamable call ops
- Tensor ops
- flow.dispatch.workgroup_count_from_dag_root (Flow::DispatchWorkgroupCountFromDagRootOp)
- flow.dispatch.workgroup_count_from_slice (Flow::DispatchWorkgroupCountFromSliceOp)
- flow.dispatch.workload.ordinal (Flow::DispatchWorkloadOrdinalOp)
- flow.tensor.alloca (Flow::TensorAllocaOp)
- flow.tensor.bitcast (Flow::TensorBitCastOp)
- flow.tensor.clone (Flow::TensorCloneOp)
- flow.tensor.constant (Flow::TensorConstantOp)
- flow.tensor.dynamic_constant (Flow::TensorDynamicConstantOp)
- flow.tensor.empty (Flow::TensorEmptyOp)
- flow.tensor.load (Flow::TensorLoadOp)
- flow.tensor.reshape (Flow::TensorReshapeOp)
- flow.tensor.slice (Flow::TensorSliceOp)
- flow.tensor.splat (Flow::TensorSplatOp)
- flow.tensor.store (Flow::TensorStoreOp)
- flow.tensor.tie_shape (Flow::TensorTieShapeOp)
- flow.tensor.trace (Flow::TensorTraceOp)
- flow.tensor.update (Flow::TensorUpdateOp)
- Collective communication ops
- Attributes
- Type constraints
- Types
- Operations
Operationslink
Collective communication opslink
flow.channel.count
(Flow::ChannelCountOp)link
Returns the total number of participants in the group
Syntax:
operation ::= `flow.channel.count` $channel `:` type($result)
attr-dict-with-keyword
Returns the total participant count in the collective communicator group.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
index |
flow.channel.default
(Flow::ChannelDefaultOp)link
Returns a default collective communication channel
Syntax:
operation ::= `flow.channel.default` ($group^)?
`:` type($result)
attr-dict-with-keyword
Returns a channel initialized using the runtime environment.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
group | ::mlir::StringAttr | string attribute |
Results:link
Result | Description |
---|---|
result |
a collecive communication channel |
flow.channel.rank
(Flow::ChannelRankOp)link
Returns the rank of the local participant in the group
Syntax:
operation ::= `flow.channel.rank` $channel `:` type($result)
attr-dict-with-keyword
Returns the rank the channel represents as a participant in a collective
group in [0, count)
.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
index |
flow.channel.split
(Flow::ChannelSplitOp)link
Splits a collective communication channel
Syntax:
operation ::= `flow.channel.split` $channel `,` $color `,` $key
`:` type($channel) `->` type($result)
attr-dict-with-keyword
Partitions the group associated with the given channel into disjoint subgroups for each unique value of color. Each new subgroup contains all participants of the same color and within each subgroup the key argument is used to define the rank order. When multiple participants in a group use the same key the tie will be broken using their rank in the parent group.
Interfaces: InferTypeOpInterface
, OpAsmOpInterface
Operands:link
Operand | Description |
---|---|
channel |
a collecive communication channel |
color |
index |
key |
index |
Results:link
Result | Description |
---|---|
result |
a collecive communication channel |
flow.collective.all_gather
(Flow::CollectiveAllGatherOp)link
Performs all-gather operation
Syntax:
operation ::= `flow.collective.all_gather` $element_type `,` $target `,` $source `,` $channel `:`
`(` type($target) `,` type($source) `,` type($channel) `)` `->`
custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
attr-dict-with-keyword
It gathers data from all ranks and concatenates them on the 0-th dimension.
Interfaces: InferTypeOpInterface
, TiedOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
element_type | ::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr | valid CollectiveElementType |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
source |
ranked tensor of any type values |
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.collective.all_reduce
(Flow::CollectiveAllReduceOp)link
Performs all-reduce operation
Syntax:
operation ::= `flow.collective.all_reduce` $reduction_op `,` $element_type `,` $target `,` $source `,` $channel `:`
`(` type($target) `,` type($source) `,` type($channel) `)` `->`
custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
attr-dict-with-keyword
The operation reduces data across all the ranks in the channel.
Interfaces: InferTypeOpInterface
, TiedOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
reduction_op | mlir::iree_compiler::IREE::Flow::CollectiveReductionOpAttr | valid CollectiveReductionOp |
element_type | ::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr | valid CollectiveElementType |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
source |
ranked tensor of any type values |
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.collective.all_to_all
(Flow::CollectiveAllToAllOp)link
Performs all-to-all operation
Syntax:
operation ::= `flow.collective.all_to_all` $element_type `,` $target `,` $source `,` $channel `:`
`(` type($target) `,` type($source) `,` type($channel) `)` `->`
custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
attr-dict-with-keyword
This operation mutually exchanges data acrosss all of the ranks in the channel.
Interfaces: InferTypeOpInterface
, TiedOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
element_type | ::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr | valid CollectiveElementType |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
source |
ranked tensor of any type values |
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.collective.reduce_scatter
(Flow::CollectiveReduceScatterOp)link
Performs reduce and scatter operations
Syntax:
operation ::= `flow.collective.reduce_scatter` $reduction_op `,` $element_type `,` $target `,` $source `,` $channel `:`
`(` type($target) `,` type($source) `,` type($channel) `)` `->`
custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
attr-dict-with-keyword
The operation reduces data across all the ranks in the channel and
scatters the result to each rank.
Interfaces: InferTypeOpInterface
, TiedOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
reduction_op | mlir::iree_compiler::IREE::Flow::CollectiveReductionOpAttr | valid CollectiveReductionOp |
element_type | ::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr | valid CollectiveElementType |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
source |
ranked tensor of any type values |
channel |
a collecive communication channel |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.collective.send_recv
(Flow::CollectiveSendRecvOp)link
Performs a grouped send and receive operation
Syntax:
operation ::= `flow.collective.send_recv` $element_type `,` $target `,` $source `,` $channel `,` $send `,` $recv `:`
`(` type($target) `,` type($source) `,` type($channel) `,` type($send) `,` type($recv) `)` `->`
custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
attr-dict-with-keyword
The operation sends data to the rank specificied by send
and receives data from the rank specified by recv. If send is -1, this rank
will not send any data. If recv is -1, this rank will not receive any data
and the output will be all zeros.
Interfaces: InferTypeOpInterface
, TiedOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
element_type | ::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr | valid CollectiveElementType |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
source |
ranked tensor of any type values |
channel |
a collecive communication channel |
send |
index |
recv |
index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
Dispatch opslink
flow.dispatch
(Flow::DispatchOp)link
A dispatch of workgroups across a grid
Syntax:
operation ::= `flow.dispatch` custom<DispatchEntryPoints>($entry_points)
(`[` $workload^ `]`)? ``
`(` $arguments `)` attr-dict `:`
custom<ShapedFunctionType>(ref($arguments),
type($arguments), $argument_dims,
type($results), $result_dims,
$tied_operands)
Dispatches workgroups across an grid defined by the captured workload parameters carrying the information required to compute the workgroup count at runtime. The function for converting the workload into a 3D workgroup count is attached to the dispatch entry point and may contain arbitrary host logic.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, SymbolUserOpInterface
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
entry_points | ::mlir::ArrayAttr | symbol ref array attribute |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
workload |
variadic of index |
arguments |
variadic of any type |
argument_dims |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
results |
variadic of any type |
Executable opslink
Executables for outlined regions.
flow.executable_end
(Flow::ExecutableEndOp)link
Terminator pseudo-op for the executable op
Syntax:
operation ::= `flow.executable_end` attr-dict
Traits: HasParent<IREE::Flow::ExecutableOp>
, Terminator
flow.executable.export
(Flow::ExecutableExportOp)link
Defines an executable entry point for dispatch operations
Syntax:
operation ::= `flow.executable.export` custom<SymbolVisibility>($sym_visibility)
custom<SymbolAlias>($sym_name, $function_ref)
custom<WorkgroupCountRegion>($workgroup_count)
attr-dict-with-keyword
Specifies an exported function with an externally-visible alias. Multiple exports can reference the same internal function.
Each entry point can have a unique workgroup count calculation region. This region takes the workload parameters passed to each flow.dispatch and produces an XYZ workgroup count for the 3D grid dispatch.
Traits: HasParent<IREE::Flow::ExecutableOp>
, IsolatedFromAbove
Interfaces: Symbol
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
sym_visibility | ::mlir::StringAttr | string attribute |
sym_name | ::mlir::StringAttr | string attribute |
function_ref | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
flow.executable
(Flow::ExecutableOp)link
Generic executable module
Syntax:
operation ::= `flow.executable` custom<SymbolVisibility>($sym_visibility)
$sym_name
attr-dict-with-keyword
regions
An executable module containing one or more public functions. The contents of the functions are safe to dispatch and can be lowered further to target-specific backend IR representations.
Traits: IsolatedFromAbove
, SingleBlockImplicitTerminator<IREE::Flow::ExecutableEndOp>
, SingleBlock
, SymbolTable
, Util_ObjectLike
Interfaces: Symbol
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
sym_visibility | ::mlir::StringAttr | string attribute |
sym_name | ::mlir::StringAttr | string attribute |
Partitioned region opslink
flow.dispatch.region
(Flow::DispatchRegionOp)link
A group of ops
This op is a container/grouping of ops. It represents a fusion group before
being lowered to a dispatch region. Ops are collected inside of the region
body of the op. Values from parent regions can be captured. Results are
yielded with a return
terminator and returned from this op.
dispatch.region
ops are lowered to dispatch.workgroups
ops. Workgroups
isolated from above. dispatch.region
ops are a more lightweight
abstraction for implementing fusion heuristics, i.e., the process of
deciding which ops should form a dispatch region.
This op also has a second region: workload_count
. The arguments to the
region represent the workload for the dispatch, and returns the number of
workgroups for the dispatch. The region is lowered directly to
workload_count
region of dispatch.workgroups
.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
result_dims |
variadic of index |
workload |
variadic of index |
Results:link
Result | Description |
---|---|
result |
variadic of any type |
flow.dispatch.tensor.load
(Flow::DispatchTensorLoadOp)link
Loads a tensor from a dispatch input placeholder
Syntax:
operation ::= `flow.dispatch.tensor.load` $source
`,` `offsets` `=` custom<DynamicIndexList>(
$offsets, $static_offsets)
`,` `sizes` `=` custom<DynamicIndexList>(
$sizes, $static_sizes)
`,` `strides` `=` custom<DynamicIndexList>(
$strides, $static_strides)
attr-dict `:` type($source) (`{` $source_dims^ `}`)? `->` type($result)
Loads an input tensor or subtensor from an input placeholder. As each workgroup executes concurrently all workgroups will receive identical loaded results of regions that may overlap.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, OffsetSizeAndStrideOpInterface
, ReifyRankedShapedTypeOpInterface
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
static_offsets | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
static_sizes | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
static_strides | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
Operands:link
Operand | Description |
---|---|
source |
dispatch.tensor |
source_dims |
variadic of index |
offsets |
variadic of index |
sizes |
variadic of index |
strides |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.dispatch.tensor.store
(Flow::DispatchTensorStoreOp)link
Stores a tensor into a dispatch output placeholder
Syntax:
operation ::= `flow.dispatch.tensor.store` $value `,` $target
`,` `offsets` `=` custom<DynamicIndexList>(
$offsets, $static_offsets)
`,` `sizes` `=` custom<DynamicIndexList>(
$sizes, $static_sizes)
`,` `strides` `=` custom<DynamicIndexList>(
$strides, $static_strides)
attr-dict `:` type($value) `->` type($target) (`{` $target_dims^ `}`)?
Stores a tensor or subtensor into an output tensor placeholder. As each workgroup executes concurrently behavior is undefined if more than one workgroup stores into overlapping regions of the full output tensor.
Traits: AttrSizedOperandSegments
Interfaces: OffsetSizeAndStrideOpInterface
, Util_ShapeAwareOp
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
static_offsets | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
static_sizes | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
static_strides | ::mlir::DenseI64ArrayAttr | i64 dense array attribute |
Operands:link
Operand | Description |
---|---|
value |
ranked tensor of any type values |
target |
dispatch.tensor |
target_dims |
variadic of index |
offsets |
variadic of index |
sizes |
variadic of index |
strides |
variadic of index |
flow.dispatch.tie_shape
(Flow::DispatchTieShapeOp)link
Ties a runtime shape to a dispatch I/O argument
Syntax:
operation ::= `flow.dispatch.tie_shape` $operand attr-dict
`:` type($result) (`{` $dynamic_dims^ `}`)?
Metadata op used to tie a runtime-computed shape with dynamic dimensions to a dispatch input/output argument. All uses of the argument should use the pass-through result of this op to allow for SSA-based shape resolution.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, ReifyRankedShapedTypeOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operand |
dispatch.tensor |
dynamic_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
dispatch.tensor |
flow.dispatch.workgroup.count
(Flow::DispatchWorkgroupCountOp)link
Returns the total workgroup count of the grid
Syntax:
operation ::= `flow.dispatch.workgroup.count` `[` $dimension `]` attr-dict `:` type($result)
The total number of workgroups along each dimension in the dispatch grid.
Represented as a 3D grid classically written as XYZ.
Corresponds to the NumWorkgroups
SPIR-V built-in and the gridDim
CUDA
built-in variable.
%x = flow.dispatch.workgroup.count[0] : index
%y = flow.dispatch.workgroup.count[1] : index
%z = flow.dispatch.workgroup.count[2] : index
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
dimension | ::mlir::IntegerAttr | index attribute |
Results:link
Result | Description |
---|---|
result |
index |
flow.dispatch.workgroup.id
(Flow::DispatchWorkgroupIDOp)link
Returns the index of the current workgroup in the grid
Syntax:
operation ::= `flow.dispatch.workgroup.id` `[` $dimension `]` attr-dict `:` type($result)
The global workgroup ID of the current workgroup in the range of
[0, flow.dispatch.workgroup.count)
along each dimension.
Represented as a 3D grid classically written as XYZ.
Corresponds to the WorkgroupId
SPIR-V built-in and the blockIdx
CUDA
built-in variable.
%x = flow.dispatch.workgroup.id[0] : index
%y = flow.dispatch.workgroup.id[1] : index
%z = flow.dispatch.workgroup.id[2] : index
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
dimension | ::mlir::IntegerAttr | index attribute |
Results:link
Result | Description |
---|---|
result |
index |
flow.dispatch.workgroup.size
(Flow::DispatchWorkgroupSizeOp)link
Returns the size of each workgroup in invocations
Syntax:
operation ::= `flow.dispatch.workgroup.size` `[` $dimension `]` attr-dict `:` type($result)
The number of local invocations within the current workgroup along each dimension. Depending on backend this may map to the SIMT thread count or inner loop nest parameters.
Workgroup sizes are not determined at the flow dialect level as they are dependent on the target backend determined when lowering into the HAL. It's still possible to use the symbolic workgroup size inside of dispatch executables as a placeholder for the resolved value once in the HAL.
Represented as a 3D grid classically written as XYZ.
Corresponds to the WorkgroupSize
SPIR-V built-in and the blockDim
CUDA
built-in variable.
%x = flow.dispatch.workgroup.size[0] : index
%y = flow.dispatch.workgroup.size[1] : index
%z = flow.dispatch.workgroup.size[2] : index
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, OpAsmOpInterface
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
dimension | ::mlir::IntegerAttr | index attribute |
Results:link
Result | Description |
---|---|
result |
index |
flow.dispatch.workgroups
(Flow::DispatchWorkgroupsOp)link
A dispatch of workgroups across a 3-dimensional grid
Syntax:
operation ::= `flow.dispatch.workgroups` (`[` $workload^ `]`)? ``
`(` $arguments `)` `:`
custom<ShapedFunctionType>(ref($arguments),
type($arguments), $argument_dims,
type($results), $result_dims,
$tied_operands)
attr-dict-with-keyword
`=` `\n` ` ` ` ` ` `
custom<DispatchWorkgroupBody>(ref(type($arguments)),
ref(type($results)),
$workgroup_body)
`` custom<DispatchWorkgroupsCountRegion>($workgroup_count)
Dispatches some number of workgroups across a 3-dimensional grid. The
body region will be invoked for each workgroup with a unique
flow.dispatch.workgroup.id
in the range of
[0, flow.dispatch.workgroup.count)
(along each dimension XYZ).
From the outside the dispatch operation has value semantics: some tensors (and optionally other primitive types) are consumed and one or more new result tensors are produced. Inside each workgroup, however, the input and output tensors are available for arbitrary loads and stores. In many cases each workgroup will load some particular tile(s) from the input tensors and store some particular tile(s) to the output tensors unique to that workgroup. Though it's possible for multiple workgroups to load the same regions of the input tensors behavior is undefined if multiple workgroups store to the same regions of the output tensors.
Though the representation is similar to the GPU-style grid dispatch model
here we still have not yet allocated buffers, determined the target device
for execution, or even completed fully resolving shapes/types/etc. Because
of this it's important that the workgroup body use the
flow.dispatch.workgroup.*
ops to query the workgroup ID/count/size instead
of hardcoding them to a particular set of values. Assume that any workgroup
dispatch may end up being specialized for several different target devices
and even several different variants for a particular target device
(differing workgroup sizes, etc).
Because at this point in the layering devices have not yet been selected the workgroup count cannot be fully evaluated. Instead workload parameters are captured that are then passed to a function that when later evaluated computes the actual workgroup count based on target information. The workload is not limited to the 3D XYZ grid dispatch of the workgroup count and can contain any number of parameters used to compute it.
%r = flow.dispatch.workgroups[%c5, %c5](%0, %1)
: (tensor<5x5xf32>, tensor<5xf32>) -> tensor<5x5xf32> =
(%arg0: !flow.dispatch.tensor<readonly:tensor<5x5xf32>>,
%arg1: !flow.dispatch.tensor<readonly:tensor<5xf32>>,
%arg2: !flow.dispatch.tensor<writeonly:tensor<5x5xf32>>) {
...
}
The number of results of the operation is equal to the number of results
in the type signature ((tensor<5x5xf32>, tensor<5xf32>) -> tensor<5x5xf32>
).
Each tensor argument and result in the type signature has a corresponding
block argument of type !flow.dispatch.tensor
. Furthermore, each argument
has a corresponding arguments
operand.
There are no arguments
operands for results, but a result can be tied an
argument by writing the argument operand's SSA value instead of its type:
E.g., in the above example, -> %0
would tie the first argument to the
result. In that case, there would be no separate block argument for the
result.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
, IsolatedFromAbove
Interfaces: ClosureOpInterface
, ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
workload |
variadic of index |
arguments |
variadic of any type |
argument_dims |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
results |
variadic of any type |
flow.return
(Flow::ReturnOp)link
Return from a flow.dispatch_region
Syntax:
operation ::= `flow.return` attr-dict ($operands^ `:` type($operands))?
Returns the given values from the region and back to the host code.
Traits: AlwaysSpeculatableImplTrait
, ReturnLike
, Terminator
Interfaces: ConditionallySpeculatable
, NoMemoryEffect (MemoryEffectOpInterface)
, RegionBranchTerminatorOpInterface
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operands |
variadic of any type |
Streamable call opslink
flow.call
(Flow::CallOp)link
Calls a streamable external host function
Syntax:
operation ::= `flow.call` $callee
`(` $arguments `)` attr-dict `:`
custom<ShapedFunctionType>(ref($arguments),
type($arguments), $argument_dims,
type($results), $result_dims,
$tied_operands)
Calls a function taking/returning tensor values with stream semantics. Tensors have their shapes captured and may be tied to denote in-place operations. Asynchronous calls must have no side-effects.
Note that returned tensors must have their shapes declared prior to the call as this is what allows the call to be made on the stream. If external host logic is required to compute the shape (avoid at all costs!) a separate func.call can be used outside of the stream to do so. If shapes are unknowable until the operation is performed it should be made as a normal asynchronous host call with 'coarse-fences' instead.
Traits: AttrSizedOperandSegments
Interfaces: CallOpInterface
, SymbolUserOpInterface
, TiedOpInterface
, Util_ShapeAwareOp
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
callee | ::mlir::FlatSymbolRefAttr | flat symbol reference attribute |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
Operands:link
Operand | Description |
---|---|
arguments |
variadic of any type |
argument_dims |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
results |
variadic of any type |
flow.func
(Flow::FuncOp)link
Streamable function declaration
Syntax:
operation ::= `flow.func` custom<SymbolVisibility>($sym_visibility)
$sym_name
``
custom<ShapedFunctionSignature>($function_type,
$tied_operands,
$arg_attrs,
$res_attrs)
attr-dict-with-keyword
($body^)?
Declares a function that can be called as an asynchronous streaming
operation via flow.call
. Today only external functions are allowed.
Traits: IsolatedFromAbove
Interfaces: CallableOpInterface
, FunctionOpInterface
, Symbol
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
sym_name | ::mlir::StringAttr | string attribute |
function_type | ::mlir::TypeAttr | type attribute of function type |
tied_operands | ::mlir::ArrayAttr | 64-bit integer array attribute |
sym_visibility | ::mlir::StringAttr | string attribute |
arg_attrs | ::mlir::ArrayAttr | Array of dictionary attributes |
res_attrs | ::mlir::ArrayAttr | Array of dictionary attributes |
Tensor opslink
flow.dispatch.workgroup_count_from_dag_root
(Flow::DispatchWorkgroupCountFromDagRootOp)link
Workgroup count computed based on iteration range of the root of the DAG for ops within the dispatch.
Syntax:
operation ::= `flow.dispatch.workgroup_count_from_dag_root` attr-dict $operands
When using tile + distribution of the root of the DAG (Directed Acyclic Graph) of ops within the dispatch to split the work amongst workgroups. The workload captured is the size of the iteration space of the root of the DAG. This op represents the computation that given the workload returns the number of workgroups to use. The backends are responsible for lowering this op into actual computation (typically based on the tile sizes used to tile and distribute the root of the DAG).
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operands |
variadic of index |
Results:link
Result | Description |
---|---|
x |
index |
y |
index |
z |
index |
flow.dispatch.workgroup_count_from_slice
(Flow::DispatchWorkgroupCountFromSliceOp)link
Place holder to signify default workgroup count calculation.
Syntax:
operation ::= `flow.dispatch.workgroup_count_from_slice` attr-dict $operands
The default computation of the number of workgroups (or workgroup count)
assumes that the dispatch + captured values is enough to compute the
workgroup count. It does so by using a program slice of the values
within the dispatch that represent the number of workgroups when available
within the dispatch.
Currently the arguments of index types captured by the
flow.dispatch.workgroups
is treated as the workload for the operation.
It is a requirement that the slice of the program that computes the
number of workgroups will need to have its leaves be these captured values.
TODO: This could be generalized in future to allow the slices to encompass arbitrary computation. The computation of the workgroup count can then be done on the device itself, if this is data dependent. In such cases the workload could be more than just values of index types.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operands |
variadic of index |
Results:link
Result | Description |
---|---|
x |
index |
y |
index |
z |
index |
flow.dispatch.workload.ordinal
(Flow::DispatchWorkloadOrdinalOp)link
Annotates the values captured as workload within the body of
flow.dispatch.workgroups
op.
Syntax:
operation ::= `flow.dispatch.workload.ordinal` attr-dict $operand `,` $ordinal `:` type($operand)
The arguments that represent the captured/returned values of the `flow.dispatch.workgroups, i.e. the signature of the body of the op is not preserved during IREEs compilation. Since the workloads are derived from the operands captured by the operation, this op denotes the values captured as workloads. This can be used in the backends to map back to the workload values while materializing the workgroup count computation.
TODO: Find a better way to represent this information, either by somehow propagating the signature of the created dispatch workgroup op through the compilation stack until the codegen backends, or as a separate list/attribute that can be plumbed through without using explicit ops.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
ordinal | ::mlir::IntegerAttr | index attribute |
Operands:link
Operand | Description |
---|---|
operand |
index |
Results:link
Result | Description |
---|---|
result |
index |
flow.tensor.alloca
(Flow::TensorAllocaOp)link
An empty tensor allocation with undefined contents
Syntax:
operation ::= `flow.tensor.alloca` `:` type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Returns a new transient tensor allocation with undefined contents.
Subsequent writes must populate any ranges of the tensor that are later
read. The resulting tensor may be long-lived and allocated as part of a
dedicated allocation. Prefer using flow.tensor.empty
whenever possible as
this op disables nearly all allocation-related optimizations performed by
the compiler. The presence of this op is often an indication of an improper
lowering.
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{MemoryEffects::Allocate on ::mlir::SideEffects::DefaultResource}
Operands:link
Operand | Description |
---|---|
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.bitcast
(Flow::TensorBitCastOp)link
Bitcasts a tensor
Syntax:
operation ::= `flow.tensor.bitcast` $source `:`
type($source) (`{` $source_dims^ `}`)? `->`
type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Bitcasts a tensor to a new type without modifying the contents.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
source |
ranked tensor of any type values |
source_dims |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.clone
(Flow::TensorCloneOp)link
Performs a full tensor clone operation
Syntax:
operation ::= `flow.tensor.clone` $operand `:` type($result) (`{` $argument_dims^ `}`)?
attr-dict-with-keyword
Clones the input tensor into an identical output tensor.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operand |
ranked tensor of any type values |
argument_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.constant
(Flow::TensorConstantOp)link
Tensor constant that can have dynamic dimensions
Syntax:
operation ::= `flow.tensor.constant` attr-dict $value
Allows specifying a tensor constant of IREE-specific types/attributes.
%cst = flow.tensor.constant #something_tensor_like : tensor<2x2xf32>
%res = math.absf %cst : tensor<2x2xf32>
Traits: AlwaysSpeculatableImplTrait
, ConstantLike
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
value | ::mlir::TypedAttr | TypedAttr instance |
Results:link
Result | Description |
---|---|
result |
tensor of any type values |
flow.tensor.dynamic_constant
(Flow::TensorDynamicConstantOp)link
Tensor constant that can have dynamic dimensions
Syntax:
operation ::= `flow.tensor.dynamic_constant` attr-dict $value `->` type($result)
Allows specifying a tensor constant of IREE-specific types/attributes with a dynamic shape that approximates a value as passed from the user. This disables many optimizations and should only be used when testing or benchmarking and wanting to ensure that dynamic dimension behavior is preserved.
%cst = flow.tensor.dynamic_constant #something_tensor_like : tensor<2x2xf32> -> tensor<?x2xf32>
%res = math.absf %cst : tensor<?x2xf32>
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
value | ::mlir::TypedAttr | TypedAttr instance |
Results:link
Result | Description |
---|---|
result |
tensor of any type values |
flow.tensor.empty
(Flow::TensorEmptyOp)link
An empty tensor carrying metadata but no contents
Syntax:
operation ::= `flow.tensor.empty` `:` type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Returns a tensor with undefined contents. Subsequent writes must populate any ranges of the tensor that are later read.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.load
(Flow::TensorLoadOp)link
Loads a value from a tensor element
Syntax:
operation ::= `flow.tensor.load` $source (`[` $indices^ `]`)? `:`
type($source) (`{` $source_dims^ `}`)?
attr-dict-with-keyword
Returns the element at the given location from within the tensor.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
source |
ranked tensor of any type values |
source_dims |
variadic of index |
indices |
variadic of index |
Results:link
Result | Description |
---|---|
result |
index or signless integer or floating-point or complex-type or vector of any type values |
flow.tensor.reshape
(Flow::TensorReshapeOp)link
Reshapes a tensor
Syntax:
operation ::= `flow.tensor.reshape` $source `:`
type($source) (`{` $source_dims^ `}`)? `->`
type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Reshapes a tensor to a new shape without modifying the contents.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
source |
ranked tensor of any type values |
source_dims |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.slice
(Flow::TensorSliceOp)link
Slices out a subregion of a tensor
Syntax:
operation ::= `flow.tensor.slice` $source `[` $start_indices `for` $lengths `]` `:`
type($source) (`{` $source_dims^ `}`)? `->`
type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Clones a subregion of a tensor.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
source |
ranked tensor of any type values |
source_dims |
variadic of index |
start_indices |
variadic of index |
lengths |
variadic of index |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.splat
(Flow::TensorSplatOp)link
Splats a value into a shaped tensor
Syntax:
operation ::= `flow.tensor.splat` $value `:` type($result) (`{` $result_dims^ `}`)?
attr-dict-with-keyword
Returns a tensor initialized to the given primitive value.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
value |
index or signless integer or floating-point or complex-type |
result_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.store
(Flow::TensorStoreOp)link
Stores a value into a tensor element
Syntax:
operation ::= `flow.tensor.store` $value `,` $target (`[` $indices^ `]`)? `:`
type($target) (`{` $target_dims^ `}`)?
attr-dict-with-keyword
Returns a tensor with the element at the given index set to the given value.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
value |
index or signless integer or floating-point or complex-type or vector of any type values |
target |
ranked tensor of any type values |
target_dims |
variadic of index |
indices |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.tie_shape
(Flow::TensorTieShapeOp)link
Ties a runtime shape to a tensor value
Syntax:
operation ::= `flow.tensor.tie_shape` $operand attr-dict
`:` type($result) (`{` $dynamic_dims^ `}`)?
Metadata op used to tie tensors with their runtime-computed dynamic dimensions. This only exists transiently in the IR as a witness to shape calculations and is removed during lowering.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, ReifyRankedShapedTypeOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
operand |
ranked tensor of any type values |
dynamic_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
flow.tensor.trace
(Flow::TensorTraceOp)link
Traces one or more tensor values at runtime
Syntax:
operation ::= `flow.tensor.trace` $key `=` `[`
custom<ShapedOperandList>($values, type($values), $value_dims)
`]` attr-dict-with-keyword
Traces out to a runtime trace sink (console, log file, etc) the given tensors. The key is arbitrary and can be used for identifying the set of values being traced.
Traits: AttrSizedOperandSegments
Interfaces: ShapeAwareOpInterface
Attributes:link
Attribute | MLIR Type | Description |
---|---|---|
key | ::mlir::StringAttr | string attribute |
Operands:link
Operand | Description |
---|---|
values |
variadic of ranked tensor of any type values |
value_dims |
variadic of index |
flow.tensor.update
(Flow::TensorUpdateOp)link
Updates a tensor with the contents of another tensor
Syntax:
operation ::= `flow.tensor.update` $update `,` $target `[` $start_indices `]` `:`
type($update) (`{` $update_dims^ `}`)? `->`
custom<ShapedTiedResult>(type($result), $target_dims)
attr-dict-with-keyword
Updates the target tensor with the contents of the update tensor at the given offset indices.
Traits: AlwaysSpeculatableImplTrait
, AttrSizedOperandSegments
Interfaces: ConditionallySpeculatable
, HoistableOpInterface
, InferTypeOpInterface
, NoMemoryEffect (MemoryEffectOpInterface)
, TiedOpInterface
, Util_ShapeAwareOp
Effects: MemoryEffects::Effect{}
Operands:link
Operand | Description |
---|---|
target |
ranked tensor of any type values |
target_dims |
variadic of index |
start_indices |
variadic of index |
update |
ranked tensor of any type values |
update_dims |
variadic of index |
Results:link
Result | Description |
---|---|
result |
ranked tensor of any type values |
Attributeslink
DummyAttrlink
Syntax: #flow.dummy
NamedParameterAttrlink
named parameter referenced an optional scope and key
Syntax:
#flow.parameter.named<
::mlir::Type, # type
StringAttr, # scope
StringAttr, # key
DictionaryAttr # config
>
Species an externally-defined parameter that can be referenced by an optional scope defining a set of parameters and a key uniquely identifying the parameter within its scope.
Parameters:link
Parameter | C++ type | Description |
---|---|---|
type | ::mlir::Type |
|
scope | StringAttr |
|
key | StringAttr |
|
config | DictionaryAttr |
Type constraintslink
dispatch.tensorlink
A placeholder for a dispatch region input/output operand. This can be used to query the metadata about the tensor (such as its shape) as well as both load and store from the backing tensor representation.
dispatch.tensorlink
A placeholder for a dispatch region input operand. This can be used to query the metadata about the tensor (such as its shape) as well as load from the backing tensor representation.
dispatch.tensorlink
A placeholder for a dispatch region output operand. This can be used to query the metadata about the tensor (such as its shape) as well as store to the backing tensor representation.
Typeslink
ChannelTypelink
a collecive communication channel
Syntax: !flow.channel
Represents a single participant in a collective clique. Multiple channels may exist within the same program to allow for partial operations or hierarchical operations.
In programs that have already been partitioned prior to being compiled there
will often exist only one channel and flow.channel.default
can be used
to reference it. In programs that model SPMD behavior internally channels
can be created or provided by hosting applications.
DummyTypelink
Syntax: !flow.dummy