Developer overview
This guide provides an overview of IREE's project structure and main tools for
developers.
Project code layout
IREE compiler code layout
API/ :
Public C API
Codegen/ :
Code generation for compute kernels
Dialect/ :
MLIR dialects (Flow
, HAL
, Stream
, VM
, etc.)
InputConversion/ :
Conversions from input dialects and preprocessing
IREE runtime code layout
base/ :
Common types and utilities used throughout the runtime
hal/ :
H ardware A bstraction L ayer for IREE's runtime, with
implementations for hardware and software backends
schemas/ :
Data storage format definitions, primarily using
FlatBuffers
task/ :
System for running tasks across multiple CPU threads
tooling/ :
Utilities for tests and developer tools, not suitable for use as-is in
downstream applications
vm/ :
Bytecode V irtual M achine used to work with IREE modules and invoke
IREE functions
IREE's core compiler accepts programs in supported input MLIR dialects (e.g.
stablehlo
, tosa
, linalg
). Import tools and APIs may be used to convert
from framework-specific formats like TensorFlow
SavedModel to MLIR modules.
While programs are ultimately compiled down to modules suitable for running on
some combination of IREE's target deployment platforms, IREE's developer tools
can run individual compiler passes, translations, and other transformations step
by step.
iree-opt
iree-opt
is a tool for testing IREE's compiler passes. It is similar to
mlir-opt
and runs sets of IREE's compiler passes on .mlir
input files. See "conversion"
in MLIR's Glossary
for more information. Transformations performed by iree-opt
can range from
individual passes performing isolated manipulations to broad pipelines that
encompass a sequence of steps.
Test .mlir
files that are checked in typically include a RUN
block at the
top of the file that specifies which passes should be performed and if
FileCheck
should be used to test the generated output.
Here's an example of a small compiler pass running on a
test file :
$ ../iree-build/tools/iree-opt \
--split-input-file \
--mlir-print-ir-before-all \
--iree-util-drop-compiler-hints \
$PWD /compiler/src/iree/compiler/Dialect/Util/Transforms/test/drop_compiler_hints.mlir
For a more complex example, here's how to run IREE's complete transformation
pipeline targeting the VMVX backend on the
fullyconnected.mlir
model file:
$ ../iree-build/tools/iree-opt \
--iree-transformation-pipeline \
--iree-hal-target-backends= vmvx \
$PWD /tests/e2e/stablehlo_models/fullyconnected.mlir
iree-compile
iree-compile
is IREE's main compiler driver for generating binaries from
supported input MLIR assembly.
For example, to translate simple.mlir
to an IREE module:
$ ../iree-build/tools/iree-compile \
--iree-hal-target-backends= vmvx \
$PWD /samples/models/simple_abs.mlir \
-o /tmp/simple_abs_vmvx.vmfb
iree-run-module
The iree-run-module
program takes an already translated IREE module as input
and executes an exported function using the provided inputs.
This program can be used in sequence with iree-compile
to translate a
.mlir
file to an IREE module and then execute it. Here is an example command
that executes the simple simple_abs_vmvx.vmfb
compiled from simple_abs.mlir
above on IREE's local-task CPU device:
$ ../iree-build/tools/iree-run-module \
--module= /tmp/simple_abs_vmvx.vmfb \
--device= local-task \
--function= abs \
--input= f32 = -2
Input scalars are passed as value
and input buffers are passed as
[shape]xtype=[value]
.
Input buffers may also be read from raw binary files or Numpy npy files.
MLIR type
Description
Input example
i32
Scalar
--input=1234
tensor<i32>
0-D tensor
--input=i32=1234
tensor<1xi32>
1-D tensor (shape [1])
--input=1xi32=1234
tensor<2xi32>
1-D tensor (shape [2])
--input="2xi32=12 34"
tensor<2x3xi32>
2-D tensor (shape [2, 3])
--input="2x3xi32=[1 2 3][4 5 6]"
Other usage examples
See these test files for advanced usage examples:
Basic tests Inputs Outputs Expected
Source file: tools/test/iree-run-module.mlir
tools/test/iree-run-module.mlir // RUN: (iree-compile --iree-hal-target-backends=vmvx %s | iree-run-module --device=local-task --module=- --function=abs --input="2xf32=-2 3") | FileCheck %s
// RUN: (iree-compile --iree-hal-target-backends=llvm-cpu %s | iree-run-module --device=local-task --module=- --function=abs --input="2xf32=-2 3") | FileCheck %s
// CHECK-LABEL: EXEC @abs
func . func @ abs ( % input : tensor < 2 xf32 > ) -> ( tensor < 2 xf32 > ) {
% result = math . absf % input : tensor < 2 xf32 >
return % result : tensor < 2 xf32 >
}
// INPUT-BUFFERS: result[1]: hal.buffer_view
// INPUT-BUFFERS-NEXT: 2xf32=-2.0 3.0
Source file: tools/test/iree-run-module-inputs.mlir
tools/test/iree-run-module-inputs.mlir 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88 // Passing no inputs is okay.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync --module=- --function=no_input) | \
// RUN: FileCheck --check-prefix=NO-INPUT %s
// NO-INPUT-LABEL: EXEC @no_input
func . func @ no_input () {
return
}
// -----
// Scalars use the form `--input=value`. Type (float/int) should be omitted.
// * The VM does not use i1/i8 types, so i32 VM types are returned instead.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=- \
// RUN: --function=scalars \
// RUN: --input=1 \
// RUN: --input=5 \
// RUN: --input=1234 \
// RUN: --input=-3.14) | \
// RUN: FileCheck --check-prefix=INPUT-SCALARS %s
// INPUT-SCALARS-LABEL: EXEC @scalars
func . func @ scalars ( % arg0 : i1 , % arg1 : i8 , % arg2 : i32 , % arg3 : f32 ) -> ( i1 , i8 , i32 , f32 ) {
// INPUT-SCALARS: result[0]: i32=1
// INPUT-SCALARS: result[1]: i32=5
// INPUT-SCALARS: result[2]: i32=1234
// INPUT-SCALARS: result[3]: f32=-3.14
return % arg0 , % arg1 , % arg2 , % arg3 : i1 , i8 , i32 , f32
}
// -----
// Buffers ("tensors") use the form `--input=[shape]xtype=[value]`.
// * If any values are omitted, zeroes will be used.
// * Quotes should be used around values with spaces.
// * Brackets may also be used to separate element values.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=- \
// RUN: --function=buffers \
// RUN: --input=i32=5 \
// RUN: --input=2xi32 \
// RUN: --input="2x3xi32=1 2 3 4 5 6") | \
// RUN: FileCheck --check-prefix=INPUT-BUFFERS %s
// INPUT-BUFFERS-LABEL: EXEC @buffers
func . func @ buffers ( % arg0 : tensor < i32 > , % arg1 : tensor < 2 xi32 > , % arg2 : tensor < 2 x3xi32 > ) -> ( tensor < i32 > , tensor < 2 xi32 > , tensor < 2 x3xi32 > ) {
// INPUT-BUFFERS: result[0]: hal.buffer_view
// INPUT-BUFFERS-NEXT: i32=5
// INPUT-BUFFERS: result[1]: hal.buffer_view
// INPUT-BUFFERS-NEXT: 2xi32=0 0
// INPUT-BUFFERS: result[2]: hal.buffer_view
// INPUT-BUFFERS-NEXT: 2x3xi32=[1 2 3][4 5 6]
return % arg0 , % arg1 , % arg2 : tensor < i32 > , tensor < 2 xi32 > , tensor < 2 x3xi32 >
}
// -----
// Buffer values can be read from binary files with `@some/file.bin`.
// * numpy npy files from numpy.save or previous tooling output can be read to
// provide 1+ values.
// * Some data types may be converted (i32 -> si32 here) - bug?
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s -o=%t.vmfb && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=npy_round_trip \
// RUN: --input=2xi32=11,12 \
// RUN: --input=3xi32=1,2,3 \
// RUN: --output=@%t.npy \
// RUN: --output=+%t.npy && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=npy_round_trip \
// RUN: --input=*%t.npy) | \
// RUN: FileCheck --check-prefix=INPUT-NUMPY %s
// INPUT-NUMPY-LABEL: EXEC @npy_round_trip
func . func @ npy_round_trip ( % arg0 : tensor < 2 xi32 > , % arg1 : tensor < 3 xi32 > ) -> ( tensor < 2 xi32 > , tensor < 3 xi32 > ) {
// INPUT-NUMPY: result[0]: hal.buffer_view
// INPUT-NUMPY-NEXT: 2xsi32=11 12
// INPUT-NUMPY: result[1]: hal.buffer_view
// INPUT-NUMPY-NEXT: 3xsi32=1 2 3
return % arg0 , % arg1 : tensor < 2 xi32 > , tensor < 3 xi32 >
}
Source file: tools/test/iree-run-module-outputs.mlir
tools/test/iree-run-module-outputs.mlir 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100 // Tests that execution providing no outputs is ok.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync --module=- --function=no_output) | \
// RUN: FileCheck --check-prefix=NO-OUTPUT %s
// NO-OUTPUT-LABEL: EXEC @no_output
func . func @ no_output () {
return
}
// -----
// Tests the default output printing to stdout.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync --module=- --function=default) | \
// RUN: FileCheck --check-prefix=OUTPUT-DEFAULT %s
// OUTPUT-DEFAULT-LABEL: EXEC @default
func . func @ default () -> ( i32 , tensor < f32 > , tensor <? x4xi32 > ) {
// OUTPUT-DEFAULT: result[0]: i32=123
% 0 = arith . constant 123 : i32
// OUTPUT-DEFAULT: result[1]: hal.buffer_view
// OUTPUT-DEFAULT-NEXT: f32=4
% 1 = arith . constant dense < 4.0 > : tensor < f32 >
// OUTPUT-DEFAULT: result[2]: hal.buffer_view
// OUTPUT-DEFAULT-NEXT: 2x4xi32=[0 1 2 3][4 5 6 7]
% 2 = flow . tensor . dynamic_constant dense < [[ 0 , 1 , 2 , 3 ],[ 4 , 5 , 6 , 7 ]] > : tensor < 2 x4xi32 > -> tensor <? x4xi32 >
return % 0 , % 1 , % 2 : i32 , tensor < f32 > , tensor <? x4xi32 >
}
// -----
// Tests explicit output to npy files by producing a concatenated .npy and then
// printing the results in python. This also verifies our npy files can be
// parsed by numpy.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | \
// RUN: iree-run-module --device=local-sync --module=- --function=numpy \
// RUN: --output= \
// RUN: --output=@%t.npy \
// RUN: --output=+%t.npy) && \
// RUN: "%PYTHON" %S/echo_npy.py %t.npy | \
// RUN: FileCheck --check-prefix=OUTPUT-NUMPY %s
func . func @ numpy () -> ( i32 , tensor < f32 > , tensor <? x4xi32 > ) {
// Output skipped:
% 0 = arith . constant 123 : i32
// OUTPUT-NUMPY{LITERAL}: 4.0
% 1 = arith . constant dense < 4.0 > : tensor < f32 >
// OUTPUT-NUMPY-NEXT{LITERAL}: [[0 1 2 3]
// OUTPUT-NUMPY-NEXT{LITERAL}: [4 5 6 7]]
% 2 = flow . tensor . dynamic_constant dense < [[ 0 , 1 , 2 , 3 ],[ 4 , 5 , 6 , 7 ]] > : tensor < 2 x4xi32 > -> tensor <? x4xi32 >
return % 0 , % 1 , % 2 : i32 , tensor < f32 > , tensor <? x4xi32 >
}
// -----
// Tests output to binary files by round-tripping the output of a function into
// another invocation reading from the binary files. Each output is written to
// its own file (optimal for alignment/easier to inspect).
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s -o=%t.vmfb && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=write_binary \
// RUN: --output=@%t.0.bin \
// RUN: --output=@%t.1.bin && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=echo_binary \
// RUN: --input=f32=@%t.0.bin \
// RUN: --input=2x4xi32=@%t.1.bin) | \
// RUN: FileCheck --check-prefix=OUTPUT-BINARY %s
// Tests output to binary files by round-tripping the output of a function into
// another invocation reading from the binary files. The values are appended to
// a single file and read from the single file.
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s -o=%t.vmfb && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=write_binary \
// RUN: --output=@%t.bin \
// RUN: --output=+%t.bin && \
// RUN: iree-run-module --device=local-sync \
// RUN: --module=%t.vmfb \
// RUN: --function=echo_binary \
// RUN: --input=f32=@%t.bin \
// RUN: --input=2x4xi32=+%t.bin) | \
// RUN: FileCheck --check-prefix=OUTPUT-BINARY %s
func . func @ write_binary () -> ( tensor < f32 > , tensor <? x4xi32 > ) {
% 0 = arith . constant dense < 4.0 > : tensor < f32 >
% 1 = flow . tensor . dynamic_constant dense < [[ 0 , 1 , 2 , 3 ],[ 4 , 5 , 6 , 7 ]] > : tensor < 2 x4xi32 > -> tensor <? x4xi32 >
return % 0 , % 1 : tensor < f32 > , tensor <? x4xi32 >
}
func . func @ echo_binary ( % arg0 : tensor < f32 > , % arg1 : tensor <? x4xi32 > ) -> ( tensor < f32 > , tensor <? x4xi32 > ) {
// OUTPUT-BINARY{LITERAL}: f32=4
// OUTPUT-BINARY{LITERAL}: 2x4xi32=[0 1 2 3][4 5 6 7]
return % arg0 , % arg1 : tensor < f32 > , tensor <? x4xi32 >
}
Source file: tools/test/iree-run-module-expected.mlir
tools/test/iree-run-module-expected.mlir 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 // RUN: (iree-compile --iree-hal-target-backends=vmvx %s | iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=2.0) | FileCheck %s --check-prefix=SUCCESS-MATCHES
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output="(ignored)") | FileCheck %s --check-prefix=SUCCESS-IGNORED
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=2.1 --expected_f32_threshold=0.1) | FileCheck %s --check-prefix=SUCCESS-THRESHOLD
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=123 --expected_output=f32=2.0) | FileCheck %s --check-prefix=FAILED-FIRST
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=4.5) | FileCheck %s --check-prefix=FAILED-SECOND
// RUN: (iree-compile --iree-hal-target-backends=vmvx %s | not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=4xf32=2.0) | FileCheck %s --check-prefix=FAILED-SHAPE
// SUCCESS-MATCHES: [SUCCESS]
// SUCCESS-THRESHOLD: [SUCCESS]
// SUCCESS-IGNORED: [SUCCESS]
// FAILED-FIRST: [FAILED] result[0]: element at index 0 (-2) does not match the expected (123)
// FAILED-SECOND: [FAILED] result[1]: element at index 0 (2) does not match the expected (4.5)
// FAILED-SHAPE: [FAILED] result[1]: metadata is f32; expected that the view matches 4xf32
func . func @ abs ( % input : tensor < f32 > ) -> ( tensor < f32 > , tensor < f32 > ) {
% result = math . absf % input : tensor < f32 >
return % input , % result : tensor < f32 > , tensor < f32 >
}
iree-check-module
The iree-check-module
program takes an already translated IREE module as input
and executes it as a series of
googletest tests. This is the test
runner for the IREE check framework .
$ ../iree-build/tools/iree-compile \
--iree-input-type= stablehlo \
--iree-hal-target-backends= vmvx \
$PWD /tests/e2e/stablehlo_ops/abs.mlir \
-o /tmp/abs.vmfb
$ ../iree-build/tools/iree-check-module \
--device= local-task \
--module= /tmp/abs.vmfb
iree-run-mlir
The iree-run-mlir
program takes a .mlir
file as input, translates it to an
IREE bytecode module, and executes the module.
It is designed for testing and debugging, not production uses, and therefore
does some additional work that usually must be explicit, like marking every
function as exported by default and running all of them.
For example, to execute the contents of
samples/models/simple_abs.mlir :
# iree-run-mlir <compiler flags> [input.mlir] <runtime flags>
$ ../iree-build/tools/iree-run-mlir \
--iree-hal-target-backends= vmvx \
$PWD /samples/models/simple_abs.mlir \
--input= f32 = -2
iree-dump-module
The iree-dump-module
program prints the contents of an IREE module FlatBuffer
file.
For example, to inspect the module translated above:
../iree-build/tools/iree-dump-module /tmp/simple_abs_vmvx.vmfb
Useful generic flags
All the IREE tools support reading input values from a file. This is quite
useful for debugging. Use --help
for each tool to see what the flag to set.
The inputs are expected to be newline-separated. Each input should be either a
scalar or a buffer. Scalars should be in the format type=value
and buffers
should be in the format [shape]xtype=[value]
. For example:
1x5xf32=1,-2,-3,4,-5
1x5x3x1xf32=15,14,13,12,11,10,9,8,7,6,5,4,3,2,1
--iree-flow-trace-dispatch-tensors
This flag will enable tracing inputs and outputs for each dispatch function. It
is easier to narrow down test cases, since IREE breaks a ML workload into
multiple dispatch function. When the flag is on, IREE will insert trace points
before and after each dispatch function. The first trace op is for inputs, and
the second trace op is for outputs. There will be two events for one dispatch
function.