Developer overviewlink

This guide provides an overview of IREE's project structure and main tools for developers.

Project code layoutlink

/compiler/: MLIR dialects, LLVM compiler passes, module translation code, etc.
- bindings/: Python and other language bindings
/runtime/: Standalone runtime code including the VM and HAL drivers
- bindings/: Python and other language bindings
/integrations/: Integrations between IREE and other frameworks, such as TensorFlow
/tests/: Tests for full compiler->runtime workflows
/tools/: Developer tools (iree-compile, iree-run-module, etc.)
/samples/: Also see the separate https://github.com/iree-org/iree-experimental repository

IREE compiler code layoutlink

API/: Public C API
Codegen/: Code generation for compute kernels
Dialect/: MLIR dialects (Flow, HAL, Stream, VM, etc.)
InputConversion/: Conversions from input dialects and preprocessing

IREE runtime code layoutlink

base/: Common types and utilities used throughout the runtime
hal/: Hardware Abstraction Layer for IREE's runtime, with implementations for hardware and software backends
schemas/: Data storage format definitions, primarily using FlatBuffers
task/: System for running tasks across multiple CPU threads
tooling/: Utilities for tests and developer tools, not suitable for use as-is in downstream applications
vm/: Bytecode Virtual Machine used to work with IREE modules and invoke IREE functions

Developer toolslink

IREE's core compiler accepts programs in supported input MLIR dialects (e.g. stablehlo, tosa, linalg). Import tools and APIs may be used to convert from framework-specific formats like TensorFlow SavedModel to MLIR modules. While programs are ultimately compiled down to modules suitable for running on some combination of IREE's target deployment platforms, IREE's developer tools can run individual compiler passes, translations, and other transformations step by step.

iree-optlink

iree-opt is a tool for testing IREE's compiler passes. It is similar to mlir-opt and runs sets of IREE's compiler passes on .mlir input files. See "conversion" in MLIR's Glossary for more information. Transformations performed by iree-opt can range from individual passes performing isolated manipulations to broad pipelines that encompass a sequence of steps.

Test .mlir files that are checked in typically include a RUN block at the top of the file that specifies which passes should be performed and if FileCheck should be used to test the generated output.

Here's an example of a small compiler pass running on a test file:

$ ../iree-build/tools/iree-opt \
  --split-input-file \
  --mlir-print-ir-before-all \
  --iree-util-drop-compiler-hints \
  $PWD/compiler/src/iree/compiler/Dialect/Util/Transforms/test/drop_compiler_hints.mlir

For a more complex example, here's how to run IREE's complete transformation pipeline targeting the VMVX backend on the fullyconnected.mlir model file:

$ ../iree-build/tools/iree-opt \
  --iree-transformation-pipeline \
  --iree-hal-target-device=local \
  --iree-hal-local-target-device-backends=vmvx \
  $PWD/tests/e2e/stablehlo_models/fullyconnected.mlir

iree-compilelink

iree-compile is IREE's main compiler driver for generating binaries from supported input MLIR assembly.

For example, to translate simple.mlir to an IREE module:

$ ../iree-build/tools/iree-compile \
  --iree-hal-target-device=local \
  --iree-hal-local-target-device-backends=vmvx \
  $PWD/samples/models/simple_abs.mlir \
  -o /tmp/simple_abs_vmvx.vmfb

iree-run-modulelink

The iree-run-module program takes an already translated IREE module as input and executes an exported function using the provided inputs.

This program can be used in sequence with iree-compile to translate a .mlir file to an IREE module and then execute it. Here is an example command that executes the simple simple_abs_vmvx.vmfb compiled from simple_abs.mlir above on IREE's local-task CPU device:

$ ../iree-build/tools/iree-run-module \
  --module=/tmp/simple_abs_vmvx.vmfb \
  --device=local-task \
  --function=abs \
  --input=f32=-2

Input scalars are passed as value and input buffers are passed as [shape]xtype=[value].

Input buffers may also be read from raw binary files or Numpy npy files.

MLIR type	Description	Input example
`i32`	Scalar	`--input=1234`
`tensor<i32>`	0-D tensor	`--input=i32=1234`
`tensor<1xi32>`	1-D tensor (shape [1])	`--input=1xi32=1234`
`tensor<2xi32>`	1-D tensor (shape [2])	`--input="2xi32=12 34"`
`tensor<2x3xi32>`	2-D tensor (shape [2, 3])	`--input="2x3xi32=[1 2 3][4 5 6]"`

Other usage examples

See these test files for advanced usage examples:

Basic testsInputsOutputsExpected

Source file: tools/test/iree-run-module.mlir

tools/test/iree-run-module.mlir
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-task --module=- --function=abs --input="2xf32=-2 3") | FileCheck %s
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=llvm-cpu %s | \
// RUN:  iree-run-module --device=local-task --module=- --function=abs --input="2xf32=-2 3") | FileCheck %s

// CHECK-LABEL: EXEC @abs
func.func @abs(%input : tensor<2xf32>) -> (tensor<2xf32>) {
  %result = math.absf %input : tensor<2xf32>
  return %result : tensor<2xf32>
}
  // INPUT-BUFFERS: result[1]: hal.buffer_view
  // INPUT-BUFFERS-NEXT: 2xf32=-2.0 3.0

Source file: tools/test/iree-run-module-inputs.mlir

tools/test/iree-run-module-inputs.mlir
// Passing no inputs is okay.

// RUN: (iree-compile %s | \
// RUN:  iree-run-module --module=- --function=no_input) | \
// RUN: FileCheck --check-prefix=NO-INPUT %s
// NO-INPUT-LABEL: EXEC @no_input
func.func @no_input() {
  return
}

// -----

// Scalars use the form `--input=value`. Type (float/int) should be omitted.
//   * The VM does not use i1/i8 types, so i32 VM types are returned instead.

// RUN: (iree-compile %s | \
// RUN:  iree-run-module --module=- \
// RUN:                  --function=scalars \
// RUN:                  --input=1 \
// RUN:                  --input=5 \
// RUN:                  --input=1234 \
// RUN:                  --input=-3.14) | \
// RUN: FileCheck --check-prefix=INPUT-SCALARS %s
// INPUT-SCALARS-LABEL: EXEC @scalars
func.func @scalars(%arg0: i1, %arg1: i64, %arg2: i32, %arg3: f32) -> (i1, i64, i32, f32) {
  // INPUT-SCALARS: result[0]: i32=1
  // INPUT-SCALARS: result[1]: i64=5
  // INPUT-SCALARS: result[2]: i32=1234
  // INPUT-SCALARS: result[3]: f32=-3.14
  return %arg0, %arg1, %arg2, %arg3 : i1, i64, i32, f32
}

// -----

// Buffers ("tensors") use the form `--input=[shape]xtype=[value]`.
//   * If any values are omitted, zeroes will be used.
//   * Quotes should be used around values with spaces.
//   * Brackets may also be used to separate element values.

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=- \
// RUN:                  --function=buffers \
// RUN:                  --input=i32=5 \
// RUN:                  --input=2xi32 \
// RUN:                  --input="2x3xi32=1 2 3 4 5 6") | \
// RUN: FileCheck --check-prefix=INPUT-BUFFERS %s
// INPUT-BUFFERS-LABEL: EXEC @buffers
func.func @buffers(%arg0: tensor<i32>, %arg1: tensor<2xi32>, %arg2: tensor<2x3xi32>) -> (tensor<i32>, tensor<2xi32>, tensor<2x3xi32>) {
  // INPUT-BUFFERS: result[0]: hal.buffer_view
  // INPUT-BUFFERS-NEXT: i32=5
  // INPUT-BUFFERS: result[1]: hal.buffer_view
  // INPUT-BUFFERS-NEXT: 2xi32=0 0
  // INPUT-BUFFERS: result[2]: hal.buffer_view
  // INPUT-BUFFERS-NEXT: 2x3xi32=[1 2 3][4 5 6]
  return %arg0, %arg1, %arg2 : tensor<i32>, tensor<2xi32>, tensor<2x3xi32>
}

// -----

// Buffer values can be read from binary files with `@some/file.bin`.
//   * numpy npy files from numpy.save or previous tooling output can be read to
//     provide 1+ values.
//   * Some data types may be converted (i32 -> si32 here) - bug?

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx \
// RUN:               -o=%t.vmfb %s && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=npy_round_trip \
// RUN:                  --input=2xi32=11,12 \
// RUN:                  --input=3xi32=1,2,3 \
// RUN:                  --output=@%t.npy \
// RUN:                  --output=+%t.npy && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=npy_round_trip \
// RUN:                  --input=*%t.npy) | \
// RUN: FileCheck --check-prefix=INPUT-NUMPY %s

// INPUT-NUMPY-LABEL: EXEC @npy_round_trip
func.func @npy_round_trip(%arg0: tensor<2xi32>, %arg1: tensor<3xi32>) -> (tensor<2xi32>, tensor<3xi32>) {
  // INPUT-NUMPY: result[0]: hal.buffer_view
  // INPUT-NUMPY-NEXT: 2xsi32=11 12
  // INPUT-NUMPY: result[1]: hal.buffer_view
  // INPUT-NUMPY-NEXT: 3xsi32=1 2 3
  return %arg0, %arg1 : tensor<2xi32>, tensor<3xi32>
}

Source file: tools/test/iree-run-module-outputs.mlir

tools/test/iree-run-module-outputs.mlir
// Tests that execution providing no outputs is ok.

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-sync --module=- --function=no_output) | \
// RUN: FileCheck --check-prefix=NO-OUTPUT %s
// NO-OUTPUT-LABEL: EXEC @no_output
func.func @no_output() {
  return
}

// -----

// Tests the default output printing to stdout.

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-sync --module=- --function=default) | \
// RUN: FileCheck --check-prefix=OUTPUT-DEFAULT %s
// OUTPUT-DEFAULT-LABEL: EXEC @default
func.func @default() -> (i32, tensor<f32>, tensor<?x4xi32>) {
  // OUTPUT-DEFAULT: result[0]: i32=123
  %0 = arith.constant 123 : i32
  // OUTPUT-DEFAULT: result[1]: hal.buffer_view
  // OUTPUT-DEFAULT-NEXT: f32=4
  %1 = arith.constant dense<4.0> : tensor<f32>
  // OUTPUT-DEFAULT: result[2]: hal.buffer_view
  // OUTPUT-DEFAULT-NEXT: 2x4xi32=[0 1 2 3][4 5 6 7]
  %2 = flow.tensor.dynamic_constant dense<[[0,1,2,3],[4,5,6,7]]> : tensor<2x4xi32> -> tensor<?x4xi32>
  return %0, %1, %2 : i32, tensor<f32>, tensor<?x4xi32>
}

// -----

// Tests explicit output to npy files by producing a concatenated .npy and then
// printing the results in python. This also verifies our npy files can be
// parsed by numpy.

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-sync --module=- --function=numpy \
// RUN:                  --output= \
// RUN:                  --output=@%t.npy \
// RUN:                  --output=+%t.npy) && \
// RUN:  "%PYTHON" %S/echo_npy.py %t.npy | \
// RUN: FileCheck --check-prefix=OUTPUT-NUMPY %s
func.func @numpy() -> (i32, tensor<f32>, tensor<?x4xi32>) {
  // Output skipped:
  %0 = arith.constant 123 : i32
  // OUTPUT-NUMPY{LITERAL}: 4.0
  %1 = arith.constant dense<4.0> : tensor<f32>
  // OUTPUT-NUMPY-NEXT{LITERAL}: [[0 1 2 3]
  // OUTPUT-NUMPY-NEXT{LITERAL}:  [4 5 6 7]]
  %2 = flow.tensor.dynamic_constant dense<[[0,1,2,3],[4,5,6,7]]> : tensor<2x4xi32> -> tensor<?x4xi32>
  return %0, %1, %2 : i32, tensor<f32>, tensor<?x4xi32>
}

// -----

// Tests output to binary files by round-tripping the output of a function into
// another invocation reading from the binary files. Each output is written to
// its own file (optimal for alignment/easier to inspect).

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx %s -o=%t.vmfb && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=write_binary \
// RUN:                  --output=@%t.0.bin \
// RUN:                  --output=@%t.1.bin && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=echo_binary \
// RUN:                  --input=f32=@%t.0.bin \
// RUN:                  --input=2x4xi32=@%t.1.bin) | \
// RUN: FileCheck --check-prefix=OUTPUT-BINARY %s

// Tests output to binary files by round-tripping the output of a function into
// another invocation reading from the binary files. The values are appended to
// a single file and read from the single file.

// RUN: (iree-compile --iree-hal-target-device=local \
// RUN:               --iree-hal-local-target-device-backends=vmvx \
// RUN:               -o=%t.vmfb %s && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=write_binary \
// RUN:                  --output=@%t.bin \
// RUN:                  --output=+%t.bin && \
// RUN:  iree-run-module --device=local-sync \
// RUN:                  --module=%t.vmfb \
// RUN:                  --function=echo_binary \
// RUN:                  --input=f32=@%t.bin \
// RUN:                  --input=2x4xi32=+%t.bin) | \
// RUN: FileCheck --check-prefix=OUTPUT-BINARY %s

func.func @write_binary() -> (tensor<f32>, tensor<?x4xi32>) {
  %0 = arith.constant dense<4.0> : tensor<f32>
  %1 = flow.tensor.dynamic_constant dense<[[0,1,2,3],[4,5,6,7]]> : tensor<2x4xi32> -> tensor<?x4xi32>
  return %0, %1 : tensor<f32>, tensor<?x4xi32>
}
func.func @echo_binary(%arg0: tensor<f32>, %arg1: tensor<?x4xi32>) -> (tensor<f32>, tensor<?x4xi32>) {
  // OUTPUT-BINARY{LITERAL}: f32=4
  // OUTPUT-BINARY{LITERAL}: 2x4xi32=[0 1 2 3][4 5 6 7]
  return %arg0, %arg1 : tensor<f32>, tensor<?x4xi32>
}

Source file: tools/test/iree-run-module-expected.mlir

tools/test/iree-run-module-expected.mlir
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=2.0) | \
// RUN:  FileCheck %s --check-prefix=SUCCESS-MATCHES
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output="(ignored)") | \
// RUN:  FileCheck %s --check-prefix=SUCCESS-IGNORED
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=2.1 --expected_f32_threshold=0.1) | \
// RUN:  FileCheck %s --check-prefix=SUCCESS-THRESHOLD
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=123 --expected_output=f32=2.0) | \
// RUN:  FileCheck %s --check-prefix=FAILED-FIRST
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=f32=4.5) | \
// RUN:  FileCheck %s --check-prefix=FAILED-SECOND
// RUN: (iree-compile --iree-hal-target-device=local --iree-hal-local-target-device-backends=vmvx %s | \
// RUN:  not iree-run-module --device=local-task --module=- --function=abs --input=f32=-2 --expected_output=f32=-2 --expected_output=4xf32=2.0) | \
// RUN:  FileCheck %s --check-prefix=FAILED-SHAPE

// SUCCESS-MATCHES: [SUCCESS]
// SUCCESS-THRESHOLD: [SUCCESS]
// SUCCESS-IGNORED: [SUCCESS]
// FAILED-FIRST: [FAILED] result[0]: element at index 0 (-2) does not match the expected (123)
// FAILED-SECOND: [FAILED] result[1]: element at index 0 (2) does not match the expected (4.5)
// FAILED-SHAPE: [FAILED] result[1]: metadata is f32; expected that the view matches 4xf32

func.func @abs(%input: tensor<f32>) -> (tensor<f32>, tensor<f32>) {
  %result = math.absf %input : tensor<f32>
  return %input, %result : tensor<f32>, tensor<f32>
}

iree-check-modulelink

The iree-check-module program takes an already translated IREE module as input and executes it as a series of googletest tests. This is the test runner for the IREE check framework.

$ ../iree-build/tools/iree-compile \
  --iree-input-type=stablehlo \
  --iree-hal-target-device=local \
  --iree-hal-local-target-device-backends=vmvx \
  $PWD/tests/e2e/stablehlo_ops/abs.mlir \
  -o /tmp/abs.vmfb

$ ../iree-build/tools/iree-check-module \
  --device=local-task \
  --module=/tmp/abs.vmfb

iree-run-mlirlink

The iree-run-mlir program takes a .mlir file as input, translates it to an IREE bytecode module, and executes the module.

It is designed for testing and debugging, not production uses, and therefore does some additional work that usually must be explicit, like marking every function as exported by default and running all of them.

For example, to execute the contents of samples/models/simple_abs.mlir:

# iree-run-mlir <compiler flags> [input.mlir] <runtime flags>
$ ../iree-build/tools/iree-run-mlir \
  --iree-hal-target-device=local \
  --iree-hal-local-target-device-backends=vmvx \
  $PWD/samples/models/simple_abs.mlir \
  --input=f32=-2

iree-dump-modulelink

The iree-dump-module program prints the contents of an IREE module FlatBuffer file.

For example, to inspect the module translated above:

../iree-build/tools/iree-dump-module /tmp/simple_abs_vmvx.vmfb

Useful generic flagslink

Read inputs from a filelink

All the IREE tools support reading input values from a file. This is quite useful for debugging. Use --help for each tool to see what the flag to set. The inputs are expected to be newline-separated. Each input should be either a scalar or a buffer. Scalars should be in the format type=value and buffers should be in the format [shape]xtype=[value]. For example:

1x5xf32=1,-2,-3,4,-5
1x5x3x1xf32=15,14,13,12,11,10,9,8,7,6,5,4,3,2,1

`--iree-flow-trace-dispatch-tensors`link

This flag will enable tracing inputs and outputs for each dispatch function. It is easier to narrow down test cases, since IREE breaks a ML workload into multiple dispatch function. When the flag is on, IREE will insert trace points before and after each dispatch function. The first trace op is for inputs, and the second trace op is for outputs. There will be two events for one dispatch function.