Skip to content

C API bindingslink

Overviewlink

The IREE compiler and IREE runtime both have their own C/C++ APIs. This page introduces the available APIs and describes how to use them from your applications.

Note

There are multiple ways to distribute and depend on C/C++ projects, each with varying levels of portability, flexibility, and toolchain compatibility. IREE aims to support common configurations and platforms.

Compiler APIlink

The IREE compiler is structured as a monolithic shared object with a dynamic plugin system allowing for extensions. The shared object exports symbols for versioned API functions.

graph TD
  accTitle: IREE compiler linkage model diagram
  accDescr {
    The libIREECompiler.so or IREECompiler.dll shared object contains pipelines,
    target backends, and general passes as private implementation details.
    Compiler plugins interface with the compiler shared object to extend it with
    custom targets, dialects, etc.
    Applications interface with the compiler shared object through the compiler
    C API's exported symbols.
  }

  subgraph compiler[libIREECompiler.so / IREECompiler.dll]
    pipelines("Pipelines

    • Flow
    • Stream
    • etc.")

    targets("Target backends

    • llvm-cpu
    • vulkan-spirv
    • etc.")

    passes("General passes

    • Const eval
    • DCE
    • etc.")
  end

  plugins("Compiler plugins

    • Custom targets
    • Custom dialects
    • etc.")

  application(Your application)

  compiler <-- "Plugin API<br>(static or dynamic linking)" --> plugins
  compiler -. "Compiler C API<br>(exported symbols)" .-> application

API definitions can be found in the following locations:

Source location Overview
iree/compiler/embedding_api.h Top-level IREE compiler embedding API
iree/compiler/PluginAPI/ directory IREE compiler plugin API
mlir/include/mlir-c/ directory MLIR C API headers

Conceptslink

The compiler API is centered around running pipelines to translate inputs to artifacts. These are modeled via sessions, invocations, sources, and outputs.

stateDiagram-v2
  accTitle: IREE compiler session and invocation state diagram
  accDescr {
    Input files are opened (or buffers are wrapped) as sources in a session.
    Sources are parsed into invocations, which run pipelines.
    Output files are written (or buffers are mapped) for compilation artifacts.
    Sessions can contain multiple sources and run multiple invocations.
  }

  direction LR
  InputFile --> Source1 : open file
  InputBuffer --> Source2 : wrap buffer

  state Session {
    Source1 --> Invocation1
    Source2 --> Invocation2
    Invocation1 --> Invocation1 : run pipeline
    Invocation2 --> Invocation2 : run pipeline
  }

  Invocation1 --> Output1File   : write file
  Invocation1 --> Output1Buffer : map memory
  Invocation2 --> Output2Buffer : map memory

Sessionslink

A session (iree_compiler_session_t) is a scope where one or more invocations can run.

  • Internally, sessions consist of an MLIRContext and a private set of options.
  • Sessions may activate available plugins based on their options.

Invocationslink

An invocation (iree_compiler_invocation_t) is a discrete run of the compiler.

  • Invocations run pipelines, consisting of passes, to translate from sources to outputs.

Sourceslink

A source (iree_compiler_source_t) represents an input program, including operations and data.

  • Sources may refer to files or buffers in memory.

Outputslink

An output (iree_compiler_output_t) represents a compilation artifact.

  • Outputs can be standalone files or more advanced streams.

Pluginslink

A plugin extends the compiler with some combination of target backends, options, passes, or pipelines. For documentation on compiler plugins, see compiler/PluginAPI/README.md.

Usagelink

This snippet shows the general layout of the API. For working examples, see the samples below.

To build a custom tool using the compiler API:

CMakeLists.txt
1
2
3
set(_IREE_COMPILER_API "${_IREE_COMPILER_ROOT}/bindings/c/iree/compiler")
target_include_directories(${_NAME} SYSTEM PRIVATE ${_IREE_COMPILER_API})
target_link_libraries(${_NAME} iree_compiler_bindings_c_loader)
iree_compiler_demo.c
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <iree/compiler/embedding_api.h>
#include <iree/compiler/loader.h>

int main(int argc, char** argv) {
  // Load the compiler library then initialize it.
  ireeCompilerLoadLibrary("libIREECompiler.so");
  ireeCompilerGlobalInitialize();

  // Create a session to track compiler state and set flags.
  iree_compiler_session_t *session = ireeCompilerSessionCreate();
  ireeCompilerSessionSetFlags(session, argc, argv);

  // Open a file as an input source to the compiler.
  iree_compiler_source_t *source = NULL;
  ireeCompilerSourceOpenFile(session, "input.mlir", &source);

  // Use an invocation to compile from the input source to one or more outputs.
  iree_compiler_invocation_t *inv = ireeCompilerInvocationCreate(session);
  ireeCompilerInvocationPipeline(inv, IREE_COMPILER_PIPELINE_STD);

  // Output the compiled artifact to a file.
  iree_compiler_output_t *output = NULL;
  ireeCompilerOutputOpenFile("output.vmfb", &output);
  ireeCompilerInvocationOutputVMBytecode(inv, output);

  // Cleanup state.
  ireeCompilerInvocationDestroy(inv);
  ireeCompilerOutputDestroy(output);
  ireeCompilerSourceDestroy(source);
  ireeCompilerSessionDestroy(session);
  ireeCompilerGlobalShutdown();
}

Sampleslink

Project Source Description
iree-org/iree-template-compiler-cmake hello_compiler.c Compiler application template
iree-org/iree integrations/pjrt/.../iree_compiler.cc JIT for TensorFlow + JAX to IREE
iree-org/iree compiler/plugins In-tree supported compiler plugins
iree-org/iree samples/compiler_plugins/ In-tree sample compiler plugins
nod-ai/iree-amd-aie plugins/.../iree-amd-aie Early-phase plugins for interfacing with AMD AIE accelerators

Runtime APIlink

The IREE runtime is structured as a modular set of library components. Each component is designed to be linked into applications directly and compiled with LTO style optimizations.

The low level library components can be used directly or through a higher level API.

Caution

Prefer using the low level API directly when writing custom bindings or integrating into larger projects. The high level API is mainly useful as a reference and when building samples.

Each runtime component has its own low level API. The low level APIs are typically verbose as they expose the full flexibility of each underlying system.

graph TD
  accTitle: IREE runtime low level API diagram
  accDescr {
    The IREE runtime includes 'base', 'HAL', and 'VM' components, each with
    their own types and API methods.
    Applications can interface directly with the IREE runtime via the low
    level component APIs.
  }

  subgraph iree_runtime[IREE Runtime]
    subgraph base
      base_types("Types

      • allocator
      • status
      • etc.")
    end

    subgraph hal[HAL]
      hal_types("Types

      • buffer
      • device
      • etc.")

      hal_drivers("Drivers

      • local-*
      • vulkan
      • etc.")
    end

    subgraph vm[VM]
      vm_types("Types

      • context
      • invocation
      • etc.")
    end
  end

  application(Your application)

  base_types & hal_types & hal_drivers & vm_types --> application

The high level 'runtime' API sits on top of the low level components. It is relatively terse but does not expose the full flexibility of the underlying systems.

graph TD
  accTitle: IREE runtime high level API diagram
  accDescr {
    The IREE runtime includes 'base', 'HAL', and 'VM' components, each with
    their own types and API methods.
    A high level "runtime API" sits on top of these component APIs.
    Applications can interface indirectly with the IREE runtime via this
    high level runtime API.
  }

  subgraph iree_runtime[IREE Runtime]
    subgraph base
      base_types("Types

      • allocator
      • status
      • etc.")
    end

    subgraph hal[HAL]
      hal_types("Types

      • buffer
      • device
      • etc.")

      hal_drivers("Drivers

      • local-*
      • vulkan
      • etc.")
    end

    subgraph vm[VM]
      vm_types("Types

      • context
      • invocation
      • etc.")
    end

    runtime_api("Runtime API

    • instance
    • session
    • call")

    base_types & hal_types & hal_drivers & vm_types --> runtime_api
  end

  application(Your application)

  runtime_api --> application

Runtime API header files are organized by component:

Component header file Overview
iree/base/api.h Base API: type definitions, cross-platform primitives, utilities
iree/vm/api.h VM APIs: loading modules, I/O, calling functions
iree/hal/api.h HAL APIs: device management, synchronization, accessing hardware features
iree/runtime/api.h High level runtime API

Low level conceptslink

Baselink

The 'base' component includes general runtime utilities such as:

  • Memory allocators
  • Status and error handling
  • String manipulation
  • File input and output
  • Event pools and loops
  • Synchronization and threading primitives
  • Tracing and other debugging

As IREE is designed to support a variety of deployment targets, many of these utilities are written to be cross-platform or be optional.

VMlink

IREE uses its own Virtual Machine (VM) at runtime to interpret program instructions on the host system.

Tip - EmitC alternate lowering path

VM instructions may be further lowered to C source code for static or resource constrained deployment.

See the --output-format=vm-c compiler option and the samples in samples/emitc_modules/ for more information.

The VM supports generic operations like loads, stores, arithmetic, function calls, and control flow. The VM builds streams of more complex program logic and dense math into HAL command buffers that are dispatched to hardware backends.

  • VM instances can serve multiple isolated execution contexts.
  • VM contexts are effectively sandboxes for loading modules and running programs.
  • VM modules provide all functionality to execution contexts, including access to hardware accelerators through the HAL. Compiled user programs are also modules.

    stateDiagram-v2
      accTitle: Sample VM Modules
      accDescr {
        Bytecode modules contain program state, program functions, and debug
        information.
        HAL modules contain devices, executables, HAL functions, and HAL types.
        Custom modules may contain external functions and custom types.
      }
    
      state "Bytecode module" as bytecode {
        bytecode_contents: Module state<br>Program funcs<br>Debug info
      }
    
      state "HAL module" as HAL {
        hal_contents: Devices<br>Executables<br>HAL funcs<br>HAL types
      }
    
      state "Parameters module" as Params {
        parameters_contents: Providers
      }
    
      state "Custom module" as custom {
        custom_contents: External funcs<br>Custom types
      }

For more detailed information about the design of the VM, see this design doc.

HALlink

IREE uses a Hardware Abstraction Layer (HAL) to model and interact with hardware devices like CPUs, GPUs and other accelerators.

  • HAL drivers are used to enumerate and create HAL devices.
  • HAL devices interface with hardware, such as by allocating device memory, preparing executables, recording and dispatching command buffers, and synchronizing with the host.
  • HAL buffers represent data storage and buffer views represent views into that storage with associated shapes and types (similar to "tensors").

High level conceptslink

The high level API uses instances, sessions, and calls to run programs with a small API surface.

stateDiagram-v2
  accTitle: IREE runtime high level API state diagram
  accDescr {
    Instances track sessions and state: options, drivers, devices.
    Sessions track calls and state: a device and bytecode/VM modules.
    Calls track input and output lists.
  }

  state iree_runtime_instance_t {
    instance_state: state<br>- options<br>- drivers<br>- devices

    state iree_runtime_session_t {
      session_state: state<br>- device<br>- VM / bytecode modules
      state iree_runtime_call_t  {
        inputs
        outputs
      }
    }
  }

Instancelink

An instance (iree_runtime_instance_t) isolates runtime usage and manages device resources.

  • Instances may service multiple sessions to avoid extra device interaction and reuse caches/pools.
  • Separate instances are isolated/sandboxed from one another.

Sessionlink

A session (iree_runtime_session_t) contains a set of loaded modules and their state.

  • Sessions that share an instance may share resources directly.
  • Sessions that do not share an instance can transfer resources using import and export APIs.

Calllink

A call (iree_runtime_call_t) is a stateful VM function call builder.

  • Calls can be reused to avoid having to construct input lists for each invocation.

Usagelink

Sampleslink

Project Source Description
iree-org/iree-template-runtime-cmake hello_world.c Runtime application template
iree-org/iree runtime/demo/ In-tree demos of the high level runtime API
iree-org/iree samples/ In-tree sample applications
iree-org/iree-experimental runtime-library/ Shared runtime library builder
Builds libireert.so to aid development
iml130/iree-template-cpp simple_embedding.c Demo integration into a project

High level "hello world"link

Below are two samples showing how to use the high level runtime API - one "terse" sample and one "explained" sample with more detailed comments:

Source file: runtime/src/iree/runtime/demo/hello_world_terse.c

runtime/src/iree/runtime/demo/hello_world_terse.c
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
#include <stdio.h>

#include "iree/runtime/api.h"
#include "iree/runtime/demo/simple_mul_module_c.h"

static void iree_runtime_demo_run_session(iree_runtime_instance_t* instance);
static void iree_runtime_demo_perform_mul(iree_runtime_session_t* session);

//===----------------------------------------------------------------------===//
// 1. Entry point / shared iree_runtime_instance_t setup
//===----------------------------------------------------------------------===//

int main(int argc, char** argv) {
  // Create and configure the instance shared across all sessions.
  iree_runtime_instance_options_t instance_options;
  iree_runtime_instance_options_initialize(&instance_options);
  iree_runtime_instance_options_use_all_available_drivers(&instance_options);
  iree_runtime_instance_t* instance = NULL;
  IREE_CHECK_OK(iree_runtime_instance_create(
      &instance_options, iree_allocator_system(), &instance));

  // All sessions should share the same instance.
  iree_runtime_demo_run_session(instance);

  iree_runtime_instance_release(instance);
  return 0;
}

//===----------------------------------------------------------------------===//
// 2. Load modules and initialize state in iree_runtime_session_t
//===----------------------------------------------------------------------===//

static void iree_runtime_demo_run_session(iree_runtime_instance_t* instance) {
  // TODO(#5724): move device selection into the compiled modules.
  iree_hal_device_t* device = NULL;
  IREE_CHECK_OK(iree_runtime_instance_try_create_default_device(
      instance, iree_make_cstring_view("local-task"), &device));

  // Create one session per loaded module to hold the module state.
  iree_runtime_session_options_t session_options;
  iree_runtime_session_options_initialize(&session_options);
  iree_runtime_session_t* session = NULL;
  IREE_CHECK_OK(iree_runtime_session_create_with_device(
      instance, &session_options, device,
      iree_runtime_instance_host_allocator(instance), &session));
  iree_hal_device_release(device);

  // Load your user module into the session (from memory, from file, etc).
  const iree_file_toc_t* module_file =
      iree_runtime_demo_simple_mul_module_create();
  IREE_CHECK_OK(iree_runtime_session_append_bytecode_module_from_memory(
      session, iree_make_const_byte_span(module_file->data, module_file->size),
      iree_allocator_null()));

  // Run your functions; you should reuse the session to make multiple calls.
  iree_runtime_demo_perform_mul(session);

  iree_runtime_session_release(session);
}

//===----------------------------------------------------------------------===//
// 3. Call a function within a module with buffer views
//===----------------------------------------------------------------------===//

// func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) ->
// tensor<4xf32>
static void iree_runtime_demo_perform_mul(iree_runtime_session_t* session) {
  iree_runtime_call_t call;
  IREE_CHECK_OK(iree_runtime_call_initialize_by_name(
      session, iree_make_cstring_view("module.simple_mul"), &call));

  // %arg0: tensor<4xf32>
  iree_hal_buffer_view_t* arg0 = NULL;
  static const iree_hal_dim_t arg0_shape[1] = {4};
  static const float arg0_data[4] = {1.0f, 1.1f, 1.2f, 1.3f};
  IREE_CHECK_OK(iree_hal_buffer_view_allocate_buffer_copy(
      iree_runtime_session_device(session),
      iree_runtime_session_device_allocator(session),
      IREE_ARRAYSIZE(arg0_shape), arg0_shape, IREE_HAL_ELEMENT_TYPE_FLOAT_32,
      IREE_HAL_ENCODING_TYPE_DENSE_ROW_MAJOR,
      (iree_hal_buffer_params_t){
          .type = IREE_HAL_MEMORY_TYPE_DEVICE_LOCAL,
          .access = IREE_HAL_MEMORY_ACCESS_ALL,
          .usage = IREE_HAL_BUFFER_USAGE_DEFAULT,
      },
      iree_make_const_byte_span(arg0_data, sizeof(arg0_data)), &arg0));
  IREE_CHECK_OK(iree_hal_buffer_view_fprint(
      stdout, arg0, /*max_element_count=*/4096,
      iree_runtime_session_host_allocator(session)));
  IREE_CHECK_OK(iree_runtime_call_inputs_push_back_buffer_view(&call, arg0));
  iree_hal_buffer_view_release(arg0);

  fprintf(stdout, "\n * \n");

  // %arg1: tensor<4xf32>
  iree_hal_buffer_view_t* arg1 = NULL;
  static const iree_hal_dim_t arg1_shape[1] = {4};
  static const float arg1_data[4] = {10.0f, 100.0f, 1000.0f, 10000.0f};
  IREE_CHECK_OK(iree_hal_buffer_view_allocate_buffer_copy(
      iree_runtime_session_device(session),
      iree_runtime_session_device_allocator(session),
      IREE_ARRAYSIZE(arg1_shape), arg1_shape, IREE_HAL_ELEMENT_TYPE_FLOAT_32,
      IREE_HAL_ENCODING_TYPE_DENSE_ROW_MAJOR,
      (iree_hal_buffer_params_t){
          .type = IREE_HAL_MEMORY_TYPE_DEVICE_LOCAL,
          .access = IREE_HAL_MEMORY_ACCESS_ALL,
          .usage = IREE_HAL_BUFFER_USAGE_DEFAULT,
      },
      iree_make_const_byte_span(arg1_data, sizeof(arg1_data)), &arg1));
  IREE_CHECK_OK(iree_hal_buffer_view_fprint(
      stdout, arg1, /*max_element_count=*/4096,
      iree_runtime_session_host_allocator(session)));
  IREE_CHECK_OK(iree_runtime_call_inputs_push_back_buffer_view(&call, arg1));
  iree_hal_buffer_view_release(arg1);

  IREE_CHECK_OK(iree_runtime_call_invoke(&call, /*flags=*/0));

  fprintf(stdout, "\n = \n");

  // -> tensor<4xf32>
  iree_hal_buffer_view_t* ret0 = NULL;
  IREE_CHECK_OK(iree_runtime_call_outputs_pop_front_buffer_view(&call, &ret0));
  IREE_CHECK_OK(iree_hal_buffer_view_fprint(
      stdout, ret0, /*max_element_count=*/4096,
      iree_runtime_session_host_allocator(session)));
  iree_hal_buffer_view_release(ret0);

  iree_runtime_call_deinitialize(&call);
}

Source file: runtime/src/iree/runtime/demo/hello_world_explained.c

runtime/src/iree/runtime/demo/hello_world_explained.c
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
#include <stdio.h>

#include "iree/runtime/api.h"

static int iree_runtime_demo_main(void);
static iree_status_t iree_runtime_demo_run_session(
    iree_runtime_instance_t* instance);
static iree_status_t iree_runtime_demo_perform_mul(
    iree_runtime_session_t* session);

#if defined(IREE_RUNTIME_DEMO_LOAD_FILE_FROM_COMMAND_LINE_ARG)

static const char* demo_file_path = NULL;

// Takes the first argument on the command line as a file path and loads it.
int main(int argc, char** argv) {
  if (argc < 2) {
    fprintf(stderr, "usage: session_demo module_file.vmfb\n");
    return 1;
  }
  demo_file_path = argv[1];
  return iree_runtime_demo_main();
}

// Loads a compiled IREE module from the file system.
static iree_status_t iree_runtime_demo_load_module(
    iree_runtime_session_t* session) {
  return iree_runtime_session_append_bytecode_module_from_file(session,
                                                               demo_file_path);
}

#elif defined(IREE_RUNTIME_DEMO_LOAD_FILE_FROM_EMBEDDED_DATA)

#include "iree/runtime/demo/simple_mul_module_c.h"

int main(int argc, char** argv) { return iree_runtime_demo_main(); }

// Loads the bytecode module directly from memory.
//
// Embedding the compiled output into your binary is not always possible (or
// recommended) but is a fairly painless way to get things working on a variety
// of targets without worrying about how to deploy files or pass flags.
//
// In cases like this the module file is in .rodata and does not need to be
// freed; if the memory needs to be released when the module is unloaded then a
// custom allocator can be provided to get a callback instead.
static iree_status_t iree_runtime_demo_load_module(
    iree_runtime_session_t* session) {
  const iree_file_toc_t* module_file =
      iree_runtime_demo_simple_mul_module_create();
  return iree_runtime_session_append_bytecode_module_from_memory(
      session, iree_make_const_byte_span(module_file->data, module_file->size),
      iree_allocator_null());
}

#else
#error "must specify a way to load the module data"
#endif  // IREE_RUNTIME_DEMO_LOAD_FILE_FROM_*

//===----------------------------------------------------------------------===//
// 1. Entry point / shared iree_runtime_instance_t setup
//===----------------------------------------------------------------------===//
// Applications should create and share a single instance across all sessions.

// This would live in your application startup/shutdown code or scoped to the
// usage of IREE. Creating and destroying instances is expensive and should be
// avoided.
static int iree_runtime_demo_main(void) {
  // Set up the shared runtime instance.
  // An application should usually only have one of these and share it across
  // all of the sessions it has. The instance is thread-safe, while the
  // sessions are only thread-compatible (you need to lock if its required).
  iree_runtime_instance_options_t instance_options;
  iree_runtime_instance_options_initialize(&instance_options);
  iree_runtime_instance_options_use_all_available_drivers(&instance_options);
  iree_runtime_instance_t* instance = NULL;
  iree_status_t status = iree_runtime_instance_create(
      &instance_options, iree_allocator_system(), &instance);

  // Run the demo.
  // A real application would load its models (at startup, on-demand, etc) and
  // retain them somewhere to be reused. Startup time and likelihood of failure
  // varies across different HAL backends; the synchronous CPU backend is nearly
  // instantaneous and will never fail (unless out of memory) while the Vulkan
  // backend may take significantly longer and fail if there are not supported
  // devices.
  if (iree_status_is_ok(status)) {
    status = iree_runtime_demo_run_session(instance);
  }

  // Release the shared instance - it will be deallocated when all sessions
  // using it have been released (here it is deallocated immediately).
  iree_runtime_instance_release(instance);

  int ret = (int)iree_status_code(status);
  if (!iree_status_is_ok(status)) {
    // Dump nice status messages to stderr on failure.
    // An application can route these through its own logging infrastructure as
    // needed. Note that the status is a handle and must be freed!
    iree_status_fprint(stderr, status);
    iree_status_ignore(status);
  }
  return ret;
}

//===----------------------------------------------------------------------===//
// 2. Load modules and initialize state in iree_runtime_session_t
//===----------------------------------------------------------------------===//
// Each instantiation of a module will live in its own session. Module state
// like variables will be retained across calls within the same session.

// Loads the demo module and uses it to perform some math.
// In a real application you'd want to hang on to the iree_runtime_session_t
// and reuse it for future calls - especially if it holds state internally.
static iree_status_t iree_runtime_demo_run_session(
    iree_runtime_instance_t* instance) {
  // TODO(#5724): move device selection into the compiled modules.
  iree_hal_device_t* device = NULL;
  IREE_RETURN_IF_ERROR(iree_runtime_instance_try_create_default_device(
      instance, iree_make_cstring_view("local-task"), &device));

  // Set up the session to run the demo module.
  // Sessions are like OS processes and are used to isolate modules from each
  // other and hold runtime state such as the variables used within the module.
  // The same module loaded into two sessions will see their own private state.
  iree_runtime_session_options_t session_options;
  iree_runtime_session_options_initialize(&session_options);
  iree_runtime_session_t* session = NULL;
  iree_status_t status = iree_runtime_session_create_with_device(
      instance, &session_options, device,
      iree_runtime_instance_host_allocator(instance), &session);
  iree_hal_device_release(device);

  // Load the compiled user module in a demo-specific way.
  // Applications could specify files, embed the outputs directly in their
  // binaries, fetch them over the network, etc.
  if (iree_status_is_ok(status)) {
    status = iree_runtime_demo_load_module(session);
  }

  // Build and issue the call.
  if (iree_status_is_ok(status)) {
    status = iree_runtime_demo_perform_mul(session);
  }

  // Release the session and free all resources.
  iree_runtime_session_release(session);
  return status;
}

//===----------------------------------------------------------------------===//
// 3. Call a function within a module with buffer views
//===----------------------------------------------------------------------===//
// The inputs and outputs of a call are reusable across calls (and possibly
// across sessions depending on device compatibility) and can be setup by the
// application as needed. For example, an application could perform
// multi-threaded buffer view creation and then issue the call from a single
// thread when all inputs are ready. This simple demo just allocates them
// per-call and throws them away.

// Sets up and calls the simple_mul function and dumps the results:
// func.func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) ->
// tensor<4xf32>
//
// NOTE: this is a demo and as such this performs no memoization; a real
// application could reuse a lot of these structures and cache lookups of
// iree_vm_function_t to reduce the amount of per-call overhead.
static iree_status_t iree_runtime_demo_perform_mul(
    iree_runtime_session_t* session) {
  // Initialize the call to the function.
  iree_runtime_call_t call;
  IREE_RETURN_IF_ERROR(iree_runtime_call_initialize_by_name(
      session, iree_make_cstring_view("module.simple_mul"), &call));

  // Append the function inputs with the HAL device allocator in use by the
  // session. The buffers will be usable within the session and _may_ be usable
  // in other sessions depending on whether they share a compatible device.
  iree_hal_device_t* device = iree_runtime_session_device(session);
  iree_hal_allocator_t* device_allocator =
      iree_runtime_session_device_allocator(session);
  iree_allocator_t host_allocator =
      iree_runtime_session_host_allocator(session);
  iree_status_t status = iree_ok_status();
  {
    // %arg0: tensor<4xf32>
    iree_hal_buffer_view_t* arg0 = NULL;
    if (iree_status_is_ok(status)) {
      static const iree_hal_dim_t arg0_shape[1] = {4};
      static const float arg0_data[4] = {1.0f, 1.1f, 1.2f, 1.3f};
      status = iree_hal_buffer_view_allocate_buffer_copy(
          device, device_allocator,
          // Shape rank and dimensions:
          IREE_ARRAYSIZE(arg0_shape), arg0_shape,
          // Element type:
          IREE_HAL_ELEMENT_TYPE_FLOAT_32,
          // Encoding type:
          IREE_HAL_ENCODING_TYPE_DENSE_ROW_MAJOR,
          (iree_hal_buffer_params_t){
              // Where to allocate (host or device):
              .type = IREE_HAL_MEMORY_TYPE_DEVICE_LOCAL,
              // Access to allow to this memory:
              .access = IREE_HAL_MEMORY_ACCESS_ALL,
              // Intended usage of the buffer (transfers, dispatches, etc):
              .usage = IREE_HAL_BUFFER_USAGE_DEFAULT,
          },
          // The actual heap buffer to wrap or clone and its allocator:
          iree_make_const_byte_span(arg0_data, sizeof(arg0_data)),
          // Buffer view + storage are returned and owned by the caller:
          &arg0);
    }
    if (iree_status_is_ok(status)) {
      IREE_IGNORE_ERROR(iree_hal_buffer_view_fprint(
          stdout, arg0, /*max_element_count=*/4096, host_allocator));
      // Add to the call inputs list (which retains the buffer view).
      status = iree_runtime_call_inputs_push_back_buffer_view(&call, arg0);
    }
    // Since the call retains the buffer view we can release it here.
    iree_hal_buffer_view_release(arg0);

    fprintf(stdout, "\n * \n");

    // %arg1: tensor<4xf32>
    iree_hal_buffer_view_t* arg1 = NULL;
    if (iree_status_is_ok(status)) {
      static const iree_hal_dim_t arg1_shape[1] = {4};
      static const float arg1_data[4] = {10.0f, 100.0f, 1000.0f, 10000.0f};
      status = iree_hal_buffer_view_allocate_buffer_copy(
          device, device_allocator, IREE_ARRAYSIZE(arg1_shape), arg1_shape,
          IREE_HAL_ELEMENT_TYPE_FLOAT_32,
          IREE_HAL_ENCODING_TYPE_DENSE_ROW_MAJOR,
          (iree_hal_buffer_params_t){
              .type = IREE_HAL_MEMORY_TYPE_DEVICE_LOCAL,
              .access = IREE_HAL_MEMORY_ACCESS_ALL,
              .usage = IREE_HAL_BUFFER_USAGE_DEFAULT,
          },
          iree_make_const_byte_span(arg1_data, sizeof(arg1_data)), &arg1);
    }
    if (iree_status_is_ok(status)) {
      IREE_IGNORE_ERROR(iree_hal_buffer_view_fprint(
          stdout, arg1, /*max_element_count=*/4096, host_allocator));
      status = iree_runtime_call_inputs_push_back_buffer_view(&call, arg1);
    }
    iree_hal_buffer_view_release(arg1);
  }

  // Synchronously perform the call.
  if (iree_status_is_ok(status)) {
    status = iree_runtime_call_invoke(&call, /*flags=*/0);
  }

  fprintf(stdout, "\n = \n");

  // Dump the function outputs.
  iree_hal_buffer_view_t* ret0 = NULL;
  if (iree_status_is_ok(status)) {
    // Try to get the first call result as a buffer view.
    status = iree_runtime_call_outputs_pop_front_buffer_view(&call, &ret0);
  }
  if (iree_status_is_ok(status)) {
    // This prints the buffer view out but an application could read its
    // contents, pass it to another call, etc.
    status = iree_hal_buffer_view_fprint(
        stdout, ret0, /*max_element_count=*/4096, host_allocator);
  }
  iree_hal_buffer_view_release(ret0);

  iree_runtime_call_deinitialize(&call);
  return status;
}

Compiler + Runtime = JITlink

The compiler and runtime APIs may be used together to build a "just in time" (JIT) execution engine. JIT compilation allows for last-minute specialization with no prior knowledge of target devices and avoids issues with version drift, but it can also constrain deployment options and usage scenarios.