Skip to content

HAL

-iree-hal-assign-legacy-target-deviceslink

Assigns the HAL devices the module will target to the given list of targets.

Assigns target HAL devices to the module based on the given list.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.
-targetBackends  : List of target backends to assign as device targets.

-iree-hal-assign-target-deviceslink

Assigns the HAL devices the module will target to the given list of target specifications.

Assigns target HAL devices to the module based on the given list of target specifications.

Targets can be specified in several ways depending on whether there are multiple devices, named devices, or devices imported from external files. Human-friendly device aliases can be used as shorthand for IREE::HAL::TargetDevice implementations providing their own configuration. The aliases are identical to those used by #hal.device.alias<>.

If multiple targets are specified they will be available as multiple distinct devices. A single device may select from one or more targets such that the first enumerated that matches at runtime will be selected. For example a gpu device may select between CUDA, HIP, or Vulkan at runtime based on what kind of device the user has and what HAL implementations were compiled into the runtime.

Examples using the canonical flag:

// Two devices, one the local host device and the other a Vulkan device:
--iree-hal-target-device=local
--iree-hal-target-device=vulkan

// One device selecting between Vulkan if available and otherwise use the
// local host device:
--iree-hal-target-device=vulkan,local

// Two CUDA devices selected by runtime ordinal; at runtime two --device=
// flags are required to configure both devices:
--iree-hal-target-device=cuda[0]
--iree-hal-target-device=cuda[1]

// A fully-defined target specification:
--iree-hal-target-device=#hal.device.target<"cuda", {...}, [#hal.executable.target<...>]>

// Named device for defining a reference by #hal.device.promise<@some_name>:
--iree-hal-target-device=some_name=vulkan

Optionslink

-targetDevices : List of target device specifications.

-iree-hal-capture-executable-sourceslink

Captures individual hal.executable.variant source listings and embeds them in the IR.

Captures a source listing of each hal.executable.variant and attaches the source to the variant embedded in the IR. Entry points are assigned locations in the IR relative to the captured source.

Optionslink

-stage : Name used to indicate what stage of compilation is captured.

-iree-hal-configure-executableslink

Configures hal.executable ops via a nested translation pipeline.

Runs a nested pipeline on each executable to attach target-specific configuration information to variants.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.

-iree-hal-configure-target-executable-variantslink

Configures hal.executable.variant ops for the specified target backend.

Attaches target-specific configuration information to a variant controlling how code generation operates.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.
-target          : Target backend name whose executable variants will be configured by this pass.

-iree-hal-conversionlink

Converts from stream and other intermediate dialects into the hal dialect.

Converts supported intermediate dialects (stream, util, and various upstream dialects like cf/scf) into the hal dialect. After conversion host code scheduling work and allocations will act on !hal.device queues and !hal.buffer (and other) resources.

It's expected that executable interface materialization has been performed so that the information required to marshal buffers and operands to the device is available for conversion.

-iree-hal-dump-executable-benchmarkslink

Dumps standalone hal.executable benchmarks to the provided path.

Dumps one MLIR file per hal.executable containing the executable contents and the host code required to dispatch them with fake buffers and operands. These benchmarks can be run with the iree-benchmark-module tool to microbenchmark individual dispatches outside of the whole program context.

The pass can only be run after executable translation but before host code conversion as the original stream dialect ops are required to synthesize the benchmarks.

There are many caveats with this approach and it will fail to generate benchmarks in many cases such as dynamic shapes, dynamic operands, or stateful data dependencies. Users should always prefer to build dedicated benchmarks in their origin framework that can be guaranteed to match their expectations and use appropriate test data. For example some dispatches may produce NaNs or out-of-bounds accesses with the fake data generated by this pass and either crash or result in unrepresentative performance.

In other words: don't blindly expect this pass to do anything but act as a starting point for microbenchmarking. Verify the outputs, the benchmarking methodology for the particular dispatch, and prepare to do more work. Or just author proper benchmarks in the original framework!

Optionslink

-path : File system path to write each executable benchmark MLIR file.

-iree-hal-dump-executable-sourceslink

Dumps individual hal.executable source listings to the provided path.

Dumps a source listing of each hal.executable and updates the source locations in the IR to point at the produced files. This allows for easy inspection of each executable prior to translation and gives downstream tools that can display source information (Tracy, perf, etc) something more useful than the entire original source program.

Optionslink

-path   : File system path to write each executable source MLIR file.
-prefix : String to prefix the written file names with.

-iree-hal-elide-redundant-commandslink

Elides stateful command buffer ops that set redundant state.

Identifies sequences of stateful command buffer operations such as hal.command_buffer.push_descriptor_set that set redundant state that arise from trivial conversion from the stateless stream dialect and removes them to reduce binary size and runtime overhead.

-iree-hal-hoist-executable-objectslink

Hoists local executable object annotations to the parent hal.executable.variant.

Finds all hal.executable.objects attrs on all ops within an executable inner module and moves them to the parent hal.executable.variant op.

-iree-hal-initialize-deviceslink

Initializes global device handles based on their specification.

Initializes each global !hal.device based on the specification attribute by building initializers that enumerate and select the appropriate device.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.

-iree-hal-inline-memoize-regionslink

Inlines hal.device.memoize regions into their parent region.

Inlines any hal.device.memoize ops into their parent region and removes the op. This prevents memoization and has the same behavior as having never formed the memoization regions.

Links hal.executable ops into one or more hal.executable ops.

Runs a nested pipeline to link multiple hal.executable ops together if the target backend the executables are used with desires.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.

Links executables for the specified target backend.

Links together multiple hal.executable ops for the given target backend if desired. Linking allows for intra-module deduplication and amortization of startup time, code size, and runtime overheads that come from managing multiple hundreds/thousands of executables.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.
-target          : Target backend name whose executables will be linked by this pass.

-iree-hal-materialize-dispatch-instrumentationlink

Materializes host and device dispatch instrumentation resources on stream IR.

Adds dispatch instrumentation for both host and device prior to materializing interfaces so that the higher-level stream dialect can be used to easily mutate the dispatch sites, executable exports, and resources used for instrumentation storage.

Optionslink

-buffer-size : Power-of-two byte size of the instrumentation buffer.

-iree-hal-materialize-interfaceslink

Defines hal.executable variants for stream.executable ops.

Defines hal.executables and one hal.variant for each required target. The interfaces required to marshal buffers and operands across the host-device boundary are declared on the executables and annotated on the dispatch sites so that subsequent conversion can consume them.

-iree-hal-materialize-resource-cacheslink

Materializes cached globals for device resources.

Scans the program for resource lookups such as hal.executable.lookup and materializes globals initialized on startup. The original lookup ops are replaced with global loads of the cached resources.

-iree-hal-materialize-target-deviceslink

Materializes global device handles based on a hal.device.targets spec.

Materializes global !hal.device ops for the devices specified by the hal.device.targets attribute on the module. An optional default device can be specified to assign to ops that do not have a default device specified.

Optionslink

-defaultDevice : Which device is considered the default when no device affinity is specified.

-iree-hal-memoize-device-querieslink

Finds hal.device.query ops and creates variables initialized on startup.

Finds all hal.device.query-related ops that are hoistable and moves them into globals that are initialized on startup. This prevents repeated queries at runtime and allows for optimization as queries are CSEd across the entire program.

-iree-hal-outline-memoize-regionslink

Outlines hal.device.memoize regions and creates global resources.

Outlines any hal.device.memoize ops in the module by creating functions and per-device globals with initializers.

-iree-hal-preprocess-executables-with-pipelinelink

Preprocess each executable with an MLIR pass pipeline.

Runs the given MLIR pass pipeline as parsed by the --pass-pipeline= flag on each hal.executable in the program. The passes must be linked into the compiler to be discovered.

Optionslink

-pipeline : MLIR pass pipeline description to run on each executable.

-iree-hal-preprocess-executables-with-toollink

Preprocess each executable with an external command line tool.

Passes each hal.executable in the program to the given command line tool as stdin and parses the resulting MLIR from stdout to replace them. This is equivalent to iree-hal-preprocess-executables-with-pipeline but allows for an external mlir-opt/iree-opt-like tool to be used containing the pipelines instead of requiring the passes to be linked into the compiler.

Optionslink

-command : stdin->stdout command to run on each hal.executable MLIR op.

-iree-hal-prune-executableslink

Prunes executable variants and exports that are not referenced.

Prunes executable variants and exports that are not referenced in the module. This is intended to be run late in the pipeline where no new dispatches will be inserted that may require the variants or exports that it removes.

-iree-hal-repeat-dispatcheslink

Repeats each hal.command_buffer.dispatch op one or more times.

Finds all hal.command_buffer.dispatch ops and repeats them the specified number of times by cloning them and inserting a barrier. This is extremely unreliable and nearly always creates incorrect programs that have wildly incorrect end-to-end execution timings. It must only be used when trying to profile (via sampling or performance counters) specific dispatches in-situ with the additional caveat that cache behavior and dispatch overhead are invalid. Do not trust any numbers produced by this method of benchmarking without verifying via external tooling.

This should rarely be used. Prefer instead to build real benchmarks in origin frameworks that, for example, use independent data and ensure correct execution results (as if you're benchmarking known-incorrect results, are you really benchmarking something useful?). Any benchmarking of memory-bound operations using this approach will be questionable (such as matmuls, which we use this for today... heh ;).

Optionslink

-count : Number of times to repeat each dispatch (including the original).

-iree-hal-resolve-device-aliaseslink

Resolves #hal.device.alias attributes to their expanded configurations.

Resolves device aliases to the concrete targets using defaults, flags, and registered device configurations.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.

-iree-hal-resolve-device-promiseslink

Resolves #hal.device.promise attributes to their devices.

Resolves promised device affinities to the materialized device globals that were promised. Verifies that all promises are resolved.

-iree-hal-resolve-export-ordinalslink

Resolves symbolic hal.executable.export references to ordinals.

Severs symbolic references to hal.executable.export ops from dispatch sites by replacing them with the ordinal assigned to the exports. This allows for subsequent passes to collapse the executables into opaque blobs.

-iree-hal-serialize-all-executableslink

Converts hal.executable.variants to one or more hal.executable.binary ops.

Runs a nested pipeline on each executable to serialize its variants from their low-level MLIR dialects (such as llvm, spirv, etc) to their target-specific object format (static/shared libraries, SPIR-V, etc).

Optionslink

-target-registry         : Target registry containing the list of available devices and backends.
-debug-level             : Debug level for serialization (0 (no information) to 3 (all information)).
-dump-intermediates-path : Path to write translated executable intermediates (.bc, .o, etc) into for debugging.
-dump-binaries-path      : Path to write translated and serialized executable binaries into for debugging.

-iree-hal-serialize-target-executableslink

Serializes executables for the specified target backend.

Serializes variants for the target backend from their low-level MLIR dialects (such as llvm, spirv, etc) to their target-specific object format (static/shared libraries, SPIR-V, etc).

Optionslink

-target-registry         : Target registry containing the list of available devices and backends.
-target                  : Target backend name whose executables will be serialized by this pass.
-debug-level             : Debug level for serialization (0 (no information) to 3 (all information)).
-dump-intermediates-path : Path to write translated executable intermediates (.bc, .o, etc) into for debugging.
-dump-binaries-path      : Path to write translated and serialized executable binaries into for debugging.

-iree-hal-strip-executable-contentslink

Strips executable module contents for reducing IR size during debugging.

A debugging pass for stripping translated executable contents (LLVM dialect, SPIR-V dialect, etc) to reduce IR size and noise from the device-only code.

-iree-hal-substitute-executableslink

Substitutes hal.executable ops with files on disk.

Substitutes hal.executable ops with externally referenced MLIR files or target-specific object files. When provided a .mlir/.mlirbc file with a top-level hal.executable the entire executable will be replaced including all variants contained with. All other files such as .o, .bc, and .spv will be set as external object files on the original executable variants and the original contents will be dropped.

Substitutions can be specified by providing a file system path where there exists files matching the executable names in one of the supported formats or by specifying the file each executable name maps to directly.

Optionslink

-substitutions : Substitution `executable_name=file.xxx` key-value pairs.
-search-path   : Path to source executable substitutions from.

-iree-hal-translate-all-executableslink

Translates hal.executable ops via a nested translation pipeline.

Runs a nested pipeline on each executable to translate its variants from their generic MLIR dialects (such as linalg) to their target-specific dialects (llvm, spirv, etc).

Optionslink

-target-registry : Target registry containing the list of available devices and backends.

-iree-hal-translate-target-executable-variantslink

Translates hal.executable.variant ops for the specified target backend.

Translates an executable variant for a specific target from its generic MLIR dialects (such as linalg) to the target-specific dialects (llvm, spirv, etc).

Optionslink

-target-registry : Target registry containing the list of available devices and backends.
-target          : Target backend name whose executable variants will be translated by this pass.

-iree-hal-verify-deviceslink

Verifies that all devices can be targeted with the available compiler plugins.

Verifies that #hal.device.target and #hal.executable.target attributes reference targets that are registered with the compiler.

Optionslink

-target-registry : Target registry containing the list of available devices and backends.