At a high level, a phase is a compiler transformation or analysis step that runs in isolation. Phases are meant to do one thing at a time and to compose in a straightforward sequence. See also the Rewriting Guide, since several phase families are built directly on top of Rewriter.
Overview
A Phase is a Stage with a single entry point, run(), which wraps the actual implementation in start().
Phase
Phase is the minimal base class.
A phase has:
- a name or annex,
- access to the current World,
- a run() wrapper for logging and verification,
- a todo() accessor backed by the internal todo_ flag for fixed-point iteration.
A key design point is the internal todo_ flag exposed through todo():
- Note
- A phase sets todo_ = true if its work discovers that another round is needed. PhaseMan uses this to drive fixed-point pipelines.
Typical Shape
A custom phase usually derives from Phase and implements start():
public:
MyPhase(World& world)
: Phase(world, "my_phase") {}
private:
}
};
As opposed to a Pass, a Phase does one thing at a time and does not mix with other Phases.
virtual void start()=0
Actual entry.
Run it with:
virtual void run()
Entry point and generates some debug output; invokes Phase::start.
Analysis
Analysis is the base class for phases that inspect the current world using the Rewriter traversal machinery. It inherits from both Phase and Rewriter, but unlike RWPhase, it rewrites into the same world. In practice, this means Rewriter is used as a structured, graph-aware traversal over ordinary MimIR terms.
- Note
- An Analysis based on Rewriter has a domain of ordinary Defs.
This is often convenient because analysis information can itself be represented as regular MimIR terms. As a result, existing IR machinery applies automatically, including:
- hash-consing / canonical sharing,
- built-in normalizations,
- and other simplifications already provided by the World.
So if your abstract domain fits naturally into MimIR, you can often encode it directly as Def and reuse the existing IR infrastructure.
An Analysis visits:
- all registered annex roots, then
- all external mutables.
During the first part, mim::Analysis::is_bootstrapping() is true. During the second part, it is false.
Typical usage:
- override rewrite(), rewrite_imm(), rewrite_mut(), or node-specific rewrite hooks,
- compute analysis information while traversing reachable IR,
- store that information in side tables,
- set todo_ = true if new information was discovered and another iteration is needed.
Reset Between Iterations
If an analysis participates in a fixed-point loop, it should be ready to run multiple times. The base reset() clears the rewriter state and resets todo_ for the next round.
RWPhase
RWPhase is the base class for phases that rebuild the current world into a new one, thereby eliminating garbage. This is the standard base class for optimization phases that structurally transform IR.
It inherits from both Phase and Rewriter, but here the two worlds differ:
- Note
- To avoid confusion, direct world() access is deleted. Use:
Execution Model
An RWPhase runs in three conceptual steps:
- optionally perform a fixed-point analysis on the old world,
- rewrite reachable old Defs into the new world:
- rewrite annex roots,
- rewrite external mutables;
- swap the old and new worlds.
After the swap, the rewritten world becomes the current one.
- Note
- An RWPhase follows the standard whole-world transformation pattern: analyze the old program, rebuild the transformed program, then replace the old world.
Cleanup
Cleanup is simply an RWPhase with no custom rewrites. Because an RWPhase reconstructs only what is reachable from the world roots, rebuilding automatically eliminates dead and unreachable code.
Bootstrapping
Like Analysis, RWPhase processes annex roots before externals.
While annexes are being rewritten, mim::RWPhase::is_bootstrapping() is true.
This matters because annexes may depend on one another. During bootstrapping, rewrites that require other annexes to already exist in the new world may need to be deferred or skipped.
Optional Pre-Analysis
An RWPhase may be given an associated Analysis. If so, analyze() runs that analysis to a fixed point before rewriting begins.
This is a common pattern:
- the analysis computes facts on the old world,
- the rewrite uses those facts to produce the new world.
If no analysis is needed, analyze() can simply return false.
Typical Shape
public:
MyRWPhase(World& world)
: RWPhase(world, "my_rw_phase") {}
private:
const Def* rewrite_imm_App(const App* app) override {
return Rewriter::rewrite_imm_App(app);
}
};
Rewrites the RWPhase::old_world into the RWPhase::new_world and swaps them afterwards.
Run it with:
PhaseMan
PhaseMan organizes several phases into a pipeline.
It can run them:
- once, in sequence, or
- repeatedly to a fixed point.
A fixed-point PhaseMan reruns the whole pipeline as long as at least one phase sets todo_ = true.
Between iterations, each phase is recreated from its original configuration. This keeps phase-local state from leaking across rounds unless the phase explicitly recomputes it.
- Note
- PhaseMan is the orchestration layer for classical phase pipelines.
Typical Usage
phases.emplace_back(std::make_unique<PhaseA>(world));
phases.emplace_back(std::make_unique<PhaseB>(world));
man.apply(true, std::move(phases));
man.run();
Organizes several Phases in a a pipeline.
std::deque< std::unique_ptr< Phase > > Phases
Use a fixed-point pipeline when phases expose new optimization opportunities for one another.
ClosedMutPhase
ClosedMutPhase is a traversal helper for phases that visit all reachable, closed mutables in the world.
A mutable is relevant here if it is:
- reachable,
- closed, i.e. it has no free variables,
- optionally non-empty, depending on elide_empty.
This is useful for local analyses or transformations that are naturally phrased as:
- Note
- For every reachable closed mutable, inspect or process it.
You override visit(M*), where M defaults to Def but may be restricted to a particular mutable subtype.
Typical Shape
public:
MyClosedPhase(World& world)
: ClosedMutPhase(world, "my_closed_phase", true) {}
private:
void visit(Lam* lam)
override {
}
};
Transitively visits all reachable, closed mutables in the World.
virtual void visit(M *)=0
NestPhase
NestPhase builds on ClosedMutPhase and computes a Nest for each visited mutable.
Use it when your phase needs a structured view of nested control or binding structure rather than just the raw mutable.
Instead of overriding visit(M*), override:
This is convenient for analyses that reason about nesting, dominance-like structure, or hierarchical regions.
Example: SCCP
#pragma once
class SCCP : public RWPhase {
private:
public:
Analysis(World& world)
:
mim::Analysis(world,
"SCCP::Analyzer") {}
auto& lattice() { return lattice_; }
private:
Def* rewrite_mut(Def*) final;
const Def* propagate(const Def* var, const Def* def);
const Def* rewrite_imm_App(const App* app) final;
};
public:
SCCP(World& world)
: RWPhase(world, "SCCP", &analysis_)
, analysis_(world) {}
private:
const Def* lattice(const Def* def) { return analysis_.lattice()[def]; }
const Def* rewrite_imm_App(const App* old_app) final;
Analysis analysis_;
};
}
This Phase will recursively Rewriter::rewrite.
DefMap< const Def * > Def2Def
The provided Sparse Conditional Constant Propagation (SCCP) implementation is a good example of the intended phase structure. Its architecture is:
- an inner Analysis computes propagation facts on the old world,
- an outer RWPhase uses those facts to rebuild a simplified new world.
- Note
- The implementation propagates not only constants but also arbitrary expressions.
Analysis
#include "sccp.h"
Def* SCCP::Analysis::rewrite_mut(Def* mut) {
map(mut, mut);
if (auto var = mut->has_var()) {
map(var, var);
for (auto var : mut->tvars()) {
map(var, var);
lattice_[var] = var;
}
}
for (auto d : mut->deps())
rewrite(d);
return mut;
}
const Def* SCCP::Analysis::propagate(
const Def* var,
const Def* def) {
auto [i, ins] = lattice_.emplace(var, def);
if (ins) {
todo_ = true;
DLOG(
"propagate: {} → {}", var, def);
return def;
}
auto cur = i->second;
if (!cur || def->isa<
Bot>() || cur == def || cur == var)
return cur;
todo_ = true;
if (cur->isa<
Bot>())
return i->second = def;
return i->second = var;
}
const Def* SCCP::Analysis::rewrite_imm_App(
const App* app) {
auto n = app->num_targs();
auto abstr_args = absl::FixedArray<const Def*>(n);
auto abstr_vars = absl::FixedArray<const Def*>(n);
for (size_t i = 0; i != n; ++i) {
auto abstr = rewrite(app->targ(i));
abstr_vars[i] = propagate(lam->tvar(i), abstr);
abstr_args[i] = abstr;
}
auto abstr_var = world().tuple(abstr_vars);
map(lam->var(), abstr_var);
lattice_[lam->var()] = abstr_var;
for (auto d : lam->deps())
rewrite(d);
return world().app(lam, abstr_args);
}
return mim::Analysis::rewrite_imm_App(app);
}
}
#define DLOG(...)
Vaporizes to nothingness in Debug build.
auto lookup(const C &container, const K &key)
Yields pointer to element (or the element itself if it is already a pointer), if found and nullptr ot...
Lam * isa_optimizable(Lam *lam)
These are Lams that are.
The SCCP analysis associates each lambda variable with a lattice value:
- bottom: no useful information yet,
- a concrete expression: this value can be propagated,
- top: keep the variable as-is (a Def maps to itself).
In the implementation, this lattice is stored as a Def2Def map. A nice aspect here is that the propagated value is itself a regular Def. This illustrates the benefit of building analysis on top of Rewriter: the abstract domain can live directly inside MimIR, so canonicalization and normalization come for free.
The analysis traverses the old world and updates the lattice when it sees applications of optimizable lambdas. If new information is discovered, it sets todo_ = true, causing the analysis to rerun until stable. This is a textbook use of Analysis:
- walk the old IR,
- collect facts,
- iterate to a fixed point.
Transformation
#include "sccp.h"
const Def* SCCP::rewrite_imm_App(
const App* old_app) {
if (
auto old_lam = old_app->callee()->isa_mut<
Lam>()) {
if (auto l = lattice(old_lam->var()); l && l != old_lam->var()) {
todo_ = true;
size_t num_old = old_lam->num_tvars();
if (auto i = lam2lam_.find(old_lam); i != lam2lam_.end())
new_lam = i->second;
else {
for (size_t i = 0; i != num_old; ++i) {
auto old_var = old_lam->var(num_old, i);
auto abstr = lattice(old_var);
if (old_var == abstr) new_doms.emplace_back(rewrite(old_lam->dom(num_old, i)));
}
size_t num_new = new_doms.size();
auto new_vars = absl::FixedArray<const Def*>(num_old);
new_lam = new_world().mut_lam(new_doms, rewrite(old_lam->codom()))->set(old_lam->dbg());
lam2lam_[old_lam] = new_lam;
for (size_t i = 0, j = 0; i != num_old; ++i) {
auto old_var = old_lam->var(num_old, i);
auto abstr = lattice(old_var);
if (old_var == abstr) {
auto v = new_lam->var(num_new, j++);
new_vars[i] = v;
} else {
new_vars[i] = rewrite(abstr);
}
}
map(old_lam->var(), new_vars);
new_lam->set(rewrite(old_lam->filter()), rewrite(old_lam->body()));
}
size_t num_new = new_lam->num_vars();
auto new_args = absl::FixedArray<const Def*>(num_new);
for (size_t i = 0, j = 0; i != num_old; ++i) {
auto old_var = old_lam->var(num_old, i);
auto abstr = lattice(old_var);
if (old_var == abstr) new_args[j++] = rewrite(old_app->targ(i));
}
return map(old_app, new_world().app(new_lam, new_args));
}
}
return Rewriter::rewrite_imm_App(old_app);
}
}
Vector< const Def * > DefVec
Once the lattice is stable, the outer SCCP phase starts rewriting. When it sees an application of a lambda whose parameters have propagated values, it rebuilds a specialized lambda:
- parameters with known propagated expressions are removed,
- remaining parameters are kept,
- the lambda body is rewritten with the propagated values substituted,
- the call site is rebuilt with only the remaining arguments.
So SCCP follows the standard RWPhase pattern:
- analyze the old world,
- rewrite into a new world using the computed facts,
- swap worlds.
Why This Split Is Useful
Separating SCCP into analysis and rewrite keeps both parts simple:
- the analysis never mutates or partially rewrites the program,
- the rewrite does not need to discover facts on the fly,
- fixed-point logic stays in the analysis stage where it belongs.
This separation is the main design pattern to follow for nontrivial optimizations.
- Note
- The complete SCCP example fits into roughly 150 lines of C++ source code. Most of the usual compiler boilerplate is absorbed by the existing Analysis, RWPhase, and Rewriter infrastructure, so the implementation can focus on the optimization itself.
Choosing the Right Base Class
A useful rule of thumb is:
- derive from Phase if you just need a custom one-off action,
- derive from Analysis if you want a graph-aware traversal that computes facts on the current world,
- derive from RWPhase if you want to rebuild the world into a transformed new one,
- derive from ClosedMutPhase if you want to visit all reachable closed mutables,
- derive from NestPhase if that visit should come with a computed Nest.
Recommended Design Pattern
For most optimization phases, the preferred structure is:
- write an Analysis that computes facts to a fixed point,
- write an RWPhase that consumes those facts while rebuilding the world.
This keeps analyses and transformations cleanly separated and fits naturally with MimIR’s rewriting-based infrastructure.
Minimal Examples
A simple whole-world rewrite phase:
public:
: RWPhase(world, "simplify") {}
private:
return Rewriter::rewrite_imm_App(app);
}
};
The World represents the whole program and manages creation of MimIR nodes (Defs).
A simple analysis phase:
public:
: Analysis(world, "count_lams") {}
size_t num_lams = 0;
private:
++num_lams;
return mim::Analysis::rewrite_mut_Lam(lam);
}
};
Using both:
CountMutLams analysis(world);
analysis.run();
Compilation Pipelines in Mim
You can also expose your custom phases as axioms in Mim via the compile plugin and build your own compilation pipeline. Mim's default compilation pipeline is defined in the opt plugin.
Summary
Phases are MimIR’s main unit of compiler work.
- Phase is the minimal base abstraction.
- Analysis is for graph-aware fact collection on the current world.
- RWPhase is for rewriting the current world into a transformed new one.
- PhaseMan sequences phases, optionally to a fixed point.
- ClosedMutPhase and NestPhase are traversal helpers for common whole-world inspections.
The key design idea is that MimIR phases are built around structured traversal and rewriting. For substantial optimizations, the usual pattern is: