This guide summaries typical idioms you want to use when working with Thorin as a developer.

Basics

Let's jump straight into an example.

#include <fstream>
 
#include <thorin/driver.h>
 
#include <thorin/fe/parser.h>
#include <thorin/pass/optimize.h>
#include <thorin/util/sys.h>
 
#include <thorin/plug/mem/mem.h>
 
using namespace thorin;
using namespace thorin::plug;
 
int main(int, char**) {
    try {
        Driver driver;
        auto& w = driver.world();
        driver.log().set(&std::cerr).set(Log::Level::Debug);
 
        auto parser = Parser(w);
        for (auto plugin : {"compile", "core"}) parser.plugin(plugin);
 
        // .Cn [%mem.M, I32, %mem.Ptr (I32, 0) .Cn [%mem.M, I32]]
        auto mem_t  = w.annex<mem::M>();
        auto argv_t = w.call<mem::Ptr0>(w.call<mem::Ptr0>(w.type_i32()));
        auto main   = w.mut_fun({mem_t, w.type_i32(), argv_t}, {mem_t, w.type_i32()})->set("main");
 
        auto [mem, argc, argv, ret] = main->vars<4>();
        main->app(false, ret, {mem, argc});
        main->make_external();
 
        optimize(w);
        std::ofstream ofs("hello.ll");
        driver.backend("ll")(w, ofs);
        ofs.close(); // make sure everything is written before clang is invoked
 
        sys::system("clang hello.ll -o hello -Wno-override-module");
        outln("exit code: {}", sys::system("./hello a b c"));
    } catch (const std::exception& e) {
        errln("{}", e.what());
        return EXIT_FAILURE;
    } catch (...) {
        errln("error: unknown exception");
        return EXIT_FAILURE;
    }
 
    return EXIT_SUCCESS;
}

Driver is the first class that you want to allocate. It keeps track of a few "global" variables like some Flags or the Log. Here, the log is set up to output to std::cerr with thorin::Log::Level::Debug (see also Debugging Features).

Then, we load the plugins compile and core, which in turn will load the plugin mem. A plugin consists of a shared object (.so/.dll) and a .thorin file. The shared object contains Passes, normalizers, and so on. The .thorin file contains the axiom declarations and links the normalizers with the axioms. For this reason, we need to allocate the thorin::fe::Parser to parse the .thorin file; thorin::fe::Parser::plugin will also load the shared object. The driver keeps track of all plugins.

Next, we create actual code. Def is the base class for all nodes/expressions in Thorin. Each Def is a node in a graph which is maintained in the World. The World is essentially a big hash set where all Defs are tossed into. The World provides factory methods to create all kind of different Defs. Here, we create the main function. In direct style, its type looks like this:

[%mem.M, I32, %mem.Ptr (I32, 0)] -> [%mem.M, I32]]

Converted to continuation-passing style (CPS) this type looks like this:

.Cn [%mem.M, I32, %mem.Ptr (I32, 0), .Cn [%mem.M, I32]]

The %mem.M type is a type that keeps track of side effects that may occur. Since, main introduces Variables we must create a mutable Lambda (see Immutables vs. Mutables). The only thing main is doing, is to invoke its return continuation with mem and argc as argument:

ret (mem, argc)

It is also important to make main external. Otherwise, Thorin will simply remove this function.

We optimize the program, emit an LLVM assembly file, and compile it via clang. Finally, we execute the generated program with ./hello a b c and output its exit code - which should be 4.

Immutables vs. Mutables

There are two different kind of Defs in Thorin: mutables and immutables:

Immutable	Mutable
must be `const`	may be non-`const`
ops form DAG	ops may be cyclic
no recursion	may be recursive
no Var	has Var; get with Def::var
build ops first, then the actual node	build the actual node first, then set the ops
Hash consed	each new instance is fresh
Def::rebuild	Def::stub

Matching IR

Thorin provides different means to scrutinize Defs. Usually, you will encounter a Def as Ref which is just a wrapper for a const Def*. Its purpose is to resolve "holes" (called Infers in Thorin) that may pop up due to type inference. Matching built-ins, i.e. all subclasses of Def, works differently than matching Axioms.

Upcast for Built-ins

Methods beginning with

isa work like a dynamic_cast with a runtime check and return nullptr if the cast is not possible, while
those beginning with as are more like a static_cast and assert via its isa sibling in the Debug build that the cast is correct.

Upcast

Def::isa/Def::as allows for an upcast that matches both mutables and immutables:

void foo(Ref def) {
    if (auto sigma = def->isa<Sigma>()) {
        // sigma is of type "const Sigma*" and could be a mutable or an immutable
    }
 
    // sigma of type "const Sigma*" and could be an immutable or a mutable
    // asserts, if def is not a Sigma
    auto sigma = def->as_imm<Sigma>();
}

Upcast for Immutables

Def::isa_imm/Def::as_imm allows for an upcast and only matches immutables:

void foo(Ref def) {
    if (auto imm = def->isa_imm()) {
        // imm of type "const Def*" and is an immutable
    }
 
    if (auto sigma = def->isa_imm<Sigma>()) {
        // sigma is of type "const Sigma*" and is an immutable Sigma
    }
 
    // sigma of type "const Sigma*" and *must* be an immutable Sigma - otherwise, asserts
    auto sigma = def->as_imm<Sigma>();
}

Upcast for Mutables

Def::isa_mut/Def::as_mut allows for an upcast and only matches mutables. By doing so, it removes the const qualifier and gives you access to the non-const methods that only make sense for mutables:

void foo(Ref def) {
    if (auto mut = def->isa_mut()) {
        // mut of type "Def*" - "const" has been removed!
        // This gives you access to the non-const methods:
        auto var = mut->var();
        auto stub = mut->stub(world, type, debug)
        // ...
    }
 
    if (auto lam = def->isa_mut<Lam>()) {
        // lam of type "Lam*" - "const" has been removed!
    }
 
    // lam of type "Lam*" and *must* be a mutable Lam.
    // Otherwise, an assert will fire.
    auto lam = def->as<Lam>();
}

Checking via Def::isa/Def::as a Def* has the same effect as using Def::isa_mut/Def::isa_mut since the scrutinee must be already a mutable due to the lack of the const qualifier:

void foo(Def* def) { // note the lack of "const" here
    if (auto sigma = def->isa<Sigma>()) {
        // sigma is of type Sigma* and is a mutable
    }
    if (auto sigma = def->isa_mut<Sigma>()) {
        // sigma is of type Sigma* and is a mutable
    }
 
    // sigma of type "Sigma*" and is a mutable - otherwise, asserts
    auto sigma = def->as<Sigma>();
}

Matching Literals

Often, you want to match a Literal and grab its content. You can use Lit::isa/Lit::as for this:

void foo(Ref def) {
    if (auto lit = Lit::isa(def)) {
        // lit is of type "std::optional<u64>"
        // It's your responsibility that the grabbed value makes sense as u64.
    }
    if (auto lit = Lit::isa<f32>(def)) {
        // lit is of type "std::optional<f32>"
        // It's your responsibility that the grabbed value makes sense as f32.
    }
 
    // asserts if def is not a Lit.
    auto lu64 = Lit::as(def);
    auto lf32 = Lit::as<f32>(def);
}

Summary

The following table summarizes all important casts:

`dynamic_cast` `static_cast`	Returns	If `def` is a ...
`def->isa<Lam>()` `def->as<Lam>()`	`const Lam*`	Lam
`def->isa_imm<Lam>()` `def->as_imm<Lam>()`	`const Lam*`	immutable Lam
`def->isa_mut<Lam>()` `def->as_mut<Lam>()`	`Lam*`	mutable Lam
`Lit::isa(def)` `Lit::as(def)`	std::optional`<`nat_t`>` nat_t	Lit
`Lit::isa<f32>(def)` `Lit::as<f32>(def)`	std::optional`<`f32`>` f32	Lit

Further Casts

There are also some additional checks available that usually come as static methods and either return a pointer or Ref to the checked entity or nullptr. Here are some examples:

void foo(Ref def) {
    if (auto size = Idx::size(def)) {
        // def = .Idx size
    }
 
    if (auto lam = Lam::isa_mut_cn(def)) {
        // def isa mutable Lam of type .Cn T
    }
 
    if (auto pi = Pi::isa_basicblock(def)) {
        // def is a Pi whose co-domain is bottom and which is not returning
    }
}

Matching Axioms

You can match Axioms via

thorin::match which is again similar to a dynamic_cast with a runtime check and returns a wrapped nullptr (see below), if the cast is not possible, or
thorin::force which is again more like a static_cast and asserts via its thorin::match sibling in the Debug build that the cast is correct.

This will yield a Match<Id, D> which just wraps a const D*. Id is the enum of the corresponding tag of the matched Axiom. Usually, D will be an App because most Axioms inhabit a function type. Otherwise, it may wrap a Def or other subclasses of it. For instance, matching %mem.M yields Match<mem::M, Def>.

By default, Thorin assumes that the magic of an Axiom happens when applying the final argument to a curried Axiom. For example, matching a %mem.load will only trigger for the final App of the curried call

%mem.load (T, as) (mem, ptr)

while

%mem.load (T, as)

will not match. The wrapped App inside the Match refers to the last App of the curried call. So in this example

thorin::App::arg() is (mem, ptr) and
thorin::App::callee() is %mem.load (T, as).

Use thorin::App::decurry() to directly get the thorin::App::callee() as thorin::App.

If you want to design an Axiom that returns a function, you can fine-adjust the trigger point of a thorin::match / thorin::force.

w/o Subtags

In order to match an Axiom without any subtags like %mem.load, do this:

void foo(Ref def) {
    if (auto load = match<mem::load>(def)) {
        auto [mem, ptr]            = load->args<2>();
        auto [pointee, addr_space] = load->decurry()->args<2>();
    }
 
    // def must match as a mem::load - otherwise, asserts
    auto load = force<mem::load>(def);
}

w/ Subtags

In order to match an Axiom with subtags like %core.wrap, do this:

void foo(Ref def) {
    if (auto wrap = match<core::wrap>(def)) {
        auto [a, b] = wrap->args<2>();
        auto mode   = wrap->decurry()->arg();
        switch (wrap.id()) {
            case core::wrap::add: // ...
            case core::wrap::sub: // ...
            case core::wrap::mul: // ...
            case core::wrap::shl: // ...
        }
    }
 
    if (auto add = match(core::wrap::add, def)) {
        auto [a, b] = add->args<2>();
        auto mode   = add->decurry()->arg();
    }
 
    // def must match as a core::wrap - otherwise, asserts
    auto wrap = force<core::wrap>(def);
 
    // def must match as a core::wrap::add - otherwise, asserts
    auto add = force(core::wrap::add, def);
}

Summary

The following table summarizes all important casts:

`dynamic_cast` `static_cast`	Returns	If `def` is a ...
`match<mem::load>(def)` `force<mem::load>(def)`	Match`<`mem::load, App`>`	`%mem.load (T, as) (mem, ptr)`
`match<core::wrap>(def)` `force<core::wrap>(def)`	Match`<`core::wrap, App`>`	`%core.wrap.??? s m (a, b)`
`match(core::wrap::add, def)` `force(core::wrap::add, def)`	Match`<`core::wrap, App`>`	`%core.wrap.add s m (a, b)`

Working with Indices

There are essentially three ways of retrieving the number of elements of something in Thorin.

Arity

This is the number of elements of to extract/insert a single element. Note that the number of elements may be unknown at compile time such as in ‹n; 0›.

Proj

thorin::Def::num_projs is the same as thorin::Def::arity, if the arity is a thorin::Lit. Otherwise, it is simply 1. This concept only exists in the C++-API to give the programmer the illusion to work with n-ary functions, e.g.:

for (auto dom : pi->doms()) { /*...*/ }

for (auto var : lam->vars()) { /*...*/ }

But in reality, all functions have exactly one domain and one codomain.

Thresholded Variants

In additition, there are thresholded variants available that are prefixed with a t and take thorin::Flags::scalerize_threshold (--scalerize-threshold) into account. The method thorin::Def::num_tprojs returns the same as thorin::Def::num_projs, but will simply yield 1, if the arity exceeds the threshold. thorin::Def::tproj, thorin::Def::tprojs, thorin::Lam::tvars, etc. work accordingly.

See also:

Shape

TODO

Summary

Expression	Class	artiy	isa_lit_artiy	as_lit_artiy	num_projs	num_tprojs
`(0, 1, 2)`	Tuple	`3`	`3`	`3`	`3`	`3`
`‹3; 0›`	Pack	`3`	`3`	`3`	`3`	`3`
`‹n; 0›`	Pack	`n`	`std::nullopt`	asserts	`1`	`1`
`[.Nat, .Bool, .Nat]`	Sigma	`3`	`3`	`3`	`3`	`3`
`«3; .Nat»`	Arr	`3`	`3`	`3`	`3`	`3`
`«n; .Nat»`	Arr	`n`	`std::nullopt`	asserts	`1`	`1`
`x: [.Nat, .Bool]`	Var	`2`	`2`	`2`	`2`	`2`
`‹32; 0›`	Pack	`32`	`32`	`32`	`32`	`1`

The last line assumes thorin::Flags::scalerize_threshold = 32.

Iterating over the Program

There are several ways of doing this. It depends on what exactly you want to achieve and how much structure you need during the traversal. The simplest way is to kick off with World::externals and recursively run over Def::extended_ops like this:

DefSet done;
for (const auto& [_, mut] : world.externals())
    visit(done, mut);
 
void visit(DefSet& done, const Def* def) {
    if (!done.emplace(def).second) return;
 
    do_sth(def);
 
    for (auto op : def->extended_ops())
        visit(done, op);
}

However, you will most likely want to use the pass or the phase infrastructure.

Table of Contents

Basics

Immutables vs. Mutables

Matching IR

Upcast for Built-ins

Upcast

Upcast for Immutables

Upcast for Mutables

Matching Literals

Summary

Further Casts

Matching Axioms

w/o Subtags

w/ Subtags

Summary

Working with Indices

Arity

Proj

Thresholded Variants

Shape

Summary

Iterating over the Program