Thorin 1.9.0
The Higher ORder INtermediate representation
Loading...
Searching...
No Matches
Developer Guide

This guide summaries typical idioms you want to use when working with Thorin as a developer.

Basics

Let's jump straight into an example.

#include <fstream>
#include <thorin/driver.h>
using namespace thorin;
using namespace thorin::plug;
int main(int, char**) {
try {
Driver driver;
auto& w = driver.world();
driver.log().set(&std::cerr).set(Log::Level::Debug);
auto parser = Parser(w);
for (auto plugin : {"compile", "core"}) parser.plugin(plugin);
// .Cn [%mem.M, I32, %mem.Ptr (I32, 0) .Cn [%mem.M, I32]]
auto mem_t = w.annex<mem::M>();
auto argv_t = w.call<mem::Ptr0>(w.call<mem::Ptr0>(w.type_i32()));
auto main = w.mut_fun({mem_t, w.type_i32(), argv_t}, {mem_t, w.type_i32()})->set("main");
auto [mem, argc, argv, ret] = main->vars<4>();
main->app(false, ret, {mem, argc});
main->make_external();
std::ofstream ofs("hello.ll");
driver.backend("ll")(w, ofs);
ofs.close(); // make sure everything is written before clang is invoked
sys::system("clang hello.ll -o hello -Wno-override-module");
outln("exit code: {}", sys::system("./hello a b c"));
} catch (const std::exception& e) {
errln("{}", e.what());
return EXIT_FAILURE;
} catch (...) {
errln("error: unknown exception");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
Some "global" variables needed all over the place.
Definition driver.h:17
World & world()
Definition driver.h:25
Log & log()
Definition driver.h:24
auto backend(std::string_view name)
Definition driver.h:76
Log & set(std::ostream *ostream)
Definition log.h:34
Parses Thorin code into the provided World.
Definition parser.h:33
int main(int argc, char **argv)
Definition main.cpp:24
The mem Plugin
Definition mem.h:11
Definition cfg.h:11
std::ostream & errln(const char *fmt, Args &&... args)
Definition print.h:179
void optimize(World &)
Definition optimize.cpp:19
std::ostream & outln(const char *fmt, Args &&... args)
Definition print.h:178

Driver is the first class that you want to allocate. It keeps track of a few "global" variables like some Flags or the Log. Here, the log is set up to output to std::cerr with thorin::Log::Level::Debug (see also Debugging Features).

Then, we load the plugins compile and core, which in turn will load the plugin mem. A plugin consists of a shared object (.so/.dll) and a .thorin file. The shared object contains Passes, normalizers, and so on. The .thorin file contains the axiom declarations and links the normalizers with the axioms. For this reason, we need to allocate the thorin::fe::Parser to parse the .thorin file; thorin::fe::Parser::plugin will also load the shared object. The driver keeps track of all plugins.

Next, we create actual code. Def is the base class for all nodes/expressions in Thorin. Each Def is a node in a graph which is maintained in the World. The World is essentially a big hash set where all Defs are tossed into. The World provides factory methods to create all kind of different Defs. Here, we create the main function. In direct style, its type looks like this:

[%mem.M, I32, %mem.Ptr (I32, 0)] -> [%mem.M, I32]]

Converted to continuation-passing style (CPS) this type looks like this:

.Cn [%mem.M, I32, %mem.Ptr (I32, 0), .Cn [%mem.M, I32]]

The %mem.M type is a type that keeps track of side effects that may occur. Since, main introduces Variables we must create a mutable Lambda (see Immutables vs. Mutables). The only thing main is doing, is to invoke its return continuation with mem and argc as argument:

ret (mem, argc)

It is also important to make main external. Otherwise, Thorin will simply remove this function.

We optimize the program, emit an LLVM assembly file, and compile it via clang. Finally, we execute the generated program with ./hello a b c and output its exit code - which should be 4.

Immutables vs. Mutables

There are two different kind of Defs in Thorin: mutables and immutables:

Immutable Mutable
must be const may be non-const
ops form DAG ops may be cyclic
no recursion may be recursive
no Var has Var; get with Def::var
build ops first, then the actual node build the actual node first, then set the ops
Hash consed each new instance is fresh
Def::rebuild Def::stub

Matching IR

Thorin provides different means to scrutinize Defs. Usually, you will encounter a Def as Ref which is just a wrapper for a const Def*. Its purpose is to resolve "holes" (called Infers in Thorin) that may pop up due to type inference. Matching built-ins, i.e. all subclasses of Def, works differently than matching Axioms.

Upcast for Built-ins

Methods beginning with

  • isa work like a dynamic_cast with a runtime check and return nullptr if the cast is not possible, while
  • those beginning with as are more like a static_cast and assert via its isa sibling in the Debug build that the cast is correct.

Upcast

Def::isa/Def::as allows for an upcast that matches both mutables and immutables:

void foo(Ref def) {
if (auto sigma = def->isa<Sigma>()) {
// sigma is of type "const Sigma*" and could be a mutable or an immutable
}
// sigma of type "const Sigma*" and could be an immutable or a mutable
// asserts, if def is not a Sigma
auto sigma = def->as_imm<Sigma>();
}
const T * as_imm() const
Definition def.h:444
Helper class to retrieve Infer::arg if present.
Definition def.h:87
A dependent tuple type.
Definition tuple.h:9

Upcast for Immutables

Def::isa_imm/Def::as_imm allows for an upcast and only matches immutables:

void foo(Ref def) {
if (auto imm = def->isa_imm()) {
// imm of type "const Def*" and is an immutable
}
if (auto sigma = def->isa_imm<Sigma>()) {
// sigma is of type "const Sigma*" and is an immutable Sigma
}
// sigma of type "const Sigma*" and *must* be an immutable Sigma - otherwise, asserts
auto sigma = def->as_imm<Sigma>();
}
const T * isa_imm() const
Definition def.h:443

Upcast for Mutables

Def::isa_mut/Def::as_mut allows for an upcast and only matches mutables. By doing so, it removes the const qualifier and gives you access to the non-const methods that only make sense for mutables:

void foo(Ref def) {
if (auto mut = def->isa_mut()) {
// mut of type "Def*" - "const" has been removed!
// This gives you access to the non-const methods:
auto var = mut->var();
auto stub = mut->stub(world, type, debug)
// ...
}
if (auto lam = def->isa_mut<Lam>()) {
// lam of type "Lam*" - "const" has been removed!
}
// lam of type "Lam*" and *must* be a mutable Lam.
// Otherwise, an assert will fire.
auto lam = def->as<Lam>();
}
T * isa_mut() const
If this is *mut*able, it will cast constness away and perform a dynamic_cast to T.
Definition def.h:449
A function.
Definition lam.h:97

Checking via Def::isa/Def::as a Def* has the same effect as using Def::isa_mut/Def::isa_mut since the scrutinee must be already a mutable due to the lack of the const qualifier:

void foo(Def* def) { // note the lack of "const" here
if (auto sigma = def->isa<Sigma>()) {
// sigma is of type Sigma* and is a mutable
}
if (auto sigma = def->isa_mut<Sigma>()) {
// sigma is of type Sigma* and is a mutable
}
// sigma of type "Sigma*" and is a mutable - otherwise, asserts
auto sigma = def->as<Sigma>();
}
Base class for all Defs.
Definition def.h:222

Matching Literals

Often, you want to match a Literal and grab its content. You can use Lit::isa/Lit::as for this:

void foo(Ref def) {
if (auto lit = Lit::isa(def)) {
// lit is of type "std::optional<u64>"
// It's your responsibility that the grabbed value makes sense as u64.
}
if (auto lit = Lit::isa<f32>(def)) {
// lit is of type "std::optional<f32>"
// It's your responsibility that the grabbed value makes sense as f32.
}
// asserts if def is not a Lit.
auto lu64 = Lit::as(def);
auto lf32 = Lit::as<f32>(def);
}

Summary

The following table summarizes all important casts:

dynamic_cast
static_cast
Returns If def is a ...
def->isa<Lam>()
def->as<Lam>()
const Lam* Lam
def->isa_imm<Lam>()
def->as_imm<Lam>()
const Lam* immutable Lam
def->isa_mut<Lam>()
def->as_mut<Lam>()
Lam* mutable Lam
Lit::isa(def)
Lit::as(def)
std::optional<nat_t>
nat_t
Lit
Lit::isa<f32>(def)
Lit::as<f32>(def)
std::optional<f32>
f32
Lit

Further Casts

There are also some additional checks available that usually come as static methods and either return a pointer or Ref to the checked entity or nullptr. Here are some examples:

void foo(Ref def) {
if (auto size = Idx::size(def)) {
// def = .Idx size
}
if (auto lam = Lam::isa_mut_cn(def)) {
// def isa mutable Lam of type .Cn T
}
if (auto pi = Pi::isa_basicblock(def)) {
// def is a Pi whose co-domain is bottom and which is not returning
}
}

Matching Axioms

You can match Axioms via

  • thorin::match which is again similar to a dynamic_cast with a runtime check and returns a wrapped nullptr (see below), if the cast is not possible, or
  • thorin::force which is again more like a static_cast and asserts via its thorin::match sibling in the Debug build that the cast is correct.

This will yield a Match<Id, D> which just wraps a const D*. Id is the enum of the corresponding tag of the matched Axiom. Usually, D will be an App because most Axioms inhabit a function type. Otherwise, it may wrap a Def or other subclasses of it. For instance, matching %mem.M yields Match<mem::M, Def>.

By default, Thorin assumes that the magic of an Axiom happens when applying the final argument to a curried Axiom. For example, matching a %mem.load will only trigger for the final App of the curried call

%mem.load (T, as) (mem, ptr)

while

%mem.load (T, as)

will not match. The wrapped App inside the Match refers to the last App of the curried call. So in this example

If you want to design an Axiom that returns a function, you can fine-adjust the trigger point of a thorin::match / thorin::force.

w/o Subtags

In order to match an Axiom without any subtags like %mem.load, do this:

void foo(Ref def) {
if (auto load = match<mem::load>(def)) {
auto [mem, ptr] = load->args<2>();
auto [pointee, addr_space] = load->decurry()->args<2>();
}
// def must match as a mem::load - otherwise, asserts
auto load = force<mem::load>(def);
}

w/ Subtags

In order to match an Axiom with subtags like %core.wrap, do this:

void foo(Ref def) {
if (auto wrap = match<core::wrap>(def)) {
auto [a, b] = wrap->args<2>();
auto mode = wrap->decurry()->arg();
switch (wrap.id()) {
case core::wrap::add: // ...
case core::wrap::sub: // ...
case core::wrap::mul: // ...
case core::wrap::shl: // ...
}
}
if (auto add = match(core::wrap::add, def)) {
auto [a, b] = add->args<2>();
auto mode = add->decurry()->arg();
}
// def must match as a core::wrap - otherwise, asserts
auto wrap = force<core::wrap>(def);
// def must match as a core::wrap::add - otherwise, asserts
auto add = force(core::wrap::add, def);
}
auto match(Ref def)
Definition axiom.h:105
auto force(Ref def)
Definition axiom.h:126

Summary

The following table summarizes all important casts:

dynamic_cast
static_cast
Returns If def is a ...
match<mem::load>(def)
force<mem::load>(def)
Match<mem::load, App> %mem.load (T, as) (mem, ptr)
match<core::wrap>(def)
force<core::wrap>(def)
Match<core::wrap, App> %core.wrap.??? s m (a, b)
match(core::wrap::add, def)
force(core::wrap::add, def)
Match<core::wrap, App> %core.wrap.add s m (a, b)

Working with Indices

There are essentially three ways of retrieving the number of elements of something in Thorin.

Arity

This is the number of elements of to extract/insert a single element. Note that the number of elements may be unknown at compile time such as in ‹n; 0›.

Proj

thorin::Def::num_projs is the same as thorin::Def::arity, if the arity is a thorin::Lit. Otherwise, it is simply 1. This concept only exists in the C++-API to give the programmer the illusion to work with n-ary functions, e.g.:

for (auto dom : pi->doms()) { /*...*/ }
for (auto var : lam->vars()) { /*...*/ }

But in reality, all functions have exactly one domain and one codomain.

Thresholded Variants

In additition, there are thresholded variants available that are prefixed with a t and take thorin::Flags::scalerize_threshold (--scalerize-threshold) into account. The method thorin::Def::num_tprojs returns the same as thorin::Def::num_projs, but will simply yield 1, if the arity exceeds the threshold. thorin::Def::tproj, thorin::Def::tprojs, thorin::Lam::tvars, etc. work accordingly.

See also:

Shape

TODO

Summary

Expression Class artiy isa_lit_artiy as_lit_artiy num_projs num_tprojs
(0, 1, 2) Tuple 3 3 3 3 3
‹3; 0› Pack 3 3 3 3 3
‹n; 0› Pack n std::nullopt asserts 1 1
[.Nat, .Bool, .Nat] Sigma 3 3 3 3 3
«3; .Nat» Arr 3 3 3 3 3
«n; .Nat» Arr n std::nullopt asserts 1 1
x: [.Nat, .Bool] Var 2 2 2 2 2
‹32; 0› Pack 32 32 32 32 1

The last line assumes thorin::Flags::scalerize_threshold = 32.

Iterating over the Program

There are several ways of doing this. It depends on what exactly you want to achieve and how much structure you need during the traversal. The simplest way is to kick off with World::externals and recursively run over Def::extended_ops like this:

DefSet done;
for (const auto& [_, mut] : world.externals())
visit(done, mut);
void visit(DefSet& done, const Def* def) {
if (!done.emplace(def).second) return;
do_sth(def);
for (auto op : def->extended_ops())
visit(done, op);
}
GIDSet< const Def * > DefSet
Definition def.h:60

However, you will most likely want to use the pass or the phase infrastructure.