AnyDSL - A Partial Evaluation Framework for Programming High-Performance Libraries
AnyDSL is a framework for domain-specific libraries (DSLs). These are implemented in our language Impala. In order to achieve high-performance, Impala partially evaluates any abstractions these libraries might impose. Partial evaluation and other optimizations are performed on AnyDSL’s intermediate representation Thorin.
Support
You can ask for support on Discord.
Current Dates
Join the AnyDSL Workshop on July 21, 2022!
AnyDSL Architecture
Embedding of DSLs in Impala
When developing a DSL, people from different areas come together:
- the application developer who just wants to use the DSL,
- the DSL designer who develops domain-specific abstractions, and
- the machine expert who knows the target machine very well and how to massage the code in order to achieve good performance.
AnyDSL allows a separation of these concerns using
- higher-order functions,
- partial evaluation and,
- triggered code generation.
Application Developer
fn main() {
let img = load("dragon.png");
let blurred = gaussian_blur(img);
}
DSL Designer
fn gaussian_blur(field: Field) -> Field {
let stencil: Stencil = { /* ... */ };
let mut out: Field = { /* ... */ };
for x, y in @iterate(out) {
out.data(x, y) = apply_stencil(x, y, field, stencil);
}
out
}
Machine Expert
fn iterate(field: Field, body: fn(int, int) -> ()) -> () {
let grid = (field.cols, field.rows, 1);
let block = (128, 1, 1);
with nvvm(grid, block) {
let x = nvvm_tid_x() + nvvm_ntid_x() * nvvm_ctaid_x();
let y = nvvm_tid_y() + nvvm_ntid_y() * nvvm_ctaid_y();
body(x, y);
}
}
Talk
Selected Results
Rodent: https://github.com/anydsl/rodent
Rodent is a BVH traversal library and renderer implemented using the AnyDSL compiler framework. Rodent is a renderer-generating library that converts 3D scenes into optimized/specialized code the scene on CPUs and GPUs. Compared with state-of-the-art renderer, we obtain the following speedups:
- Embree (Intel): up to 23% faster
- OptiX (NVIDIA): up to 31% faster (megakernel)
- OptiX (NVIDIA): up to 42% faster (wavefront)
Rodent supports also ARM CPUs and AMD GPUs.
Stincilla: https://github.com/anydsl/stincilla
Stincilla is a DSL for stencil codes. We used the Gaussian blur filter as example and compared against the implementations in OpenCV 3.0 as reference. Thereby, we achieved the following results:
- Intel CPU: 40% faster
- Intel GPU: 25% faster
- AMD GPU: 50% faster
- NVIDIA GPU: 45% faster
- Up to 10x shorter code
RaTrace: https://github.com/anydsl/traversal
RaTrace is a DSL for ray traversal.
- 17% faster on NVIDIA GTX 970 (reference: Aila et al.)
- 11% faster on Intel Core i7-4790 using type inference (reference: Embree)
- 10% slower on Intel Core i7-4790 using auto-vectorization (reference: Embree)
- 1/10th of coding time according to Halstead measures