Disclaimer

This documentation is incomplete and not guaranteed to be up to date.

Introduction

This is a high-level overview of some the features of the Onyx programming language. A basic knowledge of programming and computer systems is assumed. This documentation is not designed to be read top-to-bottom, so feel free to jump around as it makes sense. Most of the examples can be copied into the main procedure on Onyx Playground.

Hello, Onyx!

The following is the famous "Hello, World!" program, implemented in Onyx.

use core {*}

main :: () {
	println("Hello, World!");
}

Running Onyx

When your program is saved to hello.onyx, you can now compile and run it:

onyx run hello.onyx

Compiling Onyx

You can also compile Onyx to a WebAssembly binary, and run it later:

onyx build hello.onyx -o hello.wasm
onyx run hello.wasm

Philosophy

This section covers some of the high-level design desisions and trade offs made in the Onyx programming language.

Design Decisions

Preface

The design decisions that shaped Onyx were made over the course of several years and tended to adapt to my preferred programming style throughout that the time. I always aimed to keep Onyx's features relatively orthogonal, but there are some overlapping features that target different styles of programming.

Imperative vs Functional

Onyx is an imperative language. You write a sequence of statements that should be executed in the specified order to evaluate your program. This is my preferred style of programming so it is what I made Onyx.

However, I do enjoy the simplicity of a functional language. The idea of expressing a computation at a higher level appeals to me. Instead of writing a bunch of for-loops, you express what you want to happen, instead of how it should happen.

For this reason, Onyx does have functional-inspired features that make that style of programming accessible. The two features that really make this possible are the pipe operator, and quick procedures. Here is an example of using them together with the core.iter library to express the computation: Sum the squares of the first 5 numbers in a sequence.

use core {iter, println}

main :: () {
    sequence := i32.[5, 2, 4, 9, 29, 8, 2, 8, 3];

    iter.as_iter(sequence)    // Make the iterator
    |> iter.take(5)           // Only take the first 5 elements
    |> iter.map(x => x * x)   // Square each result

    // Sum the squares with a fold operation
    |> iter.fold(0, (x, y) => x + y)
    |> println();             // Print it to the screen
}

While Onyx is largely an imperative language, there are many places where expressing your code in a more functional way like this can actually help readability.

For completeness, here is the same code written in an imperative style.

use core {println}

main :: () {
    sequence := i32.[5, 2, 4, 9, 29, 8, 2, 8, 3];

    sum := 0;
    for value in sequence[0 .. 5] {
        square := value * value;
        sum += square;
    }

    println(sum);
}

Each developer can choose their own style in Onyx, but I want Onyx to be able to support both styles.

Why the ::?

This was inspired from Jai and Odin. It means there is a compile-time constant binding between something (a procedure, struct, union, number, etc.), and a symbol.

Here's some examples:

A_String    :: "A compile-time string"
A_Number    :: 42
A_Struct    :: struct { }
A_Union     :: union { }
A_Procedure :: () { }

This syntax might look strange at first, but it actually simplifies things quite a bit. Notice how every kind of definition looks the same. Its always name :: thing. This means there is no longer a difference between things that are anonymous and things that are nominal. If you want to write an anonymous procedure, you simply leave the binding off of it. This is a silly example, because you couldn't call this procedure with a way to reference it, but it does compile.

// No named procedure
(x: i32, y: i32) -> i32 {
    return x + y;
}

The colon is actually relatively special in Onyx. Anywhere there is a :, a new symbol is being declared. To find (almost) all symbol declarations in a file, you can use the regular expression:

[a-zA-Z0-9]+\s?:

Note, the only exception to this rule is quick procedures, whose syntax does not use the colon, for the sake of being as terse as possible.

Semi-colons

To many, semi-colons are (or at least should be) a thing of the past. While I don't entirely disagree, Onyx currently does require them at the end of every statement. This is because of a larger trade-off: Onyx is whitespace agnostic. You can remove any whitespace that is not between a keyword and a symbol, and the program will continue to work.

This might not seem that important, but it is part of a larger goal to keep the Onyx language as unopinionated as possible. You should be able to space out and format your code as you please, without the compiler getting in the way. While good style should obviously be used, I don't believe it is the onus of Onyx to enforce style. After more Onyx code exists, it might be worth creating something like onyx fmt, like go fmt, but in the meantime that is not a priority.

You might think, Why not use newlines as 'semi-colons'?' This is a good point and something I have looked into. There are several features in Onyx that make this a little tricky and would force you to write code in a particular way.

For example, if/else expressions do not work well like this. Here is some code that is ambiguous without semi-colons.

x := foo()

if x == 5 {
    // ...
}

// Could be interpreted as this, which would not compile.

x := foo() if x == 5 {
    // ...
}

You might say, well since if is on a new line, it shouldn't join with the previous line. That would work but then you would have to write if/else expressions on the same line (or at least the if part).

x := foo() if condition
           else otherwise

This might be a worthwhile trade-off in the future, but that is to be decided later.

Why explicitly overloaded procedures?

Onyx uses explicitly overloaded procedures, over the more "traditional" implicitly overloaded procedures. In my experience, implicitly overloaded procedures sound like a good idea, until there are many overloads with complicated types, that could be ambiguous. See SFINAE as an example of what I am talking about.

To avoid this, Onyx's overloaded procedures must be explicitly declared, explicitly overloaded, and there is defined order as to which overloads are checked. It does cause a slightly verbose syntax, and a little bit more planning, but it simplifies things for the code writer, code reader, and the compiler writer. I believe it is a win-win-win.

Why WebAssembly?

WebAssembly's (very condensed) History

WebAssembly (WASM) is a new bytecode format that has become one of the largest misnomers in the computing space. While WASM started on the Web, it quickly found uses outside of the web browser. Since it is a platform and architecture independent bytecode, it can be used in much the same way as the JVM, Erlang's BEAM, or the .NET CLR. The thing that makes WASM so appealing is that the bytecode format is very simple and unopinionated, while the other bytecode options are very tied to the programming languages they run. WASM is meant to be a compilation target for every language.

WASM by design is sandboxed and safe to execute on any system. In order for a WASM binary to do anything, it must import functions from the host environment. In the browser, these would be defined in JavaScript. Outside of the browser, they have to be defined by the WASM runner. To prevent everyone from making their own standard, the WebAssembly Systems Interface (WASI) was made to cover most common use cases, like file operations and working with standard input/output.

WASI was a great step to get WASM out of the browser, but it does leave much to be desired. For example, at the time of writing it does not support networking, which makes writing a whole class of useful programs impossible. To fix this, Wasmer created WASIX, or an extended WASI specification, that fills the gaps in the WASI specification.

Note, Onyx fully supports WASIX by compiling with -r wasi -DWASIX.

There is work being done to create the WebAssembly Component Model, which is a way for programs written in a variety of different language to all interoperate with one another, much like how programs from Java, Kotlin, and Scala can interact because they all run on the JVM. This proposal is nearly completion, but Onyx is waiting until there are more languages implementing it to see how all of the details shake out. It is on the roadmap for Onyx to support it.

Why choose WebAssembly?

While WASM is great for its purpose, its purpose does seem a little niche. Why compile to WASM when you could just compile to native machine code? Why target WASM directly when you could target LLVM, and then get WASM for free, plus all other platforms?

I will preface this saying, WASM and Onyx are not for everyone's use case. While I hope to see WASM (and Onyx) used in more places, it is not meant to replace everything.

Onyx targets WASM for the following reasons:

  • WASM has a strong future in cross-platform deployment. WASM is already being used as an alternative to Docker containers in serverless deployments. WASM is also being used as plugin systems in editors and game engines. Almost every non-embedded system has some form of WASM capability because WASM is provided by all modern browsers.

  • WASM is safe. From their sandbox, explicit imports, and explicit permissions, WASM and WASI are much safer for the end-user when compared to native binaries and other bytecodes.

  • WASM is fast. WASM is simple to compile to, resulting in very fast compilation. WASM is translated to native machine instructions on every platform resulting in very high performance as well. There are even projects that can do this compilation ahead of time, so they can truly compile a WASM binary into a native binary.

  • WASM is easy. Onyx is not my full-time job. I do not have enough time or patience to work with LLVM. While it can produce great results and is an industry-leading technology for good reason, it is not known to be easy to work with. Also, targeting machine code directly would be just as hard and probably more time-consuming.

  • WASM is inconsequential. While counter-intuitive, the fact Onyx compiles to WASM is mostly transparent to the end user. When using onyx run, Onyx feels like using a scripting language, because the WASM details are hidden from the programmer. In production cases where the end-user does not have Onyx installed, see the above bullet point.

While WASM might not be the right choice for your project, Onyx only aims to provide a great developer experience for projects that want to use WASM.

Why use Onyx?

Onyx is a WebAssembly first language. Onyx aims to make it as easy as possible to start working with WebAssembly. For that reason, Onyx is very well suited for the niche kinds of projects that require using WebAssembly. WebAssembly is growing in popularity outside of the browser because of projects like Wasmer, Wasmtime, and WasmEdge that make it easy to run WebAssembly in a controlled environment. These "controlled environments" could be game engines, where WASM is used as a "script" system; cloud functions, where WASM is used to respond to requests; plug-in systems for editors or tools.

For more details, see the section on Why WebAssembly?.

Why not use Onyx?

Due to the tradeoffs and choices Onyx has made, Onyx is not suited for every use-case. For that reason, I don't expect Onyx to take off in the same way that Rust or Go took off.

There are many kinds of projects where Onyx will never be able to be used, and that's okay. I only want Onyx to be great for the projects that can use Onyx and WebAssembly. Some projects that Onyx would not be suited for would be:

To drive the point home, there will likely never be a rewrite it in Onyx trend like there is with Rust. Onyx is not aiming to replace Rust, Go, Zig, C++, or whatever your favorite language is. Onyx is new language, filling the rather niche purpose of supporting WebAssembly above all else. I do not see WebAssembly being a limitation of Onyx, but rather I see Onyx pushing the boundaries of WebAssembly.

Onyx's runtime

One interesting point to make is that the onyx toolchain bundles a WebAssembly runner. This means that when developing in Onyx, it will feel just like you are developing in NodeJS or Python. You run your program with onyx run, just like you would run node or python. The fact that Onyx compiles to WebAssembly only matters when you are trying to ship your project. For that, it is possible (but undocumented) to compile a standalone executable of your project that bundles your WASM code and the runtime. Other than a slightly slower startup time, it feels and acts just like a native executable.

Memory Management in Onyx

Onyx has manually managed memory. Now, you're probably thinking of C, where you have to be careful that every malloc that it has a matching free. Onyx's memory management has some aspect of that, but Onyx has many things that make it much easier to not make mistakes.

Why manually managed memory?

There are realistically two alternatives, both of which Onyx chooses not to do.

  • Garbage collection
  • Borrow checking semantics

Garbage collection is not really possible in WASM, due to the way WASM is run. There is not a way to have an external source stop execution and do a garbage collection pass.

You might be thinking, but what about WASM GC? WASM GC is not a direct solution for this. WASM GC is a major change affecting the entire programming paradigm in WASM. Instead of structures and compound types being stored in linear memory, they are stored in references to external objects, and are operated on with an entire new class of instructions. This is why they can be garbage collected, because the external runtime can see everywhere they are stored, because they aren't stored in linear memory. Onyx may support WASM GC in the future, but it is not a priority item.

Borrow checking semantics are definitely a viable route to do memory management in WASM. Rust is as popular as it is for a good reason, and the community has made some incredible things in it. However, for Onyx, I did not want to have to "fight" the borrow checker. I wanted to write code how I wanted to write code. There are many projects I work on that I know have memory leaks, but I don't care. I'm just experimenting and want to get something working before I do a final pass. Rust is great for that final pass, and I have nothing against people who use and love Rust. For Onyx, I wanted to have more control because I feel like managing memory is not as hard as people make it out to be; especially when you have the right tools.

The right tool: Custom Allocators

Like Zig and Odin, Onyx has full support for creating custom allocators. Everything that works with memory must explicitly use an Allocator. All core data structures and functions that allocate memory have the option to specify which Allocator to use.

If you've never programmed with custom allocators, this might seem a little weird or complicated, but it actually simplifies programming and memory management.

One thing to realize is that most allocations you make have a well defined lifetime, or time in which they can be accessed. Sometimes that lifetime can be hard to describe to something like a borrow checker, but they breakdown into four categories:

  • Very short term: Allocations that likely only live to end of a function.
  • Short term: Allocations that live until the end of the main loop.
  • Infinite: Allocations that will never be freed.
  • Actually manually managed: Allocations you actually have to think about when they are freed.

For very short term allocations, Onyx has the defer keyword. Deferred statements or blocks are executed right before the current block or function exits. They are convenient because you can place the freeing call right next to the allocation call, so it is easy to keep track of. Also, it prevents accidentally forgetting to free something because you added an early return to the function.

Note, the defer keyword is nothing new, and is present in many popular programming languages today.

For short term allocations, Onyx has the temporary allocator. The temporary allocator is a thread-local allocate-only allocator. It simply grows an arena of memory as you allocate from it. You cannot free from the temporary allocator. Instead, you free everything in the temporary allocator, all at once, by calling core.alloc.clear_temp_allocator(). This is generally done at the end or beginning of the main loop of the program.

Infinitely living allocations are the easiest. Just don't free it.

I would estimate 5% of the time you actually have to think about how long an allocation has to live. For that 5%, it might take a little planning and debugging before your memory management is working entirely correctly, but I believe that is worth not fighting a borrow checker.

Why should you manage memory?

This may be a hot take, but for many programs, you don't need to manage memory. If your program is only going to do a task and then exit, it does not need to manage its memory.

Take the Onyx compiler as an example. It allocates a lot of memory, but since all of that memory is needed until the moment it exits, there is no point in freeing any of it. The operating system will take care of reclaiming the pages when the compiler exits.

I would argue the only time you need to do memory management is when you have a program that is going to run for a long time. Things like games, web servers, graphical applications, etc. In all of these cases, there is a central main loop that drives the program. This main loop creates a great natural boundary for when certain things can be freed.

The temporary allocator exists for this purpose. You allocate into the temporary allocator when you have something that should be valid for this loop, but should not live past the end of the loop.

The HTTP Server package for Onyx uses this strategy, but even more aggressive. It replaces the main allocator (aka context.allocator, which is used by default throughout the standard library), with a GC allocator. This allocator tracks every allocation made in it, and can free everything in a single call. Every request handler uses this allocator, so you can intentionally forget to free everything, and the GC allocator will automatically free it all when the request is processed.

Literals

Boolean Literals

Onyx contains standard the boolean literals: true and false. They must be spelled all-lowercase as they are actually just global symbols. These means if you are feeling particularly evil, you could change what true and false mean in a particular scope, but I highly recommend you don't.

Numeric Literals

Onyx contains the following numeric literals:

123          // Standard integers
0x10         // Hexadecimal integers

4.0          // Floating point
2.3f         // Floating point, of type f32.

'a'          // Character literals, of type u8

Integer literals are special in that they are "un-typed" until they are used. When used, they will become whatever type is needed, provided that there is not loss of precision when converting. Here are some examples,


x: i8 = 10;
y := x + 100;  // 100 will be of type i8, and that is okay because
			   // 100 is in the range of 2's-complement signed
			   // 8-bit numbers.


x: i8 = 10;
y := x + 1000; // This will not work, as 1000 does not fit into
			   // an 8-bit number. This will result in a compile
			   // time error.

x: f32 = 10.0f;
y := x + 100;  // 100 will be of type f32. This will work, as 100
			   // fits into the mantissa of a 32-bit floating
			   // point number, meaning that there is no loss
			   // of percision.

Character Literals

Character literals are written in the following way.

'a'

Note, Onyx used to have #char "a" because the single-quotation character was being reserved for some other use. That other use did not appear in 3 years of development, so the single-quotation was given up to serve as a character litereal.

String Literals

Onyx contains the following string-like literals:

"Hello!"       // Standard string literals, of type 'str'.

#cstr "World"  // C-String literals, of type 'cstr'.

"""            // A multi-line string literal, of type 'str'.
Multi          // Note that the data for the multi-line literal
line           // starts right after the third quote, so technically
string         // all of these "comments" would actually be part of the
literal        // literal.
"""

In Onyx, there are 3 string types, str, cstr, dyn_str. cstr is analogous to a char * in C. It is a string represented as a pointer to an array of bytes, that is expected to end in a '\0' byte. For compatibility with some C libraries, Onyx also has this type.

Most Onyx programs solely use str, as it is safer and more useful. A str is implemented as a 2 element structure, with a pointer to the data, and an integer count. This is safer, as a null-terminator is not necessary, so a buffer-overflow is much harder. To convert a cstr to str, use string.from_cstr.

dyn_str, or dynamic string, is a string type that allows for modifying the string by appending to it. It is implemented as a dynamic array of u8, so any array function will work with it. To make more idomatic and readable code, the core.string package also has function for working with dynamic strings, such as append and insert.

Built-in constants

null           // Represents an empty pointer
null_proc      // Represents an empty function pointer

You may be wondering why there is a separate value for an empty function pointer. This is due to the securer runtime of Onyx over C. In WebAssembly (Onyx's compile target), functions are completely separated from data. Function references are not pointers, they are indicies. For this reason, there are two different values that represent "nothing" for their respective contexts.

Declarations

Declaring variables in Onyx is very similar to declaring variables in many other modern programming languages. A single colon (:) is used to declare a variable, optionally followed by its type and/or the initial value for the variable.

<variable name>(, <variable name>)* : <declared type> = <initial value> ;

Inferred Types

If the type of the initial value can be determined, then the declared type of the variable is optional, and it will be infered from the type of the initial value.

Examples

Here we declare a variable called x of type i32. It is guaranteed that x will be initialized to 0 here.

x: i32;

Here we declare a variable y explicitly as type i32, with an initial value of 10.

y: i32 = 10;

Here we declare a variable z with an infered type. Since, the declared type was omitted, it will copy the type of the initial value. When not the presence of other type information, the literal 10 has type i32, so z will be of type i32.

z := 10;

Blocks

There are 3 ways of expressing a block of code in Onyx, depending on the number of statements in the block.

Multi-statement Blocks

The first way is to use curly-braces ({}) to surround the statements in the block, with statements being delimited by a semi-colon.

{
	stmt1;
	stmt2;
	// ...
}

Single-statement Blocks

The second way is to use place the do keyword before the statement to create a single-statement block. This is required in if, while, and for statements. You can of course write { stmt; } instead of do stmt; if you prefer.

do stmt;

// More commonly
if some_condition do some_stmt;

Zero-statement Blocks

The third and final way is a little redundant, but its in the language because it can be appealing to some people. When there needs to be a block, but no statements are needed, three dashes, ---, can be used as an equivalent to {}.

if condition ---

switch value {
	case 1 ---
	// ...
}

Bindings

Bindings are a central concept to Onyx. A binding declares that certain symbol is bound to something in a scope. This "something" can be an compile-time known object. Here is a non-exhaustive list or some of the compile-time known objects.

  • Procedure
  • macros
  • structs
  • enums
  • packages
  • constant literals

Syntax

A binding is written in the following way:

symbol_name :: value

This says that symbol_name will mean the same thing as value in the scope that it was declared in. Normally, value is something like a procedure, structure, enum, or some other compile-time known object. However, there is nothing wrong with re-binding a symbol to give it an alternate name.

Note, the ability to alias symbols to other symbols has an interesting consequence. Since names are not inheritly part of a procedure or type defintion, a procedure or type can have multiple names.

f :: () { ... }
g :: f

Notice that the procedure defined here can be called g or f. When reporting errors the compiler will use the name that was originally bound to the procedure (f).

Use as constants

Onyx does not have a way so specify constant variables. When you declare a variable with :=, it is always modifiable. While constants are very useful, Onyx would suffer from the same problem that C and C++ have with constants: they aren't necessarily constant. You can take a pointer to it, and use that to modify it. Onyx does not want to make false promises about how constant something is.

That being said, bindings can serve as compile-time constants. You can declare a binding to a constant literal, or something that can be reduced to a constant literal at compile time. Here are some examples.

A_CONSTANT_INTEGER :: 10
A_CONSTANT_FLOAT :: 12.34
A_CONSTANT_STRING :: "a string"

// Since A_CONSTANT_STRING.length and A_CONSTANT_INTEGER are compile-time known
// the addition can happen at compile-time.
A_CONSTANT_COMPUTED_INTEGER :: A_CONSTANT_STRING.length + A_CONSTANT_INTEGER

Targeted Bindings

Bindings can be also placed into a scope other than the current package/file scope. Simply prefix the binding name with the name of the target scope, followed by a ..

Foo :: struct {}

// `bar` is bound inside of the `Foo` structure.
Foo.bar :: () {
}

Here the bar procedure is placed inside of the Foo structure. This makes it accessible using Foo.bar. When combined with the method call operator, methods can be defined on types in Onyx in a similar manner to Go.

The target scope does not have to be a structure however; it can also be a package, union, enum, or #distinct type.

Using targeted bindings is very common in many Onyx programs, because it allows for a defining procedures that are associated with a type, directly on the type. This makes them easier to find, and able to be used by the method call operator.

Program Structure

An important problem that every programming language tackles in a different way is: how do you structure larger, multi-file programs?

While each way of tackling this problem has its own advantages and disadvantages, Onyx takes a relatively simple approach. The following are the core principles:

  • No incremental compilation.
  • Divide files into packages.
  • Dissociate the package hierarchy from the folder hierarchy.

No incremental compilation

Onyx does not do incremental compilation. Everything is recompiled, from scratch, every time. This may seem like a drawback, not a feature, but it simplifies the development process immensely.

Onyx has a number of feature that could not be partially compiled in any reasonable way. Macros, runtime type information and static-if statements to name just a few. Instead of shoehorning a solution for this into the compiler, Onyx simply avoids partial/incremental compilation.

Onyx's compiler is very fast. While no incredibly large programs are written in Onyx yet, a simple calculation shows that the compiler could theoretically compile 100-200 thousand lines per second in larger project. For this reason, incremental compilation is not necessary, as your project will compile almost instantly, no matter what size.

Note, Onyx's compiler is currently still single-threaded. If and when this issue is addressed and a multi-threaded compilation model is achieved, it is not impossible to reach over one-million lines per second.

One large downside of partial compilation is the need to worry about, "Did my whole project get recompiled and updated? Or am I still testing old code?" With a poorly configured build system, it is quite easy to cause this issue, which can lead to hours of a frustrating debugging experience. This is another reason Onyx avoid partial compilation. You know for a fact every time you compile, it is a fresh build.

Divide files into packages

In Onyx, every source file is part of a package. The package a file is part of is declared in the first source line of file.

// There can be comments before the package declaration.

package foo

func :: () {}

Struct :: struct {}

The above file declares that it is part of the foo package. All symbols declared public in this file (func and Struct) are placed in public scope of foo.

When another file wants to use these symbols, all it has to do is use foo. Then it use foo to access things inside of the foo package.

package main

use foo

main :: () {
    foo.func();
}

Note, see more about this in the Packages section.

Dissociate the package hierarchy from the file hierarchy

Unlike in many other languages, Onyx does not enforce parity between the hierarchy of files on disk, to the hierarchy of packages in the program. Any file can be part of any package. While this does come at a readability loss, it offers greater flexibility to the programmer.

TODO Explain why this is a good thing, because it is, trust me.

Loading Files

When the source code for a project is split across multiple files, the Onyx compiler needs to be told where all of these files are, so it knows to load them. This can be done in a couple different ways.

Using the CLI

When running onyx run or onyx build, a list of files is provided. All of these files will be loaded into the program, in their respective packages. This can be a reasonable way of doing things for a small project, but quickly becomes unwieldy.

$ onyx build source1.onyx source2.onyx source3.onyx ...

Using #load directives

The idiomatic way of loading files into a program is using the #load directive. The #load directive is followed by a compile-time string as the file name, and tells the compiler to load that file.

The given file name is used to search relative to path of the file that contains the #load directive.

The file name can also be of the form "mapped_directory:filename". In this case, the file will be searched for in the given mapped directory. By default, there is only one mapped directory named core, and it is set to $ONYX_PATH/core. Other mapped directories can be set using the command-line argument --map-dir.

Note, the compiler automatically caches the full path to every file loaded, so no file can be loaded more than once, even if multiple #load directives would load it.

// Load file_a from the same directory as the current file.
#load "file_a"

// Load file_b from the mapped directory 'foo'.
#load "foo:file_b"

Using #load_all

Sometimes, every file in a directory needs to be loaded. To make this less tedious, Onyx has the #load_all directive. Much like #load it takes a compile-time string as the path relative to the current file, and it will load every .onyx file in that path, but does not perform a recursive search. Any sub-directories will need a separate #load_all directive.

#load_all "/path/to/include"

#load_all "/path/to/include/subdirectory"

Packages

Onyx has a relatively simple code organization system that is similar to other languages. To organize code, Onyx has packages. A package is collection of files that define the public and private symbols of the package. To use the symbols inside of a package, each file it is using the package. This is done with the use keyword.

Let's say we have two source files, foo.onyx and bar.onyx They are part of the packages foo and bar respectively. If bar.onyx wants to use some code from foo.onyx, it has to use foo before it can do so. This makes the foo package accessible inside of bar.onyx.

// foo.onyx

// Declares that this file is part of the "foo" package.
package foo

some_interesting_code :: () {
    // ...
}
// bar.onyx

// Declares that this file is part of the "bar" package.
package bar

use foo

func :: () {
    foo.some_interesting_code();
}

It is important to note, that while it may be a good idea to organize your source files into directories that correspond to the package that they are in, there is no limitation as to which files can be part of which packages. Any file can declare that it is part of any package. There may be a future language feature to optionally limit this.

Scoping

While controversial, Onyx has opted to have a public by default symbol system. Unless marked otherwise, all symbols inside a package are accessible from outside that package. If there are implementation details to hide, they can be scoped to either the current package or the current file.

Use #package before a binding to declare that the scope is internal to the package. Any file that is part of the same package can see the symbol, but external files cannot.

package foo

public_interface :: () {
    internal_details();
}

#package
internal_details :: () {
    // ...
}

Use #local before a binding to declare that the scope is internal to the file. Only things in the same file can see the symbol.

package foo

public_interface :: () {
    super_internal_details();
}

#local
super_internal_details :: () {
    // ...
}

Note, while Onyx is white-space agnostic, it is common to write the #package and #local directives on a separate line before the binding.

If you have a large set of implementation details, it might be more readable to use the block version of #local and #package.


public_interface :: () {

}

#local {
    lots :: () {
    }

    of :: () {
    }

    internal :: struct {
    }

    details :: enum {
    }
}

Notable packages

There are several package names that have been taken by the Onyx standard library and/or have a special use.

builtin package

The builtin package is notable because it is required by the compiler to exist. It contains several definitions needed by the compiler, for example Optional and Iterator. These definitions live in core/builtin.onyx, which is a slight misnomer because the builtin package is separate from the core module.

builtin is also special because its public scope is mapped to the global scope of the program. This means that anything defined in the package is available without using any packages.

runtime package

The runtime package is required by the compiler to exist, as the compiler places several variables in it related to the current operating system and selected runtime.

runtime.vars package

runtime.vars is used for configuration variables. It is the "dumping ground" for symbols defined on the command line with the -D option. Use #defined to tell if a symbol was defined or not.

use runtime

// Compile with -DENABLE_FEATURE to enable the feature
Feature_Enabled :: #defined(runtime.vars.ENABLE_FEATURE)

runtime.platform package

runtime.platform is an abstraction layer used by the core libraries that handles interacting with OS/environment things, such as reading from files and outputting a string.

core package

The core package houses all of the standard library. The way the Onyx packages are structured, the compiler does not know anything about the core package. If someone wanted to, they could replace the entire core library and the compiler would not be affected.

main package

The main package is the default package every file is a part of if no package declaration is made. The standard library expects the main package to have a main procedure that represents the start of execution. It must be of type () -> void or ([] cstr) -> void. If there is not an entrypoint in the program because it is a library, simply use -Dno_entrypoint when compiling, or define a dummy main with no body.

Use declarations

When a file wants to use code from another package, that package must be explicitly used. This is done with the use declaration. A use declaration binds one or more symbols to the current scope to other items in other package. If a use declaration is done at the top-level, the bindings are applied at file scope.

Simple case

The simplest use declaration looks like this.

package foo

use bar

func :: () {
    bar.something_interesting();
}

This use declaration says, "bind bar to be the package named bar." This allows func to use bar in its body.

Selective case

To bind to a symbol inside of a package, this syntax can be used.

package foo

use bar {something_interesting}

func :: () {
    something_interesting();
}

This use declaration extracts the symbol something_interesting and binds a symbol of the same name to it in the current file.

Avoiding name conflicts

Sometime, you want to rename a symbol due to a name conflict, or just to shorten your code. You can do so like this.

package foo

use bar {
    SI :: something_interesting
}

func :: () {
    SI();
}

This use declaration extracts the symbol something_interesting, then binds it to SI in the current file.

Use all

If you want to bring all symbols inside of a package into the current file, say use p {*}.

package foo

use bar { * }

func :: () {
    something_interesting();
}

Use all and package

If you want to bind the package itself, and all symbols inside of the package, simply place the package keyword inside of the {}.

package foo

use bar { package, * }

func :: () {
    something_interesting();
    
    // OR

    bar.something_intesting():
}

Changing package name

If you want to bind the package itself to a different name, provide the alias like in the previous example.

package foo

use bar { B :: package }

func :: () {
    B.something_interesting();
}

Operators

Onyx boasts the typical operators found in any C-inspired programming language, which this section describes briefly.

Math operators

Onyx has the following standard binary operators:

OperatorUseWorks on
+Additionintegers, floats, multi-pointers
-Subtractionintegers, floats, multi-pointers
*Multiplicationintegers, floats
/Divisionintegers, floats
%Modulointegers

Onyx also has the standard unary operators:

OperatorUseWorks on
-Negationintegers, floats

Comparison operators

OperatorUseWorks on
==Equalsbooleans, integers, floats, pointers
!=Not-equalsbooleans, integers, floats, pointers
>Greater-thanintegers, floats
<Less-thanintegers, floats
>=Greater-than or equalsintegers, floats
<=Less-than or equalsintegers, floats

Boolean operators

Onyx has the following binary boolean operators:

OperatorUseWorks on
&&Andbooleans
||Orbooleans

Onyx has the following unary boolean operator:

OperatorUseWorks on
!Notbooleans

Implicit Boolean Conversions

In certain circumstances where a boolean value is expected, Onyx will implicitly convert the value to a boolean in the following ways:

  • Pointers: If the pointer is non-null, it is true. If it is null, it is false.
  • Array-like: If the array is empty, it is false. If it is non-empty, it is true.
  • Optionals: If the optional has a value, it is true. Otherwise, it is false.

As an escape-hatch for library writers, it is possible to make anything implicitly cast to bool by overloading the builtin procedure, __implicit_bool_cast. Here is an example of making a custom structure cast to bool implicitly.

Person :: struct {
    age: u32;
    name: str;
}

#overload
__implicit_bool_cast :: (p: Person) -> bool {
    return p.age > 0 && p.name;
}


main :: () {
    p1 := Person.{};
    p2 := Person.{42, "Joe"};

    if !p1 { println("p1 is false"); }
    if  p2 { println("p2 is true"); }
}

Pointer operators

Pointers in Onyx have the following unary operators:

OperatorUseWorks on
&, ^Address-Ofpointers
*Dereferencepointers

Note, ^ is being phased-out in favor of & for taking the address of something.

Bitwise operators

Onyx has the following binary bitwise operators:

OperatorUseWorks on
&Bitwise-Andintegers
``Bitwise-Or
^Bitwise-Xorintegers
<<Bit shift leftintegers
>>Bit shift right (logical)integers
>>>Bit shift right (arithmetic)integers

Onyx has the following unary bitwise operators:

OperatorUseWorks on
~Bitwise-Negateintegers

Try/Coalesce operators

Onyx has two special operators that are not given any intrinsic meaning by the compiler. Their use is entirely defined within the standard library. They are the try (?) and the coalesce (??) operator.

Try operator (?)

The try operator is a postfix operator that can occur anywhere in an expression. Currently, the try operator is only used by the Optional and Result types. While not enforced by the compiler, the try operator generally acts as an early escape from a procedure.

For Optional, if no value is present, an empty value is returned from the enclosing procedure.

For Result, if an error value is present, the result value is returned.

Here is an example of using the try operator on an Optional.

use core

first :: (arr: [] $T) -> ? T {
    if !arr do return .{};

    return arr[0];
}

double :: (v: $T) -> T {
    return v * 2;
}

compute :: (arr: [] $T) -> ? T {
    v := first(arr)?;
    return double(v);
}

main :: () {
    arr1 := i32.[ 2, 3, 5, 7, 11 ];
    arr2 := i32.[];

    compute(arr1) |> core.println();
    compute(arr2) |> core.println();
}

Coalesce Operator (??)

The coalesce operator is a binary operator that returns the right-side value if the left-side is an empty value. This is defined for Optional and Result types.

Here is an example of the coalesce operator with Optional.

use core

first :: (arr: [] $T) -> ? T {
    if !arr do return .{};

    return arr[0];
}

main :: () {
    arr := i32.[];

    // If first(arr) return a None value, return 0 instead;
    v := first(arr) ?? 0;
    core.println(v); // Prints 0
}

Cast operators

When two values are not of the same type, one will need to be casted to the other's type. In Onyx, the cast keyword is used. It takes on the following two forms described below.

Prefix form

main :: () {
    x: i32 = 10;
    y: f32 = 20.0f;

    // Cast x to be a f32 so it can be added with y.
    result := cast(f32) x + y;

    println(result);
}

Here, a cast is needed to turn x into a f32 so it can be added with y. This cast is in prefix form, which simply means it appears as a prefix to the thing being casted, similar to unary negation (-).

Call form

BaseType :: struct {
    name: str;
}

SubType :: struct {
    use base: BaseType;

    age: u32;
}

main :: () {
    sub_value := SubType.{
        age = 123
    };

    base_ptr: &BaseType = &sub_value;
    
    age_value := cast(&SubType, base_ptr).age;
    println(age_value);
}

In this contrived example, base_ptr is casted to a &SubType using the the call form of the cast operator. This form is slightly nicer when you are going to immediately to follow the cast operation with a postfix operation. In this case, .age. If this was written in prefix form, another set of parenthesis would be needed: (cast(&SubType) base_ptr).age.

Auto casting

Sometimes, a cast necessary for the code to type check, but it is cumbersome to type the entire cast operation. Maybe the type is too long, or maybe the type is not even available because the package is not used. In these cases, the auto-cast operator can be used.

Auto-cast is written with a double-tilde (~~). This was chosen because there would be no purpose to performing a bitwise negation (~) twice in a row.

To understand auto-cast, treat it like it was a cast(X), and the X is automatically figured out by the compiler. If the auto-cast is not possible, a compile-time error will be reported.

print_float :: (x: f32) {
    printf("{}\n", x);
}

main :: () {
    x := 10;

    // Automatically cast x to an f32, since y is an f32.
    y: f32 = ~~ x;

    // Automatically cast x to an f32, since print_float expects an f32.
    print_float(~~ x);
}

No "magic" casts

Onyx does not have "special" or "magic" casts between two completely unrelated types, such as strings and numbers. Instead, conv.parse and tprintf should be used.

main :: () {
    x := 10;

    // Equivalent to `cast(str)` in some other languages
    x_str := tprintf("{}", x);

    // conv.parse parses a string into the type provided, and returns
    // an optional of that type. So, a default must be provided using ??.
    x_val := conv.parse(i32, x_str) ?? 0;
}

Procedure calls

Calling a procedures use the traditional postfix () operator. Simply place () after the procedure you would like to call.

// This is discussed in chapter 6, Procedures.
foo :: () -> void {
    // ...
}

main :: () -> void {
    // Using () to call foo.
    foo();
}

Passing Arguments

Arguments are passed in between the (), in a comma-separated fashion. The type of each argument must agree with the expected paramter type, or at least be of compatible type.

foo :: (x: i32, y: str) -> void {
    // ...
}

main :: () -> void {
    foo(10, "Hello");
}

Named Arguments

Somes it is nicer for clarity to explicitly name the arguments being passed. This can be done by specifying the name, then an =, then the value. This specifies that the argument is for the parameter with the same name.

foo :: (x: i32, y: str) -> void {
    // ...
}

main :: () -> void {
    foo(x = 10, y = "Hello");
}

When the arguments are named, their order can be changed.

foo :: (x: i32, y: str) -> void {
    // ...
}

main :: () -> void {
    // The order does not matter here.
    foo(y = "Hello", x = 10);
}

Named arguments are particularly useful when there are a lot of parameters with default values, and you want to modify a small number of them.

// This is a simple example of many defaulted arguments
foo :: (option_1 := true, option_2 := "./tmp", option_3 := 4) -> void {
    // ...
}

main :: () -> void {
    // Override a small number of the default values.
    foo(
        option_2 = "/something_else",
        option_3 = 8
    );
}

If/else operator

Onyx has one ternary operator for inline if-statements. Inspired by Python, it has this form.

true-stmt if condition else false-stmt

Here is a simple example of using it.

use core

main :: () {
    value := 10;

    x := 1 if value < 100 else 0;
    core.println(x); // Prints 1
}

While this operator should be scarely used for the sake of readable code, it can be very handy in certain circumstances.

Range operator

Ranges in Onyx represent an interval of numbers and are typically used in for-loops and for creating slices of a buffer.

The x .. y binary operator makes a half-open range, representing the set [x, y). For example, the range 1 .. 5 represents a range including 1, 2, 3, and 4, but not 5.

The x ..= y binary operator makes a fully-closed range, representing the set [x, y]. For example, the range 1 ..= 5 represents a range including 1, 2, 3, 4, and 5.

The type of these ranges will either be range or range64, depending if x and y were 32-bit integers or 64-bit integers.

For loop over integers

For-loops support iterating over a range.

// Prints 1 to 10
for x in 1 ..= 10 {
    println(x);
}

Creating a slice

When you have a buffer of data, you can create a slice out of the data by subscripting it with something of type range. It could be a range literal, or any other value of type range.

buf: [1024] u8;
bytes_read := read_data(buf);

// Create a slice referencing the underlying buffer.
// (Nothing is copied in this operation.)
data_read := buf[0 .. bytes_read];

Pipe operator

The pipe (|>) operator is used as syntactic sugar when you want the result of one procedure call to be passed as the first argument to another another call. This might sound contrived, but with a well-designed API it can happen often.

The pipe operator transform the code as follows:

x |> f(y)    into     f(x, y)

As you can see, it simply takes the left hand side, and passes it as the first argument to the procedure call. The operator is left-associative, which simply means the parentheses are automatically inserted to all for correct chaining of pipes.

Look at this simple API for doing (very simple) computations. On the first line of main, there is an example of using this API with nested function calls. On the second line, there is the equivalent code written using the pipe operator.

add :: (x: i32, y: i32) -> i32 {
    return x + y;
}

negate :: (x: i32) -> i32 {
    return -x;
}

double :: (x: i32) -> i32 {
    return x * 2;
}

main :: () {
    println(double(add(negate(-5), 4)));

    -5 |> negate() |> add(4) |> double() |> println();
}

Piping to other arguments

Sometimes the argument you want to pipe into is not in the first argument slot. When this happens, you can simply place a _ as the argument you want to pipe into. For example,

sub :: (x, y: i32) -> i32 {
    return x - y;
}

main :: () {
    5 |> sub(3) |> println();     // prints 2

    5 |> sub(3, _) |> println();  // prints -2
}

This is very useful when piping to printing/formatting functions that require the format string to be the first argument.

main :: () {
    // This example is a bit contrived, but imagine if '5' was
    // a long concatenation of pipes.
    5
    |> logf(.Info, "The value is {}", _);
}

Iterators and Pipe

The core.iter package in Onyx uses the pipe operator heavily. The package is designed in a way for the Iterator transformation functions to be easily chained.

For example, say you wanted to find the first 5 odd numbers greater than 100, you could write the following iterator.

my_numbers :=
    iter.as_iter(0 .. 100000)        // Convert the range to an iterator.
    |> iter.skip_while(x => x < 100) // Skip numbers less than 100.
    |> iter.filter(x => x % 2 == 1)  // Filter for only odd numbers.
    |> iter.take(5)                  // Only take the first 5 things.
    |> iter.collect();               // Collect the results as an array.

This is contrived example, but it shows how composable the iter package is, thanks to the pipe operator.

Method call operator

Onyx aims to support multiple styles of programming. The Pipe Operator section describes how a functional style of programming can be achieved. This section will describe how an Object-Oriented style of programming can be done.

The key behind this is the -> operator, also called the "method call operator". It can be understood as a simple shorthand for the following.

foo.method(foo, 123)

This can instead be written as the following.

foo->method(123);

Much like the pipe operator, it makes the left-hand side of the operator the first argument to the function call. However, unlike the pipe operator, it also resolves the function from within the scope of the left-hand side. It also automatically takes the address of the left-hand side if the method expects a pointer as the first argument. These features together make for a good aproximation to an inheritance-less OOP programming model.

Object-Oriented Programming

Onyx is not an object-oriented language. It is a data-oriented language, where you should think about the way your data is structured when solving problems.

That does not mean that all object-oriented language features are bad. Sometimes, it is easier to think about something as an "object" with "things" that it can do. When that is the case, Onyx can help.

With the method-call operator (described above), you can write methods on structures and unions.

use core

Foo :: struct {
    name: str;

    // When a function is declared inside of a structure,
    // it can be accessed under the struct's scope, i.e. `Foo.say_name`.
    say_name :: (f: Foo) {
        core.printf("My name is {}.\n", f.name);
    }
}

main :: () {
    foo := Foo.{ "Joe" };
    foo->say_name();
}

Other ways of writing main above would be like so:

main :: () {
    foo := Foo.{ "Joe" };

    foo.say_name(foo);    // Accessing on 'foo' will look to its types
                          // scope, in this case 'Foo', since 'foo' does
                          // not have a member named 'say_name'.

    Foo.say_name(foo);    // Explicit version as you would see in many 
                          // other languages.
}

Sometimes you want to pass the "object" as a pointer to the method if the method is going to modify the object. As a convience, the method call operator will do this automatically for you, if it is possible to take the address of the left-hand side. This may feel a little weird but it is largely intuitive and similar to how many other languages work.

use core

Foo :: struct {
    name: str;

    say_name :: (f: Foo) {
        core.printf("My name is {}\n", f.name);
    }

    // Entirely redundant method, but illustrates passing by pointer.
    set_name :: (f: &Foo, name: str) {
        // f can be modified here because it is passed by pointer.
        f.name = name;
    }
}

main :: () {
    // Create a zero-initialized Foo.
    foo: Foo;

    // Call the set_name method
    foo->set_name("Jane");
    // Note that this is equivalent to the follow (notice the &foo).
    // foo.set_name(&foo, "Jane")

    foo->say_name();
}

Virtual Tables

While Onyx does not natively support virtual tables, there is a pattern that can achieve this using used members on structures. Here is an example of the classic "Animals that can speak" inheritance argument.

Create a virtual table structures that will store the function pointers.

Animal_Vtable :: struct {
    // 'greet' is a member of the vtable, and takes a pointer
    // to the object (which this does not concern itself with),
    // as well as the name to greet.
    greet: (rawptr, name: str) -> void;
}

Then, create some implementations of the virtual table as global variables. Note, these could be scoped so they can only be used where you need them, but for this example they are accessible everywhere.

dog_vtable := Animal_Vtable.{
    greet = (d: &Dog, name: str) {
        printf("Woof {}!\n", name);
    }
}

cat_vtable := Animal_Vtable.{
    greet = (d: &Cat, name: str) {
        printf("Meow {}!\n", name);
    }
}

Finally create the Dog and Cat structures, with a used member of type Animal_Vtable. This will enable the animal->greet() syntax because greet is accessible as a member in Dog and Cat.

Dog :: struct {
    use vtable: Animal_Vtable = dog_vtable;
}

Cat :: struct {
    use vtable: Animal_Vtable = cat_vtable;
}

Now you can pass a pointer Dog and or a pointer to Cat to any procedure expecting a pointer to an Animal_Vtable, thanks to Sub-Type Polymorphism.

say_greeting :: (animal: &Animal_Vtable, name: str) {
    animal->greet(name);
}

main :: () {
    dog := Dog.{};
    cat := Cat.{};

    say_greeting(&dog);
    say_greeting(&cat);
}

This is obviously more clunky than object-oriented programming in a language like Java or C++, but that's because Onyx is not an object-oriented language.

This pattern is used in a couple of places throughout the standard library, notably in the io.Stream implementation, which enables reading and writing using the same interface from anything that defines io.Stream_Vtable, including files, sockets, processes and string buffers.

Operator Precedence

PrecedenceOperators
1Assignment (+=, -=, ...)
2Post-fix (.x, (), ?, [], ->x())
3Pre-fix (-, !, ~~, ~, *, &)
4??
5%
6*, /
7+, -
8&, |, ^, <<, >>, >>>
9<=, <, >=, >
10==, !=
11&&, ||
12|>, ..

Control Flow

Onyx has a standard set of simple control flow mechanisms: if, while, for, switch and defer. Notably absent is goto, and this by design.

If

if statements allow the programmer to optionally execute a block of code, if a condition is met.

if-statements in Onyx are written like this:

if condition {
	println("The condition was true!");
}

Notice that there does not need to be parentheses around the condition. One thing to note is that the syntax for an else-if chain uses the keyword elseif, not else if.

if x >= 100 {
	println("x is greater than 100");
} elseif x >= 10 {
	println("x is greater than 10");
} else {
	println("x is not special.");
}

Initializers

if-statements can also have an initializer, which is a statement that appears before the condition. They allow you to declare variables that are only available in the scope of the if-statement, or any of the else blocks.

can_error :: () -> (i32, bool) ---

if value, errored := can_error(); !errored {
	printf("The value was {}!\n", value);
}

// value is not visible here.

While loops

while-statements are very similar to if-statements, except when the bottom of the while-loop body is reached, the program re-tests the condition, and will loop if necessary.

while-statements have the same syntax as if-statements.

x := 10;
while x >= 0 {
	println(x);
	x -= 1;
}

while statements can also have initializers, meaning the above code could be rewritten as:

while x := 10; x >= 0 {
	println(x);
	x -= 1;
}

while statements can also have an else block after them. The else block is executed if the condition for the while loop was never true.

while false {
	println("Never printed.");
} else {
	println("This will print.");
}

Switch

switch-statements are used to simplify a chain of if-elseif statements. Switch statements look a little different in Onyx compared to say C. This is because case blocks are actually blocks, not just jump targets.

value := 10;

switch value {
	case 5 {
		println("The value was 5.");
	}

	case 10 do println("The value was 10.");

	case _ {
		println("The value was not recognized.");
	}
}

_ is used for the default case. The default case must be listed lexicographical as the last case.

It is also possible to match multiple values using a comma-separated list.

value := 10;

switch value {
    case 5, 10, 15 {
        println("The value was 5, 10, or 15.");
    }
}

fallthrough

case blocks in Onyx automatically exit the switch statement after the end of their body, meaning an ending break statement is not needed. If you do however want to fallthrough to the next case like in C, use the fallthrough keyword.

switch 5 {
	case 5 {
		println("The value was 5.");
		fallthrough;
	}

	case 10 {
		println("The value was (maybe) 10.");
	}
}

Ranges

switch statements also allow you to specify a range of values using .. or ..=.

switch 5 {
	case 5 ..= 10 {
		println("The value was between 5 and 10.");
	}
}

Custom Types

switch statements can operate on any type of value, provided that an operator overload for == has been defined.

Point :: struct {x, y: i32;}
#operator == (p1, p2: Point) => p1.x == p2.x && p1.y == p2.y;

switch Point.{10, 20} {
	case .{0,   0} do println("0, 0");
	case .{10, 20} do println("10, 20");
	case _         do println("None of the above.");
}

Tagged Unions

switch statements are very important when working with tagged unions. See the tagged union section for details.

Initializers

switch statements can also optionally have an initializer, like while and if statements.

Defer

defer-statements allow you to run a statement or block when the enclosing block is exited.

{
	println("1");
	defer println("3");
	println("2");
}

This example will print:

1
2
3

defer statements are pushed onto a stack. When the block exits, they are popped off the stack in reverse order.

{
	defer println("3");
	defer println("2");
	defer println("1");
}

This example will also print:

1
2
3

Create/Destroy Pattern

defer statements enable the following "create/destroy" pattern.

thing := create_something();
defer destroy_something(thing);

Because deferred statements run in any case that execution leaves a block, they safely guarantee that the resource will be destroyed. Also, because defer statements are stacked, they guarantee destroying resources happens in the correct order.

outer_thing := create_outer_thing();
defer destroy_outer_thing(outer_thing);

inner_thing := create_inner_thing(outer_thing);
defer destroy_inner_thing(inner_thing);

For loops

for loops are the most powerful control flow mechanism in Onyx. They enable:

  • Iteration shorthand
  • Custom iterators
  • Removing elements
  • Scoped resources

Range-Based Loop

A basic for loop in Onyx. This will iterate from 1 to 9, as the upper bound is not included.

for i in 1 .. 10 {
    println(i);
}

This for loop is iterating over a range. Ranges represent half-open sets, so the lower bound is included, but the upper bound is not.

Array-Based Loop

for loops can also iterate over array-like types: [N] T, [] T, [..] T. Use & after for to iterate over the array by pointer.

primes: [5] i32 = .[ 2, 3, 5, 7, 11 ];
for value in primes {
    println(value);
}

// This modifies the array so each element
// is double what it was.
for &value in primes {
    // value is a &i32.
    *value *= 2;
}

it

Naming the iteration value is optional. If left out, the iteration value will be called it.

for i32.[2, 3, 5, 7, 11] {
    println(it);
}

Indexed-loops

for loops can optionally have a second iteration value called the index. This index starts at 0, and increments by 1 every iteration. Its default type is i32, but this can be changed.

// Use i32 as type of index
for value, index in i32.[2, 3, 5, 7, 11] {
    printf("{}: {}", index, value);
}

// Explictly change the type to i64
for value, index: i64 in i32.[2, 3, 5, 7, 11] {
    printf("{}: {}", index, value);
}

Custom Iterators Loops

The final type that for loops can iterate over is Iterator(T). Iterator is a built-in type that represents a generic iterator. An Iterator has 4-elements:

  • data - a pointer to the context for the iterator.
  • next - a function to retrieve the next value out of the iterator.
  • remove - an optional function to remove the current element.
  • close - an optional function to cleanup the iterator's context. The core.iter package provides many utilities for working with iterators.

Here is a basic example of creating an iterator from a range, then using iter.map to double the values. Iterators are lazily evaluated, so none of the actual doubling happens until values are pulled out of the iterator by the for loop.

doubled_iterator := iter.as_iter(1 .. 5)
                 |> iter.map(x => x * 2);

for doubled_iterator {
    println(it);
}

The above for loop loosely translates to the following code.

doubled_iterator := iter.as_iter(1 .. 5)
                 |> iter.map(x => x * 2);

{
    defer doubled_iterator.close(doubled_iterator.data);

    while true {
        it, cont := doubled_iterator.next(doubled_iterator.data);
        if !cont do break;    

        println(it);
    }
}

#no_close

The close function of an Iterator is always called after the loop exits. If this is not the desired behavior, you can add #no_close after for to forego inserting the close call.

doubled_iterator := iter.as_iter(1 .. 5)
                 |> iter.map(x => x * 2);
for #no_close doubled_iterator {
    println(it);
}

// Later:
iter.close(doubled_iterator);

#remove

The final feature of Iterator-based for loops is the #remove directive. If the current Iterator supports it, you can write #remove to remove the current element from the iterator.

// Make a dynamic array from a fixed-size array.
arr := Array.make(u32.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);

// as_iter with a pointer to a dynamic array 
// supports #remove.
for iter.as_iter(&arr) {
    if it % 2 == 0 {
        // Remove all even numbers
        #remove;
    }
}

// Will only print odd numbers
for arr do println(it);

#first

Many times while writing for loops, it is nice to know if this iteration is one of two special values: the first and last iteration. As a convenience, Onyx provides the #first directive in all of its for loops. It is a bool value that is true during the first iteration, and then false afterwards. Note, Onyx does not provide an equivalent #last directive, because in Iterator-based loops, it is impossible to know when the last iteration will happen.

One example of where this is useful is in a formatted printer. Consider this code that prints the elements of an array.

arr := i32.[ 2, 3, 5, 7, 11 ];

for arr {
    if !#first do print(", ");
    print(it);
}

This example will print:

2, 3, 5, 7, 11

Explicitly-typed loop variable

You can optionally provide an explicit type for the loop variable, if you feel it improves code readability. It does not carry any extra semantic meaning, but solely exists for the next reader of the code. If you provide the incorrect type, it is a compile error.

strings := str.["A", "list", "of", "strings"];

// Explcitly say that value is a str
for value: str in strings {
    println(value)
}

Branching

The following keywords can be used to branch form a block of code early.

break

break can be used to jump execution to after the body of the enclosing loop.

// Prints 0 to 5.
for 0 .. 10 {
	println(it);
	if it == 5 do break;
}

continue

continue can be used to jump execution to the condition of the enclosing loop.

// Prints 5 to 9.
for 0 .. 10 {
	if it < 5 do continue;
	println(it);
}

fallthrough

fallthough is discussed in the switch statement section.

return

return is used to end execution of the current procedure. It is also used to provide return values.

fact :: (n: i32) -> i32 {
	if n <= 1 do return 1;   // Early return
	return fact(n - 1) * n;  // Providing result
}

Do blocks

Do blocks allow you to encapsulate statements into smaller chunks of code whose sole purpose is to evaluate to a value.

Do blocks are expressions, and therefore must return a value. This is done using the return keyword.

main :: () {
    x := 10

    y := do {
        if x > 5 do return "X is greater than 5"
        return "X is less than or equal to 5"
    }

    println(y)
}

Explicit typing

If necessary, you can provide an explicit type to the do block using -> type after the do keyword.

Color :: enum { Red; Green; Blue; }

main :: () {
    col := do -> Color {
        return .Green
    }

    println(col)
}

Internal details

Do blocks were actually a feature that came for "free" when macro were implemented. Every expression macro simply turns into a do block at the call site.

add :: macro (x, y: i32) -> i32 {
    return x + y
}

main :: () {
    x1 := add(3, 4)

    // The above simply desugars into this:
    x2 := do -> i32 {
        return 3 + 4
    }
}

Used Locals

This is an experimental feature that may not work perfectly, and may go away in the future if there is push back or major problems.

A very common pattern in Onyx is to create/allocate a resource, then defer the release/free of the resource on the next line. Take this for an example.

main :: () {
    // Make a dynamic array.
    my_arr := make([..] i32);

    // Delete the array at the end of main.
    defer delete(&my_arr);
}

Whether using dynamic arrays, io.Readers, or os.Files, this pattern is all over the place.

To make it a little easier to type, and to allow the author of the type to define how to clean up the resource allocation, you can simply place the use keyword in front of the variable definition. This will automatically insert a deferred call to the builtin procedure __dispose_used_local. This procedure can be overloaded to define how to dispose of any resource. It also contains an overload for delete, which means anything you can call delete on, you can already use.

This is the same example as before, but with use instead.

main :: () {
    // Implicitly delete the array at the end of scope.
    use my_arr := make([..] i32);


    // The above code is equivalent to this:
    my_arr := make([..] i32);
    defer __dispose_used_local(&my_arr);
}

Procedures

Procedures allow the programmer to encapsulate behavior inside a reusable form. Other languages call them functions, subroutines or methods. "Procedures" is a super-set of all of those terms.

Syntax

Procedures in Onyx are written simply as: (parameters) -> return_type { body }.

Here is a simple procedures that simply prints, Hello!.

say_hello :: () -> void {
    println("Hello!");
}

To explain the different parts of the syntax, here is a broken down version, line by line.

say_hello     // This is the symbol name that the procedure will be bound to.
    ::        // This is 'bind' operator, as discussed in Chapter 2.5.
    ()        // This is the start of the procedure; an empty list of parameters.
    -> void   // This is the return type, specified using a `->`.
{
    // This is the procedure's body.
    println("Hello!");
}

Anonymous Procedures

Procedures do not have to be named, and can simply exist as expressions. Here, say_hello is assigned at runtime to be an anonymous procedure.

procedure_as_an_expression :: () -> void {

    // Assign the procedure to a local variable
    say_hello := () -> void {
        println("Hello!");
    };

    say_hello();
}

Optional Return Type

If the procedure returns void (i.e. returns nothing), the return type can be completely removed.

say_hello :: () {
    println("Hello, no void!");
}

Parameters

Procedures can take 0 or more parameters. All parameters are passed by value. Parameters that are passed by pointer copy the pointer value, not the data where the pointer is pointing.

Syntax

Procedure paramaters are given a name, followed by a :, followed by the type of that parameter. A comma (,) is used to delimit the different parameters.

print_add :: (x: i32, y: i32) {
    printf("{} + {} = {}\n", x, y, x + y);
}

compute_add :: (out: &i32, x: i32, y: i32) {
    *out = x + y;
}

As a convenience, if two or more parameters have the same type, they can be written using the type only once. In this example, because x and y are the same type, the : i32 is not needed after x.

print_add :: (x, y: i32) {
    // ...
}

Default values

Parameters can have default values. The default value is computed on the caller's side. This mean default values are not part of the procedures type. They are only a conveniences provided by a given procedure.

print_msg_n_times :: (n: i32, msg: str = "Hello, World!") {
    for n do println(msg);
}

print_msg_n_times(10);

The type of a defaulted parameter can be omitted if the type of the expression is known.

// Because "Hello, World!" is known to be of type 'str',
// the type of msg can be omitted.
print_msg_n_times :: (n: i32, msg := "Hello, World!") {
    for n do println(msg);
}

print_msg_n_times(10);

Return values

Procedures can return 0 or more values. Return types are specified after procedure arguments using an ->. If multiple return values are desired, the return types have to be enclosed in parentheses. The return keyword is used to specify returned values.

// A single integer return value.
add :: (x, y: i32) -> i32 {
    return x + y;
}

// Returning 2 integers.
swap :: (x, y: i32) -> (i32, i32) {
    return y, x;
}

z := add(2, 3);

a, b := 10, 20;
a, b = swap(a, b);

Note, returned values are passed by value.

Automatic-return type

Sometimes, the exact type of returned value is cumbersome to write out. In this case, #auto can be provided as the return type. It automatically determines the return type given the first return statement in the procedure.

// #auto would automatically determined to be:
//   Iterator(i32), bool, str
weird_return_type :: (x: i32) -> #auto {
    return iter.as_iter(1 .. 5) , false, "Hello, World!";
}

In some cases in Onyx, it is actually impossible to write the return type. #auto can be used in this case, and the compiler will figure out what type needs to be there. Look at this example from the standard library.

iter.prod :: (x: $I/Iterable, y: Iterator($Y)) -> #auto { ... }

iter.prod returns an iterator of pairs of the two values yielded from the left and right iterators. There is no way to write the return type, because you cannot spell the type of Iterator that x is because it is only Iterable, meaning you can call as_iter on it. Think about it, what could you write in Iterator(Pair(???, Y)) to make it correct?

Calling procedures

Calling any procedure-like thing in Onyx uses the traditional () post-fix operator, with arguments in between. Arguments are separated by commas. Arguments can also be named. Once arguments start being named, all subsequent arguments must be named.

magnitude :: (x, y, z: f32) -> f32 {
    return math.sqrt(x*x + y*y + z*z);
}

// Implicit naming
println(magnitude(10, 20, 30));

// Explicit naming
println(magnitude(10, y=20, z=30));

// Explicit naming, in diffrent order
println(magnitude(z=30, y=20, x=10));

Variadic procedures

Variadic procedures allow a procedure to take an arbitrary number of arguments. This function takes any number of integers and returns their sum. The ..i32 type behaves exactly like a slice of i32 ([] i32).

sum :: (ints: ..i32) -> i32 {
    result := 0;
    for ints {
        result += it;
    }

    return result;
}

println(sum(1, 2, 3, 4, 5));

Variadic procedures can also use the special type any, to represent heterogeneous values being passed to the function. This function prints the type of each of the values given.

print_types :: (arr: ..any) {
    for arr {
        println(it.type);
    }
}

print_types("Hello", 123, print_types);

This example outputs:

[] u8
i32
(..any) -> void

Note, read more about any in the Any section.

Using Runtime Type Information, functions can introspect the values given and perform arbitrary operations. For example, conv.format uses type information to print anything of any type in the program.

// printf uses conv.format for formatting.
printf("{} {} {}\n", "Hello", 123, context);

Polymorphic procedures

Polymorphic procedures allow the programmer to express type-generic code, code that does not care what type is being used. This is by far the most powerful feature in Onyx.

Polymorphic procedures use polymorphic variables. A polymorphic variable is declared using a $ in front of the name. When calling a polymorphic procedure, the compiler will try to solve for all of the polymorphic variables. Then, it will construct a specialized version of the procedure with the polymorphic variables substituted with their corresponding value.

Here is an example of a polymorphic procedure that compares two elements.

min :: (x: $T, y: T) -> T {
    if x < y do return x;
    else     do return y;
}

x := min(10, 20);
y := min(40.0, 30.0);

// Errors
// z := min("Hello", "World");

$T declares T as a polymorphic variable. When min is called with two i32s, the compiler solves for T, finding it to be i32. Then a specialized version of min is constructed that operates on i32s. A very similar thing happens for the second call, except in that case T is f64. Notice that any error will occur if min is called with something that does not define the operator < for T.

Polymorphic variables can occur deeply nested in a type. The compiler employs pattern matching to solve for the polymorphic variable.

foo :: (x: &[] Iterator($T)) {
    // ...
}

val: &[] Iterator(str);
foo(val);

Here is a simple pattern matching process that the compiler goes through to determine the type of $T.

Variable TypeGiven Type
&[] Iterator($T)&[] Iterator(str)
[] Iterator($T)[] Iterator(str)
Iterator($T)Iterator(str)
$Tstr

If at any point the types do not match, an error is given.

Parameters can also be polymorphic variables. If a $ is placed in front of a parameter, it becomes a compile-time "constant". A specialized version of the procedure is made for each value given.

add_constant :: ($N: i32, v: i32) -> i32 {
    // N is a compile-time known integer here.
    // It is equivalent to writing '5'.
    return N + v;
}

println(add_constant(5, 10));

Types can be passed as constant through polymorphic variables. Consider this example.

make_cubes :: ($T: type_expr) -> [10] T {
    arr: [10] T;
    for 0 .. 10 {
        arr[it] = cast(T) (it * it * it);
    }
    return arr;
}

arr := make_cubes(f32);

Because T is a constant, it can be used in the type of arr, as well as in the return type.

Quick procedures

With polymorphic variables and #auto, it is possible to write a completely type-generic procedure in Onyx.

print_iterator :: (msg: $T, iterable: $I) -> #auto {
    println(msg);
    for iterable {
        println(it);
    }
    return 1234;
}

print_iterator("List:", u32.[ 1, 2, 3, 4 ]);
print_iterator(8675309, 5 .. 10);

No types are given in the procedure above. msg and iterable can be any type, provided that iterable can be iterated over using a for loop. This kind of procedure, one with no type information, is given a special shorthand syntax.

print_iterator :: (msg, iterable) => {
    println(msg);
    for iterable {
        println(it);    
    }
    return 1234;
}

print_iterator("List:", u32.[ 1, 2, 3, 4 ]);
print_iterator(8675309, 5 .. 10);

Here the => signifies that this is a quick procedure. The types of the parameters are left out, and can take on whatever value is provided. Programming using quick procedures feels more like programming in JavaScript or Python, so don't abuse them. They are very useful when passing procedures to other procedures.

map :: (x: $T, f: (T) -> T) -> T {
    return f(x);
}

// Note that the paraentheses are optional if
// there is only one parameter.
y := map(5, value => value + 4);
println(y);

You can also have a mix between quick procedures and normal procedures. This examples shows an alternative way of writing -> #auto.

// The => could be -> #auto, or -> T.
find_smallest :: (items: [] $T) => {
    small := items[0];
    for items {
        if it < small do small = it;
    }
    return small;
}

println(find_smallest(u32.[6,2,5,1,10]));

Closures

Onyx has experimental support for closures. Currently, this is in the form of explicit closure, where every captured variable has to be declared before it can be used. This restriction will likely be lifted in the future when other internal details are figured out.

To declare a closure, simply add use (...) to the procedure definition, between the arguments and the return type.

main :: () {
    x := 10
    
    // Here, x is captured by value, and a copy is made for
    // this quick procedure.
    f := (y: i32) use (x) -> i32 {
        return y + x
    }

    f(20) |> println() // Prints 30
}

Captured values can either by value, or by pointer. To capture by pointer, simply place a & in front of the variable name.

main :: () {
    x := 10
    
    // Here, x is captured by pointer
    f := (y) use (&x) => {
        *x = 20
        return y + *x
    }

    f(20) |> println() // Prints 40
    println(x)         // Prints 20
}

Currying

A form of function currying is possible in Onyx using chained quick procedures, and passing previous arguments to each subsequent quick procedure.

add :: (x: i32) => (y: i32) use (x) => (z: i32) use (x, y) => {
    return x + y + z
}

main :: () {
    partial_sum := add(1)(2)

    sum1 := partial_sum(3)
    sum2 := partial_sum(10)

    println(sum1)
    println(sum2)
}

Internal details of Closures

Every time a closure is encountered at runtime, a memory allocation must be made to accommodate the memory needed to store the captured values. To do this, a builtin procedure called __closure_block_allocate is called. This procedure is implemented by default to invoke context.closure_allocate. By default, context.closure_allocate allocates a buffer from the temporary allocator. If you want to change how closures are allocated, you can change this procedure pointer to do something different.

main :: () {
    context.closure_allocate = (size: i32) -> rawptr {
        printf("Allocating {} bytes for closure.\n", size)
        
        // Allocate out of the heap
        return context.allocator->alloc(size)
    } 

    x := 10
    f := (y: i32) use (x) => {
        return y + x
    }

    f(20) |> println()  // Prints 30 
}

Overloaded procedures

Overloaded procedures allow a procedure to have multiple implementations, depending on what arguments are provided. Onyx uses explicitly overloaded procedures, as opposed to implicitly overloaded procedures. All overloads for the procedure are listed between the {} of the #match expression, and are separated by commas.

to_int :: #match {
    (x: i32) -> i32 { return x; },
    (x: str) -> i32 { return cast(i32) conv.str_to_i64(x); },
    (x: f32) -> i32 { return cast(i32) x; },
}

println(to_int(5));
println(to_int("123"));
println(to_int(12.34));

The order of procedures does matter. When trying to find the procedure that matches the arguments given, the compiler tries each function according to specific order. By default, this order is the lexical order of the functions listed in the #match body. This order can be changed using the #order directive.

options :: #match {
    #order 10 (x: i32) { println("Option 1"); },
    #order 1  (x: i32) { println("Option 2"); },
}

// Option 2 is going to be called, because it has a smaller order.
options(1);

The lower order values are given higher priority, as they are ordered first.

Overloaded procedures as described would not be very useful, as all of the procedures would have to be known when writing the overload list. To fix this issue, Onyx has a second way of using #match to add overload options to an already existing overloaded procedure.

options :: #match {
    (x: i32) { println("Int case."); }
}

// #match can be used as a directive to specify a new overload option for
// an overloaded procedure. Directly after #match is the overloaded procedure,
// followed by the new overload option.
#match options (x: f32) { println("Float case."); }
#match options (x: str) { println("String case."); }

// As an alternative syntax that might become the default for Onyx,
// #overload can be used with a '::' between the overloaded procedure
// and the overload option. This is prefered because it looks more
// like writing a normal procedure, but with `#overload` as a "tag".
#overload
options :: (x: cstr) { println("C-String case."); }

// A order can also be specified like so.
#match options #order 10 (x: i32) { println("Other int case."); }

Sometimes, the ability to add new overload options should be disabled to prevent undesired behavior. For this Onyx has two directives that can be added after #match to change when procedures can be added.

  • #locked - This prevents adding overload options. The only options available are the ones between the curly braces.
  • #local - This allows options to be added, but only within the same file. This can be used to clean-up code that is over-indented.

Here is an example of using #match #local.

length :: #match #local {}

#overload
length :: (x: u32) => 4

#overload
length :: (x: str) => x.count

Overloaded procedures provide the backbone for type-generic "traits" in Onyx. Instead of making a type/object oriented system (i.e. Rust), Onyx uses overloaded procedures to provide type-specific functionality for operations such as hashing. Multiple data-structures in the core package need to hash a type to a 32-bit integer. Map and Set are two examples. To provide this functionality, Onyx uses an overloaded procedure called hash in the core.hash package. This example shows how to define how a Point structure can be hashed into a u32.

Point :: struct {x, y: i32}

#overload
core.hash.hash :: (p: Point) => cast(u32) (x ^ y);

Interfaces and where

Interfaces allow for type constraints to be placed on polymorphic procedures. Without them, polymorphic procedures have no way of specifying which types are allowed for their polymorphic variables. Interfaces are best explained through example, so consider the following.

CanAdd :: interface (T: type_expr) {
    t as T;

    { t + t } -> T;
}

T is a type, and t is a value of type T. The body of the interface is specifying that two values of type T can be added together and the result is of type T. Any expression can go inside of the curly braces, and it will be type checked against the type after the arrow. This interface can be used to constrict which types are allowed in polymorphic procedure using a where clause.

CanAdd :: interface (T: type_expr) {
    t as T;

    { t + t } -> T;
}

sum_array :: (arr: [] $T) -> T where CanAdd(T) {
    result: T;
    for arr do result += it;
    return result;
}

// This is allowed
sum_array(f32.[ 2, 3, 5, 7, 11 ]);

// This is not, because '+' is not defined for 'str'.
sum_array(str.[ "this", "is", "a", "test" ]);

The second call to sum_array would generate an error anyway when it type checks the specialized procedure with T=str. However, this provides a better error message and upfront clarity to someone calling the function.

Interface constraints can also take on a more basic form, where the expected type is omitted. In this case, the compiler is only checking if there are no errors in the provided expression.

// This does not check if t + t is of type T.
CanAdd :: interface (T: type_expr) {
    t as T;

    t + t;
}

Interfaces can be used in conjunction with #match blocks to perform powerful compile-time switching over procedures. Consider the following extension to the previous example.

CanAdd :: interface (T: type_expr) {
    t as T;

    { t + t } -> T;
}

sum_array :: #match {
    (arr: [] $T) -> T where CanAdd(T) {
        result: T;
        for arr do result += it;
        return result;
    },

    (arr: [] $T) -> T {
        printf("Cannot add {}.", T);

        result: T;
        return result;
    }
}

// This is allowed
sum_array(f32.[ 2, 3, 5, 7, 11 ]);

// This is now allowed, but will print an error.
sum_array(str.[ "this", "is", "a", "test" ]);

First the compiler will check if T is something that can be added, and if it can, the first procedure will be called. Otherwise the second procedure will be called.

Where expressions

Compile-time constant expressions can be used alongside interfaces in where clauses. This gives the programmer even more control over the conditions their function will be called under. For example, ensuring the length of an array fits within a specific range allows us to optimize our code.

sum_array :: #match {
    (arr: [] $T) -> T where CanAdd(T) {
        result: T;
        for arr do result += it;
        return result;
    },

    // This will only be called with fixed-size arrays who's length is between 1 and 4.
    (arr: [$N] $T) -> T where CanAdd(T), N >= 1, N <= 4 {
        // An imaginary macro that duplicates the body N times
        // to avoid the cost of loops.
        return unroll_loop(arr, N, [a, b](a += b));
    },

    (arr: [] $T) -> T {
        printf("Cannot add {}.", T);

        result: T;
        return result;
    }
}

// This will call the [] $T version of sum_array
sum_array(f32.[ 2, 3, 5, 7, 11 ]);

// This will call the [$N] $T version of sum_array
sum_array(f32.[ 1, 2, 3, 4 ]);

// This will also call the [$N] $T version of sum_array
sum_array(f32.[ 1, 2, 3 ]);

Operator overloading

Onyx's operator overloading syntax is very similar to its #match syntax, except #operator is used, followed by the operator to overload. For example, this defines the + operator for str.

#operator + (s1, s2: str) -> str {
    return string.concat(s1, s2);
}

The following operators can be overloaded:

Arithemetic: +  -   *   / %
Comparison:  == !=  <   <= > >=
Bitwise:     &  |   ^   << >> >>>
Logic:       && ||
Assignment:  += -=  *= /= %=
             &= |=  <<= >>= >>>=
Subscript:   [] []= &[]

Most of these are self explanatory.

Macros

Macros in Onyx are very much like procedures, with a couple notable differences. When a macro is called, it is expanded at the call site, as though its body was copy and pasted there. This means that macros can access variables in the scope of their caller.

print_x :: macro () {
    // 'x' is not defined in this scope, but it can be used
    // from the enclosing scope.
    println(x);
}

{
    x := 1234;
    print_x();
}

{
    x := "Hello from a macro!";
    print_x();
}

Because macros are inlined at the call site and break traditional scoping rules, they cannot be used as a runtime known value.

There are two kinds of macros: block macros, and expression macros. The distinguishing factor between them is the return type. If a macro returns void, it is a block macro. If it returns anything else, it is an expression macro.

Block and expression macros behave different with respect to some of the language features. Expression macros behave exactly like an inlined procedure call with dynamic scoping.

add_x_and_y :: macro (x: $T) -> T {
    defer println("Deferred print statement.");
    return x + y;
}

{
    y := 20.0f;
    z := add_x_and_y(30.0f);
    printf("Z: {}\n", z);
}

// This prints:
// Deferred print statement.
// Z: 50.0000

This example shows that defer statements are cleared before the expression macro returns. Also, the return statement is used to return from the macro with a value.

Block macros behave a little differently. defer statements are not cleared, and return statements are used to return from the caller's procedure.

early_return :: macro () {
    return 10;
}

defer_a_statement :: macro () {
    defer println("Deferred a statement.");
}

foo :: () -> i32 {
    defer_a_statement();
    println("About to return.");
    early_return();
    println("Never printed.");
}

// foo() will print:
// About to return.
// Deferred a statement.

In foo, the call to defer_a_statement adds the deferred statement to foo. Then the first println is run. Then the early_return macro returns the value 10 from foo. Finally, the deferred print statement is run.

This distinction between block and expression macros allows for an automatic destruction pattern.

// These are not the actual procedures to use mutexes.
grab_mutex :: macro (mutex: Mutex) {
    mutex_lock(mutex);
    defer mutex_unlock(mutex);
}

critical_procedure :: () {
    grab_mutex(a_mutex);
}

grab_mutex will automatically release the mutex at the end of critical_procedure. This pattern of creating a resource, and then freeing it automatically using defer is very common.

Code Blocks

To make macros even more powerful, Onyx provides compile-time code blocks. Code blocks capture code and treat it as a compile-time object that can be passed around. Use [] {} to create a code block. Use #unquote to "paste" a code block.

say_hello :: [] {
    println("Hello!");
}

#unquote say_hello;

Code blocks are not type checked until they are unquoted, so they can contain references to references to variables not declared within them.

Code blocks have their syntax because they can optionally take parameters between their []. When unquoting a code block with parameters, you must pass an equal or greater number of arguments in parentheses after the variable name.

do_something :: ($do_: Code) {
    #unquote do_(1, 2);
    #unquote do_(2, 6);
}

do_something([a, b] {
    println(a + b);
});

Code blocks can be passed to procedures as compile-time values of type Code.

triple :: ($body: Code) {
    #unquote body;
    #unquote body;
    #unquote body;
}

triple([] {
    println("Hello!");
});

Code blocks can be passed to macros without being polymorphic variables, because all parameters to macros are compile-time known.

triple_macro :: macro (body: Code) {
    #unquote body;
    #unquote body;
    #unquote body;
}

triple_macro([] {
    println("Hello!");
});

A single statement/expression in a code block can be expressed as: [](expr)

[](println("Hello"))
// Is almost the same the as
[] { println("Hello"); }

The practical difference between []() and [] {} is that the latter produces a block of code, that has a void return type, while the former results in the type of the expression between it. The Array and Slice structures use this feature for creating a "lambda/capture-like" syntax for their procedures.

find_largest :: (x: [] $T) -> T {
    return Slice.fold(x, 0, [x, acc](x if x > acc else acc));
}

A code block can also be passed to a macro or procedure simply by placing a block immediately after a function call. This only works if the function call is a statement.

skip :: (arr: [] $T, $body: Code) {
    for n in 0 .. arr.count {
        if n % 2 == 1 do continue;
        it := arr[n];
        #unquote body;
    }
}

// This prints: 2, 5, 11
skip(.[2, 3, 5, 7, 11, 13]) {
    println(it);
}

Types

Primitives

Onyx contains the following primitive types.

void         // Empty, 0-size type

bool         // Booleans

u8  u16      // Unsigned integers: 8, 16, 32, and 64 bit.
u32 u64 

i8  i16      // Signed integers: 8, 16, 32, and 64 bit.
i32 i64

f32 f64      // Floating point numbers: 32 and 64 bit.

rawptr       // Pointer to an unknown type.

type_expr    // The type of a type.

any          // Used to represent any value in the language.

str          // A slice of bytes ([] u8)
cstr         // A pointer to bytes (&u8) with a null-terminator.
dyn_str      // A dynamic string ([..] u8)

range        // Represents a start, end, and step.

v128         // SIMD types.
i8x16 i16x8
i32x4 i64x2
f32x4 f64x2

Pointers

Pointers contain an address of a value of the given type. A &T is a pointer to value of type T. If a pointer is not pointing to anything, its value is null.

Use the & operator to take the address of a value. Note the consistency between the type and operation used to create a pointer.

x: i32  = 10;
p: &i32 = &x;

Use the * operator to retrieve the value out of a pointer. This is not a safe operation, so faults can occur if the pointer is pointing to invalid memory.

x := 10;
p := &x;

printf("*p is {}.\n", *p);

Multi-pointers

Normal pointers in Onyx do not support pointer addition nor subscripting, i.e. x[i]. To do this, a multi-pointer must be used.

Multi-pointers are written as [&] T. They implicitly convert to-and-from normal pointer types, so they do not add much to the safely of a program, but they do allow for expressed intent when using pointers. Consider these two procedures; there is a clear difference between how the pointers are going to be used.

proc_1 :: (out: &i32) {
    *out = 10;
}

proc_2 :: (out: [&] i32) {
    for 10 {
        out[it] = it;
    }
}

Note, pointer addition and substraction on [&] T steps with sizeof(T).

So, cast([&] i32, 0) + 1 == 4.

Pointers vs Multi-Pointers

& T[&] T
*tt[i]
t.foot + x
==, !===, !=

Fixed-size Arrays

Fixed-size arrays store a fixed number of values of any type. A [N] T array hold N values of type T. The [] operator can be used to access elements of the array.

arr: [5] u32;
arr[0] = 1;
arr[1] = 2;
arr[2] = 3;
arr[3] = 4;
arr[4] = 5;

Fixed-size arrays are passed by pointer to procedures. However, the = operator copies the contents of the array to the destination.

mutate_array :: (arr: [5] u32) {
	arr[3] = 1234;
}

arr: [5] u32;
mutate_array(arr);
println(arr[3]);     // Prints 1234

arr2: [5] u32 = arr; // This is an element by element copy.
arr2[3] = 5678;      // This does not modify the original array.
println(arr[3]);     // So this also prints 1234

Fixed-size arrays can be constructed using an array literal. Array literals have the form type.[elements]. The type is optional if the type of the elements can be automatically inferred.

arr := u32.[1, 2, 3, 4];
assert((typeof arr) == [4] u32, "type does not match");

floats := .[5.0f, 6.0f, 7.0f];
assert((typeof floats) == [3] f32, "type does not match");

Array Programming

Fixed-size arrays have builtin array programming support. This allows the +, -, *, / operators to be used with them.

Vector3 :: [3] i32; // A simple three-component vector

a: Vector3 = .[ 1, 2, 3 ];
b: Vector3 = .[ 1, 1, 1 ];

c := a + b; // [ 2, 3, 4 ];
c *= 2;     // [ 4, 6, 8 ];

Builtin Fields

Fixed-size arrays can also be accessed with builtin fields if their length is <= 4. The fields are x, y, z, w or r, g, b, a. Array field access is equivalent to regular indexing and does not affect an array's memory layout.

Color :: [4] f32; // A simple RGBA color

red:   Color = .[ 1, 0, 0, 0 ];
green: Color = .[ 0, 1, 0, 0 ];
blue:  Color = .[ 0, 0, 1, 0 ];

full_opacity: Color = .[ 0, 0, 0, 1 ];

fuchsia := red + full_opacity; // [ 1, 0, 0, 1 ]
fuchsia.b = 1; // Equivalent to fuchsia[2] = 1

teal := green + blue; // [ 0, 1, 1, 0 ]
teal.a = 1;

white := red + green + blue;
white.a = 1;

Slices

Slices are arrays with a runtime known size. A slice [] T is equivalent to the following structure.

[] T == struct {
	data: &T;
	count: u32;
}

Slices are the most common array-like type used in practice. Slices do not hold the data of their contents directly, but rather through a pointer.

Slices can be used to represent a sub-array. A slice can be created using the [] operator on an array-like type, but providing a range instead of an integer. Note that the range is half-open, meaning the upper bound is not included.

arr := u32.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

slice: [] u32 = arr[4 .. 7];
for slice {
	println(it);  // Prints 5, 6, 7
}

All array-like types implicitly cast to a slice. The following function works on fixed-size arrays, slices, and dynamic arrays.

product :: (elems: [] $T) -> T {
	result := 1;
	for elems do result *= it;
	return result;
}

data := .[1, 2, 3, 4];
println(product(data));
println(product(data[2 .. 4]));
println(product(Array.make(data)));

Dynamic Arrays

Dynamic arrays have a variable size that can be changed after they are created. A [..] T is a dynamic array of T. Functionality for dynamic arrays is provided in the Onyx standard library in the Array structure, which allows for using methods on dynamic arrays.

use core {println}

arr: [..] i32;
Array.init(&arr);
defer Array.free(&arr);

for 0 .. 10 {
    // Explicitly using Array structure
	Array.push(&arr, it);

    // OR

    // Implicitly using method calls
    arr->push(it)
}

for arr {
	println(it);
}

See the the Array structure for a full list of functions provided.

Dynamic arrays store an Allocator to know how to request more memory for the array. By default context.allocator is used. However, an alternate allocator can be specified in Array.make or Array.init.

Because dynamic arrays are so common and useful, Onyx provides some operator overloads for dynamic arrays. The most useful is <<, which is used to append elements.

// Same example as above.
use core {println}

// Dynamic arrays are safely automatically allocated
// on the first push, so there is no need to explicitly
// allocate it if you are using context.allocator.
arr: [..] i32;
defer arr->free();

for 0 .. 10 {
	// No need to take the address 'arr'.
	arr << it;
}

for arr {
	println(it);
}

Structures

Structures are the record type in Onyx. A structure is declared using the struct keyword and is normally bound to a symbol. Members of a structure are declared like declarations in a procedure.

Point :: struct {
	x: i32;
	y: i32;
}

Accessing Members

Member access is done through the . operator. Note that accessing a member on a pointer to a structure uses the same . syntax.

p: Point;
p.x = 10;
p.y = 20;

ptr := &p;
ptr.x = 30;

Structure Literals

Structure literals are a quicker way of creating a value of a struct type. They have the form, Type.{ members }. The members can be partially, or completely named. The same rules apply for when giving members as do for arguments when calling a procedure. If a value is not provided for a member, and no default value is given in the structure, a zeroed-value is used.

// Naming members
p1 := Point.{x=10, y=20};

// Leaving out names. Follows order of members declared in the structure.
p2 := Point.{10, 20};

Defaulted Members

Members can be given default values. These values are used in structure literals if no other value is provided for a member. They are also used by __initialize to initialize a structure.

Person :: struct {
	name: str = "Joe";

	// If the type can be inferred, the type can be omitted.
	age := 30;
}

sally := Person.{ name="Sally", age=42 };
println(sally);

// Because name is omitted, it defaults to "Joe".
joe := Person.{ age=31 };
println(joe);

// Leaving out all members simply sets the members with initializers to
// their default values, and all other members to zero.
joe2 := Person.{};
println(joe2);

Directives

Structures have a variety of directives that can be applied to them to change their properties. Directives go before the { of the structure definition.

DirectiveFunction
#size nSet a minimum size
#align nSet a minimum alignment
#packDisable automatic padding
#unionA members are at offset 0 (C Union)

Polymorphic Structures

Structures can be polymorphic, meaning they accept a number of compile time arguments, and generate a new version of the structure for each set of arguments.

// A 2d-point in any field.
Point :: struct (T: type_expr) {
	x, y: T;
}

Complex :: struct {
	real, imag: f32;
}

int_point: Point(i32);
complex_point: Point(Complex);

Polymorphic structures are immensely useful when creating data structure. Consider this binary tree of any type.

Tree :: struct (T: type_expr) {
	data: T;
	left, right: &Tree(T);	
}

root: Tree([] u8);

When declaring a procedure that accepts a polymorphic structure, the polymorphic variables can be explicitly listed.

HashMap :: struct (Key: type_expr, Value: type_expr, hash: (Key) -> u32) {
	// ...
}

put :: (map: ^HashMap($Key, $Value, $hash), key: Key, value: Value) {
	h := hash(key);
	// ...
}

Or they can be omitted and a polymorphic procedure will be created automatically. The parameters to the polymorphic structure can be accessed as though they were members of the structure.

HashMap :: struct (Key: type_expr, Value: type_expr, hash: (Key) -> u32) {
	// ...
}

put :: (map: ^HashMap, key: map.Key, value: map.Value) {
	h := map.hash(key);
	// ...
}

Structure Composition

Onyx does not support inheritance. Instead, a composition model is preferred. The use keyword specifies that all members of a member should be directly accessible.

Name_Component :: struct {
	name: str;
}

Age_Component :: struct {
	age: u32;
}

Person :: struct {
	use name_comp: Name_Component;
	use age_comp:  Age_Component;
}

// 'name' and 'age' are directly accessible.
p: Person;
p.name = "Joe";
p.age = 42;
println(p);

Sub-Type Polymorphism

Onyx supports sub-type polymorphism, which enable a safe and automatic conversion between pointer types &B to &A if the following conditions are met:

  1. The first member of B is of type A.
  2. The first member of B is used.
Person :: struct {
	name: str;
	age:  u32;
}

Joe :: struct {
	use base: Person;
	pet_name: str;
}

say_name :: (person: ^Person) {
	printf("Hi, I am {}.\n", person.name);
}

joe: Joe;
joe.name = "Joe";

// This is safe, because Joe "extends" Person.
say_name(^joe);

In this example, you can pass a pointer to Joe when a pointer to Person is expected, because the first member of Joe is a Person, and that member is used.

Enumerations

Enumerations or "enums" give names to values, resulting in cleaner code. Enums in Onyx are declared much like structures.

Color :: enum {
	Red;
	Green;
	Blue;
}

col := Color.Red;

Notice that enums use ; to delineate between members.

By default, enum members are automatically assigned incrementing values, starting at 0. So above, Red would be 0, Green would be 1, Blue would be 2. The values can be overridden if desired. A :: is used because these are constant bindings.

Color :: enum {
	Red   :: 123;
	Green :: 456;
	Blue  :: 789;
}

Values are automatically incremented from the previous member if no value is given.

Color2 :: enum {
	Red :: 123;
	Green;  // 124
	Blue;   // 125
}

Values can also be expressed in terms of other members.

Color3 :: enum {
	Red   :: 123;
	Green :: Red + 2;
	Blue  :: Red + Green;
}

By default, enums values are of type u32. This can also be changed by specifying the underlying type in parentheses after the enum keyword.

Color :: enum (u8) {
	Red; Green; Blue;
}

Enums can also represent a set of bit-flags, using the #flags directive. In an enum #flags, values are automatically doubled instead of incremented.

Settings :: enum #flags {
	Vsync;        // 1
	Fullscreen;   // 2
	Borderless;   // 4
}

settings: Settings;
settings |= Settings.Vsync;
settings |= Settings.Borderless;
println(settings);

As a convenience, when accessing a member on an enum type, if the type can be determined from context, the type can be omitted.

Color :: enum {
	Red; Green; Blue;
}

color := Color.Red;

// Because something of type Color only makes
// sense to compare with something of type Color,
// Red is looked up in the Color enum. Note the
// leading '.' in front of Red.
if color == .Red {
	println("The color is red.");
}

Tagged Unions

Tagged unions in Onyx can be thought of as an enum, with every variant having a different type associated with it. When the tagged union is one variant, it is storing a value of the corresponding type. A value can only be one of the variants at a time. They are written using the union keyword and look much like structures.

Here is an example of a tagged union.

Value :: union {
    // First variant, called Int, stores an i32.
    Int: i32;

    // Second variant, called String, stores a str.
    String: str;

    // Final variant, called Unknown, stores "void", meaning it does not store anything.
    Unknown: void;
}

This union has three variants called Int, String, and Unknown. They store an i32, str and nothing respectively. Internally there is also an enum made to store these variant tags. You can access it using Value.tag_enum.

To create a value out of a union type, it looks like a structure literal, except there must be exactly one variant listed by name, with its corresponding value.

v1 := Value.{ Int = 123 };
v2 := Value.{ String = "string value" };
v3 := Value.{ Unknown = .{} }; // To spell a value of type 'void', you can use '.{}';

We create three values, one for each variant of the union. To get access to the values inside of the tagged union, we have two options. Using a switch statement, or using variant access.

We can use switch statement over our tagged union value, and use a capture to extract the value stored inside.

print_value :: (v: Value) {
    switch v {
        // `n` is the captured value
        // Notice we use `.Integer`. This is short for `Value.tag_enum.Integer`.
        case .Integer as n {
            printf("Its an integer with value {}.\n", n);
        }

        case .String as s {
            printf("Its a string with value {\"}.\n", s);
        }

        // All other case will be unhandled
        // This is still necessary to satisfy exhaustive matching
        case _ ---
    }
}

print_value(v1);
print_value(v2);
print_value(v3);

We can also directly access the variant on the tagged union. This gives us an optional of the corresponding type. If the current variant matched, we get a Some. If not, we get a None.

println(v1.Integer); // prints Some(123)
println(v1.String);  // prints None

You can use the features of Optionals to work with these results.

Polymorphic unions

Like structures, unions be polymorphic and take type parameters.

A good example is the Result type from the standard library. It is defined as:

Result :: union (Ok_Type: type_expr, Err_Type: type_expr) {
    Ok: Ok_Type;
    Err: Err_Type;
}

These works exactly like polymorphic structures when it comes to using them in procedure definitions and the like.

// Returns an optional of the error type of the result.
// This is entirely redundant, since `result.Err` would give the same result.
get_err :: (result: Result($Ok, $Err)) -> ? Err {
    return result.Err;
}

Distinct

Distinct types wrap another type in a new distinct type, which allows for strong type checking and operator overloads. Consider this example about representing a timestamp.

use core {println}

Time     :: #distinct u32
Duration :: #distinct i32

#operator - (end, start: Time) -> Duration {
	return Duration.{cast(u32) end - cast(u32) start};
}

start := Time.{1000};
end   := Time.{1600};

duration := end - start;
println(typeof duration);
println(duration);

With distinct types, more semantic meaning can be given to values that otherwise would be nothing more than primitives.

Distinct types can be casted directly to their underlying type, and vice versa. Distinct types cannot be casted directly to a different type.

It should be noted that when a distinct type is made, none of the operators defined for the base type are defined for the new type. In the previous example, two Time values would not be comparable unless a specific operator overload was provided.

Time :: #distinct u32
#operator == (t1, t2: Time) => cast(u32) == cast(u32) t2;

Procedure types

Procedure types represent the type of a procedure. They are used when passing a procedure as an argument, or storing a procedure in a variable or structure member. They are written very similar to procedures, except they must have a return type, even if it is void.

map :: (x: i32, f: (i32) -> i32) -> i32 {
	return f(x);
}

// Explicit version of a procedure
println(map(10, (value: i32) -> i32 {
	return value * 2;
}));

Using procedure types for parameters enables quick procedures to be passed.

map :: (x: i32, f: (i32) -> i32) -> i32 {
	return f(x);
}

// Quick version of a procedure
// Because 'map' provides the type of the argument
// and return value, this quick procedure can be passed.
println(map(10, x => x * 2));

As a convenience, procedure types can optionally have argument names to clarify what each argument is.

handle_player_moved: (x: i32, y: i32, z: i32) -> void

// Elsewhere in the code base.
handle_player_moved = (x, y, z) => {
	println("Player moved to {} {} {}\n", x, y, z);
}

handle_player_moved(10, 20, 30);

Optional

Optional types in Onyx represent something that may or may not contain a value. They are simply written as ? T, where T is the type that they optional may contain.

You may wonder why its written as ? T instead of T?. This is to prevent ambiguity. For example, if you see [] T?, is this an optional slice of T (([] T)?)? or a slice of optional T ([] (T?))? To avoid this problem the optional specifier is placed in the same place as the pointer/slice/etc. specifiers.

Internally, Optional types are simply defined as a polymorphic union called Optional in the builtin package. They are just given the special ? T syntax to mean Optional(T). These are equivalent.

Using Optionals

Optionals have been designed to be used ergonomically within your codebase, without much overhead.

Here is an incorrect example of function that gets the last element of an array. It is incorrect because it does not correctly handle the case where the array is empty.

array_last :: (arr: [] $T) -> T {
    return arr[arr.count - 1];
}

array_last(.[1, 2, 3]) // returns 3
array_last(.[])        // undefined behavior

The only change we need to make to make this correct, all we need to change is the return type to ? T, and add a check for an empty array. The correct code would look like so.

array_last :: (arr: [] $T) -> ? T {
    // If the array is empty. Equivalent to arr.count == 0
    if !arr {
        // Return an empty instance of ? T, which is a None.
        return .{};
    }

    // This will implicitly cast from a T to a ? T.
    return arr[arr.count - 1];
}

array_last(.[1, 2, 3]); // returns Some(3)
array_last(.[]);        // returns None

If we wanted to get the value stored in an optional, we have a couple of options. We could,

  • Use one of the builtin methods on Optional, like unwrap, value_or, or or_else.
  • Use the try operator (?) force getting the value, or returning .{} from the nearest block.
  • Use the coalese operator (??).
  • Use a switch statement with a capture
// `o` is an optional i32.
o: ? i32 = 123;

v1 := o->unwrap();         // Cause an assertion failure if it doesn't exist
v2 := o->value_or(0);      // Provide a default value
v3 := o->or_else(() => 0); // Provide a default value, wrapped in a function

v4 := o?;     // If no value exists, execute `return .{}`;
v5 := o ?? 0; // Equivalent to `->value_or(0)`;

switch o {
    case .None {
        // No value
    }

    case .Some as v6 {
        // v6 is the value
    }
}

Directives

"Directives" is the generic word for the special meanings keyword-like things that control isolated aspects of the language.

This section describes most of the directives in the language, while some directives are described in more relevant parts of the documentation.

#inject

Note, the below documentation about #inject is out of date and will removed in the future. This is because #inject is now unnecessary with recent changes to the language. You can read more about the new syntax on the Bindings page, under the "Targeted Bindings" section.

#inject is a special directive compared to most others, because it enables many very powerful features in the language.

The basic idea of #inject is that it injects symbols into a scope, from anywhere in the code base. Using this, you can add methods to structures, add symbols and overloads to a package, and even declare new global types and variables.

Syntax

The inject directive can take on two forms: the singular form, and the block form.

In the singular form, you simply write #inject before a binding, but that binding's target can be nested inside of anything with a scope. For example, here is one way of adding a method to a structure.

Vector2 :: struct {
	x, y: f32;
}

// Without #inject here, you would get a parsing error.
#inject
Vector2.magnitude :: (v: Vector2) -> f32 {
	return math.sqrt(v.x * v.x + v.y * v.y);
}

main :: () {
	v := Vector2.{ 3, 4 };

	println(v->magnitude());
}

Note, while it would be possible in this case to change the syntax so you would not need to put #inject in this case, I think that can lead to some unexpected bugs. I have not tried it though, so it might be nice to use.

The powerful thing about #inject is that the definition for Vector2.magnitude does not have to be in the same file, or even the same package. It can even be optionally defined with a static if. Using #inject you can define your own extensions to types provided from any library.

When you have many things to inject into the same place, you can use the block form of #inject. In this form, you write #inject, then the thing to inject into, followed by braces ({}). Inside the braces, any binding you write will be turned into an injected binding into that scope.

Vector2 :: struct {
	x, y: f32;
}

#inject Vector2 {
	add :: (v1, v2: Vector2) => Vector2.{ v1.x + v2.x, v1.y + v2.y };
	sub :: (v1, v2: Vector2) => Vector2.{ v1.x - v2.x, v1.y - v2.y };
	mul :: (v1, v2: Vector2) => Vector2.{ v1.x * v2.x, v1.y * v2.y };
}

main :: () {
	v1 := Vector2.{ 3, 4 };
	v2 := Vector2.{ 5, 6 };


	println(v1->add(v2));         // Using method call syntax
	println(Vector2.sub(v2, v1)); // Using explicit syntax
}

Limitations

Anywhere a binding can appear, you can inject into, with the exception of procedure bodies. Procedure bodies are isolated to prevent confusion.

But, this means you can inject into any of these things:

  • Packages
  • Structures
  • Unions
  • Enums
  • (probably more, but I am forgetting them at the time of writing)

Making global bindings

To prevent name clutter, Onyx intentionally places every binding into a package. See the Packages section for more details. But sometimes, you want to make a true global binding, one that does not require using any package to access. To do this, you can simply #inject into the builtin package.

This is because the builtin package is special. Its public scope is actually mapped to the global scope of the program. This makes it so everything useful defined in builtin is always accessible to you, things like make, new, delete and context. But because of this, you can #inject into it to define your own globals for your program.

One example use of this is the logf function. It is only defined if you are using a runtime with the core.conv package defined. To do this, there is an #inject into builtin in core/conv/format.onyx for the logf symbol. This way it is always accessible, but only if you are using a runtime with formatting capabilities.

#if

#if is a compile-time if statement. It looks like a normal if statement, except its condition must be resolvable at compile time. This is because it controls whether or not the body of the #if statement are included in the compilation.

Static-ifs can be used inside and outside of procedures.

Outside of Procedures

When outside of a procedure, static-ifs can be used to control whether or not certain symbols are defined in the current compilation.

DEBUG_MODE :: true

#if DEBUG_MODE {
	// This procedure is only defined if DEBUG_MODE is true
	debug_only_procedure :: () {
		// ...
	}
} else {
	// This procedure is only defined if DEBUG_MODE is false
	not_a_debug_procedure :: () {
		
	}
}

Static-ifs can contain any top-level "thing" in the language: procedures, structures, unions, and even #load directives. Using this feature, you can optionally include files depending on a condition.

USE_EXTENSIONS :: false
MINIMUM_EXTENSION_VERSION :: 5

#if USE_EXTENSIONS && MINIMUM_EXTENSION_VERSION >= 3 {
	#load "./extensions/v3"
}

Inside of procedures

When inside a procedure, static-ifs can contain statements that will only be included if the static-if resolves to be true.

DEBUG_MODE :: true

draw :: () {
	#if DEBUG_MODE {
		draw_debug_ui();
	}

	// ...
}

Other uses

See the #defined documentation for more uses of static-if statements.

#tag

#tag is used to attach static metadata to various compile-time objects. This metadata can then be accessed using the runtime.info package.

To tag something, simply place one or more #tags before the binding. The order of the tags are preserved when using them.

The metadata that is attached has to be a compile-time known value, because it will be serialized and placed in the data section of the resulting binary. It could be a numeric, string, structure or array literal for example.

Structures

Here is an example of a structure tagged with a string literal.

#tag "option:hidden"
Value :: struct {
    // ...
}

To access tags on a structure, use the get_type_info function from runtime.info to get the Type_Info of the structure. Then use ->as_struct() to convert it to a Type_Info_Struct. Then you can use the .tags property to access the stored data. It is simply an array of any, which can be used with the utilities for any found in core.misc.

use runtime.info { get_type_info }
use core { printf, misc }

main :: () {
    info := get_type_info(Value)->as_struct();
    for tag in info.tags {
        if tag.type == str {
            value := * misc.any_as(tag, str);
            printf("Value: {}\n", value);
        }
    }
}

Structure members

Value :: struct {
    #tag "a value"
    member_name: i32;
}

To access tags on a structure member, do the same steps as above, and then use the members array on the Type_Info_Struct. On each member's info there is a tags array of that contains all the tags defined the member, in the order they were defined.

use runtime.info { get_type_info }
use core { printf, misc }

main :: () {
    info := get_type_info(Value)->as_struct();
    for member in info.members {
        for tag in member.tags {
            if tag.type == str {
                value := * misc.any_as(tag, str);
                printf("Value: {}\n", value);
            }
        }
    }
}

Unions

Tags on unions behave in exactly the same manner as tags on structures.

Unions Variants

Tags on union variants behave in exactly the same manner as tags on structure members.

Procedures

Tag information for procedures is located in the runtime.info.tagged_procedures array. You can either loop through this array manually, or you can use the helper procedure runtime.info.get_procedures_with_tag.

use runtime.info {get_procedures_with_tag}
use core {printf}

Metadata :: struct { name: str }

#tag Metadata.{ "name is foo" }
foo :: () { }

#tag Metadata.{ "name is bar" }
bar :: () { }


main :: () {
    // Provide the type of the tag.
    procs := get_procedures_with_tag(Metadata);
    for p in procs {
        printf("Procedure is: {}\n", p.func);
        printf("Procedure type is: {}\n", p.type);
        printf("Tag is: {}\n", p.tag);
        printf("Procedure is in package: {}\n", p.pack);
    }
}

Globals

Like tagged procedures, tagged global information lives in runtime.info.tagged_globals. You can either loop through it directly, or use the helper procedure runtime.info.get_globals_with_tag.

use runtime.info {get_globals_with_tag}
use core {printf}

Metadata :: struct { name: str }

#tag Metadata.{ "name is foo" }
foo: i32


main :: () {
    // Provide the type of the tag.
    globs := get_globals_with_tag(Metadata);
    for g in globs {
        printf("Global address is: {}\n", g.data);
        printf("Global type is: {}\n", g.type);
        printf("Tag is: {}\n", g.tag);
        printf("Global is in package: {}\n", g.pack);
    }
}

#export

#export adds a procedure to the export-list of the compiled WebAssembly binary. This is a crucial piece of functionality when trying to use Onyx in other environments, such as from JS or in plugin systems.

The syntax for #export looks like this.

#export "export name here" exported_procedure

The name provided must be a compile-time string. The exported procedure can either be a reference to a procedure, or a procedure literal itself.

#export "add" (x, y: i32) -> i32 {
	return x + y;
}

#foreign

The #foreign directive is used to tell the compiler that a function is defined outside of this program. Because Onyx compiles to WebAssembly, this means that the function will be added to the import section of the WASM module. You can read more about what that means here.

The #foreign directive can appear in two different places, depending on which is more convenient.

The first position it can appear in is directly after the return type of a function. In this position, it must be followed by two compile-time known strings that are the module and import name. This terminology is inherited from the WebAssembly specification.

external_procedure :: (arg1: i32, arg2: i32) -> i32 #foreign "host" "add" ---

In this example, host is the module name, and add is the import name.

Foreign blocks

The other position #foreign can appear in is foreign-blocks. In this form, you can declare many foreign procedures at once, so long as they all have the same module name, and their import name matches the name in Onyx that is given to them.

#foreign "host" {
	add :: (arg1: i32, arg2: i32) -> i32 ---
	sub :: (arg1: i32, arg2: i32) -> i32 ---
	mul :: (arg1: i32, arg2: i32) -> i32 ---
}

In this example, add, sub, and mul are all foreign procedures with the module name host. They have the import names add, sub, and mul respectively.

We can validate this using the wasm-objdump tool from the WebAssembly Binary Toolkit. We also have to compile in a special way to not clutter the output with the imports that come from the standard library.

$ onyx build -r custom -o example.wasm example.onyx core/runtime/default_link_options.onyx
$ wasm-objdump -x -j import example.wasm

example.wasm:	file format wasm 0x1

Section Details:

Import[3]:
 - func[0] sig=0 <host.add> <- host.add
 - func[1] sig=0 <host.sub> <- host.sub
 - func[2] sig=0 <host.mul> <- host.mul

When using Onyx from the command line with onyx run, or when running with the WASI backend, these foreign functions will be resolved for you. However, when using JavaScript as your runtime, you will need to provide definitions for each imported procedure. See this MDN article for more details.

#file_contents

#file_contents can be used to add the contents of a file (text or binary) into the data section of the outputted WebAssembly binary, and gives you access to it as a [] u8.

You can use this to embed anything in the binary that you would have had to put in a string literal, or load at runtime.

image_data := #file_contents "image/path/here.png";
pixels     := convert_image_to_pixels(image_data);

This way, there is file I/O to load an image from disk. It is already in the binary ready to be used.

#defined

When a symbol may or may not be defined due to different compilation flags, you can use #defined to test whether or not it is actually defined.

#defined looks like a procedure with a single argument, which evaluates at compile-time to a boolean expression.

use core {println}

main :: () {
	main_is_defined := #defined(main);  // true
	foo_is_defined  := #defined(foo);   // false
}

One useful feature of #defined is that you can use it to test if a package is defined in the program. This way, you can test for optional extensions in your program, without relying on using the correct flags.

#if #defined(package foo) {
	// We know foo is defined, we can write a procedure that uses it
	uses_foo :: () {
		use foo;
		
		foo.bar();
	}
}

Using with #if

#defined is generally used with #if to conditionally include things depending on if something else was or was not defined.

As an example, you could have a set of procedures that can be overridden by the end-user of your library. But if they want to use the defaults, they can be still be defined automatically. A combination of targeted bindings, #defined, and #if makes this works well.

In the library, you would use #if and #defined to test if a certain flag was defined.

package your_library

// Use predefined procedures if user did not override them.
#if !#defined(CUSTOM_PROCEDURES) {
	do_thing_one :: () { println("Default thing 1!"); }
	do_thing_two :: () { println("Default thing 2!"); }
}

Then the consumer of the library can use targeted bindings to define the flag and functions if necessary.

package main

use your_library

// Override procedures with targeted binding.
your_library.CUSTOM_PROCEDURES :: true

your_library.do_thing_one :: () { println("Overridden thing 1!"); }
your_library.do_thing_two :: () { println("Overridden thing 2!"); }

main :: () {
	your_library.do_thing_one(); // Overridden thing 1!
	your_library.do_thing_two(); // Overridden thing 2!
}

#persist

#persist is used to make static global variable in places that normally would not have static global variables.

You can define a persistent or static variable in a procedure like so.

count :: () -> i32 {
	// Persistent variables are global variables
	// constrained to the current scope.
	#persist counter: i32;

	counter += 1;
	return counter;	
}

main :: () {
	for 100 {
		println(count());
	}
}

You can define a persistent variable in a structure body, where it will be accessible using the structure name as a namespace.

Foo :: struct {
	#persist foo_counter: i32;

	name: str;


	make :: () -> Foo {
		Foo.foo_counter += 1;
		return Foo.{ tprintf("Foo #{}\n", Foo.foo_counter) };
	}
}

main :: () {
	f1 := Foo.make();
	f2 := Foo.make();

	println(f1);  // Foo #1
	println(f2);  // Foo #2

	println(Foo.foo_counter);
}

#thread_local

#thread_local is used to define a global variable as thread-local. These thread-local variables are unique across threads so every thread gets a copy.

use core.thread
use core.iter
use core {println}

#thread_local
counter: i32;

thread_task :: (_: rawptr) {
	for 0 .. 10000 {
		counter += 1;
	}

	println(counter);
}

main :: () {
	threads := iter.as_iter(0 .. 16)
			|> iter.map(_ => {
				t := new(thread.Thread);
				thread.spawn(t, cast(&void) null, thread_task);
				return t;
			})
			|> iter.collect();

	for t in threads {
		thread.join(t);
	}
}

Note, this example will not work on the Onyx Playground, because it uses multi-threading, which is not supported there.

This program will print 10000, sixteen times since each thread has its own copy of counter.

#doc

#doc is used to provide doc-strings to bindings. They can appear before most "things" in the language, like structures, unions, procedures, enums, etc.

To use them, simply write #doc followed by a compile-time string.

#doc "This is the documentation for the 'procedure_a'."
procedure_a :: () {
	// ...
}

#doc """
	This multi-line string literal is the documentation
	for procedure_b.
"""
procedure_b :: () {
	// ...
}

Note that you can only have one #doc directive per binding.

These doc-strings are included in the generated .odoc file when compiled with the --doc flag. This binary file is used by onyx-doc-gen to generate HTML documentation for the current compilation. This file can also be easily deserialized into a structure you can work with in Onyx like so.

use core.encoding.osad
use core.doc
use core.os

contents := os.get_contents("documentation.odoc");
docs     := osad.deserialize(doc.Doc, contents)->unwrap();

// See core/doc/doc.onyx for the what is inside of `docs`

#deprecated

You can use #deprecated on a procedure to cause a warning whenever it is called. It is not a compile error, but it will show the deprecation message when the program is compiled.

Here is how to use it.

an_old_procedure :: (x, y: i32) -> i32
	#deprecated "This is the deprecation message. Include relevant replacement info here."
{
	// ...
}

The #deprecated directive goes after the return type and before the start of the function body. A compile-time known string must follow that should contain information about how to migrate away from using the deprecated function.

Currently, #deprecated can only appear on procedures. While it could be useful on types, it is currently not supported.

#init

#init allows you to define procedures that run before the main in your program. This allows you to do simple set ups and initialization before main is reached.

#init must be followed by a compile-time known procedure with the type signature, () -> void.

#init () {
	println("In #init procedure!");
}

main :: () {
	println("In main!");
}

// Output:
// In #init procedure!
// In main!

You are guaranteed that the runtime has been fully initialized before any #init procedure is invoked. This way, you know that printing and heap allocations will work from #init procedures.

Ordering with #after

The order of #init procedure is undefined and unstable if you change your program. However, by using the #after directive, you can specify a dependency of an #init procedure that is guaranteed to be executed before the procedure in question.

global_map: Map(str, i32);

// Bind the #init statement to a symbol.
prepare_map :: #init () {
	global_map = make(Map(str, i32));
}

populate_map :: #init #after prepare_map () {
	global_map->put("A", 1);
	global_map->put("B", 2);
	global_map->put("C", 3);
}

In this example, prepare_map is guaranteed to be run before populate_map because of the #after directive on populate_map.

You can specify as many #after directives as you want on a single #init procedure.

#init
	#after A
	#after B
	#after C
() {
	// ...
}

In this example, the #init procedures A, B and C will be run before this #init procedure.

#error

#error is used to produce a static, compile-time error.

To use it, simply place it outside of any procedure, and include a compile-time string that is the error message.

#error "This is a static error the prevents this program from compiling."

main :: () {
}

#error by itself is almost useless, but when combined with #if, you can achieve something like a static-assertion.

#if !#defined(something_important) {
	#error "'something_important' must be defined to compile."
}

#this_package

This directive is a small hack that can be used when writing macros. Because macros do not have normally scoping, it can be difficult to reference something that is defined in the same package as the macro, since when the macro is expanded it might not be visible.

#this_package is used to represent the current file's package as a object in which you can look things up.

internal_details :: (x: rawptr, T: type_expr) {
	// ...
}

useful_macro :: macro (x: & $T) {
	#this_package.internal_details(x, T);
}

This pattern is very common in the core libraries of Onyx, where you have a macro that takes a pointer to anything, but it gets expanded to a procedure call that simply passes the pointer and the type of the value.

This has to use #this_package because internal_details is not going to be directly accessible when the macro is expanded. But by specifying that it needs to be looked up in the current package, this problem can be avoided.

#wasm_section

When producing the final WebAssembly file, custom sections can be included to add metadata to the binary. The compiler already produces some of these like names, and producers.

Custom sections can be specified in an Onyx program by using the #wasm_section directive. This directive is followed by the custom section name and the contents of the custom section, as compile-time strings.

#wasm_section "my-custom-section" "Custom section data here."

#wasm_section "another-section" #file "path/to/custom/data"

Miscellaneous

Format Strings

When specifying the format string for printf or conv.format, there are a number of options you can use to configure how the resulting string will be formatted. Format specifiers are specified between curly-braces ({}) in the format string. There is a one-to-one mapping between the number of curly-braces and arguments provided to conv.format, at least at the moment.

This table provides brief defintions as to what can appear between the curly braces.

SymbolUse
*If the variable is a pointer, dereference the pointer and format the result
pPretty formatting
.NSets the decimal precision when formatting a float to be N digits
bNSets the base when formatting an interger to be N
xShorthand for b16
wNLeft-pad to N characters long (this might not work for everything)
"Quote strings in double quotes. Quotes are only added to strs
'Quote string in single quotes. Quotes are only added to strs
dDisable printing enums as strings and print as numbers instead

Reflection

Reflection provides the ability for a program to introspect itself, and perform operations different dynamically at runtime based on a values type, or metadata in the compiled program.

In Onyx, reflection is available through the runtime.info package. This package provides utility function for accessing all type information and metadata (tags) stored in the binary.

Types are Values

Every type in Onyx is given a unique ID at compile time. This ID is not stable, so a separate compilation may choose a different ID for the same nominal type. By having a single integer for every type, Onyx's types can be runtime values as well as compile time values.

In the example, t is variable that stores a type.

main :: () {
    t := i32;

    println(t);         // Prints i32
    println(typeof t);  // Prints type_expr, aka the type of a type
}

Under the hood, t is simply storing a 32-bit integer that is the unique ID of i32.

any

This ability to have types as runtime values enables any in Onyx. any is a dynamically typed value, whose type is known at runtime, instead of at compile-time. Under the hood, any looks like this:

any :: struct {
    data: rawptr;
    type: type_expr;
}

As you can see, it stores a data pointer and a runtime-known type. Every any points to a region of memory where the value is actually stored. You can think of any like a "fat-pointer" that stores the pointer, plus the type.

any is typically used as an argument type on a procedure. When a parameter has any as its type, the compiler will implicitly wrap the corresponding argument in an any, placing the argument on the stack, and constructing an any using the pointer to the stack and the type of the argument provided.

uses_any :: (value: any) {
    println(value.type);
}

main :: () {
    uses_any(10);       // Prints i32
    uses_any("Hello");  // Prints str
    uses_any(context);  // Prints OnyxContext
}

any can also be used for variadic arguments of different types.

/// Prints all of the 
many_args :: (values: ..any) {
    for value in values {
        printf("{} ", value.type);
    }
}

main :: () {
    many_args(10, "Hello", context);
    // Prints: i32 str, OnyxContext
}

To use the data inside of an any, you have to write code that handles the different types, or kinds of types that you expect. You can either check for concrete types explicitly, or use runtime type information to handle things dynamically. To get the type information for a given type, use the runtime.info.get_type_info procedure, of the info method on the type_expr.

print_size :: (v: any) {
    size := switch v.type {
        case i32 => 4
        case i64 => 8
        case str => 8
        case _   => -1
    };

    printf("{} is {} bytes.\n", v.type, size);
}

main :: () {
    print_size(10);
    print_size("Hello");
    print_size(context);
}

In this contrived example, print_size checks the type of the any against explicit types using a switch expression, defaulting to -1 if the type is not one of them.

For some applications of any this is perfectly acceptable, but for others, a more generalized approach might be necessary. In such cases, you can use runtime type information to introspect the type.

Using Runtime Type Information

Baked into every Onyx compilation is a type table. This table contains information on every type in the Onyx program, from the members of structures, to the variants of unions, to which polymorphic structure was used to create a structure.

This information is stored in runtime.info.type_table, which is a slice that contains a &Type_Info for each type in the program.

Type_Info stores generic information for every type, such as the size. When given a &Type_Info, you will generally cast it to another type to get more information out of it by using the kind member.

In this example, when a structure type is passed in, the function will print the all of the members of the structure, including their: name, type and offset.

print_struct_details :: (type: type_expr) {
    info := type->info();
    struct_info := info->as_struct();  // OR cast(&Type_Info_Struct) info
    
    for member in struct_info.members {
        printf("Member name   : {}\n", member.name);
        printf("Member type   : {}\n", member.type);
        printf("Member offset : {} bytes\n", member.offset);
        printf("\n");
    }
}

Foo :: struct {
    first: str;
    second: u32;
    third: &Foo;
}

main :: () {
    print_struct_details(Foo);
}

This prints:

Member name   : first
Member type   : [] u8
Member offset : 0 bytes

Member name   : second
Member type   : u32
Member offset : 8 bytes

Member name   : third
Member type   : &Foo
Member offset : 12 bytes

In this example, runtime type information is used to get the size of the type.

print_size :: (v: any) {
    info := v.type->info();
    size := info.size;     // Every type has a known size

    printf("{} is {} bytes.\n", v.type, size);
}

main :: () {
    print_size(10);
    print_size("Hello");
    print_size(context);
}

JS Interop

Interfacing with JavaScript from Onyx is easy thanks to the core.js package. It was inspired from syscall/js, made by the wonderful people over at on the Go team.

The core.js package abstracts away the details of managing references to JS values from Onyx, so you are able to write code that uses JS values without caring about all the internal details.

For example, here is a simple program that runs on a web browser. It creates a new button element, add a click event handler that will call an Onyx function, then adds the button to the page.

use core.js

main :: () {
    // Lookup the document object in the global scope (i.e. window).
    document := js.Global->get("document");

    // Call createElement to make a new button, then set the text of the button.
    button := document->call("createElement", "button");
    button->set("textContent", "Click me!");

    // Call addEventListener to handle the `click` event.
    // Use js.func to wrap an Onyx function to be available from JS.
    button->call("addEventListener", "click", js.func((this, args) => {
        js.Global->call("alert", "Hello from Onyx!");

        return js.Undefined;
    }));

    // Call appendChild on the body to insert the button on the page.
    document->get("body")->call("appendChild", button);
}

While compiling this program, be sure to add the -r js flag, as it specifies you are targeting a JS runtime.

onyx build -o app.wasm -r js program.onyx

This will generate two files, app.wasm and app.wasm.js. The .js file exists to allow you to load and call your Onyx code from JS. Here is a simple HTML page that will load the JS and start the program, which will in turn call main.

<html>
    <head>
        <title>Onyx program</title>
        <script type="module">
            import Onyx from "/app.wasm.js"
            let app = await Onyx.load("/app.wasm")
            app.start()  // Bootstrap program and call main
        </script>
    </head>
    <body>
    </body>
</html>

Load this in your favorite web browser from a local web server and you should see a button on the page. Click it to test the program!

Some internal details

There are some nuances that are worth mentioning about how this library is currently setup.

The .start() method does start the program and invoke your main function, but it also does a little more. It also bootstraps the standard library, preparing buffers and allocators used by most Onyx programs. For this reason, even if you are not going to do anything in your main program and solely want to use Onyx as auxiliary to your main code, you still need to call the .start() method; just leave the main procedure empty.

When you want to invoke a specific Onyx function from JS, you have to do two things. First, the procedure you wish to call has to have the following signature: (js.Value, [] js.Value) -> js.Value. The first argument is the this implicit parameter. The second argument is a slice of js.Values that are the actual arguments. Here is a simple add procedure using this signature.

use core.js

add :: (this: js.Value, args: [] js.Value) -> js.Value {
    a := args[0]->as_int() ?? 0;
    b := args[1]->as_int() ?? 0;

    res := js.Value.from(a + b);

    return res;
}

Second, export the procedure from Onyx using the #export directive.

#export "add" add

Then, you can use the .invoke() method to invoke the procedure with an arbitrary number of arguments.

app.invoke("add", 123, 456); // Returns 579

As a slight aid, if you forget to call .start(), .invoke() will automatically call it for you the first time. So, if you use invoke and are wondering why the main of your procedure is executing, you likely forgot to call start.

Understanding the API

The API provided by core.js is a very thin wrapper around normal JS operations. The best way to understand it is to understand what each of the methods does in JS. Once you understand how each JS operation maps to the corresponding Onyx method, it is relatively easy to translate JS code into Onyx.

Value.new_object

Creates a new empty object. Equivalent of writing {} in JS.

Value.new_array

Creates a new empty array. Equivalent of writing [] in JS.

Value.from

Converts an Onyx value into a JS value, if possible.

Value.as_bool, Value.as_float, Value.as_int, Value.as_str

Convert a JS value into an Onyx value, if possible.

Value.type

Returns the type of the JS value. Similar to typeof in JS, but it has sensible semantics.

Value.call

Calls a method on an object, passing the object as the this argument. x->call("y", "z") is equivalent to x.y("z") in JS.

Value.invoke

Invokes a function, passing null as the this argument. x->invoke("y") is equivalent to x("y") in JS.

Value.delete

Invokes the delete operator from JS on the property of the object.

Value.new

Invokes the new operator on the value.

Value.get

x->get("y") is equivalent to writing x.y in JS.

Value.set

x->set("y", 123) is equivalent to writing x["y"] = 123 in JS.

Value.length

Shorthand for x->get("length")->as_int() ?? 0, since this operation is so common.

Value.index

x->index(y) is equivalent to writing x[y] in JS.

Value.instance_of

x->instance_of(y) is equivalent to writing x instanceof y in JS.

Value.equals

Returns true if two values are equal.

Value.is_null

Returns if the value contained is null.

Value.is_undefined

Returns if the value contained is undefined.

Value.is_nan

Returns if the value contained is NaN.

Value.truthy

Return true if the value is considered "truthy" under JS's semantics.

Value.leak

Removes the value from the tracked pool of objects, so it will not automatically be freed.

Value.release

Frees the JS value being stored. After calling this the value should not be used anymore.

Defining your own JS module

This documentation will be coming soon!