Disclaimer
This documentation is incomplete and not guaranteed to be up to date.
Introduction
This is a high-level overview of some the features of the Onyx programming language. A basic knowledge of programming and computer systems is assumed. This documentation is not designed to be read top-to-bottom, so feel free to jump around as it makes sense. Most of the examples can be copied into the main
procedure on Onyx Playground.
Hello, Onyx!
The following is the famous "Hello, World!" program, implemented in Onyx.
use core {*}
main :: () {
println("Hello, World!");
}
Running Onyx
When your program is saved to hello.onyx
, you can now compile and run it:
onyx run hello.onyx
Compiling Onyx
You can also compile Onyx to a WebAssembly binary, and run it later:
onyx build hello.onyx -o hello.wasm
onyx run hello.wasm
Philosophy
This section covers some of the high-level design desisions and trade offs made in the Onyx programming language.
Design Decisions
Preface
The design decisions that shaped Onyx were made over the course of several years and tended to adapt to my preferred programming style throughout that the time. I always aimed to keep Onyx's features relatively orthogonal, but there are some overlapping features that target different styles of programming.
Imperative vs Functional
Onyx is an imperative language. You write a sequence of statements that should be executed in the specified order to evaluate your program. This is my preferred style of programming so it is what I made Onyx.
However, I do enjoy the simplicity of a functional language. The idea of expressing
a computation at a higher level appeals to me. Instead of writing a bunch of for
-loops,
you express what you want to happen, instead of how it should happen.
For this reason, Onyx does have functional-inspired features that make that style
of programming accessible. The two features that really make this possible are the
pipe operator, and quick procedures.
Here is an example of using them together with the core.iter
library to express
the computation: Sum the squares of the first 5 numbers in a sequence.
use core {iter, println}
main :: () {
sequence := i32.[5, 2, 4, 9, 29, 8, 2, 8, 3];
iter.as_iter(sequence) // Make the iterator
|> iter.take(5) // Only take the first 5 elements
|> iter.map(x => x * x) // Square each result
// Sum the squares with a fold operation
|> iter.fold(0, (x, y) => x + y)
|> println(); // Print it to the screen
}
While Onyx is largely an imperative language, there are many places where expressing your code in a more functional way like this can actually help readability.
For completeness, here is the same code written in an imperative style.
use core {println}
main :: () {
sequence := i32.[5, 2, 4, 9, 29, 8, 2, 8, 3];
sum := 0;
for value in sequence[0 .. 5] {
square := value * value;
sum += square;
}
println(sum);
}
Each developer can choose their own style in Onyx, but I want Onyx to be able to support both styles.
Why the ::
?
This was inspired from Jai and Odin. It means there is a compile-time constant binding
between something (a procedure, struct
, union
, number, etc.), and a symbol.
Here's some examples:
A_String :: "A compile-time string"
A_Number :: 42
A_Struct :: struct { }
A_Union :: union { }
A_Procedure :: () { }
This syntax might look strange at first, but it actually simplifies things quite a bit.
Notice how every kind of definition looks the same. Its always name :: thing
.
This means there is no longer a difference between things that are anonymous and things
that are nominal. If you want to write an anonymous procedure, you simply leave the binding
off of it. This is a silly example, because you couldn't call this procedure with a way
to reference it, but it does compile.
// No named procedure
(x: i32, y: i32) -> i32 {
return x + y;
}
The colon is actually relatively special in Onyx. Anywhere there is a :
, a new symbol
is being declared. To find (almost) all symbol declarations in a file, you can use the regular expression:
[a-zA-Z0-9]+\s?:
Note, the only exception to this rule is quick procedures, whose syntax does not use the colon, for the sake of being as terse as possible.
Semi-colons
To many, semi-colons are (or at least should be) a thing of the past. While I don't entirely disagree, Onyx currently does require them at the end of every statement. This is because of a larger trade-off: Onyx is whitespace agnostic. You can remove any whitespace that is not between a keyword and a symbol, and the program will continue to work.
This might not seem that important, but it is part of a larger goal to keep the Onyx language as unopinionated
as possible. You should be able to space out and format your code as you please, without the compiler
getting in the way. While good style should obviously be used, I don't believe it is the onus of Onyx to
enforce style. After more Onyx code exists, it might be worth creating something like onyx fmt
, like go fmt
,
but in the meantime that is not a priority.
You might think, Why not use newlines as 'semi-colons'?' This is a good point and something I have looked into. There are several features in Onyx that make this a little tricky and would force you to write code in a particular way.
For example, if/else expressions do not work well like this. Here is some code that is ambiguous without semi-colons.
x := foo()
if x == 5 {
// ...
}
// Could be interpreted as this, which would not compile.
x := foo() if x == 5 {
// ...
}
You might say, well since if
is on a new line, it shouldn't join with the previous line.
That would work but then you would have to write if/else expressions on the same line (or at least the if
part).
x := foo() if condition
else otherwise
This might be a worthwhile trade-off in the future, but that is to be decided later.
Why explicitly overloaded procedures?
Onyx uses explicitly overloaded procedures, over the more "traditional" implicitly overloaded procedures. In my experience, implicitly overloaded procedures sound like a good idea, until there are many overloads with complicated types, that could be ambiguous. See SFINAE as an example of what I am talking about.
To avoid this, Onyx's overloaded procedures must be explicitly declared, explicitly overloaded, and there is defined order as to which overloads are checked. It does cause a slightly verbose syntax, and a little bit more planning, but it simplifies things for the code writer, code reader, and the compiler writer. I believe it is a win-win-win.
Why WebAssembly?
WebAssembly's (very condensed) History
WebAssembly (WASM) is a new bytecode format that has become one of the largest misnomers in the computing space. While WASM started on the Web, it quickly found uses outside of the web browser. Since it is a platform and architecture independent bytecode, it can be used in much the same way as the JVM, Erlang's BEAM, or the .NET CLR. The thing that makes WASM so appealing is that the bytecode format is very simple and unopinionated, while the other bytecode options are very tied to the programming languages they run. WASM is meant to be a compilation target for every language.
WASM by design is sandboxed and safe to execute on any system. In order for a WASM binary to do anything, it must import functions from the host environment. In the browser, these would be defined in JavaScript. Outside of the browser, they have to be defined by the WASM runner. To prevent everyone from making their own standard, the WebAssembly Systems Interface (WASI) was made to cover most common use cases, like file operations and working with standard input/output.
WASI was a great step to get WASM out of the browser, but it does leave much to be desired. For example, at the time of writing it does not support networking, which makes writing a whole class of useful programs impossible. To fix this, Wasmer created WASIX, or an extended WASI specification, that fills the gaps in the WASI specification.
Note, Onyx fully supports WASIX by compiling with
-r wasi -DWASIX
.
There is work being done to create the WebAssembly Component Model, which is a way for programs written in a variety of different language to all interoperate with one another, much like how programs from Java, Kotlin, and Scala can interact because they all run on the JVM. This proposal is nearly completion, but Onyx is waiting until there are more languages implementing it to see how all of the details shake out. It is on the roadmap for Onyx to support it.
Why choose WebAssembly?
While WASM is great for its purpose, its purpose does seem a little niche. Why compile to WASM when you could just compile to native machine code? Why target WASM directly when you could target LLVM, and then get WASM for free, plus all other platforms?
I will preface this saying, WASM and Onyx are not for everyone's use case. While I hope to see WASM (and Onyx) used in more places, it is not meant to replace everything.
Onyx targets WASM for the following reasons:
-
WASM has a strong future in cross-platform deployment. WASM is already being used as an alternative to Docker containers in serverless deployments. WASM is also being used as plugin systems in editors and game engines. Almost every non-embedded system has some form of WASM capability because WASM is provided by all modern browsers.
-
WASM is safe. From their sandbox, explicit imports, and explicit permissions, WASM and WASI are much safer for the end-user when compared to native binaries and other bytecodes.
-
WASM is fast. WASM is simple to compile to, resulting in very fast compilation. WASM is translated to native machine instructions on every platform resulting in very high performance as well. There are even projects that can do this compilation ahead of time, so they can truly compile a WASM binary into a native binary.
-
WASM is easy. Onyx is not my full-time job. I do not have enough time or patience to work with LLVM. While it can produce great results and is an industry-leading technology for good reason, it is not known to be easy to work with. Also, targeting machine code directly would be just as hard and probably more time-consuming.
-
WASM is inconsequential. While counter-intuitive, the fact Onyx compiles to WASM is mostly transparent to the end user. When using
onyx run
, Onyx feels like using a scripting language, because the WASM details are hidden from the programmer. In production cases where the end-user does not have Onyx installed, see the above bullet point.
While WASM might not be the right choice for your project, Onyx only aims to provide a great developer experience for projects that want to use WASM.
Why use Onyx?
Onyx is a WebAssembly first language. Onyx aims to make it as easy as possible to start working with WebAssembly. For that reason, Onyx is very well suited for the niche kinds of projects that require using WebAssembly. WebAssembly is growing in popularity outside of the browser because of projects like Wasmer, Wasmtime, and WasmEdge that make it easy to run WebAssembly in a controlled environment. These "controlled environments" could be game engines, where WASM is used as a "script" system; cloud functions, where WASM is used to respond to requests; plug-in systems for editors or tools.
For more details, see the section on Why WebAssembly?.
Why not use Onyx?
Due to the tradeoffs and choices Onyx has made, Onyx is not suited for every use-case. For that reason, I don't expect Onyx to take off in the same way that Rust or Go took off.
There are many kinds of projects where Onyx will never be able to be used, and that's okay. I only want Onyx to be great for the projects that can use Onyx and WebAssembly. Some projects that Onyx would not be suited for would be:
- Very performance critical desktop applications
- Native libraries, through Wasmer does have a way to do this
- Embedded environments
To drive the point home, there will likely never be a rewrite it in Onyx trend like there is with Rust. Onyx is not aiming to replace Rust, Go, Zig, C++, or whatever your favorite language is. Onyx is new language, filling the rather niche purpose of supporting WebAssembly above all else. I do not see WebAssembly being a limitation of Onyx, but rather I see Onyx pushing the boundaries of WebAssembly.
Onyx's runtime
One interesting point to make is that the onyx
toolchain bundles a WebAssembly runner.
This means that when developing in Onyx, it will feel just like you are developing in NodeJS
or Python. You run your program with onyx run
, just like you would run node
or python
.
The fact that Onyx compiles to WebAssembly only matters when you are trying to ship your project.
For that, it is possible (but undocumented) to compile a standalone executable of your project
that bundles your WASM code and the runtime. Other than a slightly slower startup time, it
feels and acts just like a native executable.
Memory Management in Onyx
Onyx has manually managed memory. Now, you're probably thinking of C, where you
have to be careful that every malloc
that it has a matching free
. Onyx's memory
management has some aspect of that, but Onyx has many things that make it much easier
to not make mistakes.
Why manually managed memory?
There are realistically two alternatives, both of which Onyx chooses not to do.
- Garbage collection
- Borrow checking semantics
Garbage collection is not really possible in WASM, due to the way WASM is run. There is not a way to have an external source stop execution and do a garbage collection pass.
You might be thinking, but what about WASM GC? WASM GC is not a direct solution for this. WASM GC is a major change affecting the entire programming paradigm in WASM. Instead of structures and compound types being stored in linear memory, they are stored in references to external objects, and are operated on with an entire new class of instructions. This is why they can be garbage collected, because the external runtime can see everywhere they are stored, because they aren't stored in linear memory. Onyx may support WASM GC in the future, but it is not a priority item.
Borrow checking semantics are definitely a viable route to do memory management in WASM. Rust is as popular as it is for a good reason, and the community has made some incredible things in it. However, for Onyx, I did not want to have to "fight" the borrow checker. I wanted to write code how I wanted to write code. There are many projects I work on that I know have memory leaks, but I don't care. I'm just experimenting and want to get something working before I do a final pass. Rust is great for that final pass, and I have nothing against people who use and love Rust. For Onyx, I wanted to have more control because I feel like managing memory is not as hard as people make it out to be; especially when you have the right tools.
The right tool: Custom Allocators
Like Zig and Odin, Onyx has full support for creating custom allocators.
Everything that works with memory must explicitly use an Allocator
.
All core data structures and functions that allocate memory have the option
to specify which Allocator to use.
If you've never programmed with custom allocators, this might seem a little weird or complicated, but it actually simplifies programming and memory management.
One thing to realize is that most allocations you make have a well defined lifetime, or time in which they can be accessed. Sometimes that lifetime can be hard to describe to something like a borrow checker, but they breakdown into four categories:
- Very short term: Allocations that likely only live to end of a function.
- Short term: Allocations that live until the end of the main loop.
- Infinite: Allocations that will never be freed.
- Actually manually managed: Allocations you actually have to think about when they are freed.
For very short term allocations, Onyx has the defer
keyword.
Deferred statements or blocks are executed right before the current block or function exits.
They are convenient because you can place the freeing call right next to the allocation call,
so it is easy to keep track of. Also, it prevents accidentally forgetting to free something
because you added an early return to the function.
Note, the
defer
keyword is nothing new, and is present in many popular programming languages today.
For short term allocations, Onyx has the temporary allocator. The temporary allocator
is a thread-local allocate-only allocator. It simply grows an arena of memory as you
allocate from it. You cannot free from the temporary allocator. Instead, you free
everything in the temporary allocator, all at once, by calling core.alloc.clear_temp_allocator()
.
This is generally done at the end or beginning of the main loop of the program.
Infinitely living allocations are the easiest. Just don't free it.
I would estimate 5% of the time you actually have to think about how long an allocation has to live. For that 5%, it might take a little planning and debugging before your memory management is working entirely correctly, but I believe that is worth not fighting a borrow checker.
Why should you manage memory?
This may be a hot take, but for many programs, you don't need to manage memory. If your program is only going to do a task and then exit, it does not need to manage its memory.
Take the Onyx compiler as an example. It allocates a lot of memory, but since all of that memory is needed until the moment it exits, there is no point in freeing any of it. The operating system will take care of reclaiming the pages when the compiler exits.
I would argue the only time you need to do memory management is when you have a program that is going to run for a long time. Things like games, web servers, graphical applications, etc. In all of these cases, there is a central main loop that drives the program. This main loop creates a great natural boundary for when certain things can be freed.
The temporary allocator exists for this purpose. You allocate into the temporary allocator when you have something that should be valid for this loop, but should not live past the end of the loop.
The HTTP Server package for Onyx uses this strategy, but even more aggressive.
It replaces the main allocator (aka context.allocator
, which is used by default
throughout the standard library), with a GC allocator. This allocator tracks
every allocation made in it, and can free everything in a single call. Every
request handler uses this allocator, so you can intentionally forget to free everything,
and the GC allocator will automatically free it all when the request is processed.
Literals
Boolean Literals
Onyx contains standard the boolean literals: true
and false
. They must be spelled all-lowercase as they are actually just global symbols. These means if you are feeling particularly evil, you could change what true
and false
mean in a particular scope, but I highly recommend you don't.
Numeric Literals
Onyx contains the following numeric literals:
123 // Standard integers
0x10 // Hexadecimal integers
4.0 // Floating point
2.3f // Floating point, of type f32.
'a' // Character literals, of type u8
Integer literals are special in that they are "un-typed" until they are used. When used, they will become whatever type is needed, provided that there is not loss of precision when converting. Here are some examples,
x: i8 = 10;
y := x + 100; // 100 will be of type i8, and that is okay because
// 100 is in the range of 2's-complement signed
// 8-bit numbers.
x: i8 = 10;
y := x + 1000; // This will not work, as 1000 does not fit into
// an 8-bit number. This will result in a compile
// time error.
x: f32 = 10.0f;
y := x + 100; // 100 will be of type f32. This will work, as 100
// fits into the mantissa of a 32-bit floating
// point number, meaning that there is no loss
// of percision.
Character Literals
Character literals are written in the following way.
'a'
Note, Onyx used to have
#char "a"
because the single-quotation character was being reserved for some other use. That other use did not appear in 3 years of development, so the single-quotation was given up to serve as a character litereal.
String Literals
Onyx contains the following string-like literals:
"Hello!" // Standard string literals, of type 'str'.
#cstr "World" // C-String literals, of type 'cstr'.
""" // A multi-line string literal, of type 'str'.
Multi // Note that the data for the multi-line literal
line // starts right after the third quote, so technically
string // all of these "comments" would actually be part of the
literal // literal.
"""
In Onyx, there are 3 string types, str
, cstr
, dyn_str
. cstr
is analogous to a char *
in C. It is a string represented as a pointer to an array of bytes, that is expected to end in a '\0'
byte. For compatibility with some C libraries, Onyx also has this type.
Most Onyx programs solely use str
, as it is safer and more useful. A str
is implemented as a 2 element structure, with a pointer to the data
, and an integer count
. This is safer, as a null-terminator is not necessary, so a buffer-overflow is much harder. To convert a cstr
to str
, use
string.from_cstr
.
dyn_str
, or dynamic string, is a string type that allows for modifying the string by appending to it. It is implemented as a dynamic array of u8
, so any array function will work with it. To make more idomatic and readable code, the core.string
package also has function for working with dynamic strings, such as append
and insert
.
Built-in constants
null // Represents an empty pointer
null_proc // Represents an empty function pointer
You may be wondering why there is a separate value for an empty function pointer. This is due to the securer runtime of Onyx over C. In WebAssembly (Onyx's compile target), functions are completely separated from data. Function references are not pointers, they are indicies. For this reason, there are two different values that represent "nothing" for their respective contexts.
Declarations
Declaring variables in Onyx is very similar to declaring variables in many other modern programming languages. A single colon (:
) is used to declare a variable, optionally followed by its type and/or the initial value for the variable.
<variable name>(, <variable name>)* : <declared type> = <initial value> ;
Inferred Types
If the type of the initial value can be determined, then the declared type of the variable is optional, and it will be infered from the type of the initial value.
Examples
Here we declare a variable called x
of type i32
. It is guaranteed that x
will be initialized to 0
here.
x: i32;
Here we declare a variable y
explicitly as type i32
, with an initial value of 10
.
y: i32 = 10;
Here we declare a variable z
with an infered type. Since, the declared type was omitted, it will copy the type of the initial value. When not the presence of other type information, the literal 10
has type i32
, so z
will be of type i32
.
z := 10;
Blocks
There are 3 ways of expressing a block of code in Onyx, depending on the number of statements in the block.
Multi-statement Blocks
The first way is to use curly-braces ({}
) to surround the statements in the block, with statements being delimited by a semi-colon.
{
stmt1;
stmt2;
// ...
}
Single-statement Blocks
The second way is to use place the do
keyword before the statement to create a single-statement block. This is required in if
, while
, and for
statements. You can of course write { stmt; }
instead of do stmt;
if you prefer.
do stmt;
// More commonly
if some_condition do some_stmt;
Zero-statement Blocks
The third and final way is a little redundant, but its in the language because it can be appealing to some people. When there needs to be a block, but no statements are needed, three dashes, ---
, can be used as an equivalent to {}
.
if condition ---
switch value {
case 1 ---
// ...
}
Bindings
Bindings are a central concept to Onyx. A binding declares that certain symbol is bound to something in a scope. This "something" can be an compile-time known object. Here is a non-exhaustive list or some of the compile-time known objects.
- Procedure
macro
sstruct
senum
spackage
s- constant literals
Syntax
A binding is written in the following way:
symbol_name :: value
This says that symbol_name
will mean the same thing as value
in the scope that it was declared in. Normally, value
is something like a procedure, structure, enum, or some other compile-time known object. However, there is nothing wrong with re-binding a symbol to give it an alternate name.
Note, the ability to alias symbols to other symbols has an interesting consequence. Since names are not inheritly part of a procedure or type defintion, a procedure or type can have multiple names.
f :: () { ... } g :: f
Notice that the procedure defined here can be called
g
orf
. When reporting errors the compiler will use the name that was originally bound to the procedure (f
).
Use as constants
Onyx does not have a way so specify constant variables. When you declare a variable with :=
, it is always modifiable. While constants are very useful, Onyx would suffer from the same problem that C and C++ have with constants: they aren't necessarily constant. You can take a pointer to it, and use that to modify it. Onyx does not want to make false promises about how constant something is.
That being said, bindings can serve as compile-time constants. You can declare a binding to a constant literal, or something that can be reduced to a constant literal at compile time. Here are some examples.
A_CONSTANT_INTEGER :: 10
A_CONSTANT_FLOAT :: 12.34
A_CONSTANT_STRING :: "a string"
// Since A_CONSTANT_STRING.length and A_CONSTANT_INTEGER are compile-time known
// the addition can happen at compile-time.
A_CONSTANT_COMPUTED_INTEGER :: A_CONSTANT_STRING.length + A_CONSTANT_INTEGER
Targeted Bindings
Bindings can be also placed into a scope other than the current package/file scope.
Simply prefix the binding name with the name of the target scope, followed by a .
.
Foo :: struct {}
// `bar` is bound inside of the `Foo` structure.
Foo.bar :: () {
}
Here the bar
procedure is placed inside of the Foo
structure. This makes it accessible
using Foo.bar
. When combined with the method call operator, methods can be defined
on types in Onyx in a similar manner to Go.
The target scope does not have to be a structure however; it can also be a package
, union
,
enum
, or #distinct
type.
Using targeted bindings is very common in many Onyx programs, because it allows for a defining procedures that are associated with a type, directly on the type. This makes them easier to find, and able to be used by the method call operator.
Program Structure
An important problem that every programming language tackles in a different way is: how do you structure larger, multi-file programs?
While each way of tackling this problem has its own advantages and disadvantages, Onyx takes a relatively simple approach. The following are the core principles:
- No incremental compilation.
- Divide files into packages.
- Dissociate the package hierarchy from the folder hierarchy.
No incremental compilation
Onyx does not do incremental compilation. Everything is recompiled, from scratch, every time. This may seem like a drawback, not a feature, but it simplifies the development process immensely.
Onyx has a number of feature that could not be partially compiled in any reasonable way. Macros, runtime type information and static-if statements to name just a few. Instead of shoehorning a solution for this into the compiler, Onyx simply avoids partial/incremental compilation.
Onyx's compiler is very fast. While no incredibly large programs are written in Onyx yet, a simple calculation shows that the compiler could theoretically compile 100-200 thousand lines per second in larger project. For this reason, incremental compilation is not necessary, as your project will compile almost instantly, no matter what size.
Note, Onyx's compiler is currently still single-threaded. If and when this issue is addressed and a multi-threaded compilation model is achieved, it is not impossible to reach over one-million lines per second.
One large downside of partial compilation is the need to worry about, "Did my whole project get recompiled and updated? Or am I still testing old code?" With a poorly configured build system, it is quite easy to cause this issue, which can lead to hours of a frustrating debugging experience. This is another reason Onyx avoid partial compilation. You know for a fact every time you compile, it is a fresh build.
Divide files into packages
In Onyx, every source file is part of a package. The package a file is part of is declared in the first source line of file.
// There can be comments before the package declaration.
package foo
func :: () {}
Struct :: struct {}
The above file declares that it is part of the foo
package.
All symbols declared public in this file (func
and Struct
) are placed in public scope of foo
.
When another file wants to use these symbols, all it has to do is use foo
.
Then it use foo
to access things inside of the foo
package.
package main
use foo
main :: () {
foo.func();
}
Note, see more about this in the Packages section.
Dissociate the package hierarchy from the file hierarchy
Unlike in many other languages, Onyx does not enforce parity between the hierarchy of files on disk, to the hierarchy of packages in the program. Any file can be part of any package. While this does come at a readability loss, it offers greater flexibility to the programmer.
TODO Explain why this is a good thing, because it is, trust me.
Loading Files
When the source code for a project is split across multiple files, the Onyx compiler needs to be told where all of these files are, so it knows to load them. This can be done in a couple different ways.
Using the CLI
When running onyx run
or onyx build
, a list of files is provided. All of these files will be loaded into the program, in their respective packages. This can be a reasonable way of doing things for a small project, but quickly becomes unwieldy.
$ onyx build source1.onyx source2.onyx source3.onyx ...
Using #load
directives
The idiomatic way of loading files into a program is using the #load
directive.
The #load
directive is followed by a compile-time string as the file name, and tells the compiler to load that file.
The given file name is used to search relative to path of the file that contains the #load
directive.
The file name can also be of the form "mapped_directory:filename"
.
In this case, the file will be searched for in the given mapped directory.
By default, there is only one mapped directory named core
, and it is set to $ONYX_PATH/core
.
Other mapped directories can be set using the command-line argument --map-dir
.
Note, the compiler automatically caches the full path to every file loaded, so no file can be loaded more than once, even if multiple
#load
directives would load it.
// Load file_a from the same directory as the current file.
#load "file_a"
// Load file_b from the mapped directory 'foo'.
#load "foo:file_b"
Using #load_all
Sometimes, every file in a directory needs to be loaded.
To make this less tedious, Onyx has the #load_all
directive.
Much like #load
it takes a compile-time string as the path relative to the current file,
and it will load every .onyx
file in that path, but does not perform a recursive search.
Any sub-directories will need a separate #load_all
directive.
#load_all "/path/to/include"
#load_all "/path/to/include/subdirectory"
Packages
Onyx has a relatively simple code organization system that is similar to other languages.
To organize code, Onyx has packages.
A package is collection of files that define the public and private symbols of the package.
To use the symbols inside of a package, each file it is using the package.
This is done with the use
keyword.
Let's say we have two source files, foo.onyx
and bar.onyx
They are part of the packages foo
and bar
respectively.
If bar.onyx
wants to use some code from foo.onyx
, it has to use foo
before it can do so.
This makes the foo
package accessible inside of bar.onyx
.
// foo.onyx
// Declares that this file is part of the "foo" package.
package foo
some_interesting_code :: () {
// ...
}
// bar.onyx
// Declares that this file is part of the "bar" package.
package bar
use foo
func :: () {
foo.some_interesting_code();
}
It is important to note, that while it may be a good idea to organize your source files into directories that correspond to the package that they are in, there is no limitation as to which files can be part of which packages. Any file can declare that it is part of any package. There may be a future language feature to optionally limit this.
Scoping
While controversial, Onyx has opted to have a public by default symbol system. Unless marked otherwise, all symbols inside a package are accessible from outside that package. If there are implementation details to hide, they can be scoped to either the current package or the current file.
Use #package
before a binding to declare that the scope is internal to the package.
Any file that is part of the same package can see the symbol, but external files cannot.
package foo
public_interface :: () {
internal_details();
}
#package
internal_details :: () {
// ...
}
Use #local
before a binding to declare that the scope is internal to the file.
Only things in the same file can see the symbol.
package foo
public_interface :: () {
super_internal_details();
}
#local
super_internal_details :: () {
// ...
}
Note, while Onyx is white-space agnostic, it is common to write the
#package
and#local
directives on a separate line before the binding.
If you have a large set of implementation details, it might be more readable to use the block version of #local
and #package
.
public_interface :: () {
}
#local {
lots :: () {
}
of :: () {
}
internal :: struct {
}
details :: enum {
}
}
Notable packages
There are several package names that have been taken by the Onyx standard library and/or have a special use.
builtin
package
The builtin
package is notable because it is required by the compiler to exist.
It contains several definitions needed by the compiler, for example Optional
and Iterator
.
These definitions live in core/builtin.onyx
, which is a slight misnomer because the builtin
package is separate from the core
module.
builtin
is also special because its public scope is mapped to the global scope of the program.
This means that anything defined in the package is available without using any packages.
runtime
package
The runtime
package is required by the compiler to exist, as the compiler places several variables in it related to the current operating system and selected runtime.
runtime.vars
package
runtime.vars
is used for configuration variables.
It is the "dumping ground" for symbols defined on the command line with the -D
option.
Use #defined
to tell if a symbol was defined or not.
use runtime
// Compile with -DENABLE_FEATURE to enable the feature
Feature_Enabled :: #defined(runtime.vars.ENABLE_FEATURE)
runtime.platform
package
runtime.platform
is an abstraction layer used by the core libraries that handles interacting with OS/environment things, such as reading from files and outputting a string.
core
package
The core
package houses all of the standard library.
The way the Onyx packages are structured, the compiler does not know anything about the core
package.
If someone wanted to, they could replace the entire core library and the compiler would not be affected.
main
package
The main
package is the default package every file is a part of if no package
declaration is made.
The standard library expects the main
package to have a main
procedure that represents the start of execution.
It must be of type () -> void
or ([] cstr) -> void
.
If there is not an entrypoint in the program because it is a library, simply use -Dno_entrypoint
when compiling,
or define a dummy main
with no body.
Use declarations
When a file wants to use code from another package, that package must be explicitly used.
This is done with the use
declaration. A use
declaration binds one or more symbols to the current scope to other items in other package. If a use
declaration is done at the top-level, the bindings are applied at file scope.
Simple case
The simplest use
declaration looks like this.
package foo
use bar
func :: () {
bar.something_interesting();
}
This use
declaration says, "bind bar
to be the package named bar
."
This allows func
to use bar
in its body.
Selective case
To bind to a symbol inside of a package, this syntax can be used.
package foo
use bar {something_interesting}
func :: () {
something_interesting();
}
This use
declaration extracts the symbol something_interesting
and binds a symbol of the same name to it in the current file.
Avoiding name conflicts
Sometime, you want to rename a symbol due to a name conflict, or just to shorten your code. You can do so like this.
package foo
use bar {
SI :: something_interesting
}
func :: () {
SI();
}
This use
declaration extracts the symbol something_interesting
, then binds it to SI
in the current file.
Use all
If you want to bring all symbols inside of a package into the current file, say use p {*}
.
package foo
use bar { * }
func :: () {
something_interesting();
}
Use all and package
If you want to bind the package itself, and all symbols inside of the package, simply place the package
keyword inside of the {}
.
package foo
use bar { package, * }
func :: () {
something_interesting();
// OR
bar.something_intesting():
}
Changing package name
If you want to bind the package itself to a different name, provide the alias like in the previous example.
package foo
use bar { B :: package }
func :: () {
B.something_interesting();
}
Operators
Onyx boasts the typical operators found in any C-inspired programming language, which this section describes briefly.
Math operators
Onyx has the following standard binary operators:
Operator | Use | Works on |
---|---|---|
+ | Addition | integers, floats, multi-pointers |
- | Subtraction | integers, floats, multi-pointers |
* | Multiplication | integers, floats |
/ | Division | integers, floats |
% | Modulo | integers |
Onyx also has the standard unary operators:
Operator | Use | Works on |
---|---|---|
- | Negation | integers, floats |
Comparison operators
Operator | Use | Works on |
---|---|---|
== | Equals | booleans, integers, floats, pointers |
!= | Not-equals | booleans, integers, floats, pointers |
> | Greater-than | integers, floats |
< | Less-than | integers, floats |
>= | Greater-than or equals | integers, floats |
<= | Less-than or equals | integers, floats |
Boolean operators
Onyx has the following binary boolean operators:
Operator | Use | Works on |
---|---|---|
&& | And | booleans |
|| | Or | booleans |
Onyx has the following unary boolean operator:
Operator | Use | Works on |
---|---|---|
! | Not | booleans |
Implicit Boolean Conversions
In certain circumstances where a boolean value is expected, Onyx will implicitly convert the value to a boolean in the following ways:
- Pointers: If the pointer is non-null, it is
true
. If it is null, it isfalse
. - Array-like: If the array is empty, it is
false
. If it is non-empty, it istrue
. - Optionals: If the optional has a value, it is
true
. Otherwise, it isfalse
.
As an escape-hatch for library writers, it is possible to make anything implicitly cast to bool by overloading the builtin procedure, __implicit_bool_cast
. Here is an example of making a custom structure cast to bool implicitly.
Person :: struct {
age: u32;
name: str;
}
#overload
__implicit_bool_cast :: (p: Person) -> bool {
return p.age > 0 && p.name;
}
main :: () {
p1 := Person.{};
p2 := Person.{42, "Joe"};
if !p1 { println("p1 is false"); }
if p2 { println("p2 is true"); }
}
Pointer operators
Pointers in Onyx have the following unary operators:
Operator | Use | Works on |
---|---|---|
& , ^ | Address-Of | pointers |
* | Dereference | pointers |
Note,
^
is being phased-out in favor of&
for taking the address of something.
Bitwise operators
Onyx has the following binary bitwise operators:
Operator | Use | Works on |
---|---|---|
& | Bitwise-And | integers |
` | ` | Bitwise-Or |
^ | Bitwise-Xor | integers |
<< | Bit shift left | integers |
>> | Bit shift right (logical) | integers |
>>> | Bit shift right (arithmetic) | integers |
Onyx has the following unary bitwise operators:
Operator | Use | Works on |
---|---|---|
~ | Bitwise-Negate | integers |
Try/Coalesce operators
Onyx has two special operators that are not given any intrinsic meaning by the compiler.
Their use is entirely defined within the standard library.
They are the try (?
) and the coalesce (??
) operator.
Try operator (?
)
The try operator is a postfix operator that can occur anywhere in an expression.
Currently, the try operator is only used by the Optional
and Result
types.
While not enforced by the compiler, the try operator generally acts as an early escape from a procedure.
For Optional
, if no value is present, an empty value is returned from the enclosing procedure.
For Result
, if an error value is present, the result value is returned.
Here is an example of using the try operator on an Optional
.
use core
first :: (arr: [] $T) -> ? T {
if !arr do return .{};
return arr[0];
}
double :: (v: $T) -> T {
return v * 2;
}
compute :: (arr: [] $T) -> ? T {
v := first(arr)?;
return double(v);
}
main :: () {
arr1 := i32.[ 2, 3, 5, 7, 11 ];
arr2 := i32.[];
compute(arr1) |> core.println();
compute(arr2) |> core.println();
}
Coalesce Operator (??
)
The coalesce operator is a binary operator that returns the right-side value if the left-side is an empty value.
This is defined for Optional
and Result
types.
Here is an example of the coalesce operator with Optional
.
use core
first :: (arr: [] $T) -> ? T {
if !arr do return .{};
return arr[0];
}
main :: () {
arr := i32.[];
// If first(arr) return a None value, return 0 instead;
v := first(arr) ?? 0;
core.println(v); // Prints 0
}
Cast operators
When two values are not of the same type, one will need to be casted to the other's type.
In Onyx, the cast
keyword is used. It takes on the following two forms described below.
Prefix form
main :: () {
x: i32 = 10;
y: f32 = 20.0f;
// Cast x to be a f32 so it can be added with y.
result := cast(f32) x + y;
println(result);
}
Here, a cast is needed to turn x
into a f32
so it can be added with y
.
This cast is in prefix form, which simply means it appears as a prefix to the thing being casted, similar to unary negation (-
).
Call form
BaseType :: struct {
name: str;
}
SubType :: struct {
use base: BaseType;
age: u32;
}
main :: () {
sub_value := SubType.{
age = 123
};
base_ptr: &BaseType = &sub_value;
age_value := cast(&SubType, base_ptr).age;
println(age_value);
}
In this contrived example, base_ptr
is casted to a &SubType
using the the call form of the cast operator.
This form is slightly nicer when you are going to immediately to follow the cast operation with a postfix operation.
In this case, .age
.
If this was written in prefix form, another set of parenthesis would be needed: (cast(&SubType) base_ptr).age
.
Auto casting
Sometimes, a cast necessary for the code to type check, but it is cumbersome to type the entire cast operation. Maybe the type is too long, or maybe the type is not even available because the package is not used. In these cases, the auto-cast operator can be used.
Auto-cast is written with a double-tilde (~~
).
This was chosen because there would be no purpose to performing a bitwise negation (~
) twice in a row.
To understand auto-cast, treat it like it was a cast(X)
, and the X
is automatically figured out by the compiler.
If the auto-cast is not possible, a compile-time error will be reported.
print_float :: (x: f32) {
printf("{}\n", x);
}
main :: () {
x := 10;
// Automatically cast x to an f32, since y is an f32.
y: f32 = ~~ x;
// Automatically cast x to an f32, since print_float expects an f32.
print_float(~~ x);
}
No "magic" casts
Onyx does not have "special" or "magic" casts between two completely unrelated types, such as strings and numbers.
Instead, conv.parse
and tprintf
should be used.
main :: () {
x := 10;
// Equivalent to `cast(str)` in some other languages
x_str := tprintf("{}", x);
// conv.parse parses a string into the type provided, and returns
// an optional of that type. So, a default must be provided using ??.
x_val := conv.parse(i32, x_str) ?? 0;
}
Procedure calls
Calling a procedures use the traditional postfix ()
operator.
Simply place ()
after the procedure you would like to call.
// This is discussed in chapter 6, Procedures.
foo :: () -> void {
// ...
}
main :: () -> void {
// Using () to call foo.
foo();
}
Passing Arguments
Arguments are passed in between the ()
, in a comma-separated fashion.
The type of each argument must agree with the expected paramter type, or at least be of compatible type.
foo :: (x: i32, y: str) -> void {
// ...
}
main :: () -> void {
foo(10, "Hello");
}
Named Arguments
Somes it is nicer for clarity to explicitly name the arguments being passed.
This can be done by specifying the name, then an =
, then the value.
This specifies that the argument is for the parameter with the same name.
foo :: (x: i32, y: str) -> void {
// ...
}
main :: () -> void {
foo(x = 10, y = "Hello");
}
When the arguments are named, their order can be changed.
foo :: (x: i32, y: str) -> void {
// ...
}
main :: () -> void {
// The order does not matter here.
foo(y = "Hello", x = 10);
}
Named arguments are particularly useful when there are a lot of parameters with default values, and you want to modify a small number of them.
// This is a simple example of many defaulted arguments
foo :: (option_1 := true, option_2 := "./tmp", option_3 := 4) -> void {
// ...
}
main :: () -> void {
// Override a small number of the default values.
foo(
option_2 = "/something_else",
option_3 = 8
);
}
If/else operator
Onyx has one ternary operator for inline if-statements. Inspired by Python, it has this form.
true-stmt if condition else false-stmt
Here is a simple example of using it.
use core
main :: () {
value := 10;
x := 1 if value < 100 else 0;
core.println(x); // Prints 1
}
While this operator should be scarely used for the sake of readable code, it can be very handy in certain circumstances.
Range operator
Ranges in Onyx represent an interval of numbers and are typically used in for
-loops
and for creating slices of a buffer.
The x .. y
binary operator makes a half-open range, representing the set [x, y)
.
For example, the range 1 .. 5
represents a range including 1, 2, 3, and 4, but not 5.
The x ..= y
binary operator makes a fully-closed range, representing the set [x, y]
.
For example, the range 1 ..= 5
represents a range including 1, 2, 3, 4, and 5.
The type of these ranges will either be range
or range64
, depending if x
and y
were 32-bit integers or 64-bit integers.
For loop over integers
For-loops support iterating over a range.
// Prints 1 to 10
for x in 1 ..= 10 {
println(x);
}
Creating a slice
When you have a buffer of data, you can create a slice out of the data by subscripting
it with something of type range
. It could be a range literal, or any other value
of type range
.
buf: [1024] u8;
bytes_read := read_data(buf);
// Create a slice referencing the underlying buffer.
// (Nothing is copied in this operation.)
data_read := buf[0 .. bytes_read];
Pipe operator
The pipe (|>
) operator is used as syntactic sugar when you want the result of one procedure call to be passed as the first argument to another another call.
This might sound contrived, but with a well-designed API it can happen often.
The pipe operator transform the code as follows:
x |> f(y) into f(x, y)
As you can see, it simply takes the left hand side, and passes it as the first argument to the procedure call. The operator is left-associative, which simply means the parentheses are automatically inserted to all for correct chaining of pipes.
Look at this simple API for doing (very simple) computations.
On the first line of main
, there is an example of using this API with nested function calls.
On the second line, there is the equivalent code written using the pipe operator.
add :: (x: i32, y: i32) -> i32 {
return x + y;
}
negate :: (x: i32) -> i32 {
return -x;
}
double :: (x: i32) -> i32 {
return x * 2;
}
main :: () {
println(double(add(negate(-5), 4)));
-5 |> negate() |> add(4) |> double() |> println();
}
Piping to other arguments
Sometimes the argument you want to pipe into is not in the first argument slot.
When this happens, you can simply place a _
as the argument you want to pipe into.
For example,
sub :: (x, y: i32) -> i32 {
return x - y;
}
main :: () {
5 |> sub(3) |> println(); // prints 2
5 |> sub(3, _) |> println(); // prints -2
}
This is very useful when piping to printing/formatting functions that require the format string to be the first argument.
main :: () {
// This example is a bit contrived, but imagine if '5' was
// a long concatenation of pipes.
5
|> logf(.Info, "The value is {}", _);
}
Iterators and Pipe
The core.iter
package in Onyx uses the pipe operator heavily.
The package is designed in a way for the Iterator
transformation
functions to be easily chained.
For example, say you wanted to find the first 5 odd numbers greater than 100, you could write the following iterator.
my_numbers :=
iter.as_iter(0 .. 100000) // Convert the range to an iterator.
|> iter.skip_while(x => x < 100) // Skip numbers less than 100.
|> iter.filter(x => x % 2 == 1) // Filter for only odd numbers.
|> iter.take(5) // Only take the first 5 things.
|> iter.collect(); // Collect the results as an array.
This is contrived example, but it shows how composable the iter
package
is, thanks to the pipe operator.
Method call operator
Onyx aims to support multiple styles of programming. The Pipe Operator section describes how a functional style of programming can be achieved. This section will describe how an Object-Oriented style of programming can be done.
The key behind this is the ->
operator, also called the "method call operator". It can be
understood as a simple shorthand for the following.
foo.method(foo, 123)
This can instead be written as the following.
foo->method(123);
Much like the pipe operator, it makes the left-hand side of the operator the first argument to the function call. However, unlike the pipe operator, it also resolves the function from within the scope of the left-hand side. It also automatically takes the address of the left-hand side if the method expects a pointer as the first argument. These features together make for a good aproximation to an inheritance-less OOP programming model.
Object-Oriented Programming
Onyx is not an object-oriented language. It is a data-oriented language, where you should think about the way your data is structured when solving problems.
That does not mean that all object-oriented language features are bad. Sometimes, it is easier to think about something as an "object" with "things" that it can do. When that is the case, Onyx can help.
With the method-call operator (described above), you can write methods on structures and unions.
use core
Foo :: struct {
name: str;
// When a function is declared inside of a structure,
// it can be accessed under the struct's scope, i.e. `Foo.say_name`.
say_name :: (f: Foo) {
core.printf("My name is {}.\n", f.name);
}
}
main :: () {
foo := Foo.{ "Joe" };
foo->say_name();
}
Other ways of writing main
above would be like so:
main :: () {
foo := Foo.{ "Joe" };
foo.say_name(foo); // Accessing on 'foo' will look to its types
// scope, in this case 'Foo', since 'foo' does
// not have a member named 'say_name'.
Foo.say_name(foo); // Explicit version as you would see in many
// other languages.
}
Sometimes you want to pass the "object" as a pointer to the method if the method is going to modify the object. As a convience, the method call operator will do this automatically for you, if it is possible to take the address of the left-hand side. This may feel a little weird but it is largely intuitive and similar to how many other languages work.
use core
Foo :: struct {
name: str;
say_name :: (f: Foo) {
core.printf("My name is {}\n", f.name);
}
// Entirely redundant method, but illustrates passing by pointer.
set_name :: (f: &Foo, name: str) {
// f can be modified here because it is passed by pointer.
f.name = name;
}
}
main :: () {
// Create a zero-initialized Foo.
foo: Foo;
// Call the set_name method
foo->set_name("Jane");
// Note that this is equivalent to the follow (notice the &foo).
// foo.set_name(&foo, "Jane")
foo->say_name();
}
Virtual Tables
While Onyx does not natively support virtual tables, there is a pattern
that can achieve this using use
d members on structures. Here is an
example of the classic "Animals that can speak" inheritance argument.
Create a virtual table structures that will store the function pointers.
Animal_Vtable :: struct {
// 'greet' is a member of the vtable, and takes a pointer
// to the object (which this does not concern itself with),
// as well as the name to greet.
greet: (rawptr, name: str) -> void;
}
Then, create some implementations of the virtual table as global variables. Note, these could be scoped so they can only be used where you need them, but for this example they are accessible everywhere.
dog_vtable := Animal_Vtable.{
greet = (d: &Dog, name: str) {
printf("Woof {}!\n", name);
}
}
cat_vtable := Animal_Vtable.{
greet = (d: &Cat, name: str) {
printf("Meow {}!\n", name);
}
}
Finally create the Dog
and Cat
structures, with a use
d member of type Animal_Vtable
.
This will enable the animal->greet()
syntax because greet
is accessible as a
member in Dog
and Cat
.
Dog :: struct {
use vtable: Animal_Vtable = dog_vtable;
}
Cat :: struct {
use vtable: Animal_Vtable = cat_vtable;
}
Now you can pass a pointer Dog
and or a pointer to Cat
to any procedure expecting
a pointer to an Animal_Vtable
, thanks to Sub-Type Polymorphism.
say_greeting :: (animal: &Animal_Vtable, name: str) {
animal->greet(name);
}
main :: () {
dog := Dog.{};
cat := Cat.{};
say_greeting(&dog);
say_greeting(&cat);
}
This is obviously more clunky than object-oriented programming in a language like Java or C++, but that's because Onyx is not an object-oriented language.
This pattern is used in a couple of places throughout the standard library,
notably in the io.Stream
implementation, which enables reading and writing
using the same interface from anything that defines io.Stream_Vtable
, including
files, sockets, processes and string buffers.
Operator Precedence
Precedence | Operators |
---|---|
1 | Assignment (+= , -= , ...) |
2 | Post-fix (.x , () , ? , [] , ->x() ) |
3 | Pre-fix (- , ! , ~~ , ~ , * , & ) |
4 | ?? |
5 | % |
6 | * , / |
7 | + , - |
8 | & , | , ^ , << , >> , >>> |
9 | <= , < , >= , > |
10 | == , != |
11 | && , || |
12 | |> , .. |
Control Flow
Onyx has a standard set of simple control flow mechanisms: if
, while
, for
, switch
and defer
.
Notably absent is goto
, and this by design.
If
if
statements allow the programmer to optionally execute a block of code, if a condition is met.
if
-statements in Onyx are written like this:
if condition {
println("The condition was true!");
}
Notice that there does not need to be parentheses around the condition. One thing to note is that the syntax for an else
-if
chain uses the keyword elseif
, not else if
.
if x >= 100 {
println("x is greater than 100");
} elseif x >= 10 {
println("x is greater than 10");
} else {
println("x is not special.");
}
Initializers
if
-statements can also have an initializer, which is a statement that appears before the condition. They allow you to declare variables that are only available in the scope of the if
-statement, or any of the else
blocks.
can_error :: () -> (i32, bool) ---
if value, errored := can_error(); !errored {
printf("The value was {}!\n", value);
}
// value is not visible here.
While loops
while
-statements are very similar to if
-statements, except when the bottom of the while-loop body is reached, the program re-tests the condition, and will loop if necessary.
while
-statements have the same syntax as if
-statements.
x := 10;
while x >= 0 {
println(x);
x -= 1;
}
while
statements can also have initializers, meaning the above code could be rewritten as:
while x := 10; x >= 0 {
println(x);
x -= 1;
}
while
statements can also have an else
block after them. The else
block is executed if the condition for the while
loop was never true.
while false {
println("Never printed.");
} else {
println("This will print.");
}
Switch
switch
-statements are used to simplify a chain of if
-elseif
statements. Switch statements look a little different in Onyx compared to say C. This is because case
blocks are actually blocks, not just jump targets.
value := 10;
switch value {
case 5 {
println("The value was 5.");
}
case 10 do println("The value was 10.");
case _ {
println("The value was not recognized.");
}
}
_
is used for the default case. The default case must be listed lexicographical as the last case.
It is also possible to match multiple values using a comma-separated list.
value := 10;
switch value {
case 5, 10, 15 {
println("The value was 5, 10, or 15.");
}
}
fallthrough
case
blocks in Onyx automatically exit the switch
statement after the end of their body, meaning an ending break
statement is not needed. If you do however want to fallthrough to the next case like in C, use the fallthrough
keyword.
switch 5 {
case 5 {
println("The value was 5.");
fallthrough;
}
case 10 {
println("The value was (maybe) 10.");
}
}
Ranges
switch
statements also allow you to specify a range of values using ..
or ..=
.
switch 5 {
case 5 ..= 10 {
println("The value was between 5 and 10.");
}
}
Custom Types
switch
statements can operate on any type of value, provided that an operator overload for ==
has been defined.
Point :: struct {x, y: i32;}
#operator == (p1, p2: Point) => p1.x == p2.x && p1.y == p2.y;
switch Point.{10, 20} {
case .{0, 0} do println("0, 0");
case .{10, 20} do println("10, 20");
case _ do println("None of the above.");
}
Tagged Unions
switch
statements are very important when working with tagged unions. See the tagged union
section for details.
Initializers
switch
statements can also optionally have an initializer, like while
and if
statements.
Defer
defer
-statements allow you to run a statement or block when the enclosing block is exited.
{
println("1");
defer println("3");
println("2");
}
This example will print:
1
2
3
defer
statements are pushed onto a stack. When the block exits, they are popped off the stack in reverse order.
{
defer println("3");
defer println("2");
defer println("1");
}
This example will also print:
1
2
3
Create/Destroy Pattern
defer
statements enable the following "create/destroy" pattern.
thing := create_something();
defer destroy_something(thing);
Because deferred statements run in any case that execution leaves a block, they safely guarantee that the resource will be destroyed. Also, because defer
statements are stacked, they guarantee destroying resources happens in the correct order.
outer_thing := create_outer_thing();
defer destroy_outer_thing(outer_thing);
inner_thing := create_inner_thing(outer_thing);
defer destroy_inner_thing(inner_thing);
For loops
for
loops are the most powerful control flow mechanism in Onyx. They enable:
- Iteration shorthand
- Custom iterators
- Removing elements
- Scoped resources
Range-Based Loop
A basic for
loop in Onyx. This will iterate from 1 to 9, as the upper bound is not included.
for i in 1 .. 10 {
println(i);
}
This for
loop is iterating over a range
. Ranges represent half-open sets, so the lower bound is included, but the upper bound is not.
Array-Based Loop
for
loops can also iterate over array-like types: [N] T
, [] T
, [..] T
. Use &
after for
to iterate over the array by pointer.
primes: [5] i32 = .[ 2, 3, 5, 7, 11 ];
for value in primes {
println(value);
}
// This modifies the array so each element
// is double what it was.
for &value in primes {
// value is a &i32.
*value *= 2;
}
it
Naming the iteration value is optional. If left out, the iteration value will be called it
.
for i32.[2, 3, 5, 7, 11] {
println(it);
}
Indexed-loops
for
loops can optionally have a second iteration value called the index.
This index starts at 0, and increments by 1 every iteration.
Its default type is i32
, but this can be changed.
// Use i32 as type of index
for value, index in i32.[2, 3, 5, 7, 11] {
printf("{}: {}", index, value);
}
// Explictly change the type to i64
for value, index: i64 in i32.[2, 3, 5, 7, 11] {
printf("{}: {}", index, value);
}
Custom Iterators Loops
The final type that for
loops can iterate over is Iterator(T)
. Iterator
is a built-in type that represents a generic iterator. An Iterator
has 4-elements:
data
- a pointer to the context for the iterator.next
- a function to retrieve the next value out of the iterator.remove
- an optional function to remove the current element.close
- an optional function to cleanup the iterator's context. Thecore.iter
package provides many utilities for working with iterators.
Here is a basic example of creating an iterator from a range
, then using iter.map
to double the values. Iterators are lazily evaluated, so none of the actual doubling happens until values are pulled out of the iterator by the for
loop.
doubled_iterator := iter.as_iter(1 .. 5)
|> iter.map(x => x * 2);
for doubled_iterator {
println(it);
}
The above for
loop loosely translates to the following code.
doubled_iterator := iter.as_iter(1 .. 5)
|> iter.map(x => x * 2);
{
defer doubled_iterator.close(doubled_iterator.data);
while true {
it, cont := doubled_iterator.next(doubled_iterator.data);
if !cont do break;
println(it);
}
}
#no_close
The close
function of an Iterator
is always called after the loop exits. If this is not the desired behavior, you can add #no_close
after for
to forego inserting the close
call.
doubled_iterator := iter.as_iter(1 .. 5)
|> iter.map(x => x * 2);
for #no_close doubled_iterator {
println(it);
}
// Later:
iter.close(doubled_iterator);
#remove
The final feature of Iterator
-based for
loops is the #remove
directive. If the current Iterator
supports it, you can write #remove
to remove the current element from the iterator.
// Make a dynamic array from a fixed-size array.
arr := Array.make(u32.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
// as_iter with a pointer to a dynamic array
// supports #remove.
for iter.as_iter(&arr) {
if it % 2 == 0 {
// Remove all even numbers
#remove;
}
}
// Will only print odd numbers
for arr do println(it);
#first
Many times while writing for
loops, it is nice to know if this iteration is one of two special values: the first and last iteration.
As a convenience, Onyx provides the #first
directive in all of its for
loops.
It is a bool
value that is true
during the first iteration, and then false afterwards.
Note, Onyx does not provide an equivalent #last
directive, because in Iterator
-based loops, it is impossible to know when the last iteration will happen.
One example of where this is useful is in a formatted printer. Consider this code that prints the elements of an array.
arr := i32.[ 2, 3, 5, 7, 11 ];
for arr {
if !#first do print(", ");
print(it);
}
This example will print:
2, 3, 5, 7, 11
Explicitly-typed loop variable
You can optionally provide an explicit type for the loop variable, if you feel it improves code readability. It does not carry any extra semantic meaning, but solely exists for the next reader of the code. If you provide the incorrect type, it is a compile error.
strings := str.["A", "list", "of", "strings"];
// Explcitly say that value is a str
for value: str in strings {
println(value)
}
Branching
The following keywords can be used to branch form a block of code early.
break
break
can be used to jump execution to after the body of the enclosing loop.
// Prints 0 to 5.
for 0 .. 10 {
println(it);
if it == 5 do break;
}
continue
continue
can be used to jump execution to the condition of the enclosing loop.
// Prints 5 to 9.
for 0 .. 10 {
if it < 5 do continue;
println(it);
}
fallthrough
fallthough
is discussed in the switch
statement section.
return
return
is used to end execution of the current procedure. It is also used to provide return values.
fact :: (n: i32) -> i32 {
if n <= 1 do return 1; // Early return
return fact(n - 1) * n; // Providing result
}
Do blocks
Do blocks allow you to encapsulate statements into smaller chunks of code whose sole purpose is to evaluate to a value.
Do blocks are expressions, and therefore must return a value.
This is done using the return
keyword.
main :: () {
x := 10
y := do {
if x > 5 do return "X is greater than 5"
return "X is less than or equal to 5"
}
println(y)
}
Explicit typing
If necessary, you can provide an explicit type to the do block using -> type
after
the do
keyword.
Color :: enum { Red; Green; Blue; }
main :: () {
col := do -> Color {
return .Green
}
println(col)
}
Internal details
Do blocks were actually a feature that came for "free" when macro
were implemented. Every expression macro
simply turns into a do
block at the call site.
add :: macro (x, y: i32) -> i32 {
return x + y
}
main :: () {
x1 := add(3, 4)
// The above simply desugars into this:
x2 := do -> i32 {
return 3 + 4
}
}
Used Locals
This is an experimental feature that may not work perfectly, and may go away in the future if there is push back or major problems.
A very common pattern in Onyx is to create/allocate a resource, then defer
the release/free of the resource on the next line. Take this for an example.
main :: () {
// Make a dynamic array.
my_arr := make([..] i32);
// Delete the array at the end of main.
defer delete(&my_arr);
}
Whether using dynamic arrays, io.Reader
s, or os.File
s, this pattern is all over the place.
To make it a little easier to type, and to allow the author of the type to define how to
clean up the resource allocation, you can simply place the use
keyword in front of the variable
definition. This will automatically insert a deferred call to the builtin procedure __dispose_used_local
.
This procedure can be overloaded to define how to dispose of any resource. It also contains an
overload for delete
, which means anything you can call delete
on, you can already use
.
This is the same example as before, but with use
instead.
main :: () {
// Implicitly delete the array at the end of scope.
use my_arr := make([..] i32);
// The above code is equivalent to this:
my_arr := make([..] i32);
defer __dispose_used_local(&my_arr);
}
Procedures
Procedures allow the programmer to encapsulate behavior inside a reusable form. Other languages call them functions, subroutines or methods. "Procedures" is a super-set of all of those terms.
Syntax
Procedures in Onyx are written simply as: (parameters) -> return_type { body }
.
Here is a simple procedures that simply prints, Hello!
.
say_hello :: () -> void {
println("Hello!");
}
To explain the different parts of the syntax, here is a broken down version, line by line.
say_hello // This is the symbol name that the procedure will be bound to.
:: // This is 'bind' operator, as discussed in Chapter 2.5.
() // This is the start of the procedure; an empty list of parameters.
-> void // This is the return type, specified using a `->`.
{
// This is the procedure's body.
println("Hello!");
}
Anonymous Procedures
Procedures do not have to be named, and can simply exist as expressions.
Here, say_hello
is assigned at runtime to be an anonymous procedure.
procedure_as_an_expression :: () -> void {
// Assign the procedure to a local variable
say_hello := () -> void {
println("Hello!");
};
say_hello();
}
Optional Return Type
If the procedure returns void
(i.e. returns nothing), the return type can be completely removed.
say_hello :: () {
println("Hello, no void!");
}
Parameters
Procedures can take 0 or more parameters. All parameters are passed by value. Parameters that are passed by pointer copy the pointer value, not the data where the pointer is pointing.
Syntax
Procedure paramaters are given a name, followed by a :
, followed by the type of that parameter. A comma (,
) is used to delimit the different parameters.
print_add :: (x: i32, y: i32) {
printf("{} + {} = {}\n", x, y, x + y);
}
compute_add :: (out: &i32, x: i32, y: i32) {
*out = x + y;
}
As a convenience, if two or more parameters have the same type, they can be written using the type only once.
In this example, because x
and y
are the same type, the : i32
is not needed after x
.
print_add :: (x, y: i32) {
// ...
}
Default values
Parameters can have default values. The default value is computed on the caller's side. This mean default values are not part of the procedures type. They are only a conveniences provided by a given procedure.
print_msg_n_times :: (n: i32, msg: str = "Hello, World!") {
for n do println(msg);
}
print_msg_n_times(10);
The type of a defaulted parameter can be omitted if the type of the expression is known.
// Because "Hello, World!" is known to be of type 'str',
// the type of msg can be omitted.
print_msg_n_times :: (n: i32, msg := "Hello, World!") {
for n do println(msg);
}
print_msg_n_times(10);
Return values
Procedures can return 0 or more values. Return types are specified after procedure arguments using an ->
. If multiple return values are desired, the return types have to be enclosed in parentheses. The return
keyword is used to specify returned values.
// A single integer return value.
add :: (x, y: i32) -> i32 {
return x + y;
}
// Returning 2 integers.
swap :: (x, y: i32) -> (i32, i32) {
return y, x;
}
z := add(2, 3);
a, b := 10, 20;
a, b = swap(a, b);
Note, returned values are passed by value.
Automatic-return type
Sometimes, the exact type of returned value is cumbersome to write out. In this case, #auto
can be provided as the return type. It automatically determines the return type given the first return
statement in the procedure.
// #auto would automatically determined to be:
// Iterator(i32), bool, str
weird_return_type :: (x: i32) -> #auto {
return iter.as_iter(1 .. 5) , false, "Hello, World!";
}
In some cases in Onyx, it is actually impossible to write the return type. #auto
can be used in this case, and the compiler will figure out what type needs to be there. Look at this example from the standard library.
iter.prod :: (x: $I/Iterable, y: Iterator($Y)) -> #auto { ... }
iter.prod
returns an iterator of pairs of the two values yielded from the left and right iterators.
There is no way to write the return type, because you cannot spell the type of Iterator
that x
is because it is only Iterable
, meaning you can call as_iter
on it. Think about it, what could you write in Iterator(Pair(???, Y))
to make it correct?
Calling procedures
Calling any procedure-like thing in Onyx uses the traditional ()
post-fix operator, with arguments in between. Arguments are separated by commas. Arguments can also be named. Once arguments start being named, all subsequent arguments must be named.
magnitude :: (x, y, z: f32) -> f32 {
return math.sqrt(x*x + y*y + z*z);
}
// Implicit naming
println(magnitude(10, 20, 30));
// Explicit naming
println(magnitude(10, y=20, z=30));
// Explicit naming, in diffrent order
println(magnitude(z=30, y=20, x=10));
Variadic procedures
Variadic procedures allow a procedure to take an arbitrary number of arguments. This function takes any number of integers and returns their sum. The ..i32
type behaves exactly like a slice of i32
([] i32
).
sum :: (ints: ..i32) -> i32 {
result := 0;
for ints {
result += it;
}
return result;
}
println(sum(1, 2, 3, 4, 5));
Variadic procedures can also use the special type any
, to represent heterogeneous values being passed to the function. This function prints the type of each of the values given.
print_types :: (arr: ..any) {
for arr {
println(it.type);
}
}
print_types("Hello", 123, print_types);
This example outputs:
[] u8
i32
(..any) -> void
Note, read more about
any
in theAny
section.
Using Runtime Type Information, functions can introspect the values given and perform arbitrary operations. For example, conv.format
uses type information to print anything of any type in the program.
// printf uses conv.format for formatting.
printf("{} {} {}\n", "Hello", 123, context);
Polymorphic procedures
Polymorphic procedures allow the programmer to express type-generic code, code that does not care what type is being used. This is by far the most powerful feature in Onyx.
Polymorphic procedures use polymorphic variables. A polymorphic variable is declared using a $
in front of the name. When calling a polymorphic procedure, the compiler will try to solve for all of the polymorphic variables. Then, it will construct a specialized version of the procedure with the polymorphic variables substituted with their corresponding value.
Here is an example of a polymorphic procedure that compares two elements.
min :: (x: $T, y: T) -> T {
if x < y do return x;
else do return y;
}
x := min(10, 20);
y := min(40.0, 30.0);
// Errors
// z := min("Hello", "World");
$T
declares T
as a polymorphic variable. When min
is called with two i32
s, the compiler solves for T
, finding it to be i32
. Then a specialized version of min
is constructed that operates on i32
s. A very similar thing happens for the second call, except in that case T
is f64
. Notice that any error will occur if min
is called with something that does not define the operator <
for T
.
Polymorphic variables can occur deeply nested in a type. The compiler employs pattern matching to solve for the polymorphic variable.
foo :: (x: &[] Iterator($T)) {
// ...
}
val: &[] Iterator(str);
foo(val);
Here is a simple pattern matching process that the compiler goes through to determine the type of $T
.
Variable Type | Given Type |
---|---|
&[] Iterator($T) | &[] Iterator(str) |
[] Iterator($T) | [] Iterator(str) |
Iterator($T) | Iterator(str) |
$T | str |
If at any point the types do not match, an error is given.
Parameters can also be polymorphic variables. If a $
is placed in front of a parameter, it becomes a compile-time "constant". A specialized version of the procedure is made for each value given.
add_constant :: ($N: i32, v: i32) -> i32 {
// N is a compile-time known integer here.
// It is equivalent to writing '5'.
return N + v;
}
println(add_constant(5, 10));
Types can be passed as constant through polymorphic variables. Consider this example.
make_cubes :: ($T: type_expr) -> [10] T {
arr: [10] T;
for 0 .. 10 {
arr[it] = cast(T) (it * it * it);
}
return arr;
}
arr := make_cubes(f32);
Because T
is a constant, it can be used in the type of arr
, as well as in the return type.
Quick procedures
With polymorphic variables and #auto
, it is possible to write a completely type-generic procedure in Onyx.
print_iterator :: (msg: $T, iterable: $I) -> #auto {
println(msg);
for iterable {
println(it);
}
return 1234;
}
print_iterator("List:", u32.[ 1, 2, 3, 4 ]);
print_iterator(8675309, 5 .. 10);
No types are given in the procedure above. msg
and iterable
can be any type, provided that iterable
can be iterated over using a for
loop. This kind of procedure, one with no type information, is given a special shorthand syntax.
print_iterator :: (msg, iterable) => {
println(msg);
for iterable {
println(it);
}
return 1234;
}
print_iterator("List:", u32.[ 1, 2, 3, 4 ]);
print_iterator(8675309, 5 .. 10);
Here the =>
signifies that this is a quick procedure. The types of the parameters are left out, and can take on whatever value is provided. Programming using quick procedures feels more like programming in JavaScript or Python, so don't abuse them. They are very useful when passing procedures to other procedures.
map :: (x: $T, f: (T) -> T) -> T {
return f(x);
}
// Note that the paraentheses are optional if
// there is only one parameter.
y := map(5, value => value + 4);
println(y);
You can also have a mix between quick procedures and normal procedures. This examples shows an alternative way of writing -> #auto
.
// The => could be -> #auto, or -> T.
find_smallest :: (items: [] $T) => {
small := items[0];
for items {
if it < small do small = it;
}
return small;
}
println(find_smallest(u32.[6,2,5,1,10]));
Closures
Onyx has experimental support for closures. Currently, this is in the form of explicit closure, where every captured variable has to be declared before it can be used. This restriction will likely be lifted in the future when other internal details are figured out.
To declare a closure, simply add use (...)
to the procedure definition,
between the arguments and the return type.
main :: () {
x := 10
// Here, x is captured by value, and a copy is made for
// this quick procedure.
f := (y: i32) use (x) -> i32 {
return y + x
}
f(20) |> println() // Prints 30
}
Captured values can either by value, or by pointer.
To capture by pointer, simply place a &
in front of the variable name.
main :: () {
x := 10
// Here, x is captured by pointer
f := (y) use (&x) => {
*x = 20
return y + *x
}
f(20) |> println() // Prints 40
println(x) // Prints 20
}
Currying
A form of function currying is possible in Onyx using chained quick procedures, and passing previous arguments to each subsequent quick procedure.
add :: (x: i32) => (y: i32) use (x) => (z: i32) use (x, y) => {
return x + y + z
}
main :: () {
partial_sum := add(1)(2)
sum1 := partial_sum(3)
sum2 := partial_sum(10)
println(sum1)
println(sum2)
}
Internal details of Closures
Every time a closure is encountered at runtime, a memory allocation must be made
to accommodate the memory needed to store the captured values. To do this, a
builtin procedure called __closure_block_allocate
is called. This procedure is
implemented by default to invoke context.closure_allocate
. By default,
context.closure_allocate
allocates a buffer from the temporary allocator. If you want to change
how closures are allocated, you can change this procedure pointer to do something
different.
main :: () {
context.closure_allocate = (size: i32) -> rawptr {
printf("Allocating {} bytes for closure.\n", size)
// Allocate out of the heap
return context.allocator->alloc(size)
}
x := 10
f := (y: i32) use (x) => {
return y + x
}
f(20) |> println() // Prints 30
}
Overloaded procedures
Overloaded procedures allow a procedure to have multiple implementations, depending on what arguments are provided. Onyx uses explicitly overloaded procedures, as opposed to implicitly overloaded procedures. All overloads for the procedure are listed between the {}
of the #match
expression, and are separated by commas.
to_int :: #match {
(x: i32) -> i32 { return x; },
(x: str) -> i32 { return cast(i32) conv.str_to_i64(x); },
(x: f32) -> i32 { return cast(i32) x; },
}
println(to_int(5));
println(to_int("123"));
println(to_int(12.34));
The order of procedures does matter. When trying to find the procedure that matches the arguments given, the compiler tries each function according to specific order. By default, this order is the lexical order of the functions listed in the #match
body. This order can be changed using the #order
directive.
options :: #match {
#order 10 (x: i32) { println("Option 1"); },
#order 1 (x: i32) { println("Option 2"); },
}
// Option 2 is going to be called, because it has a smaller order.
options(1);
The lower order values are given higher priority, as they are ordered first.
Overloaded procedures as described would not be very useful, as all of the procedures would have to be known when writing the overload list. To fix this issue, Onyx has a second way of using #match
to add overload options to an already existing overloaded procedure.
options :: #match {
(x: i32) { println("Int case."); }
}
// #match can be used as a directive to specify a new overload option for
// an overloaded procedure. Directly after #match is the overloaded procedure,
// followed by the new overload option.
#match options (x: f32) { println("Float case."); }
#match options (x: str) { println("String case."); }
// As an alternative syntax that might become the default for Onyx,
// #overload can be used with a '::' between the overloaded procedure
// and the overload option. This is prefered because it looks more
// like writing a normal procedure, but with `#overload` as a "tag".
#overload
options :: (x: cstr) { println("C-String case."); }
// A order can also be specified like so.
#match options #order 10 (x: i32) { println("Other int case."); }
Sometimes, the ability to add new overload options should be disabled to prevent undesired behavior. For this Onyx has two directives that can be added after #match
to change when procedures can be added.
#locked
- This prevents adding overload options. The only options available are the ones between the curly braces.#local
- This allows options to be added, but only within the same file. This can be used to clean-up code that is over-indented.
Here is an example of using #match #local
.
length :: #match #local {}
#overload
length :: (x: u32) => 4
#overload
length :: (x: str) => x.count
Overloaded procedures provide the backbone for type-generic "traits" in Onyx. Instead of making a type/object oriented system (i.e. Rust), Onyx uses overloaded procedures to provide type-specific functionality for operations such as hashing. Multiple data-structures in the core
package need to hash a type to a 32-bit integer. Map
and Set
are two examples. To provide this functionality, Onyx uses an overloaded procedure called hash
in the core.hash
package. This example shows how to define how a Point
structure can be hashed into a u32
.
Point :: struct {x, y: i32}
#overload
core.hash.hash :: (p: Point) => cast(u32) (x ^ y);
Interfaces and where
Interfaces allow for type constraints to be placed on polymorphic procedures. Without them, polymorphic procedures have no way of specifying which types are allowed for their polymorphic variables. Interfaces are best explained through example, so consider the following.
CanAdd :: interface (T: type_expr) {
t as T;
{ t + t } -> T;
}
T
is a type, and t
is a value of type T
. The body of the interface is specifying that two values of type T
can be added together and the result is of type T
. Any expression can go inside of the curly braces, and it will be type checked against the type after the arrow. This interface can be used to constrict which types are allowed in polymorphic procedure using a where
clause.
CanAdd :: interface (T: type_expr) {
t as T;
{ t + t } -> T;
}
sum_array :: (arr: [] $T) -> T where CanAdd(T) {
result: T;
for arr do result += it;
return result;
}
// This is allowed
sum_array(f32.[ 2, 3, 5, 7, 11 ]);
// This is not, because '+' is not defined for 'str'.
sum_array(str.[ "this", "is", "a", "test" ]);
The second call to sum_array
would generate an error anyway when it type checks the specialized procedure with T=str
. However, this provides a better error message and upfront clarity to someone calling the function.
Interface constraints can also take on a more basic form, where the expected type is omitted. In this case, the compiler is only checking if there are no errors in the provided expression.
// This does not check if t + t is of type T.
CanAdd :: interface (T: type_expr) {
t as T;
t + t;
}
Interfaces can be used in conjunction with #match
blocks to perform powerful compile-time switching over procedures. Consider the following extension to the previous example.
CanAdd :: interface (T: type_expr) {
t as T;
{ t + t } -> T;
}
sum_array :: #match {
(arr: [] $T) -> T where CanAdd(T) {
result: T;
for arr do result += it;
return result;
},
(arr: [] $T) -> T {
printf("Cannot add {}.", T);
result: T;
return result;
}
}
// This is allowed
sum_array(f32.[ 2, 3, 5, 7, 11 ]);
// This is now allowed, but will print an error.
sum_array(str.[ "this", "is", "a", "test" ]);
First the compiler will check if T
is something that can be added, and if it can, the first procedure will be called. Otherwise the second procedure will be called.
Where expressions
Compile-time constant expressions can be used alongside interfaces in where
clauses. This gives the programmer even more control over the conditions their function will be called under. For example, ensuring the length of an array fits within a specific range allows us to optimize our code.
sum_array :: #match {
(arr: [] $T) -> T where CanAdd(T) {
result: T;
for arr do result += it;
return result;
},
// This will only be called with fixed-size arrays who's length is between 1 and 4.
(arr: [$N] $T) -> T where CanAdd(T), N >= 1, N <= 4 {
// An imaginary macro that duplicates the body N times
// to avoid the cost of loops.
return unroll_loop(arr, N, [a, b](a += b));
},
(arr: [] $T) -> T {
printf("Cannot add {}.", T);
result: T;
return result;
}
}
// This will call the [] $T version of sum_array
sum_array(f32.[ 2, 3, 5, 7, 11 ]);
// This will call the [$N] $T version of sum_array
sum_array(f32.[ 1, 2, 3, 4 ]);
// This will also call the [$N] $T version of sum_array
sum_array(f32.[ 1, 2, 3 ]);
Operator overloading
Onyx's operator overloading syntax is very similar to its #match
syntax, except #operator
is used, followed by the operator to overload. For example, this defines the +
operator for str
.
#operator + (s1, s2: str) -> str {
return string.concat(s1, s2);
}
The following operators can be overloaded:
Arithemetic: + - * / %
Comparison: == != < <= > >=
Bitwise: & | ^ << >> >>>
Logic: && ||
Assignment: += -= *= /= %=
&= |= <<= >>= >>>=
Subscript: [] []= &[]
Most of these are self explanatory.
Macros
Macros in Onyx are very much like procedures, with a couple notable differences. When a macro is called, it is expanded at the call site, as though its body was copy and pasted there. This means that macros can access variables in the scope of their caller.
print_x :: macro () {
// 'x' is not defined in this scope, but it can be used
// from the enclosing scope.
println(x);
}
{
x := 1234;
print_x();
}
{
x := "Hello from a macro!";
print_x();
}
Because macros are inlined at the call site and break traditional scoping rules, they cannot be used as a runtime known value.
There are two kinds of macros: block macros, and expression macros. The distinguishing factor between them is the return type. If a macro returns void
, it is a block macro. If it returns anything else, it is an expression macro.
Block and expression macros behave different with respect to some of the language features. Expression macros behave exactly like an inlined procedure call with dynamic scoping.
add_x_and_y :: macro (x: $T) -> T {
defer println("Deferred print statement.");
return x + y;
}
{
y := 20.0f;
z := add_x_and_y(30.0f);
printf("Z: {}\n", z);
}
// This prints:
// Deferred print statement.
// Z: 50.0000
This example shows that defer
statements are cleared before the expression macro returns. Also, the return
statement is used to return from the macro
with a value.
Block macros behave a little differently. defer
statements are not cleared, and return
statements are used to return from the caller's procedure.
early_return :: macro () {
return 10;
}
defer_a_statement :: macro () {
defer println("Deferred a statement.");
}
foo :: () -> i32 {
defer_a_statement();
println("About to return.");
early_return();
println("Never printed.");
}
// foo() will print:
// About to return.
// Deferred a statement.
In foo
, the call to defer_a_statement
adds the deferred statement to foo
. Then the first println
is run. Then the early_return
macro returns the value 10 from foo
. Finally, the deferred print statement is run.
This distinction between block and expression macros allows for an automatic destruction pattern.
// These are not the actual procedures to use mutexes.
grab_mutex :: macro (mutex: Mutex) {
mutex_lock(mutex);
defer mutex_unlock(mutex);
}
critical_procedure :: () {
grab_mutex(a_mutex);
}
grab_mutex
will automatically release the mutex at the end of critical_procedure
. This pattern of creating a resource, and then freeing it automatically using defer
is very common.
Code Blocks
To make macros even more powerful, Onyx provides compile-time code blocks. Code blocks capture code and treat it as a compile-time object that can be passed around. Use [] {}
to create a code block. Use #unquote
to "paste" a code block.
say_hello :: [] {
println("Hello!");
}
#unquote say_hello;
Code blocks are not type checked until they are unquoted, so they can contain references to references to variables not declared within them.
Code blocks have their syntax because they can optionally take parameters between their []
. When unquoting a code block with parameters,
you must pass an equal or greater number of arguments in parentheses after the variable name.
do_something :: ($do_: Code) {
#unquote do_(1, 2);
#unquote do_(2, 6);
}
do_something([a, b] {
println(a + b);
});
Code blocks can be passed to procedures as compile-time values of type Code
.
triple :: ($body: Code) {
#unquote body;
#unquote body;
#unquote body;
}
triple([] {
println("Hello!");
});
Code blocks can be passed to macros without being polymorphic variables, because all parameters to macros are compile-time known.
triple_macro :: macro (body: Code) {
#unquote body;
#unquote body;
#unquote body;
}
triple_macro([] {
println("Hello!");
});
A single statement/expression in a code block can be expressed as: [](expr)
[](println("Hello"))
// Is almost the same the as
[] { println("Hello"); }
The practical difference between []()
and [] {}
is that the latter produces a block of code, that has a void
return type, while the former results in the type of the expression between it. The Array
and Slice
structures use this feature for creating a "lambda/capture-like" syntax for their procedures.
find_largest :: (x: [] $T) -> T {
return Slice.fold(x, 0, [x, acc](x if x > acc else acc));
}
A code block can also be passed to a macro or procedure simply by placing a block immediately after a function call. This only works if the function call is a statement.
skip :: (arr: [] $T, $body: Code) {
for n in 0 .. arr.count {
if n % 2 == 1 do continue;
it := arr[n];
#unquote body;
}
}
// This prints: 2, 5, 11
skip(.[2, 3, 5, 7, 11, 13]) {
println(it);
}
Types
Primitives
Onyx contains the following primitive types.
void // Empty, 0-size type
bool // Booleans
u8 u16 // Unsigned integers: 8, 16, 32, and 64 bit.
u32 u64
i8 i16 // Signed integers: 8, 16, 32, and 64 bit.
i32 i64
f32 f64 // Floating point numbers: 32 and 64 bit.
rawptr // Pointer to an unknown type.
type_expr // The type of a type.
any // Used to represent any value in the language.
str // A slice of bytes ([] u8)
cstr // A pointer to bytes (&u8) with a null-terminator.
dyn_str // A dynamic string ([..] u8)
range // Represents a start, end, and step.
v128 // SIMD types.
i8x16 i16x8
i32x4 i64x2
f32x4 f64x2
Pointers
Pointers contain an address of a value of the given type. A &T
is a pointer to value of type T
. If a pointer is not pointing to anything, its value is null
.
Use the &
operator to take the address of a value. Note the consistency between the type and operation used to create a pointer.
x: i32 = 10;
p: &i32 = &x;
Use the *
operator to retrieve the value out of a pointer. This is not a safe operation, so faults can occur if the pointer is pointing to invalid memory.
x := 10;
p := &x;
printf("*p is {}.\n", *p);
Multi-pointers
Normal pointers in Onyx do not support pointer addition nor subscripting, i.e. x[i]
.
To do this, a multi-pointer must be used.
Multi-pointers are written as [&] T
. They implicitly convert to-and-from normal pointer types, so they do not add much to the safely of a program, but they do allow for expressed intent when using pointers. Consider these two procedures; there is a clear difference between how the pointers are going to be used.
proc_1 :: (out: &i32) {
*out = 10;
}
proc_2 :: (out: [&] i32) {
for 10 {
out[it] = it;
}
}
Note, pointer addition and substraction on
[&] T
steps withsizeof(T)
.So,
cast([&] i32, 0) + 1 == 4
.
Pointers vs Multi-Pointers
& T | [&] T |
---|---|
*t | t[i] |
t.foo | t + x |
== , != | == , != |
Fixed-size Arrays
Fixed-size arrays store a fixed number of values of any type. A [N] T
array hold N
values of type T
. The []
operator can be used to access elements of the array.
arr: [5] u32;
arr[0] = 1;
arr[1] = 2;
arr[2] = 3;
arr[3] = 4;
arr[4] = 5;
Fixed-size arrays are passed by pointer to procedures. However, the =
operator copies the contents of the array to the destination.
mutate_array :: (arr: [5] u32) {
arr[3] = 1234;
}
arr: [5] u32;
mutate_array(arr);
println(arr[3]); // Prints 1234
arr2: [5] u32 = arr; // This is an element by element copy.
arr2[3] = 5678; // This does not modify the original array.
println(arr[3]); // So this also prints 1234
Fixed-size arrays can be constructed using an array literal. Array literals have the form type.[elements]
. The type
is optional if the type of the elements can be automatically inferred.
arr := u32.[1, 2, 3, 4];
assert((typeof arr) == [4] u32, "type does not match");
floats := .[5.0f, 6.0f, 7.0f];
assert((typeof floats) == [3] f32, "type does not match");
Array Programming
Fixed-size arrays have builtin array programming support. This allows the +
, -
, *
, /
operators to be used with them.
Vector3 :: [3] i32; // A simple three-component vector
a: Vector3 = .[ 1, 2, 3 ];
b: Vector3 = .[ 1, 1, 1 ];
c := a + b; // [ 2, 3, 4 ];
c *= 2; // [ 4, 6, 8 ];
Builtin Fields
Fixed-size arrays can also be accessed with builtin fields if their length is <= 4. The fields are x
, y
, z
, w
or r
, g
, b
, a
. Array field access is equivalent to regular indexing and does not affect an array's memory layout.
Color :: [4] f32; // A simple RGBA color
red: Color = .[ 1, 0, 0, 0 ];
green: Color = .[ 0, 1, 0, 0 ];
blue: Color = .[ 0, 0, 1, 0 ];
full_opacity: Color = .[ 0, 0, 0, 1 ];
fuchsia := red + full_opacity; // [ 1, 0, 0, 1 ]
fuchsia.b = 1; // Equivalent to fuchsia[2] = 1
teal := green + blue; // [ 0, 1, 1, 0 ]
teal.a = 1;
white := red + green + blue;
white.a = 1;
Slices
Slices are arrays with a runtime known size. A slice [] T
is equivalent to the following structure.
[] T == struct {
data: &T;
count: u32;
}
Slices are the most common array-like type used in practice. Slices do not hold the data of their contents directly, but rather through a pointer.
Slices can be used to represent a sub-array. A slice can be created using the []
operator on an array-like type, but providing a range
instead of an integer. Note that the range is half-open, meaning the upper bound is not included.
arr := u32.[1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
slice: [] u32 = arr[4 .. 7];
for slice {
println(it); // Prints 5, 6, 7
}
All array-like types implicitly cast to a slice. The following function works on fixed-size arrays, slices, and dynamic arrays.
product :: (elems: [] $T) -> T {
result := 1;
for elems do result *= it;
return result;
}
data := .[1, 2, 3, 4];
println(product(data));
println(product(data[2 .. 4]));
println(product(Array.make(data)));
Dynamic Arrays
Dynamic arrays have a variable size that can be changed after they are created. A [..] T
is a dynamic array of T
. Functionality for dynamic arrays is provided in the Onyx standard library in the Array
structure, which allows for using methods on dynamic arrays.
use core {println}
arr: [..] i32;
Array.init(&arr);
defer Array.free(&arr);
for 0 .. 10 {
// Explicitly using Array structure
Array.push(&arr, it);
// OR
// Implicitly using method calls
arr->push(it)
}
for arr {
println(it);
}
See the the Array structure for a full list of functions provided.
Dynamic arrays store an Allocator to know how to request more memory for the array. By default context.allocator
is used. However, an alternate allocator can be specified in Array.make
or Array.init
.
Because dynamic arrays are so common and useful, Onyx provides some operator overloads for dynamic arrays. The most useful is <<
, which is used to append elements.
// Same example as above.
use core {println}
// Dynamic arrays are safely automatically allocated
// on the first push, so there is no need to explicitly
// allocate it if you are using context.allocator.
arr: [..] i32;
defer arr->free();
for 0 .. 10 {
// No need to take the address 'arr'.
arr << it;
}
for arr {
println(it);
}
Structures
Structures are the record type in Onyx. A structure is declared using the struct
keyword and is normally bound to a symbol. Members of a structure are declared like declarations in a procedure.
Point :: struct {
x: i32;
y: i32;
}
Accessing Members
Member access is done through the .
operator. Note that accessing a member on a pointer to a structure uses the same .
syntax.
p: Point;
p.x = 10;
p.y = 20;
ptr := &p;
ptr.x = 30;
Structure Literals
Structure literals are a quicker way of creating a value of a struct type. They have the form, Type.{ members }
. The members
can be partially, or completely named. The same rules apply for when giving members as do for arguments when calling a procedure. If a value is not provided for a member, and no default value is given in the structure, a zeroed-value is used.
// Naming members
p1 := Point.{x=10, y=20};
// Leaving out names. Follows order of members declared in the structure.
p2 := Point.{10, 20};
Defaulted Members
Members can be given default values. These values are used in structure literals if no other value is provided for a member. They are also used by __initialize
to initialize a structure.
Person :: struct {
name: str = "Joe";
// If the type can be inferred, the type can be omitted.
age := 30;
}
sally := Person.{ name="Sally", age=42 };
println(sally);
// Because name is omitted, it defaults to "Joe".
joe := Person.{ age=31 };
println(joe);
// Leaving out all members simply sets the members with initializers to
// their default values, and all other members to zero.
joe2 := Person.{};
println(joe2);
Directives
Structures have a variety of directives that can be applied to them to change their properties. Directives go before the {
of the structure definition.
Directive | Function |
---|---|
#size n | Set a minimum size |
#align n | Set a minimum alignment |
#pack | Disable automatic padding |
#union | A members are at offset 0 (C Union) |
Polymorphic Structures
Structures can be polymorphic, meaning they accept a number of compile time arguments, and generate a new version of the structure for each set of arguments.
// A 2d-point in any field.
Point :: struct (T: type_expr) {
x, y: T;
}
Complex :: struct {
real, imag: f32;
}
int_point: Point(i32);
complex_point: Point(Complex);
Polymorphic structures are immensely useful when creating data structure. Consider this binary tree of any type.
Tree :: struct (T: type_expr) {
data: T;
left, right: &Tree(T);
}
root: Tree([] u8);
When declaring a procedure that accepts a polymorphic structure, the polymorphic variables can be explicitly listed.
HashMap :: struct (Key: type_expr, Value: type_expr, hash: (Key) -> u32) {
// ...
}
put :: (map: ^HashMap($Key, $Value, $hash), key: Key, value: Value) {
h := hash(key);
// ...
}
Or they can be omitted and a polymorphic procedure will be created automatically. The parameters to the polymorphic structure can be accessed as though they were members of the structure.
HashMap :: struct (Key: type_expr, Value: type_expr, hash: (Key) -> u32) {
// ...
}
put :: (map: ^HashMap, key: map.Key, value: map.Value) {
h := map.hash(key);
// ...
}
Structure Composition
Onyx does not support inheritance. Instead, a composition model is preferred. The use
keyword specifies that all members of a member should be directly accessible.
Name_Component :: struct {
name: str;
}
Age_Component :: struct {
age: u32;
}
Person :: struct {
use name_comp: Name_Component;
use age_comp: Age_Component;
}
// 'name' and 'age' are directly accessible.
p: Person;
p.name = "Joe";
p.age = 42;
println(p);
Sub-Type Polymorphism
Onyx supports sub-type polymorphism, which enable a safe and automatic conversion between pointer types &B
to &A
if the following conditions are met:
- The first member of
B
is of typeA
. - The first member of
B
is used.
Person :: struct {
name: str;
age: u32;
}
Joe :: struct {
use base: Person;
pet_name: str;
}
say_name :: (person: ^Person) {
printf("Hi, I am {}.\n", person.name);
}
joe: Joe;
joe.name = "Joe";
// This is safe, because Joe "extends" Person.
say_name(^joe);
In this example, you can pass a pointer to Joe
when a pointer to Person
is expected,
because the first member of Joe
is a Person
, and that member is used.
Enumerations
Enumerations or "enums" give names to values, resulting in cleaner code. Enums in Onyx are declared much like structures.
Color :: enum {
Red;
Green;
Blue;
}
col := Color.Red;
Notice that enums use ;
to delineate between members.
By default, enum members are automatically assigned incrementing values, starting at 0. So above, Red
would be 0, Green
would be 1, Blue
would be 2. The values can be overridden if desired. A ::
is used because these are constant bindings.
Color :: enum {
Red :: 123;
Green :: 456;
Blue :: 789;
}
Values are automatically incremented from the previous member if no value is given.
Color2 :: enum {
Red :: 123;
Green; // 124
Blue; // 125
}
Values can also be expressed in terms of other members.
Color3 :: enum {
Red :: 123;
Green :: Red + 2;
Blue :: Red + Green;
}
By default, enums values are of type u32
. This can also be changed by specifying the underlying type in parentheses after the enum
keyword.
Color :: enum (u8) {
Red; Green; Blue;
}
Enums can also represent a set of bit-flags, using the #flags
directive. In an enum #flags
, values are automatically doubled instead of incremented.
Settings :: enum #flags {
Vsync; // 1
Fullscreen; // 2
Borderless; // 4
}
settings: Settings;
settings |= Settings.Vsync;
settings |= Settings.Borderless;
println(settings);
As a convenience, when accessing a member on an enum
type, if the type can be determined from context, the type can be omitted.
Color :: enum {
Red; Green; Blue;
}
color := Color.Red;
// Because something of type Color only makes
// sense to compare with something of type Color,
// Red is looked up in the Color enum. Note the
// leading '.' in front of Red.
if color == .Red {
println("The color is red.");
}
Tagged Unions
Tagged unions in Onyx can be thought of as an enum
, with every variant having a different type associated with it.
When the tagged union is one variant, it is storing a value of the corresponding type.
A value can only be one of the variants at a time.
They are written using the union
keyword and look much like structures.
Here is an example of a tagged union.
Value :: union {
// First variant, called Int, stores an i32.
Int: i32;
// Second variant, called String, stores a str.
String: str;
// Final variant, called Unknown, stores "void", meaning it does not store anything.
Unknown: void;
}
This union has three variants called Int
, String
, and Unknown
.
They store an i32
, str
and nothing respectively.
Internally there is also an enum
made to store these variant tags. You can access it using Value.tag_enum
.
To create a value out of a union type, it looks like a structure literal, except there must be exactly one variant listed by name, with its corresponding value.
v1 := Value.{ Int = 123 };
v2 := Value.{ String = "string value" };
v3 := Value.{ Unknown = .{} }; // To spell a value of type 'void', you can use '.{}';
We create three values, one for each variant of the union.
To get access to the values inside of the tagged union, we have two options. Using a switch
statement, or using variant access.
We can use switch
statement over our tagged union value, and use a capture to extract the value stored inside.
print_value :: (v: Value) {
switch v {
// `n` is the captured value
// Notice we use `.Integer`. This is short for `Value.tag_enum.Integer`.
case .Integer as n {
printf("Its an integer with value {}.\n", n);
}
case .String as s {
printf("Its a string with value {\"}.\n", s);
}
// All other case will be unhandled
// This is still necessary to satisfy exhaustive matching
case _ ---
}
}
print_value(v1);
print_value(v2);
print_value(v3);
We can also directly access the variant on the tagged union. This gives us an optional of the corresponding type.
If the current variant matched, we get a Some
. If not, we get a None
.
println(v1.Integer); // prints Some(123)
println(v1.String); // prints None
You can use the features of Optional
s to work with these results.
Polymorphic unions
Like structures, unions be polymorphic and take type parameters.
A good example is the Result
type from the standard library.
It is defined as:
Result :: union (Ok_Type: type_expr, Err_Type: type_expr) {
Ok: Ok_Type;
Err: Err_Type;
}
These works exactly like polymorphic structures when it comes to using them in procedure definitions and the like.
// Returns an optional of the error type of the result.
// This is entirely redundant, since `result.Err` would give the same result.
get_err :: (result: Result($Ok, $Err)) -> ? Err {
return result.Err;
}
Distinct
Distinct types wrap another type in a new distinct type, which allows for strong type checking and operator overloads. Consider this example about representing a timestamp.
use core {println}
Time :: #distinct u32
Duration :: #distinct i32
#operator - (end, start: Time) -> Duration {
return Duration.{cast(u32) end - cast(u32) start};
}
start := Time.{1000};
end := Time.{1600};
duration := end - start;
println(typeof duration);
println(duration);
With distinct types, more semantic meaning can be given to values that otherwise would be nothing more than primitives.
Distinct types can be casted directly to their underlying type, and vice versa. Distinct types cannot be casted directly to a different type.
It should be noted that when a distinct type is made, none of the operators defined for the base type are defined for the new type. In the previous example, two Time
values would not be comparable unless a specific operator overload was provided.
Time :: #distinct u32
#operator == (t1, t2: Time) => cast(u32) == cast(u32) t2;
Procedure types
Procedure types represent the type of a procedure. They are used when passing a procedure as an argument, or storing a procedure in a variable or structure member. They are written very similar to procedures, except they must have a return type, even if it is void.
map :: (x: i32, f: (i32) -> i32) -> i32 {
return f(x);
}
// Explicit version of a procedure
println(map(10, (value: i32) -> i32 {
return value * 2;
}));
Using procedure types for parameters enables quick procedures to be passed.
map :: (x: i32, f: (i32) -> i32) -> i32 {
return f(x);
}
// Quick version of a procedure
// Because 'map' provides the type of the argument
// and return value, this quick procedure can be passed.
println(map(10, x => x * 2));
As a convenience, procedure types can optionally have argument names to clarify what each argument is.
handle_player_moved: (x: i32, y: i32, z: i32) -> void
// Elsewhere in the code base.
handle_player_moved = (x, y, z) => {
println("Player moved to {} {} {}\n", x, y, z);
}
handle_player_moved(10, 20, 30);
Optional
Optional types in Onyx represent something that may or may not contain a value.
They are simply written as ? T
, where T
is the type that they optional may contain.
You may wonder why its written as
? T
instead ofT?
. This is to prevent ambiguity. For example, if you see[] T?
, is this an optional slice ofT
(([] T)?
)? or a slice of optionalT
([] (T?)
)? To avoid this problem the optional specifier is placed in the same place as the pointer/slice/etc. specifiers.
Internally, Optional types are simply defined as a polymorphic union called Optional
in the builtin
package.
They are just given the special ? T
syntax to mean Optional(T)
. These are equivalent.
Using Optionals
Optionals have been designed to be used ergonomically within your codebase, without much overhead.
Here is an incorrect example of function that gets the last element of an array. It is incorrect because it does not correctly handle the case where the array is empty.
array_last :: (arr: [] $T) -> T {
return arr[arr.count - 1];
}
array_last(.[1, 2, 3]) // returns 3
array_last(.[]) // undefined behavior
The only change we need to make to make this correct, all we need to change is the return type to ? T
, and add a check for an empty array.
The correct code would look like so.
array_last :: (arr: [] $T) -> ? T {
// If the array is empty. Equivalent to arr.count == 0
if !arr {
// Return an empty instance of ? T, which is a None.
return .{};
}
// This will implicitly cast from a T to a ? T.
return arr[arr.count - 1];
}
array_last(.[1, 2, 3]); // returns Some(3)
array_last(.[]); // returns None
If we wanted to get the value stored in an optional, we have a couple of options. We could,
- Use one of the builtin methods on
Optional
, likeunwrap
,value_or
, oror_else
. - Use the try operator (
?
) force getting the value, or returning.{}
from the nearest block. - Use the coalese operator (
??
). - Use a
switch
statement with a capture
// `o` is an optional i32.
o: ? i32 = 123;
v1 := o->unwrap(); // Cause an assertion failure if it doesn't exist
v2 := o->value_or(0); // Provide a default value
v3 := o->or_else(() => 0); // Provide a default value, wrapped in a function
v4 := o?; // If no value exists, execute `return .{}`;
v5 := o ?? 0; // Equivalent to `->value_or(0)`;
switch o {
case .None {
// No value
}
case .Some as v6 {
// v6 is the value
}
}
Directives
"Directives" is the generic word for the special meanings keyword-like things that control isolated aspects of the language.
This section describes most of the directives in the language, while some directives are described in more relevant parts of the documentation.
#inject
Note, the below documentation about
#inject
is out of date and will removed in the future. This is because#inject
is now unnecessary with recent changes to the language. You can read more about the new syntax on the Bindings page, under the "Targeted Bindings" section.
#inject
is a special directive compared to most others, because it enables
many very powerful features in the language.
The basic idea of #inject
is that it injects symbols into a scope, from
anywhere in the code base. Using this, you can add methods to structures,
add symbols and overloads to a package, and even declare new global types
and variables.
Syntax
The inject directive can take on two forms: the singular form, and the block form.
In the singular form, you simply write #inject
before a binding, but that binding's
target can be nested inside of anything with a scope. For example, here is one way
of adding a method to a structure.
Vector2 :: struct {
x, y: f32;
}
// Without #inject here, you would get a parsing error.
#inject
Vector2.magnitude :: (v: Vector2) -> f32 {
return math.sqrt(v.x * v.x + v.y * v.y);
}
main :: () {
v := Vector2.{ 3, 4 };
println(v->magnitude());
}
Note, while it would be possible in this case to change the syntax so you would not need to put
#inject
in this case, I think that can lead to some unexpected bugs. I have not tried it though, so it might be nice to use.
The powerful thing about #inject
is that the definition for Vector2.magnitude
does not have to be in the same file, or even the same package. It can even
be optionally defined with a static if. Using #inject
you can
define your own extensions to types provided from any library.
When you have many things to inject into the same place, you can use the block form
of #inject
. In this form, you write #inject
, then the thing to inject into, followed by
braces ({}
). Inside the braces, any binding you write will be turned into an
injected binding into that scope.
Vector2 :: struct {
x, y: f32;
}
#inject Vector2 {
add :: (v1, v2: Vector2) => Vector2.{ v1.x + v2.x, v1.y + v2.y };
sub :: (v1, v2: Vector2) => Vector2.{ v1.x - v2.x, v1.y - v2.y };
mul :: (v1, v2: Vector2) => Vector2.{ v1.x * v2.x, v1.y * v2.y };
}
main :: () {
v1 := Vector2.{ 3, 4 };
v2 := Vector2.{ 5, 6 };
println(v1->add(v2)); // Using method call syntax
println(Vector2.sub(v2, v1)); // Using explicit syntax
}
Limitations
Anywhere a binding can appear, you can inject into, with the exception of procedure bodies. Procedure bodies are isolated to prevent confusion.
But, this means you can inject into any of these things:
- Packages
- Structures
- Unions
- Enums
- (probably more, but I am forgetting them at the time of writing)
Making global bindings
To prevent name clutter, Onyx intentionally places every binding into a package.
See the Packages section for more details. But sometimes,
you want to make a true global binding, one that does not require using any package
to access. To do this, you can simply #inject
into the builtin
package.
This is because the builtin
package is special. Its public scope is actually mapped
to the global scope of the program. This makes it so everything useful defined in
builtin
is always accessible to you, things like make
, new
, delete
and context
.
But because of this, you can #inject
into it to define your own globals for your program.
One example use of this is the logf
function. It is only defined if you are using a
runtime with the core.conv
package defined. To do this, there is an #inject
into
builtin
in core/conv/format.onyx
for the logf
symbol. This way it is always accessible,
but only if you are using a runtime with formatting capabilities.
#if
#if
is a compile-time if statement. It looks like a normal if
statement, except its condition must be resolvable at compile time.
This is because it controls whether or not the body of the #if
statement are included in the compilation.
Static-ifs can be used inside and outside of procedures.
Outside of Procedures
When outside of a procedure, static-ifs can be used to control whether or not certain symbols are defined in the current compilation.
DEBUG_MODE :: true
#if DEBUG_MODE {
// This procedure is only defined if DEBUG_MODE is true
debug_only_procedure :: () {
// ...
}
} else {
// This procedure is only defined if DEBUG_MODE is false
not_a_debug_procedure :: () {
}
}
Static-ifs can contain any top-level "thing" in the language: procedures, structures,
unions, and even #load
directives. Using this feature, you can optionally
include files depending on a condition.
USE_EXTENSIONS :: false
MINIMUM_EXTENSION_VERSION :: 5
#if USE_EXTENSIONS && MINIMUM_EXTENSION_VERSION >= 3 {
#load "./extensions/v3"
}
Inside of procedures
When inside a procedure, static-ifs can contain statements that will only be
included if the static-if resolves to be true
.
DEBUG_MODE :: true
draw :: () {
#if DEBUG_MODE {
draw_debug_ui();
}
// ...
}
Other uses
See the #defined
documentation for more uses of static-if statements.
#tag
#tag
is used to attach static metadata to various compile-time objects.
This metadata can then be accessed using the runtime.info
package.
To tag something, simply place one or more #tag
s before the binding.
The order of the tags are preserved when using them.
The metadata that is attached has to be a compile-time known value, because it will be serialized and placed in the data section of the resulting binary. It could be a numeric, string, structure or array literal for example.
Structures
Here is an example of a structure tagged with a string literal.
#tag "option:hidden"
Value :: struct {
// ...
}
To access tags on a structure, use the get_type_info
function from runtime.info
to
get the Type_Info
of the structure. Then use ->as_struct()
to convert it to a
Type_Info_Struct
. Then you can use the .tags
property to access the stored data.
It is simply an array of any
, which can be used with the utilities for any
found
in core.misc
.
use runtime.info { get_type_info }
use core { printf, misc }
main :: () {
info := get_type_info(Value)->as_struct();
for tag in info.tags {
if tag.type == str {
value := * misc.any_as(tag, str);
printf("Value: {}\n", value);
}
}
}
Structure members
Value :: struct {
#tag "a value"
member_name: i32;
}
To access tags on a structure member, do the same steps as above, and then use
the members
array on the Type_Info_Struct
. On each member's info there is a tags
array of that contains all the tags defined the member, in the order they were
defined.
use runtime.info { get_type_info }
use core { printf, misc }
main :: () {
info := get_type_info(Value)->as_struct();
for member in info.members {
for tag in member.tags {
if tag.type == str {
value := * misc.any_as(tag, str);
printf("Value: {}\n", value);
}
}
}
}
Unions
Tags on unions behave in exactly the same manner as tags on structures.
Unions Variants
Tags on union variants behave in exactly the same manner as tags on structure members.
Procedures
Tag information for procedures is located in the runtime.info.tagged_procedures
array.
You can either loop through this array manually, or you can use the helper procedure
runtime.info.get_procedures_with_tag
.
use runtime.info {get_procedures_with_tag}
use core {printf}
Metadata :: struct { name: str }
#tag Metadata.{ "name is foo" }
foo :: () { }
#tag Metadata.{ "name is bar" }
bar :: () { }
main :: () {
// Provide the type of the tag.
procs := get_procedures_with_tag(Metadata);
for p in procs {
printf("Procedure is: {}\n", p.func);
printf("Procedure type is: {}\n", p.type);
printf("Tag is: {}\n", p.tag);
printf("Procedure is in package: {}\n", p.pack);
}
}
Globals
Like tagged procedures, tagged global information lives in runtime.info.tagged_globals
.
You can either loop through it directly, or use the helper procedure runtime.info.get_globals_with_tag
.
use runtime.info {get_globals_with_tag}
use core {printf}
Metadata :: struct { name: str }
#tag Metadata.{ "name is foo" }
foo: i32
main :: () {
// Provide the type of the tag.
globs := get_globals_with_tag(Metadata);
for g in globs {
printf("Global address is: {}\n", g.data);
printf("Global type is: {}\n", g.type);
printf("Tag is: {}\n", g.tag);
printf("Global is in package: {}\n", g.pack);
}
}
#export
#export
adds a procedure to the export-list of the compiled WebAssembly
binary. This is a crucial piece of functionality when trying to use Onyx
in other environments, such as from JS or in plugin systems.
The syntax for #export
looks like this.
#export "export name here" exported_procedure
The name provided must be a compile-time string. The exported procedure can either be a reference to a procedure, or a procedure literal itself.
#export "add" (x, y: i32) -> i32 {
return x + y;
}
#foreign
The #foreign
directive is used to tell the compiler that a function is defined outside
of this program. Because Onyx compiles to WebAssembly, this means that the function will
be added to the import section of the WASM module. You can read more about what that
means here.
The #foreign
directive can appear in two different places, depending on which is
more convenient.
The first position it can appear in is directly after the return type of a function. In this position, it must be followed by two compile-time known strings that are the module and import name. This terminology is inherited from the WebAssembly specification.
external_procedure :: (arg1: i32, arg2: i32) -> i32 #foreign "host" "add" ---
In this example, host
is the module name, and add
is the import name.
Foreign blocks
The other position #foreign
can appear in is foreign-blocks.
In this form, you can declare many foreign procedures at once, so long
as they all have the same module name, and their import name matches the
name in Onyx that is given to them.
#foreign "host" {
add :: (arg1: i32, arg2: i32) -> i32 ---
sub :: (arg1: i32, arg2: i32) -> i32 ---
mul :: (arg1: i32, arg2: i32) -> i32 ---
}
In this example, add
, sub
, and mul
are all foreign procedures with the module
name host
. They have the import names add
, sub
, and mul
respectively.
We can validate this using the wasm-objdump
tool from the WebAssembly Binary Toolkit.
We also have to compile in a special way to not clutter the output with
the imports that come from the standard library.
$ onyx build -r custom -o example.wasm example.onyx core/runtime/default_link_options.onyx
$ wasm-objdump -x -j import example.wasm
example.wasm: file format wasm 0x1
Section Details:
Import[3]:
- func[0] sig=0 <host.add> <- host.add
- func[1] sig=0 <host.sub> <- host.sub
- func[2] sig=0 <host.mul> <- host.mul
When using Onyx from the command line with
onyx run
, or when running with the WASI backend, these foreign functions will be resolved for you. However, when using JavaScript as your runtime, you will need to provide definitions for each imported procedure. See this MDN article for more details.
#file_contents
#file_contents
can be used to add the contents of a file (text or binary) into
the data section of the outputted WebAssembly binary, and gives you access to it
as a [] u8
.
You can use this to embed anything in the binary that you would have had to put in a string literal, or load at runtime.
image_data := #file_contents "image/path/here.png";
pixels := convert_image_to_pixels(image_data);
This way, there is file I/O to load an image from disk. It is already in the binary ready to be used.
#defined
When a symbol may or may not be defined due to different compilation flags,
you can use #defined
to test whether or not it is actually defined.
#defined
looks like a procedure with a single argument, which evaluates
at compile-time to a boolean expression.
use core {println}
main :: () {
main_is_defined := #defined(main); // true
foo_is_defined := #defined(foo); // false
}
One useful feature of #defined
is that you can use it to test if a
package is defined in the program. This way, you can test for optional
extensions in your program, without relying on using the correct flags.
#if #defined(package foo) {
// We know foo is defined, we can write a procedure that uses it
uses_foo :: () {
use foo;
foo.bar();
}
}
Using with #if
#defined
is generally used with #if
to conditionally include things
depending on if something else was or was not defined.
As an example, you could have a set of procedures that can be overridden by the end-user
of your library. But if they want to use the defaults, they can be still be defined
automatically. A combination of targeted bindings, #defined
, and #if
makes this works well.
In the library, you would use #if
and #defined
to test if a certain flag was defined.
package your_library
// Use predefined procedures if user did not override them.
#if !#defined(CUSTOM_PROCEDURES) {
do_thing_one :: () { println("Default thing 1!"); }
do_thing_two :: () { println("Default thing 2!"); }
}
Then the consumer of the library can use targeted bindings to define the flag and functions if necessary.
package main
use your_library
// Override procedures with targeted binding.
your_library.CUSTOM_PROCEDURES :: true
your_library.do_thing_one :: () { println("Overridden thing 1!"); }
your_library.do_thing_two :: () { println("Overridden thing 2!"); }
main :: () {
your_library.do_thing_one(); // Overridden thing 1!
your_library.do_thing_two(); // Overridden thing 2!
}
#persist
#persist
is used to make static global variable in places that normally would not have
static global variables.
You can define a persistent or static variable in a procedure like so.
count :: () -> i32 {
// Persistent variables are global variables
// constrained to the current scope.
#persist counter: i32;
counter += 1;
return counter;
}
main :: () {
for 100 {
println(count());
}
}
You can define a persistent variable in a structure body, where it will be accessible using the structure name as a namespace.
Foo :: struct {
#persist foo_counter: i32;
name: str;
make :: () -> Foo {
Foo.foo_counter += 1;
return Foo.{ tprintf("Foo #{}\n", Foo.foo_counter) };
}
}
main :: () {
f1 := Foo.make();
f2 := Foo.make();
println(f1); // Foo #1
println(f2); // Foo #2
println(Foo.foo_counter);
}
#thread_local
#thread_local
is used to define a global variable as thread-local.
These thread-local variables are unique across threads so every thread gets a copy.
use core.thread
use core.iter
use core {println}
#thread_local
counter: i32;
thread_task :: (_: rawptr) {
for 0 .. 10000 {
counter += 1;
}
println(counter);
}
main :: () {
threads := iter.as_iter(0 .. 16)
|> iter.map(_ => {
t := new(thread.Thread);
thread.spawn(t, cast(&void) null, thread_task);
return t;
})
|> iter.collect();
for t in threads {
thread.join(t);
}
}
Note, this example will not work on the Onyx Playground, because it uses multi-threading, which is not supported there.
This program will print 10000
, sixteen times since each thread
has its own copy of counter
.
#doc
#doc
is used to provide doc-strings to bindings. They can appear before
most "things" in the language, like structures, unions, procedures, enums, etc.
To use them, simply write #doc
followed by a compile-time string.
#doc "This is the documentation for the 'procedure_a'."
procedure_a :: () {
// ...
}
#doc """
This multi-line string literal is the documentation
for procedure_b.
"""
procedure_b :: () {
// ...
}
Note that you can only have one #doc
directive per binding.
These doc-strings are included in the generated .odoc
file when compiled
with the --doc
flag. This binary file is used by onyx-doc-gen
to generate HTML documentation for the current compilation. This file can
also be easily deserialized into a structure you can work with in Onyx
like so.
use core.encoding.osad
use core.doc
use core.os
contents := os.get_contents("documentation.odoc");
docs := osad.deserialize(doc.Doc, contents)->unwrap();
// See core/doc/doc.onyx for the what is inside of `docs`
#deprecated
You can use #deprecated
on a procedure to cause a warning whenever it is called.
It is not a compile error, but it will show the deprecation message when the program
is compiled.
Here is how to use it.
an_old_procedure :: (x, y: i32) -> i32
#deprecated "This is the deprecation message. Include relevant replacement info here."
{
// ...
}
The #deprecated
directive goes after the return type and before the start of the function body.
A compile-time known string must follow that should contain information about how to migrate away
from using the deprecated function.
Currently, #deprecated
can only appear on procedures. While it could be useful on types,
it is currently not supported.
#init
#init
allows you to define procedures that run before the main
in your program.
This allows you to do simple set ups and initialization before main
is reached.
#init
must be followed by a compile-time known procedure with the type signature, () -> void
.
#init () {
println("In #init procedure!");
}
main :: () {
println("In main!");
}
// Output:
// In #init procedure!
// In main!
You are guaranteed that the runtime has been fully initialized before any #init
procedure is invoked. This way, you know that printing and heap allocations will
work from #init
procedures.
Ordering with #after
The order of #init
procedure is undefined and unstable if you change your program.
However, by using the #after
directive, you can specify a dependency of an #init
procedure that is guaranteed to be executed before the procedure in question.
global_map: Map(str, i32);
// Bind the #init statement to a symbol.
prepare_map :: #init () {
global_map = make(Map(str, i32));
}
populate_map :: #init #after prepare_map () {
global_map->put("A", 1);
global_map->put("B", 2);
global_map->put("C", 3);
}
In this example, prepare_map
is guaranteed to be run before populate_map
because
of the #after
directive on populate_map
.
You can specify as many #after
directives as you want on a single #init
procedure.
#init
#after A
#after B
#after C
() {
// ...
}
In this example, the #init
procedures A
, B
and C
will be run before this #init
procedure.
#error
#error
is used to produce a static, compile-time error.
To use it, simply place it outside of any procedure, and include a compile-time string that is the error message.
#error "This is a static error the prevents this program from compiling."
main :: () {
}
#error
by itself is almost useless, but when combined with
#if
, you can achieve something like a static-assertion.
#if !#defined(something_important) {
#error "'something_important' must be defined to compile."
}
#this_package
This directive is a small hack that can be used when writing macros
.
Because macros do not have normally scoping, it can be difficult
to reference something that is defined in the same package as the
macro, since when the macro is expanded it might not be visible.
#this_package
is used to represent the current file's package as a
object in which you can look things up.
internal_details :: (x: rawptr, T: type_expr) {
// ...
}
useful_macro :: macro (x: & $T) {
#this_package.internal_details(x, T);
}
This pattern is very common in the core libraries of Onyx, where you have a macro that takes a pointer to anything, but it gets expanded to a procedure call that simply passes the pointer and the type of the value.
This has to use #this_package
because internal_details
is not going
to be directly accessible when the macro is expanded. But by specifying
that it needs to be looked up in the current package, this problem can
be avoided.
#wasm_section
When producing the final WebAssembly file, custom sections can be included to add metadata to the binary.
The compiler already produces some of these like names
, and producers
.
Custom sections can be specified in an Onyx program by using the #wasm_section
directive.
This directive is followed by the custom section name and the contents of the custom section,
as compile-time strings.
#wasm_section "my-custom-section" "Custom section data here."
#wasm_section "another-section" #file "path/to/custom/data"
Miscellaneous
Format Strings
When specifying the format string for printf
or conv.format
, there are a number of options you can
use to configure how the resulting string will be formatted. Format specifiers are specified between
curly-braces ({}
) in the format string. There is a one-to-one mapping between the number of curly-braces
and arguments provided to conv.format
, at least at the moment.
This table provides brief defintions as to what can appear between the curly braces.
Symbol | Use |
---|---|
* | If the variable is a pointer, dereference the pointer and format the result |
p | Pretty formatting |
.N | Sets the decimal precision when formatting a float to be N digits |
bN | Sets the base when formatting an interger to be N |
x | Shorthand for b16 |
wN | Left-pad to N characters long (this might not work for everything) |
" | Quote strings in double quotes. Quotes are only added to str s |
' | Quote string in single quotes. Quotes are only added to str s |
d | Disable printing enums as strings and print as numbers instead |
Reflection
Reflection provides the ability for a program to introspect itself, and perform operations different dynamically at runtime based on a values type, or metadata in the compiled program.
In Onyx, reflection is available through the runtime.info
package.
This package provides utility function for accessing all type information
and metadata (tags) stored in the binary.
Types are Values
Every type in Onyx is given a unique ID at compile time. This ID is not stable, so a separate compilation may choose a different ID for the same nominal type. By having a single integer for every type, Onyx's types can be runtime values as well as compile time values.
In the example, t
is variable that stores a type.
main :: () {
t := i32;
println(t); // Prints i32
println(typeof t); // Prints type_expr, aka the type of a type
}
Under the hood, t
is simply storing a 32-bit integer that is the
unique ID of i32
.
any
This ability to have types as runtime values enables any
in Onyx.
any
is a dynamically typed value, whose type is known at runtime,
instead of at compile-time. Under the hood, any
looks like this:
any :: struct {
data: rawptr;
type: type_expr;
}
As you can see, it stores a data pointer and a runtime-known type.
Every any
points to a region of memory where the value is actually stored.
You can think of any
like a "fat-pointer" that stores the pointer, plus the type.
any
is typically used as an argument type on a procedure.
When a parameter has any
as its type, the compiler will implicitly
wrap the corresponding argument in an any
, placing the argument on the stack,
and constructing an any
using the pointer to the stack and the type of the
argument provided.
uses_any :: (value: any) {
println(value.type);
}
main :: () {
uses_any(10); // Prints i32
uses_any("Hello"); // Prints str
uses_any(context); // Prints OnyxContext
}
any
can also be used for variadic arguments of different types.
/// Prints all of the
many_args :: (values: ..any) {
for value in values {
printf("{} ", value.type);
}
}
main :: () {
many_args(10, "Hello", context);
// Prints: i32 str, OnyxContext
}
To use the data inside of an any
, you have to write code that handles the different
types, or kinds of types that you expect. You can either check for concrete types explicitly,
or use runtime type information to handle things dynamically. To get the type information for a given type,
use the runtime.info.get_type_info
procedure, of the info
method on the type_expr
.
print_size :: (v: any) {
size := switch v.type {
case i32 => 4
case i64 => 8
case str => 8
case _ => -1
};
printf("{} is {} bytes.\n", v.type, size);
}
main :: () {
print_size(10);
print_size("Hello");
print_size(context);
}
In this contrived example, print_size
checks the type of the any
against explicit
types using a switch expression, defaulting to -1 if the type is not one of them.
For some applications of any
this is perfectly acceptable, but for others, a more
generalized approach might be necessary. In such cases, you can use runtime type information to introspect the type.
Using Runtime Type Information
Baked into every Onyx compilation is a type table. This table contains information on every type in the Onyx program, from the members of structures, to the variants of unions, to which polymorphic structure was used to create a structure.
This information is stored in runtime.info.type_table
, which is a slice that contains
a &Type_Info
for each type in the program.
Type_Info
stores generic information for every type, such as the size.
When given a &Type_Info
, you will generally cast it to another type to get more
information out of it by using the kind
member.
In this example, when a structure type is passed in, the function will print the all of the members of the structure, including their: name, type and offset.
print_struct_details :: (type: type_expr) {
info := type->info();
struct_info := info->as_struct(); // OR cast(&Type_Info_Struct) info
for member in struct_info.members {
printf("Member name : {}\n", member.name);
printf("Member type : {}\n", member.type);
printf("Member offset : {} bytes\n", member.offset);
printf("\n");
}
}
Foo :: struct {
first: str;
second: u32;
third: &Foo;
}
main :: () {
print_struct_details(Foo);
}
This prints:
Member name : first
Member type : [] u8
Member offset : 0 bytes
Member name : second
Member type : u32
Member offset : 8 bytes
Member name : third
Member type : &Foo
Member offset : 12 bytes
In this example, runtime type information is used to get the size of the type.
print_size :: (v: any) {
info := v.type->info();
size := info.size; // Every type has a known size
printf("{} is {} bytes.\n", v.type, size);
}
main :: () {
print_size(10);
print_size("Hello");
print_size(context);
}
JS Interop
Interfacing with JavaScript from Onyx is easy thanks to the core.js
package. It was inspired from
syscall/js
, made by the wonderful people over at on the Go team.
The core.js
package abstracts away the details of managing references to JS values from Onyx,
so you are able to write code that uses JS values without caring about all the internal details.
For example, here is a simple program that runs on a web browser. It creates a new button
element,
add a click
event handler that will call an Onyx function, then adds the button to the page.
use core.js
main :: () {
// Lookup the document object in the global scope (i.e. window).
document := js.Global->get("document");
// Call createElement to make a new button, then set the text of the button.
button := document->call("createElement", "button");
button->set("textContent", "Click me!");
// Call addEventListener to handle the `click` event.
// Use js.func to wrap an Onyx function to be available from JS.
button->call("addEventListener", "click", js.func((this, args) => {
js.Global->call("alert", "Hello from Onyx!");
return js.Undefined;
}));
// Call appendChild on the body to insert the button on the page.
document->get("body")->call("appendChild", button);
}
While compiling this program, be sure to add the -r js
flag, as it specifies you are targeting
a JS runtime.
onyx build -o app.wasm -r js program.onyx
This will generate two files, app.wasm
and app.wasm.js
. The .js
file exists to allow you
to load and call your Onyx code from JS. Here is a simple HTML page that will load the
JS and start the program, which will in turn call main
.
<html>
<head>
<title>Onyx program</title>
<script type="module">
import Onyx from "/app.wasm.js"
let app = await Onyx.load("/app.wasm")
app.start() // Bootstrap program and call main
</script>
</head>
<body>
</body>
</html>
Load this in your favorite web browser from a local web server and you should see a button on the page. Click it to test the program!
Some internal details
There are some nuances that are worth mentioning about how this library is currently setup.
The .start()
method does start the program and invoke your main
function, but it also
does a little more. It also bootstraps the standard library, preparing buffers and allocators
used by most Onyx programs. For this reason, even if you are not going to do anything in your
main program and solely want to use Onyx as auxiliary to your main code, you still need to call
the .start()
method; just leave the main
procedure empty.
When you want to invoke a specific Onyx function from JS, you have to do two things.
First, the procedure you wish to call has to have the following signature: (js.Value, [] js.Value) -> js.Value
.
The first argument is the this
implicit parameter. The second argument is a slice of
js.Value
s that are the actual arguments. Here is a simple add
procedure using this signature.
use core.js
add :: (this: js.Value, args: [] js.Value) -> js.Value {
a := args[0]->as_int() ?? 0;
b := args[1]->as_int() ?? 0;
res := js.Value.from(a + b);
return res;
}
Second, export the procedure from Onyx using the #export
directive.
#export "add" add
Then, you can use the .invoke()
method to invoke the procedure with an arbitrary number of arguments.
app.invoke("add", 123, 456); // Returns 579
As a slight aid, if you forget to call .start()
, .invoke()
will automatically call it for you the
first time. So, if you use invoke and are wondering why the main
of your procedure is executing,
you likely forgot to call start.
Understanding the API
The API provided by core.js
is a very thin wrapper around normal JS operations.
The best way to understand it is to understand what each of the methods does in JS.
Once you understand how each JS operation maps to the corresponding Onyx method,
it is relatively easy to translate JS code into Onyx.
Value.new_object
Creates a new empty object. Equivalent of writing {}
in JS.
Value.new_array
Creates a new empty array. Equivalent of writing []
in JS.
Value.from
Converts an Onyx value into a JS value, if possible.
Value.as_bool
, Value.as_float
, Value.as_int
, Value.as_str
Convert a JS value into an Onyx value, if possible.
Value.type
Returns the type of the JS value. Similar to typeof
in JS,
but it has sensible semantics.
Value.call
Calls a method on an object, passing the object as the this
argument. x->call("y", "z")
is equivalent to x.y("z")
in JS.
Value.invoke
Invokes a function, passing null
as the this
argument.
x->invoke("y")
is equivalent to x("y")
in JS.
Value.delete
Invokes the delete
operator from JS on the property of the object.
Value.new
Invokes the new
operator on the value.
Value.get
x->get("y")
is equivalent to writing x.y
in JS.
Value.set
x->set("y", 123)
is equivalent to writing x["y"] = 123
in JS.
Value.length
Shorthand for x->get("length")->as_int() ?? 0
, since this operation is so common.
Value.index
x->index(y)
is equivalent to writing x[y]
in JS.
Value.instance_of
x->instance_of(y)
is equivalent to writing x instanceof y
in JS.
Value.equals
Returns true
if two values are equal.
Value.is_null
Returns if the value contained is null
.
Value.is_undefined
Returns if the value contained is undefined
.
Value.is_nan
Returns if the value contained is NaN
.
Value.truthy
Return true
if the value is considered "truthy" under JS's semantics.
Value.leak
Removes the value from the tracked pool of objects, so it will not automatically be freed.
Value.release
Frees the JS value being stored. After calling this the value should not be used anymore.
Defining your own JS module
This documentation will be coming soon!