Variables and scaffolds
Welcome back! Today, I wanted to talk about variables in Ribbon. But first, let's review my motivation!
The ultimate variable
In Rust, there is an ultimate variable type. It's an
Arc<RwLock<T>>
. It has the most utility for any variable type. But what is
it?
Broken down:
-
Arc
means atomically reference counted. This means that the value can be shared by references without deep copying, and these references are tracked (counted), thus garbage collection is unnecessary. Additionally, this sharing can occur across different threads, since the reference counting is thread-safe (atomic). -
RwLock
is a thread-safe read-write lock. It's a kind of mutex that allows interior mutability. This means that even when sharing references, we do not violate Rust's strict policy of unique mutable aliases, since obtaining a mutable reference can only be done (safely) by using theRwLock
.
But this is also the most expensive variable type. If you don't need sharing across threads, you can
use a thread-unsafe alternative: Rc<RefCell<T>>
. If you don't need shared
mutability, you can use Arc<T>
(thread-safe) or Rc<T>
(thread-unsafe). If you don't need sharing at all, you can use the traditional T
or
Box<T>
.
This has always bothered me. The implementation you should pick comes simply from your needs.
Theoretically the compiler has all of the knowledge about a variable's usage, and could pick the
ideal implementation for you. If it cannot for whatever reason (e.g. the variable is exported and
therefore has unknown usage), then using the default Arc<RwLock<T>>
(if
mutable) or Arc<T>
(if immutable) is always the best and safest choice.
From a programmer's perspective, it would be very nice if every variable was treated as Rust's
Arc<RwLock<T>>
type, and then based on usage, the optimizer could choose a
better implementation.
Now, there are some cases where a Rust programmer would tell you that they chose an implementation
because of a hard constraint. That is a valid concern in some cases. But even in this case, Rust's
approach doesn't explicitly state what that constraint was. For example, it's quite easy
for a developer to refactor an Rc
into an Arc
just to "fix a
bug", not realizing they defeated the whole point of using an Rc
for a
specific reason. I designed Ribbon to explicitly state these constraints.
Ribbon's variable semantics
In Ribbon, you can declare a variable as:
var x: T // Corresponds to Rust's `Arc<RwLock<T>>`
let x: T // Corresponds to Rust's `Arc<T>`
If, for whatever performance reason, you do not want atomic reference counting, and hence cannot
share the reference across threads, you would impose a constraint @Local
.
var x: T and @Local // Corresponds to Rust's `Rc<RefCell<T>>`
let x: T and @Local // Corresponds to Rust's `Rc<T>`
If you want everything within a block to be local, you can just provide the constraint to the whole block.
@Local {
var x: T // Corresponds to Rust's `Rc<RefCell<T>>`
let x: T // Corresponds to Rust's `Rc<T>`
...
}
And if you want something in a @Local
block to be thread-safe, you can use
@Global
inside. Additionally, if you need a variable to actually be stored by value
(e.g. a plain T
in Rust), you can use @Owned
. Note that
@Local
, @Global
, and @Owned
constraints may change names in
the future as I better refine their semantics.
Also note that the shared reference-counting is only a semantic modeling for how variables work. Depending on usage or the particular data types involved, the optimizer may change their implementation to a more efficient one.
Copy on write
Similar to Swift, Ribbon uses copy-on-write semantics.
var a = big_list
let b = a // `b` and `a` both share `big_list`. No copy occurs yet.
a ++= [42] // `a` is mutated while `b` has the old value. A clone occurs.
The implementation gets a little complex to fully support this, which I describe later in this post.
Shared references
If you do want to have two variables refer to the same place in memory, you can used shared mutable references. This looks like:
var a = 12
let &b = &a
a := 13
print(b) // 13
You can kind of interpret the second line as "Let the address of b
be equal to the
address of a
". It has a very satisfying elegance, don't you think?
Scaffolding
There's an additional mechanism I'm calling scaffolding. The idea is quite simple: what if you could do element access on a element that doesn't exist yet?
var x = [1, 2]
x[2] := 3
print(x) // [1, 2, 3]
You've probably seen this is some languages before, like JavaScript. But scaffolding extends to all collection types to any arbitrary depth! Check this out!
var x = []
x[0].a := 1
x[0].b.c := "Hi"
x[0].b.d := "Hello"
print(x) // [(a: 1, b: (c: "Hi", d: "Hello"))]
I'm calling it scaffolding because you can "scaffold" up your struct or list. Any
"empty" place that could exist permits element access as if it did
exist. Though the semantics of Ribbon only actually allocates storage if assignment is
performed or a mutable reference is created. Otherwise it defaults to an empty value
(null
) and doesn't allocate any storage at all.
In particular, I'm calling a reference to an element that doesn't exist yet a scaffold. For
example, the x[0].b.c
from above is a scaffold since it denotes a place in memory which
doesn't exist yet, but that will be constructed in that line of code.
Internal implementation (advanced)
Because of shared mutable references, the actual internal representation of a variable in the
interpreter I have written in Rust uses an Arc<Arc<RwLock<Value>>>
.
The outer Arc
is for shared mutable references, the inner Arc
is used for
sharing values as part of copy-on-write, and the RwLock
allows the interior mutability.
Here's how the operations work.
-
Shared mutable references: Declaring a shared mutable reference (e.g.
let &b = &a
) just clones the outerArc
froma
intob
. The innerArc
and theRwLock<Value>
are shared without being copied. -
Copies: Declaring a copy (e.g.
let b = a
) just clones the outer and innerArc
froma
intob
. TheRwLock<Value>
is shared without being copied. -
Assignments: Assigning to a variable (e.g.
b := 123
) replaces the value of theRwLock
, requiring the innerArc
be unshared (i.e. have a strong count of 1) by deep cloning it as necessary. This is called copy on write. For example:var a = [1, 2] let b = a // `b` and `a` share the same `RwLock<Value>` a[2].c := 3 // This assignment clones the list, then edits index 2 print(a) // [1, 2, (c: 3)] print(b) // [1, 2]
Summary
Hopefully you can see here that the variables in Ribbon are extremely flexible. They are designed to make prototyping easy, especially with scaffolds! However, once the optimizer phase is built, the concrete implementation of a variable can be optimized according to how it is used. It's the best of both flexibility and performance!
I hope you enjoyed this blog entry! Next week I'll finally be discussing enums. Until then, take care!