Multithreading in Ribbon
Hello! I must preface this post by saying that a lot of ideas I'm sharing here are still experimental and haven't even been implemented in my interpreter yet. I suppose I've been blogging faster than I am developing. But, regardless, let me show you my plans!
Motivation
As is tradition, let us start with a motivating example. Suppose we are to download a list of files from the internet. Here's what the code might look like:
let endpoints_to_download = [ ... ] // Suppose this has 100 items
for endpoint in endpoints_to_download {
let data = std.net.download("http://example.com/\(endpoint)")
std.fs.write("./\(endpint).txt", data)
print("Finished downloading '\(endpoint)'")
}
Now, in most programming languages today, this code is sequential. Each iteration downloads a file, saves it to disk, and then prints that it completed, before continuing to the next iteration. This is a very safe interpretation.
There's theoretical optimizations to be made, such as performing downloads and disk writes in
parallel. Even the print statement can be parallelized. But you run the risk of the behavior of the
program not matching the user's expectation. For instance, they might fully anticipate that the
print statements are executed in the same order as the list endpoints_to_download
. Or
the server they are downloading from might be rate-limited, and so few downloads can be executed
simultaneously.
There's many complications that could make naive optimizations result in incorrect behavior, or at least behavior that the programmer finds unexpected and unwanted.
That being said, it's still clear there's some easy optimizations that could be done. Even if we're only allowed one download, one disk write, and one print statement to ever run simultaneously, we can still have a pipelined implementation that actually does run them in parallel while matching the behavior of a sequential implementation.
My plan is for Ribbon's interpreter to support that optimization natively. This is a big can of worms, so I'll try to keep it brief and pedagological.
Instruction reordering
I'm sure many of you know that modern CPUs can do instruction reordering. If an instruction is waiting to read memory, the CPU can look ahead for other instructions that it can execute, instructions that don't depend on the result of the memory read. Of course, the CPU can never commit the results of these future instructions. For example, if the memory read yields a segmentation fault, the CPU should have raised an exception at that instruction, and all future instructions should have never been executed.
Well, I want Ribbon's interpreter to have a similar feature. But since I get to design the abstract machine for the Ribbon language, I get to figure out the semantics that supports such a feature while being as simple as possible. And I think I have it.
Ribbon evaluation semantics
In Ribbon, every expression is evaluated asynchronously. It's as if every single
line of code was asynchronous, and returned a Future
or Promise
. It's
worth noting that asynchronous is really just a synonym for reorderable. For
example:
let a = a_long_computation()
let b = another_long_computation()
In the code above, assuming both functions are pure and have no side effects, Ribbon gives no guarantees about which computation executes first. It's as if you could reorder both lines.
The only instructions which are not reorderable are instructions that are not pure, which is basically instructions which read from external inputs or write to external outputs (hence forth called external I/O).
Except, here's where we can improve the semantics to make the aforementioned optimization possible. What if we could describe what instructions are reorderable?
Hear me out. Typically we expect print
statements to execute sequentially. But what if
we could allow such statements to run in any order?
@Async(print) {
print("A")
print("B")
print("C")
}
This block of code does not make any guarantees about the execution order of the print statements.
For instance, B C A
is a legal behavior. And so is A B C
, which is likely
what an optimizer would choose. But here's where this could be clever.
@Async(print) {
for task in tasks {
task.execute()
print("Completed task \(task.name)")
}
}
If we assume that Task.execute
is a pure function and can already be reordered, then
this code is semantically equivalent to spawning a bunch of threads and having them all execute in
parallel.
We could go one step further, and specify this Async
-ness in the function declaration!
I'm thinking something like:
let download(url: HTTP.URL) -> Binary: @Async = {
...
}
To explain, download
is a lambda which has two constraints: the lambda type is
(url: HTTP.URL) -> Binary
and the lambda satisfies @Async
, meaning that
calls to this lambda can be reordered with other calls to this lambda, even if this lambda
interfaces with external I/O. If I wanted to omit types, could also write this as
let download(url): @Async = ...
.
This would be nice, wouldn't it? You'd get easy and correct parallelism within your sequential code, simply because the interpreter understands what it is allowed to run asynchronously and not.
Closing remarks
There's a lot left for me to explore here, so unfortunately this is where I'll have to stop for today's post. I still have to experiment with implementations to really figure out how feasible this is. And I haven't actually been able to focus on my interpreter lately due to a new side project of mine...
I'm making a website builder! It's still very barebones, but I'm trying to turn it into a commercial product within the next month. I might blog about it, or I might not. It's certainly not as interesting as a programming language, but there are some interesting problems I solved while building it so far.
Anyways, it pains me to say it, but I think I'm going to cut down on my blog posts from weekly to monthly. Since today is the last Friday of the month, I think it's reasonable to say my new posting schedule will be the last Friday of the month. Once I finish up the website builder, I'll be able to resume spending a lot of time working on Ribbon, and we'll see if I can pick up the blogging pace back to weekly.
Regardless, it means a lot to me that you're reading this. I hope you enjoy these blogs as much as I do. I look forward to seeing you again for next month's post! Until then, take extra good care!