Revisiting WebAssembly Cross Compilation in January 2024

I was recently contacted by someone trying to follow a
previous blog
on cross compiling C to WebAssembly. Things weren’t working; it seemed that things had changed.

This is a short post to look at how to cross compile an open source library to WebAssembly
using the current tooling (as of 2024-01-07).

As before I’ll use the GNU Scientific Library,
which (at the time of writing) is version 2.7.1.

The toolchain

As before I used the clang toolchain from wasi-sdk.
I downloaded wasi-sdk-21.0-linux.tar.gz
and unpacked it to $HOME/local/wasi-sdk-21.0.

I put the following shell script in the root folder of the GSL.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
WASI_SDK_HOME=$HOME/local/wasi-sdk-21.0

export PATH=$WASI_SDK_HOME/bin:$PATH

# Copy the wasm aware config files.
cp $WASI_SDK_HOME/share/misc/config.* .

rm -rf ./build
mkdir build
cd build

CC=clang \
CFLAGS="-fno-trapping-math --sysroot=$WASI_SDK_HOME/share/wasi-sysroot -mthread-model single" \
CPP=clang-cpp \
AR=llvm-ar \
RANLIB=llvm-ranlib \
NM=llvm-nm \
LD=wasm-ld \
../configure \
--prefix=$HOME/local/gsl-2.7 \
--host=wasm32-wasi \
--enable-shared=no

# build and install it.
make
make install

There are three key changes from the previous method.

First the autoconf config files in the root folder GSL are overwritten by
those in the was-sdk with cp $WASI_SDK_HOME/share/misc/config.* ..
This provides the information autoconf needs to configure the project for
WebAssembly.

Second the flags to the compiler are now --sysroot=$WASI_SDK_HOME/share/wasi-sysroot,
rather than the previous --target=wasm32-wasi.

Lastly make should be called before make install for the project to build correctly.

Testing

I used the same example program as before.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>

int main (int argc, char** argv)
{
double x = 5.0;
double y = gsl_sf_bessel_J0 (x);
printf ("J0(%g) = %.18e\n", x, y);
return 0;
}

The following script was used to compile it.

1
2
3
4
5
6
7
8
9
10
11
set -x

WASI_SDK_HOME=$HOME/local/wasi-sdk-21.0
GSL_HOME=$HOME/local/gsl-2.7

export PATH=$WASI_SDK_HOME/bin:$PATH

clang \
example.c -o example \
-O2 --sysroot=$WASI_SDK_HOME/share/wasi-sysroot \
-I$GSL_HOME/include -L$GSL_HOME/lib -lgsl -lm -lc

The executable was indeed WebAssembly.

1
2
$ file example
example: WebAssembly (wasm) binary module version 0x1 (MVP)

As before we can test it with wasmer,
using wasmer-linux-amd64.tar.gz.
version 4.2.5, and unpacked it to $HOME/local/wasmer-4.2.5.

1
2
$ $HOME/local/wasmer-4.2.5/bin/wasmer example 
J0(5) = -1.775967713143382642e-01

Success!

Thoughts

It was worth the effort re-visiting this. I think my original
post was incomplete (I suspect I had copied the config files
in the previous attempt and forgotten), but also there were
some real changes.

Using asyncio start_tls in Python 3.11

An upgradable stream starts life as a plain old socket connection, but is capable of being “upgraded” to use Transport Layer Security (TLS). This is sometimes known as STARTTLS. Common examples of this are SMTP, LDAP, and HTTP proxy tunneling with CONNECT.

The has been broken in Python, but is fixed in version 3.11!

To make things work you will need an SSL certificate and key, and for that certificate to be trusted by a certificate chain.

You can find a gist for this here.

Server

The server starts without TLS. When a client connects, the server responds to three messages:

  • PING — the server responds with PONG.
  • STARTLS — the server upgrades the connection to TLS.
  • QUIT — the server closes the client connection.

Let’s see the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import asyncio
from asyncio import StreamReader, StreamWriter
from functools import partial
from os.path import expanduser
import socket
import ssl

async def handle_client(
ctx: ssl.SSLContext,
reader: StreamReader,
writer: StreamWriter
) -> None:
print("Client connected")

while True:
request = (await reader.readline()).decode('utf8').rstrip()
print(f"Read '{request}'")

if request == 'QUIT':
break

elif request == 'PING':
print("Sending pong")
writer.write(b'PONG\n')
await writer.drain()

elif request == 'STARTTLS':
print("Upgrading connection to TLS")
await writer.start_tls(ctx)

print("Closing client")
writer.close()
await writer.wait_closed()
print("Client closed")

async def run_server():
ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
ctx.load_verify_locations(cafile="/etc/ssl/certs/ca-certificates.crt")
ctx.load_cert_chain(
expanduser("~/.keys/server.crt"),
expanduser("~/.keys/server.key")
)

handler = partial(handle_client, ctx)

print("Starting server")
server = await asyncio.start_server(socket.getfqdn(), host, 10001)

async with server:
await server.serve_forever()

if __name__ == '__main__':
asyncio.run(run_server())

Looking at the run_server function, the first job is to build the SLL context. After creating the context, the certificate authority bundle is loaded, then the certificate and key.

The partial function binds the SSL context as the first argument to the handle_client callback.

The server is then started without TLS and set running. The fully qualified domain name (FQDN) is used for the host (the “any” address “0.0.0.0” would be fine, but the FQDN works better on Windows).

When a client connects the handle_client function is called. The function begins a loop. For each iteration the loop starts by reading a line from the client. If it reads PING it writes PONG. If it reads QUIT it breaks out of the loop and closes the connection. If it reads STARTTLS it calls await start_tls(ctx) to upgrade the connection. That’s all there is to it! Very neat.

Client

This time let’s start with the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import asyncio
import socket
import ssl

async def start_client():

print("Connect to the server with using the fully qualified domain name")
reader, writer = await asyncio.open_connection(socket.getfqdn(), 10001)

print(f"The server certificate is {writer.get_extra_info('peercert')}")

print("Sending PING")
writer.write(b'PING\n')
response = (await reader.readline()).decode('utf-8').rstrip()
print(f"Received: {response}")

print("Sending STARTTLS")
writer.write(b'STARTTLS\n')

print("Upgrade the connection to TLS")
ctx = ssl.create_default_context(
purpose=ssl.Purpose.SERVER_AUTH,
cafile='/etc/ssl/certs/ca-certificates.crt'
)
await writer.start_tls(ctx)

print(f"The server certificate is {writer.get_extra_info('peercert')}")

print("Sending PING")
writer.write(b'PING\n')
response = (await reader.readline()).decode('utf-8').rstrip()
print(f"Received: {response}")

print("Sending QUIT")
writer.write(b'QUIT\n')
await writer.drain()

print("Closing client")
writer.close()
await writer.wait_closed()
print("Client disconnected")

if __name__ == '__main__':
asyncio.run(start_client())

The client starts by opening a connection without TLS. As with the server the FQDN is used, but this time it’s important. When the client upgrades to TLS the host must match the certificate, so an IP address won’t work. There is a choice here though, as the start_tls call takes the host name as an optional argument. After connecting the client checks to see if there’s an SSL server certificate with get_extra_info(“peercert”) call. This should return None.

Next the client writes a PING to the server over the unencrypted stream and reads the result (which should be PONG).

The next step is to upgrade the connection. The client writes STARTTLS to instruct the server to start the handshake. An SSL context is then made, and the start_tls(ctx) function is called on the writer. The client then checks for an SSL server certificate with get_extra_info("peercert”) which should now exist.

The client then writes PING over the now encrypted stream and reads the result (which should be PONG).

Finally the client writes QUIT and closes the connection.

Thoughts

It’s been a long time coming, but the result is so simple!

Good luck with your coding.

A Python client for .Net NegotiateStream

I find the .Net NegotiateStream class useful for internal services as it provides single sign on, authentication and basic encryption. However, as a python programmer, I couldn’t find a client library.

TL;DR

For the impatient the repo can be found on GitHub.

The Protocol

I won’t go into detail regarding the protocol, but it has two key features: a handshake (during which credentials are exchanged), and then the encryption/decryption of data sent/received.

The handshake stage is widely used by Windows intranet servers to avoid the need for entering credentials, and involves the generation and consumption of tokens which are base64 encoded and passed as HTTP headers. I’ve done this before using the
pyspnego package
for the SSPI authentication layer
of a Python web server. The SPNEGO client generates a token that is passed to the server, which responds with a new token, and so on until authentication is complete.

The wire protocol uses the same tokens, but has different “headers”. This was revealed by inspecting the code from the
net.tcp-proxy project.
During the handshake the header takes the form:

1
struct.pack(">BBBH", state, major, minor, payload_size)

The state can be one of “done” (0x14), “error” (0x15) and “in progress” (0x16). The major.minor version is 1.0, and the payload size is the length of the generated token. When the handshake is complete the header changes to simply the payload size, but as an unsigned int.

Let’s look at some code! First I needed to handle the handshake header record.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
from __future__ import annotations

import enum
import struct

class HandshakeState(enum.IntEnum):
DONE = 0x14
ERROR = 0x15
IN_PROGRESS = 0x16


class HandshakeRecord:

FORMAT = ">BBBH"

def __init__(
self,
state: HandshakeState,
major: int,
minor: int,
payload_size: int
) -> None:
self.state = state
self.major = major
self.minor = minor
self.payload_size = payload_size

def pack(self) -> bytes:
return struct.pack(
self.FORMAT,
self.state,
self.major,
self.minor,
self.payload_size
)

@classmethod
def unpack(cls, buf: bytes) -> HandshakeRecord:
(state, major, minor, payload_size) = struct.unpack(cls.FORMAT, buf)
return HandshakeRecord(state, major, minor, payload_size)
With that, and the help of pyspnego I can implement reading and writing.

class NegotiateStream(Stream):

def __init__(self, hostname: str, socket_: socket.socket) -> None:
super().__init__(socket_)
self._handshake_state = HandshakeState.IN_PROGRESS
self._client = spnego.client(hostname=hostname)

def write(self, data: bytes) -> None:
if self._handshake_state == HandshakeState.IN_PROGRESS:
handshake = HandshakeRecord(self._handshake_state, 1, 0, len(data))
header = handshake.pack()
self.send(header + data)
else:
while data:
chunk = self._client.wrap(data[:0xFC30])
header = struct.pack('<I', len(chunk.data))
self.send(header + chunk.data)
data = data[0xFC30:]

def read(self) -> bytes:
if self._handshake_state == HandshakeState.DONE:

payload_size = struct.unpack('<I', self.recv(4))[0]
payload = self.recv(payload_size)
unencrypted = self._client.unwrap(payload)
return unencrypted.data

buf = self.recv(struct.calcsize(HandshakeRecord.FORMAT))
handshake = HandshakeRecord.unpack(buf)

self._handshake_state = handshake.state

if self._handshake_state != HandshakeState.ERROR:
return self.recv(handshake.payload_size)

if handshake.payload_size == 0:
raise IOError("Negotiate error")

payload = self.recv(handshake.payload_size)
_, error = struct.unpack('>II', payload)
raise IOError(f"Negotiate error: {error}")

...

Note the different headers. When the handshake is “in progress” the handshake style of header is used. After authentication the shorter form is used, and the client has to encrypt and decrypt the data.

Finally the handshake itself.

1
2
3
4
5
6
7
8
9
10
11
12
13
class NegotiateStream:

...

def authenticate_as_client(self) -> None:
in_token: Optional[bytes] = None
while not self._client.complete:
out_token = self._client.step(in_token)
if not self._client.complete:
self.write(out_token)
in_token = self.read()

...

This turned out to be pretty simple!

I coded up a simple client:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import socket

from jetblack_negotiate_stream import NegotiateStream

def main():
hostname = socket.gethostname()
port = 8181

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.connect((hostname, port))

stream = NegotiateStream(hostname, sock)

stream.authenticate_as_client()
for data in (b'first line', b'second line', b'third line'):
stream.write(data)
response = stream.read()
print("Received: ", response)

print("Done")


if __name__ == '__main__':
main()

And a simple C# echo server:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
using System;
using System.Net;
using System.Net.Security;
using System.Net.Sockets;
using System.Text;

namespace NegotiateStreamServer
{
internal class Program
{
static void Main(string[] args)
{
var listener = new TcpListener(IPAddress.Any, 8181);
listener.Start();

while (true)
{
Console.WriteLine("Listening ...");
var client = listener.AcceptTcpClient();

try
{
Console.WriteLine("... Client connected.");

Console.WriteLine("Authenticating...");
var stream = new NegotiateStream(client.GetStream(), false);
stream.AuthenticateAsServer();

Console.WriteLine(
"... {0} authenticated using {1}",
stream.RemoteIdentity.Name,
stream.RemoteIdentity.AuthenticationType);

var buf = new byte[4096];
for (var i = 0; i < 4; ++i)
{
var bytesRead = stream.Read(buf, 0, buf.Length);
var message = Encoding.UTF8.GetString(buf, 0, bytesRead);
Console.WriteLine(message);
stream.Write(buf, 0, bytesRead);
}
stream.Close();
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
}
}
}

What a lot of code the server is. If you run the server, then the client, you should see the handshake, and the messages passing to and fro, encrypted.

As most of my Python servers are async, I made a couple of async versions. One is a straight async version of the synchronous socket client. The second is more”asyncio” style. I won’t show the code here, but here is a demo program using it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import asyncio
import socket

from jetblack_negotiate_stream import open_negotiate_stream

async def main():
hostname = socket.gethostname()
port = 8181

reader, writer = await open_negotiate_stream(hostname, port)

for data in (b'first line', b'second line', b'third line'):
writer.write(data)
await writer.drain()
response = await reader.read()
print("Received: ", response)

writer.close()
await writer.wait_closed()

if __name__ == '__main__':
asyncio.run(main())

By using the same pattern as asyncio.open_connection the negotiation gets tidied away inside open_negotiate_stream providing a clean consistent interface.

If you find any bugs, or wish to add features, please post issues and pull requests to the repo.

Setting up a Dell Inspiron 16 Plus (7610) for Linux

Introduction

I recently bought a Dell Inspiron 16 Plus (7610) for use with Linux.
There have been a number of issues that have been solved by smart people on the internet.
This is a short list of things that I have done to make things work.

I used the Ubuntu 21.10 distribution.

Status

The trackpad is sometimes still weird, but it doesn’t get in the way too much.

Hardware

  • CPU: i7-11800H
  • Memory: 16G
  • Graphics: Intel UHD Graphics
  • Disk: 512GB

Note: I specifically chose the Intel UHD graphics to avoid issues with proprietary NVIDIA drivers.

This issues

I’ve installed Ubuntu 21.10.

  • Secure boot
  • Flickering screen
  • Battery drain while lid closed
  • Short battery life when unplugged
  • Trackpad weirdness

Secure boot

This is the first laptop I’ve had with secure boot.
When installing Ubuntu 21.10 from a USB stick you get a prompt about “secure boot”.
I went off and googled this, and by the time I got back it had timed out and made a decision for me.
I installed a few times, but ended up just switching secure boot off in the BIOS settings.
This hasn’t caused me any problems.

Flickering Screen

This is due to a kernel driver default setting.
You can find the answer and links to explanations here.

I added the args i915.enable_psr=0 to the grub command line in /etc/default/grub.

1
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash i915.enable_psr=0"

Then

1
2
$ sudo update-grub
$ reboot

Battery drain while lid closed

I noticed the battery loosing power when the lid was closed and the power unplugged.
This seems to have something to do with the sleep mode the laptop enters.
I found an answer here.

I added the args mem_sleep_default=deep to the grub command line in /etc/default/grub.

1
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash i915.enable_psr=0 mem_sleep_default=deep"

Then

1
2
$ sudo update-grub
$ reboot

Short battery life when unplugged

This is a problem with all Linux laptops, and it is not specific to the Dell. It seems the way manufacturers get a long battery life when unplugged
is by turning things down (like cpu frequency) or turning things off (like turbo boost). These get turned back up when
the system notices the load requirements have increased.

To solve this you need some power management software.
There are a number of packages out there. I went for
auto-cpufreq.

1
$ sudo snap install auto-cpufreq

I also decided to add the boot flag intel_pstate=disable to /etc/default/grub
as recommended.

1
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash i915.enable_psr=0 mem_sleep_default=deep intel_pstate=disable"

Then

1
2
$ sudo update-grub
$ reboot

The thermald package is also
recommended, but that was already installed.

Trackpad weirdness

Sometimes the trackpad is weird.
The cursor movement becomes unresponsive, and clicks don’t work.
It seems to be more of a problem when the laptop is plugged in, but this may be anecdotal.

There are two views on this.

Power Management

The first is that it is an issue with power management on the trackpad.
You can find a discussion of that here.

To address this I created a file /lib/systemd/system/disable-trackpad-pm.service with the contents:

1
2
3
4
5
6
7
8
9
[Unit]
Description=Disables trackpad power management to work around input delays

[Service]
Type=oneshot
ExecStart=/bin/sh -c "echo on > /sys/bus/i2c/devices/i2c-0/device/power/control"

[Install]
WantedBy=multi-user.target

Then:

1
2
3
$ sudo systemctl daemon-reload
$ sudo systemctl enable disable-trackpad-pm
$ reboot

This has helped, but I still get the weirdness occasionally, especially when the laptop is plugged in.

Mechanical

The second view is that there is a mechanical problem.
The thought is that the battery (which is positioned under the trackpad)
interferes with the trackpad.
One solution proposed is to add some material (often plastic) between the battery and the trackpad.
You can find the instructions here.

I haven’t tried this yet, but I’m thinking about it, unless another driver issue gets found in the next month or so.

Thoughts

I really like the laptop (apart from the numeric keypad; why?),
and it’s been frustrating battling through these issues.

I’m hoping someone gets to the bottom of the trackpad issue, as
this is an excellent machine and great value for money.

How To Use clang With vscode On Ubuntu 20.04

Introduction

I wanted to use the clang compiler for a C++ project, using vscode as the IDE;
simples? No!!!

I’m running Ubuntu 20.04, and it seemed the obvious approach was to install the
default llvm tool chain (sudo app install llvm), write the obligatory “hello
world” program, and go to the vscode debug page and click “create a launch.json
file”. So far so good. Now I hit the play button and I get the error “unable
to find /usr/local/bin/lldb-mi”. WTF?

Now we go down the rabbit hole. You may choose to take the blue pill and move
one, or skip to the solution.

lldb-mi

Googling the error provided some useful facts.

  • lldb is the debugger for the llvm tool-chain,
  • lldb-mi is an executable which is not part of the base llvm project,
  • Ubuntu or llvm have stopped shipping this executable with the llvm package.

Given we are C++ programmers and therefore not lightweight, we choose to build
lldb-mi. After cloning the project
we get an error creating the cmake build file with the default gcc compiler
which complains about lib_lldb. Google has little to offer here. I bite the
bullet and decide to build the entire llvm tool-chain.

llvm

After cloning llvm and following the build instructions the build starts using
gcc as the compiler. I notice that 10% of the build has occurred in 10
minutes, so I go for a coffee. At some point the build crashes on my 12 core
64GB machine, as it runs out of memory!

After a little more googling I install the lld linker and set the build type
to “release” to reduce memory usage. I kick off the build again, and now I get
a compilation error complaining about _Atomic being undefined. Again googling
doesn’t help much, so I decide to use the llvm tool chain to build itself. After
installing llvm and setting the C and C++ compiler to clang/clang++ I try again
and it works!

lldb-mi revisited

Using my compiled llvm tool chain I can successfully compile lldb-mi, and
vscode now allows me to debug my program. Superb!

Solution

Install the llvm tool chain with the lld linker. Also use ninja instead of
make (cmake should be installed already).

1
2
3
sudo apt install lld
sudo apt install llvm
sudo apt install ninja-build

Clone llvm and build it. I used version 12.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
git clone --depth 1 --branch release/12.x git@github.com:llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
LLVM_PROJECTS="clang;clang-tools-extra;compiler-rt;debuginfo-tests;libc;libclc;libcxx;libcxxabi;libunwind;lld;lldb;mlir;openmp;parallel-libs;polly;pstl"
LLVM_TARGETS="X86"
INSTALL_PREFIX=$HOME/local/llvm-12.0.0
C_COMPILER=/usr/bin/clang
CXX_COMPILER=/usr/bin/clang++
LINKER=lld
cmake \
-G Ninja \
-DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_USE_LINKER=$LINKER \
-DCMAKE_C_COMPILER=$C_COMPILER \
-DCMAKE_CXX_COMPILER=$CXX_COMPILER \
-DLLVM_ENABLE_PROJECTS="$LLVM_PROJECTS" \
-DLLVM_TARGETS_TO_BUILD="$LLVM_TARGETS" \
../llvm
cmake --build .
cmake --build . --target install

Build lldb-mi:

1
2
3
4
5
6
7
git clone git@github.com:lldb-tools/lldb-mi.git
cd lldb-mi
mkdir build
cd build
cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX ..
cmake --build .
cmake --build . --target install

Assuming you have the tool-chain on your path:

1
export PATH=$HOME/local/lvmm-12.0.0/bin:$PATH

The vscode generated files looked like this:

launch.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "clang++ - Build and debug active file",
"type": "cppdbg",
"request": "launch",
"program": "${fileDirname}/${fileBasenameNoExtension}",
"args": [],
"stopAtEntry": false,
"cwd": "${fileDirname}",
"environment": [],
"externalConsole": false,
"MIMode": "lldb",
"setupCommands": [
{
"description": "Enable pretty-printing for gdb",
"text": "-enable-pretty-printing",
"ignoreFailures": true
}
],
"preLaunchTask": "C/C++: clang++ build active file",
"miDebuggerPath": "/home/rob/local/llvm-12.0.0/bin/lldb-mi"
}
]
}

tasks.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
"tasks": [
{
"type": "cppbuild",
"label": "C/C++: clang++ build active file",
"command": "/home/rob/local/llvm-12.0.0/bin/clang++",
"args": [
"-g",
"${file}",
"-o",
"${fileDirname}/${fileBasenameNoExtension}"
],
"options": {
"cwd": "${fileDirname}"
},
"problemMatcher": [
"$gcc"
],
"group": {
"kind": "build",
"isDefault": true
},
"detail": "Task generated by Debugger."
}
],
"version": "2.0.0"
}

Epilogue

It’s a day of my life that I won’t get back, but I’m loving the speed of the
compiler and the more sensible error messages.

Using Typed Arrays and Finalizers with a WebAssembly DataFrame

Introduction

In my previous posts on data frames we looked at how to:

The main problem with that implementation was the marshalling layer which was
hand written. It required all arrays to be copied in and out of the WebAssembly
instance, with manual allocation and freeing of memory.

This implementation introduces a marshalling layer which makes use of the
FinzalizationRegistry
which provides a callback for managing the destruction of objects.

You can find the source code here. To run the examples you will need a version of
node which supports the --harmony-weak-refs flag (I’m using 14.5.0), the
wasi-sdk 11
(with it’s bin directory on your path) and the
wabt 1.0.16
toolkit (with it’s bin directory on your path).

Usage

You can find more information about the implementation of the marshalling layer in
this post.
What I want to talk about here is usability.

Rather than bundle a fixed set of functions with the data frame I wanted it to be a structural object where the functions are
provided by an extensible repository. This way the user
isn’t limited to the operations defined by the data frame implementation.

A C function for multiplying two arrays might be written as follows -

1
2
3
4
5
6
7
8
9
10
11
12
__attribute__((used)) double* divideFloat64Arrays (double* array1, double* array2, unsigned int length)
{
double* result = (double*) malloc(length * sizeof(double));
if (result == 0)
return 0;

for (int i = 0; i < length; ++i) {
result[i] = array1[i] + array2[i];
}

return result;
}

The functions are registered in JavaScript as follows -

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DataFrame.registerFunction(
// The function or operation name.
Symbol.for('/'),
new FunctionPrototype(
// The arguments
[
new In(new TypedArrayType(new Float64Type(), null)),
new In(new TypedArrayType(new Float64Type(), null)),
new In(new Uint32Type())
],
// The return type with automatic length discovery.
new TypedArrayType(new Float64Type(), (i, args) => args[2])
),
// The bound function
wasi.instance.exports.divideFloat64Arrays
)

which allows the following -

1
2
3
4
5
6
7
8
9
10
11
12
const df = DataFrame.fromObject(
// The data
[
{ name: 'mary', height: 175.2, weight: 65.1 },
{ name: 'fred', height: 183.7, weight: 82.2 }
],
// The types
{ name: Array, height: Float32Array, weight: Float64Array}
)

// Calculations are performed by the bound function in the WebAssembly instance
df['approx_density'] = df['height'] / df['weight']

Functions can also be added for performing calculations directly in JavaScript.
For example -

1
2
3
4
5
6
7
8
9
10
Series.registerFunction(
// The function name
Symbol.for('**'),
// A null prototype is used for JavaScript functions
null,
// The bound function implemented in JavaScript
(lhs, rhs, length) => lhs.map((value, index) => value ** rhs))

const height = Series.from('height', [1.82, 1.72, 1.64, 1.88], Float64Array)
const sqrHeight = height ** 2

There is a restriction on function prototypes. The first argument is always the
array from the first series, and the last argument is always the length of the
first array.

Although we have used Symbol.for to define the function for names that map to
an operation, we can use a string for arbitrary calculations -

1
2
3
4
5
6
7
8
9
10
Series.registerFunction(
// The function name
'fillna',
// A null prototype is used for JavaScript functions
null,
// The bound function implemented in JavaScript
(lhs, fill, length) => lhs.map(value => Number.isNaN(value) ? fill : value))

const height = Series.from('height', [1.82, 1.72, Number.NaN, 1.88], Float64Array)
const maxHeight = height.fillna(0)

The WASI Marshaller

In order for the data frame to be aware of the WebAssembly instance and the
associated marshalling support, it must be initialized.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Read the wasm file.
const buf = fs.readFileSync(fileName)

// Create the Wasi instance passing in environment variables.
const envVars = {}
const wasi = new Wasi(envVars)

// Instantiate the wasm module.
const res = await WebAssembly.instantiate(buf, {
wasi_snapshot_preview1: wasi.imports()
})

// Initialize the wasi instance
wasi.init(res.instance)

// Register the functions
registerUnmarshalledFunctions(wasi)
registerInt32Functions(wasi)
registerFloat64Functions(wasi)

// Initialize the data frame
DataFrame.init(wasm)

Thoughts

Efficiency has improved enormously. As typed arrays are visible both
within the JavaScript interpreter and the WebAssembly instance, we only need to
pass the reference to the arrays when performing WebAssembly calculations. The
finalizer support means we don’t need to manually allocate, copy and free memory.

We’ve done all this without adding a much clutter beyond specifying the series
type. This could be improved by adding some heuristics to guess the types.

All this means it’s easier to use the power of WebAssembly without adding to
much burden to the data scientist.

While there is still much left to do (indices, selecting, support for more types
such as timestamps), this feels like a good step forward.

WASI Marshalling with Finalizers

In my
previous post
I described a marshalling library I’d written for WebAssembly (see
here for the source). There’s one
problem I have with it; too much copying!

But there’s a good reason for the copying. In order to ensure all allocated memory
is freed, the marshaller goes through three steps: allocating memory and
copying data, calling the function, then copying results and freeing the memory.
Without this control memory will leak.

Enter finalizers …

FinalizationRegistry

Until recently there has been no way to hook into the end of a JavaScript
object’s lifecycle. There is a “stage 3” proposal that is going into production:
FinalizationRegistry.
This is currently available in node (since 13.0.0), but must be enabled with the
flag “–harmony-weak-refs”.
It should be available in chrome (84), and firefox (79) in a few days.

When an object registers with the finalization registry it gets called back when
the object gets garbage collected. If we register our WebAssembly objects then we
can free the memory we have allocated when the garbage gets collected.

How does finalizing work?

I tried and tried to get finalizing to work. The following called the finalizer
cleanup function, but only when the program exited.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// run with node --harmony-weak-refs

const unfinalizedTags = new Set()let counter = 0

const registry = new FinalizationRegistry(held => {
console.log('cleanup')
for (const tag of held) {
console.log(tag)
unfinalizedTags.delete(tag)
}
})

// Nohting gets finalized here.
for (let counter = 0; counter < 1000; ++counter) {
let array = new Array(1000000)
const tag = "tag" + counter
unfinalizedTags.add(tag)
// Register with the finalizer.
registry.register(array, tag)
for (let i = 0; i < array.length; ++i) {
array[i] = Math.random()
}
const sum = array.reduce((acc, value) => acc + value, 0)
console.log(sum / array.length, counter)
array = null
}

// The finalizer gets called with all the registered objects on exit.

The following does call the finalizer while the code is running.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
const unfinalizedTags = new Set()
let counter = 0

const registry = new FinalizationRegistry(held => {
console.log('cleanup')
for (const tag of held) {
console.log(tag)
unfinalizedTags.delete(tag)
}
})

function makePointlessGarbage() {
let array = new Array(1000000)
const tag = "tag" + counter++
unfinalizedTags.add(tag)
registry.register(array, tag)
for (let i = 0; i < array.length; ++i) {
array[i] = Math.random()
}
const sum = array.reduce((acc, value) => acc + value, 0)
console.log(sum / array.length, counter)
array = null
if (counter < 100) {
setTimeout(makePointlessGarbage, 10)
} else {
console.log(`unfinalizedTags.size=${unfinalizedTags.size}`)
}
}

setTimeout(makePointlessGarbage, 10)

This version calls the finalizer as the program is running but doesn’t finalize
on exit, leaving some objects unfinalized. I’m not so concerned about the
unfinalized objects with respect to memory management, as the WebAssembly
instance will be destroyed on exit.

It seems that the program needs to
give up control (with setTimeout) to allow the garbage collection to run. It
also seems that the cleanup function gets passed an iterable of values to free,
which is not how I read the documentation.

Implementation

Arrays

Here is how we might un-marshall an array using the copy then free pattern.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
unmarshall (memoryManager, address, array) {
try {
// Create a temporary typed array
const typedArray = new this.type.TypedArrayType(
memoryManager.memory.buffer,
address,
array.length)
// Return a copy of the array.
return Array.from(typedArray)
} finally {
// Free the memory
memoryManager.free(address)
}
}

We can change this to simply register the object for freeing. Note that creating the typed array is not really creating the array: it just provides a view into the
memory of the element type we want.

1
2
3
4
5
6
7
8
9
10
11
unmarshall (memoryManager, address, array) {
// Create a typed array
const typedArray = new this.type.TypedArrayType(
memoryManager.memory.buffer,
address,
array.length)
// register for cleanup
memoryManager.freeWhenFinalized(typedArray, address)
// Return the typed array
return typedArray
}

This implementation is much more lightweight. In the previous version there was
a point in time where both arrays existed in memory, which could be a problem
for big data. However the “copy then free” version could resolve arrays of pointers, as
the Array can hold references to other arrays, rather than just numbers.

Strings

Strings present a minor problem. Once we have un-marshalled a byte array to a
string we have nowhere to keep the address from the WebAssembly instance. To
solve this I created a StringBuffer which is an extension of Uint8Array,
with a couple of class methods for marshalling and un-marshalling, and an
instance method to decode the string.

Pointers

I implemented an AddressType which simply sets the contents of a Pointer to
the address. This works, but feels like a weak solution. I need some more use
cases before I know what to do here.

Thoughts

We can now pass typed arrays, strings and pointers between JavaScript and a
WebAssembly instance with the memory being cleaned up through the garbage
collector and finalizers.

This is a huge win for big data applications, as only one copy of the data needs
to exist, reducing the memory footprint, and time spent copying between domains.

WASI Marshalling

In this series of posts I’ve done a lot of experimentation. An element which
I’ve had to implement a number of times is calling functions defined in a
WebAssembly module.

Calling a WebAssembly function involves transferring data to an instance,
calling a function, and retrieving the results. This is known as marshalling.

The source code for the examples can be found
here
and the marshalling package
here.

Our First C Function

Here is an example function written in C and compiled to WebAssembly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
__attribute__((used)) double* multipleFloat64ArraysReturningPtr (
double* array1,
double* array2,
int length)
{
double* result = (double*) malloc(length * sizeof(double));
if (result == 0)
return 0;

for (int i = 0; i < length; ++i) {
result[i] = array1[i] + array2[i];
}

return result;
}

At the highest level the function takes two floating point arrays which it
multiplies and returns the result. At a lower level, two pointers to double
arrays are passed with a length. A result pointer is allocated with the required
length in which the result is stored. Finally the pointer to the result array
is returned.

To make this happen, first memory has to be allocated for the two input arrays
in the WebAssembly module. The data must then be copied into the arrays. When
the function is called it creates a third array by allocating space in the
WebAssembly. The pointer to this memory is then returned. The JavaScript must
copy the result array, then free the two input arrays and the result array.

Our First JavaScript Prototype

I’ve written a package to simplify the marshalling of data between JavaScript
and WebAssembly. Here is how to create a binding for the function.

1
2
3
4
5
6
7
const proto = new FunctionPrototype(
[
new In(new ArrayType(new Float64Type())),
new In(new ArrayType(new Float64Type())),
new In(new Int32Type())
],
new ArrayType(new Float64Type(), 4))

This matches the C prototype.

1
2
3
4
__attribute__((used)) double* multipleFloat64ArraysReturningPtr (
double* array1,
double* array2,
int length)

Note that the return type also declares the length of the returned array. The
C function will only return a pointer to the start of the array, so we need
the length. In the real world we’d probably make a function out of prototypes
that require length parameters.

We can invoke the function as follows.

1
2
3
4
5
6
const result = proto.invoke(
wasi.memoryManager,
wasi.instance.exports.multipleFloat64ArraysReturningPtr,
[1, 2, 3, 4],
[5, 6, 7, 8],
4)

As well as the function arguments we provide wasi.memoryManager to allow
access to the WebAssembly memory and wasi.instance to retrieve the function.

When the function is invoked three things happen. First the input arguments are
marshalled. Memory is allocated for each array, and the array data is copied
into that memory. The integer value can be passed directly.

Second the function is called with the marshalled arguments and the return value
is received.

Third the memory for the input arguments is freed, the result value is copied
from the WebAssembly memory to a JavaScript array, and the memory that was
allocated inside the WebAssembly instance is freed.

Memory

While not strictly part of WASI, memory management is the first problem to
solve. When a WebAssembly module is instantiated a memory buffer is either
passed, or created by the instance. When using the WASM standard library
provided by wasi-libc the memory
management functions malloc and free can be exported from the instance.

After instantiation they are held in the following class.

1
2
3
4
5
6
7
8
class MemoryManager {
constructor (memory, malloc, free) {
this.memory = memory
this.malloc = malloc
this.free = free
this.dataView = new DataView(this.memory.buffer)
}
}

An array of floats might be marshalled as follows.

1
2
3
4
5
6
function marshallFloat64Array(memoryManager, array) {
const address = memoryManager.malloc(Float64Array.BYTES_PER_ELEMENT * array.length)
const typedArray = new Float64Array(memoryManager.memory.buffer, address, array.length)
typedArray.set(array)
return address
}

An array that was allocated inside the WebAssembly instance might be
un-marshalled as follows.

1
2
3
4
5
6
7
8
function unmarshallFloat64Array(memoryManager, address, length) {
try {
const typedArray = new Float64Array(memoryManager.memory.buffer, address, length)
return Array.from(typedArray)
} finally {
memoryManager.free(address)
}
}

Our Second C Function

Here is our second C function.

1
2
3
4
5
6
__attribute__((used)) void multipleFloat64ArraysWithOutputArray (double* array1, double* array2, double* result, int length)
{
for (int i = 0; i < length; ++i) {
result[i] = array1[i] + array2[i];
}
}

This time the calling function allocates the memory for the output array and
passes it in.

Our Second JavaScript Function Prototype

The JvaScript function prototype looks as follows.

1
2
3
4
5
6
7
8
const proto2 = new FunctionPrototype(
[
new In(new ArrayType(new Float64Type())),
new In(new ArrayType(new Float64Type())),
new Out(new ArrayType(new Float64Type())),
new In(new Int32Type())
]
)

We don’t need to provide a return type as nothing is returned. The output
argument is wrapped in an Out class to inform the marshaller to unpack it
after the function is called.

The function is called as follows.

1
2
3
4
5
6
7
8
const output = new Array(4)
proto2.invoke(
wasi.memoryManager,
wasi.instance.exports.multipleFloat64ArraysWithOutputArray,
[1, 2, 3, 4],
[5, 6, 7, 8],
output,
4)

Because we supplied the output array with the correct length, we didn’t need
to supply a length argument to ArrayType.

Where’s WASI?

So far there’s been no need for WASI. That ends with strings.

WASI enters the picture when we need to leave the WebAssembly sandbox and
interact with the system in which we’re running. It’s not obvious why this
should happen with strings. In JavaSCript strings use UTF-8. As soon
as we need to decode a string the C standard library will want to know about
our locale (our language environment). To do this it it checks the environment
variables, which exist outside the sandbox!

This introduces the first of our system call requirement: environ_sizes_get
and environ_get. The first call finds the amount of space required to store
the environment variables; the second fetches them.

The other need for WASI is when the code interacts with stdout/stderr. While
we might think this is unnecessary, it is common for C libraries to report
errors through the perror function which returns and error and reports the
cause to stderr.

There are a small bunch of functions we need to support here. The final outcome
is the following set of imports to the WebAssembly.

1
2
3
4
5
6
7
8
9
10
11
const res = await WebAssembly.instantiate(buf, {
wasi_snapshot_preview1: {
environ_get: (environ, environBuf) => wasi.environ_get(environ, environBuf),
environ_sizes_get: (environCount, environBufSize) => wasi.environ_sizes_getnvironCount, environBufSize),
proc_exit: rval => wasi.proc_exit(rval),
fd_close: fd => wasi.fd_close(fd),
fd_seek: (fd, offset_low, offset_high, whence, newOffset) => wasi.fd_seek(fd, fset_low, offset_high, whence, newOffset),
fd_write: (fd, iovs, iovsLen, nwritten) => wasi.fd_write(fd, iovs, iovsLen, ritten),
fd_fdstat_get: (fd, stat) => wasi.fd_fdstat_get(fd, stat)
}
})

With this relatively small set of imports we can import the vast majority of
publicly available C libraries.

Strings & Stdout/Stderr

You can check out the source code fore the implementations and use of these.
They are not typically use from JavaScript, and are only provided to allow
imported C libraries to run.

Wiring It Up

The example project shows how this works with node and in the browser. Here’s
the browser version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
<script>
const {
Wasi,
Float64Type,
ArrayType,
Int32Type,
StringType,
FunctionPrototype,
In,
Out
} = wasiMarshalling

// Create the Wasi instance passing in environment variables.
const wasi = new Wasi({})

// Instantiate the wasm module.
WebAssembly.instantiateStreaming(
fetch('example.wasm'), {
wasi_snapshot_preview1: wasi.imports()
})
.then(res => {
// Initialise the wasi instance
wasi.init(res.instance)

// The first example takes in two arrays on the same length and
// multiplies them, returning a third array.
const proto1 = new FunctionPrototype(
[
new In(new ArrayType(new Float64Type())),
new In(new ArrayType(new Float64Type())),
new In(new Int32Type())
],
new ArrayType(new Float64Type(), 4)
)

const result1 = proto1.invoke(
wasi.memoryManager,
wasi.instance.exports.multipleFloat64ArraysReturningPtr,
[1, 2, 3, 4],
[5, 6, 7, 8],
4)
console.log(result1)
...
</script>

Thoughts

We’ve got a generic marshalling layer between JavaScript and WebAssembly which
is pretty cool! We’ve also got enough WASI to drop in publicly available
C libraries.

We’ve not yet addressed C++ libraries and we’re waiting for the flang compiler
to be incorporated into the llvm toolchain, so there’s a way to go before we
have access to everything we need to use JavaSCript to do real work with
native libraries.

How to Cross Compile a C library to WebAssembly for use with JavaScript

In this post I’ll explore how to use an existing C library in a JavaScript with WebAssembly.

This uses information from previous posts for
dataframes,
and
wasi.

You can find the source code
here.

The Toolchain

In this example I used the ready made clang toolchain from
wasi-sdk. I downloaded
wasi-sdk-11.0-linux.tar.gz
and unpacked it in /opt/wasi-sdk as follows.

1
2
3
4
5
6
cd /tmp
wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-11/wasi-sdk-11.0-linux.tar.gz
cd /opt
sudo tar xvf /tmp/wasi-sdk-11.0-linux.tar.gz
sudo ln -s wasi-sdk-11.0 wasi-sdk
rm /tmp/wasi-sdk-11.0-linux.tar.gz

I also installed wabt 1.0.16, and
wasmer, and unpacked them in /opt. I then
put the bin directories from all on my path.

Cross Compiling

I decided to use the
GNU Scientific Library
as it is written in plain old C, and I can use it to further the
dataframes
project I have discuessed in previous posts.

Compiling it proved to be remarkably straightforward. After downloading and
unpacking the tarball I did the following.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Ensure wasi-sdk is on the path
export PATH=/opt/wasi-sdk/bin:$PATH

# Go to the project folder and make a build folder.
cd gsl-2.6
mkdir build
cd build

# Configure the project setting the clang toolchain
CC=clang \
CFLAGS="-fno-trapping-math --target=wasm32-wasi -mthread-model single" \
CPP=clang-cpp \
AR=llvm-ar \
RANLIB=llvm-ranlib \
NM=llvm-nm \
LD=wasm-ld \
../configure \
--prefix=/opt/gsl-2.6 \
--host=wasm32-wasi \
--enable-shared=no

# build and install it (you need to have permissions to /opt/gsl-2.6)
make install

Amazingly that just worked. The result of the install was the folder
/opt/gsl-2.6 with three sub-folders: include, lib, and share.

Testing the Static Library

I created an example.c with the following contents taken from the
gsl documentation.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>

int main (int argc, char** argv)
{
double x = 5.0;
double y = gsl_sf_bessel_J0 (x);
printf ("J0(%g) = %.18e\n", x, y);
return 0;
}

I compiled this in the following manner (using the wasi-sdk clang).

1
clang example.c -o example -O2 --target=wasm32-wasi -I/opt/gsl-2.6/include -L/opt/gsl-2.6/lib -lgsl -lm -lc

No errors! But is it really a wasm file? We can test with the file utility.

1
2
$ file ./example
example: WebAssembly (wasm) binary module version 0x1 (MVP)

This is all very cool, but what’s in the static library? Let’s take a look.

1
2
3
4
5
6
7
8
9
10
11
12
# Make sure we're using the llvm archive function.
$ which ar
/opt/wasi-sdk/bin/ar
# Find the first file in the library.
$ ar t /opt/gsl-2.6/lib/libgsl.a | head -1
version.o
# Extract the file
$ ar xv /opt/gsl-2.6/lib/libgsl.a version.o
x - version.o
# What kind of object file is it?
$ file version.o
version.o: WebAssembly (wasm) binary module version 0x1 (MVP)

It seems that the “object” files are themselves wasm, which get linked in to the
finaly module, which is itself wasm. Interesting!

We can run the “executable” file with wasmer. Wasmer is a runtime engine for WebAssembly
modules.

1
2
$ /opt/wasmer/bin/wasmer example1
J0(5) = -1.775967713143382642e-01

It works :) Let’s see how to use the library in our JavaScript code.

Create the JavaScript

This is definately TL;DR

Making a concise example of this would have been much shorted, but I’m exploring
the data science possibilities of WebAssembly, so what I’ve done is wire the
functions in to my growing dataframe/series code.

First I need to create a helper to call aggregate functions. I do this in
the file wasi-memorymanager.js.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class WasmMemoryManager {

...

invokeAggregateFunction(func, array, typedArrayType) {
let input = null

try {
input = this.createTypedArray(typedArrayType, array.length)
input.set(array)

return func(input.byteOffset, array.length)
} finally {
this.free(input.byteOffset)
}
}

...
}

To use the aggregate function I add it to the setup-wasi.js.

1
2
3
4
5
6
7
8
9
10
11
12
13
...

arrayMethods.set(
'mean',
makeAggregateOperation(
wasi.wasiMemoryManager,
null,
(array, length) => gsl_stats_mean(array, 1, length),
(series) => series.array.reduce((a, b) => a + b, 0) / series.array.length
)
)

...

You can see here I’m calling the raw gsl function gsl_stats_mean (I needed the
arrow function to provide a stride parameter of 1). What I want to show is
that there is little or no requirement for a binding library. We’re using generic
marshalling code.

The Series.js needs a small patch to handle functions that return a single
value, rather than a new series. I added a check for the return value of the
function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Series {
constructor(...) {
...

return new Proxy(this, {
get: (obj, prop, receiver) => {
if (prop in obj) {
return Reflect.get(obj, prop, receiver)
} else if (arrayMethods.has(prop)) {
return (...args) => {
const [value, type] = arrayMethods.get(prop)(obj, ...args)
if (value instanceof Array) {
return new Series('', value, type)
} else {
// An aggregated value
return value
}
}
} else {
return Reflect.get(obj.array, prop, receiver.array)
}
},

...

Dude Where’s My Function?

Ideally we would like to call the function without any extra effort, but a
function from a static library will not be included in the final module unless
there is a reference to it. This means we need a C file to create a reference so
it gets exported. I did this as follows.

1
2
3
4
5
6
7
8
#include <gsl/gsl_statistics.h>

void force_gsl_imports()
{
void* funcs[] = {
gsl_stats_mean
};
}

This is all we need to ensure the function is included in the emitted wasm module. A
specific binding function is unnecessary.

The example program

Now we just need to see if it works.

1
2
3
const height = new Series('height', [1.82, 1.72, 1.64, 1.88], 'double')
const avg = height.mean()
console.log('mean', avg)

Success!

Thoughts

The thing that surprised me the most was how simple it was to create the static
library (although it’s possible I just got lucky with the library I chose).

My example was too verbose, But I’m hoping you’re following the journey with me
and have read my previous posts.

The gsl already provides a BLAS library, so we could write a JavaScript matrix
class and do linear algebra. My hope is that the anticipated FORTRAN compiler
(flang) will be just as easy to create a library from.

Handling Stdout/Stderr with JavaScript and WebAssembly

In my
previous post
I found out how to pass strings between JavaScript and WebAssembly. The problem
with that solution was that I had to export console.log to the WebAssembly
module. At some point I want to be able to take some library source code “off
the shelf”, compile and use it without modification.
Sadly many libraries will use stdin/stdout
with printf, puts, or perror.

This is going to mean more work at the WASI layer!

You can find the source code for this post
here.

The Problem

If a program issues a call like printf it’s often sending its output to the
terminal, or possibly to a file. In any case this is certainly escaping the
WebAssembly sandbox. The
wasm-libc
library was built with the expectation of a WASI implementation. We can use this
to handle enough stdio to suit our needs.

The C Code

Here is the C code I’d like to provide WASI support for.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#define __wasi__

#include <stdlib.h>
#include <stdio.h>
#include <locale.h>

int is_locale_initialised = 0;

static void initLocale()
{
// The locale must be initialised before using
// multi byte characters.
is_locale_initialised = 1;
setlocale(LC_ALL, "");
}

__attribute__((used)) void callPerror(char* ptr)
{
if (is_locale_initialised == 0)
initLocale();

perror("Help!");
}

__attribute__((used)) void writeToStdout(char* ptr)
{
if (is_locale_initialised == 0)
initLocale();

fputs(ptr, stdout);
fflush(stdout);
}

__attribute__((used)) void writeToStderr(char* ptr)
{
if (is_locale_initialised == 0)
initLocale();

fputs(ptr, stderr);
fflush(stderr);
}

Clearly the line at the start #define __wasi__ needs some explanation. Without
this, compilation broke with the error
#error <wasi/api.h> is only supported on WASI platforms. When I checked the
include file the __wasi__ was the culprit. Defining it enabled the
stdio functionality.

The JavaScript Example

This is basically a clone of the previous post. The example code is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
const setupWasi = require('./setup-wasi')

async function main() {
// Setup the WASI instance.
const wasi = await setupWasi('./stdio-example.wasm')

// Get the functions exported from the WebAssembly
const {
writeToStdout,
writeToStderr,
callPerror
} = wasi.instance.exports

let ptr1 = null
try {
ptr1 = wasi.wasiMemoryManager.convertFromString('To stdout\n')
writeToStdout(ptr1.byteOffset)
} finally {
wasi.wasiMemoryManager.free(ptr1)
}

try {
ptr1 = wasi.wasiMemoryManager.convertFromString('To stderr\n')
writeToStderr(ptr1.byteOffset)
} finally {
wasi.wasiMemoryManager.free(ptr1)
}

callPerror()
}

main().then(() => console.log('Done'))

I’ve used the helper function convertFromString to create the string in the
memory space of the WebAssembly instance.

The WASI Code

Running the code exposes the calls we need to implement. It turned out we need:
fd_close, fd_seek, fd_write, and fd_fdstat_get.

My goal here is to implement stdout and stderr with console.log and
console.error. Here is my implementation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
fd_close = fd => {
return WASI_ESUCCESS
}

fd_seek = (fd, offset_low, offset_high, whence, newOffset) => {
return WASI_ESUCCESS
}

fd_write = (fd, iovs, iovsLen, nwritten) => {
// We only care about stdout or stderr
if (!(fd === STDOUT | fd === STDERR)) {
return WASI_ERRNO_BADF
}

const view = new DataView(this.wasiMemoryManager.memory.buffer)

// Create a UInt8Array for each buffer
const buffers = Array.from({ length: iovsLen }, (_, i) => {
const ptr = iovs + i * 8;
const buf = view.getUint32(ptr, true);
const bufLen = view.getUint32(ptr + 4, true);
return new Uint8Array(this.wasiMemoryManager.memory.buffer, buf, bufLen);
})

const textDecoder = new TextDecoder()

// Turn each buffer into a utf-8 string.
let written = 0;
let text = ''
buffers.forEach(buf => {
text += textDecoder.decode(buf)
written += buf.byteLength
});

// Return the bytes written.
view.setUint32(nwritten, written, true);

// Send the output to the console.
if (fd === STDOUT) {
this.stdoutText = drainWriter(console.log, this.stdoutText, text)
} else if (fd == STDERR) {
this.stderrText = drainWriter(console.error, this.stderrText, text)
}

return WASI_ESUCCESS;
}

fd_fdstat_get = (fd, stat) => {
// We only care about stdout or stderr
if (!(fd === STDOUT | fd === STDERR)) {
return WASI_ERRNO_BADF
}

const view = new DataView(this.wasiMemoryManager.memory.buffer)
// filetype
view.setUint8(stat + 0, WASI_FILETYPE_CHARACTER_DEVICE);
// fdflags
view.setUint32(stat + 2, WASI_FDFLAGS_APPEND, true);
// rights base
view.setBigUint64(stat + 8, WASI_RIGHTS_FD_WRITE, true);
// rights inheriting
view.setBigUint64(stat + 16, WASI_RIGHTS_FD_WRITE, true);

return WASI_ESUCCESS;
}

For fd_close and fd_seek there’s nothing to do. We Can’t close console.log
and its not random access so we can’t seek.
The fd_stat_get function was a bit of a guess with
the rights inherit, but it worked. The fd_write function needed to be sensitive to newlines as there’s no way to suppress a newline in
the console functions. The text received gets appended until a newline is received, then we report it. The
drainWriter function is as follows.

1
2
3
4
5
6
7
8
9
function drainWriter (write, prev, current) {
let text = prev + current
while (text.includes('\n')) {
const [line, rest] = text.split('\n', 2)
write(line)
text = rest
}
return text
}

Thoughts

I’m feeling quite pleased. We’ve added some complication with the WASI
stubs, but the code is still fairly small and understandable. My goal
is to get to a point where I can drop in a C library and compile it to
WebAssembly without modification. At the moment this seems like a
realistic possibility.