Bigcanvas: Technology
How does one store the contents of an infinite canvas into a computer’s finite memory? One cheats. In this case, by taking advantage of the fact that the canvas may be infinite, but people’s drawings are quite finite. We simply don’t store the empty regions.
To that end, the canvas is broken up into tiles of 256×256 pixels. I chose that size fairly arbitrarily based on gut feeling. Large tiles are probably more compressible and reduce the number of tiles we have to fetch, but smaller tiles lead to more fine-grained updates so the client has less data to upload. The decisive argument was something else entirely: if I eventually implement locking (more on that later), it will be easiest to do on a per-tile basis, so the tiles should not be too large.
The naming of tiles
These tiles sit in an infinite grid, with a coordinate system stretching out from -∞ on one end to +∞ on the other. To keep storage costs finite, we don’t store the infinite number of leading zeros of coordinates; in other words, 42
is 42
and not …0000000000042
. (Even then, it is possible for coordinates to get so large that they don’t fit in the computer’s memory, so this technically limits the canvas size somewhat. It’s still unimaginably much bigger than the known universe.)
For efficiency, these coordinates are not represented in decimal, but in base-256, as a byte array. To represent negative numbers, it uses “256’s complement”. From 0…127, everything is normal, but at 128 the number line “wraps around”, and this number is interpreted as -128. 129 is interpreted as -127, 130 is interpreted as -126, and so on until 255 which is interpreted as -1. If we want to represent larger numbers, we may have to prefix a byte with a value of 0 or 255 to signify on which side of 0 we are. I had some fun writing a base-256 arithmetic library for this; first in JavaScript, then in Go, then once more in Java.
Inside a tile
In the old Bigcanvas, the content of a tile was a series of strokes, where a stroke was just a series of points. When somebody drew something, the strokes were simply concatenated onto those that were already there. After seeing Webcanvas, I realized that this approach was going to be severely limiting. Not only does it preclude more advanced operations like blur, especially near tile edges, but it also makes tiles download and render more slowly the more is drawn to them. That clearly doesn’t scale!
So I decided to go with a more traditional approach instead: storing each tile as a PNG image. (We can’t use JPEG for its smaller size, because repeated decompression and recompression caused by incremental updates would quickly ruin image quality.)
The obvious drawback is that changes are no longer a simple concatenation. Either we have to merge PNG files on the server, or the client has to send the entire updated PNG across the wire. This turned out to be a rather big topic, about which I’ll write more in the next post.
Storage
The old Bigcanvas stored its tiles on the filesystem. Each tile was a file, which made appending strokes trivially easy, but it did lead to performance problems, despite the care I took to create a hierarchical directory structure to avoid huge numbers of files in one directory.
So in the new version, I shopped around for a database system, and eventually settled on Redis. It’s a key-value store at heart, but the values are structured: you can have sets, maps, arrays and such. Everything is in-memory, so it’s super fast, but it can also persist to disk. And it can be sharded by key across multiple machines, so it should scale well if it ever comes to that.
The biggest risk here is that the necessary storage will grow too quickly, and I’ll end up paying too much for RAM. In that case, I’ll probably swap Redis out for a disk-based DBMS like MongoDB or plain old MySQL.
My god, it’s full of bits
Initially, I did a web-based prototype of the new Bigcanvas, which like the old version used websockets for communicating with the server. Before I made the switch to PNG, I made up my own binary serialization format to keep uploads and downloads small and efficient. But after a while, I realized that I was just reinventing Google’s protocol buffers, badly, so I bit the bullet and ported it over to protobufs. Since protobufs don’t come with an official JavaScript version (binary manipulation in JavaScript is not really fast), and since ProtoBuf.js seemed like overkill, I rolled my own. OK, maybe I mostly did it because it seemed like fun.
So now, everything is protobufs. Messages from server to client, from client to server, keys in Redis, values in Redis… Serialization and deserialization is fast and efficient, and unlike JSON you get a strongly typed object to work with. And as long as you follow the rules protobufs are forwards and backwards compatible, so any schema changes should not require a data migration. So it’s very flexible, which is just what I needed, since I can’t know where this project is headed yet.
Shortly thereafter, Google released gRPC, an RPC protocol based on protocol buffers being exchanged over HTTP2. It comes with a “javanano” version that promises to be efficient on mobile, which seemed useful. But my main reason for switching from my hand-rolled websockets-based protocol to gRPC is that it provided a proper request-response model, which I needed for reliable concurrent updates (again, more on that in a later post). And, OK, maybe also because it seemed like fun to try it out. Verdict: once you get everything to compile, it works like a charm!
Using HTTP2 does exclude today’s JavaScript clients, but that wasn’t as big a deal as I thought. Getting from PNG bytes on a websocket to an image on the screen and back again turned out to be possible, but fairly slow and painful, and that was on a beefy desktop machine. So I decided that I would need a different solution for JavaScript anyway, probably just serving up the images via HTTP. Since I’m going after mobile first, that can wait.