How does one store the contents of an infinite canvas into a computer’s finite memory? One cheats. In this case, by taking advantage of the fact that the canvas may be infinite, but people’s drawings are quite finite. We simply don’t store the empty regions.
To that end, the canvas is broken up into tiles of 256×256 pixels. I chose that size fairly arbitrarily based on gut feeling. Large tiles are probably more compressible and reduce the number of tiles we have to fetch, but smaller tiles lead to more fine-grained updates so the client has less data to upload. The decisive argument was something else entirely: if I eventually implement locking (more on that later), it will be easiest to do on a per-tile basis, so the tiles should not be too large.
The naming of tiles
These tiles sit in an infinite grid, with a coordinate system stretching out from -∞ on one end to +∞ on the other. To keep storage costs finite, we don’t store the infinite number of leading zeros of coordinates; in other words,
42 and not
…0000000000042. (Even then, it is possible for coordinates to get so large that they don’t fit in the computer’s memory, so this technically limits the canvas size somewhat. It’s still unimaginably much bigger than the known universe.)
Inside a tile
In the old Bigcanvas, the content of a tile was a series of strokes, where a stroke was just a series of points. When somebody drew something, the strokes were simply concatenated onto those that were already there. After seeing Webcanvas, I realized that this approach was going to be severely limiting. Not only does it preclude more advanced operations like blur, especially near tile edges, but it also makes tiles download and render more slowly the more is drawn to them. That clearly doesn’t scale!
So I decided to go with a more traditional approach instead: storing each tile as a PNG image. (We can’t use JPEG for its smaller size, because repeated decompression and recompression caused by incremental updates would quickly ruin image quality.)
The obvious drawback is that changes are no longer a simple concatenation. Either we have to merge PNG files on the server, or the client has to send the entire updated PNG across the wire. This turned out to be a rather big topic, about which I’ll write more in the next post.
The old Bigcanvas stored its tiles on the filesystem. Each tile was a file, which made appending strokes trivially easy, but it did lead to performance problems, despite the care I took to create a hierarchical directory structure to avoid huge numbers of files in one directory.
So in the new version, I shopped around for a database system, and eventually settled on Redis. It’s a key-value store at heart, but the values are structured: you can have sets, maps, arrays and such. Everything is in-memory, so it’s super fast, but it can also persist to disk. And it can be sharded by key across multiple machines, so it should scale well if it ever comes to that.
The biggest risk here is that the necessary storage will grow too quickly, and I’ll end up paying too much for RAM. In that case, I’ll probably swap Redis out for a disk-based DBMS like MongoDB or plain old MySQL.
My god, it’s full of bits
So now, everything is protobufs. Messages from server to client, from client to server, keys in Redis, values in Redis… Serialization and deserialization is fast and efficient, and unlike JSON you get a strongly typed object to work with. And as long as you follow the rules protobufs are forwards and backwards compatible, so any schema changes should not require a data migration. So it’s very flexible, which is just what I needed, since I can’t know where this project is headed yet.
Shortly thereafter, Google released gRPC, an RPC protocol based on protocol buffers being exchanged over HTTP2. It comes with a “javanano” version that promises to be efficient on mobile, which seemed useful. But my main reason for switching from my hand-rolled websockets-based protocol to gRPC is that it provided a proper request-response model, which I needed for reliable concurrent updates (again, more on that in a later post). And, OK, maybe also because it seemed like fun to try it out. Verdict: once you get everything to compile, it works like a charm!