At Avocode, we render designs from scratch. We have developed a specialized hardware-accelerated rendering engine in C++ that also runs in the browser with the help of Emscripten and WebAssembly.
For us, though, the performance of the code itself is not a big issue since we take advantage of WebGL for all heavy-duty tasks, which allows us to fully utilize the specialized graphics hardware present in all modern computers, which is ideal for this purpose. Even though WebGL is more limited than its native counterparts, OpenGL, DirectX, and others, it is almost equal in performance.
So what challenges did we face as we developed this engine?
We had to study the behavior of the supported graphics editors and try to reproduce their every quirk as faithfully as possible. A large portion of the editors’ functionality had to be basically reimplemented. Rasterization, composition, blending, masking, transformations, filters, effects. We have to be ready for even the most unlikely combinations, because with the volume of designs we process (over 100,000 a month), anything that is possible will probably appear at some point. Should a layer’s shadow be visible through its semi-transparent border? How does a layer’s blending mode interact with its fill’s blending mode and the parent group’s blending mode? The answer is usually, it depends.
What we have found is that each editor has a different “personality” in this manner. For example, Photoshop, while slightly intimidating at first with a large number of in-depth configuration options, actually turned out to be the most consistent and predictable out of all, in an almost scientific way. If something unexpected happens in Photoshop, chances are you simply don’t understand it well enough, but it actually makes perfect sense.
Sketch, on the other hand, proved to be the least robust. It was very easy to come up with situations that the program hadn’t been prepared for, at which point the results were completely unpredictable. For example, adding a certain effect to a layer may inexplicably change the drawing order of other elements in the scene.
At the same time, the graphics editors are constantly being updated, so we have to keep on our toes and implement new features as they come. Most recently, Sketch put us to work with its Smart layout feature, whose exact mode of function is somewhat unclear.
Avocode Cloud API
Since we have already done the work of writing parsers and rendering support for the most used proprietary formats, we’d like to open this technology via an on-demand Cloud API. If you need to request data from design files and would like to use our Cloud API, please contact us for a potential partnership at firstname.lastname@example.org.
One component of this project, in particular, which seemed innocuous at first, has blown up into one of our most significant endeavors - text rendering.
“What’s so hard about rendering text,” you may ask. Well, the problem is not so much the rendering itself, but laying out the characters. And, more specifically, doing this in the exact same way as the original editors. There are many reasons why this is a challenging task. For example, the fact that there are many languages in the world and many use completely different scripts, some of which work differently in certain ways, and for example, may be written from right to left.
Since we are not aiming only to display simple strings such as “Hello world,” but fully formatted multi-line, multi-paragraph text layers, many variables come into play, such as the typeface, font size, and style, horizontal and vertical alignment, justification, letter spacing, kerning, some of which may differ for each individual character. Again, it is not that hard to process all of that and produce the correct result. But our job is to lay out each individual character at precisely the same position as the source editor would. It is no big surprise that each editor behaves differently. And they do not make this any easier for us. For example, we have no indication of how a paragraph has been broken up into lines, but our users expect the same result. And if we are off by just one pixel, a long word might no longer fit on its line and will be moved to the next one, causing a chain reaction on the remaining lines. Of course, cases like this still occur, and we are still actively working on our text layout algorithm.
Another big issue is the management of font files. As you might know, if you use a non-standard font and transfer the design file to a different computer where this font isn’t installed, the text will either be drawn with the wrong font, or there might only be a cached image of the text that cannot be edited without selecting another font. All in all, the problem is that the used typefaces are not embedded in the file. We do manage a database of common typefaces, but the font file on our server may not always perfectly match the one on the designer’s computer. It could be a newer revision with different metrics, a version with a different character subset, or possibly a wholly different typeface that has the same name. For cases like these to work properly, our users have to upload their font files to Avocode.
Monroe Machine - asynchronous graphics engine
Throughout the development process, our goals and priorities were constantly shifting. What was originally meant as a server-side service that produced rendered images, was now to be a core component of the Avocode front-end, running in real-time in the user’s browser. Because of this unexpected turn of events, and due to certain limitations imposed on apps running in the browser, our engine was not fully prepared for this paradigm shift.
The single most severe shortcoming was that once rendering had started, there was no way to stop it or pause it. At least not without multi-threading, a feature not available in browsers at the time. This would mean that for the entire duration, the Avocode app would be locked up, and the user unable to interact in any way. To avoid rebuilding the whole project with this consideration in mind, we came up with what is now known as the “Monroe Machine” architecture. The main idea is that while the program’s structure remains the same, all procedures that would cause actual rendering to occur are replaced by writing down a symbolic command for that operation. Without actually drawing anything, this process is almost instantaneous, and therefore doesn’t need to be interrupted. The result is a sequence of instructions in binary form, essentially an intermediate bytecode, which is then fed into an interpreter, one by one, that handles the actual rendering. But most importantly, the interpretation may be halted at any point between two instructions. This allows us to alternate between discrete rendering steps and responding to user input or restarting the process with updated parameters.
Ironically, even though this architecture was mainly created due to the lack of multi-threading, it actually made it easy to add multi-threading support to our rendering engine. The architecture is almost ideal for this, because all we had to do was add a second interpreter that runs along the instruction sequence ahead of the main one, and prepares partial results that can be computed in a secondary thread. These are typically certain CPU intensive calculations, as GPU rendering does not benefit from running in multiple threads - it has its own hardware-level multi-threading. The only consideration is that the preprocessing interpreter must not pass the main one and it shouldn’t get too far ahead, because then, the partial results may take up too much memory before they get consumed. At the same time, running the second interpreter is fully optional, so the application still works perfectly in a single-threaded environment.
The Stage - design canvas in the browser
Let’s see what we can do with any Sketch, XD, PSD, AI, and Figma design in the browser.
This is an Adobe XD design linked from Avocode. You can zoom in and pan to see as the design is sharply rendered. The design canvas behind is called The Stage, and we can build pretty much anything on top of it. We can show you layers, show more artboards at once (even across formats), enable rapid prototyping, and more.
So how does The Stage work?
Since we have moved from the rendering service paradigm to a dynamic real-time viewer, we’ve been gradually improving our design view, which we call “the stage.” At this point, we are already able to draw a large number of artboards from scratch simultaneously and at any level of magnification. As the user is panning and zooming in the stage, we fill in the missing parts on the fly. For this, we use an approach similar to Google Maps. At each magnification level, we divide the plane into square tiles. We render all of the visible ones, as well as some that are just outside the user’s field of view - to be prepared for when they pan towards them. To save memory, some tiles that become too far out of view are dropped. When available, we make use of a fixed-scale pre-rendered image of the whole design to fill in the unfinished areas. Here is an animation of what that might look like:
Similarly, when zooming in or out, the view is temporarily filled by the scaled previous state of the stage, as well as the pre-render.
Unlike the maps though, we also have to update tiles that have already been rendered, since it is possible to change the design view in certain ways - currently mainly by changing the visibility of specific layers.
However, in the future, we have big plans for possibilities to edit the designs, so stay tuned!