Autotiling Adventures Part II: Procedural masks for biomes and rivers

Biome masks


In the last article, I described a way to autotile multiple biomes using a minimal set of mask shapes. I used a custom map for testing. This time, I use some shaders to generate the a nice big set of masks. In particular, I can generate for example 32 variations of each of the 4 shapes at 256×256 resolution. As we have 1 shape per RGBA texture component (our masks are grayscale), we need 32 RGBA textures, or a single 32-slice array. Stiching them up, the procedural masks look like this: (rows: variations, columns: shapes)


These masks are generated using perlin noise, and then they are post-processed to remove floating islands. Here’s how:

  • We know that each shape contains 1 or 2 white regions and always 1 black region
  • Detect all the black regions, sort them by area, and replace all but the largest with white (so we satisfy “always 1 black region” criterion)
  • Detect all white regions, sort them by area, and replace all but the largest 1 (or 2) with black (so we satisfy “always 1 or 2 white region” criterion

Here are the steps visually: left is the original image, middle is with extraneous black areas removed, right is the final, with the extraneous white areas removed:


At this stage, we calculate the distance field for each of the masks. The distance field is 256×256 at this point. The maximum distance in the distance field is the length of the diagonal diag = 256*sqrt(2); we normalize the values in the distance field from (-diag,diag) to (0,1), to be resolution independent. We now downsample the distance field to 32×32, so that it can still reconstruct the shape nicely. The data is stored in an RGBA8 texture. If each variation is an array slice, we end up with a texture array 32x32xN. To give some perspective, for 64 variations we need 32*32*64*4 bytes = 256K of memory, which is very little. Add a bit of extra for the mipmaps (which are good for filtering when zooming out further), and we’re settled with the biome masks.

Rendering the biome masks

Last time I described a way to render the masked, by rendering a subset of tiles per layer. This is far from optimal (it was approach v1 after all). So, here’s a better one:

  • We observe that evert tile has to be rendered (duh). That means, we need a dense 2D data structure, with tile data per element. So, tile positions are now implicit.
  • We observe that we have up to 4 layers per tile. The info that we need per layer is the layer index (4 bits), the mask index (3 bits) and the transform index (3 bits). That makes 10 bits per layer, so 40 bits in total. So we place the data in a 64-bit data structure ( e.g. RGBA16 or RG32) and we have 22 bits to spare.

Now we render the visible grid, and we sample this data structure to reconstruct the mask. The pseudocode is roughly as follows:

for each pixel:
  calculate tile index and offset in tile
  shift output position by half a tile // for corner offset
  sample autotile data based on tile index
  set output color as 0
  for each valid layer
    transform tile offset using layer transform
    sample mask using transformed coordinates and mask index
    calculate color based on layer
    blend output color with current color based on mask value


River (and road) masks

River masks are slightly different to biome masks and have the following characteristics:

  • The tiles where we need river masks are few: for my map, it was 1.5% of the total tiles.
  • It is not beneficial any more to use corner offsets.
  • There is no diagonal river connection.
  • All river tiles connect to at least one river tile.
  • There is always a source/origin tile for rivers. The origin tile is always connected to one other tile.

Given the above, we realize that we really need 5 different masks: origin, line, corner, t-junction and cross. Below is a list of examples:

We follow the same process as with the biome masks: we remove extraneous white/black regions, calculate distance fields and downsample to 32×32.

Here’s also a video that demonstrates all the mask shapes, procedurally generated, parameterized by time:

As you can see, for the river masks there are typically big black holes in the middle, but they are filled out by the process I described earlier.

River and road rendering

The process is a bit different to the biome mask rendering. Now we have a sparse set of tiles that contain river/road data. The tile data required are 3 bits for the mask index, 3 bits for the transform, and 10 bits for each of the x,y coordinates of the tile (I’m using 512×512 maps for the overworld and I doubt I’d use 2048 or larger). The pseudocode for rendering is similar and a bit simpler compared to the biome mask rendering: for a tile, we unpack the autotile data, we calculate the output position based on the x,y values, and we sample the mask using transformed coordinates.

The river/road render passes take place immediately after the biome mask pass, as they use a sparse tile renderer (biome masks rendering uses a dense tile renderer) and they don’t use corner offsets.

Putting it all together


Here’s a video that shows all mask on the map, compared to the single color per pixel:

As a note, the original single-color-per-pixel has some additional color variation based on the vegetation density, that the new masked version does not have have yet. Also, I think there’s an indexing bug for the variations, as I should have 64 different variations per shape but we can see the occasional repetition.

TODOs for next are coastal water animation utilising the distance field, color variation of the biomes (they still look quite flat) and prop locations per tile for placement of trees, etc.

Autotiling adventures

So, we have a procedurally generated biome map, where each pixel is an individual biome. If we zoom in, it’s obviously quite pixelated. If we add sprites, it just doesn’t look right (besides the lighting and colors)

We have reasonably detailed sprites on a single-colored squares. It’s just ugly. We need to add some texture/detail.



Enter auto-tiling (or transition tiles): a method to automatically place tiles with texture so that each tile matches perfectly with their neighbours. It’s a bit of a wild west out there in terms of approaches, so here are some resources (or resource lists) that I found useful:–cms-25673×3-tile-set-used-to-create-larger-areas/125285

Quite a few.

There are two main ways to consider art when using autotiles: using masks, or using premade textures. A good example is shown here:

Autotiling example

Blend masks example

The premade tiles have the obvious benefit that they can be very nicely done, but of course they are tied to the content they represent.  The blend masks do not look as good, but are easier to develop, and they are more flexible in terms of what textures we want to seamlessly mix. I decided to use masks as I want transitions between any biome: for 16 biome types, that’s 120 unique combinations. It’s not an option to ask an artist to develop 120 different autotiles, that needs quite a bit of money and time. And also, that would have no variation; each autotile would be replicated all over the place, so it would be easy to distinguish patterns.


Grid shifting

The first naive thought that comes to mind (and I went with it for a while actually) is “ok, we have a tile, it is neighbour to 4 or 8 other tiles, so generate masks according to that relationship”. Example here. As one can see, the 4-connected version is less interesting than the 8-connected version (and we don’t want less-interesting), but the 8-connected version results in a lot of combinations! So what do we do? Well, we shift the grid. This way, we always have 4 potentially different tiles (quarters of them anyway)

Below, we shift the whole grid top-left by half a tile. Now, each grid cell (red) always contains parts of 4 tiles.

While this is mentioned in a few articles, it’s demonstrated perfectly in a non-technical article, here. That’s what sold me, as I find the results amazing!


Reducing unique combinations

So, that’s what I mostly found so far in reference material. Now, a 2×2 grid as described can contain 4 different biomes. That’s 4 bits, therefore 16 possible total combinations/arrangements. Here’s how they look like (source):


In the “16 most basic tiles” above, we can observe the following:

  • No 16 can be expressed by transforming 15 (180 deg rotation)
  • No 11,13,14 can be expressed by transforming 10 ( 90,180, 270 deg rotation)
  • No 3,9,7 can be expressed by transforming 1 ( 90,180, 270 deg rotation)
  • No 2,6,8 can be expressed by transforming 4 ( 90,180, 270 deg rotation)
  • No 5,12 contains no spatially varying data

This implies that the only unique tiles are 1,4,10,15,5,12. Furthermore, the only unique tiles with spatially varying data are 1,4,10,15. So, that is 4 tiles instead of 16. We can arrange such a mask of 4 tiles like this:

This has a nice continuous shape, if for example we want to ask an artist to draw some of those. Note that with this arrangement, the transformation will differ, as now the masks are already transformed compared to what I showed above. What’s really important is that the amount of white vs black at the borders that contain both needs to always match, so that tiles are seamlessly combined. In my case above, I’m splitting them at 50%, but that’s of course configurable. What I’m not going to cover, as I’ve given it some thought and gets very complicated, is to support variable black/white border percentages, ensuring that they match There are many more complications involved and I’m not sure if it’s worth it in the end.

So, now we have 4 unique combinations. These can be nicely stored in an RGBA texture (one mask per channel) by converting the above 1×4 tile image. In the shader, given a mask value in [0,15], we effectively do the following:

mask = ... // obtain mask value from 4 underlying biome map tiles. Value in [0,15]
(mask_unique, transform) = get_transform(mask); // use a lookup table to get the unique mask index [0,3] and the transform needed
uv2 = apply_transform(uv, transform); // transform the texture coordinates
mask_rgba = sample_mask_texture(uv2); // sample the mask values
mask_value = -1;

    case 4:  mask_value = 0; break; // whole mask is empty
    case 5:  mask_value = 1; break; // whole mask is full
    default: mask_value = mask_rgba[mask_unique]; // get the component that we need

Most of the above can be done in the vertex shader, whereas the last two steps (sampling the texture and getting the component) need to be done in the pixel shader. So, it’s quite cheap.

Rendering tiles

So, we have a method to render tiles given a very small number of masks. How do we render the tiles? Here’s the naive approach, for a 512×512 biome map:

  • We have 16 biome layers, so I assign each a priority. E.g. shallow water covers coast. Water covers shallow water and coast. Deep water covers water, shallow water and coast. And so on.
  • For each layer, we generate tile render data as follows:
    • For each tile corner in the biome (513×513 grid of points)
      • Sample the 4 adjacent biome types (clamp to border if out of bounds)
      • Create the mask where we set 1 if the layer is equal or higher priority than current, or 0 if the layer is of lower priority than current
      • Based on the mask value, calculate unique mask index and transform, and store in this tile’s render data

So, now we have a list of 513x513x16 tiles = 4.21 million. That’s quite a lot. But as I said, that’s the naive version. Observations:

  • When the unique mask index corresponds to constant 0 (mask_unique index == 4), we don’t need to consider the tile for rendering.
  • When all of the four biome values in a tile are of higher priority than the  current layer, this means that the tile will be completely covered by higher priority layer tiles, and therefore we don’t need to render it.

By applying these two, for my test map I reduced the number of tiles to 0.4 million, which is 10x better. Of course, that’s still a lot, but it doesn’t take into account any spatial hierarchy and other optimisations that could be done.

Here are some examples using the above un-nice mask. Zoomed-out:


Ok, so my mask looks bad, and there’s little to none variation, so you can see patterns everywhere.

Increasing variation

Using 256×256 masks, a single RGBA texture needs 256K of memory. We can have a texture array of such masks, using however many layers we can afford memory-wise. In runtime, we can select the layer based on various conditions. E.g. some texture layers could contain transition masks for particular biomes, or more generally, we can select a layer based on a function of the tile coordinates.


Next post will be about procedurally-generating lots of masks, using distance fields versus using binary masks, and also determination of locations for placement of props.