Autotiles, instancing and object clusters

new approach: using instancing
New approach: using instancing
attempted approach: using autotiling
Attempted approach: using autotiling
original approach: a sprite per cell
Original approach: using nothing

The problem

Some of the blocking tiles in the game are things like a tree, a cactus, etc. Occasionally, I want to use these tiles to represent blocked cells in an outdoors map. But, if I just put one such sprite per cell, the result looks poor (see bottom image above, “original approach”). So I thought, “ok, let’s try to create an autotile version of the trees”.

In the meantime, I’ve developed some helper tool to assist with creating autotiles (rug/fence/blob) from a selection of input tiles:

Autotile tool: blob

… So I hacked a bit of that code away, to automatically place sprites that respect the edge restrictions, so effectively automatically creating the autotile blob from any single sprite. Example output:

Autotile tool: blob, automatic placement based on edges

While I was super happy initially, I soon realized that it would only work under very specific circumstances (symmetric sprites, placed appropriately at particular spots), and in order to cover all scenarios , I would need to automatically create a lot more sprites. So, after seeing a lot of restrictions, I wanted to go for plan B, and reuse some code that I already have for the overworld. That code uses Poisson disk sampling to create instances of things to populate the overworld.

Sprite shader refactoring

The problem was that that shader was restricted for the overworld vegetation, so I needed to generalise. I took a hard look of the miscellaneous shaders that I’m using for sprites (anything that uses texture atlases) and I noticed ones for the following:

  • GUI
  • Static objects
  • Moving objects
  • Moving object shadows
  • Moving objects occluded areas
  • Vegetation normal
  • Vegetation shadows
  • Vegetation decal normal
  • [Future] static object decals
  • [Future] moving object decals

So, lots of combinations. So I delved in Unity’s multi_compile pragma and custom, manual shader variants, and I came up with the following scheme, to have 3 different shader variant axes for sprites:

  • Orientation: Standing or decal
  • Sprite type: Static, moving or “splat”
  • Render type: Regular, shadow or occluded

GUI is still its own thing, but all the rest can be expressed with one value per “axis” above. While Unity nicely allows keywords to configure the multi_compile option, such configuration cannot change blend settings, z settings and core things like that. So, variants based on Render type (regular, shadow, occluded) are all different shader files, that define some defines and include the common shader code. The rest of the variants are just expressed with #ifdef. Here’s how the “Regular” render type variant shader looks like:

Shader "Sprite/TextureAtlasSpriteRegular"
		g_TextureAtlasSprites("TextureAtlasSprites", 2DArray) = "white" {}
		g_TextureAtlasConstants("TextureAtlasConstants", Vector) = (32,32,1,0)
		g_RealTime("Real time", int) = 0
		g_RenderingMoveSpeed("Rendering move speed", float) = 1
			Tags { "Queue" = "AlphaTest" "RenderType" = "Opaque" }
			LOD 100

			AlphaToMask On

				#pragma vertex vert
				#pragma fragment frag
				#pragma target 4.5


				#pragma multi_compile_instancing
				//#pragma instancing_options procedural:setup

				#include "UnityCG.cginc"

				#include "Assets/Shaders/common.cginc"
				#include "Assets/Shaders/sprite.cginc"
				#include "Assets/Shaders/noise/random.cginc"

				// We don't need this, as we don't have gameobjects and materials for each

				#include "Assets/Shaders/Sprite/TextureAtlasSprite_common.cginc"


So, now all the sprite code for all the variants is in a single source file, which is super convenient for editing. This approach now allows easy proper shadows for any object (static or moving) among other things.

Benefits of the new system: everything has proper shadows! fountain, chest, character, door.

As this was a hell of a tangent, to solve the original problem, I wrote a pseudo-autotile algorithm class called “Splat” where, if I’ve specified it, instead of autotiling it creates an instance buffer and renders that with the Splat render variant (which includes shadows). This results in the first image shown on the page, where we have nice randomized trees including shadows. And, even though I’m not showing it here, we can use a variety of tree types, which is very, very convenient (with autotiling that would be near impossible).

Spritesheet to Unity

I’ve made a few posts already about spritesheets, atlases, etc, as I can’t seem to make up my mind. Especially, as my sprite needs change constantly, as I don’t really have complete art and I’m trying to get away with a bit of DCSS tiles and a bit of Oryx tiles at the moment.

Originally, I used a 2D texture atlas + a JSON file with the description of what sprites are where. It was a nightmare to edit. Also, in runtime, filtering was difficult, as Unity provides only so much freedom with sampling, as with texture atlases care needs to be taken at the edges to prevent bleeding and do correct filtering. So, difficult to edit, and difficult to render. Booh.

Then I thought “Ok, let’s use texture arrays in Unity”. So atlas+JSON as source data, then conversion to a texture array in Unity for runtime. Rendering is now easy, without any filtering issues. I do have a limit of a maximum of 2048 sprites per atlas, which is not great, but my 32-bit sprite instance data has now 8 whole bits free as a result, as I need only 11 bits to represent the texture index. On the minus side, editing was still hard.

The last few weeks, I had the sudden realisation that the atlas+JSON format as source data is very, very pointless, as I’m converting to arrays in Unity anyway. So, I went back to the basic form, which is files-and-folders. One file per sprite, some special naming format for animations, folders and subfolders for grouping and … magic! Now the spritesheet is very, very easily editable. Tiles can be previewed directly in explorer, I can change sprite names at will, add/remove tiles, and do some more stuff (more next few weeks), and it’s all very, very easy. When I’m done with editing, I run some Unity script that converts that to an array (still limit of 2048 max per atlas applies), and that’s it. I feel like I’ve been making my life more difficult with the 2D texture atlas format. So, the new atlas format will be the final (barring minor mods), as there’s no problem point really.

With such a simple “loose” format, it’s quite easy to write python scripts/tools to process the spritesheets, e.g. rename sprites or mass-rename animations, create distance fields per sprite, do some autotiling work, etc.


Since this is the first post about audio (I think?), I guess I can afford not being creative in naming the post. So, Unity already has audio facilities, which while nice, still leave some bits to be desired, especially if you’re going Wild West without using gameobjects much, like I do.

AudioListener and AudioSource

The basics that I bothered to research, require 2 things: an audio listener and audio sources, both components. Nice and simple. Normally you hook the audio listener to the player/camera gameobject, and audio source components are hooked to the gameobjects in your world. Since I’m not using gameobjects much, I’ve got an AudioListener to the camera gameobject, and a number of AudioSource components in my 2nd gameobject (called Scripts). The audio sources represent audio types really, and are 7:

  • Background music x2 : self-explanatory
  • Ambient sounds primary x2 : biome sounds for overworld, e.g. forest, swamp, open sea
  • Ambient sounds secondary x2: secondary ambience for overworld, e.g. river or shore
  • Positional sounds : misc sounds like doors opening/closing, secrets found, etc

The positional sounds source always uses the .PlayOneShot() function, which as the name suggests, plays immediately a sound, and is capable of mixing sounds, so I can reuse it to play many sounds. Simple!

The others (which are all looping channels) got a bit more complicated. I wanted crossfade, you see, and the solutions I found out there were not up to snuff. Snapshots something something, mixers, more and more components and gameobjects, or coroutines that you fire and forget, but if you want more control, tough luck.

So, I’m using a ping-pong pattern for the channels, so I need 2 of each. Thanks computer graphics for teaching me tricks! When you’re in the forest and you go to the desert, the “active” forest ambience channel starts fading out, and the 2nd buffer become active and fades in the desert audio. That’s it really! The fade in/out happens at the update function. In code, when we move around tiles in the overworld, I just set desired volume level, like “for ping-pong buffer 0 of ambient sounds primary, set volume to 1 and set audio clip to forest-ambience”. then the update function will dutifully interpolate whatever volume we currently have towards the desired volume. That’s it really! I’m quite happy with the result, and especially that it took a few hours overall to set up. I might have used more time searching for biome sounds really.

Here’s a video that demonstrates the above, plus some new biome-specific tilesets. So, you enter a dungeon in the tundra, you get tundra maps, etc. The video also uses some music that I made ages ago, as I’m eager to use this game as an outlet for all my procedural, algorithmic and creative needs.

(Towards the end of the video, I’m trying hard to find the dungeon entry, but it’s a large map so I started teleporting around, gave up and turned the FoV off, bad cheater I know ๐Ÿ™‚ )

Procedural Generation of Painterly Mountains

Last time I showed the revamped world look, with poisson disc distributions of vegetation. Mountains were absent in that version. The reason is, I don’t have any good graphics for mountains. To add to the problem, I would need mountains that could be applicable to many biomes, and that’s not that easy either! Things that I find available online are tile-able far-zoomed-out mountains, or 2D backdrop style.

For years I had been tempted with the idea to procedurally generate mountains, and I guess the time came to try it out.


  • Lots of mountain variation
  • Ability to generate mountains for multiple biomes
  • Mountains should be somewhere between pixel art and painterly, as found in a good-looking retro-style 2D game
  • Be able to overlay mountains together to make mountain ranges
  • Mountains should be billboards rather than decals: The projection should be top-down oblique, like this


  • Perlin noise added to a sort-of-bell-curve, to generate outline
  • Pick the top point and generate downwards “main” ridge, using some more perlin noise
  • Maybe generate some mini ridges at stationary points, mostly using same settings but smaller length
  • Identify “left” and “right” sides of mountain, and make sure the “left” side is lit better
  • Calculate distance field from main ridge, and use it for shading
  • Calculate dijkstra map using all ridge points and outline points as goals, and use it for shading
  • Calculate a downwards slope direction for each of the “left” and “right” sides, and use that to distort a perlin noise domain, which we sample to change the shading even more
  • Use perlin noise to calculate the tree line, also based on the highest peak

What contributes to the mountainside luminance?

All the below are luminance factors that get multiplied together to give the final luminance

  • shading based on pathfinding distance to the outline (a bit darker near the outline)
  • side: 1 if on the left side, 0.75 if on the right side of the main ridge
  • shading based on pathfinding distance to the main ridge (a bit lighter near the main ridge)
  • domain-warped perlin, different distortion based on the side of the mountain (left/right)

Overworld Graphics Redux: Vegetation

New graphics (WIP)

Before I start rambling on details, just a little bit of motivation on why should the overworld graphics need to be worked on. For reference, here’s how it looked a few months earlier:

Old, HoMM 3-style graphics

So far I’ve been using HoMM 3 assets as a temporary placeholder solution, and of course this would need to change, as it’s fine for a tech demo, but not for anything publishable. I love HoMM 3’s artstyle, and if at some point my game is nearer completion and I got the budget, I’ll hire an artist and point my finger at HoMM 3, pleading for more of the same, but different. But here we are now, and we’ll make do with the fantastic 16-bit tiles from Oryx.


Many 2D games (such as HoMM 3) use a 2D grid for placing things such as walls, floors, objects, trees, creatures, etc. Techniques such as autotiling, in addition to well-designed art, can hide the nature of the grid. HoMM 3 is again a really good example of this:

A HoMM 3 level in the map editor, showing the grid nature of the graphics

Another very good animated example is from Warcraft II:


So, to maximally utilise this trick, we need good art. To render this on screen is very very cheap: For a single layer, a single tile is assigned per grid cell. Combining multiple layers and transitioning between tilesets can be a more challenging task.


In the game, the overworld is a grid, where each cell stores details about the contained biome, for example temperature/humidity/altitude, if it’s a river, if it’s sea or a lake, if it’s a hill or a mountain, etc.

The art requirements for the overworld are as roughly follows:

  1. Tiles and variations for backgrounds of each biome
  2. A way to do transitions between biomes [using transition masks]
  3. A way to depict varying vegetation per biome [this post]
  4. A way to depict hills and mountains
  5. A way to depict decorative props in each biome (e.g. skulls in the desert) [should be very similar to vegetation]

In the above, [1] is currently using HoMM assets, but it’s very simple to replace, and will do shortly with Oryx tiles to begin with. This post will focus on vegetation.

For enough variation for all biomes, a lot of art is needed. Add to that the autotiling art requirements, and that becomes quite a big task. So, what do we do? As usual, let the computer do the hard work.

Vegetation Distribution using Instancing + Poisson Disk Sampling

Instead of carefully designing tilesets, a different approach is to just use basic art elements (a single tree, a single bush, etc) and distribute them nicely. We do not have to be restricted by the grid anymore: e.g. a tree can be placed anywhere in the continuous 2D space. As one might imagine, for a large overworld, we will need a lot of trees. In this case, as it turned out, half a million of them. The best way to render multiple objects of the same type is using instancing. Any reasonable game/graphics engine or API should provide such functionality.

A standard way to distribute vegetation is Poisson Disk Sampling, as it has some desirable characteristics, most importantly a minimum distance between each pair of elements. We can use this to generate positions of vegetation elements within a single tile. For example, a dense forest tile could contain 8 trees, whereas a desert might contain a single cactus element. Therefore, we can pre-generate multiple variations of poisson sample sets for the most dense scenario (8 elements per tile) and use those for calculating the position of each vegetation element. Here is how a pre-generated sample set looks like (8 variations):

So, how do we generate the positions for all trees? Here’s some pseudocode:

// 64 variations of 8 positions within the unit square
vec2 poisson_sample_sets[64][8] = ... 
for each grid cell on the map:
	// select a random set
	rand0 = hash( cell_coordinate )
	pset = poisson_sample_sets[ rand0 % 64]
	N = calculate number of vegetation elements for cell
	// create a random starting element for this sample set
	i0 = hash( cell_coordinate + 123 ) % N
	for each i in N:
		sample = pset[(i0 + i)%N]

So, we need to randomize a lot, but also be consistent: e.g. the elements for each tile must all use the same sample set. Also, if 2 tiles use the same sample set and need to place 4 out of 8 trees, by starting at different positions in the sample set guarantees greater variety.

A simple way to utilize this, is to pre-generate the positions of each tree and simply render those using instancing. For actual numbers, I’ll use the real numbers that I have for a standard overworld map:

  • 28911 tiles, 1 tree per tile (sparse vegetation: deserts, tundra, etc)
  • 31563 tiles, 2 trees per tile (total: 63126 instances)
  • 40686 tiles, 4 trees per tile (total: 162744 instances)
  • 37952 tiles, 8 trees per tile (total: 303616 instances, dense vegetation: jungle, swamp, etc)

So the above is about 550,000 instances. The memory requirements using 16 bits for each coordinate (it’s enough) will be 2.2MB, so not bad! We just have to figure out in the shader:

  • which tile we’re on =>
  • what biome we’re on =>
  • what trees are ok to use for this biome =>
  • pick a tree!
  • [bonus] scale the tree randomly between 90%-110%

Rendering the instances should be blazing fast, and if it’s not, you can use linear quadtrees with morton order, which will definitely make it blazing fast (I’ve been using this for neuroscience data, 2 orders of magnitude greater in number). Actually, I should implement that next, as when the lockdown is over, I might develop more on the laptop.

So, how does the distribution look like more practically? Here are a few screenshots using different number of available poisson sample sets:

Just a single poisson set. Grid visible in dense areas. Sparse areas still look varied because of the randomisation of the starting sample index
2 poisson sets
4 poisson sets
8 poisson sets. Even dense areas do not show repetition

Note: Care needs to be taken so that samples do not end up in rivers or at sea. I do that by checking the tile and neighbours. I split the unit-space in a 3×3 subgrid, calculate “isGround” values for each subtile based on biome data, and discard samples that fall into a subtile that is not set as ground.

Z-layers: Decals vs Billboards

The previous images use a trick to handle the overlaps correctly. Well ok, it’s not really a trick, it’s standard Z-buffer, we just need to be careful with the coordinates of our rendered quads.

Sprites such as trees are also called “billboards” in 3D graphics: they look like they are facing the viewer. The sprites typically look like a picture taken in front of the tree: the bottom part is the trunk, and the top is the canopy. Therefore we can say that the Y axis roughly corresponds to height. Here are some examples:

Trees trees trees! (With images) | Pixel art design, Pixel art ...

Some other sprites, such as flowers or bushes, look as viewed from above, rather than from the front (as was with trees). In this case, the image Y axis does not correspond to height anymore, but corresponds to depth instead. Let’s call these “decals”, as they are like stickers over the terrain. Several shown below:


These two have a fundamentally different behaviour in a two related aspects: depth perception and shadows.

Decals don’t really have depth, as they are like stickers: nothing is “behind” them, as only the background is under them. Trees on the other hand have depth. Things can be behind trees. Here’s an in-game example of the Toothy Troll hiding behind some conifer trees, and in front of some other trees

I’m coming for you, hobbits!

whereas flowers are not a good place to hide:

A stomp (err, stroll) in the meadows

In order to achieve this depth effect, we need to manipulate the depth of the rendered quad vertices. But first, a bit about the camera used: it’s an orthographic camera from an overhead view, so Z is camera depth, which also represents the world’s height. Therefore, the background is always at Z=0.

When we’re rendering sprites, such as the troll or the trees, the bottom part touches the ground (Z=0) while the top part has some height (e.g. Z=1). By doing just this, we’ve ensured correct rendering. Below is an example of 3 trees rendered like this, in 3 subsequent grid cells (side view):

You can see that the camera ray might not reach the trunk of the middle tree as it might be obscured by the canopy of the right tree. So, because of the need for depth, we need to use alpha masking instead of blending.

The information about what’s billboard or decal can be encoded along with other per-sprite data, and just needs a single bool flag (or 1 bit).

Billboard Shadows

Billboards, because they encode height, can typically cast shadows. We’d expect trees and creatures to cast shadows, but not necessarily flowers and bushes (decals). The easiest way to cast shadows is to render an additional pass with all instances, with a couple of changes:

  • Adjust the quad geometry so that it’s sheared
  • Use black/grey instead of colour

Here’s a quad and it’s “shadow” transformation: it fakes a light source from the top left (=> right shearing) that casts a perspective shadow (diminuition effect)

Below: with and without the shadows:

With shadows
Without shadows (except troll)

I think it’s much better with shadows! And they come for free really, development-wise.

To simulate soft shadows, we can use a distance field, that records distance to the silhouette of the sprite, from inside the sprite. I maintain such distance fields for all sprites as they are useful in more cases, but here we can map distance to shadow strength via a smooth curve.

Pixelated river flow

Finally, I’ve added some pixelated mild noise on rivers, to have some animation but without using any flow direction. Here’s an image, but this is better seen in a video

Weather and Performance

First, regarding this blog’s posts: Lately I haven’t been doing anything that’s big or cohesive enough for a blog post, and that’s why the lack of posts. But this week, the “performance” section was pretty critical, so here we are.

Two main bits of work presented here: pretty graphics (I hope) and some really educational (for me) optimisation work. I’ll start with the visuals, as the latter is a journey by itself.

Fog, heat haze and time-of-day lighting

All these are things I’ve wanted to find an excuse to do at some point, so here we are. Fog ended up pretty good. It’s some simple pixelated perlin noise, by applying banding to both the output luminance (to create the limited-color-palette effect) AND to the 2D coordinates used to sample the noise function (to make fog look blocky). But we don’t band the 3rd noise coordinate, which is time, so the effect remains nice and smooth. Fog will be applied In The Future when the biome’s humidity is pretty high, and it’s late at night or early in the day (I know, it’s a gross simplification, but I don’t plan for soil humidity simulation)

Heat haze is again pretty simple: we just sample the noise function and adjust the horizontal UVs slightly, in a periodic manner. This will be applied In The Future mostly in the deserts during daytime, or in any other case where the ambient temperature is very high.

Time-of-day is a cheat at the moment (i.e. possibly forever), and applies some curves to the RGB components. Normally, the professional way to do that is using color grading, for which you need an artist. Until I get an artist or learn how to do it myself, RGB curves it is. For each discrete time-of-day preset (dawn, noon, dusk, night) we have 3 LUTs, one per color component. So I just simply fetch the RGB pixel color, pass it through the LUTs, and get another one. The LUTs are generated from curves in the GUI, as Unity provides some nice animation curves that can be used for this, and they are stored as a texture. In the shader, we sample the values and blend based on time of day. Still need to do this final bit for smooth transitions

Bursting the optimisation bottlenecks

So, below is a summary of this week’s optimisation journey, itself summarized with: “In Unity, for performance, go Native and go Burst”.

C++ to C# to C++: There And Back Again

My world generation phase was fast in C++, but in C# it’s SLOW. Generating the 512×512 biome map, precalculating all paths between all cities, generating factions, relations, and territory control costs a lot. In C# that is 4 minutes. You click the button, go make some coffee, and world may have been generated. In C++ it was much faster. Needless to say, when I first ported, I immediately implemented caching of the various stages, so that I don’t grow old(er) waiting. This week I decided to have a look and see if things can be sped up, as I can’t be hiding from the problem forever.

Pathfinding back to C++: Success!

The first though was obviously, “why of course, move things to the C++ plugin”. Since my old code was C++ and was ported to C#, this was not exactly a daunting task, as I copied C++ code from the old project to the plugin. First major offender was the pathfinding. Reference blog post. Now I’m generating 752 routes that connect 256 cities int the map, and also precalculate some quantities that greatly accelerate pathfinding searches, that involve 8 Dijkstra map calculations on the entire 512×512 map. Here is the first kicker. From 2 minutes, the process now takes 4 seconds. Needless to say, that caused extreme joy, and set the blinders on, focused to reduce those 4 minutes for the world generation back to several seconds. Next candidate? Territory control!

Territory control back to C++: Success? Eventually!

Drunk with optimisation success, I wanted to get the same boost for the territory control. Reference blog post about territories. In C#, running the process once for each city (256 of them) takes a total of 6-7 seconds. So I ported the code, and the time went down to 3.5 seconds. Hmmm, not great. But why? Apparently, I had not understood marshalling (moving data between C# and C++) correctly. Every time I passed array, I thought I was passing pointers, but C# was copying memory under the hood. So for each of those 256 calls, I was copying back-and-forth a few 512×512 maps, so around 5 megabytes worth of data transfers. Needless to say, that’s bad, so I tried to find ways to just pass pointers. And there is a Unity-way, using Native arrays. I switched to native arrays (not too easy but ok), and the time went drom from 6-7 seconds in C#, to 285ms!. But all is not rosy, as native arrays are not perfect (see below section) and also it’s a bit fussier to call the DLL functions: create an unsafe block, in there get the void* pointer from native array and cast to IntPtr, and then send the IntPtr to the plugin.

Interlude: NativeArray vs managed arrays

Unity provides NativeArrays which are great for use with native plugins and their job system. But there are 2 major downsides. One: you need to remember to dispose them. Well ok, it’s not so bad, I’m trained to do that anyway through C++, it’s just more code to manage. The second is that they are expensive to access elements through C#. If I loop through a big native array (say quarter of a million elements), it will take at least an order of magnitude more to just access the data, read or write. So you shouldn’t just replace everything to native arrays.

One fun tidbit. You need to remember to call Dispose() when you’re done with a resource. All my systems might store Native2D arrays, and the obvious thing to do is, whenever I add a new NativeArray variable, also remember to put it in the Dispose function of that system. But here is where reflection comes to the rescue! This code is put in the base system class:

public void Dispose() 
	var type = GetType();
	foreach (var f in type.GetFields(BindingFlags.Public | 
									 BindingFlags.NonPublic | 
		if (typeof(IDisposable).IsAssignableFrom(f.FieldType))

This beauty here does the following cheat: it finds all variables that implement the IDisposable interface, and calls the Dispose function. So, when I add a new NativeArray variable in a system, I need to remember absolutely nothing, as this function will find it for me and call Dispose. I love reflection!

Generating city locations: Time to Burst

Next candidate to optimize was a different beast: the generation of city locations. This is not easy to do in a C++ plugin because it references a lot of data from the database, e.g. creature race properties (where they like to live), city type information, etc. So, it has to be done in Unity-land. And Unity-lands’ performance poster child is the Job system with the Burst compiler.

So far I had ignored Unity’s Job system, but no more. Jobs are a nice(?) framework to write multithreaded code. The parallel jobs especially, really feel like writing shaders, including the gazillion restrictions and boilerplate before/after execution ๐Ÿ™‚ More like pixel shaders rather than compute shaders, because probably I still know very little on how to use jobs.

I dutifully converted the parts where I was looping through all 256,000 world tiles to do calculations, and I ended up with 3 jobs, 2 that can run in parallel with each other, that are themselves parallel, and one that’s not parallel. Here are the intensive tasks performed:

  • Calculate an integral image of all food/common materials per world tile (this allows for very fast evaluation of how much food/materials are contained in a rectangular area). This was converted to C++ plugin.
  • Job 1: For each tile, calculate how eligible is each race to live there (depends on biome)
  • Job 2: For each tile, for each city level, calculate approximate amount of food/materials available.
  • Job 3: Given a particular race and city level, calculate which tile is the best candidate

And now the numbersโ€ฆ Originally, the code took about 18 seconds. By converting the code to use jobs, it took 11.8 seconds. By using the burst compiler to run the jobs, it took 863ms. By removing safety checks (not needed really as the indexing patterns are simple), the code takes 571ms. So, from 18 seconds, down to 571ms, not bad for a low-ish hanging fruit! There was no micro-optimisation or anything tedious mind you.

Final remarks

So, native containers and jobs using Burst are definitely the way to go. For existing code out there (e.g. delaunay triangulation or distance field calculation) that you wouldn’t want to rewrite to jobs, C++ will do the trick very efficiently by passing nativearray’s void* pointers. Native containers need to be used with care and be disposed properly.

What’s next?

Pathfinding at the moment takes 4 seconds in the plugin. Since pathfinding is a such a common performance bottleneck (so, worth job-ifying), and my overworld calculations can be done in parallel (all 752 paths, and all 8 full dijkstra maps), I have high expectations, but it’s going to be a bit of work.

Effects & Enchantments

A typical RPG/roguelike has equipment, cards, skills etc that can all provide bonuses or penalties in various statistics, primary or derived. For example, movement speed, attack speed, maximum health, etc. There are possibly lots of ways to implement them. After a few unsuccessful theoretically-efficient approaches, the current revision looks reasonable.

Temporary, permanent, recurring and conditional effects

Examples that should be possible include:

  • Potion of healing: one-off, adjust current health [permanent]
  • Potion of regeneration: every N seconds, adjust current health. Stop after N*K seconds [permanent, recurring]
  • Potion of speed: movement speed increased by 20%, for 1 minute [temporary]
  • Potion of remove blindness: one-off, sets blindness to false ONLY if the creature is not naturally blind [permanent, conditional]
  • Sword of orc slaying: +50% weapon damage (applied when we use an ability that uses the equipped weapon), only if we’re attacking orcs [temporary, conditional]
  • Boots of the desert: +50% movement speed when crossing the desert biome [conditional]

So, observing such use-cases, I made the following design choices:

  • All types of effects/modifiers can have conditions
  • Permanent effects write directly to the actual values the affect. A health potion updates the actual health value. If they have conditions, they check them once, before they apply the permanent effect.
  • Temporary effects are stored in a separate list. Every time we want to get a particular value (weapon damage, max health, etc), we need to run a system function that gets the base value and applies all effects (if any of them has conditions, it’s applied if the condition is satisfied)
  • Events are set up for recurring permanent effects
  • Events are set up for temporary effects, to remove them from the list when the effect expires.
  • Temporary effects can be applied when an item is being carried (figurine), equipped (sword) or used (potion).

A drawback of the separate list for temporary effects is that we have to maintain indirect access to all variables that could be modified by effects, so that the access function always takes into account any effects. Additionally, we can’t even cache effects as they can be conditional. So for example, we have functions like “EvaluateMaxHealth”, “EvaluateMovementSpeed” etc, that get the base maximum health, then look for any effects that target max health (and pass the conditions, if any) and generate the final value.

In the temporary effects list we have 3 types of effects, based on what they modify:

  • Numerical. +1 Skill, +50 health, +15% attack speed.
  • Boolean. Set paralyzed or not, deaf or not, able to fly or not, etc.
  • Option. Set field of vision algorithm. Probably more to come.

Overall the system should be very flexible and allow for weird effects expressed naturally by data driving, e.g. a potion of lycanthropy that only works in a full moon, over the dead body of a wolf (a date condition and a condition that there’s an item pile that contains a wolf corpse under the feet of the player). The only important rule to remember when developing effects is that any values that can be affected by temporary effects need to accessed by these evaluator functions, unless we do not want to take modifiers into account.

Here is a video that shows a few potions in action: healing, emergency healing (only when health is < 20%), alacrity (player moves at incredible speed), X-ray vision (see through walls), oracle (see entire map):

Here’s another video that shows the death particle system adjusted to go towards the direction of the hit:

Skill System Redux


Last time I worked on the skill/stat system, I didn’t even touch active abilities, like feats in DnD. My main problems with my skill breakdown were:

  • Too many skills: around 50. Easier to navigate by using categories, but still.
  • Skill progression is difficult because of the skill breadth. Difficult to balance a jack-of-all-trades and a focused grandmaster of a few skills, with interesting skill progression/bonuses
  • The options were many, and the bad options could be many too. So, level ups would be a bit confusing and prone to mistakes and bad builds
  • Only some of these skills would enable DnD style feats, but I hadn’t thought that part out, and it would possibly be imbalanced.

So I engaged in some thinking, and some more thinking, and tried to recall bits of advice and suggestions by a variety of game/RPG design people, most importantly for what I want the core experience to be like. And the cornerstone pillars of the game’s experience is combat and exploration. But at the same time, I don’t want to ignore stealth or NPC interactions in cities/factions/elsewhere, so these exist but are of lesser importance, and this should be reflected in the system.

New approach

So, while the skills needed revamping, I like the attributes and the mastery levels. So, here are the main components of the current train of thought:

  • There are about 20 skills in 4 categories: offense, defense, arcana and misc.
  • Players can put points in each skill up to a limit of 15 skill points.
  • Players can improve their mastery of a skill given sufficient points and training from an NPC
  • Improving the mastery of a skill gives new passive bonuses (e.g. evasion chance when adding points in light armour skill). Points in a skill improves those bonuses.
  • Active abilities (think DnD feats, or ToME active skills) can be learnt from NPCs or scrolls, if player satisfies requirements in terms of skills, attributes and masteries. For example, crafting light armour would require mastery in both crafting and light armour.
  • Each level, the player can allocate 3 skill points, to a total of 90 skill points at level 30. Only one skill can be trained to grandmaster level, and thus reach the 15th point

Here’s the current list of skills:

Dual wielding
One handed
Two handed
Ranged and thrown
Light armour
Medium armour
Heavy armour
Command magic
Alteration magic
Divination magic
Creation magic
Destruction magic
Crafting and alchemy

The idea behind this is that skills reflect play style. My goal is to make as many as possible viable play styles, mixing arcane with melee, etc.


Previously I had to manually author archetypes, as not all combinations of skill points would be valid. With the new approach, it should be easier to write an automatic generator of characters that does not use any predefined limits in masteries etc. If a barbarian wants to learn meditation, more power to them, it’s going to be useful still. What is required to be careful about is the related attribute. If your barbarian has very low intelligence, it’s an indicator that he/she won’t really master that skill. The goals of the archetype creation are:

  • Maximum diversity
  • Minimum bad-looking builds
  • Fully procedural builds (“make me a character”)
  • User-guided builds (“make me a grandmaster in destruction magic, with some other nice skills too”)
  • Minimum data entry / configuration effort

The new archetype generator is parameterized on a list of target skill/mastery combos to achieve (optional), and a “well-roundedness” factor, which represents how hyper-focussed or jack-of-all trades we want the character to be. It works roughly in the following way:

  • Character creation: If we have target mastery combos, roll and initialize stats to satisfy requirements as close as possible
  • Assigning attributes: Try to satisfy criteria. If done, allocate based on the well-roundedness, between completely randomly (well-rounded) and to the highest attribute (focused)
  • Assigning skills: Try to satisfy criteria. If done, allocate based on attributes and well-known combos, e.g.:
    • dual-wield + one-handed = good
    • two-handed + dual-wield / shield = bad
    • meditation + any magic = good
    • sneak + heavy armour = bad

And that’s it! Yes it’s oversimplified a bit, but the archetype generation code is less than 300 lines, and is much, much, much simpler than the old approach. So, what characters does it generate? Plotting time again!

Well-roundedness = 0, target Destruction Magic GM: What looks like a typical elemental mage, plus misc skills as he’s physically weak
Well-roundedness = 1, target Destruction Magic GM: A more well-rounded machine of destruction
Well-roundedness = 0.5, target Destruction Magic GM: Something like the above, but in-between
Well-roundedness = 0, targets Dual Wielding GM and Light Armour Expert. Adding relevant athletics and dual-wielding; we are quite agile after all. At the later levels we develop Leadership as well, since we’re high on the Charisma.

So, archetypes look like they are working as intended. For next time, instead of fast encounter resolution, like last time, I’m going to deal with HP/MP next and attempt something more concrete, like spawning a few aggressive creatures with levels, and progressing with connecting skills to active abilities.

A little bit about locking

Let locking be the means to prevent use of some sort of device

Things that can be locked:

  • Door (use = open)
  • Chest (use = open)
  • Magic portal (use = enter)
  • Fountain (use = drink)

Types of unlock conditions

  • Have XYZ key in inventory [active] [E]
  • Actor being of a specific race, or having XYZ traits [E]
  • Have all fragments of the key in inventory [active] [E]
  • Time being midnight [W]
  • A pressure plate being pressed [W]
  • A lever having being pulled [W]
  • A creature being dead [W]
  • Any combination of the above, etc

Entity and world conditions

The unlock conditions are grouped as [E]ntity conditions, where the entity that tries to unlock should evaluate those to true, and [W]orld conditions, where when the state of the world changes with respect to the given conditions, the lock might activate/deactivate.

An actor can’t explicitly lock or unlock a door with world conditions, as they need to change the state of the world in order to get these doors locked/unlocked (e.g. leave an item on a pressure plate, etc)

Active and passive entity conditions

[Active] entity conditions are ones that sort of require the user to do something, if we play out the scenario. Obviously no door opens because a key is in our pocket, but if it is in our pocket and we handle the door, and the door is locked, we can imagine us getting active and using that key in the lock.

Passive entity conditions don’t need the actor to do something explicit. The actor handles the door, but the check happens in the background. The main difference gameplay-wise is that if there’s a lock with a passive condition, after unlocking, the actor can’t lock it again, as it is beyond the actor’s actions.


Locks can specify an unlock direction, so that if we somehow ended up in a locked room without the key (e.g. stepped on a teleporter trap), we can open it from the inside. Doing this bypasses all conditions, entity or world conditions. If we do have world conditions, the next time any of the condition changes (to any value), the lock will get enabled/disabled appropriately

Here’s a video that shows a scenario of a lock with world-state conditions only: 3 pressure plates that need to be pressed, but we can also unlock from inside

Player movement, levels, objects

Given the field of vision implementation from last time, I decided it was time to test it and make the game a bit interactive, by allowing the user control of a character. This has been really important, as it has forced me to focus on making level transitions, ensuring movement cost maps and visibility maps work ok, ensuring that save/load works correctly, and a myriad of small little things. Below are a few videos and a list of things done since the last blog post:

Animated and fuzzy fog of war
  • Better fog of war this time, implemented as a simple pixel shader. It’s simple, it’s fast, looks better and does the job for now. The video also shows from about here that now sprites flip horizontally when moving towards the other direction, while when moving vertically they preserve whatever direction they were facing. This costs 1 bit in the 128-bit data per moving sprite, not a big loss or cost ๐Ÿ™‚
  • Added functionality for level movement costs (slightly different than overworld moving costs), consisting of background (walls/floor/liquids) and static objects.
  • A* and all other path calculators take into account that diagonal movement is only allowed when the related cardinal directions are passable.
  • Creatures have light sensitivity, and overworld and dungeons have light levels, that affect line of sight radius.
  • I wanted to refactor a bit of the territory system regarding propagation of influence by replacing the data per tile from class (reference type) to struct (value type), so that led to an exciting journey of more changes, fixes, bug discoveries and further bug fixes, and now it seems to be back on track, better, with less code and fewer bugs.
  • Field of vision optimisation that, when the player/sensor moves to an adjacent tile, we only clear the visible data from the surrounding circle with Los+1 radius, instead of clearing the entire map. This of course is not uncommon, but it also had to be done, as now the related performance cost went from 50ms to 0.2ms when the player moves in the overworld, because the refreshed tiles went from 512×512 to 20×20.
  • So far I wanted to have entity “configurations” which are objects that store the exact information to generate an entity, but decided against that due to the cases when entities have to reference other entities during the generation before the entities are created. So now, for example, when procedurally generating a lock and key, I have to create both entities, configure them, put the reference of the door needing the particular key entity to be unlocked, and then call the magic function “EntityBeginPlay” which makes the entity visible to the game, listeners, other entities etc.
  • Level objects (fountains, chests, etc) can now affect movement and visibility. Can now push level objects and update movement/visibility maps appropriately. Also, added doors, and can open and close them at will, blocking movement/visibility. Also, as a fun sanity check, when pushing a fountain in an open door, door can’t close.
  • Explored map disintegrates a bit when revisiting a level. Should later do it based on last visit time. Static objects disappear in explored areas when revisiting a map
  • Save/Load works from overworld and levels
  • Sparse 2D multimaps to store level objects and creatures
  • Ctrl+click moves character towards highlighted path, right click cancels path (this is for fast debugging, should later change with the introduction of turn system)
  • Slightly more flexible sprite rendering, with a list of animations and indices per animation type. So I have a “default”, “moving”, “death” etc animation types, and I can have for example “door closed” “door open” “door locked” as different animations per type.
  • Perlin noise precomputed inverse distribution function and cumulative distribution function. I wanted mainly the IDF, for cavern generation, as I wanted a scalar variable “density” to control how open or claustrophobic a cavern map is. I wanted density to vary in a linear way. My caverns are generated using thresholded perlin noise. But the perlin noise distribution is not uniform therefore the threshold value does not exhibit linear behaviour. Therefore IDF can be pre-calculated and used instead as density, as we feed it a probability value (that can be linear) and we get as output the threshold value to use. So, I did a test with this new variation, and for 10% density to 90% density the final map (after connectivity, etc) looks as follows:

White is open space. The maps get progressively more constrained in a linear way. The last map is very constrained, therefore a lot of parts have been discarded post-connectivity

Here’s a video showcasing:

  • Level transitions and overworld-level transitions
  • Fast-path traversal
  • Pushing objects
  • Opening-closing doors
  • Degradation of explored map after leaving level