City-state relations

There are lots of city states in the world, each with their own race composition, core values, guilds, and alignment among other things. As history has taught us, nations can like or dislike each other in varying degrees. In the game, I have the following simple scale: Hatred, Dislike, Annoyed, Neutral, Accept, Like, Love.  So, how do we set up relations between city states?

  • Alignment. Chaotic good and lawful evil cities are (almost) never going to be friendly. Similarly aligned cities are more likely to have positive bonds with each other. So, alignment is used to bias a relation towards love/hate.
  • Common core values. City-states with similar core values are more likely to be friendly, and vice versa. Since each city has an array of weights per core value, we can just treat the arrays as N-dimensional vectors and calculate the euclidean norm. As the major core value dictates the government, we can give a bit of extra weight in the major core values of each pair of cities.
  • Proximity. The farther two city-states are, the more unlikely they are to have (or have had) any significant connection. So, relations are dampened based on distance. I use a simple linear scale to do that.

So, the whole process can be described as follows:

  • Generate a random relation value
  • Add a bias based on alignment and common core values
  • Scale by a user-defined parameter, to control how “emotional” the relations between city-states are.
  • Dampen based on proximity
  • Add a user-defined bias, to control the general relation “direction” towards love/hate.

So it’s quite simple and generates some nice results I think. Here’s an example, varying the user-defined parameters. The purple territory is the city-state in question, shades of green represent accept/like/love while shades of red represent annoyed, dislike, hatred. Black is indifferent/neutral.

Scale = 1, bias = 0.0

Scale = 1, bias = 0.4

Scale = 2, bias = 0.4


What makes a city-state

Figure 1 Above, the territory of a selected city is highlighted with red (bottom left), while all other territories have a darker shade. On the right, city-state information is displayed.

Now that territories have been defined, we need to flesh out the city-states a bit more. The city-states will act as adventuring hubs, source for quests, target of other quests, and will also have a gameplay mind of their own. So, what makes a city-state?


Each city has a race composition (out of about 10 currently), defined as percentages. The racial variety will depend on the level of the city. A small village will be composed of a single race, while a buzzing metropolis could have more than 5 races.


Cities use the DnD alignment axes ( Lawful/Neutral/Chaotic and Good/Neutral/Evil) to represent what they stand for, and their goals. A chaotic neutral character would obviously prefer a similarly aligned city, as the interests would better match.


Cities have 4 main statistics: Population, Influence, Wealth and Military. Influence directly affects the city-state territory, and the rest are self-explanatory. They will have gameplay effects, but not quite fleshed out yet.


Cities have food and basic material reserves. These resources depend on the biome tiles within a city’s territory. Rare resources (silver, gold, crystals and more) can also exist in a city’s territory, and they need to be mined to provide the prestigious resources. Such mines can be the target of sabotage, or places where disastrous events take place, and so on.

Core Values

Each city has a “score” (more like a weighted percentage) for each of several “core values”. These are Arcane Knowledge, Military Prowess, Prestige, Piety, Commerce, Prosperity and Technology. The score in each of these values would represent the strategies and interests of the city-state. E.g. a city-state interested in Arcane Knowledge would be looking for magical relics, while a Commerce-oriented one would be more interesting in setting up a complex trade network. Military city-states would be launching attacks on other city-states that they hate, or against any other invaders. In terms of future implementation, when trying to choose between behaviors classified with these core values, the weighting would affect the choices.

Guild chapters

In the game there will be several guilds, and each guild can have chapters in different cities.

Guilds can be of “class”, “lore” and “trade” type. Class guilds are fighters guild, clerics guild, rangers guild etc, so that the members can be adventuring classes. Lore guilds are explorers guild, historians guild, seekers guild, etc, that are interested in uncovering information about the world and long-forgotten relics. Trade guilds are blacksmiths, jewelers, alchemists etc, who unionize and typically sell things or services. Everybody gives quests, but PCs will possibly not be able to join trade guilds, as a player you don’t want to be in the shop all day serving customers, or reading books to do research.

Guilds have alignment, so that a chaotic evil guild will not exist in a good-aligned city (unless they’re operating secretly).

Guilds can have biome requirements, for example a pirates’ coven will only exist at a coastal city, or a druid enclave would exist only in a village or a town, near woods.

Some guilds are secret, for example a pirates’ guild, a necromancers’ guild, an assassins’ guild etc. To find those, certain conditions need to be met.

Guilds give quests, some might have initiation quests, and there will be quests (and rewards) for advancing in rank.

Some guilds might not like some other guilds. Due to them being competitors or have very different values. This would be reflected on how guild members treat each other.


City-states optionally have relations with other city-states. Generally, the closer a city-state is to another, the more likely the relationship to be more polarized (love or hate), compared to city-states that are very far from each other. The reasoning behind this is of course friction and interaction due to proximity. Additionally, like-mindedness in terms of alignment and core values would affect the relations, as a chaotic good city-state could never be friendly to a lawful evil one.


So that’s it for now, next time it’s going to be more on relations and also routes between cities and mines.


Figure 2 Another city-state territory and information, similar to Figure 1

Overworld Territories


City-states rule the world. For the whole overworld, using my current projections, there will be a maximum of about 250 cities. For a 512×512 overworld, this would be very roughly a city per 32×32 grid.  Given that a grid cell would represent about 10 sq km, this mean a very sparsely populated overworld, which doesn’t resemble the Middle Ages all that much. That’s fine though, as the alternative/realistic version would be a city/town/hamlet per grid tile, and that would make a quarter of a million such towns.

Borders and growth

City-states have areas of influence, which define their territory and borders.  The area of influence is directly related to how difficult is to cross the terrain. In that sense, sea and high mountains are in general more difficult than other terrains, and the difficulty only increases in temperature/humidity/vegetation extremes. With that in mind, we can create an “influence cost map”, which dictates how influence, generated by its sources (cities at the moment), is reduced while radiating outwards.

Borders between city states

Another interesting point is how to deal with borders between city-states. I found that the simple way of “whoever exerts greater influence on the tile, owns the tile” is unsatisfactory, as city-states with even slightly greater influence can quickly overcome the whole territory of another city state. For example, a city state gains 4% influence, and suddenly it wins over 60% of another city state’s total territory. To deal with that, I still compare the influences, but they are scaled by a factor related to the inverse squared distance of the grid cell in question to the city states: tiles close to city-states (and owned by them)  are much more difficult to be won over by some other city state, especially if it’s far away.


A couple of years ago I developed an algorithm for this, with “nations” in mind (a max of 16 of them). The algorithm was fast, a bit buggy and complicated. Also, code was messy. I tried to read and understand it, and realized I’d be better off writing something from scratch, and it had to be simple. As an attempt of documentation here’s the algorithm in all its glory.

Data format

First things first. For each tile, we store the source ID and the influence decay so far, from the source location until the tile. The decay will be the accumulation of decays along the “shortest” 8-connected path starting from the source.


Create the influence decay map, that stores for each tile how much influence is reduced on crossing the tile horizontally or vertically (for diagonal crossing, it’s scaled by sqrt(2) )

Adding/Removing/Changing an influence source

All these cases are handled mostly in the same way, which important for simplicity.

-> Update local cache

We keep a very small per-source cache that stores the location and influence of each source — that’s it. We update the cache by adding/removing/modifying entries as needed

-> Calculate front

We calculate all the border tiles of a source in the following way. We start with the influence source location (the city), and we slowly expand outwards the 4 directions (left, right, top,bottom). We expand towards a direction if a point is within the influence of the source. Below is an image that represents this outwards expansion (the center of the concentric squares is the influence source). In the image, I mark the boundary points; these are the points that have at least one neighbour that is outside the source’s influence (they could belong to a different city, or they might be neutral). So, we slowly grow this axis aligned bounding box of the source’s points, while adding all the border points to a “to process” list.


-> Process queue

We have now a queue with a list of points to be processed. These are all the boundary points (For a newly created source, it’s location is the one and single boundary point). The queue is a  priority queue: we add points accompanied by weights, where the weights represent how important it is to process the points first. The reason why a priority queue is important is because we want to avoid traversing tiles multiple times. Imagine the following scenario:

  • Tile (32,45) needs to be processed. We figure out that it should belong to source S0, and it will have an influence of 50
  • Tile (32,45) needs to be processed again. Because we arrived here from a different path, we realize that it should actually belong to source S1, and it will have an influence of 62
  • Tile (32,45) needs to be processed again. We arrived from a different path again, but from the same source ( so effectively used a shortcut, e.g. around the mountains that cost much to pass through). So, while it will still belong to source S1, it will now have an influence of 65.

So, a tile can be unnecessarily processed several times, while we can avoid that if we process the last one first (S1, 65 influence), as the others simply have lesser influence and will not cause the tile to get re-processed. The above also hints on how the processing is done: as a form of floodfill. While on a tile, we figure out who the owner is, how much influence is there at the tile (source influence – influence decay till point), and which neighbouring tiles do we need to process. The queue processing algorithm can be summarized as follows:

  • Get the top point in the queue
  • Check if it’s obsolete. It would be obsolete if the stored weight is lower than the current weight (the weights are just the influence values). If it’s obsolete, repeat from above
  • Check the 8 neighbours and calculate the scores as if the neighbours owned the tile. So, we pretty much calculate if they should own it
  • If our influence is negative and there’s no better neighbour, the tile needs to be reclaimed by nature. If that’s the case, add to the queue all neighbours that are of the same source
  • If our influence is positive and there’s no better neighbour, we’re ok and we need to see if we need to expand this source id to neighbouring tiles. By comparing influences, we add potential candidates to the queue.
  • If there’s a better neighbour, we replace this tile. The new influence information can propagate further, so we check all neighbours to find out potential processing candidates and add them to the queue.


Here’s a video that demonstrates border growth for 256 city states.


Nations, Races and Cities

One thing that has been bugging me for a while is the administrative aspect of the world’s population. Several races coexist in the world, sometimes not so peacefully. One of the important questions is, how to divide the population groups? This can answer questions such as who lives where, who likes whom, how does the player interact with each group, etc. The context is always important: this is a game where the player controls a single character over the course of maybe several in-game decades, in a persistent and procedural world, where there are lots of self-sufficient cities with guilds, shops etc.


One way of dividing them is DnD/Forgotten Realms style using nations. Each nation has different government type (plutocracy, magocracy, autocracy) and has citizens of potentially many races. This is a nice, “realistic” division, but comes with a number of complications for a type of game where you’re an adventurer going around in the world and doing stuff. Just a few issues below:

  • Nation conflicts and diplomacy. When there are several nations, it’s only natural that over the course of time conflicts arise. If relations were all nice among them, there’s no reason not to be unified. Modeling nation-wide war in a game where the player controls a single or a few characters can be problematic. What happens if nation A wages war on nation B? What does that mean for the player? Can the player not safely visit several cities of nation A or B anymore?
  • Nation-wide AI, city-wide AI, unit AI. With nations, there’s more AI to develop. What responsibilities does a nation have? What can it do? Found cities, wage wars, etc? We’re getting heavily into 4x game territory, and that becomes a bit much for solo development.
  • Nation identity.  When creating a number of fictional nations, effort must be put so that the nations are unique, individual, interesting. I’m personally very averse to fiction where the differences are superficial. For example picking a name from a generator list, roll a dice for government, roll a dice for alignment, etc and that’s it. That’s not enough. Forgotten realms for example has quite good depth for each nation, but that’s over tons of books and game supplements. Solo development can not afford such depth.


Another way to split nations is by race, where the race would also be the nation. You have dwarves, elves, humans, etc. That’s nice and simple and at least addresses the “identity” issue above. There are still potential conflicts and AI to be modeled though, but they are nothing compared to the effort in creating a striking identity for a nation. The identity of the races still needs to be developed to escape the confines of the generic high fantasy elf/dwarf and other races, but that would be done for the nation-based division anyway, as a separate task. ( Create identity for dwarves, humans, orcs, etc, also create identity for nation of Whatever and Etcetera). Issues with race-based division:

  • Not sensible. It just doesn’t make much sense that, at least “good” nations would prohibit citizenship from other intelligent species, as long as they could all co-exist peacefully. Of course dwarves could be reserved against other races, or elves be haughty and racist, but having that at 100% everywhere makes the races one-dimensional and not so believable.
  • Player interaction limitation. In the game, several guilds exist that the player can join. If the player is from race A, do you get excluded from guilds in cities of race B? Otherwise, why would they be the special cases of allowing members of different race? It would take quite a bit of writing effort to make that sensible and believable. The game would force you to be part of a big group, where inadvertently you’d have situations of “us vs them” at a nation/race scale.

Multicultural city-states

Another way to divide the world is to ignore the nation-wide scale using city states (Elder Scrolls cities feel sort-of like that). Cities are self-sufficient, can contain guilds and various other buildings and people that a player can interact, and have the following advantages:

  • No need for full-out warfare. A nation gathering army and ganging up to attack a city is much easier than a city gathering army to attack another city. So all-out warfare between city-states is not something that one can expect to naturally happen. Subterfuge on the other hand is much more likely, and can create very interesting scenarios: for example a guild hires you to steal a relic from a rival guild of another city-state, or the government hires you to sabotage a mine operation that belongs to another city-state.
  • Multicultural. A city-state can optionally be multi-race, or single-race, or anything in-between. A fully dwarven city in the mountain is as plausible than a hillside settlement with hill dwarves and hill orcs, as long as the races can coexist peacefully.
  • No nation-wide simulation layer. Easier to develop as there’s no need for nation AI. Nations don’t need to found new cities, as the time scale of the game is not centuries (which would take for a city to start, grow and have some sort of history). There’s no resource management layer between cities and nations (does a mine’s ore go into nation coffers? Does a nation have resources, or they are per-city?)
  • Emergent identity. The city-state types are used as a prototype several times, but the simulated and played history can affect the development of individual city-states. A dwarven city in the mountains can start as poor, if surrounded by not-so-resourceful environment, but its fate might change if they discover a rare mineral nearby and mine it. A different dwarven city could have a completely different fate, for example it could be eradicated by Unknown Destructive Forces or sabotaged to oblivion. The basic identity (how does a dwarven city look like, how does it function, etc) is written for a limited number of prototypes, while the emergent identity via history generation and playing is what will make it rich.


The above are just some thoughts; not final, but representative of my current development mindset. While originally I started with the race-based approach, I’ll be using the city-state paradigm unless I can think of a blocker.

Autotiling Adventures Part IV: Mountains, trees and props

Previously I utilized HoMM 3 assets for extracting biome detail textures.

This time, the next logical step is to add more foreground detail, such as trees, rocks, mountains and other props. I continue to utilize HoMM 3 assets, as they work pretty well and are relatively close to how I’d like things to look.  Of course I’ll eventually have to make my own, as these are not mine, but that’s a problem for much, much later.

The pipeline from start to end for adding such foreground detail can be summarized by the following: First, identify the assets of interest. Then, pack them in a texture atlas and create an associated data file with per-prop information. Then, write the logic to place props on a map based on biome information. Finally, use the prop placement data to render the props onscreen.

Step 0: Tools

In order to process the assets, we need to understand them first. Kudos to the following, as without these I wouldn’t have done much:

Using the above, we can observe the assets and look at their properties (map editor), we can get the lists of all assets and asset types for the maps (MMArchive), and also extract all the images (python utilities)

Step 1: Identify assets of interest

This was a tedious process, going through each asset and identifying if it’s suitable to use as foreground detail. The source data were the following: 1) a text file with asset properties, such as suitable environments, asset type, image names, etc 2) a text file with all asset types 3) a large set of images

I’ve never worked with a tile-based engine before, so I observed several things that look like common sense:

  • Good overlapping look can be achieved when rendering props in the map top-to-bottom (makes sense as in  a top-down view, elements on the top are further back), right-to-left (no idea why)
  • Each asset is logically divided into 32×32 pixel regions, which I call subtiles. The subtiles are used in HoMM as single grid cells (e.g. a unit occupies a single subtile). Game data store movement blocking and entrance masks using subtiles. The maximum subtile number is 8×6, which means a 256×192 image.
  • Some multi-subtile assets could safely overlap, typically mountains and trees

Step 2: Generate texture atlas and data file

Having a list of assets of interest, we can now pack the images into an atlas and also save information per element. The information stored is:

  • General category, such as “Landscape features”, “Vegetation”, “Props”. Used for filtering assets.
  • Specific category, such as “Craters”, “Mountains”, “Trees”. Used for filtering assets.
  • Subcategory, such as “Oak trees”, “Rock”. Used for filtering assets.
  • Element ID, such as “avlmtdr2” (unique names used in the game data). Used for unique asset addressing
  • Composition group, such as “avlmtdr” (unique names without the number suffix, that indicates a group). Used to determine safe overlap of assets
  • Subtile num, such as [3, 4]. Used to determine the region that the asset covers
  • Subtile occupancy mask, 8×8 bits. Used to determine the per-subtile “logical” coverage
  • Subtile render mask, 8×8 bits. Used to determine if we need to render a subtile or not
  • Biome mask, 16 bits. Used to determine the biomes that the occupancy-marked tiles can be placed on

The difference between the occupancy and the render mask is as follows:

  • The render mask sets bits of tiles that contain at least 1 pixel with a non-zero alpha value
  • The occupancy mask sets bits of tiles that act as blockers in the map.

Here’s an example:

The red subtiles here mark the occupancy, while all other marked tiles (plus two unmarked with a bit of shadow) mark the renderable parts.

Below is a packed atlas using all assets of interest.

Some assets are animations, in which case we make sure they’re on the same line. I used this packing code. Atlas rectangles are named, and they use the Element ID as described above. One more interesting detail about the atlas is that the elements are packed in multiples of 32 pixels, so that means that I can have 6 mipmap levels and still not have any asset bleeding. I also generate a JSON file with the properties of each element.

Step 3: Place props based on biome data

Admittedly, I didn’t put a supreme amount of effort in placing thing. Little effort yielded reasonable results, so that’s ok for now. The placement is really simple. It is comprised of four stages: placing mountains, trees, props and cleaning up.

One important bit: Below, I use the term “map subtile”. I split the overworld map tiles to 2×2 smaller tiles: these (map) subtiles correspond to the size of the subtiles of the assets. An asset using 6×3 subtiles (192 x 96 pixels) will be mapped to 3×1.5 overworld tiles.

Placing mountains

First, we go through each subtile on the map whose elevation is high and we try to use it as a starting point for placing any of the assets marked as “Mountains”. The placement condition is that all subtiles with an occupancy bit set need to display biome compatibility based on the tile they’re on and the asset’s biome mask. We also prohibit occupied tiles being placed on river tiles, as the composition becomes quite difficult. Overlaps are allowed, as long as the overlapping assets are in the same composition group.

Placing trees

Next, we go through each land subtile on the map and we use the vegetation density as a probability value for if the subtile will be attempted to be used as a starting point for vegetation. This time, we filter the assets by only selecting ones in the category “Trees”. Biome compatibility and river prohibition are applied as in the case of mountain placement.

Placing props

Next, we generate a large number of random positions on the map, and we use them as potential starting locations for props. Yet again, we filter the assets appropriately to get the candidates of interest ( skulls, logs, stumps, reefs, etc) and we apply the same conditions as in mountains and trees: biome compatibility and river prohibition.

Cleaning up the data

Now we have a list of prop references and prop offsets per world subtile. We sort these so that the ones on the bottom left will be rendered last.  Additionally, for each prop, we identify if all of it’s occupant subtiles are 5 layers deep or more, and we remove those assets as they will never be rendered (we will only render up to 4 layers of such props). Finally, we build the GPU texture from the resulting data.

Step 4: Render props on map

The structure used for rendering is an “image” where each pixel stores information about what props need to be rendered where. Instead of having a vector of (atlas_element_index, map_location), which is not too GPU-friendly, especially if we have 100k props, we take another approach: For each subtile on the map, we store references to up to four subtiles of props. The required data per layer are the following, and they easily fit into a 128-bit texel (RGBA32).

  • Atlas element number of animation frames
  • Atlas element rect corner X
  • Atlas element rect corner Y
  • Atlas stride X ( when we have an animation, use the stride to jump to other frames)

The size of the rect that we render is conveniently constant: it’s the size of the subtile. This, appropriately optimized **should** be quite efficient, but alas, the rendering shader is very slow on an intel-powered laptop with an oldish card. But that’s a different story. Here’s a video with the results.



There’s still room for improvement (there always is), but I need to proceed to framework improvements, so for now, this is the sort of map visuals that will be used. Well, with less reefs for sure 🙂

Autotiling Adventures Part III: Detail biome textures and animated coastal waves

Previously I generated procedural masks for biome and rivers, but using constant color for each biome (river too). So, I ended up with nice outlines, but still the result was looking flat. So, I thought I’d add some procedural variation to the color using perlin noise. Needless to say, the result was underwhelming. So, after quite a bit of hunting, I rediscovered a website that I had stumbled upon years ago:  The Spriters Resource! What got me really excited back then (even though I eventually forgot) is that I found there tile art for, among other games, Heroes of Might and Magic 3!

Heroes of Might and Magic III

And, surely enough, found the section for the world tiles! (the bg folder in the zip file). Of course, these are commercial assets so I can only use them for testing things out, but they are perfect for that, as I wanted to go for that art style anyway.

So, each terrain type has a bunch of 32×32 images that represent the terrain in its entirety, or transition tiles. I was too lazy to search online if there’s any rhyme or reason to the naming of those files, so I did the natural thing: run some batched image processing using ImageMagick to identify the tileable images.

Step 1: Find out the seamless tiles in all directions

To find out which images tile with themselves, I ran a tiling scipt. A python call for imagemagick commandline looks like this:

"magick montage {}-geometry +0+0 {}".format( (file_in + " ") * numrepeats * numrepeats, file_out)

where file_in is the input 32×32 file, file_out is the output tiled file, and numrepeats is the number of tile repeats in each axis.

Results look like this:

Good tile

Bad tile (it’s a transition tile)

So, great, now I have a list of tiles that are seamless with themselves. But, would they be seamless with other tiles?

Step 2: Add labels to tiles

Files are of the form “watrtl14.png”, “tgrd023.png”, etc. So, a prefix for the terrain type, and a number of the id. So, next step is to create images with a label in the middle of the image displaying the tile id:

"magick convert {} -gravity center -annotate +0+0 {} {}".format(file, label, file_out)

Result is like this:


Step 3: Montage of different, labeled tiles

So now here comes the fun part. As we have a version of all the images labeled, I run a montage again as in step 1, but with the following changes:

  • Use the labeled images as the base tiles
  • If we have 20 images for a terrain, I sample from this set randomly to populate a 12×12 tile grid. This will show what doesn’t match with what else! Here’s an example

All tiles match well with each other!


There are some tile IDs that are darker compared to their neighbours. Reading the labels, I can quickly identify them: 17,22,23,24,25

Step 4: Assemble the texture array

Now I select which image set will be used for which biome, e.g. the water images are used for the water biomes (4 of them), by creating variants that are slightly processed in terms of histogram/levels. For each biome, I select a random subset of 16 images. As I have 16 biomes, I end up with 256 image.

So, after a bit of work, the resulting texture looks like this:

Well, in reality I’m using a source image of 32×8192 which gets interpreted as a texture array of 256 slices, so that I don’t have to write manual code for correct mipmapping in the texture atlas. From a quick performance test, there didn’t seem to be much difference.

So that’s all for how we create the detail texture atlas! Now onto applying it. There’s not much to write about how to apply it, as I’m just sampling the texture instead of using  a constant color. So, here’s a before/after comparison:

For the observant readers: there’s some slight artifact at the tile borders in the above images (and the video below) – this was some incorrect fract() operator on the UV coordinates, this has now been fixed. Here’s the associated video in all its animated glory:

The video shows before-and-after the detail textures, the coastal animation, scrolling and zooming in/out on the map.  For zoom in/out, I’m using texture filtering like this:

  • min filter: linear mipmap linear (to prevent noise at zoomed out level)
  • mag filter: nearest (I still want it to look pixely when you zoom in)
  • wrap mode: repeat ( it’s a texture array, so the filtering is taken care of automatically)

Nevertheless, I found some other resource for future reference for manual filtering/mipmapping if at some point I have to use a regular atlas rather than a texture array.

Coastal waves animation

So that’s a nice-looking gimmick, but I’m going to write a bit about how it was implemented anyway.

Remember, instead of storing bitmasks for the biome transitions, we’re storing distance fields to the boundary. When rendering, I process the layers one by one, and I’m blending the biome colours. It helps that the seacoast is biome type 0 and the water biomes are types 1,2,3 or 4. So, if there’s coast, it will always be the biome type in layer index 0. And if there’s water, it will be right after. So, I need the following conditions to be true:

  • current layer is a water layer
  • first layer is coast
  • pixel distance field value from boundary of current layer is negative (pixel is within the mask, ie. our “current” biome)

We need to have recorded the distance field value of the biome in layer index 1, regardless of the layer we’re on. This ensures that this is the distance of the first water layer to the coast, which is what we need (the “to the coast” is the crucial bit, as the distance field records values against the previous layer, and we don’t want abyssal sea distance to deep sea, if coast, deep sea and abyssal sea are layers on the same tile)

So, now that we have these, we need to compute the waves. For the waves, I’ll let code do the talking, as it is noise, domain warping, and the usual:

float t = g_TotalTime*3;
t += 4.1*( snoise2(var_actual_pos*1)*0.5 + 0.5);
float cmpDist = 0.45 + 0.02*(sin(t)*0.5 + 0.5);
cmpDist = 0.41;
if ( layer1dist > cmpDist)
    vec3 coastal_water_color = vec3(0.7,0.95,1.0);
    // Put crests at certain distances
    float distFromBoundary = layer1dist;
    float phase = -0.5*t + 2.0*( snoise2(var_actual_pos*10)*0.5 + 0.5) ;
    float dmin = cmpDist;
    float dmax = 0.5;
    float dn = (clamp(distFromBoundary,dmin,dmax) - dmin) / (dmax-dmin);
    float mixFactor = pow( sin(phase + dn*11.0)*0.5 + 0.5, 4.0); // sharpen the result with pow. adjust the phase with time    
    mixFactor *= smoothstep( -0.4, 0.4, snoise3( vec3(var_actual_pos*2.0 + vec2(1000),t*0.05)));
    mixFactor *= dn; // smoothly fade out the wave at the boundary distance = mix(,, mixFactor);

On note about the cmpDist variable. 0.5 is exactly at the boundary ( I encode a signed distance field in [0,1]), and a value slightly away from the coast would be around 0.45.

Next time I’ll try my luck with prop placement, and I’ll see if I can extract any sample props from that HoMM resource again for test use. But I might actually stop soon with the art, as I think now it should be good enough.

Autotiling Adventures Part II: Procedural masks for biomes and rivers

Biome masks


In the last article, I described a way to autotile multiple biomes using a minimal set of mask shapes. I used a custom map for testing. This time, I use some shaders to generate the a nice big set of masks. In particular, I can generate for example 32 variations of each of the 4 shapes at 256×256 resolution. As we have 1 shape per RGBA texture component (our masks are grayscale), we need 32 RGBA textures, or a single 32-slice array. Stiching them up, the procedural masks look like this: (rows: variations, columns: shapes)


These masks are generated using perlin noise, and then they are post-processed to remove floating islands. Here’s how:

  • We know that each shape contains 1 or 2 white regions and always 1 black region
  • Detect all the black regions, sort them by area, and replace all but the largest with white (so we satisfy “always 1 black region” criterion)
  • Detect all white regions, sort them by area, and replace all but the largest 1 (or 2) with black (so we satisfy “always 1 or 2 white region” criterion

Here are the steps visually: left is the original image, middle is with extraneous black areas removed, right is the final, with the extraneous white areas removed:


At this stage, we calculate the distance field for each of the masks. The distance field is 256×256 at this point. The maximum distance in the distance field is the length of the diagonal diag = 256*sqrt(2); we normalize the values in the distance field from (-diag,diag) to (0,1), to be resolution independent. We now downsample the distance field to 32×32, so that it can still reconstruct the shape nicely. The data is stored in an RGBA8 texture. If each variation is an array slice, we end up with a texture array 32x32xN. To give some perspective, for 64 variations we need 32*32*64*4 bytes = 256K of memory, which is very little. Add a bit of extra for the mipmaps (which are good for filtering when zooming out further), and we’re settled with the biome masks.

Rendering the biome masks

Last time I described a way to render the masked, by rendering a subset of tiles per layer. This is far from optimal (it was approach v1 after all). So, here’s a better one:

  • We observe that evert tile has to be rendered (duh). That means, we need a dense 2D data structure, with tile data per element. So, tile positions are now implicit.
  • We observe that we have up to 4 layers per tile. The info that we need per layer is the layer index (4 bits), the mask index (3 bits) and the transform index (3 bits). That makes 10 bits per layer, so 40 bits in total. So we place the data in a 64-bit data structure ( e.g. RGBA16 or RG32) and we have 22 bits to spare.

Now we render the visible grid, and we sample this data structure to reconstruct the mask. The pseudocode is roughly as follows:

for each pixel:
  calculate tile index and offset in tile
  shift output position by half a tile // for corner offset
  sample autotile data based on tile index
  set output color as 0
  for each valid layer
    transform tile offset using layer transform
    sample mask using transformed coordinates and mask index
    calculate color based on layer
    blend output color with current color based on mask value


River (and road) masks

River masks are slightly different to biome masks and have the following characteristics:

  • The tiles where we need river masks are few: for my map, it was 1.5% of the total tiles.
  • It is not beneficial any more to use corner offsets.
  • There is no diagonal river connection.
  • All river tiles connect to at least one river tile.
  • There is always a source/origin tile for rivers. The origin tile is always connected to one other tile.

Given the above, we realize that we really need 5 different masks: origin, line, corner, t-junction and cross. Below is a list of examples:

We follow the same process as with the biome masks: we remove extraneous white/black regions, calculate distance fields and downsample to 32×32.

Here’s also a video that demonstrates all the mask shapes, procedurally generated, parameterized by time:

As you can see, for the river masks there are typically big black holes in the middle, but they are filled out by the process I described earlier.

River and road rendering

The process is a bit different to the biome mask rendering. Now we have a sparse set of tiles that contain river/road data. The tile data required are 3 bits for the mask index, 3 bits for the transform, and 10 bits for each of the x,y coordinates of the tile (I’m using 512×512 maps for the overworld and I doubt I’d use 2048 or larger). The pseudocode for rendering is similar and a bit simpler compared to the biome mask rendering: for a tile, we unpack the autotile data, we calculate the output position based on the x,y values, and we sample the mask using transformed coordinates.

The river/road render passes take place immediately after the biome mask pass, as they use a sparse tile renderer (biome masks rendering uses a dense tile renderer) and they don’t use corner offsets.

Putting it all together


Here’s a video that shows all mask on the map, compared to the single color per pixel:

As a note, the original single-color-per-pixel has some additional color variation based on the vegetation density, that the new masked version does not have have yet. Also, I think there’s an indexing bug for the variations, as I should have 64 different variations per shape but we can see the occasional repetition.

TODOs for next are coastal water animation utilising the distance field, color variation of the biomes (they still look quite flat) and prop locations per tile for placement of trees, etc.

Autotiling adventures

So, we have a procedurally generated biome map, where each pixel is an individual biome. If we zoom in, it’s obviously quite pixelated. If we add sprites, it just doesn’t look right (besides the lighting and colors)

We have reasonably detailed sprites on a single-colored squares. It’s just ugly. We need to add some texture/detail.



Enter auto-tiling (or transition tiles): a method to automatically place tiles with texture so that each tile matches perfectly with their neighbours. It’s a bit of a wild west out there in terms of approaches, so here are some resources (or resource lists) that I found useful:–cms-25673×3-tile-set-used-to-create-larger-areas/125285

Quite a few.

There are two main ways to consider art when using autotiles: using masks, or using premade textures. A good example is shown here:

Autotiling example

Blend masks example

The premade tiles have the obvious benefit that they can be very nicely done, but of course they are tied to the content they represent.  The blend masks do not look as good, but are easier to develop, and they are more flexible in terms of what textures we want to seamlessly mix. I decided to use masks as I want transitions between any biome: for 16 biome types, that’s 120 unique combinations. It’s not an option to ask an artist to develop 120 different autotiles, that needs quite a bit of money and time. And also, that would have no variation; each autotile would be replicated all over the place, so it would be easy to distinguish patterns.


Grid shifting

The first naive thought that comes to mind (and I went with it for a while actually) is “ok, we have a tile, it is neighbour to 4 or 8 other tiles, so generate masks according to that relationship”. Example here. As one can see, the 4-connected version is less interesting than the 8-connected version (and we don’t want less-interesting), but the 8-connected version results in a lot of combinations! So what do we do? Well, we shift the grid. This way, we always have 4 potentially different tiles (quarters of them anyway)

Below, we shift the whole grid top-left by half a tile. Now, each grid cell (red) always contains parts of 4 tiles.

While this is mentioned in a few articles, it’s demonstrated perfectly in a non-technical article, here. That’s what sold me, as I find the results amazing!


Reducing unique combinations

So, that’s what I mostly found so far in reference material. Now, a 2×2 grid as described can contain 4 different biomes. That’s 4 bits, therefore 16 possible total combinations/arrangements. Here’s how they look like (source):


In the “16 most basic tiles” above, we can observe the following:

  • No 16 can be expressed by transforming 15 (180 deg rotation)
  • No 11,13,14 can be expressed by transforming 10 ( 90,180, 270 deg rotation)
  • No 3,9,7 can be expressed by transforming 1 ( 90,180, 270 deg rotation)
  • No 2,6,8 can be expressed by transforming 4 ( 90,180, 270 deg rotation)
  • No 5,12 contains no spatially varying data

This implies that the only unique tiles are 1,4,10,15,5,12. Furthermore, the only unique tiles with spatially varying data are 1,4,10,15. So, that is 4 tiles instead of 16. We can arrange such a mask of 4 tiles like this:

This has a nice continuous shape, if for example we want to ask an artist to draw some of those. Note that with this arrangement, the transformation will differ, as now the masks are already transformed compared to what I showed above. What’s really important is that the amount of white vs black at the borders that contain both needs to always match, so that tiles are seamlessly combined. In my case above, I’m splitting them at 50%, but that’s of course configurable. What I’m not going to cover, as I’ve given it some thought and gets very complicated, is to support variable black/white border percentages, ensuring that they match There are many more complications involved and I’m not sure if it’s worth it in the end.

So, now we have 4 unique combinations. These can be nicely stored in an RGBA texture (one mask per channel) by converting the above 1×4 tile image. In the shader, given a mask value in [0,15], we effectively do the following:

mask = ... // obtain mask value from 4 underlying biome map tiles. Value in [0,15]
(mask_unique, transform) = get_transform(mask); // use a lookup table to get the unique mask index [0,3] and the transform needed
uv2 = apply_transform(uv, transform); // transform the texture coordinates
mask_rgba = sample_mask_texture(uv2); // sample the mask values
mask_value = -1;

    case 4:  mask_value = 0; break; // whole mask is empty
    case 5:  mask_value = 1; break; // whole mask is full
    default: mask_value = mask_rgba[mask_unique]; // get the component that we need

Most of the above can be done in the vertex shader, whereas the last two steps (sampling the texture and getting the component) need to be done in the pixel shader. So, it’s quite cheap.

Rendering tiles

So, we have a method to render tiles given a very small number of masks. How do we render the tiles? Here’s the naive approach, for a 512×512 biome map:

  • We have 16 biome layers, so I assign each a priority. E.g. shallow water covers coast. Water covers shallow water and coast. Deep water covers water, shallow water and coast. And so on.
  • For each layer, we generate tile render data as follows:
    • For each tile corner in the biome (513×513 grid of points)
      • Sample the 4 adjacent biome types (clamp to border if out of bounds)
      • Create the mask where we set 1 if the layer is equal or higher priority than current, or 0 if the layer is of lower priority than current
      • Based on the mask value, calculate unique mask index and transform, and store in this tile’s render data

So, now we have a list of 513x513x16 tiles = 4.21 million. That’s quite a lot. But as I said, that’s the naive version. Observations:

  • When the unique mask index corresponds to constant 0 (mask_unique index == 4), we don’t need to consider the tile for rendering.
  • When all of the four biome values in a tile are of higher priority than the  current layer, this means that the tile will be completely covered by higher priority layer tiles, and therefore we don’t need to render it.

By applying these two, for my test map I reduced the number of tiles to 0.4 million, which is 10x better. Of course, that’s still a lot, but it doesn’t take into account any spatial hierarchy and other optimisations that could be done.

Here are some examples using the above un-nice mask. Zoomed-out:


Ok, so my mask looks bad, and there’s little to none variation, so you can see patterns everywhere.

Increasing variation

Using 256×256 masks, a single RGBA texture needs 256K of memory. We can have a texture array of such masks, using however many layers we can afford memory-wise. In runtime, we can select the layer based on various conditions. E.g. some texture layers could contain transition masks for particular biomes, or more generally, we can select a layer based on a function of the tile coordinates.


Next post will be about procedurally-generating lots of masks, using distance fields versus using binary masks, and also determination of locations for placement of props.

Shader variables

Since the game will utilize graphics quite a bit in the style of old SNES-era games (multiple layers, lots of sprites), that means using a rendering engine which is above trivial level. Additionally, since much of rendering will be based on procedural techniques, that means lots of shaders. Lots of shaders requires configurability of said shaders using uniform variables. And this is the topic of this post.

A shader variable (ShaderVar) is an abstraction for such uniform variables. The abstraction allows manipulation of the value via ImGui (using optional minmax ranges for integer/float variables and vectors) and updating the values in the OpenGL state. These variable abstractions can also be used to solve the problem of automatic binding of textures, as in OpenGL it can be a bit of a pain to manage. Finally, we can add stat gathering functionality to identify at a rendercall if there are any variables which haven’t been set, which can be quite useful for debugging. A brief overview is the following:

Effect loading. Inspect loaded effect (program) for uniform variables that are used by the shaders. Make two lists: one for textures/buffers and one for other values. The index in the sampler list is set as the texture unit location that we should be binding any texture or sampler

ShaderVars class.  An abstraction for a group of ShaderVar objects. Each object has a name and value, and a uniform location for an arbitrary number of effects. That means that we can do the following:

SetShaderVar<float>( shaderVars, "g_Speed", 0.5f); // Set the value 0.5f to the variable g_Speed
UpdateShaderVars( shaderVars, fx1);// If g_Speed exists in fx1, it's set as 0.5f
UpdateShaderVars( shaderVars, fx2);// If g_Speed exists in fx2, it's set as 0.5f

It’s not really complicated underneath, but it serves as a nice abstraction to not deal with strings in the underlying implementations, as we’re dealing directly with uniform locations and vectors of such locations. At the moment I’m using strings for setting values, but this can (and will) be changed to use other forms, such as properties

Global and local ShaderVars. When we’re about to render, we can update the shader using several such blocks. For example, one block could be globals for the whole application (window width,height), others could be globals for the current frame (current time) or also more specific, such as common values for overworld rendering ( The grid section that is currently in view, etc). These globals can be stored in the registry and fetched using a handle. After the globals are set, we can update the effect using any local shader variables. In case of a clash, we override with the most local version of the variable. Such overwrites can also be detected, warning for any misuse of the system.

Here’s how a few sections look like in the config files:

// Some shadervar blocks
"ShaderVars" : [
    { "GlobalPerApplication" : {
        "@factory" : "ShaderVarsSeparate",
        "ShaderVars" : [ 
    { "GlobalPerFrame" : {
        "@factory" : "ShaderVarsSeparate",
        "ShaderVars" : [ 
            {"Name" : "g_TotalTime", "@factory" : "ShaderVarFloat"}
    { "GlobalOverworld" : {
        "@factory" : "ShaderVarsSeparate",
        "ShaderVars" : [ 
            {"Name" : "g_HeightScale", "@factory" : "ShaderVarFloat", "Values" : [0.0], "Min" : 0, "Max" : 4},
            {"Name" : "g_BiomeMap", "@factory" : "ShaderVarTextureStatic", "Values" : ["biome"]},
            {"Name" : "g_SpriteOffsetY", "@factory" : "ShaderVarFloat", "Values" : [0.5], "Min" : 0, "Max" : 1},
            {"Name" : "g_TileMapRects", "@factory" : "ShaderVarTextureBufferStatic", "Values" : ["dcss_rects"]},
            {"Name" : "g_TileMap", "@factory" : "ShaderVarTextureStatic", "Values" : ["dcss"]},
            {"Name" : "g_ResourcesMap", "@factory" : "ShaderVarTextureStatic", "Values" : ["resources"]}
    { "GlobalFlashing" : {
        "@factory" : "ShaderVarsSeparate",
        "ShaderVars" : [ 
            {"Name" : "g_FlashMinIntensity", "@factory" : "ShaderVarFloat", "Values" : [0.5], "Min" : 0, "Max" : 1},
            {"Name" : "g_FlashMaxIntensity", "@factory" : "ShaderVarFloat", "Values" : [1.0], "Min" : 0, "Max" : 1},
            {"Name" : "g_FlashPeriod", "@factory" : "ShaderVarFloat", "Values" : [2.0], "Min" : 0, "Max" : 5}
// Some renderers. They can use shadervar blocks
{"OverworldDense" : {
    "@factory" : "RendererGrid2Dense",
    "Fx" : "OverworldDense",
    "ShaderVars" : ["GlobalPerFrame", "GlobalOverworld"],
    "DepthTest" : true
{"GridSparseHighlight" : {
    "@factory" : "RendererGrid2Sparse",
    "Fx" : "GridSparseHighlight",
    "TextureSamplers" : { "g_TileMap" : "nearest_clamp" },
    "ShaderVars" : ["GlobalPerFrame", "GlobalFlashing","GlobalOverworld"],
    "DepthTest" : true
// A renderable. They can use local shadervar blocks
{ "griddense" : { 
    "@factory" : "RenderableTileGrid2Widget",
    "Renderer" : "OverworldDense",
    "ShaderVars" : {
        "@factory" : "ShaderVarsSeparate",
        "ShaderVars" : [
            {"Name" : "g_Color", "@factory" : "ShaderVarColor", "Values" : [[255,255,255,255]]}

Note: The reason I’m using an additional ShaderVars abstraction is because in the future I want to consider having uniform buffer objects for many shader variable blocks, as it’s more optimal. But of course, this will only happen when the slowdowns begin, which is not now.

So, that’s it for this time. I’m also currently toying with introducing framebuffer objects in the system (so that renderers and renderables can be configured via script to render to an offscreen surface) so that we can have more flexible render paths. And also what’s coming is an autotiling implementation, using all these.

Automatic pixel art from photos

Disclaimer: Properly authored pixel art is awesome. Automated pixel art is fast food: great when you don’t have enough money (to hire) or time (to author). And does the trick when you’re starving.

I’m not an artist, I love pixel art, and frequently I want something here and now. So, how do I get copious amounts of pixel art without bugging a pixel artist or becoming one myself? Software of course. The style that I’m after is retro 90’s look: slightly pixelated and with a limited, painterly color palette. A few examples:

Old game art, fantastic colours, painterly look:

Pixel art, great mood and selection of colours


Great tile design (sprites are sometimes too “cute” for me unfortunately) and great colours. I bought them as I love them! 🙂

So, while I don’t hope to automatically generate stuff of quality like the above from photos, I made a tool to convert landscape photos to pixel-art style.  There are two components to the process:

  • Palettization (Mapping full colour range to a limited palette)
  • Pixelation (Downscaling the image to look a bit retro)

My approach is quite simple, and is as follows:

  • Load the source image
  • Select a color difference function (I used code from here)
  • Convert image pixels to color space used by the difference function
  • Select a palette. I got some from here. Additionally, I got all the unique colors in Oryx tileset and made a palette out of them too (the largest palette: about 1000 colors)
  • Convert palette pixels to color space used by the difference function
  • For each pixel, select the palette entry that minimizes the color difference between itself and the source pixel.
  • Downscale the image by a factor. For each block of pixels NxN that corresponds to 1×1 pixel in the downscaled image, fetch the palette color that appears the most times.

And that’s it! So, I tried the above on a few images (found in google, none of them is mine), and I got … very mixed results. Below I’ll show the original image and a few good/bad/quirky results.


Cie94 w/ Oryx, downscaled 2x. Good

Cie2000 w/ famicube, downscaled 2x. Not good

Cie94, GraphicArts w/ psygnosia, downscaled 2x. Lo-spec but good!


Cie94 GraphicArts w/ Oryx, downscaled 2x, good

Cie94 GraphicArts w/ famicube, downscaled 2x, good. Sharks are a bit of a problem as the sea water bleeds in

Cie1976 w/ Endesga-16, downscaled 2x, bad.

Euclidean distance w/ Oryx, downscaled 2x, bad


Cie2000 w/ oryx, downscaled just 1x, it’s way too realistic.

Cie94 GraphicArts w/ Oryx, downscaled 3x, a bit better, but still a bit realistic

Cie2000 GraphicArts w/ Endesga-32, downscaled 3x. Not as realistic, but a bit worse quality.


Cie1976 w/ oryx, downscaled 1x. A bit too realistic

Cie1976 w/ famicube, downscaled 1x. A bit too damaged and noisy.


Cie2000 w/ oryx, downscaled 1x. A bit too realistic. Additional downscale would destroy the geometrical details.

Cie2000 w/ famicube, not good.

Cie94 Textiles w/ psygnosia, quite bad.

Underwater ruins

Cie2000 w/ oryx, downscaled 2x. This looks great! Good for a change.

Cie2000 w/ famicube, not so great.

Cie2000 w/ psygnosia, not great either. the water is gray and the shark is bluish. Well … no.


Cie2000 w/ oryx, doable

Cie2000 w/ endesga, quite bad, but at least is good in making the JPG artifacts very very visible.

Cie2000 w/ psygnosia, not that bad actually! Even if quite lo-spec.


Cie94 Textiles, w/ aap64, downscaled 3x. A bit too damaged, but I like it

Cie2000 w/ oryx, downscaled 3x. It’s good, but a bit too realistic


So, the experiment was a failure, but I learned a few things from it:

  • Most important: The visual appeal of the results greatly depends on the colours used in the original. A grayish brown image won’t magically transform to colourful, just because the target palette is. And a simple color distance doesn’t solve the issue. We need a more sophisticated color transfer
  • Distinguish between surface texture and geometric silhouettes: surface texture colours need to be somewhat flattened, while silhouettes need to be preserved
    • could use a bilateral filter, and edge detection
  • Consider dithering. Can reduce color error, but do we want that? It certainly helps with the blotches/banding.
  • When using a palette with lots of colours, doesn’t mean we should strive to use all of them. The color distance metric tries to preserve the original colours, which would be realistic. We don’t want that.
  • Pick the brains of pixel artists for their approach (Duh)
  • Use high quality images, with minimal JPG compression artifacts. (Duh, but I was too lazy for this one)
  • Use Photoshop/GIMP/etc. The more sophisticated the algorithm gets, the more tedious it is to write/update a custom tool to do that.