Demiurge: On Procedural Generation

I avoided the term procedural generation in my first post about Demiurge because I didn’t want to define it. However, since Demiurge itself is most accurately described as, “a procedural terrain generation engine,” I should probably clarify what that means.

Procedural generation broadly refers to the creation (generation) of some form of data by means of an algorithm or automated routine (procedure). Many kinds of data can be generated in this way, but probably the most famous usage is for the generation of simulated terrain. You’ve probably encountered procedurally generated terrain at some point. Commercial products like World Creator and Terragen have customers in the marketing and film industries — many “fly over” scenes have procedural terrains in the background — and video games like No Man’s Sky and Minecraft rely on procedural generation, and the theoretically infinite worlds it enables, as part of their core offering.

[Note: The rest of this post becomes somewhat narrative, describing my experience leading up to the first coding work on the Demiurge Project. There is some technical information in there, but it’s mostly anecdotal; if you’re just looking for tech, it’s probably safe to skip to the next blog post. Also, as something of a disclaimer, I’ve taken some “creative liberties” with the particulars and ordering of events in Demiurge’s development. There are two reasons for this: (1) the actual events described were quite disjointed and spread out over a very long time, and (2) some of these events took place years ago and I don’t quite remember exactly how everything transpired.]

Procedural generation has been around for a very long time, and (as you can see from the links above) many products have been made to provide this technology. When I first realized I was going to need a procedural generation engine to help me create my maps, I didn’t expect to make one myself; I assumed I would just find an existing product and license that. However…

State of the Market: Products

In circa 2017 (and probably today as well, though I haven’t checked lately), procedural generation products were intended for two kinds of people: professional digital creators and Minecraft players. I won’t dwell on the products intended for Minecraft players; they’re fun, but their use case was so fundamentally different from mine that there was essentially no overlap. The products intended for digital creators were much more promising, but in the end, it just wasn’t meant to be.

Digital creators (CGI artists, game developers, etc.) spend their lives learning and using tools to create things in 3D. Utilities built for these professionals often offer an incredible amount of control, detail, and power; but because of that, they are expensive and insanely hard to use. Moreover, because of their focus on visual fidelity, nearly all of these products had to make significant sacrifices with regards to scale. In short, these tools are designed to generate constrained, often single-shot visual outputs that look flawless, and they demand the energy and expertise necessary to produce something like that.

Screenshot from my own copy of WorldCreator 1.0. It’s complicated, it’s constrained, and it’s costly, but wow does it look pretty!

But for my purposes, I didn’t really care how the output looked. (Well, I cared a little.) I also didn’t have the budget, in time or money, to start learning an extremely advanced tool in an unfamiliar field. What I needed was coherent and readily-available information. How big is this map? Where are the major waterways? If a character were to walk from A to B, how many rivers would they have to ford? How many high mountain passes would they cross? How many days would they have to walk, and how diligent would they have to be about carrying their own supplies given terrain hostility and frequency of permanent settlements along the route?

What do you mean, “Those aren’t standard features”? Why not? Is there not some kind of “narrative plausibility module” somewhere in your visual art software?

State of the Art: Academia

Ultimately, I came to the conclusion that if I wanted a procedural terrain engine suited to my needs, I would have to make it myself. Thus, with substantially more enthusiasm than that last statement might imply, I began to look past the existing procedural generation engines toward the source of the ideas that powered them: academia.

I should clarify: when I say academia, what I really mean is research papers, also commonly referred to as white papers. Documents like that are ideologically associated with academia proper, which is why all these terms are used somewhat interchangeably. In the software industry, however, research papers are produced by a wide variety of institutions: academies, corporate R&D bodies, standards organizations, and even unaffiliated individuals (Bram Cohen famously created the phenomenally influential BitTorrent protocol essentially on his own). Overwhelmingly, these sorts of research papers are made available for free on the Internet.

Academic and industrial interests often go hand-in-hand in the software industry, so I wasn’t expecting that the capabilities I needed to be very popular as research topics. However, the body of research on procedural generation is vast and varied — at a guess, I’d say 5% at most of researched topics have been turned into actual commercial products — and while I didn’t find exactly what I was looking for, there was more than enough information to get me started.

These weeks, when I was just reading research papers and building an understanding, were the very beginning of my efforts on Demiurge, and there were three papers in particular that influenced me at the time: F. Kenton Musgrave’s Procedural Fractal Terrains, which I believe is actually a chapter of a book; Realtime Procedural Terrain Generation by Jacob Olsen; and Terrain Generation Using Procedural Models Based on Hydrology by Genevaux et al. Each of these papers represents a different “axis” of procedural terrain generation.

The Musgrave document is not so much a research paper as a high-level technical overview, but the information it contains is so invaluable that I consider it the foundation upon which everything else is built. The most important idea that I learned from this paper was that noise, such as Perlin or Simplex noise, could be transformed into surprisingly plausible representations of real-world topographies using a small number of very simple operations. I had, of course, seen “fractal terrains” before (see page 2-3), but it wasn’t until I saw the Bryce 4 “ridges” output (page 2-9) and Musgrave’s description of how that was done that I realized techniques like this could be used to give usable, believable representations of mountainous topographies. I’ll speak in more detail about these techniques when I post about mountain noise. For now, though, just know that it was while reading the Musgrave paper that it first hit me: “I might actually be able to pull this off.”

I had a slightly different reaction to the Olsen paper. That paper deals with a lot of topics, but I was particularly interested in its discussion of erosion because I was evaluating whether Demiurge should be geology-first or hydrology-first: in short, whether rivers should be placed around mountains or whether mountains should be placed around rivers. Geology- versus hydrology-first is important and complex enough to merit its own post — in fact, I think that will be the subject of the next Demiurge post — but it was reading the Olsen paper (and other, similar geology-first papers) that led me to the all-important conclusion: Demiurge should definitely be hydrology-first.

Once I knew I was looking for hydrology-first algorithms, I didn’t have to read many more papers before I felt sure of that conclusion. Most procedural generation papers are geology-first, hydrology-second (or geology-first, hydrology-never); but as soon as I’d read the Genevaux paper, I knew hydrology-first was right for Demiurge. The idea of placing mountains around a predetermined river system seemed strange at first because it’s backwards from how the real-world process seems to work. But Genevaux’s results were impressive and plausible enough that I decided not to dismiss the approach without first having tested it myself. That “test” is still a core component of Demiurge today.

And so, with these three papers in hand, I was finally ready to begin writing code for the Demiurge Project. I didn’t need what I’d learned from all of the papers immediately, though elements of all three eventually made it into the project. In the beginning, however, I simply knew that I was going to try to make a hydrology-first procedural terrain generation engine, so the first task would be to procedurally generate a river system. The resulting prototype was the very first bit of functioning code I wrote for Demiurge.

But that will be the subject of another post.

–Murray