Where we're going we don't need roads

This is the fourth article in the Back to the Future theme week series.


Welcome to the future!

Oct 21, 2015 shown on the Back to the Future time machine's display.

In the movie Back to the Future Part II (1989) the protagonists travel to the future, to the distant date October 21, 2015. That day is today, so we'll have a look at what parser interactive fiction looks like — and could look like — in this futuristic year of 2015.

Let's start by establishing the baseline. The grand illusion is that an average parser IF story looks like this.

An abstract but complex node graph showing multiple nodes branching and convening

Not a parser story structure. [source]

This recent document from One More Story Games categorizes interactive story flows. To which category does an average parser IF story fall? It's not what the document says it is.

An average parser IF story flow looks something like this.

Contrary to the popular belief the traditional story structure of parser IF is strictly linear. The gameplay is more often than not a linear stream of puzzles leading to one conclusion. In other words, the story in each playthrough is more or less identical regardless of what actions you take during the game. You might have the choice to solve some puzzles in any order, or choose which order to talk to NPCs, or uncover pieces of the backstory at your own pace, but the number of meaningful choices is almost always effectively zero.

I know this is blasphemous, but to quote an unrelated movie: search your feelings, you know it to be true.

The illusion of free choice is created by the geography. Note that the PDF linked above has fallen into this same trap. It puts IF into a "connected map" category on page 2, but it's conflating story and world geography which are completely different things. Moving around in the game world doesn't count as a story. This bears repeating: open game world does not equal branching narrative.

That is not to say that all parser IF is linear, but there are only a handful of exceptions. It is on one hand a little bit surprising, considering the impact of Galatea (2000), and on the other not at all surprising, considering the amount of work needed to pull it off and the lack of tools designed for that specific purpose.

This is not criticism. Branching narrative is not and should not be a value by itself. But let's not fool ourselves by pretending that parser games are the shining paragon of branching narrative.

What are parser IF's strengths then? Here's a few:

  • World exploration
  • Storytelling
  • Adventure game puzzles
  • Wordplay

By adventure game puzzles I mean the standard use-thing-on-other-thing puzzles, which are pretty much fully explored by now. There might be something new to discover, but mostly everything is a variation of things already seen hundreds of times.

To me world exploration is still the main selling point of IF, and the world model is where I see biggest potential going forward.

Sandbox worlds

After that rather lengthy introduction, let's take a look at another kind of map.

Partial map of Zork I

Partial map of Zork I. [source]

This resembles the earlier storybook structure quite a lot. The difference is that it depicts a physical map of a parser game's geography instead of its story structure.

If we lay out a linear plot on top of the spatial map, it might look something like this (imaginary example):

Linear plot laid out on top of Zork's map

The red dots are story events placed on the locations where they happen. The story world exists only to support the plot, or to serve as a setting for the puzzles. And there's nothing wrong with that – story comes first. But what if we turned the setup the other way around?

Here's the story structure of AAA sandbox games like Grand Theft Auto:

Sandbox story flow.

Sandbox story flow. [source]

There's a central storyline, and side missions sprinkled around the map that are unlocked as you progress in the game. Once the side missions become available you can complete them in any order.

Parser IF already has the geography of a sandbox world, so why not take advantage of the fact? Instead of building the world to support the story, we could make the world the primary focus. Now the plot becomes a string of interconnected, self-contained "missions".

This requires some additional work, but also has a payoff. It would also require stepping away from the strict expectations of realism in IF, the same way that GTA and other sandbox games are often somewhat unrealistic: after failing a mission the game world resets and lets you retry as many times as you want, for example.

Crowdsourced content

One largely untapped potential is shared worlds. There are some shared settings like the Andromeda series, Flexible Survival and Kerkerkruip. Andromeda is a series of games written by different authors sharing the same setting. Flexible Survival and Kerkerkruip are single games written and continuously expanded by multiple people. Flexible Survival is especially an impressive result and by far the largest parser game ever made, but sadly a pornographic furry game is unlikely to ever get mainstream recognition no matter what its technical achievements are.

What I'm envisioning is a multiplayer setting like Guncho but with a common world and developer tools built into the environment. It would also help with the content creation problem.

In a shared environment the players could make content as they're playing by filling in the blanks that have not yet been completed, or creating their own spaces inside the shared world. MUDs offer this kind of functionality but they often lack the kind of direction I'm looking for: letting everyone do whatever they want will usually create a mishmash of varying quality. There would have to be some kind of editorial role that would maintain integrity and quality.

This is something that has seen some implementations (mainly on the MUD front) but they lack the final touch that would make it actually work outside niche environments.

Alternative interfaces and genre hybrids

What is the key feature of parse IF? For some it might be the parser itself, but to me it's the world model. If we left everything else as is but changed the user interface to something else, I think it could work just as well. In the current mobile-dominated landscape there's pressure to move away from typing primarily because it is a significantly inconvenient mode of interaction in touchscreen devices, and secondarily because the parser is generally considered having a steep learning curve for newcomers.

The current alternatives, mainly turning parts of story text into clickable hyperlinks, with or without context menus that appear on click (for example, clicking on the word "door" opens up a menu with verbs "open", "close", "unlock" and so on), haven't been very successful. I don't have an immediate suggestion as to what would be a working alternative, but I'm sure there's something that could be made to work. More experimentation is needed.

Taking the alternative interface idea even further, parser IF would have a lot to offer to other gaming genres. What parser IF really does well is the world model. Using an IF engine to power a non-IF game could create a spectacularly deep game worlds. Even a graphical adventure game using an IF engine under the hood could be worth trying.

Shark still looks fake

This is the third article in the Back to the Future theme week series.


A holographic shark in the future

The parser is a curious piece of technology. Since its first appearance about 40 years ago it has survived almost unchanged to this day, apart from superficial improvements.

Parsers come in two generations of sophistication. The first generation is the now practically extinct two-word parser that only understands commands in the form of VERB or VERB NOUN. The second generation is the modern parser that understands VERB NOUN PREPOSITION NOUN and VERB [NOUN PREPOSITION] TEXT where "text" is freeform input.

The third generation, which we don't yet have, is a parser that understands any reasonable input and is able to transform the player's intent into lower level tokens for the story engine. For example, a third level parser would know how to interpret the intent from OK LET'S TAKE A PEEK INSIDE THAT MAILBOX NOW and tokenize it into [LOOK IN] [MAILBOX] – all without the author having to anticipate and write grammar rules for that specific input or even those specific words.

The good news is that we probably have the technology to make a third level parser, at least to a reasonable extent. It would solve a big part of the age-old tutorial problem and remove many causes of frustrations associated with the parser. It would take a dedicated team some serious effort and university-level research but it's certainly doable.

The bad news is that there's another piece of the puzzle that needs to complement the parser or that kind of sophistication would completely go to waste. It's not enough for the story to understand input, it has to also respond to it appropriately.

Let's look at an example. In the past a relatively common complaint was that the parser didn't understand commands with adverbs, like OPEN DOOR CAREFULLY. In reality patching the parser to recognize adverbs is trivial in most modern development systems. A story that "understands" adverbs would be expected to respond to input something like this:

CLOSE DOOR

You close the door.

CLOSE DOOR QUIETLY

You close the door, careful not to make a sound.

CLOSE DOOR AGGRESSIVELY

You slam the door closed!

Where are those different responses coming from? It's not the parser that spanws them. Someone has to write the text, either as default responses to the CLOSE DOOR [ADVERB] action or as custom responses for interacting with this specific door.

Realistically you'd group adverbs together so that considering synonymous and closely related adverbs you'd have maybe 3-5 separate adverb groups. Even in the best case scenario you'd have to write up to five extra custom responses for every action to take into account all reasonable user commands.

After the author has done all that work, the end result is perhaps mildly interesting for the player to explore but has yet no real meaning within the story. It's not really worth the huge amount of extra work just to acknowledge the adverbs player has used by varying the game's responses. If you drop a glass violently instead of carefully, shouldn't it break? Shouldn't NPCs react differently if you talk to them amicably or aggressively? How should the parser respond if the command is nonsensical, like CLOSE DOOR LOVINGLY?

Any adverb-aware system that had any real effect to the gameplay would suffer from a combinatorial explosion of both all the extra responses that would need to be written and the results of actions that it would have to take into account. Making such a game wouldn't be beyond imagination but it would practically require dedicated effort from a fulltime team. Apart from trivially short works it wouldn't be feasible to a solo hobbyist.

Doc Brown wearing a "mind reading device" on his head

"Do you know what this means? It means that this damn thing doesn't work at all!"

To further illustrate why the parser and prose are inseparable, consider a story with a parser that has a human-level understanding of language but the story engine isn't sophisticated enough to deal with the input.

First a neutral command. This is what you'd generally see in any standard parser game.

TAKE ALMANAC

You reach for the almanac very carefully, holding your breath so that Mr. Strickland won't notice you...

Now imagine a more complex command:

DASH TOWARDS THE ALMANAC AND GRAB IT QUICKLY

The story's prepared response to a neutral TAKE ALMANAC command would be almost completely the opposite of what the player's intention is. If the story engine ignores everything in the command except the basic intent, the result is this:

DASH TOWARDS THE ALMANAC AND GRAB IT QUICKLY

You reach for the almanac very carefully, holding your breath so that Mr. Strickland wouldn't notice you...

The effect is jarring and because the real command is masked it looks like the parser has almost completely misunderstood the player's intent.

Another option would be to communicate the lower level command to which the parser reduces the original command.

DASH TOWARDS THE ALMANAC AND GRAB IT QUICKLY

[–> TAKE ALMANAC]
You reach for the almanac very carefully, holding your breath so that Mr. Strickland wouldn't notice you...

This would justify the disrepancy between the input and the response, but exposing the internal workings of the parser would further make the complex parser even more of a gimmick. Once the player notices the pattern there's no point to keep writing the "natural" phrases when it's obvious that they're just going to be reduced to the bare minimum. (As a teaching device it wouldn't be that bad though.)

The third option is to discard any command that conflicts with what the story is expecting, but that would not end well. It would lead to either horrible guess-the-verb-and-adverb puzzles or incredible frustration when the parser seemingly understands the basic intent but refuses to carry out the action.

The final option is to write separate responses for every type of intent, but this has the same problems as mentioned above, most notably the multiplied effort required to write the text and test all the combinations.

To sum it up: A system with a mismatch between the capabilities of the parser and the capabilities of the engine is not viable. The two are intrinsically linked. We're in a situation where improving the parser requires advancements in technology in several areas, both in understanding the input and generating the response. A smart parser is somewhat feasible, but only if the content generation problem is solved.

If all this sounds too pessimistic, fear not! Tomorrow we'll explore some untapped potential that could already be available with the tech we have now.