Tuesday, September 9, 2008

Content Pipelines and Build Processes (Part One?)

Perhaps the most important thing I've learned since graduating from the Guildhall and working in a couple of professional studios is the effectiveness of a good content pipeline and build process. These are things that were mentioned in passing at Guildhall lectures but that didn't register with me at the time. Since then, two things have happened to elucidate to me why content pipelines matter. The first was working with a bad pipeline and worse build process at my former company. The second was when I needed to roll my own pipeline for my hobbyist engine.

To be clear, what I mean by content pipelines and build processes are the answers to the questions "How do you get data into your game?" and "How do you produce code and data builds?" respectively. The two problems are closely related, but build processes are sometimes limited in scope to only building code, with data being processed at a different time (especially when baking content for consoles).

Given the popularity of the Unreal Engine, it's worth looking at how their content pipeline works. The decision was made long, long ago for the original Unreal that the game's raw and baked files should be the same (at least for levels and asset packages). There's a lot of interesting implications to that, and it's probably worth arguing whether the original rationale for this decision is still valid today, but that's the way it has stayed for over ten years of development of the engine. What it means with regard to the pipeline and build process is that there is no process.

In Unreal, content creators import raw files (textures, meshes, etc.) and produce package files--binary bundles that blur the line between raw and baked because they can be edited in UnrealEd but are also ready for use at runtime. Likewise, level files never need to be exported or baked into any other form; the raw level files that designers work on are the same files that the player loads in the finished game. Of course, Unreal does have baking of a sort, but it happens in-editor, when the designer clicks the Build button, instead of as an automated process, and the product is saved into the original file. For PC-based Unreal games, that's more or less all there is. Iteration time is fast, because all the data is already in its final form. For console games, the data will be baked for the target platform, which means fixing the endianness, building seek-free packages for fast loading, etc. It's a bit slower but not terrible, and the baked data can be cached and only rebaked when it changes, so iteration time is still relatively light.

In stark contrast to the Unreal model is the pipeline and build process used at my former place of employment. We used an in-house-developed engine, and there seemed to have been no clear vision as it grew, resulting in a monstrous pile of conflicting tools and methodologies. That doesn't mean that the general design of the pipeline was bad, it was just burdened with years of baggage from hastily-made, myopic decisions.

For most assets in this engine, the raw files were standard filetypes (TGAs for textures, AIFF for samples, INIs for configurations, etc.). The rest were custom text formats for worlds, particle definitions, and other non-standard asset types. (Unrelated note: As far as I recall, there were no custom binary formats; however, the text files that the world editor produced were indeterminate, with the contents being ordered apparently randomly each time the file was saved.) Because the raw files and baked files were different formats, every file had to be baked before it could be viewed in the game. That's pretty common. Making it more difficult was a complex dependency chain that I have to assume was a too-early step to produce seek-free packages. If a texture was changed, it was not merely enough to rebake the texture; every world, mesh, or UI element that referenced that texture needed to be rebaked. And there was no automated dependency checking. And the dependency graph was so convoluted that no programmer on the team understood it, not to mention the content creators. The implications? Almost every time a content creator changed anything, they rebuilt everything. That was an iteration time of almost two hours, sometimes just to verify something as simple as changing a texture.

Ironically, I decided after that experience to implement a very similar pipeline in my hobby engine. It's a fairly obvious paradigm: develop raw files, bake them, run the game with the baked files. Having noted that the problems I observed on the aforementioned project were due to other issues than this core design, I figured that I could implement a simple system this way and use my experience to avoid the pitfalls. In practice, it has worked surprisingly well. Given that my only target platform is the PC, I don't have to worry about seek-free packages, so there are no dependency issues in the build process. When I change a texture, I only need to rebake the texture file. Worlds and meshes simply reference the texture by name instead of storing the texture inline. (As an aside, it seems that it would be far simpler to do this seek-free stuff as a late post-bake step, the way Unreal does it. I haven't thought it through completely, but I can't see why it would be done any earlier.) I use timestamp checking to determine what needs to be rebaked, so builds are minimal. Iteration time is only an issue when building world files, because of the amount of heavy calculations they need for lighting. Most other changes take less than a second to build. After baking, I can either use the loose baked files to run the game or zip everything up into compressed package files.

There's more I could say on this subject, so I'll probably be following this up with a second part eventually. I'd like to write about the horror of the automated build process on my last project and delve a bit further into my own system. I'll also try to revisit the loose ramblings from this article and tie it together into a more concise evaluation of how each engine answers the questions I posed at the start.

For more information on content pipelines (including some refreshing examples of pipeline design with some forethought), check out these slides and audio from Gamefest 2008: Next-Gen Content Pipelines: A Study of 1st Party Titles

0 Comments:

Post a Comment

<< Home