Building an immersive soundscape in Shadow of the Tomb Raider - full Q&A
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.
A little while back I got the chance to chat with Shadow of the Tomb Raider audio director Rob Bridgett and composer Brian D' Oliveria to find out how the pair created the soundscape for the third entry in Square Enix's gritty Tomb Raider reboot.
The combined transcript from those two Q&As clocked in at over 5000 words, meaning I had to cut a lot of good stuff from the final article (which you can find right here). But, rather than leaving all of those juicy game dev tidbits on the cutting room floor, I thought I'd post the complete transcript to my blog so anyone keen to learn more can feast like a king.
You can check out my chat with Brian directly below, or scroll further down for the full Q&A with Rob.
Gamasutra: Brian, you said that you wanted to get 'into the texture' of what darkness is during your three years working on Shadow. How did your understanding of that concept evolve over time?
Brian D'Oliveira: Having the opportunity to contribute to this project was uncanny because until Shadow of the Tomb Raider, my work as a composer focused a lot on storytelling using the qualitative and textural aspects of sound, plus its ability to express strong yet complex emotions in powerful ways. My main sources of inspiration for this have always been the inner human condition, ancient cultures, and nature.
For both Rob Bridgett (audio director) and myself, it was apparent that the score would ultimately end up being a multifaceted tapestry of sounds, with “fear” being the emotional thread that would interweave and tie together the overall sonic fabric of “darkness.” So the first couple of years was mostly spent workshopping and researching together to really identify these core foundational emotions within Lara's world. In tandem with the creative team, we built a musical sound language that ultimately combined in a myriad of ways to become the sound world that we now experience in Shadow of the Tomb Raider.
With such a strong character as Lara, and an equally inspiring setting and storyline, the game inspired all of us to really dig deep, and for me, it was a chance to incorporate my passion for ancient cultures and the immense richness of sounds that they had left behind in both a genuine and respectful way.
Very early in the process, it became apparent that the best way to achieve this was to creatively limit the musical palette to only use natural acoustic sources and employ as close as possible the most accurate representations and original instruments. All of the music is performed entirely live, and there is no looping or digital over-editing; it's purposefully imperfect and natural.
When not in the studio producing, I would continuously be immersed in research, traveling (in hunt of new instruments and knowledge), practicing, and refining these newfound musical treasures. Once I was back in the studio, this allowed me to be able to hit the ground running.
As we iterated and experimented, we were amazed by how much these textures that were being generated were evoking and enhancing a sense of physicality and depth to the environments and gameplay. I still remember the feeling of wonder and excitement when we first started juxtaposing and combining these musical textures with the sound design during gameplay and in the environments. This was just the beginning of the magical journey!
I've heard you came back from Mexico with bags and bags of new instruments, and they all seem incredibly unique. Was it tough to marry together so many contrasting sounds into a cohesive score and soundscape?
Since I had already spent a lot of time planning and researching during pre-production, I had very carefully selected and even commissioned instruments to be made for me with this in mind. Subsequently, once I was back in the studio with them, I had an overwhelming palette to draw from. What really helped me make sense of these newfound musical treasures was the second part of my research phase, where I also brought over one of the leading masters of Mexican Pre-Columbian music (hailing originally from the Otomi tribe), Ramiro Ramirez Duarte, whom I had met while there.
Almost every day, we spent over a month living and performing together with these instruments in a show as part of my artist residency at the SAT Montreal dome venue. So not only did I practice and perform this music, I was also able to learn a lot from the audience reactions and got to know the music and instruments at a much deeper level. All of the invaluable musical internalization that I experienced significantly contributed to my ability to get into the Pre-Columbian mode. Ultimately, it allowed me to create music that was natural yet focused regardless of the type or source of the instrument
Were there any instruments you had to leave behind, that you would've loved to have seen in the final product?
There were some insanely massive diameter drums over 10 feet that would have added an even more epic dimension, but they just would not have fit in the airplane without it costing an astronomical sum! I was also looking into building some gigantic life-size death whistles, but that would have taken many more months of research and time that I did not have available.
What was it like trying to effectively combine those instruments with their more conventional counterparts, such as the cello you felt represented Lara?
As I spent a lot of time absorbing and getting into the music, the biggest epiphany that occurred during this process was the realization that expressing music from the viewpoint of the Pre-Columbian state of mind was accomplished with the understanding that all beings are intrinsically and unequivocally interconnected. Thus, it is a big reason why its implicit in their ritual practices and daily lives. The deeper I went, the more my compositional methodology transformed, and I eventually reached a point of musical transcendence where during the recordings I became a medium - without the need for thoughts or planning. It was a matter of intent and then emotive expression. Often times it even felt as if the instruments and melodies were playing themselves. This also became the case for all of the instruments in my arsenal, and the cellos especially lent themselves to this approach. A lot of the thematic melodies and textural rhythms that I played them on are a direct result of this!
How did you approach recording the instruments you brought back? What tools (software/hardware) and techniques did you use to ensure those magic rocks and the death whistle sounded as authentic as possible in-game?
About a year and a half into the production we moved into our new facility that also happens to be the original studio built and used by RCA Victor in the 1940s! The room was designed and built with the concept that it would resonate like a musical instrument. There is a very particular sound you get there, and it is one I have not experienced anywhere else. When I play and record, you can hear the room breathe and react to the notes you play with infinite acoustic variation. Given the vast amount of instruments in the room, there are always random sympathetic vibrations that occur.
My chief recording and mixing engineer, Jera Cravo, has a vast amount of experience and knowledge. He was able to effectively combine the best of tried and true techniques along with new tech and workflow hacks, which allowed us to work extremely fast and efficiently. We recorded using multiple stereo microphone sets, combined with UAD unison mic preamps, so depending on our relative location in the room with the microphone we had almost an infinite selection of colors for sound focus, reverberation, and panning position. Most of the stones and death whistles you hear are non-effected recordings using this approach and feel almost as if they were processed binaurally because of the spatial definition present. Even perceived movements in panning are real physical movements that have been purposefully incorporated as part of the performance.
What was the biggest technical challenge you encountered during the entire process, and how did you overcome it? (Feel free to really dig into the nuts and bolts of audio production here).
Creating interactive music and tight timelines that video game production demands is an ongoing challenge for this medium, and as a result, our workflow is continuously changing and evolving. With Shadow of the Tomb Raider, our choice to play and record every single note so that we could deliver final quality assets from day one was both our biggest strength and challenge because it takes a massive amount of time and effort compared to using more standard methods relying on pre-recorded sample banks. But this is also where workflow and research saved the day! I have a passionate and dedicated studio team that supports my composing process. In addition to our fantastic recording room and recording methods, this all has allowed us to efficiently get final quality sound without the need for extensive mixing or processing. In parallel, Jera would be mixing, mastering, stemming and delivering final tracks that would be sent to the audio team at Eidos so they could start plugging it into the game engine right away. During these bursts of deadline crunches, we did not have a lot of time to iterate or change, so the music had to be pretty much dead on in terms of fitting the brief, intent and gameplay interactions. This is where all the pre-production iteration and research really paid off because for the most part, it went smooth.
One aspect that also greatly helped were the stems. This meant that even if an element of the story or direction changed between commissioning the music and implementing it, Rob was able to quickly and easily re-appropriate stems from other tracks or stems that could meet the newly adjusted demands on-screen. This is akin to how music editors and music mixers work with composers in film, and it suited the rapid workflow on Shadow of The Tomb Raider.
And on the flip side - what was the most rewarding moment during production?
Given the heart and soul we poured into making this music, it was very gratifying to receive positive feedback and reactions from the audio and creative team at Eidos! And even more so it was a joy when getting to sit in on gameplay with the newly integrated music and see Lara’s world and character come to life. I still remember getting goosebumps when I first saw her walking through Paititi, and hearing its theme reveal this rare and beautiful world.
Rob says he hoped to blur the lines between a conventional score and traditional sound effects - did that affect your creative process at all?
It very much did and was a massive source of inspiration for my research and experimentation from the very beginning. Rob had a very clear vision, and I am fortunate that we resonated on the same level and were able to very quickly get into a creative flow that was both focused yet open creatively. As we went deeper, we both realized that the musicians in these cultures were, in fact, akin to sound designers because they made sounds with both instruments and sound makers such as death whistles, to continuously invoke and accompany almost all facets of their daily lives! This was the catalyst into a journey that started and culminated with Shadow of the Tomb Raider, but will most likely continue to be a big part of my creative process for the rest of my life.
How did you know when the score was 'done.' What were the boxes each track had to tick on an emotional/practical level?
On the emotional level, I usually know a track is done when I have a synesthetic reaction. I am able to close my eyes while listening to the music and vividly feel and envision the scene playing before me. I can even remember smelling the jungle at certain times!
From a practical perspective, a track has to be able to have sufficient elements and arrangement variations so that it is able to enhance gameplay and seamlessly hit the emotional cues that are intended.
What tips would you give to other composers looking to work in the world of games? Is there 'golden rule' unique to the industry?
I do not believe that there is a “golden rule” that is unique to this industry. It's in fact pertinent to all creative fields and facets of our lives - it is honesty and a burning passion! I eat, sleep, and breathe my craft, and I do not differentiate between my personal and work life since they are one in the same. If you too have this passion and are able to sacrifice and dedicate your time regardless of the outcome, then you might just have the right mindset to delve into this challenging yet rewarding path.
The biggest reason that I love working with games is that for the most part I am allowed to truly express myself as an artist and create music that I love. At the same time, it’s an ever-changing and complex art form that demands sensitivity and creative focus unlike any other industry. My advice is to not be a “jack of all trades” and use the same sounds that the defacto majority of composers seem to be using these days! Instead, converge on what you love and how you express yourself as an artist, continually learn and discover new sounds and/or instruments, and the right people and projects will gravitate accordingly. Get to know and experiment, not only with your musical tool sets, but also be able to use current interactive audio middleware like Audiokinetic’s Wwise, or FMOD. Be capable of integrating your sounds and music within the in-game engine. Even, if you end up working with an audio development team that implements your music, it is essential to understand this process so that you are able to leverage this aspect to its full extent.
And last but not least, balance your hermit production phases by being social and spending time with both the people you work with as well as new friends and possible collaborators. As you can, take time to travel, experience life, and have spontaneous adventures - this will ultimately make your music much more profound and rich!
Now, here's the complete transcript of our Q&A with Shadow's audio director, Rob Bridgett.
Gamasutra: Rob, you've expressed a desire to move away from the more traditional melodic audio present in previous Tomb Raider titles by blurring the lines between the main score and traditional sound effects. Could you talk us through that process?
Rob Bridgett: As you say, it is all about balance. To clarify the music direction for Shadow of the Tomb Raider, yes, it is about darkness and fear, but we never completely move away from melody and we certainly don’t move away from recognizable themes. We absolutely keep these elements as core emotional, storytelling and character pillars for the franchise. We still have themes, melody, and musical hooks for the player. In many ways, this may even be the most melodically memorable of the recent Tomb Raider scores because we use melody quite sparingly and only in extremely memorable and emotional scenes.
Having said that, when the music needs to become dark, and be about fear, this is where we move into our more visceral, textural and effects-based music signature. And remember, this isn’t a new thing for Tomb Raider because we have the instrument on Tomb Raider 2013, which is pretty much this same idea. Tomb Raider in general has always had the door open to this kind of thing, but we just took that idea and really ran with it this time around.
What became our central pillar of fear is this interesting space that we enter when music starts to become unrecognizable, almost like a sound effect. Very important for us, is the idea that the player is unsure that what they are hearing is score, telling them how to feel emotionally, or a locatable sound in the physical space, telling them that something is in the same place as them. That state of not knowing what the player is hearing, if it is real or imagined, is the state of fear for us.
Emotionally, sound and music are often doing the same thing, giving us cues on how to feel, though traditionally, we expect music to perform this emotional role while sound performs the role of “reality.” We decided we can play against those traditions to create the emotional effect we wanted. We place a lot of these musical textures and effects (Death whistle, Brian’s voice and breaths) in 3D, and in the space itself to really amplify this sensation of anxiety. The Mayan and Aztec instruments we used are so unfamiliar to the audience, musically; that they work perfectly in this context, as does our re-use of the original Tomb Raider instrument and the Ambient Music Design Martin Stig Andersen helped us with for underwater sequences. This is all part of the vision to move away from trying to replicate reality with sound, and instead trying to convey a state of mind to the player. They begin to figure out that yes, our jungles and HUB environments are about feeling grounded, and as we’d expect to hear them in reality. But when we enter these tomb spaces or darker areas of the jungle, that sense of reality is suspended, and psychology moves to the foreground like it would if the player actually went into these kinds of spaces. In a weird way, that is even more true and real to what we experience as humans with our fears.
During our mix, we even made the decision to release the music from the shackles of the LR channels and pan it spatially much deeper into the surrounds. The music in the stereo field just felt too safe, like it wasn’t reaching out to the player and into their living room. Again, this ambiguity and anxiety about what causes sounds the player is hearing is the central component of our concept of fear for the soundtrack as a whole.
To your point about balance, that is the whole key to using all these elements effectively. We cannot use this fear technique too much, because, like anything, it becomes tiring and annoying, much in the same way that melody and themes become tiring and overwhelming if used too much. Having a game and a story that is very well shaped in terms of action, traversal, story pacing, darkness, light, fear, hope, helps us a ton in this regard. That is actually one of the cool things about working on a Tomb Raider title, the overall shape of the experience is a joy to work with for sound and music. It is always important that we respect that overall dramatic shape with our approach.
What did you do to make the audio experience as seamless as possible in-game? Taking into account the constantly shifting states Lara finds herself in -- swimming underwater, scrambling around caves and cliffs, and leaping through treetops.
There are two parts to a seamless experience. One is to respect the dramatic shape of the gameplay and story with the audio content we create. The other is to think about transitions on an emotional level. For example, when something is a sudden shock in how the character learns certain story information or when the environmental changes that happen to our character become sudden or gradual, it is often hardest to get right. This happens because as you build the game, these moment-to-moment feelings and motivations change as the story evolves. Brian and I worked continually on sketches and work in progress stems that my team and I could test out and quickly review. The same is true of our sound team on the effects and ambiences.
The other part is implementation because at that point, it is almost purely technical; ensuring that transitions are not jarring, unless they need to be, which can be punchlist-driven (lists of tasks that are logged for the implementers to address). That said, “transitions” in video games can be technically quite challenging because of how the sounds are loaded or streamed into the available memory (stuff the player doesn’t care about or get exposed to). As a result, we always have big challenges to overcome on the technical side. Our end goal is for the experience to feel seamless, polished and movie-like in terms of presentation, and not suffer the typical video game sound tropes like sound and music suddenly stopping just because we are loading something. Our tools and pipelines determine all this, and we have been very focused from day one on having tools and workflows that put the audio designer first, so that person can focus on the content, and not on fighting with the tools to get something simple to happen.
You've said that you wanted to treat the score as source music in the hub worlds, and that you tried to make it seem as if NPCs are actually playing those instruments. What specific audio and animation challenges did that present, and how did you overcome them?
In the Paititi hub, it is extremely important that we have the player feeling like the entire city is full of rituals, sound, music and life. Our research on pre-Columbian societies points to sound and music being an integral, unmissable part of everyday life. It is another area where what we would normally think of as “music”, is in fact having a sound source and directly active role in the world itself – working more in the traditional “sound effect” realm.
The musicians in the game are playing the instruments. With the Temponatzli player in Paititi for example; each note is triggered (and randomized) when he hits the instrument in the animation. All of that has been cut and hand-placed by a sound implementer, so we have something that matches what he is doing. The animation drives the music players hear from that specific musician. Unfortunately, in this case, it would not work doing it the other way around, to have an animation match the music because the animation loops have some time limits that we need to respect. Really, these are nice ambient details that help the player feel that the world is alive. We have musicians that players “see” making this sound, but what they don’t see, and what they do hear, is a lot more mid-distance musicians playing similar sounding instruments. This gives the feeling that the city is completely full of musicians, ceremonial music and activity without the overhead of having the animation team support our entire music soundscape.
Whether we are in the jungle or in a village HUB, we work a lot on this idea of “middle-distance,” which is a place where we can still hear the sounds being generated by whatever it happens to be: birds, monkeys, musicians and crowds. However, we do not need to see them rendered on screen to believe they are there. For this to work, you need a few things that are very close to the player (like a bird flying off, or a musician playing one of the instruments they hear), and are visible. That makes the world around them sound similar to everything they are hearing. Then, the player accepts that middle-distance layer we’ve added as being present and believable.
We replicate this “music-in-the-world” idea within our smaller hubs, by having more modern Mexican or Peruvian source music played on radios in the spaces. Again, this is to give the player the sense of the locale as well as a way to navigate the space. By grounding music as an integral part of the life and culture of these places, source music plays a massive role in the environmental storytelling of this game.
You've also mentioned that mixing in Dolby Atmos helped exaggerate the game's sense of verticality? How exactly - what benefits did the tech bring to the table?
My team is always very ambitious to push the awe and spectacle of audio just as far, if not much further, than what we see visually. Sound is one of the most immersive elements of a video game and we have a big responsibility to uphold our end of the audio-visual contract, particularly with a game that is as gorgeous as Shadow. When you think about it, the player is looking through a small rectangular window into our world, visually, but everything that is around the player off-screen in this 3D world is rendered in sound all the time. The persistence of things on-screen and what is moving behind the player immerses them in this world, believing that what is behind them, and no longer on screen, is still there, and it persists once the player moves the camera away.
Because we knew we’d be using Atmos, about halfway through our development we decided to completely shift away our propagation of sound in the game from 2D, pre-rendered ambiences, and move fully into object-based, 3D prefab sounds for everything like animals, insects, birds, water, rain, dust, debris, even wind and air tone. Our level editors author all of that in 3D. It is actually faster to work this way for us, and it gives us a true 3D effect when the player moves the camera, as all those 3D objects translate exactly as you would expect them to. It’s pretty much a VR approach to sound, but done for a none-VR title and on a vast scale. And even though we talk about Atmos a lot, we also support every other 3D spatial audio technology available to us such as Windows Sonic, and Sony’s proprietary Spatial 3D audio platform. If done correctly, what all of these spatial audio technologies bring to the table is the ability to completely wrap the player in your 3D environments.
Having said that, when pushing the immersion in a “believable” world, take a moment to think about what we said earlier about fear, tombs and dark jungle spaces all messing with the player’s head and playing spatial sounds, musical sounds, that may or may not actually be there. Having environmental sound that is all around the player and they hear what might or might not be music, effects or something the player can’t understand from their models of reality is suddenly an extremely powerful and potent fear effect. We’re using every trick of immersive “reality”, to tell the player something is real, that they are there, and that way, we put them into an authentic state of fear. It is uncomfortable, possibly too uncomfortable for some people. To me, that is almost the real trick to what Atmos, and other spatial audio technologies bring to games and spatial storytelling in general. This idea of spatial storytelling is extremely fascinating to me. Yes, you can present the same story in mono, but if you present and experience that story spatially, it physically punctures the screen, and is rendered into our real 3D world. This allows our storytelling to enter the 3D space in a meaningful way.
Back to the nuts and bolts, it was also important that we mixed the game in the highest possible speaker configuration (7.1.4) because the translation down to 7.1, 5.1 and 2.1, 2.0 will always be correct, and you obviously can’t do that the other way.
Coincidentally, this third installment of the reboot trilogy was very much about exaggerated verticality, exploring more vertical traversal features and spaces using the rope. Narratively, we mirrored Lara’s emotional arc with descending and ascending gameplay and architecture. Height, in particular, is extremely important to this game. In a franchise like Tomb Raider, the environments, vistas and locations are a huge part of the spectacle of the experience. That’s why you’ll hear a lot of us on the dev team talk about the environments as “characters.” Every environment needs to have distinct personalities and feel like a genuine, believable place. The technology of spatial sound, which lets us express height information, is really a perfect fit for our game.
How did you translate the musical elements and sounds Brian submitted - such as the death whistle - and implement them effectively in a 3D game space?
The way I have worked with Brian has been very collaborative over the full three years of the project. There has been a lot of trust on both sides. Brian has done field recording and effects work for us, making it has been an interesting and unique collaboration. Musically, Brian will deliver stems to me for any musical piece that is delivered. These are specific elements of a musical piece that have been isolated, like only the percussion, violins, strings or FX etc. It was in these FX stems that I first got to play around with the death whistles and conch horns, and was able to start playing around with those sounds in the 3D space itself.
The idea is always that these sounds are caused by something that is never explained. It is off-screen, in the far or mid-distance (mid-distance is better, because it makes players more uncomfortable). It could be someone or something making those sounds. It could be an architectural design made to instill fear into those who enter that space. I like the idea that the player asks these questions of the sounds they hear. What is that? What is causing that sound? In particular, Brian had so many different sounding death whistles, and they offered a really great sound palette for these kinds of architecturally ambiguous sounds.
Finally, what do you feel is the most important creative lesson you've learned during your time on Shadow?
There are many things to take away from every project. For me, three things really defined this project and its outcomes.
Vision: For me as a director, a clear vision is vital. If the team has complete creative freedom, they will fail. If they do fail, that is the failure of the creative direction not being clear, not the failure being of their own. It is important to make sure everyone understands the box we are working in. The vision is something that we always need to understand, communicate, demonstrate and articulate with every level of detail.
Onscreen Review: The directors review the content all the time together, in a multi-discipline group, and we challenge each other across all departments every day. Putting our work on-screen and reviewing it every day like this makes a huge impact. There is no other way to truly understand what works and what doesn’t. To be honest, this kind of attention for audio can often feel like a negative thing, and can be very challenging, especially if you are working with temporary mixes and temporary content. When we go into reviews, the expectation is always that what is being shown is final quality. This increased scrutiny is often precisely what we need to make sure every discipline is supporting the other in the right ways. We can’t wait for sound at the end, and we’ve never done that on this project.
Pressure: Finally, pressure is essential to make something amazing. Without being ambitious, without wanting to do better than the day before and without challenging things, you will never put something truly exceptional on screen.