contact

Multiplayer Sound Design

February 2010 by psneville 2 Responses

When implementing a soundscape for games, it is not enough to attach audio clips to the correct game objects and hope the game engine handles the audio placement in 3D space correctly. Unless the engine includes a sound propagation engine and advanced audio occlusion and DSP API’s (and an audio engineer who can use them well), the result of merely tagging audio in space are usually weak, watered-down sounds, particularly for the local player in a multiplayer game.

It’s also subtly disconcerting to have a single sound — an explosion, for example — with the exact same waveform representing both local and remote audio, even though that is realistic; that is, it’s often more effective to diverge from realism in some cases, and employ different filters and even sometimes different waveforms altogether. I won’t argue for this statement intellectually, but we can trust our ears to tell us when it is true.

Unity3D, the engine of choice for many indie devs these days, is an example of a great tool that still does not surface these kinds of audio API’s (although as of version 2.6 it does include FMOD, those API’s are not yet exposed through Unity). And even packages such as UDK, which boast powerful audio features, lack a sound propagation engine.

Bad Local Player Gun

For example, here’s a weak, watered-down effect produced by a single sound used to represent a large mortar round firing when it is imported in Unity to be placed in 3D space:


http://psneville.com/audio/multiplayer/Gun_Local_Bad.mp3


Good Local Player Gun

We want that sound for the local player to be a lot more powerful and there’s no need for it to be directed in 3D space or filtered through a rolloff, since the camera (which is the primary audio listener in most games) is right on top of the local player.

Here’s what it sounds like after adjusting the implementation but not changing the actual audio file at all:


http://psneville.com/audio/multiplayer/Gun_Local.mp3



Eliminating the unneeded 3D effects and applying some volume, pitch and EQ changes in the engine make it full omnipresent stereo for the local player who fired the gun, and didn’t require modifying the actual audio file itself.

Remote Players

When that same gunshot is heard by a remote player who is in the same game as the player who fires that gun, we want the remote player to hear more bass and less treble in the sound, because of the difference in the way lower frequencies travel compared to higher frequencies — just as you hear the thumping bass and kick drum from a radio when a car passes but can’t hear the hi-hats or a female vocal. Using the previous example sound, the remote player should not hear the gun clip sound at the end of the shot. This is not a master volume thing, it’s an EQ thing, and if the audio engine can’t do that well, then you need a separate file to represent it even though it is the same logical sound caused by the same object and action in the game.

For example, at a distance of 50 meters and rotated away from the gun source, you might want the remote player to hear this gunfire like this:


http://psneville.com/audio/multiplayer/Gun_Remote.mp3


It is still the same sound source, and the produced sound is still stereo, but there is no clip sound, there is less treble, the sound has been EQ’d to carve out a space in the frequency range which doesn’t muddle the more important local sounds this player is causing, etc. It also still leverages the game’s audio engine — the engine is placing it to the left (at about 9 o’clock) in 3D space and applying a rolloff factor as well as slightly varying the pitch.

Technical Strategy

This is a fairly simple solution that works even in the absence of powerful audio engines:

Set up multiple audio assets with varied characteristics and use them in combination with the engine’s audio features to represent one single logical sound. In Unity, for example, this can be accomplished by scripting a wrapper AudioClip replacement that includes an internal array of source audio clip pairs, with each member of that pair containing different stereo/mono and 3D/omnipresent settings — one for the local player and one for the remote players to hear when the audio is played. The remote clip can be the same physical audio file as the local player if that makes sense (though again, sometimes it is better to diverge from reality here), but with different asset import settings applied.

This wrapper object should also contain some variable to represent ideal distance at which the individual sound files can be played, so that the implementation code can quickly iterate through the source to find the correct clip pair. It would then select which clip in a pair to play based on whether the current player is the local source of the sound or whether the sound was generated by a remote player. Code summary: (1) Find which audio pairs are best for a given distance, and (2) choose one member of the pair based on whether the sound was originated locally or remotely.

Of course, the audio engine will place the ‘remote’ member of the audio pair into 3D space and your distance variable is not used for that purpose (it is used only to select the proper audio pair, not actually to apply distance). But again, unless the engine exposes strong audio API’s, that placement needs to be coupled with filtering you apply either (a) through the game’s audio engine import settings and tools, even if the sound source is the same single file, to generate a second asset; or (b) externally and bounced into a new version of the file itself, if the game engine lacks that sort of audio support. This is the reason for this remote member of the audio pair.

If you have two different physical files for a single pair — one for the local and one for all remote players — then it’s quite likely that the local pairs at long distances will apply distance effects in the file itself, while the remote versions need to be much louder and apply no external distance effect, only EQ and the like, because the game engine will push the sound into space appropriately.

Finally, the sound is filtered by the local game engine — in Unity this currently means pitch and volume and rolloff factors, while in other engines EQ and other DSP effects such as reverb, distortion, gates, etc. can also be applied in real-time. These effects work in conjunction — not in conflict — with this general implementation approach.

Unity-Specific Tip

In Unity v2.6, I find it best to employ non-3D stereo sounds when the local player is the cause of big BOOM sounds, like cannons and explosions, while the exact same logical sound results in playing a different sound imported as a 3D sound (either mono or stereo, based on what sounds correct) for remote players. This is not realistic, but players expect immediate feedback and they expect to have massive effect — they are less demanding when it comes to the effects caused by other players.

Final Words

As a final example, here’s an explosion heard by the local player:


http://psneville.com/audio/multiplayer/Explosion_Local.mp3


Here is that same explosion sound as heard from a much greater distance by a remote player in the same multiplayer game, at a distance directed in 3D space by the game engine, after import settings were tweaked to generate a new asset with very different EQ applied:


http://psneville.com/audio/multiplayer/Explosion_Remote.mp3


A final word on this approach:

Combining multiple audio files with the proper code settings takes a great deal of tweaking and listening, as it’s a combination of programming and music production, which is most efficiently performed if the sound designer and audio engineer are sitting side-by-side (or if they are the same person).

  1. Joel Walsh says:

    Great ideas in this article ! The unity specific info is helpful too thanks for sharing this.

  2. psneville says:

    Thanks, Joel! I checked out your site — terrific work you have there! I’ll forward your link to some friends as well.

    Cheers,
    Sean