Hearing Audio Editing
Do you hear editing when you listen to a track?
Yes, of course.
Do you hear it because the editing introduces certain
‘telltale’ sonic artifacts?
For the most part, yes. Vocals are the main element of a pop song, so they
are very forward in the mix and it gives you the chance to hear any editing
on them fairly well. So hearing tuning or timing on lead vocals is pretty
common. Some things you might hear aside from extreme tuning (think
T-Pain or Cher) on vocals are additional overtones or stuttering of consonance if a vocal track is timed using certain digital tools like ElasticAudio
in Pro Tools, FlexTime in Logic or the audio plug-in Vocalign. This can
have a huge impact on the vocal performance and tone, especially when
fully timing, or quantizing, the vocals. If you take the time to chop and
move a vocal part around by hand and re-fade the tracks, however, it can
be hard to hear what I’m talking about. One method is not better than the
other, you just get different results and they require a different amount of
time and effort.
106 Alastair Sims with Jay Hodgson
Can you describe some more of these artifacts? For
example, I know that if the attack time setting on your
auto-tuner is dialed into a setting that’s too fast for the part,
and the vocalist has a pronounced vibrato, the result will be
what I can only describe as a robotic ‘warbling’.
You can get ‘the warble’ when your auto-tuner snaps between two different
notes very rapidly, for sure. The other thing you get with tuning is, if you tune
the note too far from its original pitch, the track will start to sound weird and
move out of what sounds like a normal singing range. The singer will either
sound like a chipmunk or like they are trying too hard to sound ‘manly’.
On the timing side of things, you’d probably have the easiest time hearing artifacts from the Vocalign plug-in. That processor aligns two different
takes, very useful when you want to have doubled vocals
that are very tight. When you have hard consonants
like kuh or guh, though they are lower in volume compared to many of the other elements in
speech and singing, they appear as quite sharp
transients in the waveform. Sometimes Vocalign
will misinterpret the sonic nature of those transients and
will double or triple them, resulting in an obvious ‘flamming’ in the vocals
Similarly, in Pro Tools, you might use something like ElasticAudio or
X-Form to time a vocal track. In Logic, you’d probably use the FlexTime
algorithm. What you get from these editing tools, when they’re pushed,
is a weird ‘carrier’ frequency underneath the recorded parts. There’s
a strange midrange frequency, or tone, that they add to everything. It
makes singers sound like a duck or a frog, I’d say; they add a distinct
resonance to vocals that wasn’t there before, and it changes the way
Audio Editing In/and Mixing 107
Download and Listen
to tracks 6.3 and 6.4
those vocals sound. This artifact arises
most obviously when the edited track
is stretched to extremes and there’s not
enough recorded audio to support a time
stretch. When you’re stretching audio to
the point that you’re exceeding the limits
of what was actually recorded, the plugins need to add in frequencies or repeat
audio to ‘fill-in’ the silences that result,
and you get that ‘ducky’ or ‘froggy’ resonant sound.
Have you heard ‘ducky’ or ‘froggy’ vocals on
Yes, definitely. Most Top 40 tracks are obviously edited nowadays—editing
isn’t something that engineers feel they need to shamefully conceal anymore—so you’re going to hear evidence of editing everywhere.
You can hear some X-Form-like processing in the song ‘The Heart
Wants What It Wants’ by Selena Gomez. Right off the top you can hear it
on the lead vocal.
We’ve covered the sound of editing on vocal parts. Can you
describe similar ‘telltale’ sounds of editing that you might
hear on other instruments?
On all the other instruments, the most typical way to edit is using the
‘chopping and fading’ method, that is, by cutting audio, fading each cut
and aligning them all to grid or some broader timing scheme. You can
also use ElasticAudio and FlexTime on those instruments, if you want, but
I would generally not advise it, because the chopping and fading method is
more transparent. You don’t get the same sort of artifacting I just told you
about when you chop and fade. Though again, it is more time consuming.
In terms of artifacts from the chop and fade method, they will vary, but
always be based around the fade and the phase relationship of the two clips
or regions being faded together. There are three ways to ‘hear the fade’,
as it were.
- A large gap that has been smoothed
- An in or out of phase fade
- A shortened sustain
The first, a filled gap in the audio, is the most common. When a piece
of audio is played out of time, cutting at the beginning of the note or the
transient of a hit and quantizing those pieces of audio is the most common
practice. As a consequence of the quantization process, though, two pieces
of audio might be moved apart from one another, leading to a gap in the
audio. The simplest way to remedy this gap is to pull back the beginning
of the second note or hit until the gap is filled. Because you are pulling out
the beginning of the second note, you will be repeating a section that fills
the gap twice. Once in the first note and again in the part of the second clip
used to fill the gap. This will cause an audible effect of
doubling or stuttering that part of the audio file.
To solve this repeat artifact, you simply extend
the length of the fade back into the first note
(extending it towards the second note would
mean the transient or beginning of the second note
would be repeated, which only increases how audible the edit
is), blending the two notes together making it one smooth, longer note
The second type of ‘fade artifact’ you can hear is a fade that is either
in phase or out of phase. Imagine again two notes that have been cut
and quantized and they move very little or exactly the length of half
or one of the wavelengths of the note being played. In the instance that
the two are moved only a very small amount when you make a fade,
particularly a longer fade, you will hear comb filtering. The best way
to avoid comb filtering is by making the fade as short as possible. If
the audio clips are moved over by either one full wavelength or half a
wavelength of the note played, you will have a fade that is either perfectly in phase (one wavelength), or out of phase (half a wavelength).
This will lead to a quick volume increase in the case of in-phase audio
or a volume decrease in the case of out-of-phase audio. To avoid these
changes in volume, you need to change the type of fade you’re using.
There are two basic shapes of fades, equal gain and equal
power. Equal gain is a linear fade, while equal
power is logarithmic. To avoid a volume
increase in the case of in-phase audio, you
would use an equal gain fade, and to avoid
a volume decrease you would use an equal
power fade. While the chances are small that
the two audio files will phase match perfectly in an
additive or subtractive way, it will happen from time to time. I suggest
starting always with an equal power fade that is short, around five to ten
milliseconds, and changing them as needed
The third type of ‘fade artifact’ is when the sustain of a note is shortened
because you’ve moved two notes closer together. Imagine a piano with the
sustain pedal held down, allowing for notes to ring out under one another. If we
were to then cut up passages played on this piano and move them closer together
then fade them, you would have a quick decrease in
volume between the sustain of the first note to
the now-moved forward second note, which has
a fixed amount of sustain bleeding over from
the first note. This causes an interesting effect
on the source being edited; it makes for a disjointed
and jumpy sound. To fix this type of edit you adjust the length
of fade, making the fade longer, even to the point of it being most of the length
of the first note—the jump between different sustain levels is smoothed out
across the fade, making for a natural-sounding note transition.
While editing, these three types of fade happen all over the place and all
together. You might have an out-of-phase fade that has been pushed closer
together, so you’re also hearing a jump in sustain volume. This is just what
happens. You have to spend time adjusting the size, shape and location of
the fade. If that doesn’t work, find the note later in the take; if the note is
only played once, you can try adjusting the length of the note using time
expansion or compression tools, or if the problem is really that bad even
record another pass of the song.
What are some terms that editing engineers use?
Are there any that you hear a lot?
The hard thing with [terminology] when describing music is that the words
are descriptive, not objective, similar to trying to describe the taste of
food. You might say something like “That sounds froggy” or “That sounds
ducky” because of certain types of vocal editing. Or perhaps when you get
those short fades and sustains getting cut off, like I just described, you’d
say “It sounds choppy” or “It sounds lumpy”. These are terms I use often
when describing editing, because that’s what they sound like to me. A few
other terms I use frequently and throughout this article are tuning (using
pitch corrections), timing (quantizing audio), chopping or cutting (the act
of separating the notes in a performance to be quantized), and a big one
I didn’t mention are ‘triggers’. Triggers are used in drum editing; think
of them as placeholders for where the exact beginning of the drum hit is.
I use these to help time the drums properly, as well as properly put in drum
samples for mixing.
What do you think the future holds for audio editing?
Editing has been commonplace in music production for years, and has
started to be used even artistically within music now. I see editing becoming even more integrated into the production workflow. It is used for its
unique and different sounds, to save more time and money in the studio,
and to help artists further their craft and allow them to create new sounds
and better their performances