When put to a grueling test, which codec holds up the best?
To test out the capabilities of these various codecs, I have designed a custom sound file that will push the codecs to their limits. The sound file plays a riff using the sawtooth waveform, which is an infinite summation of sine waves, then plays it again with the square waveform, a different kind of infinite sine wave summation. Then a second riff is played with each type of waveform. The sawtooth waveform is aptly named, because it looks like the teeth of a handsaw, while the square wave looks...well, square. The "waveform" is a plot of the amplitude of the wave vs. time. Perhaps it's better just to see the pictures. The point is that round, smooth waveforms (like sine waves) are the simplest type of waveform, and are thus really easy for codecs to deal with, so they are basically crap for codec testing. When you want to see how far a codec will go, you need nasty, jagged waveforms. This makes it easy to hear when the codec is messing up the sound. Here are the waveforms used:
The Sawtooth Waveform
The Square Waveform
You can see that the corners of these waveforms are not perfectly sharp. This is because the software used to create the sound cannot actually add up infinite sine waves to obtain a perfect sawtooth or square wave. This would take an infinite amount of time, so it settles for a pretty good approximation. Certainly these waveforms are good enough for this test.
After creating a source WAV file, I tried three codecs: MP3, Ogg Vorbis, and Windows Media Audio 9. To encode the WAV file to MP3, I used RazorLame, long reputed to be the best program around for MP3-creation. For Ogg Vorbis, I used OggDropXP, the official encoder of those crazy people who make all the Ogg stuff. I used Windows Media Player 9 to encode the WMA files.
To better ascertain which codec was best in which bitrate range, I encoded the file with each codec at many different bitrates. Starting low at 48 kbps (kilobits per second), then proceeding through 64 kbps, 96 kbps, 128 kbps, 160 kbps, 192 kbps, 224 kbps, and finally 256 kbps. WMA was unable to go higher than 192 kbps because, for some unknown reason, Microsoft has limited Media Player to that bitrate. 192 kbps is certainly high enough to get a good representation of the codec's abilities, however, so it's all good. Higher bitrate files of course yield better quality, but also larger file size.
After getting a 48 kbps file for each codec, I popped the three files and the source WAV file into a program that can play all three encoded sound formats: Winamp 5. They were carefully listened to and compared with the source WAV file to see how and in what ways they differed. I repeated this test for each bitrate. I thought it would be best to compare the three codecs at each particular bitrate, then to compared each individual trial with the source.
At 48 kbps, all the files sound terrible. The MP3 sounds like you have cotton stuffed in your ears. The MP3 codec, rather than attempting to work the higher frequencies into the file, simply gives up and drops the high frequencies. This gives the file a muffled, muted sound. The Ogg file makes a gallant attempt to encode the high frequencies, but fails miserably at such a low bitrate. It has a nasty flanging effect in the high frequencies. The WMA file neither gives up on the high frequencies, nor spreads itself too thin trying to represent them perfectly. Instead, it has a sort of roaring effect in the low frequencies, perhaps devoting more of the file to the highs. The MP3 sounds the most pleasing to the ear, because it simply cuts the highs. Ogg sounds the most accurate, but also the worst because it spreads itself too thin. WMA falls somewhere in between, sounding somewhat accurate, and somewhat bad.
At 64 kbps, the files still sound quite bad. The MP3 manages to cut less of the frequency spectrum than before. The Ogg file has less of a flanging effect, but it is still there and still annoying. The WMA file still has the roaring sound, but it is diminished. Again, the Ogg file sounds the worst but the most accurate, the MP3 the best but the least like the original sound, and the WMA file somewhere in between.
At 96 kbps, the files are barely listen-worthy. The MP3 sounds pleasant enough to listen to, but is still quite distinguishable from the source file. For low-freq notes, the Ogg and WMA files are indistinguishable from the source, but they still have their respective flanging and roaring issues for the higher notes.
At 128 kbps, the MP3 still takes a portion off the top frequencies, the Ogg file is barely distinguishable by way of a tiny flange, and the WMA takes only the barest amount off the high frequencies.
At 160 kbps, the Ogg and WMA files are indistinguishable (at least to my ears) from the source. The MP3 still takes a bit off the top, but you won't notice it unless you really listen closely.
For 192 kbps and above, all the encoded files are indistinguishable from the source file.
In conclusion, the WMA and Ogg files outperformed the MP3 by the barest margin (they are, after all, newer technologies), and were, at least in this test, of comparable quality. The Ogg Vorbis codec sounded the worst before becoming indistinguishable, yet offered the most accurate representation of the original sound.
While the first test was designed merely to test the theoretical limits of the codec, most songs do not actually use infinite summation waveforms designed specifically to be difficult to encode. A real song offers a different challenge to the three codecs: How small can you make the file before it starts to sound bad? The flanging/roaring problems experienced by Ogg and WMA are less of a problem in this test, so we get to see the practical applications of the codecs.
I chose for this real-world test the intro to the song Smells Like Teen Spirit. I did this for several reasons. First, I like the song and I have a CD to rip it from. Second, it has very crisp drum hits, plenty of crash cymbal, and lots and lots of higher harmonics from the distorted guitars to give the codec a workout. Third, the general production quality on the album Nevermind is superb, among the best I've ever heard.
I did the exact same test as before with the Teen Spirit intro, so here are...
At 48 kbps, the MP3 suffers from the same problem as before. The high frequencies are simply non-existant, which means that the crash and hi-hat cymbals go practically unheard. There is just the barest impression that there is something hissing in the background, like we are listening from inside a sealed cardboard box. The Ogg Vorbis file sounds absolutely STUNNING for the bitrate used! There is some light flanging on the cymbals, and it is still quite distinguishable from the original, but it really doesn't sound bad at all. The WMA file sounds better than the MP3, but still rather muted, missing a significant portion of the high frequencies.
At 64 kbps, the MP3 sounds only a bit better than at 48 kbps. The highs are still missing, and it sounds muffled. The Ogg file surprisingly doesn't improve much, despite 33% added bitrate. There is light flanging, and it is still distinguishable. The WMA file sounds basically like the MP3, except it has consistently more of the high frequencies present. Still a bit muted-sounding.
At 96 kbps, the MP3 finally brings in the main frequency of the hi-hat cymbal, a crucial part of the song's intro. This improves its sound drastically, but it is still missing a fair amount of the highs. The Ogg file firms up and does a remarkable job, being nearly indistinguishable from the source file, except for the hi-hat sounding slightly louder and rougher than in the original. The WMA file sounds fairly well-rounded, but falls somewhere inbetween the MP3 and Ogg files in terms of clarity of the high frequencies.
At 128 kbps (seems to be the standard for online music), the MP3 is sadly still lacking some of the crisp highs of the source file. The Ogg file is now indistinguishable to my ears. The WMA file, surprisingly, is also bordering on indistinguishable (I guessed it only 2/3 of the time).
At 160 kbps, the MP3 is barely distinguishable. The main difference between it and the source still lies is its inability to deal with high frequencies (the cymbals, in this case). The Ogg and WMA files are indistinguishable.
At 192 kbps and above, all the files are indistinguishable from the source.
Despite its near-universal acceptance in devices of all sorts, the MP3 codec falls slightly short of the other two codecs in performance. The Ogg Vorbis codec really shines, becoming indistinguishable at an average bitrate of 128. The WMA 9 codec is just barely behind the Ogg codec, with only very subtle differences.
I'll summarize the codec information again. CBR (constant bitrate) MP3 vs. ABR (average bitrate) Ogg Vorbis vs. CBR WMA 9. These are the default settings for each program, and in my experience, changing MP3 to VBR (variable bitrate) instead of CBR makes no difference in quality for files that have constant sound. It is quite a different story, however, for files with long silences in them, like the audio track from a movie. VBR's ability to cut the bitrate down to 32 kbps or a similar value during the pauses can save you significant filespace with no loss in quality. For the sound files I used, CBR and VBR are going to perform identically.
Some quirks about the codecs: OggDropXP, being set to ABR by default, has a quality slider that I finagled with for a long time until I got the filesize to come out very close to that of the MP3 file. Sometimes it was impossible to get it to output at exactly a particular average bitrate. Windows Media Player has a slider that goes 48 kbps, 64, 96, 128, 160, 192. The output MP3 files were about 2% smaller than encoding at the advertised bitrate should have produced (real bitrate of 188 kbps for target bitrate of 192 kbps). The WMA files came out 2% larger than predicted by the target bitrate (196 kbps in reality for 192 kbps target). This isn't bad, really, just interesting. The Ogg files were variable bitrate; their average bitrate was matched with that of the MP3s, therefore they also ended up 2% smaller than labeled.
MP3 became flawless to my (fairly sensitive) human ears at 192 kbps, Ogg Vorbis at 128 kbps, and WMA at 160 kbps. I therefore make the following recommendations: If you have devices like car stereos and MP3 players that you want to be able to play your music on, use MP3 at 192 kbps. If you want to keep as much music on your computer as you can (smallest filesize while maintaining quality), use Ogg Vorbis at 128 kbps. If you don't want to install any more software than is absolutely necessary, use WMA 9 at 160 kbps. My personal choice is Ogg Vorbis at 128 (Quality=4.00).
The good news is, you don't have to take my word for it! Listen for yourself, and make your own judgments on the relative quality of the codecs. You can download the two audio tests right here:
Square & Sawtooth Test
Smells Like Teen Spirit Test
Each compressed file contains the source sound file, and all encoded files used in the tests. You will need the Ogg Directshow Filters to play the Ogg Vorbis files. Windows Media Player or Winamp can each play all four file types (the three tested and the source WAV file).
The equipment used in this test was as follows: A computer, running Windows XP, and outfitted with a Creative SoundBlaster Audigy sound card and Logitech's Z-560 4.1 speakers. Decent studio headphones were also employed to test distinguishability.
Trivia: The names "Ogg" and "Vorbis" were taken from Terry Pratchett's fantasy book series "Discworld". Nanny Ogg is a mischievous but kind witch of Lancre. Inquisitor Vorbis heads the Omnian Inquisition and enjoys torturing unbelievers.