The dismal sound quality of some popular music and a simple solution

Most of the music we listen to on a daily basis falls under the general categories of electronic, classical or jazz.  The sound system in our main listening area is a fairly good one composed of 600 series Thiel speakers powered by a pair of Parasound JC1 monoblock amps.  The CD transport is a Sony XA5400ES and the digitial to audio conversion is handled by the DACS in an Anthem D2V processor.  This system is more than capable of letting the music recorded on standard redbook CDs shine and the music we typically listen to sounds terrific on it.  The sound is full, rich, expansive, excting and alive.

Lately we have been spending time listening to some music that is intended for a broader mass-market than most of the music we play.  The recording companies spend a lot more money producing this more popular music and promote and market it much more aggressively because they end up making much more profit from it.  In many cases – not all, but many – the music sounds terrible, especially in comparison with the other types of music we listen to.  The sound is thin, flat, limited, dull and lifeless.  The music is good, the sound of the music is dismal.  What’s going on?

One possibility is that the difference lies in our sound system.  It’s more sophisticated, refined and expensive than the sound systems many people have in their homes.  Maybe the differences we’re hearing are only discernible on very high-end “stereophile” systems.  This isn’t the problem.  The difference in sound quality is striking and although it might not be apparent on a $200 stereo-in-a-box from Best Buy, it would be easy to hear on a system that costs 5% or 10% of the price of the system we listen to.  The problem isn’t the gear, it’s how the music is produced for the consumer.

Music is mixed and mastered to appeal to the perceived tastes of the kind of listener that likes a particular kind of music.  Maybe the problem lies in the belief of the big record companies that people who listen to music marketed for mass appeal are less sophisticated, less discerning, and more easily satisfied by cheap, shitty sound production than people who listen to classical, jazz or electronic music.  This isn’t the problem either.  I don’t know what beliefs about different groups of listeners are prevalent within the large record companies but the idea that people who listen to music designed for mass appeal can’t hear the difference between well-produced and poorly produced sound is too idiotic to consider.

Music is also mixed and mastered to sound good on the playback technology on which it will typically be heard.  This is where the problem lies.  Music produced for the mass market is mixed and mastered to sound good when encoded as an MP3 and played through ear buds attached to an MP3 player because this is the preferred technology for many people who enjoy mass produced music.  Both the storage format (MP3) and the playback technology (earbuds) put severe limitations on sound quality.

MP3 is designed to reduce the amount of data used to capture music so that music files will use less bandwidth.  The general idea is analogous to decreasing the resolution of your computer monitor.  With lower resolutions there are less data to process and you can get by with less powerful graphics card in your computer.  You also get a crappier picture.  Think of the difference between current HDTV and the old-style standard TV as something like the difference between MP3 and redbook CD.  Or look at the difference between the image on your computer monitor set at 800 x 600 resolution and, say, 1900 x 1200 or 1600 x 1240.  MP3 reduces the resolution of a CD by removing about 90% of the information that is present on the CD.

That’s a lot of of information to get rid of.  What do they cut out?  This is where the limitations of earbuds come into the picture.  Earbuds can’t produce either the high frequencies that give music expansiveness and air or the low frequencies that give it weight and power.  Because of this, music that is mixed for MP3 playback through earbuds simply cuts off both the high and low frequencies at artificial levels to eliminate them from the recording.  This eliminates a lot of the original music but not enough to produce the 90% reduction that characterizes standard MP3 recordings.  To get down to keeping only 10% of the information on the original CD, MP3 thins the music it keeps by eliminating music that is claimed to be redundant or unhearable.

Dynamic range refers to the difference between the loudest and softest passages in a song.  Music produced for mass appeal has almost no dynamic range at all because everything is mixed to play back at very close to maximum loudness.  (The trend toward producing music this way is often referred to as the loudness war).  Dynamic range can make music exciting and emotionally arousing as, for example, the music builds to a powerful and overwhelming crescendo or a kick drum, horn fanfare or guitar chord leaps out of the mix to make an emotional statement.  This all gets lost when the music is mixed at maximum loudness.  Emotional highs and lows are replaced with a constant blaring drone.  However, it works very well for MP3 playback over earbuds where listening takes place in noisy environments in which the music has to drown out ambient noise, and the combined limitations of the MP3 format and earbud technology make it difficult or impossible to produce dynamic range effects anyway.

Cutting out the highs, cutting out the lows, thinning out what is left, and mixing everything at maximum loudness works well for MP3s and earbuds.  It also produces music that is thin, flat, dull and lifeless.  When you listen to music that has been mixed and mastered for MP3 and earbuds on equipment that allows you to hear everything that is present in the recording, you hear that there’s not very much there to hear.

Why doesn’t classical, jazz and a lot of electronic music have this problem?  The main audiences for classical and jazz are not listening to MP3s through earbuds.  They’re listening to CDs played over sound systems.  If classical and jazz were mixed and mastered with the same limitations as music intended for the MP3/earbud market, it would sound terrible and the audience wouldn’t buy it.  Much electronic music is dance or club oriented.  The dull and limited MP3 mixes would die a quick death when played through a club’s powerful sound system.  Another factor that probably contributes to electronic music being mixed well is that for electronic music the musician and the music producer is often one and the same person.  It’s hard to make a piece of music you think sounds great and then kill it by mixing it for MP3 and earbuds.

There’s a very simple solution for this problem – create two mixes, one optimized for MP3 and earbuds and one optimized for redbook CD.  This would seem to be good for everyone.  It’s good for people who buy music because both consumers who prefer MP3 and earbuds and consumers who prefer CDs and home sound systems get music that is produced to sound as good as it can using the playback technology they prefer. It’s good for the record companies in several ways.  First, the record company’s cost for creating the second mix is negligible in comparison with the cost of producing, promoting and marketing a high profile group.  Second, there will be some segment of the market who will buy the music twice, one version to sound great on their MP3 player and another version to sound great through their home sound system.  Third, there will also be a segment of the market that will come to prefer the full, rich, high resolution sound that MP3s and earbuds can’t reproduce.  Those listeners will have a tendency to buy the better sounding mixes of the music they loved in their MP3 days.  The music industry thrives on repackaging the same music over and over again so they can sell it to people more than once.  Releasing dual mixes optimized for different playback technologies fits right into that strategy.  Finally, the music industry has been having fits about their loss of revenue as a significant segment of their market moved from CDs to MP3s.  One solution to that problem seems so obvious that it’s surprising it hasn’t become commonplace.  Give people something on CDs that they can’t get on MP3s.  If you’ve read this far, you know what that something is – the much higher quality sound that is possible with music mixed and mastered to make full use of the sound reproduction possibilities inherent in the redbook CD format.  The music companies get an expanded market for a very small additional cost.  The music listener gets music that sounds great on whatever technology they prefer.  Everybody wins.

01/11/2011

The (Deteriorating) Sound of Music

In a recent review of Dr Dog’s We All Belong I wrote

On decent sound equipment the lo-fi limitations of ”We All Belong” are obvious and an obstacle to enjoying the music.  On lower quality gear it sounds fine which means that it’ll sound okay ripped to MP3 or MP4.  I expect there will be more and more of this as the record industry puts out popular music engineered to sound good on iPods even though they are capable of so much more.

A few days later I was sent an article about how recording engineers are unhappy that they are required to record music designed to be listened to on iPods.  Their complaint is that both compression into MP3 and MP4 formats and sound reproduction through the iPod’s earbuds put such limitations on sound quality that the sonic characteristics that can make music exciting and exhilarating are lost.  The article described sound engineering for the iPod as engineering to the lowest common technical denominator.  The engineer’s complaints are a bit disingenuous because popular music has always been recorded to sound good on the most popular playpack platform of the day and this has usually involved compromising the sound quality of the recording.  The difference between today and yesterday in this regard is that the sound quality of MP3 and MP4 played back through iPods is so low that music engineered for these formats sounds like a huge step backward.

In a previous post (Why MP3s (& MP4s) Suck) I examined the basic reason why compression into the MP3 and MP4 formats produces markedly low quality sound.  It’s simple arithmetic – over 90% of the musical information present on a standard CD is lost when compressed to MP3 or MP4 so it’s no real suprise that it sounds terrible when played back through decent quality sound gear.  While the recording engineers were unhappy about this general problem with music compression, they also pointed out several specific ways sound quality is compromised in order to make the music sound better on iPods.

One problem is that high frequencies that will make music sound sharp, penetrating, immediate, open ended and rich when properly reproduced from a CD sound harsh, grating and abrasive when the music is compressed and played back through earbuds.  The solution is to preferentially eliminate the high frequencies from the mix which turns music that was alive and exciting into something flat and dull. 

A second problem is that music engineered to be played in compressed formats through earbuds is equalized to play at a consistently loud level throughout.  This means that the dynamic range of the music, which is the difference between the quietest and loudest passages on the record, is reduced to practically zero.  Think about how the commercials on TV often sound so much louder that the show.  They’re not; the loudness level of broadcast material is regulated so that it cannot exceed specified limits.  The difference is that the sound in the show has a relatively wide dynamic range and the sound of the commercial doesn’t.  The commercial is engineered so that all of its sound is equally loud and at the highest level allowed.  They’re trying to attract your attention without seeming to consider that the attention they’re getting is usually of the “Turn that fucking thing off!!” variety.

The loss of dynamic range is important because it carries a good deal of emotional information to the listener.  For example, classical music is often recorded with a very wide dynamic range and the swelling crescendos of an orchestra can sound compelling and uplifting even to people who don’t usually listen to classical music.  Another example would be the quiet-loud dynamic of grunge which elicits bursts of excitement in listeners.  Imagine how difficult it would be for an actor to convey differences in emotion if every line had to be delivered at the loudest possible volume.  Elimination of dynamic range by recording at consistently high levels thoughout is done for several reasons.  Because of the extreme sound limitations of compressed formats they cannot reproduce the sound characteristics that make dynamic range emotionally compelling.  There is just too much information thrown away in the compression process for changes in dynamic range to be effective.  Also, iPods are designed to be used in public where ambient noise competes with the low quality sound being put out by the earbuds.  Unwaveringly high loudness levels are designed to drown out competing sounds from the listener’s environment.

Fortunately not all music is being engineered to such low standards.  Justice’s “Cross” and Digitalism’s “Idealism” are two dance/rock/electronica CDs that are very well recorded and sound stop-you-dead-in-your-tracks spectacular when heard through a quality sound system.  Artists and groups like Aimee Mann, Los Lobos and Cafe Tacuba have consistently put out very well recorded CDs. 

All of this leads to an interesting idea.  The music industry has been bitching and moaning incessantly about how CD sales are down and they’re losing so much money because people have been ripping music through peer-to-peer file sharing without paying for it.  The industry response has been almost exclusively to try to find a way to cash in on the MP3 format with the result that we have iTunes and all of its imitators.  As a revenue stream for the music industry this is fine but it’s hardly the only thing they can do.  An obvious idea that the music industry appears to have completely ignored is to give people something on CD that they can’t get on MP3 – high quality sound – and then promote the difference.  They have the technology but they appear to be clueless about what to do with it. 

Consider the recent SACD and DVD-Audio debacle.  Both of these formats are the mirror opposites of MP3 and MP4 compression; they provide much more sound information than a typical redbook CD rather than much less.  When played back through the appropriate equipment they sound terrific, much better than CDs and infinitely better than MP3s.  SACD and DVD-Audio also allow 5.1 surround sound playback which opens up worlds of possibility for musical good times none of which are possible in compressed formats.  So what did the music industry do?  They totally screwed the pooch.  First, they got involved in yet another format war (Do these people never learn?)  Then they completely botched promoting the enhanced formats so that most music consumers didn’t and still don’t have any clear idea of what they were all about.  Then they engaged in jaw-droppingly stupid marketing.  For example, one of the important advantages that Sony’s SACD format had over DVD-Audio is that SACD discs can contain both SACD and regular CD versions of an album.  You can give people both formats for the same price which encourages them to upgrade their sound system (by purchasing Sony’s SACD equipmeent) without penalizing them for buying CDs during the time it takes to upgrade one piece at a time.  So what did Sony do?  Ignored their advantage and shot themselves in the foot by releasing SACD-only discs that did not include regular CD mixes.  Finally the industry decided that only an older audience would be interested in the high quality formats or would spend the money to upgrade their systems so they devoted the majority of their high quality format releases to surround sound remixes of boomer bands like Pink Floyd, Big Brother and the Holding Company, and early Grateful Dead.  Like younger people don’t care about how music sounds and wouldn’t turn on to music that sounded really, really good?  This was all music that was originally designed and recorded for stereo and while the surround remixes are (sometimes) very cool, they are not half as cool as music that was built from the ground up for 5.1 would be.  Rather than abandon surround sound to the boomer nostalgia market, why not let current musicians have a go?  Imagine bands like Justice, Battles, Arcade Fire, Digitalism, Of Montreal, or Ojos de Brujo being given access to the technology and the support to produce surround sound recordings.  Think they wouldn’t produce music that would knock you out? and wouldn’t give people a reason to buy CDs?

You don’t need high-resolution sound formats like SACD to engineer recordings that sound much better than music played through iPods.  Any CD will do the trick and I’d imagine that enough people will care enough about having music that sounds terrific to buy CDs in addition to or in place of downloaded MP3 versions of the same songs.  In fact, why not release albums in two mixes?  You could have the shit-sound iPod version for download that is engineered to sound good given the severe technical limitations of compression and earbuds and the full-sound version that is engineered to sound as good and as rich as possible.  Of course, the music industry isn’t going to market the downloadable version as the “shit mix”.  They’re going to say it’s a specially engineered mix designed for your iPod and probably charge you more for it because it’s special, but so what?  The additional cost for the recording industry of producing a second mix would be minimal and you could let the consumer decide if they wanted to hear one, the other or both.  Apple isn’t likely to think this is a good idea but that shouldn’t stop anyone from small independents to the major music industry players from carrying it out.  Give us the best of both worlds and you’ll make more money. 

Being able to listen to music anywhere with an iPod is a wonderful technological development.  However, the self-interests of the recording industry and the companies that manufacture the cheap gear and cheaply operated hosting services that make this type of music possible shouldn’t obscure the fact that compressed music played through iPods comes at a significant loss in sound quality.  It doesn’t have to be this way.

09/19/2007

Why MP3s (& MP4s) suck

In 2005 I bought Laura, my wife, an iPod for Christmas.  She loaded up a bunch of tunes, put it on shuffle play, and fell in love.  Shuffle play would be very nice to have on our main sound system and Apple said you can have “CD quality sound” if you connect your iPod up to your main sound system using their dock.  Sounded good to me.  I hooked Laura’s iPod up to our sound system according to the Apple instructions to give it a test run.  I was fully prepared for the sound quality from the iPod to be less than the CD no matter what Apple said about “CD quality sound” but I thought we’d be using the iPod for shuffle play when we were busy doing things around the house and sound quality wouldn’t be so important.  When we wanted to listen, we’d use the CD player.  We picked a song that she had ripped to the iPod at random, cued up the CD the song had been ripped from in the CD player, and got ready to compare the two.  First I played about 30 seconds of the CD to have a reference, stopped it, and then played the same song on the iPod.  It literally hadn’t played 10 seconds before we both knew we wouldn’t pay 20 cents for sound that bad let alone the $200 or whatever it was that iPod’s cost at the time.  “CD quality sound”?  Not even remotely close. I like music.  A lot.  I’d go out of my way to avoid having to listen to music that sounded this bad.

I hadn’t payed much attention to MP3 technology because I hadn’t had any interest in downloading music from the net.  I thought the awful sound might be due to something we were doing wrong based on not knowing about the tech so I started to learn about it.  Took about 5 minutes to see what the problem is.  With MP3 and MP4 compression you’re throwing away most of the information present on the CD; it’s not surprising that what you have left sounds terrible.

Music is sound and sound propagates as a wave that is continuous in time.  Digital recording technology is not continuous in time, it records information in discreet bits.  When music is digitally recorded the continuous sound wave is sampled many times each second.  In other words, very brief snapshots of the sound wave are taken very quickly and then these short sound samples are strung together at playback so that it sounds to the listener like you have one continuous sound.  Because the bits of sound that fall in between each of the samples is lost, the digital recording has less information in it than the original sound wave.  This is why a digital recording never sounds just like a live instrument or voice.  The more samples you take each second, the more of the original sound wave you capture and the better the music sounds (assuming each of the samples is the same size).  The number of samples taken per second is called the bitrate and it is usually measured in kilobits per second abbreviated as kb/s.

MP3 and MP4 are compression formats that are designed to shrink the size of digitized files so that they take up less space when stored and less bandwidth when moved.  They are called “lossy” compression formats because information that was present in the orignal file is thrown away or lost when the file is compressed.  The designers of lossy compression formats for music claim that some of the info that is thrown away is “inaudible, or less audible to human hearing”.  Maybe it is, maybe it isn’t.  And what, exactly, does “less audible” mean?  Lots of room for marketing bullshit like “CD quality sound” there.

The question you want to ask is, how much info are they throwing away?  The bitrate for standard recording on a CD is 1411.2 kb/s.  MP3 and MP4 allow for variable bitrates with the trade off being smaller files but lower quality with lower bitrates and higher sound quality but larger files with higher bitrates.  The minimum “it’s good enough” bitrate (in other words, the crappiest sound quality they think people will pay money to hear) that has become something like an online standard is 128 kb/s.  This is the bitrate iTunes uses, for example.  Now compare the amount of info on the CD and the amount of info on the MP3 or 4 file ripped from the CD; 1411.2 kb/s of info on the CD and 128 kb/s left in the MP file.  An MP3 or 4 file has only kept a wee bit over 9% of the info that is on the CD.  In other words, a wee bit less than 91% of the musical info on the CD is thrown away when it is compressed into MP3 or MP4 format. 

91% of the info is lost!?  Are you kidding?  Nope.  If you threw away 91% of a two hour movie, you’d be left with an 11 minute film.  How good would your favorite movie be if all of it except for 11 minutes was cut?  For a lot of films you’d still be watching the opening credits when the movie ended.  If you threw away 91% of an hour long TV show, all you’d have left would be commercials – and you wouldn’t even have half of those.    How happy would you be if 9 out of every 10 songs on your iPod were deleted?  How well would your head do if your body was cut off and thrown away?  The question shouldn’t be why does it sound so bad, it should be why does it sound like anything at all.

The problem here is related to bandwidth, in both senses of the term.  “Bandwith” traditionally refers to a range of frequencies such as the range of audio frequencies, or the bandwidth, that humans can hear.  When MP3s and 4s eliminate info from the original CD source they probably shrink this bandwidth by cutting off the top and bottom ends of the human auditory bandwidth – they cut off the high treble and the low bass.  In order to achive a 91% reduction they must also eliminate a lot of info from within the bandwidth they keep.  By cutting way back on the richness of the info present in the bandwidth they retain, all of which is info people make use of when listening to music, MP3s and 4s are severely compromising the quality of the music as originally recorded. 

This drastic reduction of info within the bandwidth of human hearing that results in such a marked decrease in sound quality is carried out in order to save “bandwidth” in the “data transfer rate” sense of the term.  Internet connections, fast as some of them currently are, cannot move the enormous amount of data recorded at a 1411.2 kb/s bitrate at speeds that will make consumers happy and encourage them to download music from the net.  They also take up a lot more storage space on the hardrives of playback devices like iPods.  Would you have been willing to pay whatever you did for your iPod if it only held one tenth as many songs? 

The multinational companies marketing the hardware you buy to listen to music in MP3 and 4 formats tell you that the sound they produce is indistinguishable from the sound present on a CD of the same music. “CD quality sound!”  They tell you panels of experts say so.  If you believe this, I have a beautiful parcel of land in sunny Florida you’d surely been interested in buying.  Another approach they take is to tell you that, yes, CDs will sound better than MP3s and 4s but only if you have a really expensive “audiophile” sound system that is well beyond the means of most people.  The people who make cheap, crappy electronic equipment that can be mass-produced in low-wage factories outside the US and sold for enormous profits at Best Buy would love it if you believe this.  You’re going to have to have better sound reproduction equipment than an iPod or a $250 stereo-in-a-box to hear the difference between a CD and an MP3, but not that much better.  Another way to look at it is this:  The difference between an MP3 or 4 and the same music on a CD is huge.  If your sound reproduction gear doesn’t let you hear it, you’ve got shitty gear.  Most CDs sound infinitely better on a decent entry level sound system.  Just because you can’t hear that it’s there doesn’t mean you’re not missing it.

Discussion of other issues with limitations in sound quality of music heard in compressed formats can be found in the post The (Deteriorating) Sound of Music.

02/06/2007