#PONO #hdaudio @Neilyoung - This is an opinion blog posted via @twitlonger. Please retweet.
First I want to say that I have an extremely high respect for Neil Young as a musician. His motives behind the #PONO I believe are pure, but I believe it comes from a lack of understanding of how digital audio actually works.
The PONO device actually looks like a fantastic device for playing audio, but not because of so-called "high definition". High definition is meaningless, it is a marketing term that started showing up in the audio world because it has real quality and thus marketing value in the video world.
With video, what is displayed on our television sets is a bitmap, a finite number of pixels. Higher definition means more pixels and more pixels means more information that usable to our eyes. The term high definition has very real application in video and as a result the term has taken on incredible market value, market value that being mis-applied in the audio world to take money away from people under the false claim that it produces better sound than standard CD audio.
Digital audio in the standard Pulse Code Modulation (PCM) has two components that effect the ability to reproduce an analog sound wave. The bit depth and the sampling frequency.
The redbook CD standard uses a bit depth of 16 bits. That means each sample on each channel has 2^16 possible values, which is 65536.
This is what determines the possible dynamic range of your audio, how big the decibel difference between the quietest sound and the loudest sound can be. With 16-bit audio, the dynamic range is 96 decibels.
No music even comes close to that much dynamic range, and there is a reason why. Your volume knob will likely be turned up so that the quietest part of a song can be heard above the background noise in your listening environment.
In a quiet rural area, the ambient noise is already at about 30 decibels. It's even higher in cities, in your car, at your office, etc.
30 decibels + 96 decibels = 126 decibels. That will cause your ears to bleed if not worse, and it is doubtful your speakers are even capable of producing it.
16-bit audio has enough dynamic range for human hearing.
It is true that mastering is often done at 24-bit or 32-bit float or even 48-bit but the reason for this has nothing to do with what humans can hear, it has to do with extra headroom to avoid clipping. Here's why.
Recordings are made at line level, all samples must be taken below 0 dB. 0 dB and negative dB do not mean no sound, but rather 0 dB is the theoretical quietest sound a person can hear. Breathing is about 10 dB. When your volume is turned all the way down, no gain should be applied and you should not hear anything at all because all the sound should be below 0 dB.
Digital audio enforces this, 0 dB is the loudest you can record, it simply can not record signals hotter than that. If you attempt to the result is a distortion called clipping. However if you record so the loudest sound in your recording is too much lower than that, you lose dynamic range.
It is hard to know ahead of time what your loudest sound is going to be. Recording at 24-bit solves that problem because you can you can be conservative in your guess and target a peak of, say, negative 12 decibels (some even suggest negative 18). While that doesn't use the full dynamic range 24-bit has to offer, what it means is if you guessed wrong and the peak volume goes above your guess, it still is probably not going to clip.
The dynamic range you recorded is not the full dynamic range 24-bit is capable of but it is more than the dynamic range that 16-bit is capable of which is more than the dynamic range you or I can safely reproduce on our speakers. So after the mastering is done, you can peak normalize the 24-bit audio and down-sample to 16-bit audio resulting in playback that genuinely represents the full dynamic range of the recording.
That is why mastering is done at higher bit depths than 16-bits, it is not because there is an audible advantage in dynamic range to the human ear.
The other aspect of PCM audio is the sampling frequency. PCM audio is discrete mathematics. There are samples of the analog wave form over specific intervals, and it is called the sampling frequency. In standard CD audio there are 44,100 samples for every second of audio. For DVD and Blueray, there are 48,000 samples for every second of audio. For so-called "high definition" audio, there are usually either 96,000 or 192,000 samples for every second of audio.
More samples means better, right?
To help explain this, I will take you back to your Intermediate Algebra class.
If you have an analog line of the form f(x)=mx+b, you can determine the formula to reconstruct that line from just two sample points. Taking 27 sample points from a line will not result in a better line, it only results additional resources needed to store those sample points. Two is all you need to perfectly reconstruct an analog line.
If you have an analog parabola of the form f(x)=ax^2 + bx + c, you can determine the formula to reconstruct that parabola from just three sample points. Taking 41 sample points from a parabola will not result in a better parabola, all you need is 3 distinct points, any three points.
And so forth.
Pulse Code Modulation used with digital audio is similar.
Given a sample frequency of F, you can *perfectly* reproduce the analog waves as long as the frequency of those waves is equal to or lower than F/2. This is called the Nyquist-Shannon sampling theorem and it is what makes digital audio possible.
A sample rate of 44.1 kHz used in CD Audio is capable of *perfectly* reproducing all audio frequencies of 22.05 kHz and lower. This is solid mathematics and if it wasn't solid, digital audio would not work.
Humans can only hear audio frequencies from about 2 Hz to 20 kHz. Adult humans often can't hear frequencies above 15 kHz. Music rarely goes above 10 kHz.
The sampling frequency used in the audio CD is sufficient for perfectly reproducing every audio frequency that you or I is physically capable of hearing. If may not be the best sampling frequency for cats or bats, but it is sufficient for humans. Higher sample rates have more data but that additional data does not have any value in the reconstruction of analog audio waves that you and I can hear.
Again digital mastering is often is done at higher sample rates, but this is to reduce the build-up of rounding errors during the mastering process. When the mastering is finished, re-sampling to 44.1 kHz or 48 kHz is not going to lessen the ability to perfectly reproduce the analog audio waves.
By not down-sampling, it does however increase the cost of playing back that audio.
When digital audio is played, it has to go through a conversion process from PCM data to the analog line level signal that is sent to an amplifier and then through your speakers, where your speakers vibrate to reproduce the sound we hear.
This conversion process is called DAC - Digital to Analog Conversion.
In the recent modern PC era, most computers originally were only able to convert 48 kHz sample rates. Other sample rates had to be re-sampled on the fly to 48 kHz. This is why historically CDROM drives had an analog cable to the sound card. The CDROM drive would turn the 41.1 kHz CD audio into analog and send that analog audio to the sound card so your PC would not have to re-sample it.
Now days, staring I believe with the Soundblaster 2, most sound cards have DAC capabilities for both 44.1 kHz or 48 kHz. For those that can only do 48 kHz, processors are powerful enough that audio CDs are just re-sampled on the fly to 48 kHz before going through the DAC.
The so-called "High Definition" audio needlessly adds two additional sample rates - 96 kHz and 192 kHz.
Playback on most PCs will result in the audio being re-sampled down to 48 kHz or 44.1 kHz before playback, meaning the audio you hear isn't even using all the samples that are in the high-def audio file to begin with. For playback without re-sampling, your sound card or digital audio receiver has to be able to decode 96 kHz and 192 kHz PCM data and that needlessly drives up the cost of playback equipment because it isn't going to produce a better analog sound wave. It mathematically can't, not in our hearing range.
In fact it probably is quite likely your audio system has a band limiter that removes audio frequencies above human hearing simply because attempting to send them through speakers designed for human hearing can result in distortion. We don't want the high frequencies we can't hear, so it is ridiculous to playback audio using a sample rate where the only benefit is the ability to reproduce those high frequencies that can cause distortion in the frequencies we can hear.
High Definition Audio quite simply is a fraud. It is snake oil. The higher sample rate and higher bitrate does not benefit the audio, all it does is increase the file size and the complexity of playback.
Does that mean high-definition audio should be avoided? Unfortunately no, there is *sometimes* value to purchasing high definition audio. Not because it is 24-bit or has a higher sample rate, but rather because *sometimes* because it is higher fidelity.
CD mastering is suffering from something called the "loudness war". I use to blame the mastering engineers, but when I continually bitched about it, I had some tell me it really isn't their choice. They have a job to do, and they are doing what they are told to do. If they don't compress the dynamic range to make it "louder" the artist and/or producer fires them and goes with someone who will.
What is happening, during the mastering they are actually intentionally reducing the fidelity, the dynamic range, to make the audio sound louder. There are two reasons for this:
A) If it is louder, it stands out more compared with other songs on the radio. So there is marketing value.
B) Increasingly music is listened to in a car or through ear-buds while there is quite a bit of background noise. By reducing the dynamic range, the quiet parts can be brought above the ambient noise and heard.
Hi fidelity stereos in the living room is to many people a thing of the past, so they generally are not mastering for them.
A clear visual representation - and credit where credit is due, I found these images on http://forums.stevehoffman.tv/
1993 Remaster of Jesus Christ Superstar: http://s1246.photobucket.com/user/AliceWonder32/media/JCS-1993_zps29785451.jpg.html
2012 Remaster of Jesus Christ Superstar: http://s1246.photobucket.com/user/AliceWonder32/media/JCS-2012_zps9d9c3668.jpg.html
Compare the images, you can quite easily see the difference. The quiet parts in the 2012 remaster are much louder than they are in the 1993 remaster. That's the loudness war.
With SACD and high definition audio, the target market is people with more money who are more likely to have quality home stereo systems that they use, rather than just listening to all their music through an iPhone or $20 speakers attached to a PC.
As a result, *sometimes* the mastering is intentionally done better than it is for CD, with higher fidelity. In cases where the mastering is done differently with higher fidelity in mind, it may be worth it to purchase the high definition version of the audio simply because you are getting a better master.
And that makes me sad, the audience buying CDs is being dumbed down with respect to what good music should sound like.
What I would like to see is not a high definition audio service, but a high fidelity audio service.
Quality mastering with dynamic range intact but without the un-necessary bloat of 24 bits per channel or absurdly high sampling frequencies.
16 bit 48 kHz flac files for typically $1.69 per track.
128 kbps opus / aac or 192 kbps mp3 encoding of same tracks at $1.19 per track.
That is what I would like to see.
And un-like "high definition" audio where some crappy over-compressed masters are sold as high definition, only properly mastered high fidelity albums would be offered.
That is what I would like to see.
Please re-tweet this, and if you use bitcoin and feel so inclined, you can send a small something to: