SoundStage! Surrounded! - Digital Evolution -- Part One (4/2003)

Surrounded!
Back Issue Article

April 2003

Digital Evolution -- Part One

I originally formed the idea for "Surrounded" after becoming convinced of surround sound’s ability to create a better musical experience in the home. Coming to this realization posed what I thought would be considerable obstacles, from both philosophical and logistical standpoints. I mean, I’m an audiophile, and you and I both know that we audiophiles can be slow to change our long-held beliefs.

The philosophical speed bump turned out to be quite easy to overcome. If I could have better sound, which equated to more musical realism, which lead to more enjoyment, I could suppress any feelings of nostalgia for two-channel-only audio. In this instance, my logical self quickly won out.

The logistical issue was one I had partially addressed when adding home-theater capability to my system. Of course, multichannel music requires arguably even more precise speaker placement and greater system resolution, so it was not accomplished without planning and a considerable financial outlay. But I came to find that the promise of better music in the home conquers almost any tradition, budgetary consideration, or décor issue -- as many of you already know.

Just when I thought I was home free -- "I’d made a seamless transition," I thought to myself, "like a well-designed crossover, heh, heh, heh..." -- another obstacle smacked me in the head. I didn’t have the technical knowledge to really communicate to our readers the finer points of multichannel music and the high-resolution formats. Sure, I could listen and from that standpoint things were clear, but I wanted to keep pace with the technical aspects, too. It’s a good thing a writer has his sources, or resources in this case.

Thus was born the idea to develop a primer for digital audio that would take us through CD and into both DVD-Audio and SACD. The goal is not to declare a winner, but define the participants. So, armed with a desire to understand the two principles in the high-resolution multichannel-music arena, I enlisted the fine gentlemen at Switzerland-based Anagram Technologies to help. Anagram Technologies, for those that don’t know, is a Swiss company that provides digital solutions to some of the brightest manufacturers in audio. These companies include Cairn, Audio Aero, Camelot Technology, Audiomecca, Talk Electronics, and Nagra, just to name a handful. Orpheus Labs -- a company wholly owned by Anagram’s principles -- and maker of the very fine Orpheus Two multichannel preamplifier, has benefited from their knowledge base as well. And with that, off we go into our conversation with Florian Cossy and Thierry Heeb.

Jeff Fritz: For those not familiar, please tell us about Anagram Technologies, Orpheus Laboratories, and your background in electronics design.

Florian Cossy: The Anagram story is, first of all, a story of friendship between Thierry Heeb [Anagram Technologies’ DSP engineer] and myself. We have been friends for about 13 years now, both having related university cursus [degree] -- Thierry is a mathematical engineer and I am an electrical engineer. We both had the opportunity to work as consultants for Goldmund back in 1996. As time progressed, we agreed less and less with Goldmund’s "philosophy" and we decided to create our own company. Anagram Ltd. was born.

The goal of this first company was to provide innovative A/D and D/A conversion solutions to the high-end-audio domain. We developed the ATF module and then found a customer base -- Audio Aero and Cairn were the first ones. What is really funny is that Goldmund has never been a customer, even though they have tried our A/D and D/A solution and found it outstanding!

We were not able to do it all ourselves and we decided to expand. In May 2000, Reynald Gentizon became the third person to join the company as a partner, and we decided to change the structure of the company: Anagram Ltd. became Orpheus Laboratories Ltd. and we created Anagram Technologies Inc. The goals were quite different: Orpheus Laboratories was used as a demonstration brand for Anagram Technologies’ solutions, so that our potential customers would have working units to evaluate. Daniel Oertli joined us in late 2000, and he is the last partner in both companies.

Orpheus is today growing and seems to interest audiophiles all around the world. That's why we have developed an entire system with even more new products due to be announced this year.

Anagram has also grown a lot since 2000. We now have many different types of customers -- high-end audio, mass-market audio, semiconductors, and even automotive companies. Some of them do not want to be named. We are seven people, and we will be nine by April 1, 2003. The goal today is not only to provide solutions to high-end audio manufacturers, but also mass market -- not only in the audio domain but also in the video and measurement domains.

Today, we are trying to separate both companies in order to clarify for people what we do.

JF: As a beginning to our discussion on the high-resolution digital formats -- namely SACD and DVD-Audio -- can you breakdown the concept of the CD?

FC: A CD can be seen similar to the LP for the mechanical construction: You have a spiral track that covers the disc itself, between the diameters 50mm and 116mm, of the 120mm disc. The track pitch is 1.6 micrometer.

CD has a table of contents in which you have the position of each track -- from a logic standpoint (not physical) -- and the length of the track. This can be seen as similar to the index of a book.

Data is stored with redundancy, which means that when the pickup reads the data, there are control bits that confirm a correct reading (checksum bits for example¹). If the reading has an error, the decoder will either have enough information with redundant bits to correct it or it has to use an algorithm to place missing information in the datastream.

JF: How, mathematically and from a physical-structure standpoint, is the maximum resolution of a CD determined?

Thierry Heeb: First of all one has to understand that the CD is a digital media. That is, all the information on the CD can either be expressed as a "1" or as a "0." There are no in-between values possible. This fundamental piece of information is called a "bit."

As with any digital media, a certain number of bits are grouped together to form a significant piece of information called a "word." According to the Red Book (i.e., the specifications for the CD format), a word is formed of 16 bits. One 16-bit word represents a sample of one audio channel. There are two audio channels on a CD.

The sampling frequency of the CD is specified to be 44.1kHz. To state it simply: CD is encoded as 16-bit PCM at 44.1kHz. (For more information on this, please refer to the answer to the next question.) Each bit on the CD is represented as either a small hole or a small bump (depending on its value of 0 or 1) that will deflect the laser differently allowing the recognition of a 1 or a 0.

Let’s dip deeper into the matter of CD resolution. With 16 bits per word, you can actually code 2¹⁶ which equals 65,536 different values. Imagine having 16 LEDs, and each of them can be turned on or turned off. The first LED will give you two choices (on or off), the same for the second LED, and so on up to the 16th LED. So we end up with 2 x 2 x 2 x 2… (16 times), which calculates to 65,536 different values. But how does this relate, first to numbers, and then to an analog audio signal?

Let us put a numbering value (N) on the 16 bits described above. Let us call each bit B0, B1, up to the 16 bits in the word we are considering, according to the following formula:

N = -B15 * 2¹⁵ + B14 * 2¹⁴ + B13 * 2¹³...+ B2 * 2²+ B1 * 2¹ + B0 * 2⁰

This gives a univocal [unambiguous] mapping of our 16 bits to the number range -32,768 to +32,767. This number range can then be mapped to an analog voltage with proportional values. This is indeed what a D/A converter does. If we imagine a 4V output signal, -32,768 would correspond to -2V, and +32,767 to +2V, and 0 would correspond to 0V. Intermediary values are mapped linearly, correspondingly.

As can be seen from the above, the smallest signal variation that can be coded on 16 bits is equal to 1/32,768 x full scale. In other words, we have a resolution of 1/32,768 with a 16-bit coded signal, which corresponds to about -96dB THD+N, as each bit represents about -6dB.

In comparison, whereas a 16-bit signal has a precision of 1/32,768, a 24-bit signal has a precision of 1/8,388,608, or 256 times better than 16 bit.

There are methods to enhance the apparent resolution of CD past the theoretical 16 bits. These techniques are either based on dithering or noise shaping. Dithering acts by adding a pseudo-random low-level signal to the audio to be coded. This pseudo-random sequence cleans out the quantification noise (truncation of a value to 16 bits). Noise shaping, on the other hand, works by moving unwanted noise to less-critical parts of the spectra. Indeed, with those techniques, it is possible to get more than 16 bits of resolution on part of the spectrum of a 16-bit coded signal.

JF: Give us a synopsis of PCM audio, its benefits, and limitations.

TH: PCM stands for pulse coded modulation and is based on Shannon’s Sampling Theorem.

Shannon’s Sampling Theorem states that a stationary and band-limited signal can be exactly reconstructed from its samples, provided the sampling frequency is higher than twice the maximum frequency present in the signal.

A good analogy to PCM is a movie track. The movie track is made of a succession of still pictures (at about 24 images/second). One can consider each of these pictures as a sample of the movie -- it’s like taking a still picture 24 times per second. When the movie is projected in a theater, what is projected is indeed the succession of the still pictures. The pictures’ change rate is high enough that we perceive the image as moving continuously in time. Indeed, the movie picture is an excellent example of a sampled system.

There is one major difference between the movie and PCM audio: The still picture we were describing is made of analog pictures in the sense that they are recorded on film. Imagine now that we are scanning those pictures into a computer. We tell the computer to use a certain resolution, say for instance 640 x 400 points. This means that the image will be cut into 640 rows and 400 lines, producing a large number of little squares (pixels) and the image will be constant on a given pixel. Think in these terms: We replaced the analog picture (without pixelization) with a digital picture with a given resolution (the pixel size).

So let us link this analogy to audio, and PCM in particular. We start with an analog signal.

First we start by taking "still pictures" of the signal at a high rate. This is the process of "sampling."

Then we "pixelize" the audio creating "still pictures." This is called quantification (i.e., we choose the resolution of the signal (16, 20, 24 bits) and associate the corresponding numerical value).

200304_graph2.gif (4622 bytes)

The number sequence given by the sampling and quantification of an audio signal as described above is the PCM representation of the given audio signal.

Benefits of PCM

Easy to understand when compared to more sophisticated modulations such as PWM, DSD, et cetera.
Natural way of expressing an audio signal.
Linear by nature.
Easy processing of PCM-coded signals.
Very large PCM-compatible digital audio gear in the field.

Limitations of PCM

Limited bandwidth (at least with 44.1kHz or 48kHz) for transient reproduction.
Non-suitable for direct digital amplification.
Loss of phase information in the higher part of the spectrum on short signals.

In part two we’ll delve into the DVD as a storage device for SACD and DVD-Audio, and discuss Meridian Lossless Packing (MLP) and Direct Stream Digital (DSD).

...Jeff Fritz
jeff@soundstage.com

¹ A computed value which depends on the contents of a block of data and which is transmitted or stored along with the data in order to detect corruption of the data. The receiving system re-computes the checksum based upon the received data and compares this value with the one sent with the data. If the two values are the same, the receiver has some confidence that the data was received correctly.

The checksum may be 8 bits (modulo 256 sum), 16, 32, or some other size. It is computed by summing the bytes or words of the data block, ignoring overflow. The checksum may be negated so that the total of the data words plus the checksum is zero.

...Jeff Fritz
jeff@soundstage.com

Surrounded! Current Issue