Vocabulary
- Absorption
- In absorption, wave motion is attenuated—usually by conversion into
heat. Usually the unabsorbed part of the original wave is reflected.
- AES
- Audio Engineering Society; an American organization of audio
engineers which standardizes audio related technology and forms a
common forum for experts in the field.
- AES10 or MADI
- Multichannel Audio Digital Interface; a unidirectional multichannel
digital audio transmission standard originated by the AES. MADI is
based on the FDDI (Fibre Distributed Data Interface) transmission
format, but usually uses coaxial cable instead of optical fibre.
Accommodates up to 56 channels and 24 bits per sample. Used for
point‐to‐point multichannel digital audio connections in studio and
broadcasting environments.
- AES/EBU digital audio bus (AES3)
- A digital sound transmission standard which is based on a synchronous,
self‐clocking RS‐422A compatible physical layer on top of which stereo
digital audio and associated data (called sub‐channel data) is
transmitted. The standard has been strongly influenced by CD
technology, and is mainly used between digital studio equipment. The
standard specifies multiple sample rates (32kHz to 48kHz) and sample
bit depths (upto 24 bits per sample). Originally developed by AES,
later adopted by EBU, hence the name; the correct, official name for
the standard is AES3‐1985.
- Beating
- A phenomenon produced by the interference of two sinusoidal sounds
at close enough frequencies; the audible sensation is the one of a
single sound, periodically modulated in amplitude by a second one.
The reason for beating is the inherent incapability of the human
hearing of separating two close to each other frequencies and the
mathematical equivalence of such a combination to a sinusoidally
amplitude modulated sine wave.
- Causality
- A property of systems which states that the system
does not laugh
before tickled
: nonzero output only results from nonzero input
and the output never occurs before the input. Causality is a necessary
condition for realizability—in the real world, systems simply do
not know their future history…
- Clipping/overdrive/saturation
- When circuits or transmission media are driven past the point of
their maximum input amplitude, they tend to limit the signal to its
maximum value. This can happen sharply (digital full scale) or softly
(the sigmoid type limiting action of analog tape) and results in the
effect of hard or soft limiting, respectively. Limiting produces
heavy sidebanding and, consequently, harsh and nonconsonant distortion.
Synonymous terms are overdrive (especially when speaking of amplifiers)
and saturation (taken from tube amplifier terminology).
- CODEC
- Coder/decoder. When talking about data transmission, a coder/decoder
is a device or algorithm which works on a bidirectional data link,
coding transmitted and decoding received data. Audio codecs usually
use computer files, multimedia data streams or TV broadcast channels
for their data.
- Composite
- A loose term used in this tutorial to mean sound signals that originate
not from a single, but multiple sound sources, notably instruments.
Examples include most recordings and natural sounds.
- DCC
- Digital Compact Cassette; an attempt by Philips at establishing a
digital successor to the common analog C(ompact) cassette. The
format utilizes a specially shielded cartridge with the dimensions
of the traditional C‐cassette. The media is similar, as well, so
DCC
players can play analog cassettes. Because of this (and because
recording
on both sides
of the tape would have been difficult), rotary
heads were not used. Although multiple parallel tracks are used, the
data rate of the tape is severely limited by the design and data
compression needs to be used. The method chosen by Philips is
MPEG
layer 1 audio coding, only under the name PASC. DCC is now officially
a dead format, killed by technical difficulties and heavy competition
from Sony’s MiniDisc.
- decibel
- A logarithmic unit of ratios. Defined as ten times the base‐ten
logarithm of the measurable divided by a reference level. For
different quantities measured and different reference levels we get
all sorts of decibel measures. The most typical are dB SPL (average
sound power/average power at threshold of hearing), dBu (average
voltage/1 Volt) and dBW (power/1 Watt).
- EBU
- European Broadcasting Union; an organization formed originally by
national radio stations in Europe. Specializes in broadcast audio
distribution technology. Current standardization efforts include
terrestial digital radio, both for audio and various kinds of data.
- Harmonic (overtone)
- Given a signal, we can decompose it with the Fourier transform. Then,
a harmonic (of some particular frequency present in the transform)
is any frequency (also present in the analysis) which is at a whole
number ratio to our base tone. If the signal is periodic, every
partial present in the analysis is harmonic. The term implies an
underlying base tone to which the harmonic in question is related.
(Thus, one doesn’t say that components present in a composite signal
are necessarily harmonics, even though they may appear in integer
frequency ratios.)
- HRTF
- Head related transfer function; the transfer function of the system
resulting from the linear filtering action of placing the human
body (especially the head) in a sound field. The main components
arise from shoulder and ear lobe reflections and from diffraction
effects on sound travelling around the head. Also used to denote
the impulse response of such a system or any processing method
simulating such a system (the usage is quite fuzzy indeed).
- IMA
- International MIDI association; a consortium formed as a place for
users of MIDI and related software to discuss their problems and
propositions. IMA keeps close contacts with MMA to relay user input
and suggestions to manufacturers.
- Interpolation
- Interpolation is a technique used to reconstruct waveforms from
discrete samples taken from them. Many different such techniques
exist, differing by their underlying mathematical structure. Most
common ones are based on fitting a polynomial of some degree to
the sample data. From this come the terms linear, quadratic, cubic
and so on. These refer to the degree of the polynomial that is fitted.
Interpolation methods are also named after the families of polynomials
used (Chebychev, Legendre etc.) and their construction
(NURBS).
Common to all these methods is that they strive for optimality in
some sense—most try to achieve smooth approximations
with the resulting curve passing through the data points. When used
to reconstruct acoustic waveform from evenly spaced samples, polynomial
interpolation is never the optimal way. Instead, reconstruction by
approximations to perfect lowpass reconstructing filters should be
used.
- Medium
- A scientific term used to denote the underlying substance or space
where waves travel. In the context of audio systems, air is usually
the medium, although compressible liquids and solids can also
transmit sound.
- MMA
- MIDI Manufacturers’ Association; a consortium formed to promote and
refine the MIDI specification and to guide in the implementation of
the standard. MMA extensions to the original MIDI specification
include MIDI time signaling and SDS.
- Modulation
- The variation of some characteristic of a signal or a parameter of
an algorithm producing the signal to achieve some specific goal.
Examples include amplitude modulation (time‐variant scaling of a
signal (AM, tremolo)) and frequency modulation (variation of the
repetition rate of a (quasi‐) periodic signal (FM, vibrato)).
- MP3
- MPEG‐1 or MPEG‐2 (sic!),
layer 3 audio coding; a lossy, perceptual audio
coding format widely used for the transmission of stereophonic sound,
both in commercial and non‐commercial environments. Layer 3 is the
most sophisticated of the 3 layers specified for MPEG‐1 and
MPEG‐2
(They share the same audio bitstream formats, only the allowed bitrates
differ. Funny enough, MPEG‐2 allows only three of the lower bitrates.).
The standard does not specify the codec, per se, only the bitstream
syntax. However, implementation issues have stabilized fairly well
by now. MP3 offers excellent audio quality for music and similar
sound encountered on soundtracks at relatively low bit‐rates (in the
range from 48kbps to 192kbps). Isn’t suitable for very low bitrate
speech coding, for which different methods exist. The acronym comes
from the common filename extension used for files of this content.
(FYI: Philips used MPEG‐1 layer 1 audio coding in DCC, only under
the name PASC.)
- MPEG
- Motion Picture Experts Group; a joint consortium of motion picture
engineers. Standardizes movie related material. Commonly known for
its MPEG‐1, MPEG‐2 and MPEG‐4 standards, which pertain to the digital
coding and transmission of moving picture and associated sound
(MPEG‐1‐MPEG‐2), and multimedia (MPEG‐4, in draft stage).
- Partial
- Given a signal, we can decompose it with the Fourier transform. Then,
a partial (of some particular frequency present in the transform)
is any frequency present in the analysis. The term implies an
underlying base tone to which the harmonic in question is related.
(Thus, one doesn’t say that components present in a composite signal
are necessarily partials.)
- PQ code(d)
- PQ refers to the first two (of the eight, named from P to W)
subchannel bits on CDs. These are used to carry auxiliary data, such
as track information, the table of contents (TOC), catalog numbers,
ISRC (International Standard Recording Code) information, de‐emphasis
status, SCMS copy propagation control and so on. The majority of
this information is carried over the Q channel, accumulated in 98
bit frames, whereas the P channel carries a simplistic code denoting
the starts and ends of CD tracks, lead‐in and lead‐out areas. Most
current CD players are sophisticated enough not to use the P channel
code at all, since all relevant information is also available
through the more sophisticated Q coding scheme. The addition of
PQ
code is a major portion of the CD mastering process—often manufacturing
plant bound masters are simply referred to as being
PQ coded
or PQed
.
- RCA
- Radio Company of America; a long gone manufacturer of radio
and audio equipment in the US.
- A connector used to interface audio equipment at a line
level of 4dBu in an unbalanced mode of transmission. These
take the form of an 8mm wide ground plug/jack with a
concentric 2mm plug/jack (opposite polarity) in the center
for signal and are usually colored red/white vs. black (for
left/right signal wires). Originated by RCA the Company.
Currently the most common way of interfacing consumer audio
equipment.
- SCMS
- Serial Copy Management System; a protocol used for restricting digital
copying of audio material in consumer applications. Based on sub‐channel
coding of generation identifiers and copy protection bits on digital
audio media, such as DATs and CDs. Only implemented in consumer mode
applications, pro mode applications ignore SCMS. AES/EBU in pro mode
cannot even convey SCMS information.
- Shift‐invariance or time‐invariance
- A property of (linear) systems which states that the response of the
system does not depend on the time an input is applied. A
FIR or IIR
filter with constant coefficients is a prime example. A flanger or
a tremolo is a prime example to the contrary.
- SMDI
- SCSI Musical Data Interchange; a data interchange standard originated
in 1991 by Peavey Electronics. In the late 80’s and early 90’s,
samplers were coming into fashion and a standardized way to exchange
sample data was needed. As MIDI was quite old and extremely
slow (MIDI
choke was a problem even then), it was seen that a new bus was needed.
As the SCSI (Small Computers System Interface) bus already existed
and had proven to be interoperable, SMDI leveraged the existing
technology. Nowadays SMDI can be used to convey all kinds of information
besides pure sample data and is invaluable whenever samplers need to
be integrated to the rest of the studio. As an added bonus, computer
connectivity and use of existing SCSI hard drives became possible.
- SDS
- Sample Dump Standard; standardized by the Midi Manufacturers Association,
this protocol allows unified downloading of sample data to synthesizers
and samplers through the MIDI bus. Utilizes SysEx messages and offers
two separate modes: open loop and closed loop. Open loop corresponds
to the usual MIDI connection topology, in closed loop configuration
a separate return cable is used to provide feedback. SDS is extraordinarily
slow, even in the context of the MIDI physical layer. In addition,
operating SDS reliably is quite difficult (to use SDS in closed loop
mode, the physical cabling has to be changed, for instance) and so
the standard is not currently widely deployed in studio environments.
- Sidebands
- A frequency components added to a signal when put through a suitable
modulation process. Especially AM and FM produce sidebands. The name
implies roughly symmetrical placing of the added components relative
to the original unmodulated signal.
- SMPTE
- Society for Motion Picture and Television Engineers; an organization
of motion picture and television technology experts that standardizes
technical aspects of moving picture and related data (such as audio)
transmission and coding, such as frame rates, time codes and modulation
techniques. Responsible for the time code format of the same name
which is commonly used in broadcasting, film production and professional
audio applications as a common synchronisation standard to relate
pieces of audiovisual presentations together.
- S/PDIF or IEC‐958
- Sony/Philips Digital Interface; a consumer derivative of the
AES/EBU
bus. Standardized by the International Electrotechnical Commission
under the name IEC‐958, but marketed as S/PDIF for consumer
applications. (Technically, these are two different standards but in
practice, they are almost identical. They interoperate perfectly.)
Uses simplified AES/EBU (consumer mode) and includes provisions
for copy management through SCMS. Used primarily for digital audio
transmission in consumer applications, such as CD players, DATs,
MD players, and DCC recorders. Applied on top of both electrical
and optical interfaces.
- SPL
- Sound Pressure Level; sound pressure levels are a measure of average
sound power. They are defined as the average power of sound,
relative to the threshold of hearing, on the decibel scale.
- Stability
- A property of linear systems stating that given any bounded input,
the system will produce a bounded output. Other ways of saying the
same thing is that the system function has no poles on or outside
the unit circle, that the impulse response approaches zero as time
goes to infinity or that the system does not exhibit self‐oscillation.
- Subband coding
- A generic term describing a method of coding signals where filter
banks are used to divide the signal into several (frequency) subbands
and their output is encoded instead of the signal itself. Decoding
is done with a reconstruction filter bank. Includes transform coding
as a subset.
- Subchannel coding
- The transmission of auxiliary data on CD data frame subchannel bits.
Includes PQ coding of track and SCMS data, as well as the additional
data oriented applications standardized as
CD+G
and CD+MIDI.
Later, the same coding was transferred to
AES/EBU frames and DAT tape.
- Superposition
- The addition together of multiple signals.
- Mathematically, the superposition principle characterizes
linear systems. What it says is that, first, if we input two
signals to the system and add the respective outputs, we get
the same result as we would get by inputting the sum of the
original signals and observing the output (additivity).
Second, if we amplitude scale the signal by a constant and
observe the system output, the result is equal to inputting
the unscaled signal and only after that scaling with the
constant (homogeneity). All this put into a single formula
gives the superposition principle. It is usually applied
backwards when we already know the system obeys
linearity.
- Transform coding
- A generic term describing a method of coding signals where a suitable
(linear) transformation (such as the FFT or the DCT) is used to break
up the signal before coding. Decoding is accomplished by the inverse
transform (or an approximation to it, if the coding process is lossy).
Specifically, the term often implies block coding: the signal is
first divided into blocks and the transformation is applied separately
to each of these. Transform coding is a special case of subband
coding. The idea is to concentrate the energy (variance) of the
signal into as few coefficients as possible, giving space‐efficient
representation and leading to compression.
- Vocoding
- The superimposition of the estimated varying short‐term spectral
envelope of a signal on another. Used as an effect to create
illusions of singing instruments and other spectral hybrids of
separate sound sources.