Sound is something most of us know and love. But although we hear sounds every single day of our life, there are many aspects to experience that one usually doesn’t pay attention to. These are the basic questions of the nature of sound, the reasons for our hearing anything at all and the mechanics underlying the sensation. It is not entirely clear how sound is actually generated in physical objects, why different objects do not all sound alike or how all this relates to our sensory apparatus. When taking a closer look at sound as a physical phenomenon, many interesting characteristics of sound come to proper focus. The relevant topics include the wave characteristics of sound: diffraction, reflection, interference and so on. Taking into account the peculiarities of the human sensory organ and the psychology of perception in general, one is taken into the field of psychoacoustics—the study of human auditory perception and its underlying mechanisms. This is a field rich in applications and interesting discovery, not all of which seems intuitive at first sight.

As one application perceptual sound codecs could be mentioned—how can an MP3 codec throw 90% of an audio signal away and still reproduce a perceptually near perfect replica of the original?

Sound is a wave phenomenon. That is something we’re all told on highschool physics courses. But usually no time is left for a nice intuitive picture of the thing to build up. That’s what we will try to construct, next.

To create waves we need a medium in which to put them. Obviously the
medium needs to be elastic in order to generate any waves at all.
Additionally there must be a kind of stiffness

which holds the
adjacent parts of the medium together. This is what makes wave
propagation, a form of energy transfer, possible. It also determines
the kinds of vibration which can propagate in the medium: unlike
solids, gases and liquids lack transverse binding forces and so only
longitudinal energy transfer is possible. This is why sound cannot be
polarised. The other important intrinsic property of the medium is its
density—together density and the strength of the binding forces
determine the speed of wave motion in the medium.

Note that both density and the stiffness can vary within the substance: the medium can be inhomogeneous. In addition to this, the binding forces can be different depending on direction. In the latter case we talk about anisotropy, which fortunately does not occur in gases.

In the case of sound, the density is just the usual weight per unit of volume of gas and the stiffness is borne of a balance of repulsive forces between molecules and their mean kinetic energy (temperature). In a closed system of moving particles the sum kinetic energy will stay constant and statistical physics predicts that the mean distance between the equally repellant particles will try to even out. Similarly the expected velocity (with direction) of particles over an arbitrary volume will be zero—at large scales the gas will tend to stay put even if the molecules themselves move quite a bit. Only after we upset the balance do the mean properties ripple. These mean properties are used as a stepping stone to the analytic model of a sound field, which simply forgets that the molecule level ever existed, assigns a real velocity vector and a real pressure measure to each point in space and lets these vary over time. From Newton’s laws we then get a relatively simple partial differential equation which tells that pressure gradients cause acceleration of particles and that the velocity at some point causes the pressures to change around it.

Often we make the further assumption that at each point, the velocity vector equals the gradient of the pressure field, though this way we lose part of the generality of the model.

To get a hold of the process of sound propagation, we first look at a
simple example: a point source. This is simply one point in
which we explicitly control the sound pressure or the velocity vector of
the field. In practice, neither of these can be controlled independently
of the other. When we excite the medium by creating a disturbance in its
structure, coupling between adjacent particles makes the disturbance try
to even out, globally. The forces arising from the new inhomogeneity
accelerate the particles toward lower pressure. But this, assuming the
pressurised region is point‐like, can only mean that the disturbance
moves outward. The stuffed particles get pushed off from the center of
the region. This makes them move and, consequently, pushes fresh
molecules out of the way. Voila: motion. Once this has happened, the
pressure is evened out. It is worth pointing out that the pressure wave
does not get ironed out in the inward direction. This is due to the
inertia of the molecules—once the pressure sets them in motion, the
pressure moves in the direction the first molecules go in. What happens
is similar to slamming a pool ball against another. Since all this has
happened through what are almost 100% elastic collisions of particles,
little energy is lost. (Some is, to the mean kinetic energy of
particles, raising the gas temperature. This accounts for some of the
attenuation that sound experiences while travelling.) As long as no new
particles are brought into play the net effect is that of making the
pressurised region move outwards from the point source. Note that
individual particles do *not* move appreciable distances in the
action but stop after transferring their kinetic energy to the next one.
This is a characteristic of wave phenomena: seen at a larger scale,
energy moves, not the medium. Put in another way, turning the volume
knob east does not produce a tropic storm.

It is very important to separate such concepts as the pressure field (scalar in 3D), the density field (scalar in 3D), the velocity (3D vector in 3D), the gradient fields of the first two scalar ones (3D vector in 3D), the time derivative fields of the scalar ones (scalar in 3D) and all the derived fields on top of these. When bypassing the mathematical notation, it is exceptionally easy to confuse the first and second time derivatives of the pressure field with velocities and accelerations (which are taken relative to spatial coordinates and are, hence, vector fields).

Above, we compressed the air at the source. But the same principle of
wave transmission applies if the opposite is done—namely, if we create
a depressurisation zone. In this case, the surrounding particles move in
and the zone moves outward again. Also, the amount of depressurisation
is significant—the more violent the original disturbance, the bigger
the propagating bump

in the medium. Actually, aside from the energy
of the original defect getting divided into a larger area and thus growing
fainter and fainter per unit volume, one can accurately construct a
series of more or less violent distrurbances at a point by measuring the
local air pressure at a point some distance removed from the sound
source. This is, roughly, how sound propagates through air in free space
and is experienced from afar.

A few things have to be noted about sound radiation. The first thing is
that we speak about pressures. The importance of this is seen when
thinking about a speaker cone that moves very slowly. In this case, the
air has time to escape from before the cone instead of forming a high
pressure zone. Apparently efficient radiation is not possible. We see
that in order to emit considerable radiation, rapid variations in pressure
or large radiators must be used. This is typical of wave emission—it
is why microwave radio transmission requires only small antennas while
low frequency AM radio often employs dipole antennas that are tens
of meters long. From this we come to the second point: in order to
continuously emit sound, we cannot just move the speaker cone further
and further ahead. Instead, the cone has to come back before the air has
time to escape around the edges. In physics, the situation is discussed
in terms of coupled systems and impedance matching. The principle is,
sound emitters work best when the inertia of the medium keeps the medium
from moving appreciably and on the other hand the emitter’s own inertia
isn’t large enough to make it hard to move. Back and forth motion is
the normal mode of wave transmission, not the impulses we have discussed
so far. A special case of such movement occurs when the motion repeats
at a constant rate and each cycle involves the same precise pattern of
movement. In this case we speak of periodic motion and
periodic sound/signals. The rate of repetition of a periodic motion is
dubbed the *frequency*, with Hertz (Hz) as its unit. Hertz is the
SI unit meaning times per second

. As each part of a wave traverses
at a constant velocity and, at each fixed point in space, the vibratory
motion repeats at a constant rate, we see that one period of the motion
is always exactly duplicated in a certain interval of space that depends
only on the speed of the wave motion and the frequency of the wave we
drive through the medium. As we work in a single medium, the speed stays
constant, so the length of our interval depends only on the frequency
of our vibration. This is the *wavelength* corresponding to the
frequency, with an inverse dependency on it. Because of the properties
of what we will become to know as linear systems, a certain type of
periodic wave has a very special position in our treatment. This wave is
the sinusoid. It is the smooth, endless, periodic function
which we bump into in trigonometry. The sine wave has the property
that when put in a linear system (in our case, transmitted through air),
it comes through as a sine wave with the same frequency. The only
variation comes about in the form of a time lag and a change in
strength. When a combination of sine waves of different frequencies
is introduced, they go through as if the other waves weren’t even
present.

In later parts of the text when we talk about sound, we usually mean pressure variations measured at a point. This is because we have ears which are relatively small compared to the wavelength of audible sound—we can with good accuracy say that ears are pointlike with regard to sound fields. Thus few humans even fully comprehend the real, complex vibrational patterns which occur in three dimensional spaces—evolution has not equipped our brain to do such analysis. This fact is a double edged sword, really—it would be nice to actively understand all the phenomena involved in sound transmission since all such things affect what we hear but, on the other hand, mathematical description and manipulation of 2+ dimensional wave phenomena quickly becomes quite unwieldy. It is quite a relief to scientists, engineers, technicians and artists that such considerations are not strictly necessary to fool our hearing.

When nonsinusoidal sources and/or a number of radiators and/or closed
spaces are considered, things get interesting. At once we note something
called *interference*. It is what happens when more than
one source is placed in the same space. At each point in space, the
individual contributions of our moving pressure zones (one for each
emitter) just add up. We get what is called linear wave
transmission. The name comes from mathematics and means, roughly, that
given a bunch of signals, we can first add and then feed through a system
or first feed through the system and then add, with equal results. To a
considerable degree, this is what happens with sound. In spite of its
rather technical connotations, linearity is a true friend. Without it,
there would be little hope of understanding anything about sound at an
undergraduate level.

Said in another way, at small to moderate amplitudes, sound transmission in large scale obeys a second order linear partial differential equation, called the wave equation, which is seen in all branches of physics and is covered early on in physics education. As is well known, once we know some solutions to a linear differential equation, we get more by scaling and summing.

Now, as periodic waves interfere, it is interesting to see what happens
in a single, fixed point in space as time evolves. Let’s suppose we have
a one‐dimensional string where a single sinusoidal sound source is present.
We know that the pressure in a single point reflects that of the source
at any place, save the time lag it takes the vibratory motion to reach
our point and the attenuation resulting from friction and other damping
forces. If we now add a second source with an identical frequency but
a different placing on the string, we get *standing waves*. How
does this happen? Think about the peak of one period of the motion. As
it leaves the two sources, it travels at a constant velocity away from
them. Precisely at the middle, the two waves meet and we let them
interfere; they add together. The same applies for the valley parts of
the wave. So in the middle, we get twice the amplitude. We say the two
sounds are *in phase* with each other. Let’s take another point,
this time choosing it so that the time to get from source 1 to the point
is precisely half a cycle time greater than the time to get to our point
from source 2, that is, the difference between the distances to the
sources is a whole number of wavelengths plus one half. This time, the
sinusoids always arrive at our point precisely when they cancel each
other out. So in this point, we never observe vibratory motion. Points
of these two kinds occur repeatedly over the entire length of our string,
with the amplitude of the sinusoid motion varying between them from zero
to double the source amplitude.

The last example was very simple, as only one‐dimensional effects were considered. If two‐dimensions are used, we get a nice interference pattern, where our special points recur on points where the distance is, again, a whole multiple of half the wavelength.

We remember from highschool geometry that, given two fixed points, if we draw a curve of those points where the difference of distances from the fixed points is constant, we get a hyperbole. So the knots and humps of our interference pattern on a plane occur on hyperboles with the point sound sources as foci and the spacing of the points determined by the wavelength of the sound. The same deduction goes for the 3D case, only the sound field is quite a lot more difficult to visualise. We get, logically enough, hyperboloids. (To see this, put a line through the two point sources, rotate a plane set through this line and repeate the two dimensional reasoning on this plane.)

One should note that when different frequencies are combined, the result
is more complex, since now we cannot combine the resultant vibration
pointwise into a single sinusoid. But keeping to two, close frequencies,
we get an interesting phenomenon called *beating*. When two
frequencies that are close to each other are combined, we get, not an
audible combination of the two, but the frequency in the middle of the
two, varying sinusoidally in amplitude at the rate of the difference
between the two original frequencies. This is seen as follows. Suppose
we have two sine waves with frequencies