FAQ : MONITORS DEMYSTIFIED [Archive] - Blackout Audio Techno Forums

DJZeMig_L

02-01-2004, 10:15 PM

Part 1: Compromise and Approximation

Monitor speakers affect nearly all the decisions we make when recording and mixing — yet most of us know very little about how they are designed, and why they sound the way they do. In the first of a new series, Phil Ward explains what goes into the design of typical passive nearfields, and the effects they can have on what we record.

If you stop to think about it — though few people ever do — monitors are a fantastically important part of our studios. You use these lumps of chipboard, plastic, paper and glue on every recording you make, and they colour your sound more fundamentally than pretty much anything else. Perhaps the lack of consideration they are normally given is because monitors are so simple to use. They rarely need adjustment, have no software, no internal memory, no processor, no user manual (well, at least, not one you can find), never need an update and probably haven't even got a power socket.

Once a month or so you'll probably see, and no doubt even read, a monitor review in this very magazine. Typically, the reviewer will describe the design of the product, perhaps write a little about the engineering propaganda being disseminated by the manufacturer and then move on to describe some of the subjective audible characteristics that the monitor 'imprints' on recordings.

This imprinting process is analogous to another situation involving a piece of equipment many of us have in our studios, which is also called a monitor, rather confusingly; the screen on your computer. If you do any graphic design and/or computer-based illustration, you'll know how easy it is to be caught out by the colour distortions that screens add to everything you draw. It's especially frustrating if you also use a printer. The screen and printer colours are very unlikely to match and neither will be nominally 'accurate'. However, in the case of colour-matching, there are calibration procedures, software patches and set-up routines that, if you have the time and patience, can be used to minimise the difference between input and output.

There's nothing of this kind with monitor speakers, however, even though the basic problem — matching an input signal with an output — is similar. So why are speakers different? How is it that a 'simple' pair of speakers can imprint such audible characteristics on music? What are the mechanisms at play, and given the existence of these mechanisms, how do speaker designers decide how their products should sound? Do the specifications that manufacturers publish have any value? And perhaps most importantly, is it possible to be smarter in our choice and use of monitors so that we can minimise, or at least better understand, the contribution they make to the sound of our recordings? Well, yes it is, and with only the analysis of a typical, smallish nearfield monitor for a safety net (and no, I'm not going to tell you which one it is, that wouldn't be fair), I'm going to make an attempt over the next couple of issues to shed a little light on the dark, mysterious world of monitor speakers and what horrible things they might be doing to your lovingly crafted music.

The Art Of Compromise

It's no fun being a speaker designer — your entire professional career is devoted to making the least awful sounds you can out of beautiful ones. Give speaker designers the perfectly recorded sound of a Stradivarius, a Martin, a Steinway or a voice and deep inside they'll know the very best they can do is compromise and approximate (slice a speaker designer and, like Blackpool rock, you'd probably find those two words printed right through the middle). The compromises start as soon as a new product is conceived and they arrive from three directions: first, the inconvenient fact that our ears are cleverer at listening than speakers are at speaking; second, the more inconvenient fact that users tend not to have unlimited space or cash; and third, the even more inconvenient fact that, as yet, nobody has developed a technique for telling the laws of physics who's boss. However, because they are the fundamental villains of the piece, I'll start this month with the laws of physics.

Relatively speaking, music is a wide-bandwidth beast with a wide dynamic range. A speaker is limited in both bandwidth and dynamic range (the latter perhaps to a far greater degree than many realise). When speaker designers try to widen both the bandwidth and the dynamic range of their products, the laws of physics exact a heavy penalty and the listener hears it happening. Following is an explanation of the classic low-frequency bandwidth versus dynamics trade-off, illustrated by a few acoustic measurements from the typical nearfield monitor I borrowed to write this piece.

Pass The Port

Like many speakers we can all name (there are pictures of several dotted around this piece in case your memory needs jogging), our No-Name Acoustics nearfield comprises a couple of drive units mounted on the front surface of an 'airtight' box. The box is there to suppress the output from the rear of the bass driver so that, at low frequencies, it doesn't cancel the output from the front. In addition to this (and providing somewhere for you to put your soft-toy studio mascots), the box also fundamentally defines the low-frequency limit for the system, as the 'stiffness' of the air inside makes it harder and harder for the drive unit cone to move as the output frequency falls (the cone displacement required to generate constant acoustic power increases exponentially as you move down the frequency spectrum).

The designers of the No-Name have chosen to reduce the bandwidth-limiting effect of the box by employing 'reflex loading' — a hole in the box extended by an internal tube in an arrangement often known as a 'port'. Reflex loading extends the low-frequency bandwidth of a speaker by adding a 'helper' resonance to the system. At really low frequencies, the 'slug' of air inside the tube simply pumps backwards and forwards out of phase in response to movement of the bass driver, and so contributes nothing to the acoustic output. But at the port's resonant frequency (which is defined by the cross-sectional area and length of the tube, and the volume of the box), the slug moves in phase with the driver, adding significant extra acoustic output and reducing the movement of the drive unit cone. Sounds like a free lunch, doesn't it? But, as ever, there's no such thing.

Here's why not. Firstly, the extension of low-frequency bandwidth produced by reflex loading comes at the expense of dynamic accuracy, as the amplifier now not only has to control the movement of the drive unit cone but also the air in the tube. Secondly, reflex loading causes the system to display rapid change of phase with frequency, and if you express phase change as time you can see that a reflex-loaded speaker effectively adds a delay to low frequencies. That time delay can be expressed as a distance (ie. the speed of sound multiplied by time — more on this later) with the result that when speaker designers choose reflex loading they are also effectively choosing to move the bass player's fundamentals back three metres or so. And you thought kicking him was the only way to do that...

Thirdly, as soon as you increase the sound level to a point where the air passing through the port becomes turbulent (as any substance does eventually when put through a pipe at a certain speed — it's those dastardly laws of physics again), the air becomes a non-linear mess. At best, the port will make some odd farting noises, at worst, it effectively stops working at all. However, you can make the air continue to flow in a linear, non-turbulent fashion at higher sound levels by designing the exit surface of the port in a flared shape. Unsurprisingly, the manufacturers of the No-Name Nearfield have taken exactly this approach (see the 'Load Of Balls?' box on page 194 for another way of reducing port turbulence).

Fourthly, with or without generous flaring on entry and exit, a reflex port is fundamentally non-linear. The acoustic impedance (ie. resistance to movement) of the big wide world on the outside of the box is obviously not the same as that inside the box, so the flow dynamics as the air in the port rushes outward are not the same as when it rushes inwards. As a result, the average air pressure inside the box will drift away from the nominal atmospheric pressure outside, and the bass driver's voice coil will take on an average offset away from its nominal rest position. This offset can be relatively innocuous and result only in a slight increase in low-frequency harmonic distortion, but if the driver happens to have non-linearities in its magnet system that cause an offset in the same direction, they can be completely disastrous as the cone suddenly 'locks out' at one end of its travel. I've seen this happen in a reflex-loaded bass guitar cabinet fitted with a very well-regarded American bass driver. It's quite an impressive trick, and boy does it quieten the bass player...

Grasping The Graphs

That gives you an idea of the theoretical compromises involved in reflex loading. Now let's see how they affect the No-Name Nearfield in practice, by examining its frequency response graphically.

Figure 1, shows the actual low-frequency response of the No-Name Nearfield — the black line shows the sum of the output of the driver and the output of the reflex port from 20Hz to 1kHz, plotted as output level in dB against frequency in Hz. The designers have chosen to give the No-Name a slightly overdamped response, where serious roll-off commences below around 55Hz (a bass guitar's bottom E has its fundamental at 44Hz). I wrote 'chosen', because tweaking the various parameters of box volume, port resonance, and drive unit allows an almost infinite number of different responses to be achieved; for example, responses that maximise bandwidth at the expense of response accuracy, time-domain behaviour and power handling at one end of the scale and, at the other end, responses that use the port rather more subtly — more perhaps as an aid to power handling (by reducing the amount the cone moves, or excursion, as designers like to call it) than to bandwidth extension. The green overlay in Figure 1 shows how the low-frequency response of the No-Name would look if it were not a reflex system, but a portless, closed box. This perhaps isn't quite fair, because the parameters for the No-Name box and driver were presumably chosen in the knowledge that a port was going to be used, but the overlay does illustrate the bandwidth extension offered by the port (as a closed box, the No-Name is 3dB down at around 100Hz compared to its ported self), together with its characteristic fast roll-off.

The black curve on Figure 2 shows the impulse response of the No-Name over the same frequency range — in other words, a plot of the excursion of the speaker cone against time when a fast click or impulse is put through it. If you're surprised that the movement of the cone is expressed on the left-hand axis in Volts, don't be — speakers work because a voltage put through a moving-coil driver creates movement, after all. The negative voltages simply indicate that the cone is behind its usual rest position (which is equal to 0V).

Now you know what the graph means, can you see how much difficulty the speaker has in stopping after the impulse is put through it? The characteristic frequency of the ringing overhang is equal to the resonant frequency of the reflex port. If a bass player, say, were to play a note at that frequency and stop instantaneously (we can all dream...) the speaker would add that resonant tail to his playing (in fact he wouldn't need to be playing a note at exactly the port resonance, just somewhere near it would do). The No-Name Nearfield actually has a pretty well-behaved and well-damped port, but in extreme circumstances, where a highly resonant port has been chosen, the port resonant frequency overhang can begin to affect the accuracy with which a speaker reproduces pitch. It can make the pitch of things sound slightly hazy, if not definitely out of tune. And you thought it was that bass player trying his fretless... Again, there's a green overlay in Figure 2 showing the time-domain behaviour of the No-Name with the port blocked up. Notice how much better the cone stops moving?

Figure 3 is a graph of delay in milliseconds plotted against frequency in Hertz. The black curve illustrates the added time delay of the No-Name at low frequencies (this is known as the 'group delay' and is phase change expressed as time, as explained earlier). Again, a green overlay shows what happens without the port. Comparing the black line with the green, and taking a specific example, you can see from the graph that a bottom E at 44Hz with the port occurs over 10 milliseconds later than without it. Since the speed of sound is 330 metres per second, this delay is equivalent to moving bottom E back over three metres (as mentioned earlier). However, the real-world specific audibility of such low-frequency time delay effects is a subject of much discussion and argument among speaker folk and hi-fi reviewers. I sit firmly on the fence, and I've included the group delay graph primarily to illustrate the complexity of the issue (and so that I could do that gag about the bass player's fundamentals). There are so many factors that influence the perception of low-frequency performance that it's pretty much impossible to identify one and be sure of its guilt. I think there's little doubt though that the choice of a low-frequency response that seriously distorts the dynamic, temporal and even pitch information present in the signal, all in the name of a little more bandwidth, is not the right one for a monitoring tool.

Perhaps the most interesting measurement on the low-frequency end of the No-Name Nearfield — especially in the context of a speaker's dynamic range failings — is illustrated in Figure 4. This graph of output level against frequency shows the differing frequency response at two different (but fairly low) drive levels. This time the green curve has nothing to do with the presence of the port or not — it simply represents the response of the ported No-Name at a higher drive level than the black curve. This shows the compression introduced predominantly by the port, as the two curves should be the same. The difference is only about 1dB from 30 to 60Hz but then the green curve's drive level wasn't particularly high either. At higher levels I would expect the port compression effect to become quickly more obvious. And if the No-Name didn't have a reasonably well-flared port exit I'd expect the port compression to become, if not specifically audible, then certainly a significant influence on the way compression was used at a mix.

Now, you probably read that paragraph and felt cheated because I said it was interesting. Bear with me — here's why. The No-Name was lent to me by its owner (brave man), and one of his subjective feelings about the product is that it sounds a little 'bass-light' until it's being driven reasonably hard. Now that looks to be at odds with Figure 4, which shows the low-frequency level decreasing as the product is driven harder. So what's going on? Well, it's hard to say exactly without carrying out a lot more measurements, but based on my experience with other monitors, I am in a position to speculate why the readings I have taken seem at odds with the owner's carefully-formed, long-term opinion. Firstly, it's well known that, far from being flat, the frequency response of the human ear is level-dependent at both frequency extremes. Secondly, the actual situation is far more complex than can be revealed by one or two measurements. It's quite possible, for example, that after initially falling with increasing level, the response of the No-Name develops a resonant peak as its port starts to misbehave more seriously. Or perhaps the resonant low-frequency energy that the speaker begins to generate as other compression effects and non-linearities begin to unfold is perceived as the missing bass. Whatever the underlying mechanisms for the subjective feeling, measurements of the No-Name reveal that it has a pretty well-behaved and sensible low-frequency response. So, rather than drive the speakers into non-linear behaviour in order to feel they are working properly, it might be better to change their position in the room (nearer to rear wall or corner) so that they sound 'right' at lower levels, which produces more linear behaviour. In turn, this 'linear' solution is likelier to result in monitoring that more accurately reflects the recording and the mix.

Tips For The Ported

All of the above might read like a character assassination of any speaker with a reflex port, but that's really not my intention. Thoughtfully and carefully conceived reflex loading (and I'd include the No-Name in that category) can work well, but the technique by its very nature has characteristics that it pays to be aware of. So, if you have a pair, how best to work with reflex-loaded monitors? Maybe try a few of the following little experiments.

A Load Of Balls?
Further to the subject of avoiding turbulence in speaker ports, the recently introduced Nautilus hi-fi speaker range from B&W features a technology they call FlowPort. FlowPort is an array of small hemispherical dimples set into the flared exit surface of the reflex port. The argument proposed is that the dimples work by helping the airflow remain non-turbulent "in much the same way as the remarkably similar dimples on the surface of golf balls". I've absolutely no idea if this really works or is significant, but the engineer (or was it a marketing man, I wonder?) who came up with a technology that ties a range of deeply stylish, very male-oriented and high-value hi-fi speakers to the similarly upmarket, male-oriented sport of golf deserves some sort of gong. That's marketing genius, that is.

• Find out what frequency the port is tuned to. It'll be somewhere between 30Hz and 80Hz, and the smaller the speaker, the higher it's going to be. If the manufacturer publishes technical data including an impedance curve, the port frequency (for reasons I don't have space to go into here) is the minimum impedance value between the two low frequency peaks. Figure 5 is the impedance curve for the dear old No-Name, which reveals its port frequency to be around 43Hz. If you can't get hold of an impedance curve, play a sine wave patch from a keyboard at a reasonable level and just look at the movement of the speaker cone as you sweep the note up an octave and a half from low B to E. The point at which both the cone moves the least and there's a healthy breeze blowing from the port is the tuning frequency. Once you know the port frequency, you can be pretty certain that if your monitors are going to misbehave, that's where they'll do it. It's useful to know the nearest musical note to the tuning frequency, too, because you can then be aware that any consistent problems associated with that note may well be a monitor artifact and can perhaps be safely ignored (so no more worrying why that F sharp always sounds all boomy and slightly out of tune...).

• Have a good critical listen to some simple low-frequency material recorded at different levels. Maybe record a bass guitar or keyboard piece especially for the purpose. Does the quality and character of the sound change alarmingly with level? If it does, you can bring this knowledge to bear when you record, or more likely when you mix.

• Put a sock in it — literally. If you have a pair of passive reflex-loaded monitors you'll do no harm just having a listen to how they behave with the ports blocked. The measurements on the No-Name Nearfield illustrate the fundamental change that blocking the port can bring about on the low-frequency behaviour of the system. If you have a bass problem on a mix, blocking the port is as useful as trying a different monitor or even room — perhaps more so, because you're only changing one variable. Don't however, just listen to bass level — there's bound to be less bass with the port blocked. Listen instead to how the bass character changes with level, and to how clearly you can identify the pitch of the bass notes in each case — with port blocked and unblocked.

Next month, we'll explore more deep and dark speaker mysteries, this time further up the bandwidth, including the complete tosh that is the typical 'frequency response' curve. Now, where's my anorak got to...?

Published in SOS October 2000

DJZeMig_L

02-01-2004, 10:17 PM

Part 2: Frequency Response Secrets

What do the manufacturer's frequency response figures tell you about your studio monitors? Less than you might think, as Phil Ward discovers...

Last month, I began this short series by describing and illustrating how the characteristics of your studio monitors can profoundly affect all that you record. Simply because it seemed logical to start at the bottom, Part 1 dealt with the low end of the audio spectrum. This month, still with the No-Name Acoustics Nearfield (our average anonymous studio monitor) as willing guinea-pig, I will attempt to make you aware of how your monitors behave across the rest of the audio frequency band.

High Anxiety

Perhaps the most obvious way in which a speaker imprints its character on music is through its frequency response in the mid to high frequency band -- say, 200Hz to 8kHz. This is the region where musical sounds carry the majority of their information and energy, and the region where the performance and integration of the woofer and tweeter are probably the dominant factors. Only probably, though, as there are other dark and mysterious factors to consider. I'll come to those later...

Conclusions are often drawn about the drive units and probable tonal qualities of a speaker from its measured frequency response, and at first glance the concept of frequency response is very simple. In the case of a consumer electronics product, a frequency response that is not flat simply modifies the tonal characteristics of the audio signal. (Applying EQ to a signal is nothing more than intentionally bending the frequency response.)

If only it were thus with speakers. Manufacturers' specifications of frequency response (you know the kind of thing -- "50Hz -20kHz ±3dB") are such a simplification of the real-life situation as to be almost meaningless. In fact, they're probably worse than meaningless, as their simplicity lulls us all into a false sense of understanding.

Frequency Facts

There are basically three factors that complicate the issue of interpreting a speaker's frequency response:

* Where? Where was the speaker when its response was measured, and where was the measurement microphone in relation to the speaker?

* When? Was the measured response a short snapshot or the integration (combining and averaging) of signals over some longer time?

* What's listening? How do we correlate a response measurement made via a microphone with the ears and brain of a real person? And remember, as I said in last month's instalment, our ears are cleverer at listening than speakers are at singing.

Location, Location, Location

'Where' is probably the most straightforward of these three issues, so I'll try to deal with that first. Manufacturers do occasionally supply some 'where' information with published frequency response specifications. They'll say something like, "microphone at 1m on tweeter axis". They might even specify the environment in which the measurement was made. So, just because we still have the No-Name ready and willing to be subjected to more investigation, let's measure its frequency response with a microphone at 1m on the tweeter axis -- which you can see in Figure 1. Looks pretty respectable (50Hz-20kHz ±3dB) in the context of a device made of wobbling bits of plastic screwed into a chipboard box.

But what happens if we move the microphone? The green overlay on Figure 1 is the frequency response with the microphone still at 1m distance, but moved 20 degrees downward with respect to the tweeter axis. The response is a mess, and two questions probably arise. Firstly, why does moving the microphone by around 25cm turn a nice flat response curve into a section through the Himalayas? And secondly, if you were listening to the speakers and moved your head a similar distance, how come you wouldn't be conscious of such gross tonal changes? After all, if you took a near 10dB chunk out of the signal at 2kHz with an equaliser, you'd hear it!

The answer to the first question is actually pretty straightforward (unlike the answer to the second). The chunk of response missing around 2kH is simply caused by destructive interference between the output of the two drive units. Over the region of the speaker's bandwidth, where both drive units contribute to the output (this overlap is illustrated in Figure 2), there will be points in space where the path lengths from each driver differ by multiples of half a wavelength, and, in a throwback to school physics 'ripple tank' experiments, silence breaks out. The second obvious change between the response at two different microphone positions -- the faster roll-off above 10kHz -- is caused by the directional characteristics of the tweeter. Any radiating diaphragm will begin to become directional as the wavelength of the radiated energy approaches the size of the diaphragm, and when we moved the microphone 'off axis' the tweeter directionality began to show. With phenomena such as interference and directivity at play, each different microphone position will have a unique frequency response, and the arbitrary choice of "microphone at 1m on tweeter axis" for a specification is just that -- arbitrary.

Before I move on to answering the second question, Figure 2 and and Figure 3 (the latter being a curve showing the No-Name's response 20 degrees off-axis horizontally) reveal something unusual, and perhaps significant, about the speaker's design. Figure 3 shows that the No-Name is a good performer horizontally off-axis. The expected roll-off in very high-frequency energy is present, but through the mid-range and driver overlap region the off-axis and on-axis curves are pretty close together. The reason for this good behaviour is the choice of a low (2kHz) crossover frequency between the two drivers -- 3kHz or above would be more usual -- and also the reasonably gentle crossover filter slopes. At 2kHz, firstly, the bass unit is still reasonably non-directional; and secondly, the wavelength (0.17m) is larger than the distance between the two drivers, which helps to reduce the severity of the cancellation effects mentioned earlier. It's not all good news, though, because a low crossover frequency and a low filter slope (electrically, the high-pass filter is 6dB/octave) is likely to put the tweeter under significant displacement and thermal stress. I would not be at all surprised to find that the No-Name displays relatively high levels of distortion around 1-3kHz, nor to hear that it has a reputation for tweeter failure.

Meanwhile, it's time to answer the second question: why don't we hear the gross off-axis response changes? The answer is as much associated with the 'When?' and 'What's listening?' questions as it is with the 'Where?' question, and it goes something like this. It's all too easy, when considering the mechanisms of hearing, to rely on the analogy of ear as 'microphone' and brain as 'recorder'. This analogy is not entirely without foundation (on the very basic level of eardrum and microphone diaphragm, for example), but once we start to consider how the brain interprets the ear's 'output', the situation is rather less clear-cut.

Perhaps the biggest intellectual hurdle to jump is understanding that the brain combines and averages over time over time the signals received from the ear. In the case of a loudspeaker in a room where reflections from walls, floor and ceiling ensure multiple paths from speaker to ear, the tonal balance perceived by a person in the room is made up from the integrated average of many different 'frequency responses', all arriving at different times. The integration 'window' is around 15mS wide, but varies with person and frequency. This is why we're not really aware of the gross response anomalies that are often revealed by a single frequency response measurement and, similarly, why we can perceive a speaker as highly coloured when its axial frequency response appears to be flat (take a bow, the majority of horn-loaded speakers). This psychoacoustic integration phenomena has some significant implications for the design of monitors and how we use them. I'm sure you knew there was going to be a point to all this!

Practical Magic

Find out a little about the off-axis response of your monitors. Don't necessarily mistrust a published specification but treat it as the marketing material it almost certainly is. You can work out much of what you need to know simply by looking. If you have monitors with a large bass/mid driver (say, 200mm or more) and a high crossover frequency (above 3kHz) you can be pretty certain that, however flat the axial frequency response, the off-axis response won't be. Conversely if your monitors have a smaller bass/mid unit and a lower crossover frequency they'll in all probability be better behaved off-axis.

Does your room or monitor setup encourage strong early reflections? A monitor with poor horizontal off-axis performance, positioned relatively close to a side wall, is likely to sound coloured through the mid-range, because a large proportion of the sound you hear is actually the reflected off-axis response. Try changing the inward angle of the monitors -- aim them either towards a point well in front of the listening position or aim them both straight out into the room. The mid coloration might just be suppressed as you change the response shape of the side-wall reflections. Remember, though, that if you're no longer listening on the tweeter axis the overall balance might become a little less bright.

Don't ask me why, but many people seem to think the natural position for speakers is on the shorter wall firing down the room. Quite often, however, they'll work better on the long wall firing across the room, simply because the first side-wall reflection will then be less prominent at the listening position.

Could you suppress early side-wall reflections by the strategic application of some diffusing and/or absorptive material? It's pretty easy to work out the region of the side walls that will generate a strong reflection to your listening position, so a quick experiment with a folded duvet hung in the right place might show that a tonal problem you've been equalising for years is actually a monitor dispersion artifact. Careful, though -- over-damp the room and you'll end up mixing everything too bright and adding too much reverb.

I'd never advocate turning a typical pair of nearfield monitors on their sides (if that's how you use yours, you really will be fighting against dispersion quirks), but doing so is an interesting experiment, because it may help you understand how your monitor dispersion and listening-room characteristics influence the sounds that you record. Listen for changes in tonal quality and coloration through the mid-band with monitors horizontal and then vertical. Perhaps you could record a little spoken voice and use it as a monitor coloration test.

Delayed Reaction

A further significant frequency response 'When' issue is illustrated by Figure 4. This is a 'waterfall' plot of the No-Name monitor, from 200Hz to 12kHz, showing that it's not only bass players who suffer from 'extras' added by the monitors (see last month's instalment). A waterfall plot can be pictured as a series of frequency response 'slices' recorded in the few milliseconds after a speaker stops playing a wide-band noise signal. Time runs on the Z axis, where the curve at the back (at 0mS) is the steady-state frequency response of the speaker. The plot for the No-Name shows that there's still a healthy mid-range output 3mS or so after everything should have stopped. This delayed output is down to three primary causes:

1. Resonance effects in the bass/mid driver cone and surround. For as long as speaker designers have been plying their trade, the search has been on for the perfect material for a speaker cone (or diaphragm, as the better-educated prefer to call them). This material would cost the same as cardboard, have the density of air, and possess a stiffness tending towards infinity. Any speaker designer who found such a material would, of course, immediately stop designing speakers and make trillions in more profitable fields from such a remarkable discovery.

In the absence of this material, however, as frequency rises and a cone is asked to accelerate more and more rapidly, there comes a point where it stops moving as a single entity and enters an often resonant mode of behaviour known as 'break-up'. Break-up simply describes the behaviour of the cone above the level where its mechanical stiffness can withstand the forces of acceleration.

Two mechanisms come to the rescue of the speaker designer and allow a cone to work at frequencies above break-up. The first is the inherent self-damping of the material. Bells ring because they have little self-damping. Make a bell from a typical speaker-cone material and -- well, not too many folk are going to come to church. It's no coincidence that speaker designers attracted by the lowish density and high stiffness of aluminium find, once they have a working cone, that it demonstrates a variety of high-Q resonant modes (often referred to as bell modes) above its break-up frequency.

The second mechanism is the edge damping offered by the cone's surround (usually made of a natural rubber, but sometimes polyurethane foam or a PVC-derivative material). This, however, is notoriously hard to get exactly and consistently right, and it's often the case that a material which offers good damping properties above break-up is hopelessly stiff and sluggish at low frequencies.

2. Diffraction effects from the edges of the cabinet. (The edges behave as secondary acoustic sources, and as they're further from the microphone their energy arrives later). It's an inescapable fact that speakers, even professional monitor speakers, haven't shaken off their furniture heritage. There's a long tradition and production infrastructure of cabinet-making in the design and manufacture of speakers, and this encourages rectilinear shapes with sharp edges. Trouble is, sharp and rectilinear are exactly the characteristics that encourage edge diffraction. Radiusing the edges can help, but to improve matters at anything other than very high frequencies the radii values required are usually well beyond the capabilities of cost-effective woodwork. And before you go away and start designing a range of extravagantly curved speaker enclosures, you wouldn't be the first. It seems that folk like their speakers to be furniture-style, and companies that have gone the curved route have, by and large, found it to be a downward curve.

3. The cabinet walls themselves moving in response to the mechanical vibrations of the drivers. This phenomenon is a subject in itself, because a simple calculation comparing the total area of the cabinet walls to the radiating area of the drivers shows that one is over 40 times the other. And it doesn't need a genius to appreciate that a surface 40 times the area of the bass/mid driver doesn't need to move much to generate significant acoustic energy. There seems to be a far greater emphasis on reducing the resonant contribution of the cabinet among hi-fi speaker designers than among those working in professional audio. Maybe pro-audio will catch up one day -- not that the efforts or techniques of hi-fi folk have often been particularly successful!

There are many different engineering approaches to solving problem number three. However, the two that really stand a chance of working -- very hi-tech enclosure materials or sophisticated mechanical isolation techniques -- are far too expensive to implement in the tight-margin world of small nearfield monitors. The job needs that same cheap, light, stiff material as the cone, really -- just as long as it will take a real wood veneer....

Compression Point

There's just one further phenomenon that I want to cover before you lose all faith in the noise your monitors seem to think is the music you recorded. It's compression again. In Part 1, I described the mechanism of port compression in reflex-loaded monitors. This time it's the turn of wide-band compression, the cause of which is predominantly heat.

Around 99 percent of the power your amplifier delivers into your monitors is dissipated as heat, and as various components within the speakers warm up they begin to distort its frequency response. Figure 5 shows the result of subtracting a No-Name frequency response curve measured at 6V (4.5W into 8(omega)) from one measured at 1V (0.125W into 8(omega)). Ideally, Graph E would be a straight line at 0dB, but already the temperature rise in the bass unit's voice-coil has caused its electrical resistance to increase and the driver's output level to compress by half a dB. Higher drive levels can cause the voice-coil to reach 200-300°C and can very easily result in two or three dBs of compression.

But simple, wide-band compression isn't the only mechanism at play. The crossover filter circuits in passive speakers are dependent for their response accuracy on the input impedance presented by the drive unit. As the voice-coil resistance increases with temperature, the effective filter response's shape can stray very far from that intended and can introduce all sorts of errors in the system response. These thermal compression effects are notoriously difficult to predict or tie down -- they are entirely signal and signal-history dependent, for a start. For example, if you drive a monitor hard from cold, thermal compression won't occur until the temperature has 'caught up'. Similarly, if you have been monitoring at high levels and suddenly turn everything down, the response distortions resulting from the high level will persist, as the voice-coils take a while to cool.

The ideal solution to these effects is, of course, for designers to do their job properly and engineer speakers such that thermal compression doesn't occur. If only it were that easy. In terms of adding manufacturing and component cost, minimising thermal compression is, after fixing the cabinet resonance, about as expensive a design aim as you can get. And, of course, compression is a hidden effect -- it happens without you being specifically aware of it but profoundly influences the way you mix. If memory serves me correctly, the issue of monitors wielding influence over the way you mix is where I came in...

Thanks to Phil Knight for his help in generating the measurements used in this article.

Published in SOS November 2000 Friday 2nd January 2004