On Blind Tests and Slate Digital Plugins

Martin John Butler
RGO Senior Member

Singer/Songwriter/Producer/

Posts: 7,984

On Blind Tests and Slate Digital Plugins Feb 5, 2016 10:29:00 GMT -6 iamasound likes this

Quote

Post by Martin John Butler on Feb 5, 2016 10:29:00 GMT -6

I rarely visit Gearslutz anymore, but touch base there every once in a while if a manufacturer comments. Steven Slate was talking about blind tests of Slates preamp emulations against the hardware. I replied with this, risking typical GS vitriol. It's lengthy, so grab some coffee, but I feel that more knowledgeable people than me have finally said what I struggled to express well previously.

Last month's Stereophile editorial page was about this very issue, blind testing. Obviously, Mr.Slate swears by them. "See, they can't tell a difference, in fact, they prefer the plug-in" could be his mantra. But unfortunately, there's more to the story than mets the eye, and ears in this case.

First, there's no accounting for the individual's abilities as a critical listener, even if they're experienced AE's.

Second, there's the "test" factor, some people respond negatively to testing itself, elevating all sorts of things, like blood pressure, anxiety, insecurities, etc.

Third, IME, differences between things audio don't always reveal themselves immediately, it might take a few hours, days, or weeks, but once noticed, you can pinpoint it again and again. Like a small yellow egg stain on an otherwise clean tie, your eyes, (or ears in this case), go right to it.

This is such a contentious debate, the conclusion I always end up with is to trust my own ears, and if others feel differently, well, good luck to you, honestly.

This article from last months Stereophile Op Ed page is worth every minute it takes to read, as I couldn't have said it better myself. John Atkinson was an expert before Steven Slate wore diapers. I like Mr. Slate, I like and use many of his products, and I love that he's trying to bring innovation to the musician without deep pockets, but let's face it, he's 50% P.T. Barnum as well.

So here ya go:

To the Simple, Everything Appears Simple, by John Atkinson (Editor, Stereophile Magazine)

"My spirits sank as I read the comments on Stereophile's Facebook page. In the November issue, we had published reviews of UpTone Audio's USB Regen device by Kalman Rubinson, Michael Lavorgna, and myself. Michael and Kal had enthused about the positive effect the USB Regen had made, but I could detect no measurable difference. On Facebook, Dan Madden had written, "I think a device like this would need a blind listening test to verify that a listener could hear the difference in a statistically measurable way, in a very high percentage of times."
I have no argument with that statement. But then, Madden went on to say, "Have someone hook up this gizmo on YOUR system, and then have you listen to it with the same song 10 times with and without it connected randomly, and if you get the 'better sound with it' right 9 times out of 10 then I would be convinced that it makes a difference to the sound."

Sounds like a simple test, but designing a blind test that can be used to confirm or deny that a real but small audible difference exists is far from simple. In the formal statistical analysis of the test results, you can't prove a negative; you can conclude only that, under the circumstances of the test, no difference could be detected. By contrast, a statistically significant positive identification can be regarded as universal proof that a difference is detectable. But that analysis depends on the test examining just one variable—the difference being examined—and, as I have repeatedly discussed in this magazine, the blind-testing methodology itself can be an interfering variable in the test. The fact that the listener is in a different state of mind in a blind test than he or she would be when listening to music becomes a factor.

Rigorous blind testing, if it is to produce valid results, thus becomes a lengthy and time-consuming affair using listeners who are experienced and comfortable with the test procedure. Otherwise, the results of the test become randomized, hence meaningless.

In the words of famed mastering engineer Bob Katz: "There is no such thing as a 'casual' blind test. Blind tests are a serious business. Experimenters need training how to perform blind tests well. Blind tests can fail (produce statistically invalid results) if the experimenter neglected one critical detail. Weeks of intensive study are required to learn how to perform blind tests. Then weeks of preparation to create the test. Then weeks of testing to follow."

Some probably think it paradoxical for the editor of a magazine based primarily on the concept of judging audio components by listening to them under sighted conditions to be commenting on blind-testing methodology. However, since the very first blind listening test I took part in, in 1977, organized by the late James Moir for Hi-Fi News magazine, I have been involved in well over 100 such tests, as listener, proctor, or organizer. My opinion on their efficacy and how difficult it is to get valid results and not false negatives—ie, reporting that no difference could be heard when a small but real audible difference exists—has been formed as the result of that experience.

There is, in fact, a formal discipline devoted to the design of blind tests, based on recommendations formulated by the International Telecommunications Union in its document ITU-R BS1116-3 (footnote 1). Katz was summarizing the ITU guidelines and their consequences; the context for his comments was a workshop at the 139th Audio Engineering Society Convention (footnote 2), held last October in New York, on the audibility of possible improvements in sound quality made by recording and playing back audio with bit depths greater than the CD's 16 and sample rates higher than the CD's 44.1kHz.

This is a contentious subject. On the Stereophile website forum last summer, reader David Harper wrote, "Humans do not hear any difference between 16-bit/44.1kHz and any higher bit/sampling rate. This is established fact."

Harper was referring to a 2007 paper by E. Brad Meyer and David R. Moran that "proved" that there was no sonic advantage to high-resolution audio formats (footnote 3). Their conclusion ran counter to the experience of many recording engineers, academics, and audiophiles, but other than doubts over their methodology and the fact that their source material was of unknown provenance, Meyer and Moran's paper seemed to be the final formal word on the matter.

Until now. The AES workshop in which Bob Katz was taking part also featured presentations by legendary recording engineer George Massenburg (now a Professor at McGill University, in Montreal) and binaural recording specialist Bob Schulein. But it was the first presentation—by Joshua Reiss, of Queen Mary University, in London, and a member of the AES Board of Governors—that caught my attention.

Some 80 papers have now been published on high-resolution audio, about half of which included blind tests. The results of those tests, however, have been mixed, which would seem to confirm Meyer and Moran's findings. However, around 20 of the published tests included sufficient experimental detail and data to allow Dr. Reiss to perform a meta-analysis—literally, an analysis of the analyses (footnote 4). Reiss showed that, although the individual tests had mixed results, the overall result was that trained listeners could distinguish between hi-rez recordings and their CD equivalents under blind conditions, and to a high degree of statistical significance".

Last Edit: Feb 5, 2016 10:35:53 GMT -6 by Martin John Butler

www.martinjohnbutler.com

swurveman
RGO Senior Member

Posts: 2,506

On Blind Tests and Slate Digital Plugins Feb 5, 2016 12:03:54 GMT -6 Martin John Butler, odyssey76, and 2 more like this

Quote

Post by swurveman on Feb 5, 2016 12:03:54 GMT -6

I think it's interesting that people fight for every inch of audio ground selling gear, while most of the world listens to music on speakers and in environments where you can barely hear anything other than the vocal.

Frank Johnson- Songwriter/Musician/Engineer
www.facebook.com/Songflowerrecording/

formatcyes
Senior Member

Infracted By Gearslutz

Posts: 533

On Blind Tests and Slate Digital Plugins Feb 5, 2016 14:53:05 GMT -6 ericbradley likes this

Quote

Post by formatcyes on Feb 5, 2016 14:53:05 GMT -6

Blind tests are very revealing the only issue I have with Slate's test's are they are done by the company trying to sell plugin's so there is a bias there. But what it does show is just how close the emulations are getting. The only real way to test is a double blind test much harder to set up but.

My dog thinks you have to much going on, around 30khz

popmann
RGO Senior Member

Longwinded Waxer of Gear Poetic/He Who Is Obtuse/Resident Underachiever

Posts: 3,020

On Blind Tests and Slate Digital Plugins Feb 5, 2016 16:42:12 GMT -6 Martin John Butler, tonycamphd, and 2 more like this

Quote

Post by popmann on Feb 5, 2016 16:42:12 GMT -6

Well, I think the issue is simpler than you're making it. Any ABX will show you or anyone can tell the difference. He has not replicated the hardware 100%. There's actually no debate there. And I know he will likely take exception to that, but read on....

His ACTUAL point, is that you can't tell from listening which is the hardware and which is the plug in. This is also easy to test. Can you, or not?

I also think that if you look at the audience, he's not selling replacement tools. If a mix engineer who is used to having access to a nice API legacy or SSL to mix on no longer is given budget to do so....or some guy is playing in their home studio creating the next generation of audible art....neither are truly able to use the real deal. the fact is, these boards haven't been made in my adult life (save maybe the new 9k)--so, even if every claim were 100% true, who is he hurting?

Blind testing is hard to set up. It's harder the more complex and interactive a tool is.....sample rate is fairly easy compared to say, a compressor. Or an analog mixer where each channel's calibration and components will vary....and the signal will pass EQ and filter circuits even though not engaged--not to mention which one was done vs duplicated to the other? So, I think that while blind testing IS often much more complex than people want to make it seem (witness the null testing of DAWs that simply play file back with the same panning and fader settings and claim all DAWs sound the same because of this)....I think it's also proving different things than people read into them.

mrholmes
RGO Senior Member

I GIVE AND I TAKE LOVE .-))

Posts: 4,149

On Blind Tests and Slate Digital Plugins Feb 5, 2016 18:15:15 GMT -6

Quote

Post by mrholmes on Feb 5, 2016 18:15:15 GMT -6

In my world truth is told in a real world mix.
Second problem is, he is doing those AB files, we do not know if he uses two times the same files.

I would have wasted time with it 5 years ago, today I know that there is something special about HW.
There is no need to dicuss it over and over again.

I think a product like the Silver Bullet is way more intresting because it will change the way you work.

joseph
Senior Member

Posts: 701

On Blind Tests and Slate Digital Plugins Feb 5, 2016 20:02:44 GMT -6 mrholmes and iamasound like this

Quote

Post by joseph on Feb 5, 2016 20:02:44 GMT -6

Turn on equipment.

Tune up the kit, pick your favorite cymbals, place your best tube mic out front, run the mic direct into compressor, record.
Immediate satisfaction.

Plugins can never match that.

Last Edit: Feb 5, 2016 20:03:14 GMT -6 by joseph

Ward
RGO Senior Member

"Life is too short to dance with fat chicks" - James Cotton

Posts: 10,951

On Blind Tests and Slate Digital Plugins Feb 6, 2016 11:02:55 GMT -6 ragan and jazznoise like this

Quote

Post by Ward on Feb 6, 2016 11:02:55 GMT -6

Feb 5, 2016 20:02:44 GMT -6 joseph said:

Turn on equipment.

Tune up the kit, pick your favorite cymbals, place your best tube mic out front, run the mic direct into compressor, record.
Immediate satisfaction.

Plugins can never match that.

Can I run the mic into my favorite pre before the compressor? -ASKING FOR A FRIEND

No affiliations at this time

joseph
Senior Member

Posts: 701

On Blind Tests and Slate Digital Plugins Feb 6, 2016 11:30:23 GMT -6

Quote

Post by joseph on Feb 6, 2016 11:30:23 GMT -6

Point is you don't need a pre for a high output tube mic on a drum set. Often it sounds better to use makeup gain on compressor or nothing at all.
People sometimes do this with 1176 and vocals too.

Just in case that wasn't already clear.

Martin John Butler
RGO Senior Member

Singer/Songwriter/Producer/

Posts: 7,984

On Blind Tests and Slate Digital Plugins Feb 6, 2016 12:25:40 GMT -6

Quote

Post by Martin John Butler on Feb 6, 2016 12:25:40 GMT -6

Great posts guys, thanks. One of my reasons for posting this wasn't to call out Steven Slate. He'll readily admit you can hear some differences with plugins.

It's true, plugins are getting very close to the hardware, and do offer many of us an opportunity to get some great sounds at a much lower cost. I posted mainly because I've heard time and time again from lots of people online about how a simple test would clear up an issue we were debating, and the article I copied better explains my reticence to trust blind tests, well.. uhh.. blindly.

I still haven't completely loved anything I've done digitally. Part of my issue has been finding the most compatible mic, and part of it is my lack of a comprehensive understanding of things like gain staging and audio engineering in general. I'm still really happy with my Relab reverb, like the CLA 1176 plugs on the acoustic guitar presets, and Apple's compressor plugs, so Im not anti-digital, I'm anti my shitty mixes, which I'm endeavoring to correct ;-)

And I'm anti people telling me I can't hear what I'm hearing.

Last Edit: Feb 6, 2016 12:27:59 GMT -6 by Martin John Butler

www.martinjohnbutler.com

ragan
RGO Senior Member

Posts: 6,477

On Blind Tests and Slate Digital Plugins Feb 6, 2016 12:43:25 GMT -6 via mobile

Quote

Post by ragan on Feb 6, 2016 12:43:25 GMT -6

Feb 5, 2016 16:42:12 GMT -6 popmann said:

Well, I think the issue is simpler than you're making it. Any ABX will show you or anyone can tell the difference. He has not replicated the hardware 100%. There's actually no debate there. And I know he will likely take exception to that, but read on....

His ACTUAL point, is that you can't tell from listening which is the hardware and which is the plug in. This is also easy to test. Can you, or not?

I also think that if you look at the audience, he's not selling replacement tools. If a mix engineer who is used to having access to a nice API legacy or SSL to mix on no longer is given budget to do so....or some guy is playing in their home studio creating the next generation of audible art....neither are truly able to use the real deal. the fact is, these boards haven't been made in my adult life (save maybe the new 9k)--so, even if every claim were 100% true, who is he hurting?

Blind testing is hard to set up. It's harder the more complex and interactive a tool is.....sample rate is fairly easy compared to say, a compressor. Or an analog mixer where each channel's calibration and components will vary....and the signal will pass EQ and filter circuits even though not engaged--not to mention which one was done vs duplicated to the other? So, I think that while blind testing IS often much more complex than people want to make it seem (witness the null testing of DAWs that simply play file back with the same panning and fader settings and claim all DAWs sound the same because of this)....I think it's also proving different things than people read into them.

I'm curious about your last sentiment.

How do you think a proper DAW comparison should be done?

popmann
RGO Senior Member

Longwinded Waxer of Gear Poetic/He Who Is Obtuse/Resident Underachiever

Posts: 3,020

On Blind Tests and Slate Digital Plugins Feb 6, 2016 20:27:29 GMT -6 chasmanian likes this

Quote

Post by popmann on Feb 6, 2016 20:27:29 GMT -6

Well, you have to use third party plug ins, for one. The compensation will not be the same, and that contributes to the sound. io compensation accuracy can be measured. Ability to record and play back and render midi to audio with varying amount of CPU load and compensation levels in the mixer. Pan law implementation is often not tested--i.e. Doing the test with NOT everything panned LCR. Direct monitoring implementation isn't equal--when I ran my tests a few years ago, Cubase was the same as using the RME hardware mixer UI directly, while Reaper's was noticeably mushy (more latent)....which then leads to actual testing of the overdub record offset--and again, at various points in the process--some with horribly latent plug ins on other channels....some with nothing....some with VIs some with live audio....offline renders vs live recordings....lots of aspects.

The test I was referring to is a good example of what I meant about proving something other than what's implied--they've proven with no DSP engaged that all DAWs are using the same LCR summing code block. My experience has been that they do. But, that isn't how anyone uses a DAW, you know? It's fine to test that--if you understand that's all you're testing.

ragan
RGO Senior Member

Posts: 6,477

On Blind Tests and Slate Digital Plugins Feb 7, 2016 9:41:18 GMT -6 via mobile

Quote

Post by ragan on Feb 7, 2016 9:41:18 GMT -6

Feb 6, 2016 20:27:29 GMT -6 popmann said:

Well, you have to use third party plug ins, for one. The compensation will not be the same, and that contributes to the sound. io compensation accuracy can be measured. Ability to record and play back and render midi to audio with varying amount of CPU load and compensation levels in the mixer. Pan law implementation is often not tested--i.e. Doing the test with NOT everything panned LCR. Direct monitoring implementation isn't equal--when I ran my tests a few years ago, Cubase was the same as using the RME hardware mixer UI directly, while Reaper's was noticeably mushy (more latent)....which then leads to actual testing of the overdub record offset--and again, at various points in the process--some with horribly latent plug ins on other channels....some with nothing....some with VIs some with live audio....offline renders vs live recordings....lots of aspects.

The test I was referring to is a good example of what I meant about proving something other than what's implied--they've proven with no DSP engaged that all DAWs are using the same LCR summing code block. My experience has been that they do. But, that isn't how anyone uses a DAW, you know? It's fine to test that--if you understand that's all you're testing.

Yeah I can see what you're saying. But I don't think when the hundreds or thousands of Dudes On Gearslutz say "Logic is warmer, better for vintage style stuff, bro; Pro Tools is more hifi and clean" they're talking about delay compensation implementation. They're talking about the actual sound of raw audio and completely deluding themselves (and of course posting all over about it so the inexperienced can read it and start deluding themselves).

But I understand what you mean about functionality that could affect sonics. I didn't know 3rd party plugins are dealt with differently. Don't they report the same latency numbers to whatever host DAW they're in?

Last Edit: Feb 7, 2016 9:42:21 GMT -6 by ragan

popmann
RGO Senior Member

Longwinded Waxer of Gear Poetic/He Who Is Obtuse/Resident Underachiever

Posts: 3,020

On Blind Tests and Slate Digital Plugins Feb 7, 2016 16:32:38 GMT -6 via mobile chasmanian likes this

Quote

Post by popmann on Feb 7, 2016 16:32:38 GMT -6

that doesn't work--yes, it is how most do it. The better solution is to test when the plug is opened--which is what Cubase has done for years. Some test the plug in one time and store that value. If you want to see how accurate it's doing it--grab Voxengo's latency delay freebie--if it works, your DAW is trusting the reported need. The plug works by reporting that it needs a huge latency delay then passing it in no latency--allowing you to manually set the latency inside the plug. It will no longer function in Cubase because they no longer trust developers to report correctly. Which is why it's tight as tits.