We are heading into the closing stretch of 2017 and the audio streaming market is finally inflecting. Collectively streaming services added significant subscribers and subscription revenue and the global revenue outlook for the music industry’s future has gone from bleak to hopeful.

As Spotify, Apple Music, and now Pandora continue battling for the limited pool of premium subscribers in the U.S. and Western markets, renewed focus is shifting towards ad supported listening as growth in the premium subscriber pool slows. One might imagine that such investments mean the digital audio ad-tech space is about to go through a period of explosive growth. In order to better understand what is going on in the audio advertising space, we sat down with the architects of Saavn’s advertising product and strategy: VP of Ad-Platform, Gaurav Kaushik (aka ‘GKs,’ pronounced “Jeeks”) and SVP of Special Projects Gavin Byrne. They both say they’re glad that the rest of the streaming ecosystem is finally catching up and focusing on creating high quality brand solutions in the audio space.

So with streaming rapidly becoming the dominant mode of music consumption how has your interaction with brands and advertisers changed in the last year?

I can table set Saavn’s macro view and GKs can speak more specifically on advertiser interactions over the last year. I think two important market evolutions over the last few years have led to many of the changes we are now seeing.

A historic timeline of both ad-supported free audio & possessed audio. While disruptive technologies are almost constant for possessed content ad-supported free audio has largely remained the same since the 60’s technologically.

First, music and audio were overlooked as mediums in the evolution of digital advertising. We had IAB digital display advertising and then we had IAB video. (The Interactive Advertising Bureau is the standards organization for the digital advertising ecosystem.)  Audio was overlooked. That’s primarily because music, at least since the invention of the LP, was primarily an “owned” medium outside of terrestrial radio. Until consumers began converting over to streaming, brands effectively had to rely on terrestrial to connect with consumers in. This led to a delay in the development of the digital ad tech stack to support audio. While this means that audio isn’t currently as present in digital marketers’ minds as display, search, video, or social, it also means that audio doesn’t have all of the bad ad luggage that the rest of the digital media ecosystem does–all of the ad fraud and bad practices.  

The second thing, which is a symptom of the first, is that because of the lack of quality streaming music inventory created by publishers the collective media sciences–data, brain, and creative–have not had the same level of investment that search, social, display and video have had. That is rapidly changing with more research taking advantage of advanced brain scanning like fMRI. With those advanced brain imaging technologies all of the major research universities,  UC Berkley, MIT, Harvard all have auditory cortex research underway.

 

Does that prompt you to put even more resources behind Saavn’s ad-supported users?

From day one, Saavn been focused on being an ad-supported product. It’s what the consumer in our markets require so we knew from the beginning we had to invest in creating a high-quality and high-impact impression for brands, but one that would also resonate with consumers.

I think what you’re currently seeing is that many of the world’s other streaming services, that may have been only focused on premium subscribers, are now understanding that free ad supported streaming is not just an on-boarding platform for premium subscribers.  It’s actually a whole stand alone business, if not their primary business.  

 

What are the brands that you are working with asking for?

Data. Data, rightfully, has become the focal point both for Saavn and almost every brand campaign in the last year.  It’s used to shape the creative, features, and the call to action for every campaign now. With so much inefficiency in traditional media and even other digital mediums being able to reach the right consumer, with the right creative, and with a high impact premium impression is becoming more critical by the day.  Data is the key to that. One profile of target consumers gets Creative A, with a call to action to set a reminder. The second set gets Creative B, with a click to call, tweet, post to Facebook, or download.

 

Are brands actually utilizing all of those actions in campaigns today?

Oh completely. Whatever actions you can do on a phone, we can and will replicate those actions with Saavn’s Spot product. Saavn is mobile first and our markets is mobile first or only, our consumers know those actions and expect to use them. The key again is the data and that is significant advantage that we have as a music streaming company.   

 

Why would Saavn have an advantage as a music streaming company?

Our targeting data is unique, private, and most importantly three dimensional, not two.  The easiest way to understand the power of the data set is first, all a consumer has to do on Saavn is listen to the music they enjoy. They don’t need to expose any personal data to the world, yet the music a person listens to and the patterns they create–their audio DNA so to speak–is unique to them.  

Second, the experience of listening to music since the invention of headphones is often a deeply private and personal one. This is valuable compared to, lets say, social, where the “you” that is presented is the one you want presented to the world. It is the marketing version of you. Once you put your headphones on and you hit play, that’s the real you. Take Gavin. He might look business-casual-death-metal on the outside, but when the headphones go on, he might be a 100% K-pop.

Then the final amazing part is that our content data is three dimensional since music is the only repeat content medium. When you watch a movie, you generally watch it once. When you watch a TV show? Once. Video Clip? Once. Newspaper article? Once. Blog post? Once. Book? Once.  But when you find music that you like, users repeatedly listen over and over and over again.

A close up look at a single user’s Audio DNA and Audio Gene. Saavn believes the utility of their Audio DNA profiling system has significant scale. We combined with 1st and 3rd party sources the potential utility becomes limitless.

So not only can we tell what content categories you like in the way that social or search can, but we can tell how much you like each individual piece of content, hence making the profiles three dimensional not just two.  

With Audio-DNA Saavn can identify users with inclinations towards certain types of content and thus brand categories. For example, identifying a single user with an actual intent, i.e. to purchase a new entry level vehicle, allows Saavn to find others of likely inclinations creating an “Affinity Relative Group”. Custom Audio-Gene’s can be created for content recommendations or as targeting solutions for brands.

When you combine the three of these you have a powerful targeting solution–one that can identify a single user with a strong inclination towards certain types of messaging or product categories. You can then use that audio DNA to find closely related individuals who will most likely share similar taste profiles, and again the user has not had to put any of their personal information publicly at risk. When targeting cohorts with multiple campaign creatives, brands can very easily see which attributes are affecting performance.

 

Aren’t brands reluctant to make multiple creatives due to cost concerns and can we get into the actual mechanics of the spot model?

That’s actually one of the many beautiful things about audio as a digital marketing medium: its low production costs. While video is extremely popular as a rich media medium, its costs can be factorially higher than audio. Additionally, once you have an audio production in place, incremental creative output costs are marginal. You need smart writing, good VO and sound design, but you can record one set of creative where the brand is speaking from the moon, telling you to tap and call to schedule a test drive, and in the same session record a concept where the brand is in ancient rome and bundle that together in post production. Good creative iteration is part of the power of the medium and platform. In fact, in order for Saavn to understand the edges and best practices of the creative iteration process, it has effectively built a full-blown creative agency internally as both an R&D effort, but also to assist brands that may not have creative on the shelf to use on Saavn.

 

Wait. So Saavn has a creative agency?

Effectively, yes we do. Internally our production team is really our creative sciences group.  Basically our vanguard… astronaut… audio… artists… exploring how to speak on a new canvas. Both creative production and ad-tech are really two of our most valuable and talented teams doing epic work pushing the edges with the medium and our spot unit.  I mean our goal is to make our spot a default creative line item as part of brands 360 campaigns globally and to help them understand the creative actions and options they have with the Saavn Spot canvas.

 

So why is the Saavn Spot such a high quality impression?

Genuinely (hand on chest),  I seriously believe we have created the highest quality impression on mobile. That’s demonstrated by our performance across thousands of campaigns. On average, our brands experience more than 4% brand engagement.

 

Those are ad clicks and taps?

Yes. But while we are proud of those performance numbers we don’t think that they fully capture the relative value of the unit. I mean, from the ground up we built the unit to be the highest quality impression. We wanted the unit to simply make sense for the medium, the brand, and most importantly, the consumers. We knew three years ago that consumer time on these devices was a precious resource in the app ecosystem. We knew that if we set out to achieve the goal of properly valuing that consumer time, we could build a quality ad product experience. And that is a very rare commodity in this world. We are actually working with Millward Brown and Nielsen to quantify that full impression value. A serious work in progress, no doubt. But the goal is to value our users’ time properly, and in a way that makes sense to them.  

 

You’re not trying to maximize audio impressions?

We didn’t see the point long term. You don’t value the user experience going that route. And revenue growth demands would eventually catch up with you and degrade the user experience, and I had just come from the core product team so building that model would basically be heresy. Not just that. It just doesn’t make sense for the brands either. “Ad blindness” is a real thing. You can degrade an ad units impression value by not accounting for the consumer experience itself. Going down that road is no way to start to build a meaningful ecosystem–a system where brands and consumers can coexist and benefit from the terms.  For consumers in the music streaming ecosystem, the commodity of value is time. Time taken away from listening to the music they love and the memories they have, or the ones they’re making right at that very moment. So we value every second of a listener’s time when we allow a brand to speak with them. We do this for both the brand’s benefit and their consumers. The impression is cleaner, free of consumer noise and clutter, and the listeners time is properly valued. For brands, the commodity is speaking cleanly to their consumer.  Digital is full of fraud, blind impressions, background calls, etc. etc.–unwanted at best, misinterpreted by consumers at worst. Brands can have quality rich media messaging access without training users to move their finger or cursor to the lower right corner and wait.  We have a better vehicle for this. The engine that powers this is just a simple ratio, a ratio between user music streaming time and the duration of brand messaging to that consumer. The current ratio is one to sixty, meaning a consumer is provided sixty seconds of ad free music for every second a brand is allowed to speak with a listener during their session.

 

So users get more streaming time in proportion to the ad duration?

Yes. We have found that you get the optimal performance from the unit at about 12-15 seconds in duration, so the consumer gets about 12-15 minutes of uninterrupted music streaming at the current ratio. During that time, any time the app is viewed the brand has 100% share of the display layer creative. So consumers on our platform know when they hear a Saavn spot they have until they hear the next break to engage with what that brand is offering. That just makes sense. These are mobile devices. They are in our pocket, in our bag, or we are jogging, biking, or driving.  Listeners are generally doing something when they are listening to Saavn on mobile devices.  That’s what makes audio the medium for mobile advertising–it’s the multitasking medium in an ever present smartphone mobile multitasking world. Take a look around at a commuter in a major urban metro–the number of earbuds in while fingers are typing or tapping. Audio is always there, always on, and it’s the only medium capable of delivering a consistent quality experience on smartphones. We built a native ad experience around that thinking.   

 

How does a native ad experience like that affect you in the big shift to programmatic audio?

The experience is created in how we deliver the creatives. When they are broken apart into commodity elements of audio and display, they are of far lesser value than when they act together as one–as an audio call to action to the display. Like GKs said, listeners are usually doing something when streaming audio and they need time to stop what they are doing, navigate to the app and engage with the brand offering. We are training our users to expect that consideration with our spot model. So the elemental pieces are still standard units that can be utilized programmatically. They just don’t perform nearly as well as the Saavn Spot.    

That and our audio inventory is already available programmatically via our partner at AdWizz. We utilize DAAST with AdWizz, it’s IAB’s new audio ad standard.

 

Does that mean you’re DAAST-focused as a platform?

No not at all.  We are DAAST (Digital Audio Ad Serving Template), VAST (Video Ad Serving Template), and vPAID (Video Player-Ad Interface Definition) compliant. We focused on our partnership with AdWizz mainly because we really believe in the medium and support a distinct layer in the ad sphere for just audio. So we adopted DAAST early. We were certainly one of the first few and I think it will evolve very rapidly.  We don’t want to knock video, but as individuals we all experience the world on mobile devices. I think the fundamental understanding that’s coming is that smartphones are an audio-first medium. With that realization and the clean start with DAAST, in terms of current best practices and fraud reduction, I think you will see a dramatic shift in creative and budget focus. Our audio inventory will be made available to DSP’s (Demand Side Platform: The software used by marketers to buy advertising inventory on exchanges) using audio-VAAST very shortly.

 

So a bunch of programmatic news coming soon?

You may hear some non-programmatic news first.

We launched on Alexa powered devices as part of Amazon’s launch of both Echo devices and the Alexa service in India.  We are big believers in voice as a platform for Saavn as an experience and as a platform for brands.  We will continue to develop for Alexa and other voice assistant platforms over the coming months.   

We also are continuing to improve our Brand Channel experience for brands that want a more consistent presence and a longer-term relationship with their consumers directly on Saavn.  We are finding that many brands would like to attach themselves to certain content genres, becoming a de facto curator or DJ of a music type.  So we are continuing to work on enhancements that help them do just that.

We also have a major enhancement that will improve the sponsorship experience of Original Programming audio show content as well, making that inventory available programmatically when it is not sponsored. We’re continuing to improve the ways we can utilize the DAAST more creatively for brands and communication, with a more robust sequential feature, and actions set before our PMP (Private Marketplace – a private ad exchange where access to a publisher’s inventory can be controlled) launch.

…and a lot of solid targeting insights that will come out of a large project by our data science and engineering teams… yeah… a lot more coming for the whole audio space and for Saavn’s ecosystem specifically…I mean we are really only at the beginning phase of the mobile ad-supported audio streaming medium.

 

On the sciences part. Before you mentioned that the research universities are adding quite a bit of understanding to how our brain’s interact with audio. Anything interesting of note?

It’s very cool stuff.  You can see where it can go with deeper understanding. I might botch this up a little bit in paraphrased translation, but effectively when your brain sees something, it trusts it. You see a photograph of a dog–it’s a photograph of dog. Duh, right? That’s what a whole region of our brain is dedicated to. It’s to deal with just imaging. Audio works very differently.  In the study, it suggested that hearing the word “dog” connects with far more regions of the brain. Probably because of the inherent versatility and evolutionary experience of spoken language, our brains pre-compile every variation and connection to the word when heard. But it activates everything associated with a dog–a smell, an image, a sound, a touch. Those are not envisioned but activated through past and much more personal connections. It was something like that.  

I think as those types of creative sciences are better understood by brands and agencies you are going see a lot more investment in the medium. There are various studies on music’s effect on moods, but I don’t think anyone needed a study to feel that. Music can access so many different emotional pathways throughout the brain. We can often find ourselves looping on the same songs, effectively emotionally stimming.

That’s actually a really cool topic that we have done some work on, but we should give that a whole other episode.