We treat audio in an old fashioned way. It’s time to innovate!
HTML, the language of the web, is a markup language. It uses markers so a web browser can interpreted how the text should be presented on a webpage. Therefore titles, paragraphs and other parts of the text can be changed without changing the actual text itself but only by changing the representation of it using HTML.
If you’re reading this text on your smartphone the sidebar of this page will be shown down below. And the header image will be resized to a smaller size. On an iPad the content might be sized differently and larger than on your smartphone. This is what we call responsive design, design that changes the way in which the content will be presented in different kind of browsers and display sizes.
We can show text and images in such a flexible way because we designed it that way. Text can be presented by any type of font, size and color. And although images are a little less flexible we can resize them, remove a part of it, layer them, animated them, put text over them, align them, make them transparent etc. This way the combination of text with images can be presented on a webpage in millions of ways.
And what if we need to show text on Google Glass or on a watch? Text and images need to be fully flexible or else we would need different versions of the same content for all the different devices.
Audio on the web is often treated in the same way as video, represented in a large container, a box shown on the webpage. Take this fantastic Love + Radio podcast for example and listen to the first few minutes:
What I noticed by listening:
- an introduction of WBEZ Chicago Public Media
- an introduction of the L+R host Nick van der Kolk
- the voice of the person being interviewed: Andre Taylor
- some background music and nifty sound design
The way we tell stories through audio as we do using text and images are somewhat related. Most blogposts, like the Love + Radio podcast, have an introductory text. We quote people in text. And images and background images are used for creating some additional feel.
But the way we publish this kind of audio is in no way as flexible as how we publish text and images. We put most time in recording all parts of the podcast, editing it together as a story and then upload the whole thing as one big file. We can not change anything after it’s published only by going back to the source files, editing them again together and re-upload the whole thing again. It’s very time consuming to work in such fixed format.
What if we would be able to disable the introduction of this audio file so we can write our own introduction for it? Or maybe even better: record our own voice as an introduction to the piece? What if we want to remove or lower the volume of the background music? What if we want to quote only a small part of it?
Online content consists of small pieces of data which we can represent it all sort of different ways. We’re remixing that data all the time on millions of webpages. This is how the web works for text an images. And sure, working with audio is a lot more complicated but we need to change the way we’re using it online. We need to break audio into smaller pieces of data, so we can clue them together in different ways. Sometimes with additional music. Sometimes with new introductions. Sometimes with still images shown in sync with the music. Sometimes with moving images. As fluent data.
(This text was also published on Medium with a few slightly different text.)