Skip to main contentSkip to footer content

Accessibility at ESF
Creating Accessible Audio Content


Transcripts are required for Audio Accessibility. For most media, such as recordings of teleconferences, you only need to get/make and post a transcript to provide basic accessibility.

  • Example transcript for presentation
  • Example transcript for podcast
  • Use speech recognition software

    Using speech recognition software, such as Dragon NaturallySpeaking, might be a viable option if you have a lot of media with a single speaker, such as a regular podcast that is mostly you speaking. Such software requires "training" for a particular voice, so if your audio is interviews with different people, this won't work as well. Keep in mind that any software-only option will require some editing to correct mistakes.

  • Use audio-to-text service

    There are now many free and for-a-fee online serivces that will generate a transcript for you, including YouTube. These will almost always require a lot of editing to correct mistakes.

  • Type the transcript yourself

    Unless you are an excellent typist, doing it yourself is likely to be frustrating. It's probably worth paying someone else to do it. If you do it yourself, plan for it to take at least three times as long as the audio to type it up - e.g., half an hour for a 10-minute podcast.

    There is free software that can help by slowing down the text and providing easy pause buttons, such as Express Scribe Transcription Playback Software.

  • Pay someone to make the transcript

    See the transcription services page.

Create high-quality audio

  • Use high-quality microphone(s) and recording software.
  • When feasible, record in a room that is isolated from all external sounds.
  • Avoid rooms with hard surfaces, such as tile or wood floors.

Use low background audio

When the main audio is a person speaking and you have background music, set the levels so people with hearing or cognitive disabilities can easily distinguish the speaking from the background.

Specifically, make the background sounds at least 20 decibels lower than the foreground speech content (with the exception of occasional sounds that last for only one or two seconds).

Avoid sounds that can be distracting or irritating, such as some high pitches and repeating patterns.

More information is in Understanding Success Criterion 1.4.7: Low or No Background Audio (AAA).

Speak clearly and slowly

Speak clearly. This is important for people wanting to understand the content, and for captioners.

Speak as slowly as appropriate. This will enable listeners to understand better, and make the timing better for captions and sign language.

Give people time to process information

Pause between topics.

Use clear language

Avoid or explain jargon, acronyms, and idioms. For example, expressions such as "raising the bar" can be interpreted literally by some people with cognitive disabilities and can be confusing.

Provide redundancy for sensory characteristics

Make your information work for people who cannot see and/or cannot hear.

For example, instead of saying:

| Attach this to the green end.


| Attach the small ring to the green end, which is the larger end.

More information that primarily addresses web pages, yet is relevant to audio and video, is in Understanding Success Criterion 1.3.3: Sensory Characteristics (A).