An introduction to video captions and subtitles

Video is one of the top sources of information for people all over the world and the stats show content consumption increases each year.

No matter the channels you choose to distribute your message through video content, your goal is to get as much engagement as possible and make sure the viewers get your message.

We won’t talk about the strategies used to shoot or edit the videos, but we want to highlight one of the techniques that often gets overlooked: using captions or subtitles.

Although the terms “caption” and “subtitle” are used interchangeably, they do in fact each serve a slightly different purpose. Captions have traditionally been used to assist hearing impaired viewers, where as subtitles are generally used for translation.

They both appear on the screen synchronized to the visual content and display dialogue and audio information. Captions can also indicate the name of the conversational partners and sound effects (“knocking on the door”, “dog barking”, “laughter”, etc…).

More recently, with the rise of online video and social media, the term subtitle has grown in prominence over the term caption and is most commonly used for muted video on social media.

Why use captions/subtitles?

A large number of users depends on this feature:

people with hearing disabilities
people with cognitive and learning abilities who need to see and hear the content to better understand it
users not fluent in the specific language used in the video

Some additional benefits of captioning your video include:

viewing the content in either silent environments on mute or in loud environments where you cannot hear the sound
children and adults can benefit by learning to read or by learning a second language
users will be more compelled to watch a video that autoplays on mute on social networks if if includes captions

It is worth noting that modern browser standards will only autoplay a video that is muted, another reason why you should add captions to your videos.

Closed captions vs open captions

There are 2 ways to present captions to the users: Closed Captions (CC) and Open Captions.

Closed captions come as a different file alongside the video and the player used, needs to support showing the captioning - however they are not compatible with all the media players or streaming platforms.

Open captions are easy to use because they do not require special functionality - they are embedded directly into the video and do not require any additional action from the user. Closed captions sometimes require the user to enable them.

SRT files

One of the standard file formats that contains the text, the timing and the order in which they appear is SRT (SubRip)

The structure is very basic, and it can easily be created in any text editor:
the subtitles are numbered sequentially, starting at 1
the time when the text appears on screen and when it disappears
actual text to display
a blank line, indicating the end of this item

1
00:00:00,540 --> 00:00:03,120
Hi, my name's Scott Ko, as an entrepreneur,

2
00:00:03,180 --> 00:00:07,680
I cannot overstate how important it is these days to use video as a tool to

Automated subtitles and captions

Creating an SRT file by hand would be a tedious task requiring someone to listen to the audio of a video and type out the audio as text. Not only that, they would need to record the timing and monitor how many words to show on the screen at any given time so that the viewer has time to read them.

Luckily, with AI it is possible to automatically transcribe a video and generate an SRT file. Services such as AWS Transcribe, along with alternatives from Google and Microsoft will take an audio or video file and convert the audio to text format with all the timing information included. You can then use a third party library to convert the vendor format in to an industry standard SRT file.

Automatically add subtitles to video using code

If you are interested in how you can use the Shotstack API to automatically add subtitles and open captions to your video using code, we have written an in depth PHP captions guide.

Get started with Shotstack's video editing API in two steps:

Sign up for free to get your API key.

Send an API request to create your video:

curl --request POST 'https://api.shotstack.io/v1/render' \
--header 'x-api-key: YOUR_API_KEY' \
--data-raw '{
  "timeline": {
    "tracks": [
      {
        "clips": [
          {
            "asset": {
              "type": "video",
              "src": "https://shotstack-assets.s3.amazonaws.com/footage/beach-overhead.mp4"
            },
            "start": 0,
            "length": "auto"
          }
        ]
      }
    ]
  },
  "output": {
    "format": "mp4",
    "size": {
      "width": 1280,
      "height": 720
    }
  }
}'