Create AI Assets with Shotstack

The Shotstack provider allows you to generate assets using the Create API and includes the following services:

Image to Video.: Convert images to videos.
Text to Image: Convert text to images.
Text to Speech: Convert text to speech using a choice of voices and languages.
Text Generation: Generate text using a text prompt, powered by GPT-4.

Setup

The Shotstack provider is a built in provider and does not require any additional setup or configuration.

Adding credentials

The Shotstack provider is setup and enabled by default and does not require any credentials. Access may be limited based on your subscription plan.

Services

The Shotstack provider includes the following services:

Image to video

Image to video converts an image into a 4-second video. Provide a URL to an image to generate an mp4 video file. The video file can be used inside of your Edit or as a standalone asset.

You can optionally set two parameters guidanceScale and motion:

guidanceScale dictates how strongly the video sticks to the original image. Use lower values to allow the model more freedom to make changes and higher values to correct motion distortions.
motion dictates how the amount of motion in the video. Lower values generally result in less motion in the output video, while higher values generally result in more motion.

Only PNG and JPG files of the following dimensions are supported:

1024x576
576x1024
768x768

{
    "provider": "shotstack",
    "options": {
        "type": "image-to-video",
        "imageUrl": "https://shotstack-assets.s3.amazonaws.com/images/wave-barrel-sm.jpg",
        "guidanceScale": 1.8,
        "motion": 127
    }
}

This will generate a video file of the supplied image, using default settings for guidanceScale and motion.

Info

Shotstack's Image to Video API uses Stable Diffusion's Stable Video model.

Text to image

Text to image converts a prompt into an image. The image file can be used inside of your Edit or as a standalone asset.

{
    "provider": "shotstack",
    "options": {
        "type": "text-to-image",
        "prompt": "A detailed illustration of Mars, showcasing its orange-red surface, with Olympus Mons and Valles Marineris prominently displayed.",
        "width": 1024,
        "height": 1024
    }
}

Info

Please ensure both width and height are in multiples of 64.

Text to speech

Text to speech converts text to speech using a choice of voices and languages. Provide a string of text and a voice and language to generate an mp3 audio file. The audio file can be used as a soundtrack for a video or as a standalone asset.

Using the Create API the following payload can be used to generate an audio file:

{
    "provider": "shotstack",
    "options": {
        "type": "text-to-speech",
        "text": "The future of media production is here with the help of the Shotstack Create API",
        "voice": "Matthew"
    }
}

This will generate an mp3 audio file using the Matthew voice and the text provided.

Translating text

To generate an audio file in a different language, for example Korean, you can use the language option. Note that when setting a language you must also set the voice option to a voice that supports the language. For a full list of languages and voices see the Shotstack text-to-speech options API documentation.

{
    "provider": "shotstack",
    "options": {
        "type": "text-to-speech",
        "text": "The future of media production is here with the help of the Shotstack Create API",
        "voice": "Seoyeon",
        "language": "ko-KR"
    }
}

The above example creates an audio file in Korean. The English text is translated to and spoken in Korean.

Newscaster mode

The Shotstack text-to-speech service supports a newscaster mode which allows you to generate an audio file that sounds like a newsreader. To enable newscaster mode set the newscaster option to true. Note that newscaster mode is only supported by the Matthew voice.

{
    "provider": "shotstack",
    "options": {
        "type": "text-to-speech",
        "text": "The future of media production is here with the help of the Shotstack Create API",
        "voice": "Matthew",
        "newscaster": true
    }
}

Note that newscaster mode is only supported by Matthew and Joanna for US English (en-US), Lupe for US Spanish (es-US), and Amy for British English (en-GB).

Text generator

The text generation service uses the OpenAI GPT-4 model to generate text from a prompt. Provide a string of text and the and the service will generate a new string of text based on the prompt.

Using the Create API the following payload can be used to generate text:

{
    "provider": "shotstack",
    "options": {
        "type": "text-generator",
        "prompt": "Create a sentence about the Shotstack API that includes every letter of the alphabet"
    }
}

This will generate a string of text based on the prompt provided and return the url of a text file containing the generated text.

Create AI Assets with Shotstack

Setup​

Adding credentials​

Services​

Image to video​

Text to image​

Text to speech​

Translating text​

Newscaster mode​

Text generator​