Create AI Assets with Shotstack
The Shotstack provider allows you to generate assets using the Create API and includes the following services:
- Image to Video.: Convert images to videos.
- Text to Image: Convert text to images.
- Text to Speech: Convert text to speech using a choice of voices and languages.
- Text Generation: Generate text using a text prompt, powered by GPT-4.
Setup
The Shotstack provider is a built in provider and does not require any additional setup or configuration.
Adding credentials
The Shotstack provider is setup and enabled by default and does not require any credentials. Access may be limited based on your subscription plan.
Services
The Shotstack provider includes the following services:
Image to video
Image to video converts an image into a 4-second video. Provide a URL to an image to generate an mp4 video file. The video file can be used inside of your Edit or as a standalone asset.
You can optionally set two parameters guidanceScale
and motion
:
guidanceScale
dictates how strongly the video sticks to the original image. Use lower values to allow the model more freedom to make changes and higher values to correct motion distortions.motion
dictates how the amount of motion in the video. Lower values generally result in less motion in the output video, while higher values generally result in more motion.
Only PNG and JPG files of the following dimensions are supported:
- 1024x576
- 576x1024
- 768x768
{
"provider": "shotstack",
"options": {
"type": "image-to-video",
"imageUrl": "https://shotstack-assets.s3.amazonaws.com/images/wave-barrel-sm.jpg",
"guidanceScale": 1.8,
"motion": 127
}
}
This will generate a video file of the supplied image, using default settings for guidanceScale and motion.
Shotstack's Image to Video API uses Stable Diffusion's Stable Video model.
Text to image
Text to image converts a prompt into an image. The image file can be used inside of your Edit or as a standalone asset.
{
"provider": "shotstack",
"options": {
"type": "text-to-image",
"prompt": "A detailed illustration of Mars, showcasing its orange-red surface, with Olympus Mons and Valles Marineris prominently displayed.",
"width": 1024,
"height": 1024
}
}
Please ensure both width and height are in multiples of 64.
Text to speech
Text to speech converts text to speech using a choice of voices and languages. Provide a string of text and a voice and language to generate an mp3 audio file. The audio file can be used as a soundtrack for a video or as a standalone asset.
Using the Create API the following payload can be used to generate an audio file:
{
"provider": "shotstack",
"options": {
"type": "text-to-speech",
"text": "The future of media production is here with the help of the Shotstack Create API",
"voice": "Matthew"
}
}
This will generate an mp3 audio file using the Matthew
voice and the text provided.
Translating text
To generate an audio file in a different language, for example Korean, you can use the language
option. Note that when
setting a language you must also set the voice
option to a voice that supports the language. For a full list of
languages and voices see the Shotstack text-to-speech options API
documentation.
{
"provider": "shotstack",
"options": {
"type": "text-to-speech",
"text": "The future of media production is here with the help of the Shotstack Create API",
"voice": "Seoyeon",
"language": "ko-KR"
}
}
The above example creates an audio file in Korean. The English text is translated to and spoken in Korean.
Newscaster mode
The Shotstack text-to-speech service supports a newscaster
mode which allows you to generate an audio file that sounds
like a newsreader. To enable newscaster mode set the newscaster
option to true
. Note that newscaster mode is only
supported by the Matthew
voice.
{
"provider": "shotstack",
"options": {
"type": "text-to-speech",
"text": "The future of media production is here with the help of the Shotstack Create API",
"voice": "Matthew",
"newscaster": true
}
}
Note that newscaster mode is only supported by Matthew and Joanna for US English (en-US), Lupe for US Spanish (es-US), and Amy for British English (en-GB).
Text generator
The text generation service uses the OpenAI GPT-4 model to generate text from a prompt. Provide a string of text and the and the service will generate a new string of text based on the prompt.
Using the Create API the following payload can be used to generate text:
{
"provider": "shotstack",
"options": {
"type": "text-generator",
"prompt": "Create a sentence about the Shotstack API that includes every letter of the alphabet"
}
}
This will generate a string of text based on the prompt provided and return the url of a text file containing the generated text.