Generative AI API

Unify Generative AI Media Production

Connect to built in, and third party Generative AI providers using a single API and rapidly combine text to voice, text to image and text to video in your templates, applications and workflows.

GET STARTED Talk to an expert

Unify multiple AI providers using a single SDK

Unifying Generative AI

Connecting the industry leaders in Generative AI through a unified API

OpenAI Logo
Stability AI Logo
D-ID Logo
ElevenLabs Logo
HeyGen Logo

Remove the complexity of integrating with multiple AI providers

Your video production, application or workflow might need AI generated assets from multiple service providers. But do you really want to spend your time installing libraries, jumping from web site to web site and getting confused over their docs and SDK's?

TRY NOW Talk to an expert

Fragmented ecosystem

Generative AI APIs are fragmented and messy

As the generative AI landscape evolves it is becoming increasingly fragmented with new API's and services being released every day. To build a cohesive media experience you have to integrate multiple AI providers, study each ones docs, install their SDK's and end up with a bloated mess that goes out of date as soon as it hits production.

Disconnected services

No other platform brings everything together

Generating a seamless video experience can involve using one API for text to speech, another for text to image, and one more for text to video. And then you need to edit it using yet another service. The result is a convoluted mess of code, SDKs and API keys that is difficult to maintain and manage.

It would have taken a lot of research on what technologies we needed to leverage technically for us to achieve the desired outcome. This would have taken at least two months of engineering time for a simple use case, and up to 6 months if the scope widened. Hector Zarate, Spotify

You could spend thousands of hours developing your own video editing capabilities, increasing time to market and costing money.

A single API endpoint to unify Generative AI media creation

Connect all your Generative AI media production needs in to a single platform and connect via one simple to use API endpoint and a single API key. Rapid prototype using Shotstacks built in text-to-speech, GPT4 powered text generation, and Stable Diffusion text to image and image to video services.

Or combine generated assets from leading providers like OpenAI, Stability AI, Elevenlabs, HeyGen and D-ID in to a single creative output. Shotstack is the centralised API hub for generative AI media production.

TRY NOW Talk to an expert

AI Image Generation API

Text to Image

Use the Shotstack built in text-to-image service to generate images from a simple text prompt. Or, use your own Stability AI key to plug in to the latest Stable Diffusion models and combine generated images with our other APIs and services.

Image to Video API

Image to Video

Use the Shotstack image-to-video service to bring your images to life in the form of video. Provide the URL of an image and AI will turn it in to a short video to use in your edits.

GPT-4 Text Generation API

Text to Text

Get direct access to GPT-4 using the built in Shotstack text generator to generate summaries, scripts and messaging for your videos. Or bring your own API key and connect directly to OpenAI to generate text from prompts.

AI Voice Overs API

Text to Speech

Use AI to generate voice overs for your videos. Use the built in Shotstack text-to-speech service or bring your own API key and connect to ElevenLabs. Combine voices, accents and translation and Shotstack video editing templates to create voice overs for every audience.

AI Avatars API

Text to Avatar

Bring your own API key and connect to D-ID or HeyGen. Generate AI talking avatars by providing the text and selecting a character. With a unified endpoint you can switch between providers and generate avatars at scale.

Bring Everything Together

Text to Video

Bring everything together using the Shotstack platform and video editing templates to provide an AI text to video service. Unify all your AI generated media in one place and combine voice overs, images, avatars, text and video to scale your AI media production.


Join companies large and small rendering thousands of videos every day

Coca Cola

Experience Shotstack for yourself, with no risk, and generate your first video in 15 minutes.