How to edit a picture-in-picture video using Node.js

Before getting started you might want to see an example application that we built that will layer videos to create a picture-in-picture effect.

Picture-in-picture has become quite ubiquitous across the media landscape with the functionality being available on browsers to allow for the watching of a video while scrolling a web page, and as insets on many YouTube videos for creators to provide commentary on their videos.

This guide will walk you through creating a simple application that can be used to add picture-in-picture functionality to your videos. For this tutorial we are using Node.js.

Let's get started

Sign up for an API key

You can sign up for a free Shotstack developer account and receive your key which you will need to make calls to the API. If you haven't used the API before a good place to start is our Hello World tutorial.

Node.js

Our script will be written in Node.js and we'll keep it as simple as possible with minimum dependencies.

Setting the scene

We're going to build a YouTube listicle video discussing my top 5 favourite games for OSX. We have a bunch of media assets such as game footage and video commentary of different aspect ratio's, resolutions and filetypes which we will assemble into a composited video.

JSON

A Shotstack video edit is simply a JSON file comprising of a timeline, clips, transitions and effects that is posted to the API which takes care of the rendering process producing an mp4 video file.

In the JSON below, we place part of our game footage in series, and place our scaled video commentary on the bottom left hand of those videos.

{
"timeline": {
"background": "#000000",
"tracks": [
{
"clips": [
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/deponia_pip.mov"
},
"start": 0,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/hacknet_pip.mov"
},
"start": 10,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/beholder_pip.mov"
},
"start": 20,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/pinstripe_pip.mov"
},
"start": 30,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/thimbleweed_pip.mov"
},
"start": 40,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
}
]
},
{
"clips": [
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/deponia.mkv",
"volume": 0.1,
"trim": 10
},
"start": 0,
"length": 10
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/hacknet.mkv",
"volume": 0.1,
"trim": 10
},
"start": 10,
"length": 10
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/beholder.mkv",
"volume": 0.1,
"trim": 10
},
"start": 20,
"length": 10
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/pinstripe.mkv",
"volume": 0.1,
"trim": 40
},
"start": 30,
"length": 10
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/thimbleweed.mkv",
"volume": 0.1,
"trim": 10
},
"start": 40,
"length": 10
}
]
}
]
},
"output": {
"format": "mp4",
"resolution": "sd"
}
}

You could submit this JSON payload directly to the API using Curl or Postman but for this tutorial we will save the JSON in a file called template.json and read it in using our Node.js script.

Node.js application

Create a new Node.js script file and add the code below. The script will read the JSON template file, POST it to the API render endpoint and then poll the API to retrieve the render status. It will take around 30 seconds to render the video and a URL will be output to the console which you can download and view. You will need to install the dotenv and axios node modules before running the script.

require('dotenv').config();
const axios = require('axios');

const shotstackUrl = 'https://api.shotstack.io/stage/';
const shotstackApiKey = process.env.SHOTSTACK_API_KEY; // Either declare your API key in your .env file, or set this variable with your API key right here.

const json = require('./template.json');

/**
* Post the JSON video edit to the Shotstack API
*
* @param {String} json The JSON edit read from template.json
*/

const renderVideo = async (json) => {
const response = await axios({
method: 'post',
url: shotstackUrl + 'render',
headers: {
'x-api-key': shotstackApiKey,
'content-type': 'application/json'
},
data: json
});

return response.data;
}

/**
* Get the status of the render task from the Shotstack API
*
* @param {String} uuid The render id of the current video render task
*/

const pollVideoStatus = async (uuid) => {
const response = await axios({
method: 'get',
url: shotstackUrl + 'render/' + uuid,
headers: {
'x-api-key': shotstackApiKey,
'content-type': 'application/json'
},
});

if (!(response.data.response.status === 'done' || response.data.response.status === 'failed')) {
setTimeout(() => {
console.log(response.data.response.status + '...');
pollVideoStatus(uuid);
}, 3000);
} else if (response.data.response.status === 'failed') {
console.error('Failed with the following error: ' + response.data.response.error);
} else {
console.log('Succeeded: ' + response.data.response.url);
}
}

// Run the script
(async () => {
try {
const render = await renderVideo(JSON.stringify(json));
pollVideoStatus(render.response.id);
} catch (err) {
console.error(err);
}
})();

Initial result

Our first draft will look like the video below:

Pretty straightforward right! We can add some improvements to make it clearer what the video is about and it doesn't yet feel like a listicle at all. The different clips don't transition very nicely, and there's generally no context on what's going on outside of the commentary.

Final touches

The JSON below adds a couple of HTML assets. These assets use HTML and CSS to build basic animations that provide context on the game that is being discussed. We'll also add some transitions to the game footage to more organically transition from one item to the next and include an initial title scene that makes it clear what the video is all about.

{
"timeline": {
"background": "#000000",
"tracks": [
{
"clips": [
{
"asset": {
"type": "title",
"text": "Top 5 Steam games on OSX",
"style": "blockbuster",
"color": "#ffffff",
"size": "large",
"background": "#000000",
"position": "center"
},
"start": 0,
"length": 3,
"transition":{
"in": "fade",
"out": "fade"
}
}
]
},
{
"clips": [
{
"asset": {
"type": "html",
"html": "<div>5</div>",
"css": "div {font-family: \"Lato\";font-size: 90px; font-weight: bold; padding: 5%;}",
"width": 150,
"height": 150,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideRight",
"out": "slideLeft"
},
"start": 3,
"length": 4,
"position": "bottomLeft",
"offset":{
"x": 0.05,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Daedalic Entertainment, 2012</div>",
"css": "div {font-family: \"Lato\";font-size: 18px; font-weight: bold; padding: 5%;}",
"width": 300,
"height": 50,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideUp",
"out": "slideDown"
},
"start": 3,
"length": 4,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Deponia</div>",
"css": "div {font-family: \"Lato\";font-size: 60px; font-weight: bold; padding: 5%;}",
"width": 400,
"height": 100,
"background": "#bdc3c7",
"position": "center"
},
"transition": {
"in": "slideDown",
"out": "slideUp"
},
"start": 3,
"length": 4,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.219
}
},
{
"asset": {
"type": "html",
"html": "<div>4</div>",
"css": "div {font-family: \"Lato\";font-size: 90px; font-weight: bold; padding: 5%;}",
"width": 150,
"height": 150,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideRight",
"out": "slideLeft"
},
"start": 11,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.05,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Fractal Alligator, 2015</div>",
"css": "div {font-family: \"Lato\";font-size: 18px; font-weight: bold; padding: 5%;}",
"width": 300,
"height": 50,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideUp",
"out": "slideDown"
},
"start": 11,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Hacknet</div>",
"css": "div {font-family: \"Lato\";font-size: 60px; font-weight: bold; padding: 5%;}",
"width": 400,
"height": 100,
"background": "#bdc3c7",
"position": "center"
},
"transition": {
"in": "slideDown",
"out": "slideUp"
},
"start": 11,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.219
}
},
{
"asset": {
"type": "html",
"html": "<div>3</div>",
"css": "div {font-family: \"Lato\";font-size: 90px; font-weight: bold; padding: 5%;}",
"width": 150,
"height": 150,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideRight",
"out": "slideLeft"
},
"start": 21,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.05,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Warm Lamp Games, 2016</div>",
"css": "div {font-family: \"Lato\";font-size: 18px; font-weight: bold; padding: 5%;}",
"width": 300,
"height": 50,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideUp",
"out": "slideDown"
},
"start": 21,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Beholder</div>",
"css": "div {font-family: \"Lato\";font-size: 60px; font-weight: bold; padding: 5%;}",
"width": 400,
"height": 100,
"background": "#bdc3c7",
"position": "center"
},
"transition": {
"in": "slideDown",
"out": "slideUp"
},
"start": 21,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.219
}
},
{
"asset": {
"type": "html",
"html": "<div>2</div>",
"css": "div {font-family: \"Lato\";font-size: 90px; font-weight: bold; padding: 5%;}",
"width": 150,
"height": 150,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideRight",
"out": "slideLeft"
},
"start": 31,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.05,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Atmos Games, 2017</div>",
"css": "div {font-family: \"Lato\";font-size: 18px; font-weight: bold; padding: 5%;}",
"width": 300,
"height": 50,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideUp",
"out": "slideDown"
},
"start": 31,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Pinstripe</div>",
"css": "div {font-family: \"Lato\";font-size: 60px; font-weight: bold; padding: 5%;}",
"width": 400,
"height": 100,
"background": "#bdc3c7",
"position": "center"
},
"transition": {
"in": "slideDown",
"out": "slideUp"
},
"start": 31,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.219
}
},
{
"asset": {
"type": "html",
"html": "<div>1</div>",
"css": "div {font-family: \"Lato\";font-size: 90px; font-weight: bold; padding: 5%;}",
"width": 150,
"height": 150,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideRight",
"out": "slideLeft"
},
"start": 41,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.05,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Terrible Toybox, 2017</div>",
"css": "div {font-family: \"Lato\";font-size: 18px; font-weight: bold; padding: 5%;}",
"width": 300,
"height": 50,
"background": "#ecf0f1",
"position": "center"
},
"transition": {
"in": "slideUp",
"out": "slideDown"
},
"start": 41,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.15
}
},
{
"asset": {
"type": "html",
"html": "<div>Thimbleweed Park</div>",
"css": "div {font-family: \"Lato\";font-size: 40px; font-weight: bold; padding: 5%;}",
"width": 400,
"height": 100,
"background": "#bdc3c7",
"position": "center"
},
"transition": {
"in": "slideDown",
"out": "slideUp"
},
"start": 41,
"length": 5,
"position": "bottomLeft",
"offset":{
"x": 0.18,
"y": 0.219
}
}
]
},
{
"clips": [
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/deponia_pip.mov"
},
"start": 3,
"length": 7,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/hacknet_pip.mov"
},
"start": 10,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/beholder_pip.mov"
},
"start": 20,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/pinstripe_pip.mov"
},
"start": 30,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/thimbleweed_pip.mov"
},
"start": 40,
"length": 10,
"scale": 0.2,
"position": "bottomRight",
"offset": {
"x": -0.05,
"y": 0.1
},
"transition":{
"out": "fade"
}
}
]
},
{
"clips": [
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/deponia.mkv",
"volume": 0.1,
"trim": 10
},
"start": 3,
"length": 7,
"transition":{
"in": "fade",
"out": "fade"
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/hacknet.mkv",
"volume": 0.1,
"trim": 10
},
"start": 10,
"length": 10,
"transition":{
"in": "fade",
"out": "fade"
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/beholder.mkv",
"volume": 0.1,
"trim": 10
},
"start": 20,
"length": 10,
"transition":{
"in": "fade",
"out": "fade"
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/pinstripe.mkv",
"volume": 0.1,
"trim": 40
},
"start": 30,
"length": 10,
"transition":{
"in": "fade",
"out": "fade"
}
},
{
"asset": {
"type": "video",
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/pip/thimbleweed.mkv",
"volume": 0.1,
"trim": 10
},
"start": 40,
"length": 10,
"transition":{
"in": "fade",
"out": "fade"
}
}
]
}
],
"fonts": [
{
"src": "https://shotstack-assets.s3-ap-southeast-2.amazonaws.com/fonts/Lato-Bold.ttf"
}
]
},
"output": {
"format": "mp4",
"resolution": "sd"
}
}

Final result

The final output is a professionally edited listicle video with picture-in-picture video commentary. PewDiePie would be jealous.

Conclusion

This guide shows you how to build an application that places a scaled video on top of another video; creating a picture-in-picture effect. We also used the HTML asset type and the built in slide transitions to add animated lower third titles.

To demonstrate how you could use the techniques described in this how to guide we have built our own open-source picture-in-picture generator which you can use to generate picture-in-picture videos. The complete source code is available on Github, which you could use as a starting point to build your own application.

Jeff Shillitto

BY DERK ZOMER
27th November, 2020