Take the State of Enterprise Video Survey. You could win an Oculus Quest 2!

How to Create Custom Picture-by-Picture Looks for Remote Streaming Production

Article Featured Image

As the world was locked down due to the COVID-19 pandemic, the adoption of video conference software-as-a-service (SaaS) solutions for corporate, government, educational, and consumer use grew appreciably in 2020. The meteoric rise of Zoom, Microsoft Teams, Google Meet, and Cisco Webex was not without security and encryption issues. But while the media circulated reports of Zoombombers and rent-a-goat video participants (replaced late in the year with rent-a-Santa), people at all levels used video conferences as way to meet, connect, govern,
communicate, educate, and much more.

Video conference services all strive to maintain a latency below 150 milliseconds, which is the maximum delay before conversations start to feel unnatural. They also need to be easy to use across a very broad userbase (from elementary students to seniors and every age group in between). All of these services have in common the ability to share video and audio feeds, typically from an often suboptimal webcam with a built-in microphone. More advanced users have mirrorless video cameras and headsets or podcast-style microphones to look or sound better—or as good as the heavy encryption and low bitrate of video conference services allow.

Participants can toggle their display between showing only the active speaker and a “tiled” or “gallery” view that can show up to 49 participants in a grid. “Pin” and “spotlight” controls override the active speaker. Video conference services also let users or presenters share their screen, which is one way to show a PowerPoint presentation, photos, a website, or videos to meeting participants. Some services allow the simultaneous sharing of a screen and a small video feed in a display format referred to as picture by picture (PxP). This is similar to picture in picture (PiP), except PxP composites the two signals side by side, whereas PiP composites one signal on top of the other.

In this article, I will discuss PxP and PiP functionality. By design and necessity, video conference services are really easy to use, but they lack advanced controls for independently regulating the positions and scale of video and content-sharing signals.

If you’re looking for a way to elevate your livestreaming productions beyond what clients can do themselves, then you will want to take over more control in your productions, starting with the PxP and PiP looks, using external hardware and/or software. The benefit of this approach is that your recordings will generate a standard HD signal for later editing and on-demand viewing on different platforms.

Bypassing Stock PxP/PiP Looks

If you want to take more control of your production looks, you won’t find the controls hidden deep in your video conference service’s advanced settings. You have to work within the controls that are available to you, which is the combination of a single video and audio input and the ability of presenters to “spotlight” a video signal that all participants will see.

The trick is that instead of using your webcam to send a video signal to your video conference service, you need to use a compatible hardware or software solution that is Universal Video Codec (UVC)-compliant. UVC is a webcam-type signal that video conference services can use and is not the same type of signal that a hardware PCIe video capture card would use. If you composite the PxP or PiP signal your video conference SaaS sees, then you aren’t restricted by the service’s controls and limitations.

Hardware Solutions

One approach is to use a hardware video switcher to connect multiple inputs and, using the video switcher’s controls, composite your video and computer signals as a PxP or PiP look. Then, connect your video switcher to a USB capture card. This, of course, assumes that your video switcher has a PxP or PiP look.

Individual video signals can be converted using a USB converter connected to a video camera with an HDMI or HD-SDI output or a computer with an HDMI output. These USB converters are available with USB 3.0, USB-C, and Thunderbolt 3 connectors. You can connect one or multiple USB capture cards, depending on your computer system’s USB controller capabilities.

Another approach is to use a hardware video switcher that has a PiP or PxP look and a USB output. The Blackmagic Design ATEM Mini (Figure 1, below) and ATEN Camlive Pro (Figure 2, below Figure 1) are two models that sell for less than $400 and offer both PiP or PxP looks and a UVC webcam-type USB output.

Figure 1. The Blackmagic Design ATEM Mini can be used to create a PxP look. There are no controls for cropping the thumbnail image, but I can scale and move the position.

Figure 2. The ATEN Camlive Pro has advanced controls for creating a PxP look, including cropping individual sources.

The advantage of using a hardware switcher with a USB output is that you need only a single USB connection. In addition, these models often cost less and are smaller than traditional professional hardware video switchers. The disadvantages of these models is that they have a limited set of controls to adjust the size, positions, and cropping of the PiP and PxP looks. Not all models support the more desirable PxP look, and although some have the ability to change settings using controls and buttons on the hardware, most require initial setup and advance controls to be adjusted on a connected computer or mobile app.

Streaming Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Companies and Suppliers Mentioned