TutorialsNovember 24, 2021

Video Annotation Feature Overview

Join Heartex's Label Studio Product Designer Den Talata for a look ahead at an often requested exciting new feature: Video Annotation.

Transcript

Michael
Welcome to another webinar in our Label Studio tutorial series. Today’s session is a bit different—we’re giving you an early look at a highly requested upcoming feature: video annotation. This is a major update and something our team has been working hard to deliver. We're excited to show you what’s coming.

Joining me today is Dan Talata, a product designer at Heartex who’s been deeply involved in this feature. We'll also be joined later by Heartex co-founder Max Chenko, who’ll help answer questions after the demo.

Before we begin, a few quick housekeeping notes:
You can ask questions in our Slack community, specifically in the #webinars channel. Today, we’re also taking questions directly in the YouTube chat. We prefer Slack so we can keep a record, but feel free to use either.

You can also register for upcoming webinars on our website—we’re finalizing the schedule through Q1. One exciting event on the horizon is a joint session with Layout Parser at the end of January.

As for community updates, the Label Studio Champions program continues to grow. Congrats to Nitish Kumar, now leading the leaderboard. He’s been answering questions, fixing bugs, and contributing across GitHub and Slack. Keep an eye out for weekly tasks—like joining this webinar live—which earn you points.

We also published new community content: an article on integrating Label Studio’s frontend into Angular apps. You’ll find the link in this video’s description.

We encourage you to check out our Slack, GitHub, Substack newsletter, and enterprise offerings at heartex.com. And with that, I’ll hand it over to Dan.

Dan
Thanks, Michael. I’m Dan, a product designer at Heartex. My focus is on making the Label Studio interface more user-friendly and efficient. Today, I’ll walk you through how video annotation works in the upcoming release.

Let’s dive in.

I’m using Label Studio version 1.3 here, but that may change slightly by the time this is released. I’ll start by creating a project called “Traffic” in the Research workspace, and upload a video clip I prepared for this demo.

After setting up the labels—Vehicle, Human, and Tree—I configure the frame rate at 30 FPS. In future versions, Label Studio will detect frame rate automatically.

Once the project is created, you’ll see one task with the uploaded video. Opening the task brings up the new video annotation interface.

The interface includes a video player, playback controls, a frame input field, and the timeline. The timeline shows one column per video frame—in this case, 12 seconds at 30 FPS, so 360 frames.

To annotate, I select a label—say, Vehicle—and draw a bounding box on the first frame. This creates a purple line on the timeline, which we call a lifespan. The dot at the start is a keyframe, created automatically when I adjust or move the region. As the object moves, I reposition it frame by frame, and keyframes are added automatically.

Between keyframes, the region is interpolated. This means you don’t have to manually label every frame—just the key ones. You can drag the cursor across the timeline and adjust as needed.

If an object exits the frame—for example, a palm tree going out of view—you can end the region's lifespan by clicking a button that removes it from that point onward. If needed, you can bring it back by reintroducing the lifespan.

You can also delete individual keyframes. If a keyframe is removed, the system interpolates between the nearest keyframes on either side. And if the result isn’t ideal, you can simply add the keyframe back.

The goal is to make video annotation fast, flexible, and intuitive.

In addition to manual labeling, you can also use machine learning to pre-annotate videos. In another project, we used a model to detect runners in a sports event. The system automatically tracked their movements across frames. You can then refine these predictions—like selecting all runners and assigning them the correct label, such as "Woman," using keyboard shortcuts.

Sometimes, objects are briefly occluded—for example, a runner disappearing behind a display. The model turns off the region’s lifespan during occlusion and reactivates it when the object reappears. You can do this manually as well.

The right panel lists all regions, just like in image annotation. You can enter full-screen mode, scroll through the timeline, and use the overview progress bar to track changes and annotations across the video.

Michael, do we have any questions?

Michael
Yes. One person asked how interpolation works between keyframes. Is any ML applied?

Dan
No ML is involved in the interpolation itself. It’s simple math—Label Studio linearly interpolates the region’s position and size between keyframes. If the object doesn’t follow a straight line, you’ll need to add more keyframes to keep the tracking accurate.

Michael
And is there a way to modify the interpolation algorithm?

Dan
Yes. You can replace the front-end callback responsible for interpolation and use your own logic or model-based tracking.

Before we continue, I want to show a quick preview of what the video annotation interface will look like soon. Michael, can you play the video?

[Preview Video Audio – Narration Summary]
The upcoming video annotation interface introduces new video controls for playback, frame navigation, and clip timing. Keyframes and lifespans are shown in the timeline, with automatic interpolation between frames.

Two new panels are being added:

The Outliner, for managing and grouping regions

The Details Panel, for adding metadata and seeing keyframe/frame-specific info

These features aim to streamline video annotation across use cases.

Dan
So that’s where we’re headed. The video annotation feature is currently in beta and will soon support new interface elements like the Outliner and Details Panel. The timeline is a key part of this new functionality.

Join our Slack community—we’ll be sharing the preview video there and collecting feedback. Your input helps us tailor the experience to your needs.

Michael
Let’s take a few more questions.

Someone asked about syncing video annotation with time series data. How do you plan to support that?

Dan
That’s part of what we’re working on. Label Studio is known for being highly customizable. From an interface perspective, it’s possible to synchronize video with time series playback. Max, want to comment on the technical side?

Max
Yes, we’ve introduced a synchronization layer we call the “sync bus.” It will allow us to sync video with other sources like audio and time series. Right now, it’s possible using a somewhat hacky template, but a more integrated version is on the way.

Michael
Another question came in about object tracking. Could you integrate real-time tracking algorithms like Boosting, MIL, or KCF instead of keyframes?

Dan
Those are valid approaches. While Label Studio doesn’t run object tracking in real time on the front end, you can absolutely use those models on the backend to generate pre-annotations. It’s more about integrating that ML pipeline into the system. We’re open to feedback and feature requests—please post in the #feature-requests Slack channel.

Michael
Here’s another good one: Can I label long videos—like 2 to 3 hours?

Dan
We’ve tested up to 30-minute videos so far. Two to three hours should be theoretically possible, but performance depends on your browser and system memory. We’ll continue testing and improving scalability.

Michael
Any updates on the release timeline?

Dan
Video annotation will roll out to Enterprise users next week and to open source within a month.

Michael
Great! That’s all for today. Thank you to everyone who joined. A special thanks to Dan and Max for the demo and Q&A. We’ll be sending the recording to those who couldn’t attend live—and yes, we know you’ve been asking!

See you next time.

Dan
Thanks, everyone! Please share feedback—we’d love to hear from you.

Max
Take care, all!

Video Annotation Feature Overview

Transcript

Related Content