CommunitySeptember 29, 2021

Effective Active Learning Techniques with Lightly & Label Studio

Join us for an exciting data science focused interview featuring the co-founder of Lightly and their lead ML Engineer being interviewed by Heartex (maintainers of the Label Studio open source project) CTO and co-founder Nikolai Liubimov. Moderated by Heartex Head of Community Michael Ludden.

Transcript

Michael: We’re live—thanks everyone for joining. Today we’re lucky to have the Developer Relations Lead from Jina AI, an open source neural search framework. Alex CG is here with us. Is that how you introduce yourself, by the way—Alex CG?

Alex: Yeah, my parents are terrible people. They gave me a really long double-barreled surname. I sound really British, but I’m not. I just go with CG.

Michael: Nice. What’s the full version?

Alex: Griffiths. A French-Welsh combo. Two great tastes that taste terrible together.

Michael: No comment on that one. Alright, a couple things to run through before we dive in. I think most of the audience today is from the Label Studio community, but there may be some folks here from Jina AI too. If you’re not already in our Slack channel, definitely join—it’s a great place to ask questions and connect. The engineering team is incredibly responsive. We’re big on community, so it’s a good idea to hop in. Also, check out our webinars page—upcoming events are listed there, and you can RSVP to get calendar invites. We’re adding more details and more sessions as we go, and working on planning them further in advance. You can also find YouTube replays of past sessions. Just a note: these live stream links stop working after the stream ends. We edit the intros and outros before re-uploading, so check the webinars page for the updated links.

Obviously I’ll let you introduce your tool, but if folks want to follow along, you can visit get.jina.ai to go to the GitHub repo. And if anyone isn’t familiar with Label Studio, here are a few quick links. Label Studio is an open source data labeling tool that lets you annotate pretty much any data type—and more are on the way. You can try it at labelstud.io. We’ve also got a playground where you can test out different templates, and a Substack newsletter at labelstudio.substack.com. Our GitHub repo is also linked from the main site. That’s it for my plugs. Alex, take it away.

Alex: Thanks, Michael. As mentioned, I’m the Developer Relations Lead at Jina AI, a startup based in Berlin—Germany, not the one in New Hampshire or wherever else there’s another Berlin. We provide an open source neural search framework, and today I’ll walk through what it does, how it works, and show some code examples. Let me share my screen and we’ll go from there.

Michael: Yep, I can see it. Cute little astronauts and whatnot.

Alex: Yeah, this is a deck I’ve presented many times. Haven’t had time to create something custom, but it really dives into what Jina is. So—Jina is all about search, solving problems with search. This version of the deck is set up for a workshop, hence the word "workshop," and the sample project is building a search engine for memes. We’ll look at how the problem is framed, what Jina does, and if we have time, we’ll look at some of the code. Don’t worry if you don’t have a background in deep learning. I majored in Chinese at university. Everything I know about Python and AI came from YouTube and a lot of trial and error.

Let’s say we want to build a meme search engine—for example, to find memes about Winnie the Pooh and food. The system pulls up things like "finger food" and "cheese puffs," and even if the exact word "food" isn’t in the meme, it still finds related content. That’s because the model has been pretrained on massive datasets and understands that cereal, breakfast, and soup are food-related concepts. You can also search by keyword—like “chocolate”—and it’ll return relevant memes with different templates.

We built an actual demo of this at examples.jina.ai, but I think our AWS backend is down again. So typical demo effect. I’ll try to get that fixed tomorrow.

What I built there is a search engine that lets you type in text and find memes with similar text—or upload an image and find memes with visually similar images. That’s what Jina is really good at: searching across data types. When most people think of search, they think of a text box—Google, Twitter, Stack Overflow. But search today is so much more. Uploading an image to find products? That’s search. Asking a chatbot a question? That’s search too. Tinder? Arguably a form of search.

Neural search means using deep learning models to generate embeddings—mathematical representations of content—and then comparing those embeddings to find similarities. Take a model like CLIP, feed it images of Bulbasaur or Charmander, and it learns patterns: curves, colors, features. Jina lets you do similarity search on those embeddings, so if you upload a picture of Bulbasaur, it finds other images close to that in embedding space. You don’t need to know how the model works internally—it just works.

One of the nice things about this is that it’s semantic. It understands meaning. The same way a human sees a round yellow shape and knows it’s Pikachu, the model can do the same. And it’s data-type agnostic. You can search across images, text, audio, even 3D meshes or proteins. And if you want to switch languages, it’s as simple as swapping in a French model instead of an English one.

But integrating all this—models, data, infrastructure—is hard. That’s where Jina comes in. We simplify the infrastructure and let you focus on your application. Our core concepts are: documents (the basic unit of data), executors (operations on data), and flows (chains of executors). You can break down a book into pages, encode each page, and index them for fast search.

We’ve also launched Jina Hub. Think of it like apt-get or Homebrew for neural search. You can install prebuilt executors like a CLIP encoder, sentence splitter, or simple indexer. Just reference them with jinahub://, and it handles the rest—no coding required. These run in Docker and are easy to chain together.

If this were a full workshop, I’d take it slower. But we also have a Colab notebook that walks through building a meme search engine. It pulls in a dataset, creates documents, encodes them, and builds an index. You can query it with simple text like "school" and get memes back that match that concept—even if they don’t use the word "school" explicitly. The model infers meaning.

All of the code is open source. It’s on my GitHub at github.com/alexcg1, including the full meme search example.

Michael: That was great. Super interesting and fun. Can we go back to that slide listing what Jina can search—text, images, audio, video, 3D mesh, and proteins?

Alex: Sure.

Michael: First of all: proteins?

Alex: Yeah, we worked with Major League Hacking fellows who started building a search engine for amino acids. Hugging Face has a model that encodes protein sequences—basically long strings of characters—and lets you find similar ones.

Michael: Okay, that’s amazing. I also wanted to ask about 3D meshes. How does that work?

Alex: Good question. We’re using models that understand 3D geometry—not just 2D bitmaps. So when you submit a mesh, it doesn’t just analyze an image from a single angle. It can compare the structure and shape of objects in 3D space. This is especially useful for game asset search, where you might want to find similar models across large libraries.

Michael: What about video?

Alex: For video, it depends. You might break videos into scenes, extract representative frames, and then index those. Or you could search videos by uploading a related image and finding similar scenes. Some models let you search across modalities too—for example, CLIP supports text, images, and audio in the same embedding space.

Michael: So theoretically, I could type in the word “chainsaw” and get back an image of a chainsaw, the sound of a chainsaw, or a video of someone using one?

Alex: Exactly. As long as the models you're using share the same embedding space, it’s totally doable. It’s not magic—but it’s getting close.

Michael: Okay, final question. I studied musical theater at UCLA. I’ve always wanted a search engine where I can hum a melody and it tells me what song it is. Why doesn’t that exist yet?

Alex: It’s technically possible, but you’d need a model trained on melodic patterns. If you’re humming the melody, it needs to isolate and recognize that sequence—and that’s hard. But if a model exists that can do that, you could absolutely plug it into Jina.

Michael: Amazing. I hope someone builds that. Anyway, this has been a fantastic presentation—thank you, Alex.

Alex: Thank you! If folks want to try it out, go to get.jina.ai, and if you want to reach me, I’m @alexcg on Twitter or on the Jina Slack at slack.jina.ai. Would love for you to play around with it.

Michael: Perfect. And if you’re looking for a way to label and evaluate your data—whether it’s RAG, model comparisons, or fine-tuning—you can check out Label Studio. Thanks again, everyone!

Effective Active Learning Techniques with Lightly & Label Studio

Transcript

Related Content