spaCy
Overview
spaCy is an open-source natural language processing library widely used for Named Entity Recognition (NER), part-of-speech tagging, and syntactic parsing. When connected to Label Studio, it enables teams to accelerate NER labeling by automatically detecting entities in text, generating predictions that annotators can quickly review and correct.
How it connects to Label Studio
spaCy integrates with Label Studio through the Label Studio ML Backend. You can run the included spacy example service to provide online pre-labeling: Label Studio sends text data to the backend, which uses a spaCy model to identify entities and return them as pre-annotations in the labeling interface. Annotators can then refine those suggestions directly in the UI.
You can also use spaCy and Label Studio together offline:
- Import predictions generated by spaCy into Label Studio for review and correction.
- Export annotations from Label Studio in a spaCy-compatible format to continue model training.
This bidirectional setup lets teams use Label Studio both as an annotation environment and a feedback loop for improving spaCy models.