Labelling Functions
What are labelling functions and how do you create them?

What are Labelling Functions

A labelling function is a heuristic that takes a datapoint and either returns a label or abstains (returns nothing). For sequence tagging tasks, you return a span of text where you think the label should be applied.
Importantly labelling functions don't need to be 100% accurate or label every datapoint. The labelling functions can be quite noisy and there can be multiple labelling functions that label a particular datapoint. Later, once all of the labelling functions have been applied, you train a Label Model that works out the most likely gold label for each datapoint.

Example types of Labelling Functions

Some common types of labelling functions are:
  • Keyword searches: looking for specific words in a sentence
  • Pattern matching (E.g. regex): looking for specific syntactical patterns
  • Third-party models: using a pre-trained model (or a model for a different task)
  • Distant supervision: using external knowledge bases or dictionaries
  • Human-generated labels: actual human labels are also a noisy guess at the correct label and can be included.
You can find templates for these in the Programmatic interface. In your code box, just click the Templates button.

Labelling Functions in Programmatic

In Programmatic, a labelling function takes as input a row from your dataframe and returns as output a span. A span is a tuple of (start_character, end_character). You can access any of the attributes of a row using dot notation:
  • row.text will get the text
  • row.doc will get the spaCy Doc
import pandas as pd
PRODUCT_LIST = pd.read_csv("../product_list.csv")
def check_product_list(row: datapoint) -> List[Span]:
spans = []
for word in row:
if row in PRODUCT_LIST:
spans.append(Span(start=word.idx, end=word.idx + len(word.idx))
return spans
The Programmatic interface is designed to make the following workflow as simple as possible:
  1. 1.
    Write an initial version of a labelling function
  2. 2.
    Spot check its performance by looking at its output on datapoints in the training set
    This will be shown in the tabs on the right side. "Labelling function results" shows you all the results. If you have any ground truth labels then the "Matched ground truths" and "Missed ground truths" tab will show you where you are going right and wrong.
  3. 3.
    Look at the estimated precision and recall of each labelling function
  4. 4.
    Refine and debug to improve coverage or accuracy as necessary
The goal of developing labelling functions is to produce a high quality training dataset that can then be used to train a traditional machine learning model. Ideally, for each label that you want to apply you would have a handful of labelling functions that overlap a bit. In the best case, your labelling functions would be relatively independent of each other.

What makes a good labelling function?

You don't need to have perfect labelling functions but you should strive to have a handful of high precision (>50%) and high recall labelling functions for each tag that you care about.
In general lots of unrelated overlapping functions are better than only one or two very good functions.
How detrimental bad labelling functions are depends a lot on what makes them “bad”. A really high coverage label function with low precision that doesn't overlap much with other labelling functions is much worse than a high precision low coverage labelling function.
If you do have a high coverage labelling function with low precision then its good to make sure you have a few other labelling functions that over lap with it. A useful baseline model to have in mind is to imagine what would happen if the labels were decided just by majority vote.