What is Programmatic?
Quickly build high quality datasets for NLP without manual annotation.
Humanloop Programmatic is a pip-installable app for rapidly building large weakly-supervised datasets. That means that you can use rules created by experts, external knowledge bases, or even other models to automatically create a lot of training data for downstream applications. Really quickly.
NLP engineers and data scientists use programmatic labelling to automatically build high quality datasets in hours.
Go to our quick start to get going immediately or read on for more info.
Walk through of programmatic

We built Programmatic to make getting data for NLP much easier

Getting annotated data is often one of the biggest bottlenecks for machine learning. For NLP projects, this can be even harder because annotation often needs domain expertise, or you might be working with private data. This makes it hard to outsource annotation.
Whilst onboarding customers to the Humanloop platform, we saw again and again that lots of teams had useful domain knowledge or rule-based systems that would usefully augment manual annotation.
We've built Programmatic from the ground up to help you supercharge your datasets. Everything in the app has been crafted to make the feedback near-instantaneous so you can rapidly iterate on rules and label faster than ever before.

How does it work?

With Programmatic you replace or augment hand-labelled data with approximate labels that have been generated by rules. These rules can be simple like checking for the presence of a keyword or more complex like using a different model. The clever part of programmatic labelling is what happens after you've created your labelling functions.
Programmatic uses a de-noising model (sometimes called a "label model") to combine all of the programmatically generated labels you produce into a single cleaned up data-set. To understand more about how this works you can find more details in our docs, in this blog, or through other packages like Snorkel and skweak. You don't need to understand how the de-noising works to get the benefit though. You can also happily use the tool without digging into the maths or background libraries.