All Classes and Interfaces
Class
Description
A feature extractor that combines multiple feature extractors into one.
A composite evaluator that logs iteration statistics along with both instance and token accuracy
metrics on test data.
Configuration settings for
ConllOutputEvaluator.Builder for constructing
ConllOutputConfiguration instances.An evaluator that outputs predicted and actual labels in CoNLL format.
A conditional random field (CRF) tagger that assigns tags to tokens in an input string.
Interface for training Conditional Random Field (CRF) models.
Extracts features from tokens within a sequence for use in CRF model training and tagging.
A positioned token that includes extracted features.
A sequence of tokens with extracted features.
A positioned token that includes both extracted features and a training label.
A sequence of tokens with both extracted features and training labels.
A sequence of tokens representing user input for tagging.
Exception thrown when input provided to a preprocessing component is invalid.
A feature extractor that generates features based on the length of a sequence.
Builder for
LengthFeatureExtractor.A MALLET-based implementation of
CrfTagger that uses a pre-trained CRF model to tag input
sequences.A CRF trainer implementation using the MALLET (MAchine Learning for LanguagE Toolkit) library.
Configuration settings for
MalletCrfTrainer.Builder for constructing
MalletCrfTrainerConfiguration instances.Configuration settings for
ModelOutputEvaluator.Builder for constructing
ModelOutputConfiguration instances.An evaluator that writes the current transducer (model) to file using Java serialization.
A feature extractor that matches tokens against a regex pattern.
Builder for
PatternMatchingFeatureExtractor.A token with its position within a sequence.
A feature extractor that generates features based on token position within a sequence.
Builder for
PositionFeatureExtractor.Represents an ordered sequence of positioned tokens.
Utility methods for serializing and deserializing objects to and from files.
A container for training and test data splits.
A tag provider for string tags with a defined tag set and starting tag.
A feature extractor that generates features based on prefixes or suffixes of tokens.
Builder for
SubstringFeatureExtractor.A positioned token that includes the scores for each tag.
A sequence of tokens that have been tagged by a CRF model with associated features and tag
scores.
Provides tag information for CRF models, including conversion, enumeration, and the starting
state.
A tag with its associated confidence score.
Converts raw text input into a sequence of tokens.
Reads training data and converts it into a stream of training sequences.
A positioned token that includes a training label.
Generates schema definitions for validating CRF training data.
A sequence of tokens with training labels.
A container for training and test data splits.
A feature extractor that applies a transformation function to each token to produce features.
Unchecked exception thrown when an error occurs during conditional random field (CRF) operations.
Specifies the weight storage strategy for CRF training.
A tokenizer that splits input on whitespace characters.
A feature extractor that extracts features from the current token and neighboring tokens within a
window.
Builder for constructing
WindowFeatureExtractor instances.A function that maps a feature to a new feature based on its relative position.
Reads and writes CRF training data in XML format, and generates XSD schemas for tag validation.
A feature extractor that checks whether tokens are present in a set of values loaded from an XML
file using XPath expressions.
Builder for
XPathFeatureExtractor.