Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- apply(F, int) - Method in interface org.coordinatekit.crf.core.preprocessing.WindowFeatureMapper
-
Applies this mapper to transform a feature based on its relative position.
B
- build() - Method in class org.coordinatekit.crf.core.preprocessing.LengthFeatureExtractor.Builder
-
Builds the feature extractor.
- build() - Method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor.Builder
-
Builds the feature extractor.
- build() - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor.Builder
-
Builds the feature extractor.
- build() - Method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor.Builder
-
Builds the feature extractor.
- build() - Method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor.Builder
-
Builds a new
WindowFeatureExtractorwith the configured parameters. - build() - Method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor.Builder
-
Builds the feature extractor.
- build() - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration.Builder
-
Builds the configuration with the current settings.
- build() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Builds the configuration with the current settings.
- build() - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration.Builder
-
Builds the configuration with the current settings.
- builder() - Static method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor
-
Creates a new builder for
PositionFeatureExtractor. - builder() - Static method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns a new
ConllOutputConfiguration.Builderinstance for constructing a configuration. - builder() - Static method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns a new
MalletCrfTrainerConfiguration.Builderinstance for constructing a configuration. - builder() - Static method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns a new
ModelOutputConfiguration.Builderinstance for constructing a configuration. - builder(int) - Static method in class org.coordinatekit.crf.core.preprocessing.LengthFeatureExtractor
-
Creates a new builder with the specified length upper limit.
- builder(InputStream, String) - Static method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor
-
Creates a new builder with the specified XML input stream and XPath expression.
- builder(String) - Static method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor
-
Creates a new builder with the specified pattern string.
- builder(String, boolean) - Static method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor
-
Creates a new builder with the specified pattern string and case sensitivity.
- builder(Function<String, F>) - Static method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor
-
Creates a new builder with the specified feature mapper function.
- builder(Pattern) - Static method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor
-
Creates a new builder with the specified compiled pattern.
- builder(FeatureExtractor<F>, WindowFeatureMapper<F>) - Static method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor
-
Creates a new builder for constructing a
WindowFeatureExtractor.
C
- caseSensitive(boolean) - Method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor.Builder
-
Sets whether token matching should be case-sensitive.
- CompositeFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that combines multiple feature extractors into one.
- CompositeTestAccuracyEvaluator - Class in org.coordinatekit.crf.mallet.train
-
A composite evaluator that logs iteration statistics along with both instance and token accuracy metrics on test data.
- CompositeTestAccuracyEvaluator(InstanceList, String) - Constructor for class org.coordinatekit.crf.mallet.train.CompositeTestAccuracyEvaluator
-
Creates a new composite evaluator for the specified test data.
- configuration - Variable in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
The configuration parameters controlling the training process.
- Configuration Options - Search tag in package org.coordinatekit.crf.mallet.train
- Section
- conllOutputConfiguration() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the configuration for CoNLL output.
- conllOutputConfiguration(ConllOutputConfiguration) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the configuration for CoNLL output.
- ConllOutputConfiguration - Class in org.coordinatekit.crf.mallet.train
-
Configuration settings for
ConllOutputEvaluator. - ConllOutputConfiguration.Builder - Class in org.coordinatekit.crf.mallet.train
-
Builder for constructing
ConllOutputConfigurationinstances. - conllOutputEnabled() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns whether CoNLL output is enabled during training.
- conllOutputEnabled(boolean) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets whether to enable CoNLL output during training.
- ConllOutputEvaluator - Class in org.coordinatekit.crf.mallet.train
-
An evaluator that outputs predicted and actual labels in CoNLL format.
- ConllOutputEvaluator(InstanceList, String, ConllOutputConfiguration) - Constructor for class org.coordinatekit.crf.mallet.train.ConllOutputEvaluator
-
Creates a new CoNLL output evaluator with the specified configuration.
- createCrf(InstanceList) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Creates a CRF model initialized with states derived from the training data.
- createCrfTrainer(CRF) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Creates and configures a threaded CRF trainer for the given model.
- createMalletSequences(CRF, Sequence<FeaturePositionedToken<F>>) - Method in class org.coordinatekit.crf.mallet.tag.MalletCrfTagger
-
Converts a sequence of feature-positioned tokens into a MALLET feature vector sequence.
- CRF_SCHEMA_NAMESPACE_URI - Static variable in class org.coordinatekit.crf.core.io.XmlTrainingData
-
The namespace URI for CRF structural elements.
- CrfTagger<F,
T> - Interface in org.coordinatekit.crf.core.tag -
A conditional random field (CRF) tagger that assigns tags to tokens in an input string.
- CrfTrainer - Interface in org.coordinatekit.crf.core.train
-
Interface for training Conditional Random Field (CRF) models.
D
- decode(String) - Method in class org.coordinatekit.crf.core.StringTagProvider
- decode(String) - Method in interface org.coordinatekit.crf.core.TagProvider
-
Converts a string representation to a typed tag value.
- defaults() - Static method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns a configuration with all default values.
- defaults() - Static method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns a configuration with all default values.
- defaults() - Static method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns a configuration with all default values.
- DENSE - Enum constant in enum class org.coordinatekit.crf.mallet.train.WeightsType
-
Use dense weight storage for all features.
- deserialize(Class<T>, Path) - Static method in class org.coordinatekit.crf.core.util.Serializables
-
Deserializes an object from a file.
E
- encode(String) - Method in class org.coordinatekit.crf.core.StringTagProvider
- encode(T) - Method in interface org.coordinatekit.crf.core.TagProvider
-
Converts a typed tag value to its string representation.
- ending(boolean) - Method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor.Builder
-
Sets whether to extract from the end (suffix) or beginning (prefix) of the token.
- equals(Object) - Method in record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Indicates whether some other object is "equal to" this one.
- evaluate(TransducerTrainer) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputEvaluator
- evaluateInstanceList(TransducerTrainer, InstanceList, String) - Method in class org.coordinatekit.crf.mallet.train.CompositeTestAccuracyEvaluator
- evaluateInstanceList(TransducerTrainer, InstanceList, String) - Method in class org.coordinatekit.crf.mallet.train.ConllOutputEvaluator
- evaluateInstanceList(TransducerTrainer, InstanceList, String) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputEvaluator
- Example: Combining Feature Extractors - Search tag in package org.coordinatekit.crf.core.preprocessing
- Section
- Example Usage - Search tag in package org.coordinatekit.crf.mallet.tag
- Section
- Example Usage - Search tag in package org.coordinatekit.crf.mallet.train
- Section
- extract(Sequence<? extends PositionedToken>) - Method in interface org.coordinatekit.crf.core.preprocessing.FeatureExtractor
-
Extracts features for all tokens in a sequence.
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.CompositeFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in interface org.coordinatekit.crf.core.preprocessing.FeatureExtractor
-
Extracts features for the token at the specified position within a sequence.
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.LengthFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.TransformingFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor
- extractAt(Sequence<? extends PositionedToken>, int) - Method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor
- extractTraining(Sequence<? extends TrainingPositionedToken<T>>) - Method in interface org.coordinatekit.crf.core.preprocessing.FeatureExtractor
-
Extracts features for all tokens in a training sequence, preserving tag information.
F
- featureExtractor - Variable in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
The feature extractor for converting tokens to feature sets during training.
- FeatureExtractor<F> - Interface in org.coordinatekit.crf.core.preprocessing
-
Extracts features from tokens within a sequence for use in CRF model training and tagging.
- Feature Extractors - Search tag in package org.coordinatekit.crf.core.preprocessing
- Section
- FeaturePositionedToken<F> - Interface in org.coordinatekit.crf.core.preprocessing
-
A positioned token that includes extracted features.
- features() - Method in interface org.coordinatekit.crf.core.preprocessing.FeaturePositionedToken
-
Returns the set of features extracted for this token.
- FeatureSequence<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A sequence of tokens with extracted features.
- FeatureSequence(List<String>, List<Set<F>>) - Constructor for class org.coordinatekit.crf.core.preprocessing.FeatureSequence
-
Constructs a new feature sequence from the given tokens and features.
- FeatureTrainingPositionedToken<F,
T> - Interface in org.coordinatekit.crf.core.preprocessing -
A positioned token that includes both extracted features and a training label.
- FeatureTrainingSequence<F,
T> - Class in org.coordinatekit.crf.core.preprocessing -
A sequence of tokens with both extracted features and training labels.
- FeatureTrainingSequence(List<String>, List<T>, List<Set<F>>) - Constructor for class org.coordinatekit.crf.core.preprocessing.FeatureTrainingSequence
-
Constructs a new feature training sequence from the given tokens, tags, and features.
- filePrefix() - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns the prefix for output file names.
- filePrefix() - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns the prefix for output file names.
- filePrefix(String) - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration.Builder
-
Sets the prefix for output file names.
- filePrefix(String) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration.Builder
-
Sets the prefix for output file names.
- fileSuffix() - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns the suffix (extension) for output file names.
- fileSuffix() - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns the suffix (extension) for output file names.
- fileSuffix(String) - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration.Builder
-
Sets the suffix (extension) for output file names.
- fileSuffix(String) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration.Builder
-
Sets the suffix (extension) for output file names.
- firstFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor.Builder
-
Sets the feature to emit for the first token in a sequence.
- fullyConnected() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns whether to create a fully connected CRF state machine.
- fullyConnected(boolean) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets whether to create a fully connected CRF state machine.
G
- gaussianVariance() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the Gaussian prior variance for L2 regularization.
- gaussianVariance(double) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the Gaussian prior variance for L2 regularization.
- generateSchema(OutputStream) - Method in interface org.coordinatekit.crf.core.io.TrainingSchemaGenerator
-
Generates a schema and writes it to the specified output stream.
- generateSchema(OutputStream) - Method in class org.coordinatekit.crf.core.io.XmlTrainingData
- get(int) - Method in class org.coordinatekit.crf.core.InputSequence
- get(int) - Method in class org.coordinatekit.crf.core.preprocessing.FeatureSequence
- get(int) - Method in class org.coordinatekit.crf.core.preprocessing.FeatureTrainingSequence
- get(int) - Method in class org.coordinatekit.crf.core.preprocessing.TrainingSequence
- get(int) - Method in interface org.coordinatekit.crf.core.Sequence
-
Returns the token at the specified position in this sequence.
- get(int) - Method in class org.coordinatekit.crf.core.tag.TaggedSequence
H
- hashCode() - Method in record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Returns a hash code value for this object.
- hasLengthFeatureMapper(Function<Integer, F>) - Method in class org.coordinatekit.crf.core.preprocessing.LengthFeatureExtractor.Builder
-
Sets the mapper function to generate features for lengths the sequence has.
I
- includeCurrentToken(boolean) - Method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor.Builder
-
Sets whether to include features from the current token.
- includeIfLessThanLength(boolean) - Method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor.Builder
-
Sets whether to include tokens shorter than the configured length.
- input() - Method in exception class org.coordinatekit.crf.core.preprocessing.InvalidInputException
-
Returns the invalid input that caused this exception.
- InputSequence - Class in org.coordinatekit.crf.core
-
A sequence of tokens representing user input for tagging.
- InputSequence(List<String>) - Constructor for class org.coordinatekit.crf.core.InputSequence
-
Creates a new sequence from the given list of string tokens.
- InvalidInputException - Exception Class in org.coordinatekit.crf.core.preprocessing
-
Exception thrown when input provided to a preprocessing component is invalid.
- InvalidInputException(String) - Constructor for exception class org.coordinatekit.crf.core.preprocessing.InvalidInputException
-
Creates a new exception for invalid input without a specific reason.
- InvalidInputException(String, String) - Constructor for exception class org.coordinatekit.crf.core.preprocessing.InvalidInputException
-
Creates a new exception for invalid input with a specific reason.
- iterationInterval() - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns the iteration interval for writing output files.
- iterationInterval() - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns the iteration interval for writing model files.
- iterationInterval(int) - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration.Builder
-
Sets the iteration interval for writing output files.
- iterationInterval(int) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration.Builder
-
Sets the iteration interval for writing model files.
- iterations() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the maximum number of training iterations.
- iterations(int) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the maximum number of training iterations.
- iterator() - Method in class org.coordinatekit.crf.core.InputSequence
- iterator() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureSequence
- iterator() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureTrainingSequence
- iterator() - Method in class org.coordinatekit.crf.core.preprocessing.TrainingSequence
- iterator() - Method in class org.coordinatekit.crf.core.tag.TaggedSequence
L
- lacksLengthFeatureMapper(Function<Integer, F>) - Method in class org.coordinatekit.crf.core.preprocessing.LengthFeatureExtractor.Builder
-
Sets the mapper function to generate features for lengths the sequence lacks.
- lastFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor.Builder
-
Sets the feature to emit for the last token in a sequence.
- length(int) - Method in class org.coordinatekit.crf.core.preprocessing.SubstringFeatureExtractor.Builder
-
Sets the length of the substring to extract.
- LengthFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that generates features based on the length of a sequence.
- LengthFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for
LengthFeatureExtractor.
M
- MalletCrfTagger<F,
T> - Class in org.coordinatekit.crf.mallet.tag -
A MALLET-based implementation of
CrfTaggerthat uses a pre-trained CRF model to tag input sequences. - MalletCrfTagger(FeatureExtractor<F>, Path, TagProvider<T>, Tokenizer) - Constructor for class org.coordinatekit.crf.mallet.tag.MalletCrfTagger
-
Creates a new tagger by loading a serialized CRF model from the specified path.
- MalletCrfTrainer<F,
T> - Class in org.coordinatekit.crf.mallet.train -
A CRF trainer implementation using the MALLET (MAchine Learning for LanguagE Toolkit) library.
- MalletCrfTrainer(FeatureExtractor<F>, TagProvider<T>, TrainingDataSequencer<T>) - Constructor for class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Creates a new trainer with the specified components and default configuration.
- MalletCrfTrainer(FeatureExtractor<F>, TagProvider<T>, TrainingDataSequencer<T>, MalletCrfTrainerConfiguration) - Constructor for class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Creates a new trainer with the specified components and configuration.
- MalletCrfTrainerConfiguration - Class in org.coordinatekit.crf.mallet.train
-
Configuration settings for
MalletCrfTrainer. - MalletCrfTrainerConfiguration.Builder - Class in org.coordinatekit.crf.mallet.train
-
Builder for constructing
MalletCrfTrainerConfigurationinstances. - mapSequenceToInstance(Alphabet, LabelAlphabet, TrainingSequence<T>) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Converts a training sequence into a MALLET
Instance. - matchedFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor.Builder
-
Sets the feature to emit when a token matches the pattern.
- modelOutputConfiguration() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the configuration for model checkpoint output.
- modelOutputConfiguration(ModelOutputConfiguration) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the configuration for model checkpoint output.
- ModelOutputConfiguration - Class in org.coordinatekit.crf.mallet.train
-
Configuration settings for
ModelOutputEvaluator. - ModelOutputConfiguration.Builder - Class in org.coordinatekit.crf.mallet.train
-
Builder for constructing
ModelOutputConfigurationinstances. - modelOutputEnabled() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns whether model checkpoint output is enabled during training.
- modelOutputEnabled(boolean) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets whether to enable model checkpoint output during training.
- ModelOutputEvaluator - Class in org.coordinatekit.crf.mallet.train
-
An evaluator that writes the current transducer (model) to file using Java serialization.
- ModelOutputEvaluator(ModelOutputConfiguration) - Constructor for class org.coordinatekit.crf.mallet.train.ModelOutputEvaluator
-
Creates a new model output evaluator with the specified configuration.
N
- notMatchedFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.PatternMatchingFeatureExtractor.Builder
-
Sets the feature to emit when a token does not match the pattern.
- notPresentFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor.Builder
-
Sets the feature to emit when a token is not present in the value set.
O
- of(FeatureExtractor<F>...) - Static method in class org.coordinatekit.crf.core.preprocessing.CompositeFeatureExtractor
-
Creates a new composite feature extractor from the specified extractors.
- org.coordinatekit.crf.core - package org.coordinatekit.crf.core
-
Core abstractions and interfaces for Conditional Random Fields (CRF) sequence labeling.
- org.coordinatekit.crf.core.io - package org.coordinatekit.crf.core.io
-
Input/output components for reading and writing CRF training data.
- org.coordinatekit.crf.core.preprocessing - package org.coordinatekit.crf.core.preprocessing
-
Preprocessing components for tokenization and feature extraction in CRF pipelines.
- org.coordinatekit.crf.core.tag - package org.coordinatekit.crf.core.tag
-
Tagging interfaces and result types for CRF sequence labeling.
- org.coordinatekit.crf.core.train - package org.coordinatekit.crf.core.train
-
Training interfaces for CRF model construction.
- org.coordinatekit.crf.core.util - package org.coordinatekit.crf.core.util
-
Utility classes for CRF operations.
- org.coordinatekit.crf.mallet.tag - package org.coordinatekit.crf.mallet.tag
-
MALLET-based CRF tagging implementation.
- org.coordinatekit.crf.mallet.train - package org.coordinatekit.crf.mallet.train
-
MALLET-based CRF training implementation.
- outputDirectory() - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration
-
Returns the directory in which to write output files.
- outputDirectory() - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration
-
Returns the directory in which to write model files.
- outputDirectory(Path) - Method in class org.coordinatekit.crf.mallet.train.ConllOutputConfiguration.Builder
-
Sets the directory in which to write output files.
- outputDirectory(Path) - Method in class org.coordinatekit.crf.mallet.train.ModelOutputConfiguration.Builder
-
Sets the directory in which to write model files.
P
- PatternMatchingFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that matches tokens against a regex pattern.
- PatternMatchingFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for
PatternMatchingFeatureExtractor. - position() - Method in interface org.coordinatekit.crf.core.PositionedToken
-
Returns the zero-based position of this token within its sequence.
- PositionedToken - Interface in org.coordinatekit.crf.core
-
A token with its position within a sequence.
- PositionFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that generates features based on token position within a sequence.
- PositionFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for
PositionFeatureExtractor. - positionFromEndFeatureMapper(Function<Integer, F>) - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor.Builder
-
Sets the mapper function to generate features based on position from the end.
- positionFromStartFeatureMapper(Function<Integer, F>) - Method in class org.coordinatekit.crf.core.preprocessing.PositionFeatureExtractor.Builder
-
Sets the mapper function to generate features based on position from the start.
- presentFeature(F) - Method in class org.coordinatekit.crf.core.preprocessing.XPathFeatureExtractor.Builder
-
Sets the feature to emit when a token is present in the value set.
R
- randomSeed() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the random seed for reproducible data splitting.
- randomSeed(int) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the random seed for data splitting.
- read(InputStream) - Method in interface org.coordinatekit.crf.core.io.TrainingDataSequencer
-
Reads training data from an input stream and returns a stream of training sequences.
- read(InputStream) - Method in class org.coordinatekit.crf.core.io.XmlTrainingData
- read(Path) - Method in interface org.coordinatekit.crf.core.io.TrainingDataSequencer
-
Reads training data from a file path and returns a stream of training sequences.
- reason() - Method in exception class org.coordinatekit.crf.core.preprocessing.InvalidInputException
-
Returns the reason why the input is invalid.
- Result Types - Search tag in package org.coordinatekit.crf.mallet.tag
- Section
S
- score() - Method in interface org.coordinatekit.crf.core.tag.TagScore
-
Returns the confidence score for this tag.
- Sequence<E> - Interface in org.coordinatekit.crf.core
-
Represents an ordered sequence of positioned tokens.
- SEQUENCE_ELEMENT_NAME - Static variable in class org.coordinatekit.crf.core.io.XmlTrainingData
-
The local name of the element that contains a training sequence.
- Serializables - Class in org.coordinatekit.crf.core.util
-
Utility methods for serializing and deserializing objects to and from files.
- serialize(T, Path) - Static method in class org.coordinatekit.crf.core.util.Serializables
-
Serializes an object to a file.
- SimpleTrainingTestSplit - Record Class in org.coordinatekit.crf.mallet.train
-
A container for training and test data splits.
- SimpleTrainingTestSplit(InstanceList, InstanceList) - Constructor for record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Creates an instance of a
SimpleTrainingTestSplitrecord class. - size() - Method in class org.coordinatekit.crf.core.InputSequence
- size() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureSequence
- size() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureTrainingSequence
- size() - Method in class org.coordinatekit.crf.core.preprocessing.TrainingSequence
- size() - Method in interface org.coordinatekit.crf.core.Sequence
-
Returns the number of tokens in this sequence.
- size() - Method in class org.coordinatekit.crf.core.tag.TaggedSequence
- size() - Method in interface org.coordinatekit.crf.mallet.train.TrainingTestSplit
-
Returns the total number of instances across both training and test sets.
- SOME_DENSE - Enum constant in enum class org.coordinatekit.crf.mallet.train.WeightsType
-
Use a hybrid approach with some dense and some sparse weights.
- SPARSE - Enum constant in enum class org.coordinatekit.crf.mallet.train.WeightsType
-
Use sparse weight storage for all features.
- splitTrainingData(Collection<Path>) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
Reads training data and splits it into training and test sets.
- startingTag() - Method in class org.coordinatekit.crf.core.StringTagProvider
- startingTag() - Method in interface org.coordinatekit.crf.core.TagProvider
-
Returns the starting/fallback tag for the CRF.
- stream() - Method in class org.coordinatekit.crf.core.InputSequence
- stream() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureSequence
- stream() - Method in class org.coordinatekit.crf.core.preprocessing.FeatureTrainingSequence
- stream() - Method in class org.coordinatekit.crf.core.preprocessing.TrainingSequence
- stream() - Method in interface org.coordinatekit.crf.core.Sequence
-
Returns a sequential
Streamover the tokens in this sequence. - stream() - Method in class org.coordinatekit.crf.core.tag.TaggedSequence
- StringTagProvider - Class in org.coordinatekit.crf.core
-
A tag provider for string tags with a defined tag set and starting tag.
- StringTagProvider(String) - Constructor for class org.coordinatekit.crf.core.StringTagProvider
-
Constructs a new string tag provider with the starting tag.
- StringTagProvider(Collection<String>, String) - Constructor for class org.coordinatekit.crf.core.StringTagProvider
-
Constructs a new string tag provider with the specified tags and starting tag.
- SubstringFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that generates features based on prefixes or suffixes of tokens.
- SubstringFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for
SubstringFeatureExtractor.
T
- tag() - Method in interface org.coordinatekit.crf.core.preprocessing.FeatureTrainingPositionedToken
-
Returns the training tag (label) for this token.
- tag() - Method in interface org.coordinatekit.crf.core.preprocessing.TrainingPositionedToken
-
Returns the training tag (label) for this token.
- tag() - Method in interface org.coordinatekit.crf.core.tag.TaggedPositionedToken
-
Returns the highest-scoring tag for this token.
- tag() - Method in interface org.coordinatekit.crf.core.tag.TagScore
-
Returns the tag value.
- tag(int) - Method in interface org.coordinatekit.crf.core.tag.TaggedPositionedToken
-
Returns an iterator over the top N tags for this token, ordered by score descending.
- tag(String) - Method in interface org.coordinatekit.crf.core.tag.CrfTagger
-
Tags the tokens in the input string using the CRF model.
- tag(String) - Method in class org.coordinatekit.crf.mallet.tag.MalletCrfTagger
- TaggedPositionedToken<F,
T> - Interface in org.coordinatekit.crf.core.tag -
A positioned token that includes the scores for each tag.
- TaggedSequence<F,
T> - Class in org.coordinatekit.crf.core.tag -
A sequence of tokens that have been tagged by a CRF model with associated features and tag scores.
- TaggedSequence(List<String>, List<Set<F>>, List<Map<T, Double>>) - Constructor for class org.coordinatekit.crf.core.tag.TaggedSequence
-
Creates a new tagged sequence from parallel lists of tokens, features, and tag scores.
- tagProvider - Variable in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
The tag provider defining available tags and their encoding/decoding.
- TagProvider<T> - Interface in org.coordinatekit.crf.core
-
Provides tag information for CRF models, including conversion, enumeration, and the starting state.
- tags() - Method in class org.coordinatekit.crf.core.StringTagProvider
- tags() - Method in interface org.coordinatekit.crf.core.TagProvider
-
Returns all valid tags in the model's label space.
- TagScore<T> - Interface in org.coordinatekit.crf.core.tag
-
A tag with its associated confidence score.
- tagScores() - Method in interface org.coordinatekit.crf.core.tag.TaggedPositionedToken
-
Returns an iterator over all tag scores for this token, ordered by score descending.
- test() - Method in record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Returns the value of the
testrecord component. - test() - Method in interface org.coordinatekit.crf.mallet.train.TrainingTestSplit
-
Returns the instances to use for model evaluation.
- threads() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the number of threads to use for parallel training.
- threads(int) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the number of threads for parallel training.
- token() - Method in interface org.coordinatekit.crf.core.PositionedToken
-
Returns the token value.
- tokenize(String) - Method in interface org.coordinatekit.crf.core.preprocessing.Tokenizer
-
Tokenizes the input string into a sequence of tokens.
- tokenize(String) - Method in class org.coordinatekit.crf.core.preprocessing.WhitespaceTokenizer
-
Tokenizes the input string into a sequence of tokens.
- Tokenizer - Interface in org.coordinatekit.crf.core.preprocessing
-
Converts raw text input into a sequence of tokens.
- toString() - Method in record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Returns a string representation of this record class.
- train(Path, Path) - Method in interface org.coordinatekit.crf.core.train.CrfTrainer
-
Trains a CRF model using the training data at the specified path and saves the model to the specified output path.
- train(Collection<Path>, Path) - Method in interface org.coordinatekit.crf.core.train.CrfTrainer
-
Trains a CRF model using the training data at the specified paths and saves the model to the specified output path.
- train(Collection<Path>, Path) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
- training() - Method in record class org.coordinatekit.crf.mallet.train.SimpleTrainingTestSplit
-
Returns the value of the
trainingrecord component. - training() - Method in interface org.coordinatekit.crf.mallet.train.TrainingTestSplit
-
Returns the training instances to use for model training.
- Training Data Format - Search tag in package org.coordinatekit.crf.core.io
- Section
- trainingDataSequencer - Variable in class org.coordinatekit.crf.mallet.train.MalletCrfTrainer
-
The sequencer for reading training data from files into training sequences.
- TrainingDataSequencer<T> - Interface in org.coordinatekit.crf.core.io
-
Reads training data and converts it into a stream of training sequences.
- trainingFraction() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the fraction of data to use for training.
- trainingFraction(double) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the fraction of data to use for training.
- TrainingPositionedToken<T> - Interface in org.coordinatekit.crf.core.preprocessing
-
A positioned token that includes a training label.
- TrainingSchemaGenerator - Interface in org.coordinatekit.crf.core.io
-
Generates schema definitions for validating CRF training data.
- TrainingSequence<T> - Class in org.coordinatekit.crf.core.preprocessing
-
A sequence of tokens with training labels.
- TrainingSequence(List<String>, List<T>) - Constructor for class org.coordinatekit.crf.core.preprocessing.TrainingSequence
-
Constructs a new training sequence from the given tokens and tags.
- TrainingTestSplit - Interface in org.coordinatekit.crf.mallet.train
-
A container for training and test data splits.
- TransformingFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that applies a transformation function to each token to produce features.
- TransformingFeatureExtractor(Function<String, Set<F>>) - Constructor for class org.coordinatekit.crf.core.preprocessing.TransformingFeatureExtractor
-
Creates a new transforming feature extractor with the specified transformation function.
U
- UncheckedCrfException - Exception Class in org.coordinatekit.crf.core
-
Unchecked exception thrown when an error occurs during conditional random field (CRF) operations.
- UncheckedCrfException() - Constructor for exception class org.coordinatekit.crf.core.UncheckedCrfException
-
Constructs a new exception with no detail message.
- UncheckedCrfException(String) - Constructor for exception class org.coordinatekit.crf.core.UncheckedCrfException
-
Constructs a new exception with the specified detail message.
- UncheckedCrfException(String, Throwable) - Constructor for exception class org.coordinatekit.crf.core.UncheckedCrfException
-
Constructs a new exception with the specified detail message and cause.
- UncheckedCrfException(Throwable) - Constructor for exception class org.coordinatekit.crf.core.UncheckedCrfException
-
Constructs a new exception with the specified cause.
V
- valueOf(String) - Static method in enum class org.coordinatekit.crf.mallet.train.WeightsType
-
Returns the enum constant of this class with the specified name.
- values() - Static method in enum class org.coordinatekit.crf.mallet.train.WeightsType
-
Returns an array containing the constants of this enum class, in the order they are declared.
W
- weightsType() - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration
-
Returns the weight storage type for the CRF.
- weightsType(WeightsType) - Method in class org.coordinatekit.crf.mallet.train.MalletCrfTrainerConfiguration.Builder
-
Sets the weight storage type.
- WeightsType - Enum Class in org.coordinatekit.crf.mallet.train
-
Specifies the weight storage strategy for CRF training.
- WhitespaceTokenizer - Class in org.coordinatekit.crf.core.preprocessing
-
A tokenizer that splits input on whitespace characters.
- WhitespaceTokenizer() - Constructor for class org.coordinatekit.crf.core.preprocessing.WhitespaceTokenizer
-
Creates a new whitespace tokenizer.
- windowAfter(int) - Method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor.Builder
-
Sets the number of tokens to look forward from the current position.
- windowBefore(int) - Method in class org.coordinatekit.crf.core.preprocessing.WindowFeatureExtractor.Builder
-
Sets the number of tokens to look back from the current position.
- WindowFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that extracts features from the current token and neighboring tokens within a window.
- WindowFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for constructing
WindowFeatureExtractorinstances. - WindowFeatureMapper<F> - Interface in org.coordinatekit.crf.core.preprocessing
-
A function that maps a feature to a new feature based on its relative position.
X
- XmlTrainingData<T> - Class in org.coordinatekit.crf.core.io
-
Reads and writes CRF training data in XML format, and generates XSD schemas for tag validation.
- XmlTrainingData(TagProvider<T>) - Constructor for class org.coordinatekit.crf.core.io.XmlTrainingData
-
Constructs an
XmlTrainingDatainstance for reading training data. - XmlTrainingData(TagProvider<T>, String) - Constructor for class org.coordinatekit.crf.core.io.XmlTrainingData
-
Constructs an
XmlTrainingDatainstance with a target namespace for schema generation. - XPathFeatureExtractor<F> - Class in org.coordinatekit.crf.core.preprocessing
-
A feature extractor that checks whether tokens are present in a set of values loaded from an XML file using XPath expressions.
- XPathFeatureExtractor.Builder<F> - Class in org.coordinatekit.crf.core.preprocessing
-
Builder for
XPathFeatureExtractor.
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form