Applies the learning algorithm
Applies the learning algorithm
the dataset with feature vectors (spark
).Dataframe
the dataset with the annotations (spark Dataset of types.BinaryAnnotation).
maximum number of iterations for the GradientDescent algorithm
threshold for the log likelihood variability for the gradient descent algorithm
learning rate for the gradient descent algorithm
prior (Beta distribution hyperparameters) for the estimation of the probability that an annotator correctly classifias positive instances
prior (Beta distribution hyperparameters) for the estimation of the probability that an annotator correctly classify as negative instances
prior for the weights of the logistic regression model
0.1.5
Provides functions for transforming an annotation dataset into a standard label dataset using the RaykarBinary algorithm
This algorithm only works with types.BinaryAnnotation datasets. There are versions for the types.MulticlassAnnotation (RaykarMulti) and types.RealAnnotation (RaykarCont).
It will return a types.RaykarBinaryModel with information about the estimation of the ground truth for each example, the annotator precision estimation of the model, the weights of the logistic regression model learned and the log-likelihood of the model.
The next example can be found in the examples folders. In it, the user may also find an example of how to add prior confidence on the annotators.
0.1.5
Raykar, Vikas C., et al. "Learning from crowds." Journal of Machine Learning Research 11.Apr (2010): 1297-1322.