Applies the learning algorithm
Applies the learning algorithm
the dataset with feature vectors (spark Dataframe).
the dataset with the annotations (spark Dataset of types.MulticlassAnnotation).
maximum number of iterations for the GradientDescent algorithm
threshold for the log likelihood variability for the gradient descent algorithm
learning rate for the gradient descent algorithm
prior (Dirichlet distribution hyperparameters) for the estimation of the probability that an annotator correctly a class given another
prior for the weights of the logistic regression model
0.2.0
Provides functions for transforming an annotation dataset into a standard label dataset using the Raykar algorithm for multiclass
This algorithm only works with types.MulticlassAnnotation datasets. There are versions for the types.BinaryAnnotation (RaykarBinary) and types.RealAnnotation (RaykarCont).
It will return a types.RaykarMultiModel with information about the estimation of the ground truth for each example (probability for each class), the annotator precision estimation of the model, the weights of the three (one vs all) logistic regression model learned and the log-likelihood of the model.
The next example can be found in the examples folders. In it, the user may also find an example of how to add prior confidence on the annotators.
0.1.5
Raykar, Vikas C., et al. "Learning from crowds." Journal of Machine Learning Research 11.Apr (2010): 1297-1322.