Slide 15
Slide 15 text
Spark MLlib
One of the most actively developed library.
• classification: logistic regression, linear support vector machine (SVM), naive Bayes,
classification tree
• regression: generalized linear models (GLMs), regression tree
• collaborative filtering: alternating least squares (ALS)
• clustering: k-means
• decomposition: singular value decomposition (SVD), principal component analysis (PCA)
• statistics: summary statistics
• evaluation: binary classification
• optimization: gradient descent, L-BFGS