SVM - Support Vector Machines
Support Vector Machines are powerful classifiers that work well for both linear and non-linear classification problems.Creating an SVM Model
Key Methods
Sets the type of SVM formulationParameters:
val(int): SVM type. SeeSVM::Types
SVM::C_SVC(100): C-Support Vector Classification for n-class classificationSVM::NU_SVC(101): ν-Support Vector Classification with parameter νSVM::ONE_CLASS(102): Distribution Estimation (One-class SVM)SVM::EPS_SVR(103): ε-Support Vector RegressionSVM::NU_SVR(104): ν-Support Vector Regression
Initialize with one of predefined kernelsParameters:
kernelType(int): Kernel type. SeeSVM::KernelTypes
SVM::LINEAR(0): Linear kernel, no mapping doneSVM::POLY(1): Polynomial kernelSVM::RBF(2): Radial basis function, good choice in most casesSVM::SIGMOID(3): Sigmoid kernelSVM::CHI2(4): Exponential Chi2 kernelSVM::INTER(5): Histogram intersection kernel
Sets parameter C of a SVM optimization problemParameters:
val(double): Parameter C for C_SVC, EPS_SVR or NU_SVR
Sets parameter γ of a kernel functionParameters:
val(double): Parameter gamma for POLY, RBF, SIGMOID or CHI2 kernels
Trains the SVM modelParameters:
trainData(Ptr<TrainData>): Training dataflags(int): Optional flags
Predicts response for input samplesParameters:
samples(InputArray): Input samples, floating-point matrixresults(OutputArray): Optional output matrix of resultsflags(int): Optional flags
Trains an SVM with optimal parameters using cross-validationParameters:
data(Ptr<TrainData>): Training datakFold(int): Cross-validation parameter (default: 10)Cgrid(ParamGrid): Grid for C parametergammaGrid(ParamGrid): Grid for gamma parameterpGrid(ParamGrid): Grid for p parameternuGrid(ParamGrid): Grid for nu parametercoeffGrid(ParamGrid): Grid for coeff parameterdegreeGrid(ParamGrid): Grid for degree parameterbalanced(bool): Create balanced cross-validation subsets
Retrieves all the support vectorsReturns: Mat - matrix where support vectors are stored as rows
Example Usage
- C++
- Python
KNearest - K-Nearest Neighbors
The K-Nearest Neighbors algorithm finds the k nearest neighbors and predicts the response based on their values.Creating a KNearest Model
Key Methods
Sets the default number of neighbors to use in predict methodParameters:
val(int): Number of neighbors (must be greater than 1)
Sets whether classification or regression model should be trainedParameters:
val(bool): true for classification, false for regression
Sets the algorithm typeParameters:
val(int): Algorithm type
KNearest::BRUTE_FORCE(1): Brute force searchKNearest::KDTREE(2): KD-tree based search
Finds the neighbors and predicts responses for input vectorsParameters:
samples(InputArray): Input samples (rows are samples)k(int): Number of nearest neighborsresults(OutputArray): Vector with prediction resultsneighborResponses(OutputArray): Optional output for neighbor responsesdist(OutputArray): Optional output distances to neighbors
Example Usage
DTrees - Decision Trees
Decision trees are tree-based classifiers that split data based on feature values.Creating a Decision Tree Model
Key Methods
Sets the maximum possible depth of the treeParameters:
val(int): Maximum depth (root node has depth 0)
Sets the minimum number of samples required to split a nodeParameters:
val(int): Minimum sample count (default: 10)
Sets the maximum number of categories for clusteringParameters:
val(int): Maximum categories (default: 10)
Sets the number of folds for cross-validation pruningParameters:
val(int): Number of folds (default: 10)
Sets a priori class probabilitiesParameters:
val(Mat): Array of class probabilities sorted by label
Example Usage
Boost - Boosted Trees
Boosted tree classifier that combines multiple weak classifiers into a strong one.Creating a Boost Model
Key Methods
Sets the type of boosting algorithmParameters:
val(int): Boost type
Boost::DISCRETE(0): Discrete AdaBoostBoost::REAL(1): Real AdaBoost (default, works well with categorical data)Boost::LOGIT(2): LogitBoost (good for regression)Boost::GENTLE(3): Gentle AdaBoost (good with regression data)
Sets the number of weak classifiersParameters:
val(int): Number of weak classifiers (default: 100)
Sets the threshold for computational time savingsParameters:
val(double): Weight trim rate between 0 and 1 (default: 0.95)
Example Usage
RTrees - Random Trees (Random Forest)
Random Forest is an ensemble learning method that constructs multiple decision trees.Creating a Random Trees Model
Key Methods
Enable/disable variable importance calculationParameters:
val(bool): true to calculate variable importance
Sets the size of randomly selected subset of features at each tree nodeParameters:
val(int): Number of active variables (0 = sqrt of total features)
Sets termination criteria for trainingParameters:
val(TermCriteria): Criteria specifying max iterations or accuracy
Returns the variable importance arrayReturns: Mat - variable importance vector (if enabled during training)
Returns the result of each individual tree in the forestParameters:
samples(InputArray): Samples for which votes will be calculatedresults(OutputArray): Matrix where results will be writtenflags(int): Flags for defining the type of RTrees
Example Usage
NormalBayesClassifier - Naive Bayes
Bayes classifier for normally distributed data using Bayesian statistics.Creating a Naive Bayes Model
Key Methods
Trains the Bayes classifierParameters:
trainData(Ptr<TrainData>): Training dataflags(int): Optional flags
Predicts response for input samplesParameters:
samples(InputArray): Input samplesresults(OutputArray): Output predictionsflags(int): Optional flags
Predicts the response and returns probabilitiesParameters:
inputs(InputArray): Input vectors (one or more)outputs(OutputArray): Predicted classesoutputProbs(OutputArray): Output probabilities for each classflags(int): Optional flags
Example Usage
The Naive Bayes classifier assumes that features are normally distributed and independent. It works best when these assumptions hold true.
See Also
- Regression Algorithms - Linear and logistic regression methods
- Clustering Algorithms - K-means and EM clustering
- StatModel Base Class - Base class for all ML models
