Applied machine learning developers have a lot of opensource machine learning frameworks, e.g., ScikitLearn, Spark MLlib, ML.Net, etc. Frameworks provide a userfriendly highlevel interface to algorithms, but implementations resort to lowlevel languages and optimizations. Such lowlevel C/C++, C#, or Java code is far from the original mathematical notation that is preferable for machine learning algorithms research and development. Semagle Framework makes the most of lowlevel C#like constructs for performance optimization and highlevel semimathematical F# notation for joining the algorithm blocks. Modularization of algorithms with finegrained blocks makes research and development of new implementations for the same family of problems straightforward.
Semagle framework includes the following principal F# libraries:
 Metrics Library (Semagle.MachineLearning.Metrics)
 Vectors Library (Semagle.Numerics.Vectors)
 SVM Library (Semagle.MachiveLearning.SVM)
 Structural SVM Library (Semagle.MachiveLearning.SSVM)
Metrics Library
Semagle.MachineLearning.Metrics library implements performance evaluation metrics for classification, regression, and information retrieval:
Accuracy is used for two and one class classification problems.
\[accuracy = \frac{1}{n} \sum_{i=1}^n \mathbb{1}_{predicted_i = expected_i}\]let accuracy (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.accuracy test_y predict_y
Recall is used for classification and sequence labeling problems.
\[recall = \frac{\sum_{i=1}^n \mathbb{1}_{predicted_i=expected_i=+1}}{\sum_{i=1}^n \mathbb{1}_{expected_i=+1}}\]let recall (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.recall test_y predict_y
Precision is used for classification and sequence labeling problems.
\[precision = \frac{\sum_{i=1}^n \mathbb{1}_{predicted_i=expected_i=+1}}{\sum_{i=1}^n \mathbb{1}_{predicted_i=+1}}\]let precision (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.precision test_y predict_y
F1 is used for classification and sequence labeling problems.
\[F_1 = 2 \times \frac{recall \times precision}{recall + precision}\]let f1 (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.precision test_y predict_y
Mean squared error (MSE) is used for regression problems.
\[MSE = \frac{1}{n}\sum_{i=1}^n (predicted_i  expected_i)^2\]let mse (expected: float32[]) (predicted: float32[]) =
...
Regression.mse test_y predict_y
Vectors Library
Semagle.Numerics.Vectors library provides DenseVector
and SparseVector
vector types. These data types implement operations required by standard SVM kernels. SparseVector
is also a basic data type for the Structured SVM implementation.
DenseVector
is constructed from the array of float32
values. This type of vector stores zero and nonzero values and suits well for feature vectors with many nonzero elements.
let a = DenseVector([ 1.0f; 3.0f; 3.0f; 4.0f; 8.0f ])
SparseVector
is constructed from the array of int indices of nonzero elements and the array of float32
values of nonzero elements. This type of vector suits well for feature vectors with a few nonzero elements.
let a = SparseVector([0; 1; 3; 5], [1.0f; 2.0f; 4.0f; 6.0f])
DenseVector
and SparseVector
support the following operations:
 indexing
a.[1]
and slicinga.[1..3]
,a.[..3]
,a.[2..]
, but implementations ofSparseVector
operations are more expensive because they require binary search for finding item indices;  elementwise addition
a + b
, subtractiona  b
, multiplicationa * b
, and divisiona / b
;  unary negation
a
and multiplicationa * 1.5f
and divisiona / 2.0f
by a scalar;  dot product
a .* b
and Euclidean distancea  b
.
SVM Library
Semagle.MachineLearning.SVM library provides a highlevel interface for classification and regression supportvector machines and a lowlevel interface for sequential minimal optimization ^{1}^{,}^{2}. All interfaces are data type agnostic and with kernel functions only:
type Kernel<'X> = 'X > 'X > float
A highlevel interface includes the following models:
 TwoClass for twoclass classification;
 OneClass for twoclass classification with training on positive samples only;
 MultiClass for multiclass classification;
 Regression for realvalued regression.
type SVM<'X,'Y> =
/// SVM model for two class classification
 TwoClass of Kernel<'X> * ('X[]) * (float[]) * float
/// SVM model for one class classification
 OneClass of Kernel<'X> * ('X[]) * (float[]) * float
/// SVM model for regression
 Regression of Kernel<'X> * ('X[]) * (float[]) * float
/// SVM model for multiclass classification
 MultiClass of Kernel<'X> * (('Y * 'Y * ('X[]) * (float[]) * float)[])
Prediction functions for different problems are similar:
module TwoClass =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module OneClass =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module Regression =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module MultiClass =
let inline predict(model : SVM<'X,'Y>) (x : 'X) = ...
They are based on the weighted sum of values of kernel functions for support vectors:
module SVM =
let inline predict (x : 'X) (K : Kernel<'X>) (X : 'X[]) (A : float[]) (b : float) =
Array.fold2 (fun sum x_i a_i > sum + a_i * (float (K x_i x))) b X A
The matching problem solvers show how to apply the generic sequential minimization solver to specific problems:
Twoclass and multiclass problem solvers build classification models for data type X
. Labels for twoclass classifiers are +1/1
. For multiclass, they have an arbitrary data type Y
:
/// Optimization parameters for C_SVC problem
type C_SVC = {
/// The penalty for +1 class instances
C_p : float;
/// The penalty for 1 class instances
C_n : float;
}
/// Two class C Support Vector Classification (SVC) problem solver
let C_SVC (X : 'X[]) (Y : float32[]) (K : Kernel<'X>) (parameters : C_SVC) (options : OptimizationOptions) = ...
/// Multiclass C Support Vector Classification (SVC) problem solver
let C_SVC_M (X : 'X[]) (Y : 'Y[]) (K : Kernel<'X>) (parameters : C_SVC) (options : OptimizationOptions) = ...
Oneclass problem solver builds classification models for data type X
and +1/1
labels
for an input dataset with positive samples only.
/// Optimization parameters for OneClass problem
type OneClass = {
/// The fraction of support vectors
nu : float;
}
/// OneClass problem solver
let OneClass (X : 'X[]) (K : Kernel<'X>) (parameters : OneClass) (options : OptimizationOptions) = ...
Regression problem solver builds regression models for data type X
and float32
labels.
/// Optimization parameters for C_SVR problem
type C_SVR = {
/// The boundary of the approximated function
eta : float;
/// The penalty
C : float;
}
/// C Support Vector Regression (SVR) problem solver
let C_SVR (X : 'X[]) (Y : float32[]) (K : Kernel<'X>) (parameters : C_SVR) (options : OptimizationOptions) = ...
Sequential minimization optimization solver is a lowlevel building block for the above problemspecific problem solvers and Structured SVM optimization below:
/// General parameters for C_SMO problem
type C_SMO = {
/// Initial feasible values of optimization varibles
A : float[];
/// Persample penalties
C : float[];
/// The linear term of the optimized function
p: float[];
}
/// Sequential Minimal Optimization (SMO) problem solver
let C_SMO (X : 'X[]) (Y : float32[]) (Q : Q) (parameters : C_SMO) (options : OptimizationOptions) =
Structural SVM Library
Semagle.MachineLearning.SSVM library provides a highlevel interface for multiclass classification and sequence labeling, and a lowlevel cuttingplane solver for “1slack” ^{3} structural SVM formulation. All highlevel interfaces require two functions:
FeatureFunctions
creates a feature vector for an arbitrary data typeX
;LossFunction
defines a loss value for predictedy
and expectedy'
labels of an arbitrary data typeY
.
/// Simple feature function
type FeatureFunction<'X> = 'X > Vector
/// Structured SVM loss function type
type LossFunction<'Y> = (* y *) 'Y > (* y' *)'Y > float
Multiclass classification model builds a linear classification model in the combined feature and label space:
/// Structured SVM model for multiclass classification
type MultiClass<'X,'Y> = MultiClass of FeatureFunction<'X> * (* W *) float[] * 'Y[]
module MultiClass =
/// Optimization parameters for the multiclass structured SVM
type MultiClass<'Y> = {
/// Penalty for slack variables
C : float;
/// Loss function
loss : LossFunction<'Y>
}
/// Learn structured SVM model for multiclass classification
let learn (X : 'X[])(Y : 'Y[])(F: FeatureFunction<'X>)(parameters: MultiClass<'Y>)(options : OneSlack.OptimizationOptions) = ...
/// Predict class by multiclass structured model
let predict (MultiClass(F, W, Y) : MultiClass<'X,'Y>) (x : 'X) : 'Y = ...
Sequence labeling model builds a linear classification model in the combined feature and states space:
/// Structured SVM model for sequence labeling
type Sequence<'X,'Y> = Sequence of FeatureFunction<'X> * (* W *) float[] * 'Y[]
module Sequence =
/// Optimization parameters for the sequence structured SVM
type Sequence<'Y> = {
/// Penalty for slack variables
C : float;
/// Loss function
loss : LossFunction<'Y>
}
/// Learn structured SVM model for sequence labeling
let learn (X : 'X[][]) (Y : 'Y[][]) (F : FeatureFunction<'X>) (parameters : Sequence<'Y>) (options : OneSlack.OptimizationOptions) = ...
/// Predict labals sequence by sequence structured SVM
let predict (Sequence(F,W,Y) : Sequence<'X, 'Y>) (X : 'X[]) : 'Y[] = ...
“1slack” cuttingplane solver is a lowlevel building block for the above problemspecific Structural SVM solvers. It requires a JointFeatureFunction
that maps $\mathcal{X} \times \mathcal{Y}$ space into a sparse vector space (vectors with predominately zero elements). As a result of optimization, the solver returns a weight vector for the sparse vector space features:
/// Joint feature function type
type JointFeatureFunction<'X,'Y> = (* x *) 'X > (* y *) 'Y > SparseVector
// 1slack optimization problem solver
let oneSlack (X : 'X[]) (Y : 'Y[]) (JF : JointFeatureFunction<'X,'Y>) (parameters : OneSlack<'X,'Y>) (options : OptimizationOptions) = ...
Check code, documentation, samples and API reference for more details. Happy hacking!
References

Chang, C. & Lin, C. LIBSVM : A Library for Support Vector Machines // ACM Trans. Intell. Syst. Technol. v.2, 2011  p. 1–39. ↩

Fan, R.E., Chen, P.H. & Lin, C.J. Working Set Selection Using Second Order Information for Training Support Vector Machines. // J. Mach. Learn. Res. v.6, 2005  p. 1889–1918. ↩

Joachims, T., Finley, T. & Yu, C. J. CuttingPlane Training of Structural SVMs // Mach. Learn. v.77, 2009  p. 27–59. ↩