The fastFM API reference¶
The MCMC module¶
-
class
fastFM.mcmc.
FMClassification
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, copy_X=True)[source]¶ Factorization Machine Classification with a MCMC solver.
Parameters: - n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit_predict
(X_train, y_train, X_test)[source]¶ Return average class probabilities of posterior estimates of the test samples. Use only with MCMC!
Parameters: - X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y_train (array, shape (n_samples)) – the targets have to be encodes as {-1, 1}.
- X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
Returns: y_pred – Returns predicted class labels.
Return type: array, shape (n_test_samples)
-
fit_predict_proba
(X_train, y_train, X_test)[source]¶ Return average class probabilities of posterior estimates of the test samples. Use only with MCMC!
Parameters: - X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y_train (array, shape (n_samples)) – the targets have to be encodes as {-1, 1}.
- X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
Returns: y_pred – Returns probability estimates for the class with lowest classification label.
Return type: array, shape (n_test_samples)
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: T – The labels are returned for classification. Return type: array, shape (n_samples)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
-
class
fastFM.mcmc.
FMRegression
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, copy_X=True)[source]¶ Factorization Machine Regression with a MCMC solver.
Parameters: - n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit_predict
(X_train, y_train, X_test, n_more_iter=0)[source]¶ Return average of posterior estimates of the test samples.
Parameters: - X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y_train (array, shape (n_samples)) –
- X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
- n_more_iter (int) – Number of iterations to continue from the current Coefficients.
Returns: T
Return type: array, shape (n_test_samples)
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: T – The labels are returned for classification. Return type: array, shape (n_samples)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
The ALS module¶
-
class
fastFM.als.
FMClassification
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=None)[source]¶ Factorization Machine Classification trained with a ALS (coordinate descent) solver.
Parameters: - n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
- l2_reg_w (float) – L2 penalty weight for linear coefficients.
- l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
- l2_reg (float) – L2 penalty weight for all coefficients (default=0).
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit
(X_train, y_train, n_more_iter=0)[source]¶ Fit model with specified loss.
Parameters: - X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y (float | ndarray, shape = (n_samples, )) – the targets have to be encodes as {-1, 1}.
- n_more_iter (int) – Number of iterations to continue from the current Coefficients.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: y – Class labels Return type: array, shape (n_samples)
-
predict_proba
(X_test)¶ Return probabilities
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: y – Class Probability for the class with smaller label. Return type: array, shape (n_samples)
-
score
(X, y, sample_weight=None)¶ Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: - X (array-like, shape = (n_samples, n_features)) – Test samples.
- y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
- sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns: score – Mean accuracy of self.predict(X) wrt. y.
Return type: float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
-
class
fastFM.als.
FMRegression
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0)[source]¶ Factorization Machine Regression trained with a als (coordinate descent) solver.
Parameters: - n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
- l2_reg_w (float) – L2 penalty weight for linear coefficients.
- l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
- l2_reg (float) – L2 penalty weight for all coefficients (default=0).
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit
(X_train, y_train, n_more_iter=0)[source]¶ Fit model with specified loss.
Parameters: - X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y (float | ndarray, shape = (n_samples, )) –
- n_more_iter (int) – Number of iterations to continue from the current Coefficients.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: T – The labels are returned for classification. Return type: array, shape (n_samples)
-
score
(X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X (array-like, shape = (n_samples, n_features)) – Test samples.
- y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True values for X.
- sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns: score – R^2 of self.predict(X) wrt. y.
Return type: float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
The SGD module¶
-
class
fastFM.sgd.
FMClassification
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0, l2_reg_V=0, l2_reg=None, step_size=0.1)[source]¶ Factorization Machine Classification trained with a stochastic gradient descent solver.
Parameters: - n_iter (int, optional) – The number of interations of individual samples .
- init_std (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
- l2_reg_w (float) – L2 penalty weight for linear coefficients.
- l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
- l2_reg (float) – L2 penalty weight for all coefficients (default=0).
- step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit
(X, y)[source]¶ Fit model with specified loss.
Parameters: - X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y (float | ndarray, shape = (n_samples, )) – the targets have to be encodes as {-1, 1}.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: y – Class labels Return type: array, shape (n_samples)
-
predict_proba
(X_test)¶ Return probabilities
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: y – Class Probability for the class with smaller label. Return type: array, shape (n_samples)
-
score
(X, y, sample_weight=None)¶ Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: - X (array-like, shape = (n_samples, n_features)) – Test samples.
- y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
- sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns: score – Mean accuracy of self.predict(X) wrt. y.
Return type: float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
-
class
fastFM.sgd.
FMRegression
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0, step_size=0.1)[source]¶ Factorization Machine Regression trained with a stochastic gradient descent solver.
Parameters: - n_iter (int, optional) – The number of interations of individual samples .
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
- l2_reg_w (float) – L2 penalty weight for linear coefficients.
- l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
- l2_reg (float) – L2 penalty weight for all coefficients (default=0).
- step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit
(X, y)[source]¶ Fit model with specified loss.
Parameters: - X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y (float | ndarray, shape = (n_samples, )) –
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: T – The labels are returned for classification. Return type: array, shape (n_samples)
-
score
(X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X (array-like, shape = (n_samples, n_features)) – Test samples.
- y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True values for X.
- sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns: score – R^2 of self.predict(X) wrt. y.
Return type: float
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self
The Ranking module¶
-
class
fastFM.bpr.
FMRecommender
(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0, step_size=0.1)[source]¶ Factorization Machine Recommender with pairwise (BPR) loss solver.
Parameters: - n_iter (int, optional) – The number of interations of individual samples .
- init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
- random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
- rank (int) – The rank of the factorization used for the second order interactions.
- l2_reg_w (float) – L2 penalty weight for linear coefficients.
- l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
- l2_reg (float) – L2 penalty weight for all coefficients (default=0).
- step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
-
w0_
¶ float – bias term
-
w_
¶ float | array, shape = (n_features) – Coefficients for linear combination.
-
V_
¶ float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.
-
fit
(X, pairs)[source]¶ Fit model with specified loss.
Parameters: - X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
- y (float | ndarray, shape = (n_compares, 2)) – Each row i defines a pair of samples such that the first returns a high value then the second FM(X[i,0]) > FM(X[i, 1]).
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
predict
(X_test)¶ Return predictions
Parameters: X (scipy.sparse.csc_matrix, (n_samples, n_features)) – Returns: T – The labels are returned for classification. Return type: array, shape (n_samples)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: Return type: self