The fastFM API reference

The MCMC module

class fastFM.mcmc.FMClassification(n_iter=100, init_stdev=0.1, rank=8, random_state=123, copy_X=True)[source]

Factorization Machine Classification with a MCMC solver.

Parameters:
  • n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit_predict(X_train, y_train, X_test)[source]

Return average class probabilities of posterior estimates of the test samples. Use only with MCMC!

Parameters:
  • X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y_train (array, shape (n_samples)) – the targets have to be encodes as {-1, 1}.
  • X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
Returns:

y_pred – Returns predicted class labels.

Return type:

array, shape (n_test_samples)

fit_predict_proba(X_train, y_train, X_test)[source]

Return average class probabilities of posterior estimates of the test samples. Use only with MCMC!

Parameters:
  • X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y_train (array, shape (n_samples)) – the targets have to be encodes as {-1, 1}.
  • X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
Returns:

y_pred – Returns probability estimates for the class with lowest classification label.

Return type:

array, shape (n_test_samples)

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:T – The labels are returned for classification.
Return type:array, shape (n_samples)
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self
class fastFM.mcmc.FMRegression(n_iter=100, init_stdev=0.1, rank=8, random_state=123, copy_X=True)[source]

Factorization Machine Regression with a MCMC solver.

Parameters:
  • n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit_predict(X_train, y_train, X_test, n_more_iter=0)[source]

Return average of posterior estimates of the test samples.

Parameters:
  • X_train (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y_train (array, shape (n_samples)) –
  • X_test (scipy.sparse.csc_matrix, (n_test_samples, n_features)) –
  • n_more_iter (int) – Number of iterations to continue from the current Coefficients.
Returns:

T

Return type:

array, shape (n_test_samples)

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:T – The labels are returned for classification.
Return type:array, shape (n_samples)
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self

The ALS module

class fastFM.als.FMClassification(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=None)[source]

Factorization Machine Classification trained with a ALS (coordinate descent) solver.

Parameters:
  • n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
  • l2_reg_w (float) – L2 penalty weight for linear coefficients.
  • l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
  • l2_reg (float) – L2 penalty weight for all coefficients (default=0).
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit(X_train, y_train, n_more_iter=0)[source]

Fit model with specified loss.

Parameters:
  • X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y (float | ndarray, shape = (n_samples, )) – the targets have to be encodes as {-1, 1}.
  • n_more_iter (int) – Number of iterations to continue from the current Coefficients.
get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:y – Class labels
Return type:array, shape (n_samples)
predict_proba(X_test)

Return probabilities

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:y – Class Probability for the class with smaller label.
Return type:array, shape (n_samples)
score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
  • X (array-like, shape = (n_samples, n_features)) – Test samples.
  • y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
  • sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns:

score – Mean accuracy of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self
class fastFM.als.FMRegression(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0)[source]

Factorization Machine Regression trained with a als (coordinate descent) solver.

Parameters:
  • n_iter (int, optional) – The number of samples for the MCMC sampler, number or iterations over the training set for ALS and number of steps for SGD.
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
  • l2_reg_w (float) – L2 penalty weight for linear coefficients.
  • l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
  • l2_reg (float) – L2 penalty weight for all coefficients (default=0).
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit(X_train, y_train, n_more_iter=0)[source]

Fit model with specified loss.

Parameters:
  • X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y (float | ndarray, shape = (n_samples, )) –
  • n_more_iter (int) – Number of iterations to continue from the current Coefficients.
get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:T – The labels are returned for classification.
Return type:array, shape (n_samples)
score(X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
  • X (array-like, shape = (n_samples, n_features)) – Test samples.
  • y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True values for X.
  • sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns:

score – R^2 of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self

The SGD module

class fastFM.sgd.FMClassification(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0, l2_reg_V=0, l2_reg=None, step_size=0.1)[source]

Factorization Machine Classification trained with a stochastic gradient descent solver.

Parameters:
  • n_iter (int, optional) – The number of interations of individual samples .
  • init_std (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
  • l2_reg_w (float) – L2 penalty weight for linear coefficients.
  • l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
  • l2_reg (float) – L2 penalty weight for all coefficients (default=0).
  • step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit(X, y)[source]

Fit model with specified loss.

Parameters:
  • X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y (float | ndarray, shape = (n_samples, )) – the targets have to be encodes as {-1, 1}.
get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:y – Class labels
Return type:array, shape (n_samples)
predict_proba(X_test)

Return probabilities

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:y – Class Probability for the class with smaller label.
Return type:array, shape (n_samples)
score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
  • X (array-like, shape = (n_samples, n_features)) – Test samples.
  • y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True labels for X.
  • sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns:

score – Mean accuracy of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self
class fastFM.sgd.FMRegression(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0, step_size=0.1)[source]

Factorization Machine Regression trained with a stochastic gradient descent solver.

Parameters:
  • n_iter (int, optional) – The number of interations of individual samples .
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
  • l2_reg_w (float) – L2 penalty weight for linear coefficients.
  • l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
  • l2_reg (float) – L2 penalty weight for all coefficients (default=0).
  • step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit(X, y)[source]

Fit model with specified loss.

Parameters:
  • X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y (float | ndarray, shape = (n_samples, )) –
get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:T – The labels are returned for classification.
Return type:array, shape (n_samples)
score(X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
  • X (array-like, shape = (n_samples, n_features)) – Test samples.
  • y (array-like, shape = (n_samples) or (n_samples, n_outputs)) – True values for X.
  • sample_weight (array-like, shape = [n_samples], optional) – Sample weights.
Returns:

score – R^2 of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self

The Ranking module

class fastFM.bpr.FMRecommender(n_iter=100, init_stdev=0.1, rank=8, random_state=123, l2_reg_w=0.1, l2_reg_V=0.1, l2_reg=0, step_size=0.1)[source]

Factorization Machine Recommender with pairwise (BPR) loss solver.

Parameters:
  • n_iter (int, optional) – The number of interations of individual samples .
  • init_stdev (float, optional) – Sets the stdev for the initialization of the parameter
  • random_state (int, optional) – The seed of the pseudo random number generator that initializes the parameters and mcmc chain.
  • rank (int) – The rank of the factorization used for the second order interactions.
  • l2_reg_w (float) – L2 penalty weight for linear coefficients.
  • l2_reg_V (float) – L2 penalty weight for pairwise coefficients.
  • l2_reg (float) – L2 penalty weight for all coefficients (default=0).
  • step_size (float) – Stepsize for the SGD solver, the solver uses a fixed step size and might require a tunning of the number of iterations n_iter.
w0_

float – bias term

w_

float | array, shape = (n_features) – Coefficients for linear combination.

V_

float | array, shape = (rank_pair, n_features) – Coefficients of second order factor matrix.

fit(X, pairs)[source]

Fit model with specified loss.

Parameters:
  • X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
  • y (float | ndarray, shape = (n_compares, 2)) – Each row i defines a pair of samples such that the first returns a high value then the second FM(X[i,0]) > FM(X[i, 1]).
get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X_test)

Return predictions

Parameters:X (scipy.sparse.csc_matrix, (n_samples, n_features)) –
Returns:T – The labels are returned for classification.
Return type:array, shape (n_samples)
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
Return type:self