flexible_linear module

Regularized linear regression with custom training and regularization costs.

FlexibleLinearRegression is a scikit-learn-compatible linear regression estimator that allows specification of arbitrary training and regularization cost functions.

For a linear model:

\[\textrm{predictions} = X \cdot W\]

this model attempts to find \(W\) by minimizing:

\[\min_{W} \left\{ \textrm{cost}(X \cdot W - y) + C \cdot \textrm{reg_cost}(W) \right\}\]

for given training data \(X, y\). Here \(C\) is the regularization strength and \(\textrm{cost}\) and \(\textrm{reg_cost}\) are customizable cost functions (e.g., the squared \(\ell^2\) norm or the \(\ell^1\) norm).

Note: In reality, we fit an intercept (bias coefficient) as well. Think of \(X\) in the above as having an extra column of 1’s.

Ideally, the cost functions should be convex and continuously differentiable.

We provide some cost functions: see l1_cost_func(), l2_cost_func(), japanese_cost_func() (or the cost_func_dict dictionary). If you want to use a custom cost function, it should be of the form:

def custom_cost_func(z, **opts):
    # <code to compute cost and gradient>
    return cost, gradient

where cost is a float, gradient is an array of the same dimensions as z, and you may specify any number of keyword arguments.

exception flexible_linear.FitError(message, res)[source]

Bases: Exception

Exception raised when fitting fails.

message

str – Error message.

res

scipy.optimize.OptimizeResult – Results returned by scipy.optimize.minimize. See SciPy documentation on OptimizeResult for details.

class flexible_linear.FlexibleLinearRegression(C=1.0, cost_func='l2', cost_opts=None, reg_cost_func='l2', reg_cost_opts=None)[source]

Bases: sklearn.base.BaseEstimator

Regularized linear regression with custom training/regularization costs.

Parameters:
  • C (Optional[float]) – Nonnegative regularization coefficient. (Zero means no regularization.)
  • cost_func (Optional[callable or str]) – Training cost function. If not callable, should be one of ‘l1’, ‘l2’, or ‘japanese’.
  • cost_opts (Optional[dict]) – Parameters to pass to cost_func.
  • reg_cost_func (Optional[callable or str]) – Regularization cost function. If not callable, should be one of ‘l1’, ‘l2’, or ‘japanese’.
  • reg_cost_opts (Optional[dict]) – Parameters to pass to reg_cost_func.
coef_

ndarray – Weight matrix of shape (n_features+1,). (coef_[0] is the intercept coefficient.)

fit(X, y)[source]

Fit the model.

Parameters:
  • X (ndarray) – Training data of shape (n_samples, n_features).
  • y (ndarray) – Target values of shape (n_samples,).
Returns:

self

Raises:

FitError – If the fitting failed.

predict(X)[source]

Predict using the model.

Parameters:X (ndarray) – Data of shape (n_samples, n_features).
Returns:y (ndarray) – Predicted values of shape (n_samples,).
flexible_linear.cost_func_dict = {'l2': <function l2_cost_func at 0x1068b1d08>, 'japanese': <function japanese_cost_func at 0x1080c9ea0>, 'l1': <function l1_cost_func at 0x1068d0488>}

Dictionary of implemented cost functions.

flexible_linear.japanese_cost_func(z, eta=0.1)[source]

‘Japanese bracket’ cost and gradient

Computes cost and gradient for the cost function:

\[\mathrm{cost}(z) = \frac{\eta^2}{n} \sum_{i=1}^n \left( \sqrt{ 1 + \left( \frac{z_i}{\eta} \right)^2 } - 1 \right) \,.\]

This cost function interpolates componentwise between the squared \(\ell^2\) norm (for \(|z_i| \ll \eta\)) and the \(\ell^1\) norm (for \(|z_i| \gg \eta\)) and is thus useful for reducing the impact of outliers (or when dealing with heavy-tailed rather than Gaussian noise). Unlike the \(\ell^1\) norm, this cost function is smooth.

The key to understanding this is that the Japanese bracket

\[\langle z \rangle := \sqrt{ 1 + |z|^2 }\]

satisfies these asymptotics:

\[\begin{split}\sqrt{ 1 + |z|^2 } - 1 \approx \begin{cases} \frac12 |z|^2 & \text{for $|z| \ll 1$} \\ |z| & \text{for $|z| \gg 1$} \end{cases} \,.\end{split}\]
Parameters:
  • z (ndarray) – Input vector.
  • eta (Optional[float]) – Positive scale parameter.
Returns:

Tuple[float, ndarray] – The cost and gradient (same shape as z).

flexible_linear.l1_cost_func(z)[source]

Normalized \(\ell^1\) cost and gradient

\[\mathrm{cost}(z) = \frac{1}{n} ||z||_{\ell^1} = \frac{1}{n} \sum_{i=1}^n |z_i| \,.\]

Note

This cost is not differentiable. For a smooth alternative, see japanese_cost_func().

Parameters:z (ndarray) – Input vector.
Returns:Tuple[float, ndarray] – The cost and gradient (same shape as z).
flexible_linear.l2_cost_func(z)[source]

Normalized squared \(\ell^2\) cost and gradient

\[\mathrm{cost}(z) = \frac{1}{2n} ||z||_{\ell^2}^2 = \frac{1}{2n} \sum_{i=1}^n |z_i|^2 \,.\]
Parameters:z (ndarray) – Input vector.
Returns:Tuple[float, ndarray] – The cost and gradient (same shape as z).
flexible_linear.test_estimator()[source]
flexible_linear.test_gradients()[source]