flexible_linear module¶

Regularized linear regression with custom training and regularization costs.

FlexibleLinearRegression is a scikit-learn-compatible linear regression estimator that allows specification of arbitrary training and regularization cost functions.

For a linear model:

\[\textrm{predictions} = X \cdot W\]

this model attempts to find $W$ by minimizing:

\[\min_{W} \left\{ \textrm{cost}(X \cdot W - y) + C \cdot \textrm{reg_cost}(W) \right\}\]

for given training data $X, y$. Here $C$ is the regularization strength and $\textrm{cost}$ and $\textrm{reg_cost}$ are customizable cost functions (e.g., the squared $\ell^2$ norm or the $\ell^1$ norm).

Note: In reality, we fit an intercept (bias coefficient) as well. Think of $X$ in the above as having an extra column of 1’s.

Ideally, the cost functions should be convex and continuously differentiable.

We provide some cost functions: see l1_cost_func(), l2_cost_func(), japanese_cost_func() (or the cost_func_dict dictionary). If you want to use a custom cost function, it should be of the form:

def custom_cost_func(z, **opts):
    # <code to compute cost and gradient>
    return cost, gradient

where cost is a float, gradient is an array of the same dimensions as z, and you may specify any number of keyword arguments.

exception flexible_linear.FitError(message, res)[source]¶

Bases: Exception

Exception raised when fitting fails.

message¶: str – Error message.

res¶: scipy.optimize.OptimizeResult – Results returned by scipy.optimize.minimize. See SciPy documentation on OptimizeResult for details.

class flexible_linear.FlexibleLinearRegression(C=1.0, cost_func='l2', cost_opts=None, reg_cost_func='l2', reg_cost_opts=None)[source]¶

Bases: sklearn.base.BaseEstimator

Regularized linear regression with custom training/regularization costs.

Parameters:

C (Optional[float]) – Nonnegative regularization coefficient. (Zero means no regularization.)
cost_func (Optional[callable or str]) – Training cost function. If not callable, should be one of ‘l1’, ‘l2’, or ‘japanese’.
cost_opts (Optional[dict]) – Parameters to pass to cost_func.
reg_cost_func (Optional[callable or str]) – Regularization cost function. If not callable, should be one of ‘l1’, ‘l2’, or ‘japanese’.
reg_cost_opts (Optional[dict]) – Parameters to pass to reg_cost_func.

coef_¶: ndarray – Weight matrix of shape (n_features+1,). (coef_[0] is the intercept coefficient.)

fit(X, y)[source]¶

Fit the model.

Parameters:	X (ndarray) – Training data of shape `(n_samples, n_features)`. y (ndarray) – Target values of shape `(n_samples,)`.
Returns:	self
Raises:	`FitError` – If the fitting failed.

predict(X)[source]¶

Predict using the model.

Parameters:	X (ndarray) – Data of shape `(n_samples, n_features)`.
Returns:	y (ndarray) – Predicted values of shape `(n_samples,)`.

flexible_linear.cost_func_dict = {'l2': <function l2_cost_func at 0x1068b1d08>, 'japanese': <function japanese_cost_func at 0x1080c9ea0>, 'l1': <function l1_cost_func at 0x1068d0488>}¶: Dictionary of implemented cost functions.

flexible_linear.japanese_cost_func(z, eta=0.1)[source]¶

‘Japanese bracket’ cost and gradient

Computes cost and gradient for the cost function:

\[\mathrm{cost}(z) = \frac{\eta^2}{n} \sum_{i=1}^n \left( \sqrt{ 1 + \left( \frac{z_i}{\eta} \right)^2 } - 1 \right) \,.\]

This cost function interpolates componentwise between the squared $\ell^2$ norm (for $|z_i| \ll \eta$) and the $\ell^1$ norm (for $|z_i| \gg \eta$) and is thus useful for reducing the impact of outliers (or when dealing with heavy-tailed rather than Gaussian noise). Unlike the $\ell^1$ norm, this cost function is smooth.

The key to understanding this is that the Japanese bracket

\[\langle z \rangle := \sqrt{ 1 + |z|^2 }\]

satisfies these asymptotics:

\[\begin{split}\sqrt{ 1 + |z|^2 } - 1 \approx \begin{cases} \frac12 |z|^2 & \text{for $|z| \ll 1$} \\ |z| & \text{for $|z| \gg 1$} \end{cases} \,.\end{split}\]

Parameters:	z (ndarray) – Input vector. eta (Optional[float]) – Positive scale parameter.
Returns:	Tuple[float, ndarray] – The cost and gradient (same shape as z).

flexible_linear.l1_cost_func(z)[source]¶

Normalized $\ell^1$ cost and gradient

\[\mathrm{cost}(z) = \frac{1}{n} ||z||_{\ell^1} = \frac{1}{n} \sum_{i=1}^n |z_i| \,.\]

Note

This cost is not differentiable. For a smooth alternative, see japanese_cost_func().

Parameters:	z (ndarray) – Input vector.
Returns:	Tuple[float, ndarray] – The cost and gradient (same shape as z).

flexible_linear.l2_cost_func(z)[source]¶

Normalized squared $\ell^2$ cost and gradient

\[\mathrm{cost}(z) = \frac{1}{2n} ||z||_{\ell^2}^2 = \frac{1}{2n} \sum_{i=1}^n |z_i|^2 \,.\]

Parameters:	z (ndarray) – Input vector.
Returns:	Tuple[float, ndarray] – The cost and gradient (same shape as z).

flexible_linear.test_estimator()[source]¶

flexible_linear.test_gradients()[source]¶