<no title>

class kenchi.outlier_detection.base.BaseOutlierDetector[source]¶

Bases: sklearn.base.BaseEstimator, abc.ABC

Base class for all outlier detectors in kenchi.

References

[1]	Kriegel, H.-P., Kroger, P., Schubert, E., and Zimek, A., “Interpreting and unifying outlier scores,” In Proceedings of SDM, pp. 13-24, 2011.

anomaly_score(X=None, normalize=False)[source]¶

Compute the anomaly score for each sample.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, compute the anomaly score for each training sample. normalize (bool, default False) – If True, return the normalized anomaly score.
Returns:	anomaly_score – Anomaly score for each sample.
Return type:	array-like of shape (n_samples,)

decision_function(X=None, threshold=None)[source]¶

Compute the decision function of the given samples.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, compute the decision function of the given training samples. threshold (float, default None) – User-provided threshold.
Returns:	shiftted_score_samples – Shifted opposite of the anomaly score for each sample. Negative scores represent outliers and positive scores represent inliers.
Return type:	array-like of shape (n_samples,)

fit(X, y=None)[source]¶

Fit the model according to the given training data.

Parameters:	X (array-like of shape (n_samples, n_features)) – Training data. y (ignored) –
Returns:	self – Return self.
Return type:	object

fit_predict(X, y=None)[source]¶

Fit the model according to the given training data and predict if a particular training sample is an outlier or not.

Parameters:	X (array-like of shape (n_samples, n_features)) – Training Data. y (ignored) –
Returns:	y_pred – Return -1 for outliers and +1 for inliers.
Return type:	array-like of shape (n_samples,)

plot_anomaly_score(X=None, normalize=False, **kwargs)[source]¶

Plot the anomaly score for each sample.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, plot the anomaly score for each training samples. normalize (bool, default False) – If True, plot the normalized anomaly score. ax (matplotlib Axes, default None) – Target axes instance. bins (int, str or array-like, default 'auto') – Number of hist bins. figsize (tuple, default None) – Tuple denoting figure size of the plot. filename (str, default None) – If provided, save the current figure. hist (bool, default True) – If True, plot a histogram of anomaly scores. kde (bool, default True) – If True, plot a gaussian kernel density estimate. title (string, default None) – Axes title. To disable, pass None. xlabel (string, default 'Samples') – X axis title label. To disable, pass None. xlim (tuple, default None) – Tuple passed to `ax.xlim`. ylabel (string, default 'Anomaly score') – Y axis title label. To disable, pass None. ylim (tuple, default None) – Tuple passed to `ax.ylim`. *kwargs (dict*) – Other keywords passed to `ax.plot`.
Returns:	ax – Axes on which the plot was drawn.
Return type:	matplotlib Axes

plot_roc_curve(X, y, **kwargs)[source]¶

Plot the Receiver Operating Characteristic (ROC) curve.

Parameters:	X (array-like of shape (n_samples, n_features)) – Data. y (array-like of shape (n_samples,)) – Labels. ax (matplotlib Axes, default None) – Target axes instance. figsize (tuple, default None) – Tuple denoting figure size of the plot. filename (str, default None) – If provided, save the current figure. title (string, default 'ROC curve') – Axes title. To disable, pass None. xlabel (string, default 'FPR') – X axis title label. To disable, pass None. ylabel (string, default 'TPR') – Y axis title label. To disable, pass None. *kwargs (dict*) – Other keywords passed to `ax.plot`.
Returns:	ax – Axes on which the plot was drawn.
Return type:	matplotlib Axes

predict(X=None, threshold=None)[source]¶

Predict if a particular sample is an outlier or not.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, predict if a particular training sample is an outlier or not. threshold (float, default None) – User-provided threshold.
Returns:	y_pred – Return -1 for outliers and +1 for inliers.
Return type:	array-like of shape (n_samples,)

predict_proba(X=None)[source]¶

Predict class probabilities for each sample.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, predict if a particular training sample is an outlier or not.
Returns:	y_score – Class probabilities.
Return type:	array-like of shape (n_samples, n_classes)

score_samples(X=None)[source]¶

Compute the opposite of the anomaly score for each sample.

Parameters:	X (array-like of shape (n_samples, n_features), default None) – Data. If None, compute the opposite of the anomaly score for each training sample.
Returns:	score_samples – Opposite of the anomaly score for each sample.
Return type:	array-like of shape (n_samples,)

to_pickle(filename, **kwargs)[source]¶

Persist an outlier detector object.

Parameters:	filename (str or pathlib.Path) – Path of the file in which it is to be stored. kwargs (dict) – Other keywords passed to `sklearn.externals.joblib.dump`.
Returns:	filenames – List of file names in which the data is stored.
Return type:	list