kenchi.pipeline module

kenchi.pipeline.make_pipeline(*steps)[source]

Construct a Pipeline from the given estimators. This is a shorthand for the Pipeline constructor; it does not require, and does not permit, naming the estimators. Instead, their names will be set to the lowercase of their types automatically.

Parameters:*steps (list) – List of estimators.
Returns:p
Return type:Pipeline
class kenchi.pipeline.Pipeline(steps, memory=None)[source]

Bases: sklearn.pipeline.Pipeline

Pipeline of transforms with a final estimator.

Parameters:
  • steps (list) – List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.
  • memory (instance of joblib.Memory or string, default None) – Used to cache the fitted transformers of the pipeline. By default, no caching is performed. If a string is given, it is the path to the caching directory. Enabling caching triggers a clone of the transformers before fitting. Therefore, the transformer instance given to the pipeline cannot be inspected directly. Use the attribute named_steps or steps to inspect estimators within the pipeline. Caching the transformers is advantageous when fitting is time consuming.
named_steps

dict – Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.

anomaly_score(X, normalize=False)[source]

Apply transforms, and compute the anomaly score for each sample with the final estimator.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data.
  • normalize (bool, default False) – If True, return the normalized anomaly score.
Returns:

anomaly_score – Anomaly score for each sample.

Return type:

array-like of shape (n_samples,)

featurewise_anomaly_score(X)[source]

Apply transforms, and compute the feature-wise anomaly scores for each sample with the final estimator.

Parameters:X (array-like of shape (n_samples, n_features)) – Data.
Returns:anomaly_score – Feature-wise anomaly scores for each sample.
Return type:array-like of shape (n_samples, n_features)
Raises:ValueError
plot_anomaly_score(X, **kwargs)[source]

Apply transoforms, and plot the anomaly score for each sample with the final estimator.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data.
  • ax (matplotlib Axes, default None) – Target axes instance.
  • bins (int, str or array-like, default 'auto') – Number of hist bins.
  • figsize (tuple, default None) – Tuple denoting figure size of the plot.
  • filename (str, default None) – If provided, save the current figure.
  • hist (bool, default True) – If True, plot a histogram of anomaly scores.
  • kde (bool, default True) – If True, plot a gaussian kernel density estimate.
  • title (string, default None) – Axes title. To disable, pass None.
  • xlabel (string, default 'Samples') – X axis title label. To disable, pass None.
  • xlim (tuple, default None) – Tuple passed to ax.xlim.
  • ylabel (string, default 'Anomaly score') – Y axis title label. To disable, pass None.
  • ylim (tuple, default None) – Tuple passed to ax.ylim.
  • **kwargs (dict) – Other keywords passed to ax.plot.
Returns:

ax – Axes on which the plot was drawn.

Return type:

matplotlib Axes

plot_graphical_model

Apply transforms, and plot the Gaussian Graphical Model (GGM) with the final estimator.

Parameters:
  • ax (matplotlib Axes, default None) – Target axes instance.
  • figsize (tuple, default None) – Tuple denoting figure size of the plot.
  • filename (str, default None) – If provided, save the current figure.
  • random_state (int, RandomState instance, default None) – Seed of the pseudo random number generator.
  • title (string, default 'GGM (n_clusters, n_features, n_isolates)') – Axes title. To disable, pass None.
  • **kwargs (dict) – Other keywords passed to nx.draw_networkx.
Returns:

ax – Axes on which the plot was drawn.

Return type:

matplotlib Axes

plot_partial_corrcoef

Apply transforms, and plot the partial correlation coefficient matrix with the final estimator.

Parameters:
  • ax (matplotlib Axes, default None) – Target axes instance.
  • cbar (bool, default True.) – Whether to draw a colorbar.
  • figsize (tuple, default None) – Tuple denoting figure size of the plot.
  • filename (str, default None) – If provided, save the current figure.
  • title (string, default 'Partial correlation') – Axes title. To disable, pass None.
  • **kwargs (dict) – Other keywords passed to ax.pcolormesh.
Returns:

ax – Axes on which the plot was drawn.

Return type:

matplotlib Axes

plot_roc_curve(X, y, **kwargs)[source]

Apply transoforms, and plot the Receiver Operating Characteristic (ROC) curve with the final estimator.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data.
  • y (array-like of shape (n_samples,)) – Labels.
  • ax (matplotlib Axes, default None) – Target axes instance.
  • figsize (tuple, default None) – Tuple denoting figure size of the plot.
  • filename (str, default None) – If provided, save the current figure.
  • title (string, default 'ROC curve') – Axes title. To disable, pass None.
  • xlabel (string, default 'FPR') – X axis title label. To disable, pass None.
  • ylabel (string, default 'TPR') – Y axis title label. To disable, pass None.
  • **kwargs (dict) – Other keywords passed to ax.plot.
Returns:

ax – Axes on which the plot was drawn.

Return type:

matplotlib Axes