Example (1): Baseline Isolation Forest without Hyperparameter Tuning
======================================================================

Prerequisites
---------------

* Python 3.12 (recommended)
* Files on disk:

  * ``database/train_dataset_severson.db`` (benchmark labels per cycle)

* (Optional) LaTeX installation if you want Matplotlib to render text with
  LaTeX:

  * A TeX distribution (e.g., TeX Live/MacTeX/MiKTeX), dvipng, and fonts
    like cm-super.
  * Don't have LaTeX installed? Either install it, or set
    ``rcParams["text.usetex"] = False``.

Before running the example in the ``machine_learning/baseline_models``
section, please evaluate whether the global directory path specified in
``src/osbad/config.py`` needs to be updated:

.. code-block:: python

    # Modify this global directory path if needed
    PIPELINE_OUTPUT_DIR = Path.cwd().joinpath("artifacts_output_dir")

The following example of running a baseline Isolation Forest model (without
hyperparameter tuning) is also provided as a notebook in
``machine_learning/baseline_models/severson_data_source/ml_01_iforest_baseline_severson.ipynb``.

Step-1: Load libraries
---------------------------

Import the libraries into your local development environment, including the
``osbad`` library for benchmarking anomaly detection.

* ``Path`` is used for robust, cross-platform file paths.
* ``pprint`` pretty-prints data structures for readable diagnostics.
* ``duckdb`` is the embedded analytical database engine storing the dataset.
* ``optuna`` is a hyperparameter optimization framework (available for
  later tuning workflows).
* ``bconf``: project config utilities (e.g., where to write artifacts).
* ``BenchDB``: a thin layer around DuckDB that provides convenience loaders.
* ``CycleScaling``: implements the statistical feature transformation
  methods for scaling cycle data.
* ``ModelRunner``, ``hp``, ``modval``, ``bviz``: modeling,
  hyperparameters, model validation, and visualization helpers for the
  benchmarking study.

.. code-block:: python

    # Standard library
    from pathlib import Path
    import pprint

    # Third-party libraries
    import duckdb
    import pandas as pd
    import matplotlib as mpl
    import matplotlib.pyplot as plt
    import numpy as np
    import optuna

    # Custom osbad library for anomaly detection
    import osbad.config as bconf
    import osbad.hyperparam as hp
    import osbad.modval as modval
    import osbad.viz as bviz
    from osbad.database import BenchDB
    from osbad.scaler import CycleScaling
    from osbad.model import ModelRunner

Step-2: Load Benchmarking Dataset
------------------------------------

* Define the path to the DuckDB database file using the ``DB_DIR`` from
  ``bconf``.
* Create a DuckDB connection (read-only) and load the full training dataset
  from the ``df_train_dataset_sv`` table.
* Retrieve the unique cell indices available in the training dataset.

.. code-block:: python

    # Path to database directory
    DB_DIR = bconf.DB_DIR

    db_filepath = DB_DIR.joinpath("train_dataset_severson.db")

    # Create a DuckDB connection
    con = duckdb.connect(
        db_filepath,
        read_only=True)

    # Load all training dataset from duckdb
    df_duckdb = con.execute(
        "SELECT * FROM df_train_dataset_sv").fetchdf()

    # Get the cell index of training dataset
    unique_cell_index_train = df_duckdb["cell_index"].unique()
    print(f"Unique cell index: {unique_cell_index_train}")

Step-3: Filter Dataset for a Selected Cell
---------------------------------------------

* Pick a specific cell based on ``selected_cell_label``, which identifies
  the experimental data corresponding to one unique cell.
* Create an artifacts folder for that cell, where you can save figures,
  tables, or model outputs related to this cell.
* Filter the loaded dataset for the selected cell only and extract the
  ground-truth outlier cycle indices.

.. code-block:: python

    # Get the cell-ID from cell_inventory
    selected_cell_label = "2017-05-12_5_4C-70per_3C_CH17"

    # Create a subfolder to store fig output
    # corresponding to each cell-index
    selected_cell_artifacts_dir = bconf.artifacts_output_dir(
        selected_cell_label)

    # Filter dataset for specific selected cell only
    df_selected_cell = df_duckdb[
        df_duckdb["cell_index"] == selected_cell_label]

    # Anomalous cycle has label = 1
    # Normal cycle has label = 0
    # true outliers from benchmarking dataset
    df_true_outlier = df_selected_cell[
        df_selected_cell["outlier"] == 1]

    # Get the cycle index of anomalous cycle
    true_outlier_cycle_index = (
        df_true_outlier["cycle_index"].unique())

Step-4: Drop True Labels
-----------------------------

* Drop the true outlier labels (denoted as ``outlier``) from the dataframe
  and select only the relevant features for machine learning:

  * ``cell_index``: The cell-ID for data and model versioning purposes.
  * ``cycle_index``: The cycle number of each cell.
  * ``discharge_capacity``: Discharge capacity of the cell.
  * ``voltage``: Discharge voltage of the cell.

* Initialize ``BenchDB`` for the selected cell and load the benchmarking
  dataset from the training partition.
* Extract the true outlier cycle indices for later evaluation.

.. code-block:: python

    # Import the BenchDB class
    # Load only the dataset based on the selected cell
    benchdb = BenchDB(
        db_filepath,
        selected_cell_label)

    # load the benchmarking dataset
    df_selected_cell = benchdb.load_benchmark_dataset(
        dataset_type="train")

    if df_selected_cell is not None:

        filter_col = [
            "cell_index",
            "cycle_index",
            "discharge_capacity",
            "voltage"]

        # Drop true labels from the benchmarking dataset
        # and filter for selected columns only
        df_selected_cell_without_labels = benchdb.drop_labels(
            df_selected_cell,
            filter_col)

        # Extract true outliers cycle index from benchmarking dataset
        true_outlier_cycle_index = benchdb.get_true_outlier_cycle_index(
            df_selected_cell)

Step-5: Plot Cycle Data without Labels
-----------------------------------------

* Visualize the cycling data for the selected cell without displaying the
  true outlier labels. This represents what the model sees before training.

.. code-block:: python

    # Plot cell data with true anomalies
    # If the true outlier cycle index is not known,
    # cycling data will be plotted without labels
    benchdb.plot_cycle_data(
        df_selected_cell_without_labels)

    output_fig_filename = (
        "cycle_data_nolabel_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()


.. image:: docs_figure/ml_01_severson_iforest_baseline/cycle_data_nolabel_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 398px
   :width: 600px
   :alt: Cycle data without labels from ``2017-05-12_5_4C-70per_3C_CH17``
   :align: center

Step-6: Statistical Feature Transformation
---------------------------------------------

To help separate abnormal cycles from normal cycles, a statistical feature
transformation method is applied using the median and IQR of the input
features:

.. math::

    x_\textrm{scaled} = x_i - \left[\frac{\textrm{median}(X)^{2}}
    {\textrm{IQR}(X)}\right],

where the IQR can be calculated from the third (75th percentile) and first
quartile (25th percentile) of the input vector
(:math:`\textrm{IQR}(X) = Q_3(X) - Q_1(X)`).
Here, :math:`\textrm{median}(X)^2` preserves the physical unit of the
original feature after transformation. Feature scaling is implemented on
both the capacity and voltage data.

Capacity scaling
^^^^^^^^^^^^^^^^^

.. code-block:: python

    # Instantiate the CycleScaling class
    scaler = CycleScaling(
        df_selected_cell=df_selected_cell_without_labels)

    # Implement median IQR scaling on the discharge capacity data
    df_capacity_med_scaled = scaler.median_IQR_scaling(
        variable="discharge_capacity",
        validate=True)

    # Plot the histogram and boxplot of the scaled data
    ax_hist = bviz.hist_boxplot(
        df_variable=df_capacity_med_scaled["scaled_discharge_capacity"])

    ax_hist.set_xlabel(
        r"Discharge capacity [Ah]",
        fontsize=12)
    ax_hist.set_ylabel(
        r"Count",
        fontsize=12)

    output_fig_filename = (
        "cap_scaling_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()

Voltage scaling
^^^^^^^^^^^^^^^^^

.. code-block:: python

    # Implement median IQR scaling on the discharge voltage data
    df_voltage_med_scaled = scaler.median_IQR_scaling(
        variable="voltage",
        validate=True)

    # Plot the histogram and boxplot of the scaled data
    ax_hist = bviz.hist_boxplot(
        df_variable=df_voltage_med_scaled["scaled_voltage"])

    ax_hist.set_xlabel(
        r"Scaled voltage [V]",
        fontsize=12)
    ax_hist.set_ylabel(
        r"Count",
        fontsize=12)

    output_fig_filename = (
        "voltage_scaling_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()

Scatter histogram
^^^^^^^^^^^^^^^^^^

* Create a scatterplot with histograms to display the distribution for the
  x-axis and y-axis:

  * The salmon color corresponds to the x-axis (``discharge_capacity``).
  * The grey color corresponds to the y-axis (``voltage``).

.. code-block:: python

    axplot = bviz.scatterhist(
        xseries=df_selected_cell_without_labels["discharge_capacity"],
        yseries=df_selected_cell_without_labels["voltage"],
        cycle_index_series=df_selected_cell_without_labels["cycle_index"])

    axplot.set_xlabel(
        r"Capacity [Ah]",
        fontsize=12)
    axplot.set_ylabel(
        r"Voltage [V]",
        fontsize=12)

    plt.show()

Before applying the scaling transformation, the scatterplot with histograms
shows that the anomalous cycles are closely clustered with the normal
cycles, making it difficult for the model to learn a clear decision boundary.
After applying the scaling transformation, the anomalous cycles are more
separated from the normal cycles, which can help the model to better identify
the anomalies.

.. image:: docs_figure/ml_01_severson_iforest_baseline/scatterhist_no_scaling_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 591 px
   :width: 600 px
   :alt: Scatterplot with histograms before scaling
   :align: center

After applying the scaling transformation:

.. code-block:: python

    axplot = bviz.scatterhist(
        xseries=df_capacity_med_scaled["scaled_discharge_capacity"],
        yseries=df_voltage_med_scaled["scaled_voltage"],
        cycle_index_series=df_selected_cell_without_labels["cycle_index"])

    axplot.set_xlabel(
        r"Scaled capacity [Ah]",
        fontsize=12)
    axplot.set_ylabel(
        r"Scaled voltage [V]",
        fontsize=12)

    plt.show()

.. image:: docs_figure/ml_01_severson_iforest_baseline/scatterhist_with_scaling_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 585 px
   :width: 600 px
   :alt: Scatterplot with histograms after scaling
   :align: center

Step-7: Physics-informed Feature Extraction
---------------------------------------------

As the anomalies in this dataset are collective due to a continuous series
of abnormal voltage and current measurements, the collective anomalies of a
given cycle can be transformed into cycle-wise point anomalies.

.. math::

    \Delta Q_\textrm{scaled,max,cyc} = \underset{\textrm{cyc}}{\max}
    (Q_{\textrm{scaled},{k+1}} - Q_{\textrm{scaled},{k}}),

.. math::

    \Delta V_\textrm{scaled,max,cyc} = \underset{\textrm{cyc}}{\max}
    (V_{\textrm{scaled},{k+1}} - V_{\textrm{scaled},{k}}),

where :math:`\Delta V_\textrm{scaled,max,cyc}` is the maximum scaled
voltage difference per cycle, :math:`\Delta Q_\textrm{scaled,max,cyc}` is
the maximum scaled capacity difference per cycle, and :math:`k` denotes the
index of each recorded data point.

Feature max dQ
^^^^^^^^^^^^^^^

.. code-block:: python

    # maximum scaled capacity difference per cycle
    df_max_dQ = scaler.calculate_max_diff_per_cycle(
        df_scaled=df_capacity_med_scaled,
        variable_name="scaled_discharge_capacity")

    # Update the column name to include dQ into the name
    df_max_dQ.columns = [
        "max_diff_dQ",
        "log_max_diff_dQ",
        "cycle_index"]

Feature max dV
^^^^^^^^^^^^^^^

.. code-block:: python

    # maximum scaled voltage difference per cycle
    df_max_dV = scaler.calculate_max_diff_per_cycle(
        df_scaled=df_voltage_med_scaled,
        variable_name="scaled_voltage")

    # Update the column name to include dV into the name
    df_max_dV.columns = [
        "max_diff_dV",
        "log_max_diff_dV",
        "cycle_index"]

Merge features
^^^^^^^^^^^^^^^

* Merge the ``df_max_dV`` and ``df_max_dQ`` dataframes into a single
  feature dataframe, removing duplicated ``cycle_index`` columns.

.. code-block:: python

    # Merge both df_max_dV, df_max_dQ into a df
    # Remove the duplicated cycle_index column
    df_merge = pd.concat([df_max_dV, df_max_dQ], axis=1)
    df_merge_features = df_merge.loc[
        :, ~df_merge.columns.duplicated()].copy()

Step-8: Bubble Plot Visualization
------------------------------------

* Calculate the bubble size ratios from the feature distributions for
  plotting.
* Plot bubble charts both with and without log transformation to
  visualize the separation of anomalous cycles.

Bubble plot without log transformation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python

    # Calculate the bubble size ratio for plotting
    df_bubble_size_dQ = bviz.calculate_bubble_size_ratio(
        df_variable=df_max_dQ["max_diff_dQ"])

    df_bubble_size_dV = bviz.calculate_bubble_size_ratio(
        df_variable=df_max_dV["max_diff_dV"])

    # Get the cycle count to label the bubbles
    unique_cycle_count = (
        df_selected_cell_without_labels["cycle_index"]
        .unique())

    # bubble size for multivariate anomalies
    bubble_size = (
        np.abs(df_bubble_size_dV)
        * np.abs(df_bubble_size_dQ))

    # Plot the bubble chart and label the outliers
    axplot = bviz.plot_bubble_chart(
        xseries=df_merge_features["max_diff_dQ"],
        yseries=df_merge_features["max_diff_dV"],
        bubble_size=bubble_size,
        unique_cycle_count=unique_cycle_count,
        cycle_outlier_idx_label=true_outlier_cycle_index)

    axplot.set_xlabel(
         r"$\Delta Q_{\mathrm{scaled,max,cyc}}$ [Ah]",
         fontsize=12)
    axplot.set_ylabel(
         r"$\Delta V_{\mathrm{scaled,max,cyc}}$ [V]",
         fontsize=12)

    plt.show()

Bubble plot with log transformation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Log transformation helps to improve the visualization of closely clustered
data points and to better separate the anomalous cycles from the normal
cycles, especially when the feature values span several orders of magnitude.

.. code-block:: python

    # Plot the bubble chart and label the outliers
    axplot = bviz.plot_bubble_chart(
        xseries=df_merge_features["log_max_diff_dQ"],
        yseries=df_merge_features["log_max_diff_dV"],
        bubble_size=bubble_size,
        unique_cycle_count=unique_cycle_count,
        cycle_outlier_idx_label=true_outlier_cycle_index)

    axplot.set_title(
        f"Cell {selected_cell_label}", fontsize=13)

    axplot.set_xlabel(
         r"$\log(\Delta Q_{\mathrm{scaled,max,cyc}})$ [Ah]",
         fontsize=12)
    axplot.set_ylabel(
         r"$\log(\Delta V_{\mathrm{scaled,max,cyc}})$ [V]",
         fontsize=12)

    output_fig_filename = (
        "multivariate_bubble_plot_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()

.. image:: docs_figure/ml_01_severson_iforest_baseline/multivariate_bubble_plot_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 482.98 px
   :width: 600 px
   :alt: Multivariate bubble plot from ``2017-05-12_5_4C-70per_3C_CH17``
   :align: center

Step-9: Baseline Isolation Forest (without hyperparameter tuning)
-------------------------------------------------------------------

* Create a ``ModelRunner`` instance with the selected features
  (``log_max_diff_dQ``, ``log_max_diff_dV``) and the cell label.
* Build the training input matrix ``Xdata``
  (shape: n_cycles × n_features).
* Instantiate the baseline Isolation Forest model using
  ``cfg.baseline_model_param()`` (default hyperparameters, no tuning).
* Fit the model, compute probabilistic outlier scores, and extract the
  predicted outlier cycle indices using a threshold of ``0.7``.

.. code-block:: python

    selected_feature_cols = (
        "log_max_diff_dQ",
        "log_max_diff_dV")

    # Instantiate ModelRunner with selected features and cell_label
    runner = ModelRunner(
        cell_label=selected_cell_label,
        df_input_features=df_merge_features,
        selected_feature_cols=selected_feature_cols
    )

    # create Xdata array
    Xdata = runner.create_model_x_input()

    # Extract the model configuration for Isolation Forest
    cfg = hp.MODEL_CONFIG["iforest"]

    # create model instance without hyperparameter tuning
    model = cfg.baseline_model_param()
    model.fit(Xdata)

    # Predict probabilistic outlier score
    proba = model.predict_proba(Xdata)

    # Get predicted outlier cycle and score from
    # the probabilistic outlier score
    (pred_outlier_indices,
     pred_outlier_score) = runner.pred_outlier_indices_from_proba(
        proba=proba,
        threshold=0.7,
        outlier_col=cfg.proba_col
    )

    print("Predicted anomalous cycles:")
    print(pred_outlier_indices)
    print("-"*70)
    print("Predicted corresponding outlier score:")
    print(pred_outlier_score)

To inspect the default hyperparameters of the baseline model:

.. code-block:: python

    # Access the default hyperparameters without tuning
    baseline_model_param = model.get_params()
    pprint.pp(baseline_model_param)

Step-10: Predict Probabilistic Anomaly Score Map
---------------------------------------------------

* ``pred_outlier_indices`` is a list of cycle indices predicted as
  anomalous by the baseline Isolation Forest model. Using ``.isin()``,
  the dataframe is filtered to keep only cycles identified as anomalies.
* A new column, ``outlier_prob``, is added to store the outlier probability
  computed by the model, making it easy to track how confidently the
  algorithm flags each cycle.
* ``runner.predict_anomaly_score_map`` generates a 2D contour map of
  anomaly scores (outlier probability).

.. code-block:: python

    # Filter the selected features based on predicted outlier indices
    df_outliers_pred = df_merge_features[
        df_merge_features["cycle_index"]
        .isin(pred_outlier_indices)].copy()

    df_outliers_pred["outlier_prob"] = pred_outlier_score

    # Plot the anomaly score map
    axplot = runner.predict_anomaly_score_map(
        selected_model=model,
        model_name="Isolation Forest",
        xoutliers=df_outliers_pred["log_max_diff_dQ"],
        youtliers=df_outliers_pred["log_max_diff_dV"],
        pred_outliers_index=pred_outlier_indices,
        threshold=0.7)

    axplot.set_xlabel(
         r"$\log(\Delta Q_{\mathrm{scaled,max,cyc}})$ [Ah]",
         fontsize=12)
    axplot.set_ylabel(
         r"$\log(\Delta V_{\mathrm{scaled,max,cyc}})$ [V]",
         fontsize=12)

    output_fig_filename = (
        "isolation_forest_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()

.. image:: docs_figure/ml_01_severson_iforest_baseline/baseline_iforest_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 424.39 px
   :width: 600 px
   :alt: iForest confusion matrix from ``2017-05-12_5_4C-70per_3C_CH17``
   :align: center

The figure shows the anomaly score map produced by the baseline Isolation
Forest model:

* **Background Heatmap**:

  * Red regions: high anomaly probability (more likely to contain outliers).
  * Blue/white regions: low anomaly probability (normal cycles).

* **Dashed Black Contour**:

  * Represents the decision boundary defined by the Isolation Forest
    threshold. Points outside are considered anomalies.

* **Black Dots**:

  * Represent the majority of normal cycles (inlier data).

* **Yellow Stars with Labels**:

  * Mark the detected anomalous cycles. Their positions in the 2D feature
    space highlight where they deviate from typical battery behavior.

* **Colorbar (right)**:

  * Quantifies anomaly probability (0 = normal, 1 = highly anomalous).

Histogram of the anomaly score
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python

    outlier_score = model.decision_function(Xdata)

    fig, ax = plt.subplots(figsize=(10, 6))

    ax.hist(
        outlier_score,
        color="skyblue",
        edgecolor="black",
        bins=25)

    ax.set_xlabel(
        "Predicted anomaly score",
        fontsize=12)

    ax.grid(
        color="grey",
        linestyle="-",
        linewidth=0.25,
        alpha=0.7)

    plt.show()

Step-11: Model Performance Evaluation
-----------------------------------------

* Map predicted outlier indices to the benchmark dataset:

  * ``df_selected_cell`` holds cycle-level records and the ground-truth
    label (e.g., ``outlier`` = 1 for anomalous cycles, else 0).
  * ``pred_outlier_indices`` is the list of cycle indices flagged by the
    model.

* ``modval.evaluate_pred_outliers(...)`` returns a tidy DataFrame with:

  * ``cycle_index``: Cell discharge cycle index.
  * ``true_outlier``: ground truth (0/1).
  * ``pred_outlier``: model prediction (0/1) for the same cycles.

* ``modval.generate_confusion_matrix(...)`` aggregates counts of:

  * ``True Negative (TN)``: predicted 0, truth 0.
  * ``False Positive (FP)``: predicted 1, truth 0.
  * ``False Negative (FN)``: predicted 0, truth 1.
  * ``True Positive (TP)``: predicted 1, truth 1.

.. code-block:: python

    # Compare predicted probabilistic outliers against true outliers
    # from the benchmarking dataset
    df_eval_outlier = modval.evaluate_pred_outliers(
        df_benchmark=df_selected_cell,
        outlier_cycle_index=pred_outlier_indices)

Confusion matrix
^^^^^^^^^^^^^^^^^^

.. code-block:: python

    axplot = modval.generate_confusion_matrix(
        y_true=df_eval_outlier["true_outlier"],
        y_pred=df_eval_outlier["pred_outlier"])

    axplot.set_title(
        "Isolation Forest",
        fontsize=16)

    output_fig_filename = (
        "conf_matrix_iforest_"
        + selected_cell_label
        + ".png")

    fig_output_path = (
        selected_cell_artifacts_dir
        .joinpath(output_fig_filename))

    plt.savefig(
        fig_output_path,
        dpi=600,
        bbox_inches="tight")

    plt.show()

.. image:: docs_figure/ml_01_severson_iforest_baseline/conf_matrix_iforest_2017-05-12_5_4C-70per_3C_CH17.png
   :height: 480px
   :width: 600 px
   :alt: iForest confusion matrix from ``2017-05-12_5_4C-70per_3C_CH17``
   :align: center

Evaluation metrics
^^^^^^^^^^^^^^^^^^^^

In this study, five different metrics are used to evaluate model performance:

* **Accuracy**: :math:`\frac{\textrm{TP} + \textrm{TN}}{\textrm{Total prediction}}`
* **Precision**: :math:`\frac{\textrm{TP}}{\textrm{TP + FP}}`
* **Recall**: :math:`\frac{\textrm{TP}}{\textrm{TP + FN}}`
* **F1-score**: :math:`\frac{2(\textrm{Precision}\times \textrm{Recall})}{\textrm{Precision} + \textrm{Recall}}`
* **MCC**: :math:`\frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN+FN)}}`

.. code-block:: python

    df_current_eval_metrics = modval.eval_model_performance(
        model_name="iforest",
        selected_cell_label=selected_cell_label,
        df_eval_outliers=df_eval_outlier)

Step-12: Export Evaluation Metrics
-------------------------------------

* Export the evaluation metrics to a CSV file for record-keeping and
  comparison across models.

.. code-block:: python

    # Export current metrics to CSV
    metrics_eval_filepath = Path.cwd().joinpath(
        "eval_metrics_no_hp_severson.csv")

    hp.export_current_model_metrics(
        model_name="iforest",
        selected_cell_label=selected_cell_label,
        df_current_eval_metrics=df_current_eval_metrics,
        export_csv_filepath=metrics_eval_filepath,
        if_exists="replace")