osbad.modval
============

.. py:module:: osbad.modval

.. autoapi-nested-parse::

   Evaluation utilities for benchmarking anomaly detection models.

   This module provides functions to compare predicted anomalous cycles
   against ground-truth labels from a benchmarking dataset. It includes
   utilities for aligning predictions with labels, visualizing results via
   confusion matrices, and summarizing performance metrics.

   Key features:
       - ``evaluate_pred_outliers``: Aligns predicted outlier indices with the
         benchmarking dataset and produces a DataFrame containing cycle-wise
         true and predicted outlier labels.
       - ``generate_confusion_matrix``: Generates a customized confusion
         matrix heatmap, highlighting correct predictions in palegreen and
         misclassifications in salmon.
       - ``eval_model_performance``: Computes and prints standard evaluation
         metrics (accuracy, precision, recall, F1-score, Matthews correlation
         coefficient) and returns them in a single-row DataFrame.

       .. code-block:: python

           import osbad.modval as modval


Module Contents
---------------

.. py:data:: ROOT_DIR

.. py:data:: PATH_TO_ENV_VARIABLE

.. py:data:: USE_LATEX

.. py:data:: USE_LATEX
   :value: True


.. py:function:: evaluate_pred_outliers(df_benchmark: pandas.DataFrame, outlier_cycle_index: numpy.ndarray) -> pandas.DataFrame

   Evaluate the predicted outliers against the true outliers for each
   cycle in a new dataframe.

   :param df_benchmark: Benchmarking dataset of the selected
                        cell.
   :type df_benchmark: pd.DataFrame
   :param outlier_cycle_index: Predicted outliers from
                               statistical methods or ML models.
   :type outlier_cycle_index: np.ndarray

   :returns: A dataframe with predicted outliers and true
             outliers from the benchmarking dataset for each cycle.
   :rtype: pd.DataFrame

   .. rubric:: Example

   .. code-block::

       df_eval_outlier_sd_dV = modval.evaluate_pred_outliers(
           df_benchmark=df_selected_cell,
           outlier_cycle_index=std_outlier_dV_index)


.. py:function:: generate_confusion_matrix(y_true: Union[pandas.Series, numpy.ndarray], y_pred: Union[pandas.Series, numpy.ndarray]) -> matplotlib.axes._axes.Axes

   Generate a custom confusion matrix for true and false predictions,
   where the color palegreen indicates true predictions, whereas the
   color salmon denotes false predictions.

   :param y_true: True outliers from the
                  benchmarking dataset.
   :type y_true: pd.Series | np.ndarray
   :param y_pred: Predicted outliers from the
                  statistical methods or ML models.
   :type y_pred: pd.Series | np.ndarray

   :returns: Matplotlib axes for additional
             external customization.
   :rtype: matplotlib.axes._axes.Axes

   .. rubric:: Example

   .. code-block::

       df_eval_outlier_sd_dV = modval.evaluate_pred_outliers(
           df_benchmark=df_selected_cell,
           outlier_cycle_index=std_outlier_dV_index)

       axplot = modval.generate_confusion_matrix(
           y_true=np.array(df_eval_outlier_sd_dV["true_outlier"]),
           y_pred=np.array(df_eval_outlier_sd_dV["pred_outlier"]))

       fig_title=(r"SD on $\Delta V_\textrm{scaled,max,cyc}$\newline")
       axplot.set_title(fig_title + "\n", fontsize=16)

       plt.show()


.. py:function:: eval_model_performance(model_name, selected_cell_label: str, df_eval_outliers: pandas.DataFrame) -> pandas.DataFrame

   Evaluate and summarize model performance metrics.

   This function computes model performance metrics (accuracy,
   precision, recall, F1-score, and Matthews correlation coefficient)
   using ground-truth and predicted outlier labels. It prints each
   metric to the console and returns the results as a one-row DataFrame
   for the specified model and cell.

   :param model_name: Name of the machine learning model being
                      evaluated.
   :type model_name: str
   :param selected_cell_label: Identifier for the evaluated cell.
   :type selected_cell_label: str
   :param df_eval_outliers: DataFrame containing two columns:
                            - ``true_outlier``: Ground-truth outlier labels.
                            - ``pred_outlier``: Predicted outlier labels.
   :type df_eval_outliers: pd.DataFrame

   :returns: Single-row DataFrame with the evaluation metrics and
             metadata including ``ml_model`` and ``cell_index``.
   :rtype: pd.DataFrame

   .. rubric:: Example

   .. code-block::

       df_current_eval_metrics = modval.eval_model_performance(
           model_name="iforest",
           selected_cell_label=selected_cell_label,
           df_eval_outliers=df_eval_outlier)

   .. note::

       - Both ``true_outlier`` and ``pred_outlier`` must be binary
         labels where ``0`` = inlier and ``1`` = outlier.