osbad.config

Module Contents

osbad.config.HP_DATA_SOURCE: Literal['tohoku', 'severson'] = 'severson'

Active hyperparameter data source selection. Determines which dataset-specific config is loaded throughout the pipeline.

Important

Set HP_DATA_SOURCE to either tohoku or severson depending on the dataset being analyzed. This selection controls which hyperparameter configuration is loaded from the machine_learning/hp_config_schema directory.

osbad.config.PIPELINE_OUTPUT_DIR

Global directory path for storing pipeline artifacts.

All figures, plots, and intermediate artifacts generated by the pipeline or Jupyter notebooks are written to this directory. If the directory does not already exist, it will be created at runtime.

Important

PIPELINE_OUTPUT_DIR defines the root location where all results (per-cell artifacts, exported plots, metrics, and hyperparameters) are stored. Ensure this path points to a valid writable location before running the pipeline.

osbad.config.find_repo_root(marker: str = 'pyproject.toml')

Locate the root directory of the repository by searching for a marker file.

This function starts from the current working directory and traverses upwards in the directory hierarchy until it finds a directory containing the specified marker file (default is “pyproject.toml”). If the marker file is not found, an exception is raised.

Parameters:

marker (str) – The name of the marker file to search for. Default is “pyproject.toml”.

Returns:

The path to the root directory of the repository.

Return type:

pathlib.Path

Raises:

FileNotFoundError – If the marker file is not found in any parent directories up to the filesystem root.

osbad.config.ROOT_DIR
osbad.config.DB_DIR
osbad.config.artifacts_output_dir(selected_cell_label: str) pathlib.PosixPath

Ensure and return the artifacts directory for a given cell.

Creates (if missing) a per-cell subdirectory under PIPELINE_OUTPUT_DIR and returns its path. All figures and artifacts for the selected cell should be written to this location.

Parameters:

selected_cell_label (str) – Identifier of the evaluated cell used to name the subdirectory.

Returns:

Path to the cell-specific artifacts directory.

Return type:

pathlib.PosixPath

osbad.config.create_json_hp_config(output_json_filepath: str, hp_dict: dict)

Create and save a JSON file containing hyperparameter settings.

This function writes a dictionary of hyperparameter configurations to a JSON file at the specified path.

Parameters:
  • output_json_filepath (str) – Path to save the output JSON file.

  • hp_dict (dict) – Dictionary containing hyperparameter configurations with labeled keys.

Returns:

A JSON file is written to the specified location.

Return type:

None

Example

hp_schema_iforest = {
    "contamination": {"low": 0.0, "high": 0.5},
    "n_estimators": {"low": 100, "high": 500},
    "max_samples": {"low": 100, "high": total_cycle_count},
    "threshold": {"low": 0.0, "high": 1.0}
}

iforest_hp_config_filepath = (
    Path.cwd()
    .parent.parent.parent
    .joinpath(
        "machine_learning",
        "hp_config_schema",
        "iforest_hp_config.json"))

bconf.create_json_hp_config(
    iforest_hp_config_filepath,
    hp_dict=hp_schema_iforest)
osbad.config.load_json_hp_config(input_json_filepath: str) dict

Load hyperparameter configuration from a JSON file.

This function reads a JSON file containing hyperparameter configurations and returns the contents as a dictionary.

Parameters:

input_json_filepath (str) – Path to the JSON file containing hyperparameter configuration.

Returns:

Dictionary containing the loaded hyperparameter configurations.

Return type:

dict

Example

iforest_hp_config_filepath = (
    Path(__file__)
    .parent.parent.parent
    .joinpath(
        "machine_learning",
        "hp_config_schema",
        "iforest_hp_config.json"))

bconf.load_json_hp_config(iforest_hp_config_filepath)
class osbad.config.CustomFormatter(fmt=None, datefmt=None, style='%', validate=True, *, defaults=None)

Bases: logging.Formatter

Custom logging formatter with colorized output.

This formatter applies ANSI escape codes to add colors for different logging levels and customizes the format string for messages. INFO messages are displayed as plain text, while DEBUG, WARNING, ERROR, and CRITICAL messages include timestamps, file names, and line numbers for better context.

Logging level styles:
  • INFO: Grey text, message only.

  • DEBUG: Red text with timestamp, name, file, and line number.

  • WARNING: Bold red text with extended debug-style format.

  • ERROR: Bold red text with extended debug-style format.

  • CRITICAL: Bold red text with extended debug-style format.

grey = '\x1b[38;21m'
yellow = '\x1b[33;21m'
red = '\x1b[31;21m'
bold_red = '\x1b[31;1m'
reset = '\x1b[0m'
debug_format = Multiline-String
Show Value
"""%(asctime)s - %(name)s - %(levelname)s
%(message)s (%(filename)s:%(lineno)d)"""
info_format = '%(message)s'
FORMATS
format(record)

Format the specified record as text.

The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.