API Reference
=============

This document provides comprehensive reference for NASBench-101/201/301 APIs, including architecture representations, method signatures, return types, and benchmark-specific details.

Benchmarks Overview
-------------------

.. list-table::
   :header-rows: 1
   :widths: 15 25 30 15 15

   * - Benchmark
     - Datasets
     - Available Splits
     - Primary Metrics
     - Training Epochs
   * - NASBench-101
     - CIFAR-10
     - train, val, test
     - train/val/test accuracy
     - 4, 12, 36, 108
   * - NASBench-201
     - CIFAR-10, CIFAR-100, ImageNet16-120
     - train, val, test
     - train/val/test accuracy, losses
     - 0-199 (200 epochs total)
   * - NASBench-301
     - CIFAR-10, CIFAR-100
     - val, test
     - surrogate val/test accuracy
     - N/A (surrogate-based)

Architecture Representations
----------------------------

Each benchmark uses a different architecture representation:

**NASBench-101 (Arch101):**

- Dataclass with two fields:

  - ``adjacency``: list[list[int]] — 7×7 adjacency matrix
  - ``operations``: list[str] — 7 operations from ['input', 'conv3x3-bn-relu', 'conv1x1-bn-relu', 'maxpool3x3', 'output']

- Example:

.. code-block:: python

   Arch101(
       adjacency=[[0, 1, 1, 0, 0, 0, 0],
                  [0, 0, 0, 1, 1, 0, 0],
                  ...],
       operations=['input', 'conv3x3-bn-relu', 'conv1x1-bn-relu', ..., 'output']
   )

**NASBench-201 (String):**

- Architecture string format: ``|op~0|+|op~0|op~1|+|op~0|op~1|op~2|``
- 6 edges connecting 4 nodes in a cell
- 5 operations per edge: ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3']
- Total search space: 5^6 = 15,625 unique architectures
- Each architecture maps to a canonical index (0-15624)

- Example:

.. code-block:: python

   '|none~0|+|skip_connect~0|nor_conv_1x1~1|+|nor_conv_3x3~0|avg_pool_3x3~1|skip_connect~2|'

**NASBench-301 (Dict):**

- DARTS-style architecture with normal and reduction cells
- Dictionary with 'normal' and 'reduce' keys
- Each cell: list of (operation, predecessor_node) tuples
- 8 operations: ['max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5', 'none']
- 4 intermediate nodes per cell, each with 2 input edges

- Example:

.. code-block:: python

   {
       'normal': [('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ...],
       'reduce': [('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ...]
   }

Common API Surface
------------------

All benchmarks expose the following core methods.

Initialization
~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench101, NASBench201, NASBench301

   # Using explicit path
   api = NASBench201('/path/to/nb201.pkl', verbose=True)

   # Using environment variable
   api = NASBench201(verbose=True)  # Reads from NASBENCH201_PATH

**Constructor Args:**

- ``data_path``: Optional[str] — path to pickled benchmark data; if None, reads from environment variable
- ``verbose``: bool — enable/disable all logging output (default: True)

**Environment Variables:**

- ``NASBENCH101_PATH`` — path to NB101 pickle file
- ``NASBENCH201_PATH`` — path to NB201 pickle file
- ``NASBENCH301_PATH`` — path to NB301 pickle file

get_statistics
~~~~~~~~~~~~~~

Get statistics about the loaded benchmark data.

.. code-block:: python

   stats = api.get_statistics()

**Returns:** dict — benchmark statistics

**Return Format by Benchmark:**

- NB101: ``{'benchmark': 'nasbench101', 'architectures': int, 'records': int}``
- NB201: ``{'benchmark': 'nasbench201', 'entries': int}``
- NB301: ``{'benchmark': 'nasbench301', 'files': int}``

random_sample
~~~~~~~~~~~~~

Sample random architectures from the benchmark search space.

.. code-block:: python

   samples = api.random_sample(n=5, seed=123)

**Args:**

- ``n``: int — number of samples (default: 1)
- ``seed``: Optional[int] — RNG seed for reproducibility

**Returns:**

- **NB101**: list[Arch101] — list of Arch101 dataclass objects
- **NB201**: list[str] — list of architecture strings
- **NB301**: list[int] — indices for entries in the loaded dataset (falls back to synthetic architecture dicts if raw entries are unavailable)

iter_all
~~~~~~~~

Iterate over all available architectures in the loaded data.

.. code-block:: python

   for arch in api.iter_all():
       result = api.query(arch, dataset='cifar10', split='val')

**Returns:**

- **NB101**: Iterator[Arch101]
- **NB201**: Iterator[str] — architecture strings
- **NB301**: Iterator[int] — indices in loaded data

get_index
~~~~~~~~~

Get an identifier or index for an architecture.

.. code-block:: python

   # NB201: Convert arch string to numeric index
   idx = api.get_index('|none~0|+|skip_connect~0|nor_conv_1x1~1|+|...')
   # Returns: 12345 (int in range 0-15624)

   # NB101: Get hash identifier
   hash_id = api.get_index(arch_obj)
   # Returns: 'a3f5b2...' (SHA256 hash string)

   # NB301: Find index in loaded data
   idx = api.get_index(arch_dict)
   # Returns: 42 or None

**Args:**

- ``arch``: Architecture representation (type depends on benchmark)

  - NB101: Arch101 object
  - NB201: str (architecture string)
  - NB301: dict (architecture dict)

**Returns:**

- **NB101**: str — stable SHA256 hash identifier
- **NB201**: int — canonical index (0-15624)
- **NB301**: Optional[int] — index in loaded data, or None if not found

available_budgets
~~~~~~~~~~~~~~~~~

List available training budgets (epochs) for a dataset/split combination.

.. code-block:: python

   budgets = api.available_budgets(dataset='cifar10', split='val')
   # Returns e.g. [199, 200] for NB201 validation

**Args:**

- ``dataset``: Optional[str] — target dataset (defaults to all datasets)
- ``split``: Optional[str] — target split (defaults to all splits)

**Returns:** Optional[list] — sorted list of budgets if tracked; None when budgets are not defined for the benchmark.

- **NB101**: returns ``None`` (budgets not tracked)
- **NB201**: list of available epochs per dataset/split based on original training logs
- **NB301**: epochs derived from per-entry learning curves (validation) or final declared budget (test)

exists
~~~~~~

Validate whether a combination of dataset, split, budget, and architecture is supported without issuing a full ``query``.

.. code-block:: python

   api.exists(dataset='cifar10', split='val', budget=199)  # -> True

**Args:**

- ``dataset``: Optional[str]
- ``split``: Optional[str]
- ``budget``: Optional[Any]
- ``arch``: Optional[Any] — architecture representation

**Returns:** bool — True if every provided component is supported, False otherwise.

query
~~~~~

Query performance metrics for an architecture from loaded data.

.. code-block:: python

   # NB201 example
   result = api.query(
       arch='|none~0|+|skip_connect~0|nor_conv_1x1~1|+|...',
       dataset='cifar10',
       split='val',
       seed=777,
       budget=199
   )
   print(f"Validation accuracy: {result['metric']:.2f}%")
   print(f"Training time: {result['cost']:.2f}s")

**Args:**

- ``arch``: Architecture representation (depends on benchmark)

  - **NB101**: Arch101 object
  - **NB201**: str (architecture string)
  - **NB301**: Any (dict or index)

- ``dataset``: str — dataset name

  - **NB101**: 'cifar10'
  - **NB201**: 'cifar10', 'cifar100', 'ImageNet16-120'
  - **NB301**: 'cifar10', 'cifar100'

- ``split``: str — data split

  - **NB101**: 'train', 'val', 'test'
  - **NB201**: 'train', 'val', 'test'
  - **NB301**: 'val', 'test'

- ``seed``: Optional[int] — random seed (default varies by benchmark)

  - **NB201**: default 777 (official NB201 seed)
  - **NB101/NB301**: unused

- ``budget``: Optional[Any] — training budget

  - **NB101**: unused (returns final recorded metrics)
  - **NB201**: epoch number 0-199 (default: 199 for final epoch)
  - **NB301**: epoch index for validation curves (defaults to final epoch); test split always reports the declared final budget

**Returns:** dict with the following keys:

.. code-block:: python

   {
       'metric': Optional[float],      # Primary metric (e.g., accuracy %)
       'metric_name': str,              # Name of metric (e.g., 'val_acc')
       'cost': Optional[float],         # Training time in seconds
       'std': Optional[float],          # Standard deviation (if available)
       'info': dict                     # Additional metadata and raw data
   }

**Return Value Details:**

- **NB101**: Returns a tuple ``(info_dict, metrics_by_budget)`` by default.

  - ``info_dict`` contains ``module_adjacency``, ``module_operations``, ``module_hash``, and aggregate training metadata.
  - ``metrics_by_budget`` is a dict keyed by epoch budgets (4/12/36/108), where each value is a list of up to three run dictionaries. Each run dictionary mirrors the native NASBench metrics: ``halfway_*`` and ``final_*`` keys as well as training times.
  - ``average=True`` collapses each budget to a single averaged metrics dictionary.
  - ``summary=True`` restores the condensed dict (``metric``, ``metric_name``, ``cost``, ``std``, ``info``) for backwards compatibility.

- **NB201 / NB301**: Return a dictionary with keys:

  - ``metric``: Accuracy percentage (e.g., 94.5) or None if not available
  - ``metric_name``: Describes the metric, typically ``{split}_acc``
  - ``cost``: Training/evaluation time in seconds, or None
  - ``std``: Standard deviation of the metric across multiple runs (rarely used)
  - ``info``: Dictionary containing additional information

    - **NB201**: arch_index, dataset, split, seed, epoch, arch_str, params, flop
    - **NB301**: Entry metadata (index, dataset, epochs available/used, declared budget, optimizer tag, JSON path)

NASBench-101 Specifics
------------------------

Import and Initialization
~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench101

   api = NASBench101('/path/to/nasbench_only108.pkl', verbose=True)
   # Or use environment variable
   api = NASBench101(verbose=True)

Dataset and Splits
~~~~~~~~~~~~~~~~~~

- **Single dataset**: CIFAR-10 only
- **Splits**: train, val, test
- **Training epochs**: 4, 12, 36, 108 (typically query final epoch 108)

Architecture Type (Arch101)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import Arch101

   arch = Arch101(
       adjacency=[[0, 1, 1, 0, 0, 0, 0], ...],  # 7×7 matrix
       operations=['input', 'conv3x3-bn-relu', ..., 'output']  # 7 ops
   )

Operations
~~~~~~~~~~

Available operations (from ``op_set()``):

- 'input' (fixed at node 0)
- 'conv3x3-bn-relu'
- 'conv1x1-bn-relu'
- 'maxpool3x3'
- 'output' (fixed at node 6)

encode / decode / id
~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   # Encode Arch101 to native strings
   encoding = api.encode(arch)
   # Returns: {'adjacency_str': '0110000...', 'operations_str': 'input,conv3x3-bn-relu,...'}

   # Decode encoding to Arch101
   arch = api.decode(encoding)

   # Get stable hash ID
   arch_id = api.id(arch)
   # Returns: 'a3f5b2c8...' (SHA256 hash)

get_index
~~~~~~~~~

.. code-block:: python

   # Returns the same as id() for consistency
   hash_id = api.get_index(arch)
   # Returns: 'a3f5b2c8...'

random_sample
~~~~~~~~~~~~~

.. code-block:: python

   archs = api.random_sample(n=10, seed=42)
   # Returns: list of 10 Arch101 objects sampled from loaded data

iter_all
~~~~~~~~

.. code-block:: python

   for arch in api.iter_all():
       result = api.query(arch, dataset='cifar10', split='test')
       print(f"Test acc: {result['metric']:.2f}%")

query
~~~~~

.. code-block:: python

   info, metrics = api.query(arch, dataset='cifar10', split='val')
   # metrics -> {4: [run_dict, ...], 12: [...], 36: [...], 108: [...]}
   averaged = api.query(arch, dataset='cifar10', split='val', average=True)[1]
   summary = api.query(arch, dataset='cifar10', split='val', summary=True)

**Args:**

- ``arch``: Arch101 — architecture object
- ``dataset``: str — 'cifar10' (only dataset available)
- ``split``: str — 'train', 'val', or 'test'
- ``seed``: Optional[int] — unused
- ``budget``: Optional[Any] — unused (all budgets available in ``metrics``)
- ``average``: Optional[bool] — return averaged metrics per budget when True
- ``summary``: Optional[bool] — return condensed dict (legacy shape) when True

**Returns:**

- Tuple ``(info_dict, metrics_by_budget)`` when ``summary=False`` (default)
- Condensed dict when ``summary=True``

train_time
~~~~~~~~~~

Get training time for an architecture.

.. code-block:: python

   time_sec = api.train_time(arch, dataset='cifar10')
   # Returns: float (seconds) or None

mutate
~~~~~~

Apply a mutation to an architecture.

.. code-block:: python

   import random
   rng = random.Random(42)
   mutated = api.mutate(arch, rng=rng, kind='edge_toggle')

**Mutation kinds:**

- 'edge_toggle' — flip an edge in the adjacency matrix
- 'op_swap' — swap two operations

NASBench-201 Specifics
------------------------

Import and Initialization
~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench201

   api = NASBench201('/path/to/NASBench-201-v1_1-096897.pth', verbose=True)
   # Or use environment variable
   api = NASBench201(verbose=True)

Dataset and Splits
~~~~~~~~~~~~~~~~~~

- **Datasets**: CIFAR-10, CIFAR-100, ImageNet16-120
- **Splits**: train, val, test
- **Training epochs**: 0-199 (200 epochs total)
- **Common budget values**: 12 (early), 199 (final epoch)
- **Default seed**: 777 (official NB201 seed)

Architecture Representation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NB201 uses **architecture strings** as the primary representation:

.. code-block:: python

   arch_str = '|none~0|+|skip_connect~0|nor_conv_1x1~1|+|nor_conv_3x3~0|avg_pool_3x3~1|skip_connect~2|'

**Format details:**

- Cell with 4 nodes (node 0 is input, nodes 1-3 are intermediate, node 4 is output)
- 6 edges: (1←0), (2←0), (2←1), (3←0), (3←1), (3←2)
- Each edge has one operation from: ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3']
- String format: ``|op~src|+|op~src|op~src|+|op~src|op~src|op~src|``

**Index mapping:**

- Each architecture has a canonical integer index: 0 to 15,624
- Use ``get_index(arch_str)`` to convert string → index

random_sample
~~~~~~~~~~~~~

.. code-block:: python

   arch_strs = api.random_sample(n=5, seed=42)
   # Returns: ['|none~0|+|...', '|skip_connect~0|+|...', ...]

**Returns**: list[str] — architecture strings

iter_all
~~~~~~~~

.. code-block:: python

   for arch_str in api.iter_all():
       idx = api.get_index(arch_str)
       print(f"Architecture {idx}: {arch_str}")

**Returns**: Iterator[str] — yields architecture strings

get_index
~~~~~~~~~

Convert an architecture string to its canonical integer index.

.. code-block:: python

   idx = api.get_index('|none~0|+|skip_connect~0|nor_conv_1x1~1|+|...')
   # Returns: 12345 (int in range 0-15624)

**Args:**

- ``arch``: str — NB201 architecture string

**Returns:** int — index (0-15624)

**Raises:** ValueError if architecture string is invalid

query
~~~~~

.. code-block:: python

   result = api.query(
       arch='|none~0|+|skip_connect~0|nor_conv_1x1~1|+|...',
       dataset='cifar10',
       split='val',
       seed=777,      # Default seed
       budget=199     # Final epoch
   )

**Args:**

- ``arch``: str — NB201 architecture string
- ``dataset``: str — 'cifar10', 'cifar100', or 'ImageNet16-120'
- ``split``: str — 'train', 'val', or 'test'
- ``seed``: Optional[int] — data seed (default: 777)
- ``budget``: Optional[int] — epoch number 0-199 (default: 199)

**Returns:** dict with keys:

- ``metric``: accuracy percentage (e.g., 91.23)
- ``metric_name``: '{split}_acc'
- ``cost``: training/eval time in seconds
- ``std``: None (not used)
- ``info``: dict with arch_index, dataset, split, seed, epoch, arch_str, params, flop

**Split-specific behavior:**

- 'train': Returns training accuracy at specified epoch
- 'val': Returns validation accuracy (uses 'x-valid@epoch' keys in data)
- 'test': Returns test accuracy (uses 'ori-test@epoch' keys, falls back to validation)

NASBench-301 Specifics
------------------------

Import and Initialization
~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench301

   api = NASBench301('/path/to/nb301_data.pkl', verbose=True)
   # Or use environment variable
   api = NASBench301(verbose=True)

Dataset and Splits
~~~~~~~~~~~~~~~~~~

- **Datasets**: CIFAR-10, CIFAR-100
- **Splits**: val, test (no train split for surrogates)
- **Training epochs**: Validation learning curves provide per-epoch accuracies; the test split reports metrics at the declared final budget for each entry.

Architecture Representation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NB301 uses **DARTS-style architecture dictionaries**:

.. code-block:: python

   arch = {
       'normal': [
           ('sep_conv_3x3', 0), ('sep_conv_3x3', 1),  # Node 1 inputs
           ('sep_conv_3x3', 0), ('sep_conv_3x3', 1),  # Node 2 inputs
           ('sep_conv_3x3', 1), ('skip_connect', 0),  # Node 3 inputs
           ('skip_connect', 0), ('dil_conv_3x3', 2)   # Node 4 inputs
       ],
       'reduce': [
           ('max_pool_3x3', 0), ('max_pool_3x3', 1),
           ('skip_connect', 2), ('max_pool_3x3', 0),
           ('max_pool_3x3', 0), ('skip_connect', 2),
           ('skip_connect', 2), ('max_pool_3x3', 1)
       ]
   }

**Format details:**

- Two cells: 'normal' and 'reduce' (reduction cell)
- Each cell has 4 intermediate nodes
- Each node selects 2 operations from previous nodes (including input nodes 0 and 1)
- 8 operations: ['max_pool_3x3', 'avg_pool_3x3', 'skip_connect', 'sep_conv_3x3', 'sep_conv_5x5', 'dil_conv_3x3', 'dil_conv_5x5', 'none']
- Each entry is a tuple: (operation_name, predecessor_node_index)

random_sample
~~~~~~~~~~~~~

.. code-block:: python

   indices = api.random_sample(n=3, seed=42)
   # Returns: [102, 4096, 7123]

**Returns**: list[int] — dataset entry indices (falls back to architecture dict samples if raw entries are unavailable)

iter_all
~~~~~~~~

.. code-block:: python

   for idx in api.iter_all():
       print(f"Architecture index: {idx}")

**Returns**: Iterator[int] — yields indices in loaded data

get_index
~~~~~~~~~

Find the index of an architecture in loaded data.

.. code-block:: python

   idx = api.get_index(arch_dict)
   # Returns: 42 (int) or None if not found

**Args:**

- ``arch``: Any — architecture dict, dataset index, or entry path string

**Returns:** Optional[int] — index in loaded data, or None if not found

query
~~~~~

.. code-block:: python

   result = api.query(
       arch=0,           # dataset index
       dataset='cifar10',
       split='val',
       budget=50,        # epoch index
   )

**Args:**

- ``arch``: Any — dataset index (int), entry path (str), or architecture dict with 'normal'/'reduce' keys
- ``dataset``: str — 'cifar10' or 'cifar100'
- ``split``: str — 'val' or 'test'
- ``seed``: Optional[int] — unused
- ``budget``: Optional[int] — epoch index for validation curves (defaults to final epoch); ignored for test split

**Returns:** dict with keys: metric, metric_name, cost, std, info (runtime in seconds, dataset metadata, epochs available/used, declared budget, optimizer tag, and JSON path)

**Split behavior:**

- ``val``: accuracy from the per-entry validation learning curve; budgets beyond the recorded length fall back to the final epoch.
- ``test``: reported test accuracy at the declared final budget (the ``budget`` argument is ignored).

Complete Usage Examples
-----------------------

NASBench-101 Example
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench101

   # Initialize
   api = NASBench101(verbose=True)
   stats = api.get_statistics()
   print(f"Loaded {stats['architectures']} architectures")

   # Sample architectures
   archs = api.random_sample(n=5, seed=42)

   # Query performance
   for arch in archs:
       result = api.query(arch, dataset='cifar10', split='test')
       print(f"Test accuracy: {result['metric']:.2f}%")
       print(f"Training time: {result['cost']:.2f}s")

NASBench-201 Example
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench201

   # Initialize
   api = NASBench201(verbose=True)

   # Sample architecture strings
   arch_strs = api.random_sample(n=3, seed=777)

   # Query on multiple datasets
   for arch_str in arch_strs:
       idx = api.get_index(arch_str)
       print(f"\nArchitecture {idx}:")

       for dataset in ['cifar10', 'cifar100', 'ImageNet16-120']:
           result = api.query(
               arch=arch_str,
               dataset=dataset,
               split='test',
               seed=777,
               budget=199
           )
           print(f"  {dataset} test acc: {result['metric']:.2f}%")

   # Iterate all architectures
   count = 0
   for arch_str in api.iter_all():
       count += 1
       if count > 5:
           break
       result = api.query(arch_str, dataset='cifar10', split='val')
       print(f"Arch {count}: val_acc = {result['metric']:.2f}%")

NASBench-301 Example
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench301

   # Initialize
   api = NASBench301(verbose=True)

   # Sample dataset indices
   arch_indices = api.random_sample(n=2, seed=42)

   # Query performance at multiple epochs
   for idx in arch_indices:
       final_val = api.query(idx, dataset='cifar10', split='val')
       mid_val = api.query(idx, dataset='cifar10', split='val', budget=50)
       print(f"Index {idx}: final={final_val['metric']:.2f}% | mid@50={mid_val['metric']:.2f}%")

Error Handling
--------------

Common Exceptions
~~~~~~~~~~~~~~~~~

**ValueError**:

- Invalid architecture string format (NB201)
- Architecture index out of range
- Invalid dataset or split name

**FileNotFoundError**:

- Pickle file not found at specified path
- Environment variable not set

**KeyError**:

- Data format mismatch (e.g., missing expected keys in pickle)

Example Error Handling
~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from nasbenchapi import NASBench201

   try:
       api = NASBench201('/path/to/data.pkl', verbose=True)
   except FileNotFoundError:
       print("Data file not found. Please set NASBENCH201_PATH or provide valid path.")
       exit(1)

   try:
       result = api.query(
           arch='|invalid~format|',
           dataset='cifar10',
           split='val'
       )
   except ValueError as e:
       print(f"Invalid architecture: {e}")

Verbose Logging Control
-----------------------

All benchmarks support a ``verbose`` parameter to control logging output:

.. code-block:: python

   # Enable all logging (default)
   api = NASBench201(verbose=True)
   # Outputs:
   # Loading NB201 from /path/to/file.pkl (2.1 GB)
   # Reading: 100%|██████████| 2.1G/2.1G [00:15<00:00]
   # Unpickling data...
   # Unpickling complete.
   # [NB201] Loaded 15625 architectures (source=arch2infos)

   # Disable all logging (silent mode)
   api = NASBench201(verbose=False)
   # No output

Logging includes:

- File loading progress bars (via tqdm)
- Unpickling status messages
- Data summary and statistics
- Warning messages (e.g., mapping failures)