Skip to content

Reference

HeaderAnnotator

Annotates DataFrame exports with metadata headers for Excel reports.

This class reads a CSV "specification" file describing metadata for fields and applies that metadata as annotated header rows when exporting DataFrames to Excel. It ensures consistent column ordering and makes reports self-documenting.

Experimental feature

HeaderAnnotator and HeaderSpec are experimental and may change without warning.

Responsibilities:

  • Parse a CSV specification file that defines metadata for fields.
  • Map specification metadata onto DataFrame columns.
  • Prepend metadata rows (header blocks) above data rows in exported Excel files.

__init__(spec_csv_path, spec_path_col, drop_empty_columns, spec_ignore_cols=None)

Initialize a HeaderAnnotator.

Parameters:

Name Type Description Default
spec_csv_path str | Path

Path to the CSV file containing header specifications.

required
spec_path_col str

Column name in the spec that defines the canonical field paths.

required
drop_empty_columns bool

If True, columns with no matching data will be dropped from the annotated DataFrame.

required
spec_ignore_cols list[str] | None

Optional list of spec columns to ignore and drop.

None
Notes
  • The specification CSV is normalized automatically (duplicate rows dropped, hyperlinks applied if *_link columns are present, and known typos corrected).
  • The spec_path_col must contain canonical dot-notation paths (without indices).

apply_metadata_header(df)

Annotate a DataFrame with specification metadata.

Prepends one or more metadata rows (from the spec CSV) above the actual data rows, ensuring consistent column ordering. The result is ready to export directly to Excel with self-documenting headers.

Parameters:

Name Type Description Default
df DataFrame

Input DataFrame. Must contain a 'path_to_root' column.

required

Returns:

Type Description
DataFrame

DataFrame with metadata rows stacked on top of the original data rows.

to_excel_with_metadata(writer, sheet_name, df, *, index=False, header=True, auto_filter=True, styler_fn=None) staticmethod

Write a DataFrame or Styler to an Excel worksheet, including metadata header rows.

The header rows are styled in gray and frozen, while the data block may have styling applied through a custom styler function.

Parameters:

Name Type Description Default
writer

An ExcelWriter object.

required
sheet_name str

Target worksheet name.

required
df DataFrame | Styler

Data or styled DataFrame including metadata rows.

required
index bool

Write index column. Default False.

False
header bool

Write column headers. Default True.

True
auto_filter bool

Add an autofilter. Default True.

True
styler_fn Callable

Function applied to style the body rows.

None

Returns:

Name Type Description
Worksheet Worksheet

The created xlsxwriter worksheet object.

HeaderSpec dataclass

Dataclass wrapper for header specification files.

Experimental feature

HeaderSpec is considered experimental and WILL change without warning.

Encapsulates configuration for creating a HeaderAnnotator. Use this class when you want to bundle spec file location and options together.

Attributes:

Name Type Description
spec_csv_path str

Path to the CSV file with header metadata.

spec_path_col str

Column in the spec that defines canonical paths. Default "path".

spec_ignore_cols list[str]

List of spec fields to ignore when loading.

drop_empty_columns bool

If True, drop columns that are empty after merging.

Methods:

Name Description
get_annotator

Create a HeaderAnnotator configured with this spec.

Specification CSV

The specification CSV provides metadata for DataFrame columns that are exported to Excel (diff and population report). It defines a metadata block whit info about the columns.

File structure

The CSV must contain at least one column:

  • path (required): A canonical field path this row describes.
  • Written in dot notation, without numeric indices.
  • Examples:
    • SingleSwitch.@puic or SingleSwitch.extension.MicroNode.@railConnectionRef.

Other columns will be rendered as documentation rows above the table header.

Example metadata columns include:

  • description: Explanation of the field’s meaning.
  • datatype: Expected type (e.g. string, number, enum:…).
  • required: Whether the field is mandatory (yes/no).
  • domain: Controlled vocabulary or external reference.
Example
path,label,description,datatype,required,documentation,documentation_link
SingleSwitch.@puic,PUIC,Unique object identifier,string,yes,PUIC spec,https://docs.example/puic
SingleSwitch.parent,Parent PUIC,Parent object reference,string,no,Parent ref,https://docs.example/parent
SingleSwitch.extension.MicroNode.@railConnectionRef,RailConn Ref,Reference to rail connection,string,no,Ext ref,https://docs.example/railconn

If a column X has a companion X_link column, the loader will replace X with an Excel HYPERLINK formula pointing to X_link.

Example:

  • documentation + documentation_link=HYPERLINK("https://docs.example/puic", "PUIC spec")
Excel export

The world runs on Excel so we generate an Excel sheet where metadata rows are stacked above the data rows, making the report self-documenting.

get_annotator()

Return a HeaderAnnotator configured with this spec.