Reference¶
HeaderAnnotator
¶
Annotates DataFrame exports with metadata headers for Excel reports.
This class reads a CSV "specification" file describing metadata for fields and applies that metadata as annotated header rows when exporting DataFrames to Excel. It ensures consistent column ordering and makes reports self-documenting.
Experimental feature
HeaderAnnotator and HeaderSpec are experimental and may change without warning.
Responsibilities:
- Parse a CSV specification file that defines metadata for fields.
- Map specification metadata onto DataFrame columns.
- Prepend metadata rows (header blocks) above data rows in exported Excel files.
__init__(spec_csv_path, spec_path_col, drop_empty_columns, spec_ignore_cols=None)
¶
Initialize a HeaderAnnotator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
spec_csv_path |
str | Path
|
Path to the CSV file containing header specifications. |
required |
spec_path_col |
str
|
Column name in the spec that defines the canonical field paths. |
required |
drop_empty_columns |
bool
|
If True, columns with no matching data will be dropped from the annotated DataFrame. |
required |
spec_ignore_cols |
list[str] | None
|
Optional list of spec columns to ignore and drop. |
None
|
Notes
- The specification CSV is normalized automatically (duplicate rows dropped, hyperlinks applied if *_link columns are present, and known typos corrected).
- The spec_path_col must contain canonical dot-notation paths (without indices).
apply_metadata_header(df)
¶
Annotate a DataFrame with specification metadata.
Prepends one or more metadata rows (from the spec CSV) above the actual data rows, ensuring consistent column ordering. The result is ready to export directly to Excel with self-documenting headers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
Input DataFrame. Must contain a 'path_to_root' column. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with metadata rows stacked on top of the original data rows. |
to_excel_with_metadata(writer, sheet_name, df, *, index=False, header=True, auto_filter=True, styler_fn=None)
staticmethod
¶
Write a DataFrame or Styler to an Excel worksheet, including metadata header rows.
The header rows are styled in gray and frozen, while the data block may have styling applied through a custom styler function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
writer |
An ExcelWriter object. |
required | |
sheet_name |
str
|
Target worksheet name. |
required |
df |
DataFrame | Styler
|
Data or styled DataFrame including metadata rows. |
required |
index |
bool
|
Write index column. Default False. |
False
|
header |
bool
|
Write column headers. Default True. |
True
|
auto_filter |
bool
|
Add an autofilter. Default True. |
True
|
styler_fn |
Callable
|
Function applied to style the body rows. |
None
|
Returns:
Name | Type | Description |
---|---|---|
Worksheet |
Worksheet
|
The created xlsxwriter worksheet object. |
HeaderSpec
dataclass
¶
Dataclass wrapper for header specification files.
Experimental feature
HeaderSpec is considered experimental and WILL change without warning.
Encapsulates configuration for creating a HeaderAnnotator. Use this class when you want to bundle spec file location and options together.
Attributes:
Name | Type | Description |
---|---|---|
spec_csv_path |
str
|
Path to the CSV file with header metadata. |
spec_path_col |
str
|
Column in the spec that defines canonical paths. Default "path". |
spec_ignore_cols |
list[str]
|
List of spec fields to ignore when loading. |
drop_empty_columns |
bool
|
If True, drop columns that are empty after merging. |
Methods:
Name | Description |
---|---|
get_annotator |
Create a HeaderAnnotator configured with this spec. |
Specification CSV¶
The specification CSV provides metadata for DataFrame columns that are exported to Excel (diff and population report). It defines a metadata block whit info about the columns.
File structure¶
The CSV must contain at least one column:
- path (required): A canonical field path this row describes.
- Written in dot notation, without numeric indices.
- Examples:
SingleSwitch.@puic
orSingleSwitch.extension.MicroNode.@railConnectionRef
.
Other columns will be rendered as documentation rows above the table header.
Example metadata columns include:
- description: Explanation of the field’s meaning.
- datatype: Expected type (e.g.
string
,number
,enum:…
). - required: Whether the field is mandatory (
yes
/no
). - domain: Controlled vocabulary or external reference.
Example¶
path,label,description,datatype,required,documentation,documentation_link
SingleSwitch.@puic,PUIC,Unique object identifier,string,yes,PUIC spec,https://docs.example/puic
SingleSwitch.parent,Parent PUIC,Parent object reference,string,no,Parent ref,https://docs.example/parent
SingleSwitch.extension.MicroNode.@railConnectionRef,RailConn Ref,Reference to rail connection,string,no,Ext ref,https://docs.example/railconn
Hyperlink support¶
If a column X
has a companion X_link
column, the loader will replace X
with an Excel
HYPERLINK formula pointing to X_link
.
Example:
documentation
+documentation_link
→=HYPERLINK("https://docs.example/puic", "PUIC spec")
Excel export¶
The world runs on Excel so we generate an Excel sheet where metadata rows are stacked above the data rows, making the report self-documenting.
get_annotator()
¶
Return a HeaderAnnotator
configured with this spec.