TXT, CSV, and Parquet custom parser walkthrough#
This walkthrough covers the directory-based parsers for TXT, CSV, and Parquet payloads.
Main Entry Points#
mario.parse_from_txt(...)mario.parse_from_parquet(...)
Key arguments and directory layout#
Use mario.parse_from_txt(...) for TXT or CSV payloads and mario.parse_from_parquet(...) for Parquet payloads.
Key arguments:
path: directory containing the files to parse;table: choose"IOT"or"SUT";mode: choose"flows"or"coefficients";flat: setTruefor long-format payloads;sepand_format: TXT or CSV only;matrix_layouts: optional semantic declaration for non-standard matrix layouts;tech_assumption: optionalSUTselector forITorPT.
Matrix-per-file payloads look like:
custom_txt_database/
├── Z.csv
├── Y.csv
├── V.csv
├── E.csv
├── EY.csv
└── units.csv
Flat payloads can use one combined data file plus units, or one flat file per matrix plus units. The same directory logic applies to Parquet.
Packaged example directories#
The parser examples below use exported database folders bundled with the documentation:
Extract each archive locally and point path to the inner flows or coefficients directory shown in the code examples below.
[ ]:
import mario
Matrix-per-file TXT or CSV#
Use flat=False for the historical matrix-per-file layout.
[ ]:
db = mario.parse_from_txt(
path="/path/to/iot_export_csv/flows",
table="IOT",
mode="flows",
_format="csv",
flat=False,
)
INFO Parser: txt reading IOT flows from /path/to/MARIO/mario/test/supporting_files/iot_export_csv/flows in matrix mode (csv).
INFO Parser: Reading flows from txt files.
INFO Parser: Reading files finished.
INFO Parser: Investigating possible identifiable errors.
INFO Parser: parsing database finished.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: txt state ready for IOT.
INFO Metadata: initialized.
Flat TXT or CSV#
Use flat=True for long-format payloads. MARIO accepts either one combined data file or separate flat files per matrix, as long as units is present.
[ ]:
db = mario.parse_from_txt(
path="/path/to/iot_export_csv",
table="IOT",
mode="coefficients",
_format="csv",
flat=True,
)
INFO Parser: txt reading IOT coefficients from /path/to/MARIO/mario/test/supporting_files/iot_export_csv/coefficients in flat mode (csv).
INFO Parser: reading coefficients from flat txt files.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: txt state ready for IOT.
INFO Metadata: initialized.
Flat Parquet#
The same logic applies to Parquet exports.
[ ]:
db = mario.parse_from_parquet(
path="/path/to/iot_export_parquet",
table="IOT",
mode="flows",
flat=True,
)
INFO Parser: parquet reading IOT flows from /path/to/MARIO/mario/test/supporting_files/iot_export_parquet/flows in flat mode.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: parquet state ready for IOT.
INFO Metadata: initialized.
Reference notes and caveats#
These parsers are directory-based. path must point to one directory, not to one individual file.
Use flat=True for long-format payloads. For TXT or CSV parsing, _format and sep matter; for Parquet parsing they do not.
The same semantic rules used for custom Excel parsing also apply here: if one IOT layout carries extra semantic levels, declare them through matrix_layouts= instead of relying on filename conventions alone.
These formats are usually preferable to Excel when the data already comes from a MARIO export or from an automated preprocessing workflow.