Excel custom parser walkthrough#
This notebook shows how to parse existing MARIO-readable Excel workbooks with mario.parse_from_excel(...). It uses the small test workbooks shipped with MARIO, so the focus is on reading a completed workbook rather than generating a new template.
What this notebook covers#
parse standard IOT and SUT Excel workbooks;
parse Excel workbooks that need explicit
matrix_layouts;inspect the parsed baseline blocks and index labels.
Creating a blank workbook from sets and units is covered in the dedicated workflow on providing a custom database.
Example workbooks#
The examples use these packaged workbooks:
test_IOT_standard.xlsx: standard IOT layout;test_IOT_special.xlsx: IOT layout with extra semantic levels onVandE;test_SUT_standard.xlsx: standard SUT layout;test_SUT_special.xlsx: SUT layout with extra semantic levels on satellite extensions.
The code below uses paths relative to this notebook source location in the MARIO repository. If you run the notebook elsewhere, replace the path strings with the location of your local Excel files.
Main entry point and key arguments#
Use mario.parse_from_excel(...) when path points to one completed MARIO-readable workbook.
Key arguments:
path: workbook to parse;table: choose"IOT"or"SUT";mode: choose"flows"or"coefficients";data_sheet: optional sheet selector for the table payload;unit_sheet: optional sheet selector for the units sheet;matrix_layouts: optional semantic declaration for non-standard matrix layouts;tech_assumption: optionalSUTselector forITorPT.
Workbook structure and packaged examples#
path must point to one Excel workbook.
custom_database.xlsx
├── data sheet
└── units sheet
The data sheet can use the canonical MARIO matrix layout exported by MARIO or a flat template generated with mario.write_parse_template(...). The units sheet must contain the units for the labels used by sectors, activities, commodities, factors, final demand, and satellites.
Packaged example workbooks used in this notebook:
Standard and special layouts#
In the standard workbook layout, the table uses three index levels on both rows and columns: region, level name, and level label. Use mode="flows" or mode="coefficients" to tell MARIO how to interpret the numbers.
For non-standard layouts, use matrix_layouts= to declare extra semantic layers, especially on V and E. VY and EY inherit the same extra layout as V and E, so you do not declare them separately.
[1]:
import mario
Parse a standard IOT workbook#
For a standard IOT workbook, only the file path, table kind and mode are required.
[2]:
iot = mario.parse_from_excel(
path="/path/to/test_IOT_standard.xlsx",
table="IOT",
mode="flows",
)
INFO Parser: excel reading IOT flows from /path/to/test_IOT_standard.xlsx.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: excel state ready for IOT.
INFO Metadata: initialized.
[3]:
iot.meta.table
[3]:
'IOT'
[4]:
iot.get_index("Region")
[4]:
['Reg1', 'Reg2']
[5]:
iot.get_index("Sector")
[5]:
['Agriculture', 'Services', 'Industry']
[6]:
iot.V
[6]:
| Region | Reg1 | Reg2 | ||||
|---|---|---|---|---|---|---|
| Level | Sector | Sector | ||||
| Item | Agriculture | Industry | Services | Agriculture | Industry | Services |
| Item | ||||||
| Taxes | 5.517199e+05 | 3.047096e+06 | 5.010693e+06 | 4908.415181 | 36852.481913 | 143818.770826 |
| Wages | 2.576824e+06 | 6.311953e+06 | 3.339472e+07 | 9698.892565 | 213257.479272 | 754513.227394 |
| Capital | 2.887880e+06 | 5.616641e+06 | 2.366083e+07 | 30784.073998 | 101949.241428 | 529918.086220 |
Parse an IOT workbook with custom matrix layouts#
Use matrix_layouts when specific matrices carry extra semantic levels beyond the default layout. In this example, V is indexed by region and sector, while E is indexed by region.
[7]:
iot_special = mario.parse_from_excel(
path="/path/to/test_IOT_special.xlsx",
table="IOT",
mode="flows",
matrix_layouts={
"V": ("Region", "Sector"),
"E": "Region",
},
)
INFO Parser: excel reading IOT flows from /path/to/test_IOT_special.xlsx.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: excel state ready for IOT.
INFO Metadata: initialized.
[8]:
iot_special.V
[8]:
| Region | Reg1 | Reg2 | ||||||
|---|---|---|---|---|---|---|---|---|
| Sector | Agriculture | Services | Industry | Agriculture | Services | Industry | ||
| Region | Sector | Factor of production | ||||||
| Reg1 | Agriculture | Taxes | 2.206879e+05 | 2.004277e+06 | 1.218838e+06 | 1963.366072 | 57527.508330 | 14740.992765 |
| Services | Taxes | 5.517199e+04 | 5.010693e+05 | 3.047096e+05 | 490.841518 | 14381.877083 | 3685.248191 | |
| Industry | Taxes | 2.758599e+04 | 2.505346e+05 | 1.523548e+05 | 245.420759 | 7190.938541 | 1842.624096 | |
| Reg2 | Agriculture | Taxes | 1.931020e+05 | 1.753742e+06 | 1.066483e+06 | 1717.945313 | 50336.569789 | 12898.368670 |
| Services | Taxes | 2.758599e+04 | 2.505346e+05 | 1.523548e+05 | 245.420759 | 7190.938541 | 1842.624096 | |
| Industry | Taxes | 2.758599e+04 | 2.505346e+05 | 1.523548e+05 | 245.420759 | 7190.938541 | 1842.624096 | |
| Reg1 | Agriculture | Wages | 1.030730e+06 | 1.335789e+07 | 2.524781e+06 | 3879.557026 | 301805.290958 | 85302.991709 |
| Services | Wages | 2.576824e+05 | 3.339472e+06 | 6.311953e+05 | 969.889257 | 75451.322739 | 21325.747927 | |
| Industry | Wages | 1.288412e+05 | 1.669736e+06 | 3.155977e+05 | 484.944628 | 37725.661370 | 10662.873964 | |
| Reg2 | Agriculture | Wages | 9.018884e+05 | 1.168815e+07 | 2.209184e+06 | 3394.612398 | 264079.629588 | 74640.117745 |
| Services | Wages | 1.288412e+05 | 1.669736e+06 | 3.155977e+05 | 484.944628 | 37725.661370 | 10662.873964 | |
| Industry | Wages | 1.288412e+05 | 1.669736e+06 | 3.155977e+05 | 484.944628 | 37725.661370 | 10662.873964 | |
| Reg1 | Agriculture | Capital | 1.155152e+06 | 9.464332e+06 | 2.246656e+06 | 12313.629599 | 211967.234488 | 40779.696571 |
| Services | Capital | 2.887880e+05 | 2.366083e+06 | 5.616641e+05 | 3078.407400 | 52991.808622 | 10194.924143 | |
| Industry | Capital | 1.443940e+05 | 1.183041e+06 | 2.808320e+05 | 1539.203700 | 26495.904311 | 5097.462071 | |
| Reg2 | Agriculture | Capital | 1.010758e+06 | 8.281290e+06 | 1.965824e+06 | 10774.425899 | 185471.330177 | 35682.234500 |
| Services | Capital | 1.443940e+05 | 1.183041e+06 | 2.808320e+05 | 1539.203700 | 26495.904311 | 5097.462071 | |
| Industry | Capital | 1.443940e+05 | 1.183041e+06 | 2.808320e+05 | 1539.203700 | 26495.904311 | 5097.462071 | |
Parse a standard SUT workbook#
For SUT workbooks, pass the table kind as "SUT". You can also pass tech_assumption= when you want MARIO to record the technology assumption used downstream.
[9]:
sut = mario.parse_from_excel(
path="/path/to/test_SUT_standard.xlsx",
table="SUT",
mode="flows",
tech_assumption="IT",
)
INFO Parser: excel reading SUT flows from /path/to/test_SUT_standard.xlsx.
INFO Parser: state payload ready with 10 canonical blocks.
INFO Parser: excel state ready for SUT.
INFO Metadata: initialized.
[11]:
sut.meta.table
[11]:
'SUT'
[12]:
sorted(sut["baseline"].keys())
[12]:
['EY',
'Ea',
'Ec',
'S',
'U',
'VY',
'Va',
'Vc',
'Xa',
'Xc',
'Ya',
'Yc',
'ea',
's',
'u',
'wcc']
[13]:
sut.E
INFO Resolver: resolving E for baseline.
INFO Resolver: trying E via concat.
INFO Resolver: resolved E via concat.
[13]:
| Region | Region 1 | Region 2 | Region 1 | Region 2 | ||||
|---|---|---|---|---|---|---|---|---|
| Level | Activity | Activity | Commodity | Commodity | ||||
| Item | Manufacturing | Services | Manufacturing | Services | Goods | Services | Goods | Services |
| Item | ||||||||
| Employment | 3.752922e+06 | 6.547713e+05 | 1.926382e+04 | 3.939633e+03 | 0.0 | 0.0 | 0.0 | 0.0 |
| CO2 | 2.220336e+13 | 8.884627e+12 | 1.910855e+11 | 4.839515e+10 | 0.0 | 0.0 | 0.0 | 0.0 |
Parse a SUT workbook with custom matrix layouts#
The same matrix_layouts mechanism applies to SUT workbooks. Here, the satellite-account matrices are declared with region and commodity detail.
[14]:
sut_special = mario.parse_from_excel(
path="/path/to/test_SUT_special.xlsx",
table="SUT",
mode="flows",
tech_assumption="IT",
matrix_layouts={
"E": ("Region", "Commodity"),
},
)
INFO Parser: excel reading SUT flows from /path/to/test_SUT_special.xlsx.
INFO Parser: state payload ready with 10 canonical blocks.
INFO Parser: excel state ready for SUT.
INFO Metadata: initialized.
[15]:
sut_special.E
INFO Resolver: resolving E for baseline.
INFO Resolver: trying E via concat.
INFO Resolver: resolved E via concat.
[15]:
| Region | Region 1 | Region 2 | Region 1 | Region 2 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Level | Activity | Activity | Commodity | Commodity | ||||||
| Item | Manufacturing | Services | Manufacturing | Services | Goods | Services | Goods | Services | ||
| Region | Commodity | Satellite account | ||||||||
| Region 1 | Goods | Employment | 1.125877e+06 | 1.964314e+05 | 5.779146e+03 | 1.181890e+03 | 0.0 | 0.0 | 0.0 | 0.0 |
| Services | Employment | 7.505844e+05 | 1.309543e+05 | 3.852764e+03 | 7.879266e+02 | 0.0 | 0.0 | 0.0 | 0.0 | |
| Region 2 | Goods | Employment | 1.501169e+06 | 2.619085e+05 | 7.705528e+03 | 1.575853e+03 | 0.0 | 0.0 | 0.0 | 0.0 |
| Services | Employment | 3.752922e+05 | 6.547713e+04 | 1.926382e+03 | 3.939633e+02 | 0.0 | 0.0 | 0.0 | 0.0 | |
| Region 1 | Goods | CO2 | 7.771177e+12 | 3.109620e+12 | 6.687991e+10 | 1.693830e+10 | 0.0 | 0.0 | 0.0 | 0.0 |
| Services | CO2 | 3.330504e+12 | 1.332694e+12 | 2.866282e+10 | 7.259272e+09 | 0.0 | 0.0 | 0.0 | 0.0 | |
| Region 2 | Goods | CO2 | 8.881345e+12 | 3.553851e+12 | 7.643418e+10 | 1.935806e+10 | 0.0 | 0.0 | 0.0 | 0.0 |
| Services | CO2 | 2.220336e+12 | 8.884627e+11 | 1.910855e+10 | 4.839515e+09 | 0.0 | 0.0 | 0.0 | 0.0 | |
Practical notes#
mode="flows"reads flow matrices; usemode="coefficients"only for coefficient workbooks.matrix_layoutsmust use MARIO semantic index names, such asRegion,Sector,Activity,Commodity,Factor of production,Final demand, andSatellite account.Fully empty matrices are accepted and filled with zeros; partially empty matrices are treated as invalid input.
Layout notes and caveats#
The workbook that mario.parse_from_excel(...) reads is always one Excel file with at least a data sheet and a units sheet.
In the standard layout, rows and columns use three index levels: region, level name, and level label. The units sheet lists the units for sectors or activities or commodities, factors of production, and satellite accounts used by the workbook. If the sheet names are not the defaults, use data_sheet= and unit_sheet= explicitly.
For non-standard layouts, matrix_layouts= declares the extra semantic levels carried by matrices such as V and E. When extra levels are defined for V or E, MARIO applies the same logic to VY and EY, so they do not need separate declarations.
Practical caveats:
Excel is usually the easiest custom entry point, but it is not the best format for large roundtrip workflows.
mode=is never inferred automatically: choose"flows"or"coefficients"explicitly.When you define extra indices for
EorV, provide the complete label set under those indices, not only the labels visible in one sub-block.