Aggregation#
This notebook covers both classic workbook-based aggregation and workbook-free regional aggregation through region_aggregation.
[1]:
import mario
db = mario.load_test("IOT")
INFO Parser: excel reading IOT flows from /Users/lorenzorinaldi/Documents/GitHub/MARIO/mario/test/tables/test_IOT_standard.xlsx.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: excel state ready for IOT.
INFO Metadata: initialized.
Generate an empty Excel template#
Open the generated workbook if you want to see the format MARIO expects.
[ ]:
db.get_aggregation_excel(
path="/path/to/empty_aggregation_template.xlsx",
overwrite=True,
)
Apply a filled Excel workbook#

The example workbook groups Factors of production as “Value added”, and Regions into just one “World” region. The other levels are left blank, since they don’t need to be filled if the user’s intention is to leave them untouched.
Download the packaged example workbooks#
The exact workbooks used in this example are available here:
[7]:
aggregated_db = db.aggregate(
"/path/to/aggregation_iot_filled.xlsx",
inplace=False,
ignore_nan=True, # if you didn't aggregate all labels in all sets, this will ignore empty cells.
)
WARNING nan values for the aggregation of Satellite account for following items ignored
['Employment', 'CO2']
WARNING nan values for the aggregation of Consumption category for following items ignored
['Final demand']
WARNING nan values for the aggregation of Sector for following items ignored
['Agriculture', 'Services', 'Industry']
INFO Resolver: resolving X for baseline.
INFO Resolver: trying X via formula build_iot_X_from_Z_Y.
INFO Resolver: resolved X via formula build_iot_X_from_Z_Y.
INFO Aggregation: scenario: `baseline` aggregated.
INFO Resolver: resolving z for baseline.
INFO Resolver: trying z via formula build_iot_z_from_Z_X.
INFO Resolver: resolved z via formula build_iot_z_from_Z_X.
INFO Resolver: resolving v for baseline.
INFO Resolver: trying v via formula build_iot_v_from_V_X.
INFO Resolver: resolved v via formula build_iot_v_from_V_X.
INFO Resolver: resolving e for baseline.
INFO Resolver: trying e via formula build_iot_e_from_E_X.
INFO Resolver: resolved e via formula build_iot_e_from_E_X.
Inspect the aggregated table#
Compare aggregated and non-aggregated databases
[8]:
aggregated_db
[8]:
name = IOT test (standard)
table = IOT
scenarios = ['baseline']
Factor of production = 1
Satellite account = 2
Consumption category = 1
Region = 1
Sector = 3
[9]:
db
[9]:
name = IOT test (standard)
table = IOT
scenarios = ['baseline']
Factor of production = 3
Satellite account = 2
Consumption category = 1
Region = 2
Sector = 3
Let’s also obtain the Z matrix for the aggregated database and compare it with the one of the non-aggregated database
[10]:
aggregated_db.Z
[10]:
| Region | World | ||||
|---|---|---|---|---|---|
| Level | Sector | ||||
| Item | Agriculture | Industry | Services | ||
| Region | Level | Item | |||
| World | Sector | Agriculture | 9.366075e+05 | 5.441638e+06 | 1.094425e+06 |
| Industry | 1.140438e+06 | 2.344857e+07 | 1.090801e+07 | ||
| Services | 1.913158e+06 | 1.009323e+07 | 3.141690e+07 | ||
[11]:
db.Z
[11]:
| Region | Reg1 | Reg2 | ||||||
|---|---|---|---|---|---|---|---|---|
| Level | Sector | Sector | ||||||
| Item | Agriculture | Industry | Services | Agriculture | Industry | Services | ||
| Region | Level | Item | ||||||
| Reg1 | Sector | Agriculture | 9.308651e+05 | 5.357799e+06 | 1.075278e+06 | 2363.029548 | 50607.806861 | 10341.916110 |
| Industry | 1.122397e+06 | 2.289284e+07 | 1.065096e+07 | 1413.697219 | 134910.657543 | 30367.209709 | ||
| Services | 1.893586e+06 | 9.737509e+06 | 3.059785e+07 | 1301.725200 | 27813.857216 | 58525.384254 | ||
| Reg2 | Sector | Agriculture | 4.168587e+02 | 2.596041e+03 | 8.430071e+02 | 2962.557738 | 30635.045696 | 7961.643900 |
| Industry | 6.890987e+03 | 1.335193e+05 | 6.555829e+04 | 9736.931972 | 287300.673355 | 161126.940614 | ||
| Services | 1.590090e+03 | 1.555045e+04 | 3.843707e+04 | 16680.335207 | 312355.775242 | 722089.042972 | ||
Workbook-free regional aggregation for known MRIOs#
The same aggregation method can bypass the use of an Excel template if the user wants just to aggregate the Region level. This is possible when using “MARIO-known” MRIO tables (e.g. EXIOBASE, EORA, OECD-ICIO…), thanks to an internal mapping process MARIO performs with built-in coverage rules, without asking you to fill the Region sheet manually.
The mapping is done in two steps. First, MARIO uses its packaged Country_coverage workbook to reconcile source-specific region codes with ISO3 labels. Then, for labels that still need to be resolved, it falls back to `country_converter <IndEcol/country_converter>`__, which provides the country metadata used to assign preset groups such as continents, UN regions, EU, OECD, G7, and G20.
For region_aggregation, the accepted preset strings are "continent", "UNregion", "EU", "OECD", "G7", and "G20". You can also pass an explicit mapping as a dict, pandas.Series, or pandas.DataFrame if you want full control over the target Region labels.
Below, a guided example using EXIOBASE.
[14]:
import mario
mario.set_log_verbosity("critical")
exio_db = mario.parse_exiobase(
path="/path/to/IOT_2024_ixi.zip", # or the extracted dataset directory
table="IOT",
unit="Monetary",
)
exio_db.regions # print regions
[14]:
['AT',
'BE',
'BG',
'CY',
'CZ',
'DE',
'DK',
'EE',
'ES',
'FI',
'FR',
'GR',
'HR',
'HU',
'IE',
'IT',
'LT',
'LU',
'LV',
'MT',
'NL',
'PL',
'PT',
'RO',
'SE',
'SI',
'SK',
'GB',
'US',
'JP',
'CN',
'CA',
'KR',
'BR',
'IN',
'MX',
'RU',
'AU',
'CH',
'TR',
'TW',
'NO',
'ID',
'ZA',
'WA',
'WL',
'WE',
'WF',
'WM']
[15]:
exio_by_continent = exio_db.aggregate(
io=None,
levels="Region",
region_aggregation="continent",
inplace=False,
)
exio_by_continent.regions
[15]:
['America', 'Asia', 'Oceania', 'Africa', 'Europe']
Another example below, using OECD-ICIO tables, aggregating "G20" countries
[26]:
oecd_db = mario.parse_oecd(
path="/path/to/OECD_ICIO/folder_or_csv",
)
oecd_db
[26]:
name = OECD ICIO 2022 regular
table = IOT
scenarios = ['baseline']
Factor of production = 2
Satellite account = 1
Consumption category = 6
Region = 81
Sector = 50
[27]:
oecd_db.regions
[27]:
['AGO',
'ARE',
'ARG',
'AUS',
'AUT',
'BEL',
'BGD',
'BGR',
'BLR',
'BRA',
'BRN',
'CAN',
'CHE',
'CHL',
'CHN',
'CIV',
'CMR',
'COD',
'COL',
'CRI',
'CYP',
'CZE',
'DEU',
'DNK',
'EGY',
'ESP',
'EST',
'FIN',
'FRA',
'GBR',
'GRC',
'HKG',
'HRV',
'HUN',
'IDN',
'IND',
'IRL',
'ISL',
'ISR',
'ITA',
'JOR',
'JPN',
'KAZ',
'KHM',
'KOR',
'LAO',
'LTU',
'LUX',
'LVA',
'MAR',
'MEX',
'MLT',
'MMR',
'MYS',
'NGA',
'NLD',
'NOR',
'NZL',
'PAK',
'PER',
'PHL',
'POL',
'PRT',
'ROU',
'RUS',
'SAU',
'SEN',
'SGP',
'STP',
'SVK',
'SVN',
'SWE',
'THA',
'TUN',
'TUR',
'TWN',
'UKR',
'USA',
'VNM',
'ZAF',
'ROW']
[28]:
oecd_by_g20 = oecd_db.aggregate(
io = None,
levels="Region",
region_aggregation="G20",
inplace=False,
)
oecd_by_g20
[28]:
name = OECD ICIO 2022 regular
table = IOT
scenarios = ['baseline']
Factor of production = 2
Satellite account = 1
Consumption category = 6
Region = 39
Sector = 50
[29]:
oecd_by_g20.regions
[29]:
['G20',
'SGP',
'ARE',
'THA',
'COL',
'NZL',
'COD',
'PER',
'KAZ',
'CHL',
'STP',
'NGA',
'BGD',
'TWN',
'ROW',
'ISR',
'UKR',
'EGY',
'CIV',
'HKG',
'ISL',
'KHM',
'BRN',
'CHE',
'CRI',
'JOR',
'LAO',
'NOR',
'VNM',
'MMR',
'PAK',
'MAR',
'SEN',
'TUN',
'BLR',
'AGO',
'CMR',
'PHL',
'MYS']