Aggregation#

This notebook covers both classic workbook-based aggregation and workbook-free regional aggregation through region_aggregation.

[1]:

import mario
db = mario.load_test("IOT")

INFO Parser: excel reading IOT flows from /Users/lorenzorinaldi/Documents/GitHub/MARIO/mario/test/tables/test_IOT_standard.xlsx.
INFO Parser: state payload ready with 6 canonical blocks.
INFO Parser: excel state ready for IOT.
INFO Metadata: initialized.

Generate an empty Excel template#

Open the generated workbook if you want to see the format MARIO expects.

[ ]:

db.get_aggregation_excel(
    path="/path/to/empty_aggregation_template.xlsx",
    overwrite=True,
)

Apply a filled Excel workbook#

The example workbook groups Factors of production as “Value added”, and Regions into just one “World” region. The other levels are left blank, since they don’t need to be filled if the user’s intention is to leave them untouched.

Download the packaged example workbooks#

The exact workbooks used in this example are available here:

[7]:

aggregated_db = db.aggregate(
    "/path/to/aggregation_iot_filled.xlsx",
    inplace=False,
    ignore_nan=True, # if you didn't aggregate all labels in all sets, this will ignore empty cells.
)

WARNING nan values for the aggregation of Satellite account for following items ignored
['Employment', 'CO2']
WARNING nan values for the aggregation of Consumption category for following items ignored
['Final demand']
WARNING nan values for the aggregation of Sector for following items ignored
['Agriculture', 'Services', 'Industry']
INFO Resolver: resolving X for baseline.
INFO Resolver: trying X via formula build_iot_X_from_Z_Y.
INFO Resolver: resolved X via formula build_iot_X_from_Z_Y.
INFO Aggregation: scenario: `baseline` aggregated.
INFO Resolver: resolving z for baseline.
INFO Resolver: trying z via formula build_iot_z_from_Z_X.
INFO Resolver: resolved z via formula build_iot_z_from_Z_X.
INFO Resolver: resolving v for baseline.
INFO Resolver: trying v via formula build_iot_v_from_V_X.
INFO Resolver: resolved v via formula build_iot_v_from_V_X.
INFO Resolver: resolving e for baseline.
INFO Resolver: trying e via formula build_iot_e_from_E_X.
INFO Resolver: resolved e via formula build_iot_e_from_E_X.

Inspect the aggregated table#

Compare aggregated and non-aggregated databases

[8]:

aggregated_db

[8]:

name = IOT test (standard)
table = IOT
scenarios = ['baseline']
Factor of production = 1
Satellite account = 2
Consumption category = 1
Region = 1
Sector = 3

[9]:

db

[9]:

name = IOT test (standard)
table = IOT
scenarios = ['baseline']
Factor of production = 3
Satellite account = 2
Consumption category = 1
Region = 2
Sector = 3

Let’s also obtain the Z matrix for the aggregated database and compare it with the one of the non-aggregated database

[10]:

aggregated_db.Z

[10]:

		Region	World
		Level	Sector
		Item	Agriculture	Industry	Services
Region	Level	Item
World	Sector	Agriculture	9.366075e+05	5.441638e+06	1.094425e+06
		Industry	1.140438e+06	2.344857e+07	1.090801e+07
		Services	1.913158e+06	1.009323e+07	3.141690e+07

[11]:

db.Z

[11]:

		Region	Reg1			Reg2
		Level	Sector			Sector
		Item	Agriculture	Industry	Services	Agriculture	Industry	Services
Region	Level	Item
Reg1	Sector	Agriculture	9.308651e+05	5.357799e+06	1.075278e+06	2363.029548	50607.806861	10341.916110
		Industry	1.122397e+06	2.289284e+07	1.065096e+07	1413.697219	134910.657543	30367.209709
		Services	1.893586e+06	9.737509e+06	3.059785e+07	1301.725200	27813.857216	58525.384254
Reg2	Sector	Agriculture	4.168587e+02	2.596041e+03	8.430071e+02	2962.557738	30635.045696	7961.643900
		Industry	6.890987e+03	1.335193e+05	6.555829e+04	9736.931972	287300.673355	161126.940614
		Services	1.590090e+03	1.555045e+04	3.843707e+04	16680.335207	312355.775242	722089.042972

Workbook-free regional aggregation for known MRIOs#

The same aggregation method can bypass the use of an Excel template if the user wants just to aggregate the Region level. This is possible when using “MARIO-known” MRIO tables (e.g. EXIOBASE, EORA, OECD-ICIO…), thanks to an internal mapping process MARIO performs with built-in coverage rules, without asking you to fill the Region sheet manually.

The mapping is done in two steps. First, MARIO uses its packaged Country_coverage workbook to reconcile source-specific region codes with ISO3 labels. Then, for labels that still need to be resolved, it falls back to `country_converter <IndEcol/country_converter>`__, which provides the country metadata used to assign preset groups such as continents, UN regions, EU, OECD, G7, and G20.

For region_aggregation, the accepted preset strings are "continent", "UNregion", "EU", "OECD", "G7", and "G20". You can also pass an explicit mapping as a dict, pandas.Series, or pandas.DataFrame if you want full control over the target Region labels.

Below, a guided example using EXIOBASE.

[14]:

import mario

mario.set_log_verbosity("critical")

exio_db = mario.parse_exiobase(
    path="/path/to/IOT_2024_ixi.zip",  # or the extracted dataset directory
    table="IOT",
    unit="Monetary",
)

exio_db.regions  # print regions

[14]:

['AT',
 'BE',
 'BG',
 'CY',
 'CZ',
 'DE',
 'DK',
 'EE',
 'ES',
 'FI',
 'FR',
 'GR',
 'HR',
 'HU',
 'IE',
 'IT',
 'LT',
 'LU',
 'LV',
 'MT',
 'NL',
 'PL',
 'PT',
 'RO',
 'SE',
 'SI',
 'SK',
 'GB',
 'US',
 'JP',
 'CN',
 'CA',
 'KR',
 'BR',
 'IN',
 'MX',
 'RU',
 'AU',
 'CH',
 'TR',
 'TW',
 'NO',
 'ID',
 'ZA',
 'WA',
 'WL',
 'WE',
 'WF',
 'WM']

[15]:

exio_by_continent = exio_db.aggregate(
    io=None,
    levels="Region",
    region_aggregation="continent",
    inplace=False,
)

exio_by_continent.regions

[15]:

['America', 'Asia', 'Oceania', 'Africa', 'Europe']

Another example below, using OECD-ICIO tables, aggregating "G20" countries

[26]:

oecd_db = mario.parse_oecd(
    path="/path/to/OECD_ICIO/folder_or_csv",
)

oecd_db

[26]:

name = OECD ICIO 2022 regular
table = IOT
scenarios = ['baseline']
Factor of production = 2
Satellite account = 1
Consumption category = 6
Region = 81
Sector = 50

[27]:

oecd_db.regions

[27]:

['AGO',
 'ARE',
 'ARG',
 'AUS',
 'AUT',
 'BEL',
 'BGD',
 'BGR',
 'BLR',
 'BRA',
 'BRN',
 'CAN',
 'CHE',
 'CHL',
 'CHN',
 'CIV',
 'CMR',
 'COD',
 'COL',
 'CRI',
 'CYP',
 'CZE',
 'DEU',
 'DNK',
 'EGY',
 'ESP',
 'EST',
 'FIN',
 'FRA',
 'GBR',
 'GRC',
 'HKG',
 'HRV',
 'HUN',
 'IDN',
 'IND',
 'IRL',
 'ISL',
 'ISR',
 'ITA',
 'JOR',
 'JPN',
 'KAZ',
 'KHM',
 'KOR',
 'LAO',
 'LTU',
 'LUX',
 'LVA',
 'MAR',
 'MEX',
 'MLT',
 'MMR',
 'MYS',
 'NGA',
 'NLD',
 'NOR',
 'NZL',
 'PAK',
 'PER',
 'PHL',
 'POL',
 'PRT',
 'ROU',
 'RUS',
 'SAU',
 'SEN',
 'SGP',
 'STP',
 'SVK',
 'SVN',
 'SWE',
 'THA',
 'TUN',
 'TUR',
 'TWN',
 'UKR',
 'USA',
 'VNM',
 'ZAF',
 'ROW']

[28]:

oecd_by_g20 = oecd_db.aggregate(
    io = None,
    levels="Region",
    region_aggregation="G20",
    inplace=False,
)

oecd_by_g20

[28]:

name = OECD ICIO 2022 regular
table = IOT
scenarios = ['baseline']
Factor of production = 2
Satellite account = 1
Consumption category = 6
Region = 39
Sector = 50

[29]:

oecd_by_g20.regions

[29]:

['G20',
 'SGP',
 'ARE',
 'THA',
 'COL',
 'NZL',
 'COD',
 'PER',
 'KAZ',
 'CHL',
 'STP',
 'NGA',
 'BGD',
 'TWN',
 'ROW',
 'ISR',
 'UKR',
 'EGY',
 'CIV',
 'HKG',
 'ISL',
 'KHM',
 'BRN',
 'CHE',
 'CRI',
 'JOR',
 'LAO',
 'NOR',
 'VNM',
 'MMR',
 'PAK',
 'MAR',
 'SEN',
 'TUN',
 'BLR',
 'AGO',
 'CMR',
 'PHL',
 'MYS']

Download this notebook