Greenhouse gas calculations#
This notebook shows an advanced calc_ghg(...) workflow in MARIO. It starts by inspecting the built-in GHG profile registry, shows how to check which gases are covered for each supported database profile, and then compares harmonized GHG results across EXIOBASE, EORA26, and ADB.
For built-in profiles, calc_ghg(...) also accepts time_horizon and ipcc_report to resolve profile-specific GWP values when multiple IPCC variants are available.
The workflow is organized in four steps:
inspect the preconfigured
calc_ghgprofiles and verify which gases are covered by the profiles used in this comparisoncompute regional
PBAandCBAtotals for the three databasesharmonize units and conclude with a cross-database chart for selected countries
Note (beta): This workflow uses a beta GHG aggregation method. It will be updated soon to support additional
time_horizonoptions and broader satellite coverage across all environmentally extended databases currently supported by MARIO parsing.
1. Inspect built-in calc_ghg profiles#
Database.calc_ghg(...) can use a built-in profile or a profile passed explicitly with profile=. The available profiles currently are exiobase_monetary, exiobase_hybrid, eora, adb, gloria, emerging.
In the built-in registry, some gases are stored as scalar factors and some as nested mappings. In practice, CO2 and total GHG remain scalar (1), while CH4 and N2O can expose a structure like {time_horizon: {ipcc_report: factor}}.
This means that time_horizon and ipcc_report are only relevant when calc_ghg(...) uses a built-in profile. If you pass a custom flat gwp={...} mapping, the behavior stays unchanged.
[2]:
import mario
profile = 'exiobase_monetary'
mario.ops.ghg.GHG_PROFILES[profile]
[2]:
{'match': ('exiobase', 'monetary'),
'unit': 'kg CO2eq',
'gwp': {'CO2 - combustion - air': 1,
'CH4 - combustion - air': {100: {'AR6': 29.8, 'AR5': 30, 'AR4': 25}},
'N2O - combustion - air': {100: {'AR6': 273, 'AR5': 265, 'AR4': 298}}}}
2. Configure the comparison inputs#
The next cell defines the three databases used in the comparison and the helper functions that keep the workflow aligned across sources.
Before running it, update shared_folder so it points to your local folder with the database files used in this workflow.
In particular, it does three things:
defines the file locations, versions, and years to import
reshapes each database into a common
PBAandCBAoutput layoutreconciles source-specific region labels to ISO3 codes with the packaged country coverage workbook
Notes#
Replace
YOUR_PATH_TO_INPUT_OUTPUT_DATABASESwith the local parent folder that containsEXIOBASE,EORA, andADB.{version},{table}, and{year}are filled automatically from thepropertiesdictionary.Folder and file names should match exactly, including spaces in
Air emissionsand in the ADB workbook names.
[ ]:
import pandas as pd
mario.set_log_verbosity('critical')
from mario.clusters.coverage import load_concordance
ghgs = pd.DataFrame()
shared_folder = 'YOUR_PATH_TO_INPUT_OUTPUT_DATABASES'
properties = {
'exiobase_monetary': {
'path': f'{shared_folder}/EXIOBASE/{{version}}/{{table}}_{{year}}_ixi.zip',
'name': 'EXIOBASE', 'table': 'IOT', 'versions': ['3.10.2'], 'years': [2010,2015,2018,2021],
},
'eora': {
'path': f'{shared_folder}/EORA/EORA1/Eora26/Eora26_{{year}}_bp',
'name': 'EORA26','table': 'IOT','versions': ['199.82'],'years': [2010,2015],
},
'adb': {
'path': f'{shared_folder}/ADB/{{version}}',
'path_satellites': f'{shared_folder}/ADB/Air emissions/{{year}} EE-MRIOT (Air Emissions).xlsx',
'name': 'ADB','table': 'IOT','versions': ['62 economies'],'years': [2018,2021,2023],
},
}
CONCORDANCE = load_concordance()
SOURCE_CONCORDANCE_HINTS = {
'adb': 'ADB',
'eora': 'EORA1',
'exiobase': 'EXIOBASE',
}
ISO3_VALUES = {
str(value).strip().upper() for value in CONCORDANCE['ISO3'].dropna()
}
def ghg_rearrangement(db, name, version, table, year):
df_pba = db.E.loc["GHG", :].T.groupby(level=0).sum() + db.EY.loc["GHG", :].T.groupby(level=0).sum()
df_cba = db.F.loc["GHG", :].T.groupby(level=0).sum() + db.EY.loc["GHG", :].T.groupby(level=0).sum()
df_pba = df_pba.to_frame()
df_cba = df_cba.to_frame()
unit = db.units['Satellite account'].loc['GHG','unit']
df_pba.columns = pd.MultiIndex.from_tuples([(name, version, table, year, "PBA", unit)], names=['database', 'version', 'table', 'year', 'accounting', 'unit'])
df_cba.columns = pd.MultiIndex.from_tuples([(name, version, table, year, "CBA", unit)], names=['database', 'version', 'table', 'year', 'accounting', 'unit'])
df = pd.concat([df_pba, df_cba], axis=1)
return df
def reconcile_to_iso3(df, source):
source_value = str(source).strip()
source_column = next(
(
column
for token, column in SOURCE_CONCORDANCE_HINTS.items()
if token in source_value.casefold()
),
source_value,
)
if source_column not in CONCORDANCE.columns:
raise KeyError(f"No concordance column found for source '{source}'")
lookup = {}
for iso3, source_code in CONCORDANCE[['ISO3', source_column]].dropna().itertuples(index=False):
lookup.setdefault(str(source_code).strip().casefold(), str(iso3).strip().upper())
iso3_index = []
for label in df.index:
label_text = str(label).strip()
iso3 = lookup.get(label_text.casefold())
if iso3 is None and label_text.upper() in ISO3_VALUES:
iso3 = label_text.upper()
iso3_index.append(iso3 if iso3 is not None else label_text)
df = df.copy()
df.index = iso3_index
df.index.name = 'Region'
return df
3. Parse EXIOBASE and compute GHG#
EXIOBASE has a dedicated exiobase_monetary profile. The loop below parses each selected year, computes the aggregate GHG row with calc_ghg(profile='exiobase_monetary', time_horizon=100, ipcc_report='AR6'), and stores harmonized regional totals in the comparison table.
For this profile, time_horizon and ipcc_report affect the built-in CH4 and N2O factors, while CO2 remains equal to 1.
[3]:
p = properties['exiobase_monetary']
name, table, path = p['name'], p['table'], p['path']
for version in p['versions']:
for year in p['years']:
print(f'=== {name} {version} {table} ixi {year} ===')
db = mario.parse_exiobase(
path=path.format(version=version, year=year, table=table),
table=table,
unit='Monetary',
)
db.calc_ghg(profile='exiobase_monetary', time_horizon=100, ipcc_report='AR6')
ghgs_df = ghg_rearrangement(db, name, version, table, year)
ghgs_df = reconcile_to_iso3(ghgs_df, name)
ghgs = pd.concat([ghgs, ghgs_df], axis=1)
=== EXIOBASE 3.10.2 IOT ixi 2010 ===
=== EXIOBASE 3.10.2 IOT ixi 2015 ===
=== EXIOBASE 3.10.2 IOT ixi 2018 ===
=== EXIOBASE 3.10.2 IOT ixi 2021 ===
4. Parse EORA26 and compute GHG#
EORA26 uses a broader profile with many sub-source rows. The same pattern applies here: parse the database, run calc_ghg(profile='eora', time_horizon=100, ipcc_report='AR6'), reconcile regions to ISO3, and append the result to the common comparison table.
[4]:
p = properties['eora']
name, table, path = p['name'], p['table'], p['path']
for version in p['versions']:
for year in p['years']:
print(f'=== {name} {version} {table} ixi {year} ===')
db = mario.parse_eora(
path=path.format(year=year),
multi_region=True,
)
db.calc_ghg(profile='eora', time_horizon=100, ipcc_report='AR6')
ghgs_df = ghg_rearrangement(db, name, version, table, year)
ghgs_df = reconcile_to_iso3(ghgs_df, name)
ghgs = pd.concat([ghgs, ghgs_df], axis=1)
=== EORA26 199.82 IOT ixi 2010 ===
=== EORA26 199.82 IOT ixi 2015 ===
5. Parse ADB and compute GHG#
ADB already exposes a total GHG extension profile. After parsing the database and satellite workbook, the loop below computes GHG, harmonizes region labels, and appends the result to the same table used for EXIOBASE and EORA.
Here time_horizon and ipcc_report are not needed in practice, because the built-in profile already points to a total GHG row with scalar factor 1.
[5]:
p = properties['adb']
name, table, path = p['name'], p['table'], p['path']
for version in p['versions']:
for year in p['years']:
print(f'=== {name} {version} {table} ixi {year} ===')
db = mario.parse_adb(
path=path.format(version=version),
year=year,
add_extensions=p['path_satellites'].format(year=year),
)
db.calc_ghg(profile='adb')
ghgs_df = ghg_rearrangement(db, name, version, table, year)
ghgs_df = reconcile_to_iso3(ghgs_df, name)
ghgs = pd.concat([ghgs, ghgs_df], axis=1)
=== ADB 62 economies IOT ixi 2018 ===
=== ADB 62 economies IOT ixi 2021 ===
=== ADB 62 economies IOT ixi 2023 ===
6. Spot-check one harmonized country#
Before reshaping and plotting, it is useful to inspect one row after all three databases have been appended. The cell below checks the full USA record after profile application and ISO3 reconciliation.
[6]:
ghgs.loc['USA',:]
[6]:
database version table year accounting unit
EXIOBASE 3.10.2 IOT 2010 PBA kg 5.513016e+12
CBA kg 5.979904e+12
2015 PBA kg 5.086859e+12
CBA kg 5.382269e+12
2018 PBA kg 5.076462e+12
CBA kg 5.406201e+12
2021 PBA kg 4.662891e+12
CBA kg 5.250987e+12
EORA26 199.82 IOT 2010 PBA Gg 7.591209e+06
CBA Gg 8.594610e+06
2015 PBA Gg 7.591209e+06
CBA Gg 8.327947e+06
ADB 62 economies IOT 2018 PBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 6.290197e+06
CBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 6.608713e+06
2021 PBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 6.022567e+06
CBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 6.509001e+06
2023 PBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 5.968822e+06
CBA Gigagrams of Carbon Dioxide equivalent (Gg of CO2e) 6.364799e+06
Name: USA, dtype: float64
7. Harmonize units and prepare a tidy table#
calc_ghg preserves the native unit of each profile, so the three sources are not directly comparable yet. The next cell reshapes the aggregated output to long format and converts everything to Mt CO2.
[7]:
df = ghgs.unstack().to_frame()
df.columns = ['value']
df.reset_index(inplace=True)
factors = pd.Series(index=df.index, dtype=float)
factors[df['unit'].eq('kg')] = 1e-9
factors[df['unit'].str.contains('Gg', na=False)] = 1e-3
if factors.isna().any():
raise ValueError(f"Unsupported units found: {df.loc[factors.isna(), 'unit'].unique().tolist()}")
df = df.copy()
df['value'] = df['value'] * factors
df['unit'] = 'Mt CO2'
df
[7]:
| database | version | table | year | accounting | unit | Region | value | |
|---|---|---|---|---|---|---|---|---|
| 0 | EXIOBASE | 3.10.2 | IOT | 2010 | PBA | Mt CO2 | AUT | 67.120733 |
| 1 | EXIOBASE | 3.10.2 | IOT | 2010 | PBA | Mt CO2 | AUS | 406.737233 |
| 2 | EXIOBASE | 3.10.2 | IOT | 2010 | PBA | Mt CO2 | BEL | 108.324240 |
| 3 | EXIOBASE | 3.10.2 | IOT | 2010 | PBA | Mt CO2 | BGR | 43.529051 |
| 4 | EXIOBASE | 3.10.2 | IOT | 2010 | PBA | Mt CO2 | BRA | 407.413978 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 3505 | ADB | 62 economies | IOT | 2023 | CBA | Mt CO2 | WSM | NaN |
| 3506 | ADB | 62 economies | IOT | 2023 | CBA | Mt CO2 | YEM | NaN |
| 3507 | ADB | 62 economies | IOT | 2023 | CBA | Mt CO2 | ZMB | NaN |
| 3508 | ADB | 62 economies | IOT | 2023 | CBA | Mt CO2 | ZWE | NaN |
| 3509 | ADB | 62 economies | IOT | 2023 | CBA | Mt CO2 | RoW | 6672.575371 |
3510 rows × 8 columns
8. Compare the three databases visually#
The final chart keeps PBA and CBA side by side and facets by country. This makes it easier to compare both the scale of emissions and the accounting convention differences across EXIOBASE, EORA26, and ADB.
[8]:
import math
import plotly.express as px
plot_df = df.dropna(subset=['value']).copy()
plot_df['database_version'] = plot_df['database'].astype(str) + ' | ' + plot_df['version'].astype(str)
selected_regions = ["ITA", "DEU", "FRA", "CHN", "USA", "BRA", "RUS", "AUS"] # use sorted(plot_df['Region'].dropna().unique()) for all regions
available_regions = set(plot_df['Region'].dropna().astype(str))
regions = [region for region in selected_regions if region in available_regions]
if not regions:
raise ValueError(f"No data found for selected regions: {selected_regions}")
plot_df = plot_df[plot_df['Region'].isin(regions)].copy()
years = sorted(plot_df['year'].dropna().unique())
n_cols = min(4, len(regions))
n_rows = math.ceil(len(regions) / n_cols)
fig = px.bar(
plot_df.sort_values(['Region', 'year', 'database', 'version', 'accounting']),
x='year',
y='value',
color='database_version',
pattern_shape='accounting',
barmode='group',
facet_col='Region',
facet_col_wrap=n_cols,
category_orders={
'Region': regions,
'year': years,
'database_version': sorted(plot_df['database_version'].unique()),
},
hover_data=['database', 'version', 'accounting', 'unit'],
labels={
'year': '',
'value': '',
'database_version': 'Database + version',
'accounting': 'Accounting',
},
title='PBA and CBA GHG emissions for selected countries in different MRIO databases (MtonCO2eq)',
)
fig.for_each_annotation(lambda a: a.update(text=a.text.split('=')[-1]))
fig.update_xaxes(type='category')
fig.update_yaxes(matches=None, showticklabels=True)
fig.update_layout(
template='plotly_white',
width=1400,
height=max(400, 220 * n_rows),
legend_title_text='Database + version',
)
html_path = 'ghg_clustered_by_region.html'
fig.write_html(html_path)
fig.show()