StatCan parser walkthrough#
This notebook is the practical guide for parsing Statistics Canada supply-use and symmetric input-output tables in MARIO.
What this notebook covers#
where the StatCan tables come from;
when direct online parsing is enough and when a local cache is useful;
how
SUTandIOTworkflows differ;how
year=,level=,geo=, andvaluation=are used;what to expect from
download=True;which caveats matter for this API-driven parser.
Relevant source pages#
Official StatCan SUT catalogue: Supply and use tables
Official StatCan IOT catalogue: Symmetric input-output tables
StatCan WDS documentation: WDS user guide
MARIO can query the official WDS API directly, so there is no need to download the raw .csv bundles manually unless you want a local cache.
Main entry point#
For normal user workflows, the public entry point is:
mario.parse_statcan(...)
The same function supports:
SUTtables;IOTtables;summary,detail, andlink1997levels where available;direct online parsing or local cached raw files.
Key arguments#
The key public arguments are:
year: reference year to parse;table: choose"SUT"or"IOT";level: choosesummary,detail, orlink1997when supported;geo: geography label such asCanadaor one province/territory;valuation: only forIOT, usuallybasicorpurchaser;path: optional directory for locally cached raw files;download: whenTrue, MARIO downloads the raw WDS bundle intopathand then parses it locally.
SUT versus IOT#
Use table="SUT" when you want the native split structure with S, U, Yc, Ya, Va, and Vc.
Use table="IOT" when you want the symmetric input-output representation with Z, Y, and V.
For SUT, level="link1997" is also available. For IOT, valuation= matters because the tables are exposed at both basic and purchaser prices.
Direct online parsing versus local cache#
Use direct online parsing when you just want the database and do not need the raw files afterwards.
Use download=True together with path=... when you want MARIO to keep the raw WDS .csv bundle locally. This is the most useful option when you parse the same StatCan table repeatedly or want a reproducible local cache.
Cache layout and supported values#
path is optional and acts as a local WDS cache:
StatCan/
└── cache/
├── 36-10-0438-01_*.csv
└── 36-10-0001-01_*.csv
Supported levels:
SUT:summary,detail,link1997;IOT:summary,detail.
For IOT, valuation can be basic or purchaser.
[1]:
import mario
Parse a Canada SUT directly from WDS#
This is the simplest StatCan workflow.
[2]:
db = mario.parse_statcan(
year=2022,
table="SUT",
level="summary",
geo="Canada",
)
db
INFO Parser: requesting StatCan WDS metadata for table 36100438.
INFO Parser: downloading StatCan table 36100438 from https://www150.statcan.gc.ca/n1/tbl/csv/36100438-eng.zip.
INFO Parser: StatCan SUT payload ready with shapes S=(32, 63), U=(63, 32), Yc=(63, 18), Va=(9, 32), Vc=(9, 63).
INFO Metadata: initialized.
[2]:
name = StatCan SUT summary Canada 2022
table = SUT
tech_assumption = industry-based
scenarios = ['baseline']
Activity = 32
Commodity = 63
Factor of production = 9
Satellite account = 1
Consumption category = 18
Region = 1
Parse a provincial SUT#
The SUT tables expose provinces and territories as separate geographies. The parser expects the published StatCan geography label.
[3]:
db = mario.parse_statcan(
year=2022,
table="SUT",
level="detail",
geo="Ontario",
)
db
INFO Parser: requesting StatCan WDS metadata for table 36100478.
INFO Parser: downloading StatCan table 36100478 from https://www150.statcan.gc.ca/n1/tbl/csv/36100478-eng.zip.
INFO Parser: StatCan SUT payload ready with shapes S=(216, 473), U=(473, 216), Yc=(473, 289), Va=(9, 216), Vc=(9, 473).
INFO Metadata: initialized.
[3]:
name = StatCan SUT detail Ontario 2022
table = SUT
tech_assumption = industry-based
scenarios = ['baseline']
Activity = 216
Commodity = 473
Factor of production = 9
Satellite account = 1
Consumption category = 289
Region = 1
Parse an IOT at basic prices#
For IOT, pass valuation="basic" or valuation="purchaser".
[4]:
db = mario.parse_statcan(
year=2022,
table="IOT",
level="detail",
valuation="basic",
)
db
INFO Parser: requesting StatCan WDS metadata for table 36100001.
INFO Parser: downloading StatCan table 36100001 from https://www150.statcan.gc.ca/n1/tbl/csv/36100001-eng.zip.
INFO Parser: StatCan IOT payload ready with shapes Z=(252, 252), Y=(252, 306), V=(8, 252).
INFO Metadata: initialized.
[4]:
name = StatCan IOT detail Canada 2022 Basic price
table = IOT
scenarios = ['baseline']
Factor of production = 8
Satellite account = 1
Consumption category = 306
Region = 1
Sector = 252
Parse an IOT at purchaser prices#
The public API stays the same; only the valuation selector changes.
[5]:
db = mario.parse_statcan(
year=2022,
table="IOT",
level="summary",
valuation="purchaser",
)
db
INFO Parser: requesting StatCan WDS metadata for table 36100084.
INFO Parser: downloading StatCan table 36100084 from https://www150.statcan.gc.ca/n1/tbl/csv/36100084-eng.zip.
INFO Parser: StatCan IOT payload ready with shapes Z=(32, 32), Y=(32, 28), V=(7, 32).
INFO Metadata: initialized.
[5]:
name = StatCan IOT summary Canada 2022 Purchaser price
table = IOT
scenarios = ['baseline']
Factor of production = 7
Satellite account = 1
Consumption category = 28
Region = 1
Sector = 32
Keep the raw WDS files locally#
When download=True, MARIO stores the downloaded raw .csv bundle inside path and then parses the local file. This is useful when you want to avoid re-downloading the same table.
[6]:
db = mario.parse_statcan(
year=2022,
table="SUT",
level="detail",
geo="Quebec",
path="/path/to/StatCan",
download=True,
)
db
INFO Parser: reading local StatCan CSV statcan_36100478_sut_detail.csv.
INFO Parser: StatCan SUT payload ready with shapes S=(217, 472), U=(472, 217), Yc=(472, 289), Va=(9, 217), Vc=(9, 472).
INFO Metadata: initialized.
[6]:
name = StatCan SUT detail Quebec 2022
table = SUT
tech_assumption = industry-based
scenarios = ['baseline']
Activity = 217
Commodity = 472
Factor of production = 9
Satellite account = 1
Consumption category = 289
Region = 1
Caveats#
the parser is API-driven, so network availability matters more than for local-file parsers;
download=Trueis optional, not required;this walkthrough focuses on the economic StatCan tables only;
environmental extensions are intentionally left out here.