ArchitectureComponentsIngestion

Ingestion

Purpose

Brings data into the portal. Pipelines write to the datastore in two modes:

  • Event-driven — when a dataset or resource is created or updated in CKAN, ckanext-aircan triggers an Airflow DAG to load it.
  • Scheduled — for external sources that publish on their own cadence, DAGs run on a fixed schedule and pull on each tick.

Tech stack

LayerTech
OrchestratorApache Airflow
DAG languagePython
TargetDatastore (BigQuery or DuckLake)
CKAN integrationckanext-aircan

Repo locations

See also


Last reviewed: 2026-05-04

Built with LogoFlowershow