CKAN
The open-source data catalog used as the source of truth for datasets and metadata.
CKAN
Purpose
CKAN is the core backend of the data portal — the source of truth for every dataset, resource, organisation, and user. It owns the canonical metadata, storage references, and authorisation model that the frontend and downstream services build on.
Key responsibilities:
- Manage datasets, resources, organisations, and users.
- Own the metadata schema and validation rules.
- Index datasets in Solr for search.
- Authorise actions through its role and permission model.
- Expose RESTful APIs for managing and querying datasets and metadata.
Tech stack
| Layer | Tech |
|---|---|
| Core | CKAN (Python) |
| Database | PostgreSQL |
| Search | Solr |
Extensions
| Extension | Repo | Purpose |
|---|---|---|
| ckanext-noanonaccess | https://github.com/datopian/ckanext-noanonaccess | Blocks anonymous access — forces login before any page can be viewed. |
| ckanext-pdfview | https://github.com/ckan/ckanext-pdfview | Renders PDF resources inline in the browser. |
| ckanext-geoview | https://github.com/ckan/ckanext-geoview | Previews geospatial resources (GeoJSON, WMS, KML, etc.) on a map. |
| ckanext-scheming | https://github.com/ckan/ckanext-scheming | Defines custom dataset/resource/organisation metadata schemas via YAML/JSON. |
| ckanext-auth | https://github.com/datopian/ckanext-auth | Adds extended authentication features (e.g. 2FA, login throttling, password policies). |
| ckanext-s3filestore | https://github.com/shubham-mahajan/ckanext-s3filestore | Stores uploaded resources in S3-compatible object storage (Cloudflare R2 in this fork). |
| ckanext-aircan | https://github.com/datopian/ckanext-aircan | Integrates CKAN with Airflow to run datapush/ETL pipelines as DAGs. |
See also
Last reviewed: 2026-05-04