Skip to content

digital-land/digital-land-python

Repository files navigation

Digital Land Pipeline

Continuous Integration codecov License Code style: black

Python command-line tools for collecting and converting resources into a dataset

Installation

pip3 install digital-land

Command line

$ digital-land --help

Usage: digital-land [OPTIONS] COMMAND [ARGS]...

Options:
  -d, --debug / --no-debug
  -n, --dataset TEXT
  -p, --pipeline-dir PATH
  -s, --specification-dir PATH
  --help                        Show this message and exit.

Commands:
  build-datasette                build docker image for datasette
  collect                        fetch resources from collection endpoints
  collection-add-source          Add a new source and endpoint to a collection
  collection-check-endpoints     check logs for failing endpoints
  collection-list-resources      list resources for a pipeline
  collection-pipeline-makerules  generate pipeline makerules for a collection
  collection-save-csv            save collection as CSV package
  convert                        convert a resource to CSV
  dataset-create                 create a dataset from processed resources
  dataset-entries                dump dataset entries as csv
  fetch                          fetch resource from a single endpoint
  pipeline                       process a resource
  add-endpoint-and-lookups       add batch of endpoints from csv

Development environment

The GDAL tools are required to convert geographic data, and in order for all of the tests to pass.

Makefile depends on GNU make if using macOS install make using brew and run gmake.

Development requires Python 3.6.2 or later, we recommend using a virtual environment:

make init
make
python -m digital-land --help

Commands Guide

add-endpoint-and-lookups

This command allows for adding multiple endpoints and lookups for datasets within a given collection, driven by entries in a csv file.

Detailed instructions for running this command can be found in the Data Operations manual within the MHCLG technical documentation repository.

Use with caution

(currently only successfully tested on Brownfield Land collection)

Release procedure

Update the tagged version number:

make bump

Build the wheel and egg files:

make dist

Push to GitHub:

git push && git push --tags

Wait for the continuous integration tests to pass and then upload to PyPI:

make upload

Notebooks

notebooks have been added which contain code that code be useful when debugging the system. currently jupyter isn;t installed as part of the dev environment so before running you may need to install:

pip install jupyterlab

The notebooks are as follows:

  • debug_resource_transformation.ipynb - given a resource and a dataset this downloads the resource and relvant information to process the resource. This is very useful for replicating errors that occur in this step.

Licence

The software in this project is open source and covered by the LICENSE file.

Individual datasets copied into this repository may have specific copyright and licensing, otherwise all content and data in this repository is © Crown copyright and available under the terms of the Open Government 3.0 licence.

About

Python command-line tools for collecting and converting resources into a dataset

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 26

Languages