-
Notifications
You must be signed in to change notification settings - Fork 0
Make File for AmphibiaWeb
NOTE: This code base has been deprecated in favor of https://github.com/VertNet/post-harvest-processor.
This page describes the process for creating a subset snapshot file from the latest VertNet harvest for download from Google Cloud Storage. This example is to create a file for use by AmphibiaWeb.
Prerequisites: Make sure the table vertnet_latest_amphibians reflects the latest data from the most recent BigQuery fullDump (e.g., table full_20150120). See the page https://github.com/VertNet/bigquery/wiki/Make-Amphibian-Snapshot-Table,
To create a current copy of the data set for AmphibiaWeb, delete the table vertnet_latest_amphibiaweb.
Select the table vertnet_latest_amphibians.
Click on the Show Options button.
Select the following options:
- Write Preference: Overwrite table
- Results Size: Allow Large Results
- Results Schema: Flatten Results
In the Select Destination Table dialog:
- Project: vertnet-portal
- Dataset: dumps
- Table ID: vertnet_latest_amphibiaweb
Enter the following in the New Query text box and click on Run Query:
SELECT
type,
institutioncode,
collectioncode,
catalognumber,
occurrenceid,
associatedmedia,
basisofrecord,
eventdate,
year,
month,
day,
highergeography,
continent,
country,
countrycode,
stateprovince,
county,
locality,
verbatimelevation,
decimallatitude,
decimallongitude,
geodeticdatum,
coordinateuncertaintyinmeters,
georeferencedby,
georeferenceddate,
georeferenceprotocol,
georeferencesources,
scientificname,
order,
family,
genus,
specificepithet,
scientificnameauthorship,
dc_references
FROM [dumps.vertnet_latest_amphibians]
Click on the Run Query button.
When the table vertnet_latest_amphibians is complete, it can be exported to Google Cloud Storage for public access. To do so, move the cursor to the vertnet_latest_amphibians table in the dumps folder. click on the drop-down button and select 'Export table'. Select CSV as the export format, GZIP as the compression format, and gs://vertnet-byclass/vertnet_latest_amphibiaweb.csv.gz as the Google Cloud Storage URI. It will take a few minutes to generate the file. In the meantime the BigQuery console will show that a job is in progress. When it is finished, if there are no errors, go to the Google Cloud Storage console at https://console.developers.google.com/project/vertnet-portal/storage/browser/vertnet-byclass/. Click on the Shared Publicly check box to make the file openly accessible at http://storage.googleapis.com/vertnet-byclass/vertnet_latest_amphibiaweb.csv.gz.