CDDA data collection 2018

Released on 2018-02-01 by: Marek Staron

To: EIONET PCPs and NRCs for Biodiversity data and information

Cc: Eionet National Focal Points

From: Mette Lund and Marek Staron, EEA

Dear Colleague,

We are writing to inform you of the detailed plans for the CDDA annual data collection that is, as you will know, one of the agreed EIONET core data flows. We hope that this letter will be useful as you organise a programme of work with your network in order to validate the existing data published by EEA and to provide any new information available by the deadline for deliveries 30 April 2018.

New CDDA reference page

Reporting instructions for data flows hosted by the EEA are made available from one common access point: CDR help. The new CDDA reference page with all instructions and supporting material for the reporting are found at http://cdr.eionet.europa.eu/help/cdda/. Please visit the CDDA reference page and familiarise yourself with the material.

New reporting mechanism and new data model

A new reporting mechanism is ready to receive your data. The focus of the new mechanism is 1) to re-use the INSPIRE Protected Sites data from countries implementing the INSPIRE Directive 2) to automate the QC and the processing of the deliveries to the final European dataset.

Following the process which removed outdated and redundant information and adapted to the impacts of the INSPIRE Directive, the new CDDA data model will now be used for the reporting. The new model foresees that many countries will have the spatial data ready via INSPIRE Protected Sites and hence divides the data into Type 1 and Type 2, following the Linked approach. This means that countries providing spatial data on protected sites via INSPIRE should re-use those datasets for the CDDA Type 1 reporting and may use the pre-loaded Excel template for the CDDA Type 2 reporting. We recommend that countries that are not implementing INSPIRE use the pre-loaded Excel and shape file templates (see below) for both CDDA Type 1 and Type 2 reporting. The new data model and the reporting mechanism are described in detail in the CDDA reporting guidelines (note now version 1.1).

The reporting deadline has moved from the originally announced 15 April to 30 April to give you two more weeks to get past the new automatic QC on CDR. The new QC operates with so-called “blockers”: if your data delivery does not conform with the specifications in the CDDA reporting guidelines and in Data Dictionary, you will not be able to finalise the upload. In that case, the QC feedback report on CDR will indicate the issues found in the data. You will have to correct the delivery accordingly in order to be able to finalise the delivery and release the files on CDR.

You may test the upload of your data in the CDRSandbox. The CDRSandbox is a training and testing environment similar to the real CDR. In CDRSandbox, users can test the delivery process, quality control checks and workflow associated with the different reporting obligations. CDRSandbox can also be used for training of new reporters or during the preparation of data deliveries. More instructions about the use of CDRSandbox is found at the CDDA reference page.

The reporting mechanism using the Linked approach is new. We have tested it thoroughly before this call, but it is likely that you will find issues that we did not foresee. Please do not hesitate to send questions to us and we encourage you to experiment with smaller data samples in the CDRSandbox to get familiar with the uploading and QC process prior to the April deadline.

The Designation types reporting webform is not yet ready for your updates. It will become available later this year and you will receive a notification with further instructions when it is operational. The Designation types registry is available with designation types as reported in 2017. It will be used in the automatic QC.

Pre-loaded template files

As in previous years, also for this data collection, the EEA has prepared a data package with pre-loaded template files and a country data quality report has been prepared for each country. We strongly recommend using the shape file template for the Type 1 reporting if your country is not providing INSPIRE Protected Sites. For the Type 2 reporting, we strongly recommend using the Excel file template, unless your country already has an operational workflow that will provide the valid xml file from a national database. The data package contains:

  • CDDA_2018_**_ProtectedSite_3035_<date of creation>.gml – For your information, this is an example of a correct gml file based on your latest delivery. If you reported site points or boundaries in overseas territories, you will find a file with those sites indicating the EPSG code 4258 in the file name in addition to the file with EPSG code 3035 in the file name.
  • CDDA_2018_**_ProtectedSite_polygon_3035_<date of creation>.shp – Template shape file pre-loaded with site boundaries made from the latest national spatial data provided by you. If you reported site boundaries in overseas territories, you will find a file with those sites indicating the EPSG code 4258 in the file name in addition to the file with EPSG code 3035 in the file name.
  • CDDA_2018_**_ProtectedSite_point_3035_<date of creation>.shp – Template shape file pre-loaded with site points made from the latest national spatial data provided by you. If you reported site points in overseas territories, you will find a file with those sites indicating the EPSG code 4258 in the file name in addition to the file with EPSG code 3035 in the file name.
  • CDDA_2018_**_type2data_<date of creation>.xml – For your information, this is an example of a correct xml file based on your latest delivery.
  • CDDA_2018_**_type2data_<date of creation>.xls - CDDA Data Dictionary based template Excel file pre-loaded with the latest national tabular data provided by you.
  • CDDA_2018_**_sitesMissingCriticalInfo_<date of creation>.xls – If your latest delivery had sites without geographical reference, it will be listed here. Sites missing geographical reference are no longer accepted in CDDA.
  • QA_CDDA_v15_v2017_**.pdf – Country specific report on the quality of the latest CDDA data delivery.
  • QA_CDDA_v15_v2017.pdf – Summary report on the quality of the latest European CDDA data.

** is for country code or name

Each of the country data packages is uploaded to your country folder, the list of links is available from the CDDA reference page. Nominated data reporters and national focal points have access to this envelope.

Update of the CDDA data (Type 1 and Type 2)

The pre-loaded template files contain the latest national data provided by you in a structure defined by the new data model. If you use the templates, as recommended, it is very important that you verify the content. It was transformed into the new model based on certain assumptions (described in Chapter 8 of the CDDA reporting guidelines) and the result may differ from your actual data. Where necessary, the content should be corrected and updated. The reports on quality of the data, included in the package, provide information on problems and errors we have detected in your previous deliveries. Please try to correct them or provide an explanation if it is not possible.

Do not try to use the old database template. They are no longer supported and you will not be able to release the data and finalise the data delivery on CDR.

Key issues

The cddaId codes

The site code cddaId is the thematic identifier for the CDDA dataset. It is vital that all Type 2 records are given correct cddaIds. The cddaId is identical to the WDPA ID used as the unique identifier by the World Database on Protected Areas (WDPA).

The web service for allocation of cddaId for new sites is available at:

http://dd.eionet.europa.eu/services/siteCodes

The service has been updated with the information from the last year reporting cycle. We kindly ask you to use this service to reserve new codes for your sites. A detailed user guide is available for download at the site code allocation web service page. Please note that the layout of the web page has been changed, compared to the user guide, but the functionality of the page remains the same.

The localId and PSlocalId codes

All Type 1 features and their corresponding Type 2 records must be assigned correct and identical localId and PSlocalId. The localId and PSlocalId are the principal linking elements, which allows Type 1 and Type 2 data to be properly joined. A screen cast available in the CDDA reference page demonstrates some of the issues to consider.

Data delivery and automatic QA

Your quality checked data should be uploaded, after consultation with the National Focal Point, to the Central Data Repository, by the deadline of: 30 April 2018.

Please create a new envelope for the 2018 reporting in the CDR CDDA collection of your country and upload the data.

The delivery should consist of the following parts:

  • Type 1 spatial data (GML format)
  • Type 2 data (XLS or XML format)
    • DesignatedArea
    • LinkedDataset
  • Additional information - any supporting or clarifying information that you think is necessary in order to explain your delivery. Use any format you think will serve the purpose.

The Type 1 data files must be submitted as valid GML files. If your country does not have the Type 1 data files available as INSPIRE Protected Sites, you may use the shape file templates and then convert them to GML files using the conversion tool. Do not upload the shape files to CDR.

The Type 2 data files are reported as Excel files. The Excel data files, based on the Excel file templates, are automatically converted to XML files upon upload to CDR.

Alternatively, the Type 2 data files may be uploaded to CDR directly as XML files, valid according to the XML schema available in Data Dictionary at http://dd.eionet.europa.eu/v2/dataset/3344/schema-dst-3344.xsd. A screen cast available in the CDDA reference page demonstrates some issues to consider when creating the XML file.

Partial deliveries to the CDR envelope are not accepted, i.e. it will not be possible to release the envelope if files are missing. All Type 1 and Type 2 data must be present in the envelope.

Please do not upload nested zip files. Nested zip files will not be properly processed.

You can split data into multiple files, but each Type 2 file must contain both DesignatedArea data and the respective LinkedDataset data.

If you split a delivery into multiple files, all must be uploaded into the same envelope.

Only the most recently released envelope will be processed (when it is also “technically accepted” in the Final feedback stage, see below). Data delivered in older envelopes will not enter the European dataset.

After you upload your data, you must test them by using the ‘Run automatic QA’ function of the CDR envelope. The full list of tests is available in the CDDA reference page. You must use the ‘Run automatic QA’ function at least once before you try the ‘Release the envelope’ function.

The results of the QC tests will be stored in the Feedback section of the envelope. Please check the QC tests and correct your data if necessary. If the dataset is unfit for release (it contains “blockers”), it will be indicated in the QC feedback. If you try to release an envelope that contains data files with blockers, it will fail and the envelope will return to Draft status.

Other types of issues identified by the QC tests (“errors”, “warnings”) will not prevent the envelope’s release, but you should revise the issues anyway. See also the Scoring section below.

When successfully released, the envelope enters the Final feedback stage and the envelope will be locked. In this step, the Data Processor (ETC/BD) will perform additional checks and decide whether the envelope can be "technically accepted" or they will ask for a correction and redelivery from you ("correction requested").

Only technically accepted envelopes will be harvested and processed into the European dataset. If Data Processor asks for a correction of your delivery, the detailed reasons will be provided in the envelope as a manual feedback and you will be notified about it.

When a delivery is technically accepted, the envelope will be closed. When correction is requested for a delivery, the envelope will also be closed. Any new delivery, including redelivery of corrected data, will require that you create a new envelope.

Scoring criteria

scoringtable

The scoring of the reporting will follow the Eionet core data flows evaluation criteria for the Timeliness parameter:

  • Timely delivery 30 April (the deadline)
  • Small delay 30 May (+30 days)
  • Serious delay 1 June (and thereafter)

For the Data Quality parameter, in 2018, the scoring will be the following:

  • Basic test failed: Generation of "blocker" messages during the CDR QC test.
  • Basic test passed: Only "error", "warning" or "ok" messages have been generated.
  • All tests passed: Only "warning" or "ok" messages have been generated.

Anticipated changes in 2019

Content-related fields will become “blockers” in 2019 if they are mandatory. Until now, the concept of Mandatory and Optional field was not strictly applied to CDDA. In 2018, we apply a stricter QC with regard to the provision of Mandatory information on identifiers and other fields important for the structure and integrity of the CDDA database (see the list of QC tests). From 2019, we will extend the strict application of QC for mandatory fields to include the content related fields of legalFoundationDate, majorEcosystemType, iucnCode and siteArea.

Helpdesk and support

  • In the case of login problems or any other technical difficulties with Eionet web services, please contact Eionet Helpdesk
  • In the case of questions related to the content of the data requested, please contact CDDA helpdesk. This support is provided by ETC/BD staff.
  • In case countries repeatedly ask the same kind of questions on the new reporting mechanism or the new data model, we will offer additional screen casts or a webinar to cover the items in question.

If the people working directly on the CDDA data delivery in your country are not PCPs or NRCs, please copy this letter to them so that they are aware of the wider context of their work at the European level.

We would like to take this opportunity to thank you for your continued support. We look forward to another successful annual data flow.

Mette Lund - Biodiversity data and information systems
Marek Staron - Water and biodiversity data flows
European Environment Agency