It is proposed that the new, more harmonised approach for simple, while still meaningful evaluation criteria should be based on the overall logic outlined below, thereby making use of results from Reportnet's automatic data quality checking routines, as far as possible.
Most core data flows are using the XML 1.0 Standard as data exchange format, which allows for an automated checking of data quality in Reportnet, immediately after data upload. Over the years, data quality checks have been defined for most of the major data flows. It is assumed that for each of the core data flows, automated data quality checks will be defined where missing, while existing ones will be expanded and refined.
The proposal is that in the evaluation of deliveries under the core data flows, scores of 0..4 points (earlier known as: smileys) will be given, according to the table below. Following existing practice from evaluating the earlier set of priority data flows, the main components for the scoring are a delivery's timeliness as well as its data quality. Scoring results from individual core data flows will be aggregated by country, so that an average score for the country can be derived, typically as achievement (performance) in percent. Additionally, average performance values can be calculated for a single data flow (across all countries), as well as average scores for different country groupings (across all data flows). More details and definitions for the individual scoring categories are given further down below.
The definitions for the categories used in the above matrix are given below. Similar definitions have largely been used in the past for the priority data flows. However, there was a substantial complexity and variety across the earlier priority data flow criteria, often obfuscating the underlying principles and commonalities.
The timeliness criteria refer to the actual reporting date, in comparison with the reporting deadline and the length of the reporting cycle. The logic is based on a process-oriented approach, taking into account how much effort is needed for handling a given delivery in subsequent data processing steps.
The evaluation categories are based on the results of (automated) data quality checks which inspect format and completeness, as well as internal and external consistency of the delivery. Typically, the checking rules verify initially the presence of all mandatory values (completeness check), followed by internal consistency checks, such as: uniqueness of primary keys, references between optional data elements (conditional tests) and checks for referential integrity between tables, or between GIS data and attribute data. A further option for checking the internal consistency is to test the incoming data against earlier deliveries under the same reporting, e.g. in form of outlier checks. Eventually, external consistency checks can be executed, where delivered data are compared against external reference data, e.g. code lists and nomenclatures provided by Eurostat or other international data providers.
Document last modified: 2017/02/22 . Content in this portal is modified daily by a community of providers.