Book page

Validation reports user guide

VALIDATION REPORTS USER GUIDE

  1. Standardization of the Validation Reports
  2. Retrieval of the Validation Report
  3. Structure of the Validation Report (HRR)
  4. See also

 

For service support related to Validation Reports, please contact:

ESTAT-DATA-METADATA-SERVICES@ec.europa.eu

Last update : 06-11-2023

COOL Production

SDMX Converter

 

Standardization of the Validation Reports

In collaboration with National Statistical Institutes, Eurostat has introduced an SDMX-compliant XML schema for validation reports. This Machine-Readable Report (MRR) contains the results of the validation event (e.g. errors in data, error messages) and can be processed by Eurostat’s Validation Report Formatter Service (VRF) to produce an additional, user-friendly Human Readable Report (HRR) in HTML format. The HTML report enhances the readability and clarity of the information provided and presents information on validation errors in a transparent and ordered manner.

Eurostat’s validation services (STRUVAL and CONVAL) currently produce the SDMX-compliant MRR report and the associated HRR report. 

 

Retrieval of the Validation Report

The data provider may access the Validation Report as EDAMIS feedback. The Validation Report is never sent directly (e.g. email) to data providers due to possible data confidentiality and security constraints. The EDAMIS service may be, however, configured to send an email to inform users of the availability of the report.

Data providers receive the Validation Report in a human-readable format aimed at statisticians and other generalist groups. The HTML and Excel formats described in this guide are the most widely used, however the human-readable report can be generated in CSV, as well. The machine-readable XML format is currently only available for Eurostat internal use.

EDAMIS - received feedback files

 

Validation Report - HTML

The Validation Report consists of a Header, an Overview section (optional), the Error Summary and the Error Details. The Header contains metadata on the validation process and a general overview of the results of the validation event. In the Overview section, statistical production domains may opt to display additional information on the validation process and the data itself. The Error Summary lists all rules that the data failed with the error message describing the issue. Finally, the Error Details contain the detailed list of all error occurrences, grouped by unique root cause.

 

Header section

The Header section provides an overview of the validation event: the aggregated results, and the service(s) and assets involved. The Header consists of an information box on process metadata (left), a counter of validation failures per severity (right). Further, the Header provides messages related to the outcome of the validation process. If a report is generated in the Acceptance environment, this is also indicated in the Header.

validation report - header section

 

The metadata list contains the following entries:

Data ProviderIdentifier of the data provider country or organization. This may be the country code or the name of the organization. 
Data SubmittedDate and time when the dataset was submitted in EDAMIS. 
Process Type

Indicates if the validation instance is pre-validation or official transmission. 

If a pre-validation process is invoked, the process type is 'PRE-VALIDATION'

If official transmission is invoked, the process type is 'OFFICIAL TRANSMISSION'

Please also see Messaging below.

 
Number of observationsTotal number of observations checked. 
Input formatFormat of the submitted data file. 
Validated DatasetName of dataset validated; EDAMIS Dataset ID (including the version). 
DSDDSD Artefact ID as listed in the Euro SDMX Registry. 
DataflowDataflow Artefact ID as listed in the Euro SDMX Registry. 
ConstraintConstraint Artefact ID as listed in the Euro SDMX Registry. 
RulesetID of the content validation programs executed. 
Report GeneratedDate and time when the Validation Report is created. 
Validation serviceName and version of validation service(s) called. In the Eurostat context, the services may include STRUVAL (structural validation) or CONVAL (content validation). 

 

Please note that entries that are not relevant for the individual validation event will not appear in the report, e.g. the use of Constraints is optional and therefore the label may not appear.

 

Error occurrence counter

The Header displays a counter with the total of issues encountered, grouped by Severity. 

ErrorBlocking. The data is rejected and the identified issue must be corrected in the before re-submission.
Warning    Non-blocking. The validation process detected an issue where expert evaluation and possible correction is required before the acceptance of the data.
InfoNon-blocking. Information on the data is provided.

 

Messaging

The Validation Report includes general purpose (not error specific) messages to inform data providers of specific circumstances of the validation flow instance.

 

Official Transmission

 

The label appears for validation flow instances that are intended as official data transmissions, with no pre-validation option selected in EDAMIS.
Pre-Validation. Data is not officially transmitted to Eurostat.The label appears for validation flow instances that are intended as pre-validation of the data, with the pre-validation option selected at data submission in EDAMIS.
Validation ended with success.The validation process concluded with no Error or Warning severity issues detected.
Validation completed, review of data is required.The validation process concluded with no Error and minimum 1 Warning severity detected.
Validation ended with errors found.The validation process concluded with minimum 1 Error severity detected.
The report is based on confidential data. Some values might have been removed.Datasets may contain data defined as confidential. In such cases, all elements of the report that are or may be confidential are removed. These elements include values for CONCEPT_VALUE, and error messages that may include the values for CONCEPT_VALUE.
Error limit reached.

Validation services are configured to terminate after reaching a pre-set number of validation errors. In case the error limit is reached, the report indicates this fact. Please note that the data may contain further, unreported errors that have not yet been identified and may be detected on re-submission of the data.

The current error cap is 10.000 error occurrences.

 

Overview section

Production domains may choose to complement the validation results with additional information about the validation event, e.g. calculate statistics on data quality or compliance metrics. Such calculations are not considered part of the data validation itself and therefore are displayed separately in the report, under the Overview header, above the Error Summary. The metrics are produced by the CONVAL service and use a non-standard severity that has no impact on the validation flow. 

COOL - overview section

 

Error Summary section

The Error Summary provides the complete list of Rules that the data failed to fulfill. Further, the Summary includes:

Original orderThe errors are listed in order of their detection (order in the rule set) by default.
RuleID of the failed Rule
SeverityError/Warning/Info
OccurrencesTotal number of occurrences per Rule.
Error MessageComplete error message text.

 

The Error Summary and Full report may be exported to Excel using the button in the top right of the section (highlighted in green below). A search bar helps finding details in the list.

COOL - overview summary export functionalityCOOL - overview summary exported example

 

Error details section

The Error Details lists all occurrences of errors for a specific rule detected by the validation services. The section presents:

  • the rule name
  • severity level
  • total number of occurrences
  • number of occurrences reported - STRUVAL only, the number of occurrences displayed in the report is limited to the first 5 in the order of detection
  • error message
  • the complete list of series keys (the cross-products of Concept Name-Concept Value pairs) in a tabular format
  • in case a secondary message is defined for a Rule to provide instructions for resolution, an Additional details entry will display it

Note: For structural validation, there are no Rule names defined, the Rule entry will display an error ID that may be helpful for diagnostic purposes.

COOL - error detailed section

 

Similarly to the Error Summary, the table may be exported to Excel or CSV formats, copied to the clipboard or printed using the button on the right side of the section.

In case the validation process detects no errors of any severity, the Error Details section does not appear in the report.

 

Grouping of results

Errors detected in the data are grouped by identifying the unique root cause that prompted them. The grouping logic is displayed below; if there is a difference detected at either step between errors, they will be presented separately, whereas errors with all four attributes identical are considered multiple occurrences of the same error group.

COOL - Grouping results diagram
Error CodeHigh level category of error (not visible in the reports)
RuleID of the failed Rule
Concept NameName of affected dimension.
Error message Complete error message text.

 

Note: A single root cause may trigger multiple error types, and these will be listed separately (e.g. a code is unexpected and also violates a length constraint).

Errors generated by technical issues detected in the file (e.g. structurally incorrect dataset) will also appear as an error group, with 1 occurrence and no location defined. 

 

Filtering reports due to confidentiality constraints

Eurostat policy prohibits the inclusion of confidential data in Validation Reports distributed to external parties, including national statistical institutes. In such cases, the data provider will receive a report where potentially confidential elements are filtered out and marked as {REMOVED - CONFIDENTIAL}. Eurostat domain managers collecting and processing confidential data will receive an unfiltered report where all information is retained.

The filtering only affects CONCEPT_VALUE.

 

Validation Report - Excel

In case the original input dataset is XLSM/XLSX format, there is an option to generate the Validation Report by directly indicating the errors in the original file. Note that this method does not work with the XLS extension files.

Cells where a rule violation is detected are highlighted in orange color and the Error Message is displayed in a comment box attached to the cell, along with the Severity.

In case multiple errors are associated with a single cell, all Error Messages are listed in the same comment box.

The design of certain validation rules would imply to highlight more than one cell in the report, e.g. pointing to multiple cells utilized for a calculation. To avoid a possibly confusing visualization, in such cases the report currently avoids the highlighting of multiple cells and associates the error message with a single cell. In case the error needs to be referenced across multiple tabs in a file, the highlighting will be applied on the first affected cell in each tab.

Excel reports may be complemented with an HTML report in case an executive summary is also desired.

COOL - validation report in excel format

X

See also

STRUVAL error codes and messages