Book page

Validation report user guide

Fernando MORENTE-ORIA
Fernando MORENTE-ORIA • 29 December 2023

VALIDATION REPORTS USER GUIDE

  1. Standardization of the Validation Reports
  2. Retrieval of the Validation Report
  3. Validation Report
  4. See also

 

For service support related to Validation Reports, please contact:

ESTAT-DATA-METADATA-SERVICES@ec.europa.eu

Last reviewed : 16-08-2024

COOL Production

SDMX Converter

 

Standardization of the Validation Reports

In collaboration with National Statistical Institutes, Eurostat has introduced an SDMX-compliant XML schema for validation reports. This Machine-Readable Report (MRR) contains the outcomes of the validation event (e.g. errors in data, error messages) and can be processed by Eurostat’s Validation Report Formatting Service (VRF) to produce a user-friendly Human Readable Report (HRR) in HTML format. The HTML report enhances the readability and clarity of the information provided, and presents information on validation errors and process metadata in a transparent and ordered manner. The HTML report is available for all Data Providers.

 

Retrieval of the Validation Report

The data provider may access the Validation Report via the EDAMIS feedback. The Validation Report is never sent directly (e.g. email) to data providers due to possible data confidentiality and security constraints. The EDAMIS service may be, however, configured to send an email to inform users of the availability of the report.

Data providers receive the Validation Report in HTML format aimed at statisticians and other generalist groups. Certain statistical domains may also receive an additional report in Excel format, depending on the input file format in use (see below). The machine-readable XML format is currently only available for Eurostat internal use.

EDAMIS - received feedback files

 

Validation Report

The Validation Report consists of a Header, an Overview section (optional), the Summary and the Details. The Header contains metadata on the validation process and a general overview of the results of the validation process. In the Overview section, statistical production domains may opt to display additional information on the validation process and the data itself. The Summary lists all rules that the data violated, with the error message describing the issue, grouped by unique root cause.. Finally, the Details contain the detailed list of all error occurrences.

 

Header section

The Header section provides an overview of the validation event. The Header consists of general messaging related to process type and process outcome; an aggregated counter of flagged issues, per severity level; report export functionality; and dataset specific process metadata along with general process metadata.

 

The information represented in the Header are the following:

Dataset specific infoDataset NameName of the submitted dataset, as per the Edamis naming convention.
 Data ProviderIdentifier of the data provider country or organization. This may be the country code or the name of the organization.
 Last validation serviceLast validation service that was called in the process.
 RulesetID of the content validation programs executed.
 DSDDSD Artefact ID as listed in the Euro SDMX Registry.
 DataflowDataflow Artefact ID as listed in the Euro SDMX Registry.
 Input formatFormat of the submitted data file.
 Number of observationsTotal number of observations checked.
General process metadataData SubmittedDate and time when the dataset was submitted in Edamis.
 Process Type

Indicates if the validation instance is pre-validation or official transmission. 

If a pre-validation process is invoked, the process type is 'PRE-VALIDATION'

If official transmission is invoked, the process type is 'OFFICIAL TRANSMISSION'

Please also see Messaging below.

 
 
 
 Report GeneratedDate and time when the Validation Report is created.
 Validation serviceName and version of validation service(s) called. In the Eurostat context, the services may include STRUVAL (structural validation) or CONVAL (content validation).

 

Please note that entries that are not relevant for the individual validation event will not appear in the report, e.g. if a flow is configured not to use CONVAL, the element Ruleset, a CONVAL asset, will not appear.

 

Error occurrence counter

The Header displays a counter with the total of issues encountered, grouped by Severity. 

ErrorBlocking. The data is rejected and the identified issue must be corrected in the before re-submission.
Warning    Non-blocking. The validation process detected an issue where expert evaluation and possible correction is required before the acceptance of the data.
InfoNon-blocking. Information on the data is provided.

 

Messaging

The Validation Report includes general purpose (not error specific) messages to inform data providers of specific circumstances of the validation flow instance.

Official Data Transmission

 

The label appears for validation flow instances that are intended as official data transmissions, with no pre-validation option selected in EDAMIS.
Pre-Validation. This is not an official transmission. The data is validated but not retained by Eurostat.The label appears for validation flow instances that are intended as pre-validation of the data, with the pre-validation option selected at data submission in EDAMIS.
Validation ended with success.The validation process concluded with no Error or Warning severity issues detected.
Validation completed, review of data is required.The validation process concluded with no Error and minimum 1 Warning severity detected.
Validation ended with errors found.The validation process concluded with minimum 1 Error severity detected.
Error limit reached.

Validation services are configured to terminate after reaching a pre-set number of validation errors. In case the error limit is reached, the report indicates this fact. Please note that the data may contain further, unreported errors that have not yet been identified and may be detected on re-submission of the data.

The current error cap is 10.000 error occurrences.

 

Full Report Export

The content of the report (Including Summary and Details) may be exported to Excel, using the Full Report Export button in the Header. The export file will present the Summary in a separate tab, as well as each Rule with all occurrences in individual tabs.

 

COOL - overview summary exported example

 

 

Overview section

Production domains may choose to complement the validation results with additional information about the validation event, e.g. calculate statistics on data quality or compliance metrics. Such calculations are not considered part of the data validation itself and therefore are displayed separately in the report, under the Overview header, above the Error Summary. The metrics are produced by the CONVAL service and use a non-standard severity that has no impact on the validation flow. 

COOL - overview section

 

Error Summary section

The Error Summary provides the complete list of Rules that the data failed to fulfill. Further, the Summary includes:

Original orderThe errors are listed in order of their detection (order in the rule set) by default.
RuleID of the failed Rule
SeverityError/Warning/Info
OccurrencesTotal number of occurrences per Rule.
Error MessageComplete error message text.

 

Details section

Details lists all occurrences of errors for a specific rule detected by the validation services. The section presents:

  • the rule name
  • severity level
  • total number of occurrences
  • number of occurrences reported - STRUVAL only, the number of occurrences displayed in the report is limited to the first 5 in the order of detection
  • error message
  • the complete list of series keys (the cross-products of Concept Name-Concept Value pairs) in a tabular format
  • in case a secondary message is defined for a Rule to provide instructions for resolution, an Additional details entry will display it

Note: For structural validation, there are no Rule names defined, the Rule entry will display an error ID that may be helpful for diagnostic purposes.

COOL - error detailed section

 

In case the validation process detects no issues of any severity, the Error Details section does not appear in the report.

 

Grouping of results

Errors detected in the data are grouped by identifying the unique root cause that prompted them. The grouping logic is displayed below; if there is a difference detected at either step between errors, they will be presented separately, whereas errors with all four attributes identical are considered multiple occurrences of the same error group.

COOL - Grouping results diagram
Error CodeHigh level category of error (not visible in the reports)
RuleID of the failed Rule
Concept NameName of affected dimension.
Error messageComplete error message text.

 

Note: A single root cause may trigger multiple error types, and these will be listed separately (e.g. a code is unexpected and also violates a length constraint).

Errors generated by technical issues detected in the file (e.g. structurally incorrect dataset) will also appear as an error group, with 1 occurrence and no location defined.

 

Inter-dataset validation

Inter-dataset validation is a process where multiple data files submitted simultaneously are validated first individually, then against each other. The process results in a single report incorporating complete information about all process steps. The dataset-specific metadata in the Header is broken down on file level (with the inter-dataset step also presented with limited metadata), and a separate Summary is generated for each.

 

Validation Report - Excel

In case the original input dataset is XLSM/XLSX format, there is an option to generate the Validation Report by directly indicating the errors in the original file. Note that this method does not work with the XLS extension files.

Cells where a rule violation is detected are highlighted in orange color and the Error Message is displayed in a comment box attached to the cell, along with the Severity.

In case multiple errors are associated with a single cell, all Error Messages are listed in the same comment box.

The design of certain validation rules would imply to highlight more than one cell in the report, e.g. pointing to multiple cells utilized for a calculation. To avoid a possibly confusing visualization, in such cases the report currently avoids the highlighting of multiple cells and associates the error message with a single cell. In case the error needs to be referenced across multiple tabs in a file, the highlighting will be applied on the first affected cell in each tab.

Excel reports may be complemented with an HTML report in case an executive summary is also desired.

COOL - validation report in excel format

 

See also

STRUVAL error codes and messages