When selecting a datafile in the send data file screen, the file name is parsed and analysed using the DSNC (dataset naming convention) in order to pre-fill automatically the form.
Dataset Naming Convention (DSNC) at a glance:
DATASET ID:
DOMAIN ID | _ | DATASET STRUCTURE ID | _ | PERIODICITY |
Field | Length | Description/Remark |
DOMAIN ID | 1..8 | Identifies the statistical domain (a group of datasets closely linked together).Only digits from 0 to 9 and capital letters can be used. |
DATASET STRUCTURE ID | 1..7 | Identifies the Dataset Structure (associated with one or several statistical tables). Only digits from 0 to 9 and capital letters can be used.
|
PERIODICITY OR PERIODICITIES | 1 |
|
DATASET OCCURRENCE ID
DATASET ID | _ | FROM | _ | YEAR | _ | PERIOD | _ | [TO] | _ | [OPTION] | . | FORMAT |
Field | Length | Description/Remark |
DATASET ID (mandatory) | See above | See above |
FROM (mandatory) | 2 | The code of the country which the primary data providing organisation belongs to. The ISO2 country codes are used, with several exceptions |
YEAR (mandatory) | 4 | Four digit representation of the reporting year, “YYYY” For non-periodic datasets: “0000” or the reporting year |
PERIOD (mandatory) | 4 | Four digit representation of the period within the reporting year or the sequence number for non-periodic datasets. Acceptable values depend on the periodicity:
|
TO (optional) | 2 | The code of the country which the primary data receiving organisation belongs to. The same rules are applied as in FROM.
Note: This field is used mainly for transmissions sent from Eurostat. |
Optional field(s) | 1…220 | Though not recommended, optional information given by the data sender (ignored by Eurostat tools during processing) |
FORMAT | 20 | Examples: “XML”, “GES”: GESMES, “CSV”, “FLR” (“Fixed Length Records”), “DOCX”, “XLSX” etc. |
Composition constraints and limitations for fields:
- “YEAR” and “PERIOD”: If data of several years/periods are sent in a dataset occurrence, then only the year/period should be used that is agreed (specified in the calendar of the dataset).
- Optional field(s): Only letters “A” to “Z”, digits “0” to “9” and “_” are allowed.
The file name follows:
- partially the DSNC
It means that a valid dataset ID is found at the beginning of the file name.
In this case, only the fields until the first inconsistency will be pre-filled. Other fields remain empty and have to be filled manually by the user.
Fields extracted from the DSNC analysis have to be compatible with the filtered lists of elements present in the corresponding fields.
- completely the DSNC
For complete DSNC, the "Dataset ID", "From", "To", "Year" and "Period" fields will be automatically filled in.
The naming convention relates to 3 levels:
- Level 1: the dataset ID (ex: RAIL_E_Q)
- Level 2: the dataset occurrence ID (ex: RAIL_E_Q_UK_2003_0002_EU)
- Level 3: the datafile ID (ex: RAIL_E_Q_UK_2003_0002_EU_V0002.GES).
The DSNC parsing is based on the second level, the dataset occurrence ID.
The automatic pre-filling of the form is performed until the first inconsistency detected in the DSNC.
For instance, “RAIL_E_Q_toto.csv” is sufficient to pre-fill the field "Dataset ID" and "RAIL_E_Q_UK_2003_titi.csv" is sufficient to pre-fill the "Dataset ID", "From" and "Year" fields.