Book page

Issue 09 - Experimental indices in tourism statistics

WIN project blog logo

Experimental Indices in Tourism Statistics

In the last few years, tourism trends have changed, both on the demand and supply sides. The rapidly evolving and available new technologies require national statistical offices (NSIs) to continuously adapt their IT systems to collect data, including data from unstructured big data sources, as well as to verify and process them. Moreover, the situation forces the NSIs to improve the methods and techniques of acquiring new types of data and to implement innovative tools to advance the consistency and comparability of the generated results, also in the area of tourism statistics.

Today's digital world brings us many big data sources, which may help tourism statistics be more up-to-date. Tourism is a very specific industry based on travellers' behaviour and limitations. According to European regulation 692/2011 on tourism statistics: "tourism means the activity of visitors taking a trip to a main destination outside their usual environment, for less than a year, for any main purpose, including business, leisure or other personal purposes, other than to be employed by a resident entity in the place visited". Classical statistics are often not enough to give a complete picture of tourism in today's world. Therefore, the aim of use case 4 - "Experimental indices in tourism statistics" is to develop experimental indices based on big data, particularly from web portals. Tourism is an industry characterised by the cross-sectional nature of economic activities and industries within the economy. It includes accommodation, food and beverage, rental, transport, cultural, sport, travel agent and tour operator service. 

The project, therefore, involves using more diverse information than just that directly related to tourism, which can include the price of accommodation, airline tickets, local transport, the cost of particular products and the price of tours. This data can be used to impute missing data in the tourist travel survey, which is carried out in the countries of the European Union. It can also be crucial for detecting and monitoring rapid changes in the level of prices offered for tourism services. This is particularly important when shock situations, such as the pandemic, can disrupt the entire tourism industry.

The number of portals offering the same range of published information of interest is very large, and new ones appear daily. Therefore, an important issue on which the possibility of using web portals for statistical purposes depends is their appropriate assessment according to defined criteria. Particular attention should be given to portals with a high frequency of data updates, a large number of offers, and popularity among users, which can be verified through specialised websites dedicated to web traffic analysis. Another thing to consider is that portals, especially those offering rental of tourist facilities or booking tickets, are based on the latest IT solutions and technologies, making their web scraping significantly more difficult. Therefore, an analysis of the portals in terms of creating technological criteria is worthwhile. Most off-the-shelf solutions designed for automatic data downloading cannot handle dynamically loaded content, pop-ups or captcha. That said, while the criteria for evaluating information resources are common and can only be compiled once, the criteria for evaluating technological parameters must be adapted to the knowledge and skills of the developers.

Web scraping of portals was initiated in 2022, but this is only the beginning. Work is currently underway to combine data from web sources with statistical sources. Later in 2023, work will focus on developing methods for creating new indicators.

Published 10 January 2023