Smart surveys - WP 2 Methodology

In pursuit of developing a comprehensive research methodology for the implementation of smart surveys, we have identified four major challenges that hinder the integration of smart surveys into European Official Statistics data collection:

1. Effective recruitment and retaining participants for smart surveys, particularly in difficult-to-reach societal groups.

2. Use of machine learning models to enhance Human-Computer Interaction in smart surveys.

3. Designing smart surveys from a User Experience (UX) or usability perspective, involving respondents, and managing human-computer interaction with sensor data after being processed by machine learning models.

4. Integration of data from smart surveys with traditional survey methods by estimating the mode effect (i.e., differences of smart vs. traditional data collections). To address these challenges, the project will conduct a number of small and large-scale field tests until the end of the project in 2025.

In practice, it is highly likely that there are multiple successful methodologies for conducting smart surveys, and these may vary based on local circumstances. For instance, in some countries, interviewers may play a significant role in both recruitment and retaining participants for smart surveys (issue 1), as well as in enhancing the usability of the app (issue 3).

However, some countries may rely to a greater or lesser extent on traditional non-smart surveys in combination with smart surveys to produce official statistics (issue 4).

Additionally, as a final example, the data available for training and re-training machine learning models in smart surveys may differ both between and within countries over time (issue 2).

One of the key objectives of this work package is to determine which combinations of smart survey designs are effective and which are not. To accommodate the variations between countries, we are conducting field experiments and usability tests in multiple countries to gain comprehensive insights.

The ultimate aim of this work package is to identify trade-offs between design features in smart surveys. One significant trade-off relates to recruitment and retaining participants (issue 1) versus the mode-effect (issue 4). For instance, providing alternative data collection modes, such as web or paper diaries alongside smart surveys, may potentially increase response rates during survey recruitment. However, this approach may lead to differences in data across the various modes (mode-effects: issue 4). The greater the number of alternative modes offered, the more challenging it becomes to estimate mode effects and integrate data from multiple modes.

Another trade-off pertains to the use of machine learning models (issue 2) and the usability of smart surveys (issue 3). Smart surveys are designed to measure things that respondents may find impossible or very challenging to answer, such as the start time of a specific activity or the exact expenses during grocery shopping. Effective machine learning models can enhance the usability of the response task for the respondent.

For example, automated classification of images from shopping receipts can reduce the burden on respondents and improve the accuracy of measurements. However, if machine learning models perform poorly, due to low-quality images or difficulties in classifying products, respondents may be presented with inaccurate results (data). When respondents need to manually correct data from machine learning models, it can lead to usability issues (issue 3) and potentially impact the engagement of participants (issue 1).

The ultimate goal of this work package is to provide insights into these trade-offs by conducting field experiments that involve varying design aspects of smart surveys. An overview of the design of all tests will be published in summer 2024, with findings from all tests and recommendations for a research methodology for smart surveys set to be released by the end of the project in spring 2025. This deliverable does not delve into the detailed trade-offs between design elements, but rather focuses on earlier research into the issues of recruitment and retaining participants, machine learning, usability, and the mode effect in data integration across, in four chapters respectively.

Back to the main page of the Smart Survey Implementation: Trusted Smart Surveys | Eurostat CROS (europa.eu)