Book page

Privately Held Data – Tools and Techniques 1st edition - 2024

 Privately Held Data – Tools and Techniques 
Course LeaderChristian Kauth
Target GroupOfficial Statisticians wishing to develop a toolbox for data acquisition with respect to privately held data at national level.
Entry Qualifications
  • Sound command of English. Participants should be able to make short interventions and to actively participate in discussions
  • Active or passive knowledge of Python and HTML/CSS is beneficial. Participants may choose to either actively code the exercises or to passively learn from the provided solutions.

 

Objective(s)

This course shall democratize the accessibility of privately held data and valorisation thereof

  • Get to know the techniques and tools to access privately held data
  • Understand the stages of data valorisation pipelines, including data storage and accessibility
  • Familiarize with the European Data Strategy and Digital Strategy
  • Experiment with web-crawling, web-scraping, web-automation and API interfacing techniques
  • Gain hands-on experience with diverse data mining techniques from privately held data sources
Contents
  • Demystification of the techniques and tools involved in data access
  • Data valorisation pipeline, stages, architectures and accessibility (MongoDB database, Swagger& OpenAPI design)
  • Pillars, instruments, stakeholders and enablers of the European Data Strategy
  • Introduction to web-crawling (Scrapy), web-scraping (Beautiful Soup), web-automation (Selenium) and API interfacing (Swagger)
  • Lab sessions: Data mining from privately held data sources
  • Development of data miners for news (rtl.lu), reviews (yelp.com), tourism (airbnb.com), and social media (twitter.com)
Expected OutcomeThe participants will understand the current data landscape and undertakings of the European data and digital strategies. They will know the techniques and tools involved in the stages of a data processing pipeline, from privately held data access to data valorization. Participants will have witnessed how artificial intelligence, blockchain and storage infrastructure benefit data management and have acquired hands-on experience (either through active coding or passive understanding of the provided solutions) with privately held data on news sites (rtl.lu), review sites (yelp.com), tourism (airbnb.com), and social media (twitter.com). Participants will learn how to store scraped data to databases (MongoDB) and make it accessible (REST API).
Training Methods
  • Presentations and lectures (50%)
  • Exchange of views/experiences (10%)
  • Guided Hands-on Lab sessions (40%)
Required ReadingNone
Suggested ReadingNone
Required Preparation
Trainer(s)/

Lecturer(s)
Christian Kauth

 

1st edition

Practical Information    
WhenDurationWhereOrganiserAPPLICATION VIA National Contact Point
22–24.04.20243 daysCologne, Germany

ICON INSTITUTE

Public Sector GmbH

Deadline: 26.02.2024