Book page

Artificial intelligence: Machine and Statistical Learning - 2024

Machine and Statistical Learning 
Course LeaderHerbert Kruitbosch
Target GroupPython programmers with ambitions to apply machine and deep learning in software engineering and research questions.
Entry Qualifications
  • Sound command of English. Participants should be able to make short interventions and to actively participate in discussions
  • Programming in Python
  • Some experience looking at tabular data with scatter plots and histograms
Objective(s)
  • Validation and reporting results of machine learning methods.
  • Mathematical concepts of supervised and unsupervised machine and deep learning models, like PCA, SVM, trees, ensembles and neural networks.
  • Use of scikit-learn, matplotlib, pandas, tensorflow, keras to design models and perform machine learning experiments
  • Understanding of symbolic computation for backpropagation and gradient descent
  • Model selection, Hyper-parameter tuning and practical considerations
  • Understanding the lego bricks of neural networks (deep learning) and the numerical issues, in particular vanishing gradients.
  • Use pretrained models for text and vision applications with libraries like deeppavlov and detectron2.
Contents
  • Recap of tabular data, scatter plots and histograms
  • Cross validation, overfitting and data sets
  • The field: Unsupervised and supervised learning, and reinforcement learning (RL is not discussed in detail). Regression and classification. Dimensionality reduction and clustering. Machine Learning and official statistics, misclassification Bias.
  • Data types: tabular, visual (images), textual, time series, audio, etc.
  • Neural networks, back propagation, logistic sigmoid, (cat.) cross entropy, optimization, vanishing gradients, weight initialisation, ReLU (and others) softmax, loss and metrics, gradient descent flavours, learning rate, learning curves, regularisation
  • Practical tips, prioritize problems, orthogonalisation, categorize misclassifications, shortcut pipelines. loss and metrics considerations, multiple objectives, existing libraries, models and APIs.
  • Convolutional neural networks, residual layers, deconvolutions.
  • Lecture: models for text: TFIDF and Word2Vec, recurrent networks, Transformer Networks, CTC loss
Expected Outcome

Practical skills to use scikit-learn and similar libraries to answer (research) questions with machine learning

Understanding of different machine learning models and neural networks

Overview and coarse skills w.r.t. neural networks text mining and computer vision

Training Methods
  • Lectures
  • Programming exercises
Required ReadingNone
Suggested Reading

Suggestions for when you like a structured overview of Machine or deep learning. They are not required at all.

  • Python Machine Learning - Effective algorithms for practical machine learning and deep learning, Sebastian Raschka, Vahid Mirjalili
  • Deep Learning with Keras, Antonio Gullì, Sujit Pal (practical)
  • Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville (theoretical)

Interesting papers which improved the field of deep learning significantly:

  • Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate, Shift Sergey Ioffe, Christian Szegedy
  • Deep Residual Learning for Image Recognition, Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Required PreparationYou’ll be sent two exercises a few weeks in advance to avoid issues with logging in to Google Colaboratory and browser problems at the course.
Trainer(s)/

Lecturer(s)

Herbert Kruitbosch (University of Groningen)

Marco Puts (Statistics Netherlands)

 

 

Practical Information    
WhenDurationWhereOrganiserApplication  via National Contact Point
19-21.03.20243 daysThe Hague, NetherlandsICON-INSTITUT Public Sector GmbHDeadline: 22.01.2024