Artificial intelligence: Machine and Statistical Learning |
Course Leader | Herbert Kruitbosch |
Target Group | This course is intended for NSIs staff from any field who want to learn more about Machine and Statistical Learning (MSL) method and techniques. MSL is concerned with algorithms that automatically improve their performance through 'learning', it has emerged mainly from statistics and artificial intelligence, and has connections to a variety of related subjects including computer science and pattern recognition. This course will present an overview of fundamental concepts, common techniques, and algorithms in MSL. It will cover basic topics such as dimensionality reduction, classification and regression, clustering as well as more recent topics such as ensemble learning/boosting, support vector machines, and kernel methods, etc... This course will provide students the basic intuition behind modern MSL methods. |
Entry Qualifications | |
Objective(s) | Validation and reporting results of machine learning methods. Mathematical concepts of supervised and unsupervised machine and deep learning models, like PCA, SVM, trees, ensembles and neural networks. Use of scikit-learn, matplotlib, pandas, tensorflow, keras to design models and perform machine learning experiments Understanding of symbolic computation for backpropagation and gradient descent Model selection, Hyper-parameter tuning and practical considerations Understanding the lego bricks of neural networks (deep learning) and the numerical issues, in particular vanishing gradients. Use pretrained models for text and vision applications with libraries like deeppavlov and detectron2.
|
Contents | The sessions should consist of both basic methodology and practical exercises. Fundamental concepts of MSL: supervised, unsupervised, reinforcement and ensemble learning; maximum likelihood; loss functions; train, test and validation sets/errors; bias-variance trade-off; model complexity, regularisation and overfitting; etc... Dimensionality reduction, e.g., principal component analysis, linear discriminant analysis. Validation and resampling, e.g., cross-validation, bootstrap, jackknife. Regression and variable selection, e.g., linear regression, model selection, ridge and lasso. Classification and clustering, e.g., Bayes classifiers, K Nearest Neighbours, logistic regression, K-Means, hierarchical clustering, mixture models. Decision, e.g., classification and regression trees, bagging, Random Forests, bayesian networks. Pattern recognition and anomaly detection, e.g., kernel methods and support vector machines. Introduction to artificial intelligence, neural networks and deep learning. Practice using R or Python language and packages.
|
Expected Outcome | Practical skills to use scikit-learn and similar libraries to answer (research) questions with machine learning Understanding of different machine learning models and neural networks Overview and coarse skills w.r.t. neural networks text mining and computer vision |
Training Methods | Lectures Programming exercises
|
Required Reading | None |
Suggested Reading | Suggestions for when you like a structured overview of Machine or deep learning. They are not required at all. Python Machine Learning - Effective algorithms for practical machine learning and deep learning, Sebastian Raschka, Vahid Mirjalili Deep Learning with Keras, Antonio Gullì, Sujit Pal (practical) Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville (theoretical)
Interesting papers which improved the field of deep learning significantly: Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate, Shift Sergey Ioffe, Christian Szegedy Deep Residual Learning for Image Recognition, Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
|
Required Preparation | Participants have to send two exercises a few weeks in advance to avoid issues with logging in to Google Colaboratory and browser problems at the course. |
Trainer(s)/ Lecturer(s) | Herbert Kruitbosch (University of Groningen) Marco Puts (Statistics Netherlands) |