Practical machine learning for spatial data
This course gives a practical introduction to machine learning for spatial data, both to shallow learning and deep learning models, especially convolutional neural networks (CNN).
The course consists of lectures and hands-on exercises. Exercises will be done with different Python libraries. Scikit-learn will be used for the shallow learning exercises in Notebooks or on local PCs. Keras and solaris with pytorch will be used for deep learning exercises on Puhti-AI.
After the course the participants should have the skills and knowledge needed to begin applying machine learning and deep learning for different tasks and utilizing the GPU resources available at CSC for training and deploying their own neural networks.
- Basics of geoinformatics: data types, formats.
- Basics of Python. The course will include a fair amount of reading and writing Python code, so you should be able to follow Python syntax. If you need to refresh your Python skills you can go through the materials of Helsinki University GeoPython course.
- Very basic Linux commands: cd, ls, mv, cp, rm, chmod, less, tail, echo, mkdir, pwd. If unfamiliar take a look for example at LinuxSurvival first two modules.
The course is similar the Practical machine learning for spatial data course kept in autumn 2019.
Lecture: Introduction to machine learning
Lecture: Introduction to exercises, preparing spatial data for machine learning
Exercise 1: Preparing vector data for regression
Exercise 2: Preparing raster data and labels for clustering and classification
Lecture: Shallow machine learning models
Exercise 3: Shallow regression with scikit-learn
Exercise 4: Image segmentation using k-means with scikit-learn Exercise 5: Image classification using shallow classifiers, grid search with scikit-learn
Lecture: Introduction to deep learning models Lecture: Fully connected neural networks
Lecture: Puhti GPUs and batch jobs
Exercise 6: Fully connected regressor with keras
Exercise 7: Fully connected classifier with keras
Lecture: Convolutional neural networks (CNN)
Exercise 8: CNN based image segmentation with keras
Lecture: GIS software supporting machine learning for spatial data
Lecture: Introduction to solaris
Exercise 9: CNN based image segmentation with solaris
Reservation for delays in timetable.
We will have short breaks also during morning and afternoon sessions.
Course exercise materials (under development): https://github.com/csc-training/geocomputing/tree/master/machineLearning
Technical requirements for participant local computers:
- For working with Puhti:
- o In Windows Putty and WinSCP/FileZilla or some other similar tool
- o In Mac and Linux, FileZilla if moving files with scp is not familiar for you.
- ArcGIS or QGIS or some other GIS tool for viewing the files (GeoTiff, JPG2000, Shape, GeoPackage).
- Spyder, PyCharm or some other Python IDE, or Notepad++ or some other text editor. For Windows users it is important that the tools has possibility to change end-of-line character type from Windows to Linux. (Basic Notepad and Word are not suitable.)
- If you want to do first day exercises locally and not with Notebooks, then Python with Spyder or some other Python IDE. The needed Python packages are listed here: https://github.com/csc-training/geocomputing/blob/master/machineLearning/gis.yml Conda installation is warmly recommended.