Course overview and motivation
- Are you working with geospatial data and running close to the limits of your own computing environment?
- Are you curious on how you can take your geospatial data processing and analysis to the next level?
- Or maybe you have been using a supercomputer already, but would like to make sure your are getting the most out of it?
→ This intensive two-day course is intended for you!
In this course we will learn the basics of geocomputing on a supercomputer through a combination of lectures and hands-on activities. The main focus of the course is Puhti supercomputer, were all hands-on exercises will be done. The CSC services discussed in this course are free-of-charge for academic research, education and training purposes for Finnish higher education institutions and state research institutes (subsidized by the Ministry of Education and Culture, Finland).
Most of the course content also applies to LUMI supercomputer, which is available for academic users and companies.
The course is meant both for academic researchers planning to use Puhti supercomputer and for data analysts from private companies planning to use LUMI.
If still unsure, if supercomputers could benefit you, see CSC geocomputing page.
Course materials will be published here: https://csc-training.github.io/geocomputing_course/
For using supercomputers following technical skills are needed:
- Domain knowledge: spatial analysis tools and data (prerequisite)
- Linux basic commands (prerequisite)
- Supercomputer basics - this course is mainlly about this
- Scripting skills and how to write parallel scripts, one of these: Python, R, bash, Julia, MATLAB etc. (prerequisite)
- In this course we only shortly talk about parallizing Python and R code.
Learning outcomes
After the course the participants should have the skills and knowledge needed to start using CSC supercomputer Puhti for their spatial analysis and spatial data processing tasks.
In detail, participants will learn:
- How to get account and access to Puhti (as part of the prerequisites).
- How to connect to a supercomputer, and where to store your data (Allas).
- How to use the modules and the batch job system.
- How to install own software to a supercomputer (Tykky).
- How to run your R or Python scripts or GDAL commands on one or several cores.
- How to use QGIS and other pre-installed GIS-software via the Puhti webinterface.
- How to get help.
Preliminary schedule
Day 1
09:00 - 10:15 Welcome, Practicalities and Introduction to CSC
10:15 - 10:45 Coffee break
10:45 - 12:00 Introduction to supercomputers and Puhti webinterface
12:00 - 13:00 Lunch break
13:00 - 14:15 Running jobs on Puhti, GIS tools and data in Puhti
14:15 - 14.45 Coffee break
14:45 - 16:00 Command line tools exercise with GDAL
Day 2
09:00 - 10:15 Python exercise
10:15 - 10:45 Coffee break
10:45 - 12:00 Parallelization, high throughput, Installing own software, Adapting own scripts to the supercomputer
12:00 - 13:00 Lunch break
13:00 - 14:00 Data storage and moving data, Allas
14:00 - 14:15 Wrap-up, Q&A
14:15 - 14.45 Coffee break
14:45 - 16:00 Exercise R (optional)
Prerequisites
To make this course as enjoyable as possible for you and to make sure you can get something out of it, we expect you to know about the following:
- General understanding of geoinformatics, vector and raster data, coordinate systems.
- General understanding of either Python, R or use of command line tools, e.g. GDAL, PDAL, ...
- Basic Unix commands (know how to use these commands in a terminal): cd, ls, mv, cp, rm, chmod, less, tail, echo, mkdir, pwd.
- Here are some resources to acquire these skills:
- UNIX tutorial for beginners (the first two topics are a good start, try also some editor)
- Basic Linux Commands 10 min tutorial video (sit back and watch)
- CSC Linux Cheat Sheet (one page summary of the most important Linux commands – handy to have near you during the course)
- Here are some resources to acquire these skills:
Practical information
This course is offered free of charge, but registration is required (deadline: 1.10). You can choose to attend the course at CSC office in Espoo or remotely. The course does not include any catering, only coffee/tee. Several lunch restaurants are in close distance.
Participants at CSC office are provided with a training PC. Online participants need own computer with Zoom, browser-based Zoom should be enough. Two screens are very recommended for online participation.
All hands-on activities of this course will be carried out with CSC's supercomputer Puhti's webinterface, which you can access via your favorite webbrowser. For this, you do not need any additional software installed on your own computer.
For using Puhti a CSC account is needed. In case you do not have an account for CSC's services yet, further instructions for applying for this will be provided a week before the course.
If you later find out that you cannot attend the course, please let us know by sending an email to event-support@csc.fi so that people on the waitlist can fill your spot.