Workshop: Satellite Data Analysis and Machine Learning Classification with QGIS – Part 1
In collaboration with
Would you like to increase your skills in the field of satellite images processing using Machine Learning? Do you know the basics of GIS (geographic information system) or QGIS (an open-source application to view, edit and analyze geospatial data)? Then please feel free to join this workshop! It introduces how classification of satellite imagery can be done with QGIS by showing how to retrieve, process and classify satellite imagery, as well as how to assess performance of machine learning algorithms through error matrix and accuracy indexes.
The workshop involves two QGIS plugins: Semi-automatic Classification Plugin (SCP) and dzetsaka. SCP is used for majority of preprocessing operations such as retrieval of the Sentinel 2 imagery for an area of interest, DOS (Dark object subtraction) atmospheric correction, selection of specific bands for classification, creation of composite and computation of band algebra (i.e., Normalized Difference Vegetation Index (NDVI). The dzetsaka plugin is used to detect and classify built-up areas starting from preprocessed satellite imagery with Gaussian Mixture Model, Random Forest and K-Nearest Neighbors machine learning algorithms.
Besides the two plugins, some core QGIS functionalities and are included in the workshop for clipping satellite imagery and creating vector file of training data. Lastly, outcomes of the machine learning algorithm are compared with the global map of human settlements – GHS-BUILT (Sentinel-1) produced by Joint Research Center (JRC) of European Commission to assess their performance. Before being used for assessing algorithms’ performances, GHS-BUILT (Sentinel-1) is adapted to coordinate reference system, resolution, and classes of classification outcomes. Adaptation of GHS-BUILT (Sentinel-1) involves many isolated operations (reprojection, tile merging, resampling, and reclassification). For this reason, the QGIS Graphical Modeler is introduced in the exercise because it allows automation of chain of operations. Besides the adaptation of GHS-BUILT (Sentinel-1), the computation of error matrix and accuracy indexes for each classification outcome are integrated with the Graphical Modeler too.
Please see the two one-pager guidelines for the software requirements and the datasets for the exercises and download relevant software and datasets before the workshop.
- Information and Requirements for Workshop
- Instructions for Attendees (Updated 7 May 2021)
- Now with two links to download the data (Google drive; Zenodo) and FAQs:
- Can I use a different version of QGIS?
- How can I find version 3.10. on the QGIS website?
- How to check if I successfully installed scikit-learn?
- Now with two links to download the data (Google drive; Zenodo) and FAQs:
- Slides set of Part 1, Satellite Data Analysis and Machine Learning Classification with QGIS (Posted 26 April 2021)
The workshop has two parts:
- Part 1: 27 April 2021, 14:00 – 16:00 CEST
- (Optional) 13:00 – 13:45 CEST Demo session on how to install the required software for the workshop on Windows OS.
- 14:00 – 16:00 CEST Workshop Part 1
- Part 2: 11 May 2021, 14:00 – 16:00 CEST
- Presentation of panellists.
[09:00] Before we start
- Make sure you fulfill all the requirements to follow the workshop.
[10:00] QGIS Introduction
- Main graphical user interface: Menu, toolbar, panel of the layer (sorted according to which you are using).
- Display/hide panels and toolbars: View→ Panel or View→Toolbars you can select what to see on your interface.
[14:00] QGIS plugins
- Install QGIS plugins: Main bars → Plugins → Manage and Install Plugins…
- Semi-automatic classification plugin (SCP): today used
- QuickMapServices (QMS): Will be used in the second part of the workshop to collect training data.
- dzetsaka: Classification tool (Gaussian mixture model, random forest and KNN algorithms) to be used in the second part of the workshop for classification.
[18:00] Practical part – Installing plugins
- Semi-automatic Classification Plugin (SCP)
[23:00] About Satellite Imagery
- Multispectral satellite imagery: Using sensors to capture different electromagnetic radiation on the Earth.
- Sentinel 2: 13 spectral brands, 4 bands at 10m, 6 bands at 20m, 3 bands at 6m spatial resolution. Frequent updates (every 5 days).
[27:00] Load Auxiliary Vector Data
- Load vector working area: rectangular area of interest (ai Hashimiyya village) name of vector “working_area.shp”.
- Apply predefined vector style (red line with transparent background).
- Result: You can delimit an area of work.
[34:00] Practical Part – Application of vector
– Layer → add vector layer → copy path of the data → add
– Layers → right click → properties → select outline and style
[38:00] Start Using SCP plugin
- SCP graphical interface: SCP now is in the menu bar, toolbar and dock panel.
- SCP data provider: Using data from ESA Sentinel-2 from Copernicus Open Access Hub.
[42:00] SCP plugin – Image download
- Download satellite imagery using SCP: SCP →download products → Login data and insert credential from a data provider. If you do not have credentials, select Use alternative search for Sentinel-2. Then use the search tab to provide the information regarding the image to download.
- Important note: L1C: A image is not correct in regard to the atmospheric reflectance. 2A must be chosen, but it’s an older image (2018 Europe).
- You can correct 1C images by using the SNAP tool on http://step.esa.int/main/
- Before downloading the picture, you need to set up some variables in the search tab of the SCP plugin (unflag “Only preview layers”, unflag “Preprocess images, flag “Load bands in QGIS”)
[60:00] Practical part – Set up data provider and download data.
[83:00] End of break
- Load previously downloaded image.
- Image consists of 12 jp2 files (one for each band), GML file (mask for clouds), XML files (metadata).
- Add jp2 layers to the project and you will see the working area and the images in grayscale.
[86:00] SCP plugin – Image Preprocessing
- Satellite images preprocessing: Using SCP → preprocessing → Sentinel-2 (satellite).
- Load the bands of satellite and metadata. Flag DOS1 Apply the atmospheric correction. Unflag other preprocessing options.
- 60m bands are not preprocessed because they are low resolution. Remove therefore band 1, 9 and 10. Only keep 20m and 10m bands.
- Load previously preprocessed image: layer → addlayer → add raster layer. Look for the file, add and close. They will be in grayscale.
- Remove original layers from QGIS. (e.g. their name starts with L1C)
[89:00] Practical part – Load previously download images an preprocess them
[103:00] SCP plugin – Band set
- Classification will be performed only on preprocessed bands (e.g those with 10m and 20m of resolution).
- SCP: Band set. Order band in the way they should be ordered in the output. You can set it by wavelength, or by number of bands (with the green rows), paying attention that band 8A is after band 8.
- Check “Create virtual raster of band set” and click “run” in order to save the virtual raster in .vrt format (virtual raster format)
[110:00] QGIS core – Clip raster/image
- Clip the full raster in order to keep only the portion that falls within the working area, because now it is bigger (computational reasons). The result will be saved as a .tif raster in your local memory. You will have to select the clipping extent based on the working_area vector.
- Styling a multispectral raster layer: Symbology → multiband color → [band 03 red; band 02 green; band 01 blue]. You will have the RGB image.
[117:00] Practical part – Create virtual raster of band set and clip it to the working area
[128:00] SCP plugin – Band algebra
- This is a part you can try by yourself if you are interested.
- Computation of typical indexes: NDVI to quantify green vegetation.
- Algebra in QGIS: Is useful to analyse information on an image.
- NDVI expression: you can identify grassland, forest, etc.
- NDVI visualization.
[132:00] Acknowledgements from Prof. Maria Brovelli to assistants
Speakers, Panelists and Moderators
MARIA ANTONIA BROVELLI Professor of Geographic Information Systems and Digital MappingPolitecnico di MilanoMARIA ANTONIA BROVELLIProfessor of Geographic Information Systems and Digital MappingPolitecnico di Milano
Degree with honors in Physics, PhD in Geodesy and Cartography. She is Professor of GIS at the Politecnico di Milano (PoliMI) and member of the School of Doctoral Studies in Data Science at “Roma La Sapienza”. From 1997 to 2010 she was the Director of the “Geomatics Lab” of PoliMI. From 2011 to 2016 she was the Vice Rector of PoliMI for the Como Campus.
She is the chair of ISPRS WG IV/4 “Collaborative crowdsourced cloud mapping (C3M)”; member of ESA ACEO (Advisory Committee of Earth Observation); co-chair of the United Nations Open GIS Initiative, Deputy-Chair of the UN-GGIM (Global Geospatial Information Management) Academic Network, mentor of the PoliMI Chapter of YouthMappers (PoliMappers). She is author of 108 scientific indexed publications and Guest Editor of 11 Special Issues.
Her research activity is in the field of geomatics. Her interests have been various, starting from geodesy, radar-altimetry and moving later to GIS, webGIS, geospatial web platform, VGI (Volunteer Geographic Information), Citizen Science and Big Geo Data. She is participating and leading research on these topics within the frameworks of both national and international projects and scientific networks. One of her main interest is in Open Source GIS, where she is playing a worldwide leading role.
GORICA BRATICResearch FellowPolitecnico di MilanoEng. Gorica Bratic obtained her BSc degree in Environmental Engineering at University of Novi Sad, Faculty of Technical Sciences in 2015 and MSC degree in Environmental and Geomatics Engineering at Politecnico di Milano in 2018. In June 2018 she joined the GEOlab team of Politecnico di Milano as a temporary research fellow and in November 2018 she started her PhD in the Department of Civil and Environmental Engineering. Her main field of interest is inter-comparison and validation of global high-resolution land cover maps. Her research relies on Free and Open Source Software technologies.