TAIGA: A Novel Dataset for Multitask Learning of Continuous and Categorical Forest Variables From Hyperspectral Imagery

Matti Mõttus, Phu Pham, Eelis Halme, Matthieu Molinier, Hai Cu, Jorma Laaksonen

Research output: Contribution to journalArticleScientificpeer-review

5 Citations (Scopus)


The spectral and spatial resolutions of modern optical Earth observation data are continuously increasing. To fully utilize the data, integrate them with other information sources, and create applications relevant to real-world problems, extensive training data are required. We present TAIGA, an open dataset including continuous and categorical forestry data, accompanied by airborne hyperspectral imagery with a pixel size of 0.7 m. The dataset contains over 70 million labeled pixels belonging to more than 600 forest stands. To establish a baseline on TAIGA dataset for multitask learning, we trained and validated a convolutional neural network to simultaneously retrieve 13 forest variables. Due to the size of the imagery, the training and testing sets were independent, with strictly no overlap for patches up to 45 x 45 pixels. Our retrieval results show that including both spectral and textural information improves the accuracy of mapping key boreal forest structural characteristics, compared with an earlier study including only spectral information from the same image. TAIGA responds to the increased availability of hyperspectral and very high resolution imagery, and includes the forestry variables relevant for forestry and environmental applications. We propose the dataset as a new benchmark for spatial-spectral methods that overcomes the limitations of widely used small-scale hyperspectral datasets.

Original languageEnglish
Article number5521711
JournalIEEE Transactions on Geoscience and Remote Sensing
Publication statusPublished - 7 Jan 2022
MoE publication typeA1 Journal article-refereed


  • boreal forest
  • convolutional neural networks
  • Data models
  • Deep learning
  • Forestry
  • Hyperspectral imaging
  • multitask learning
  • Soil
  • Spatial resolution
  • Training
  • Boreal forest
  • hyperspectral imaging (HSI)


Dive into the research topics of 'TAIGA: A Novel Dataset for Multitask Learning of Continuous and Categorical Forest Variables From Hyperspectral Imagery'. Together they form a unique fingerprint.

Cite this