DK Forest LiDAR - Predictor Data Overview

Predictor variables and selection

EcoDes-DK15 descriptors

The core of the predictor variables is formed by the EcoDes-DK15 rasterised lidar descriptors (Assmann et al. 2022) generated from the 2014/15 national airborne laser scanning campaign conducted by the Danish government.

From the 76 available EcoDes-DK15 layers (incl. auxiliary layers), we removed the date_stamp_xxx, point_count_xxx, point_source and building_proportion layers as we deemed those non-informative for the task of predicting forest conservation value. We kept the sea and water mask layers to try out sub-setting of the training data to make sure only land pixels are included, but discarded the mask layers later in the analysis.

Furthermore, we removed the following descriptors: canopy_openness, point_count, normalized_z_mean, heat_load_index, openness_mean, twi - as the ecological meaning of these was conceptually redundant with other descriptors (vegetation_density, canopy_height, solar_radiation, openness_difference and ground water respectively) and initial model runs indicated that these variables had a low predictive power. We also removed the aspect variable because it was a very weak predictor. This makes sense conceptually as the aspect at 10 m likely has little meaning on whether a forest cell is of high conservation value or not (all cardinal directions would theoretically be expected to be of high conservation value).

Finally, we removed all vegetation_proportion variables. These variables demonstrated a low predictive power by themselves. However, to capture the vertical variability in the lidar point cloud we calculated a foliage height diversity variable.

The final set of used EcoDes-DK15 variables is:

amplitude_mean
amplitude_sd
canopy_height
dtm_10m
normalized_z_sd
openness_difference
slope
solar_radiation
vegetation_density

Foliage height diversity

To capture the vertical variation in the forest canopy we calcualted the “foliage height diversity” (MacArthur and MacArthur 1961) from the EcoDes-DK15 point proportion descriptors We followed the height bins used by Wilson (1974): 0 m – 1.5 m, 1.5 m – 9 m, and >9 m.

foliage_height_diversity

Tree type predictor

As we expected that most common tree type (broadleaf vs. coniferous) would play an important role in determining if and why a forest is of high or low conservation value, we included the tree type projections generated by Bjerreskov et al. (2021).

The authors used a multi-temporal Sentinel 1/2 data fusion (SAR and optical) approach to assign forest types in a binary classification (broadleaf vs. coniferous).

As both types are mutually exclusive we discarded the “is confierous” variable after one-hot encoding of the source data. The source data is currently not publcily avialable, but was kindly shared with us by Thomas Nord-Larsen (senior author on Bjerreskov et al. 2021).

treetype_bjer_dec

Soil predictors

Clay, sand and organic carbon content of soil

Soil type and composition are an important indicator in the key for the paragraph 25 forests. Here we used the following three predictors to account for differences in the soils across Denmark:

Clay_utm32_10m
Sand_utm32_10m
Soc_utm32_10m

These data were obtained from the Soilgrids 2.0 dataset (Poggio et al. 2021). The original data layers were queried using the geodata package (Hijmans, Ghosh, and Mandel 2021) and subset to the extent of Denmark. The original data have a grain size of 250 m and are in a “Interrupted_Goode_Homolosine” projection. We projected them to the EcoDes-DK grid with 10 m grain size (UTM32N) using nearest neighbor resampling.

Note that the nearest neighbour resampling strategy is conservative and makes no assumption about the spatial distribution of the variables during the downsampling of the 250 m dataset. However, the downsampling may give the wrong impression that we have used higher-resolution predictor data than we actually have. Finally, the resampling will inevitably introduce some uncertainties where the downsampled grid and the orignal grid not align.

As a water mask had originally been applied to this data, we had no predictor data in cases where a 250 m x 250 m pixel overlapped with a water body. This became a problem when extrapolating the models to the nationwide extent, as the finer grain size of our maps introduced more detailed shore lines. We therefore had 10 m x 10 m land pixels for which no soil data was available. To address this problem we gap-filled the original 250 m x 250 m soil data. All pixels that were NA and had at least one neighbouring cell that was not NA were filled with the mean of all neighbouring cells that were not NA. The raster was then projected to the EcoDEs-DK grid with 10 m grain size and only used for generating the nationwide forest conservation value maps from the trained models, but not for training of the models themselves. Forest conse predictions close to some shores may therefore contain some error, but we are confident that this error is very small due to the inherently high autocorrelation of the soil variables.

Water availability

To account for the wetness of the forest ground and the water availability to the plants we use the summer near-surface ground water estimates by Koch et al. 2021.

ns_groundwater_summer

Focal variables

To capture the spacial context around a pixel beyond the 10 m grid, we selected four key predictor variables and calculated their mean and variation (sd) for two window sizes of 110 m and 250 m around each pixel. We selected these window sizes as the best candidates based on variograms generated for all variables.

We conducted a collinearity analysis on the focal variables and reduced the variables in a step-wise selection process to the following final four focal variables included in the models:

dtm_10m_sd_110m
canopy_height_sd_110m
vegetation_density_sd_110m
ns_groundwater_summer_sd_110m

Additional documentation of the selection process can be found in the focal variable selection document.

Overview table final predictor data sources

Here is an overview table of the final predictor data sources.

Predictor	Source Dataset	Ecological Meaning
amplitude_mean	EcoDes-DK15	Quality of lidar signal reflected (proxy of biomass).
amplitude_sd	EcoDes-DK15	Variation in quality of lidar signal reflected within 10 m pixel (proxy of variation in biomass).
canopy_height	EcoDes-DK15	Lidar estimator of canopy height (95-percentile of height distribution of all vegetation points in 10 m pixel).
canopy_height_sd_110m	EcoDes-DK15	Variation in lidar estimator of canopy height within 110 m focal window (11 x 11 pixels).
Clay_utm32_10m	Poggio et al. 2021	Estimated percentage clay content of soil (250 m resolution downscaled to 10 m).
dtm_10m	EcoDes-DK15	Terrain height above sea level.
dtm_10m_sd_110m	EcoDes-DK15	Variation in terrain height above sea level within 110 m focal window (11 x 11 pixels).
foliage_height_diversity	EcoDes-DK15	Foliage height diversity MacArthur and MacArthur (1979) based on height bins by Wilson (1974)
normalized_z_sd	EcoDes-DK15	Estimated variation in canopy height within 10 m pixel.
ns_groundwater_summer_sd_110m	Koch et al. 2021	Estimate of depth of near-surface groundwater during an average summer.
ns_groundwater_summer_utm32_10m	Koch et al. 2021	Variation in the estimate of depth of near-surface groundwater during an average summer within a 110 m focal window (11 x 11 pixels).
openness_difference	EcoDes-DK15	Presence of linear features in the terrain (valleys, ridges etc.) based on a 50 m search radius.
Sand_utm32_10m	Poggio et al. 2021	Estimated percentage sand content of soil (250 m resolution downscaled to 10 m).
slope	EcoDes-DK15	Terrain slope at 10 m
Soc_utm32_10m	Poggio et al. 2021	Estimated percentage soil organic carbon content of soil (250 m resolution downscaled to 10 m).
solar_radiation	EcoDes-DK15	Annual incident solar radiation based on terrain model (aspect and slope).
treetype_bjer_dec	Bjerreskov et al. 2021	Decidous or coniferous forest.
vegetation_density	EcoDes-DK15	Denisty of vegetation points in 10 m lidar pixel.
vegetation_density_sd_110m	EcoDes-DK15	Variation of density of vegeation points amongst pixels within 110 m window (11 x 11 pixels).

References

Assmann, Jakob J., Jesper E. Moeslund, Urs A. Treier, and Signe Normand. “EcoDes-DK15: High-resolution ecological descriptors of vegetation and terrain derived from Denmark’s national airborne laser scanning data set.” Earth System Science Data Discussions (2021): 1-32. -Bjerreskov, K. S., Nord-Larsen, T., and Fensholt, R.: Classification of Nemoral Forests with Fusion of Multi-Temporal Sentinel-1 and 2 Data, 13, 950, https://doi.org/10.3390/rs13050950, 2021.
Hijmans, Robert J., Aniruddha Ghosh, and Alex Mandel. 2021. Geodata: Download Geographic Data. https://CRAN.R-project.org/package=geodata. -Koch, J., Gotfredsen, J., Schneider, R., Troldborg, L., Stisen, S., and Henriksen, H. J.: High Resolution Water Table Modeling of the Shallow Groundwater Using a Knowledge-Guided Gradient Boosting Decision Tree Model, 3, 2021.
MacArthur, R. H., & MacArthur, J. W. (1961). On Bird Species Diversity. Ecology, 42(3), 594–598. https://doi.org/10.2307/1932254
Poggio, Laura, Luis M De Sousa, Niels H Batjes, Gerard Heuvelink, Bas Kempen, Eloi Ribeiro, and David Rossiter. 2021. “SoilGrids 2.0: Producing Soil Information for the Globe with Quantified Spatial Uncertainty.” Soil 7 (1): 217–40.
Willson, M. F. (1974). Avian Community Organization and Habitat Structure. Ecology, 55(5), 1017–1029.