The core of the predictor variables is formed by the EcoDes-DK15 rasterised lidar descriptors (Assmann et al. 2022) generated from the 2014/15 national airborne laser scanning campaign conducted by the Danish government.
From the 76 available EcoDes-DK15 layers (incl. auxiliary layers), we removed the date_stamp_xxx, point_count_xxx, point_source and building_proportion layers as we deemed those non-informative for the task of predicting forest conservation value. We kept the sea and water mask layers to try out sub-setting of the training data to make sure only land pixels are included, but discarded the mask layers later in the analysis.
Furthermore, we removed the following descriptors: canopy_openness, point_count, normalized_z_mean, heat_load_index, openness_mean, twi - as the ecological meaning of these was conceptually redundant with other descriptors (vegetation_density, canopy_height, solar_radiation, openness_difference and ground water respectively) and initial model runs indicated that these variables had a low predictive power. We also removed the aspect variable because it was a very weak predictor. This makes sense conceptually as the aspect at 10 m likely has little meaning on whether a forest cell is of high conservation value or not (all cardinal directions would theoretically be expected to be of high conservation value).
Finally, we removed all vegetation_proportion variables. These variables demonstrated a low predictive power by themselves. However, to capture the vertical variability in the lidar point cloud we calculated a foliage height diversity variable.
The final set of used EcoDes-DK15 variables is:
To capture the vertical variation in the forest canopy we calcualted the “foliage height diversity” (MacArthur and MacArthur 1961) from the EcoDes-DK15 point proportion descriptors We followed the height bins used by Wilson (1974): 0 m – 1.5 m, 1.5 m – 9 m, and >9 m.
As we expected that most common tree type (broadleaf vs. coniferous) would play an important role in determining if and why a forest is of high or low conservation value, we included the tree type projections generated by Bjerreskov et al. (2021).
The authors used a multi-temporal Sentinel 1/2 data fusion (SAR and optical) approach to assign forest types in a binary classification (broadleaf vs. coniferous).
As both types are mutually exclusive we discarded the “is confierous” variable after one-hot encoding of the source data. The source data is currently not publcily avialable, but was kindly shared with us by Thomas Nord-Larsen (senior author on Bjerreskov et al. 2021).
Soil type and composition are an important indicator in the key for the paragraph 25 forests. Here we used the following three predictors to account for differences in the soils across Denmark:
These data were obtained from the Soilgrids 2.0 dataset (Poggio et al. 2021). The original data layers were queried using the geodata package (Hijmans, Ghosh, and Mandel 2021) and subset to the extent of Denmark. The original data have a grain size of 250 m and are in a “Interrupted_Goode_Homolosine” projection. We projected them to the EcoDes-DK grid with 10 m grain size (UTM32N) using nearest neighbor resampling.
Note that the nearest neighbour resampling strategy is conservative and makes no assumption about the spatial distribution of the variables during the downsampling of the 250 m dataset. However, the downsampling may give the wrong impression that we have used higher-resolution predictor data than we actually have. Finally, the resampling will inevitably introduce some uncertainties where the downsampled grid and the orignal grid not align.
As a water mask had originally been applied to this data, we had no predictor data in cases where a 250 m x 250 m pixel overlapped with a water body. This became a problem when extrapolating the models to the nationwide extent, as the finer grain size of our maps introduced more detailed shore lines. We therefore had 10 m x 10 m land pixels for which no soil data was available. To address this problem we gap-filled the original 250 m x 250 m soil data. All pixels that were NA and had at least one neighbouring cell that was not NA were filled with the mean of all neighbouring cells that were not NA. The raster was then projected to the EcoDEs-DK grid with 10 m grain size and only used for generating the nationwide forest conservation value maps from the trained models, but not for training of the models themselves. Forest conse predictions close to some shores may therefore contain some error, but we are confident that this error is very small due to the inherently high autocorrelation of the soil variables.
To account for the wetness of the forest ground and the water availability to the plants we use the summer near-surface ground water estimates by Koch et al. 2021.
To capture the spacial context around a pixel beyond the 10 m grid, we selected four key predictor variables and calculated their mean and variation (sd) for two window sizes of 110 m and 250 m around each pixel. We selected these window sizes as the best candidates based on variograms generated for all variables.
We conducted a collinearity analysis on the focal variables and reduced the variables in a step-wise selection process to the following final four focal variables included in the models:
Additional documentation of the selection process can be found in the focal variable selection document.
Here is an overview table of the final predictor data sources.
Predictor | Source Dataset | Ecological Meaning |
---|---|---|
amplitude_mean | EcoDes-DK15 | Quality of lidar signal reflected (proxy of biomass). |
amplitude_sd | EcoDes-DK15 | Variation in quality of lidar signal reflected within 10 m pixel (proxy of variation in biomass). |
canopy_height | EcoDes-DK15 | Lidar estimator of canopy height (95-percentile of height distribution of all vegetation points in 10 m pixel). |
canopy_height_sd_110m | EcoDes-DK15 | Variation in lidar estimator of canopy height within 110 m focal window (11 x 11 pixels). |
Clay_utm32_10m | Poggio et al. 2021 | Estimated percentage clay content of soil (250 m resolution downscaled to 10 m). |
dtm_10m | EcoDes-DK15 | Terrain height above sea level. |
dtm_10m_sd_110m | EcoDes-DK15 | Variation in terrain height above sea level within 110 m focal window (11 x 11 pixels). |
foliage_height_diversity | EcoDes-DK15 | Foliage height diversity MacArthur and MacArthur (1979) based on height bins by Wilson (1974) |
normalized_z_sd | EcoDes-DK15 | Estimated variation in canopy height within 10 m pixel. |
ns_groundwater_summer_sd_110m | Koch et al. 2021 | Estimate of depth of near-surface groundwater during an average summer. |
ns_groundwater_summer_utm32_10m | Koch et al. 2021 | Variation in the estimate of depth of near-surface groundwater during an average summer within a 110 m focal window (11 x 11 pixels). |
openness_difference | EcoDes-DK15 | Presence of linear features in the terrain (valleys, ridges etc.) based on a 50 m search radius. |
Sand_utm32_10m | Poggio et al. 2021 | Estimated percentage sand content of soil (250 m resolution downscaled to 10 m). |
slope | EcoDes-DK15 | Terrain slope at 10 m |
Soc_utm32_10m | Poggio et al. 2021 | Estimated percentage soil organic carbon content of soil (250 m resolution downscaled to 10 m). |
solar_radiation | EcoDes-DK15 | Annual incident solar radiation based on terrain model (aspect and slope). |
treetype_bjer_dec | Bjerreskov et al. 2021 | Decidous or coniferous forest. |
vegetation_density | EcoDes-DK15 | Denisty of vegetation points in 10 m lidar pixel. |
vegetation_density_sd_110m | EcoDes-DK15 | Variation of density of vegeation points amongst pixels within 110 m window (11 x 11 pixels). |