Predictor variables and selection

EcoDes-DK15 descriptors

The core of the predictor variables is formed by the EcoDes-DK15 rasterised lidar descriptors (Assmann et al. 2022) generated from the 2014/15 national airborne laser scanning campaign conducted by the Danish government.

From the 76 available EcoDes-DK15 layers (incl. auxiliary layers), we removed the date_stamp_xxx, point_count_xxx, point_source and building_proportion layers as we deemed those non-informative for the task of predicting forest conservation value. We kept the sea and water mask layers to try out sub-setting of the training data to make sure only land pixels are included, but discarded the mask layers later in the analysis.

Furthermore, we removed the following descriptors: canopy_openness, point_count, normalized_z_mean, heat_load_index, openness_mean, twi - as the ecological meaning of these was conceptually redundant with other descriptors (vegetation_density, canopy_height, solar_radiation, openness_difference and ground water respectively) and initial model runs indicated that these variables had a low predictive power. We also removed the aspect variable because it was a very weak predictor. This makes sense conceptually as the aspect at 10 m likely has little meaning on whether a forest cell is of high conservation value or not (all cardinal directions would theoretically be expected to be of high conservation value).

Finally, we removed all vegetation_proportion variables. These variables demonstrated a low predictive power by themselves. However, to capture the vertical variability in the lidar point cloud we calculated a foliage height diversity variable.

The final set of used EcoDes-DK15 variables is:

  • amplitude_mean
  • amplitude_sd
  • canopy_height
  • dtm_10m
  • normalized_z_sd
  • openness_difference
  • slope
  • solar_radiation
  • vegetation_density

Foliage height diversity

To capture the vertical variation in the forest canopy we calcualted the “foliage height diversity” (MacArthur and MacArthur 1961) from the EcoDes-DK15 point proportion descriptors We followed the height bins used by Wilson (1974): 0 m – 1.5 m, 1.5 m – 9 m, and >9 m.

  • foliage_height_diversity

Tree type predictor

As we expected that most common tree type (broadleaf vs. coniferous) would play an important role in determining if and why a forest is of high or low conservation value, we included the tree type projections generated by Bjerreskov et al. (2021).

The authors used a multi-temporal Sentinel 1/2 data fusion (SAR and optical) approach to assign forest types in a binary classification (broadleaf vs. coniferous).

As both types are mutually exclusive we discarded the “is confierous” variable after one-hot encoding of the source data. The source data is currently not publcily avialable, but was kindly shared with us by Thomas Nord-Larsen (senior author on Bjerreskov et al. 2021).

  • treetype_bjer_dec

Soil predictors

Clay, sand and organic carbon content of soil

Soil type and composition are an important indicator in the key for the paragraph 25 forests. Here we used the following three predictors to account for differences in the soils across Denmark:

  • Clay_utm32_10m
  • Sand_utm32_10m
  • Soc_utm32_10m

These data were obtained from the Soilgrids 2.0 dataset (Poggio et al. 2021). The original data layers were queried using the geodata package (Hijmans, Ghosh, and Mandel 2021) and subset to the extent of Denmark. The original data have a grain size of 250 m and are in a “Interrupted_Goode_Homolosine” projection. We projected them to the EcoDes-DK grid with 10 m grain size (UTM32N) using nearest neighbor resampling.

Note that the nearest neighbour resampling strategy is conservative and makes no assumption about the spatial distribution of the variables during the downsampling of the 250 m dataset. However, the downsampling may give the wrong impression that we have used higher-resolution predictor data than we actually have. Finally, the resampling will inevitably introduce some uncertainties where the downsampled grid and the orignal grid not align.

As a water mask had originally been applied to this data, we had no predictor data in cases where a 250 m x 250 m pixel overlapped with a water body. This became a problem when extrapolating the models to the nationwide extent, as the finer grain size of our maps introduced more detailed shore lines. We therefore had 10 m x 10 m land pixels for which no soil data was available. To address this problem we gap-filled the original 250 m x 250 m soil data. All pixels that were NA and had at least one neighbouring cell that was not NA were filled with the mean of all neighbouring cells that were not NA. The raster was then projected to the EcoDEs-DK grid with 10 m grain size and only used for generating the nationwide forest conservation value maps from the trained models, but not for training of the models themselves. Forest conse predictions close to some shores may therefore contain some error, but we are confident that this error is very small due to the inherently high autocorrelation of the soil variables.

Water availability

To account for the wetness of the forest ground and the water availability to the plants we use the summer near-surface ground water estimates by Koch et al. 2021.

  • ns_groundwater_summer

Focal variables

To capture the spacial context around a pixel beyond the 10 m grid, we selected four key predictor variables and calculated their mean and variation (sd) for two window sizes of 110 m and 250 m around each pixel. We selected these window sizes as the best candidates based on variograms generated for all variables.

We conducted a collinearity analysis on the focal variables and reduced the variables in a step-wise selection process to the following final four focal variables included in the models:

  • dtm_10m_sd_110m
  • canopy_height_sd_110m
  • vegetation_density_sd_110m
  • ns_groundwater_summer_sd_110m

Additional documentation of the selection process can be found in the focal variable selection document.

Overview table final predictor data sources

Here is an overview table of the final predictor data sources.

Predictor Source Dataset Ecological Meaning
amplitude_mean EcoDes-DK15 Quality of lidar signal reflected (proxy of biomass).
amplitude_sd EcoDes-DK15 Variation in quality of lidar signal reflected within 10 m pixel (proxy of variation in biomass).
canopy_height EcoDes-DK15 Lidar estimator of canopy height (95-percentile of height distribution of all vegetation points in 10 m pixel).
canopy_height_sd_110m EcoDes-DK15 Variation in lidar estimator of canopy height within 110 m focal window (11 x 11 pixels).
Clay_utm32_10m Poggio et al. 2021 Estimated percentage clay content of soil (250 m resolution downscaled to 10 m).
dtm_10m EcoDes-DK15 Terrain height above sea level.
dtm_10m_sd_110m EcoDes-DK15 Variation in terrain height above sea level within 110 m focal window (11 x 11 pixels).
foliage_height_diversity EcoDes-DK15 Foliage height diversity MacArthur and MacArthur (1979) based on height bins by Wilson (1974)
normalized_z_sd EcoDes-DK15 Estimated variation in canopy height within 10 m pixel.
ns_groundwater_summer_sd_110m Koch et al. 2021 Estimate of depth of near-surface groundwater during an average summer.
ns_groundwater_summer_utm32_10m Koch et al. 2021 Variation in the estimate of depth of near-surface groundwater during an average summer within a 110 m focal window (11 x 11 pixels).
openness_difference EcoDes-DK15 Presence of linear features in the terrain (valleys, ridges etc.) based on a 50 m search radius.
Sand_utm32_10m Poggio et al. 2021 Estimated percentage sand content of soil (250 m resolution downscaled to 10 m).
slope EcoDes-DK15 Terrain slope at 10 m
Soc_utm32_10m Poggio et al. 2021 Estimated percentage soil organic carbon content of soil (250 m resolution downscaled to 10 m).
solar_radiation EcoDes-DK15 Annual incident solar radiation based on terrain model (aspect and slope).
treetype_bjer_dec Bjerreskov et al. 2021 Decidous or coniferous forest.
vegetation_density EcoDes-DK15 Denisty of vegetation points in 10 m lidar pixel.
vegetation_density_sd_110m EcoDes-DK15 Variation of density of vegeation points amongst pixels within 110 m window (11 x 11 pixels).

References