The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Pingping Jia
  • Junhua Zhang
  • Yanning Liang
  • Sheng Zhang
  • Keli Jia
  • Xiaoning Zhao

Research Organisations

External Research Organisations

  • Ningxia University
  • Nanjing University of Information Science and Technology
  • Chinese Academy of Sciences (CAS)
View graph of relations

Details

Original languageEnglish
Article number112364
Number of pages14
JournalEcological indicators
Volume166
Early online date29 Jul 2024
Publication statusPublished - Sept 2024

Abstract

The escalating salinization of cultivated soil poses a significant threat to the ecological environment. It is imperative to establish a monitoring system and mitigate the spread of salinization in arid and coastal areas through remote sensing, incorporating high-precision cross-regional models for soil salt content inversion. This study focuses on typical saline-alkali soils in arid and coastal regions of China. Using Sentinel 2 data (including 6 bands and 27 spectral indices), along with soil texture, moisture content, temperature, precipitation, and digital elevation model (DEM) data to establish an arid-coastal salinity inversion model. Variable selection methods such as pearson correlation coefficient (PCC), variable importance in projection (VIP), gray relational analysis (GRA), and gradient boosting machine (GBM) were used, while using 9 models including adaptive boosting (Adaboost), extremely randomized trees (ERT), and light gradient boosting machine (LightGBM). The best model was further elucidated using the Shapley additive explanations method. Results indicate that the common sensitive characteristic variables of arid-coastal areas were spectral indices and soil properties in PCC, the spectral variable bands and indices in VIP, and all variables in GRA and GBM. The best inversion model GBM-ERT (R2 = 0.91, RMSE = 1.06) in arid areas exhibited higher accuracy than the best inversion model GBM-Adaboost (R2 = 0.77, RMSE = 1.74) in coastal areas. The arid-coastal inversion model PCC-LightGBM demonstrated the best inversion performance (R2 = 0.64, RMSE = 2.29) and simulation performance in arid (R2 = 0.67) and coastal areas (R2 = 0.63). Dead fuel index (DFI) had the most significant impact on model prediction (0.89) and the second ratio index (RI2) contributed the highest relative importance (18 %) to the model. Our analysis indicates that the arid-coastal model of PCC-LightGBM established using common characteristic variables, can effectively monitor large-scale soil salinity.

Keywords

    Arid-coastal area, Environment variables, Remote sensing, Soil health, Sustainable land use

ASJC Scopus subject areas

Sustainable Development Goals

Cite this

The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2. / Jia, Pingping; Zhang, Junhua; Liang, Yanning et al.
In: Ecological indicators, Vol. 166, 112364, 09.2024.

Research output: Contribution to journalArticleResearchpeer review

Jia P, Zhang J, Liang Y, Zhang S, Jia K, Zhao X. The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2. Ecological indicators. 2024 Sept;166:112364. Epub 2024 Jul 29. doi: 10.1016/j.ecolind.2024.112364
Download
@article{db08313a82264b78b7f0186449644568,
title = "The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2",
abstract = "The escalating salinization of cultivated soil poses a significant threat to the ecological environment. It is imperative to establish a monitoring system and mitigate the spread of salinization in arid and coastal areas through remote sensing, incorporating high-precision cross-regional models for soil salt content inversion. This study focuses on typical saline-alkali soils in arid and coastal regions of China. Using Sentinel 2 data (including 6 bands and 27 spectral indices), along with soil texture, moisture content, temperature, precipitation, and digital elevation model (DEM) data to establish an arid-coastal salinity inversion model. Variable selection methods such as pearson correlation coefficient (PCC), variable importance in projection (VIP), gray relational analysis (GRA), and gradient boosting machine (GBM) were used, while using 9 models including adaptive boosting (Adaboost), extremely randomized trees (ERT), and light gradient boosting machine (LightGBM). The best model was further elucidated using the Shapley additive explanations method. Results indicate that the common sensitive characteristic variables of arid-coastal areas were spectral indices and soil properties in PCC, the spectral variable bands and indices in VIP, and all variables in GRA and GBM. The best inversion model GBM-ERT (R2 = 0.91, RMSE = 1.06) in arid areas exhibited higher accuracy than the best inversion model GBM-Adaboost (R2 = 0.77, RMSE = 1.74) in coastal areas. The arid-coastal inversion model PCC-LightGBM demonstrated the best inversion performance (R2 = 0.64, RMSE = 2.29) and simulation performance in arid (R2 = 0.67) and coastal areas (R2 = 0.63). Dead fuel index (DFI) had the most significant impact on model prediction (0.89) and the second ratio index (RI2) contributed the highest relative importance (18 %) to the model. Our analysis indicates that the arid-coastal model of PCC-LightGBM established using common characteristic variables, can effectively monitor large-scale soil salinity.",
keywords = "Arid-coastal area, Environment variables, Remote sensing, Soil health, Sustainable land use",
author = "Pingping Jia and Junhua Zhang and Yanning Liang and Sheng Zhang and Keli Jia and Xiaoning Zhao",
note = "Publisher Copyright: {\textcopyright} 2024 The Author(s)",
year = "2024",
month = sep,
doi = "10.1016/j.ecolind.2024.112364",
language = "English",
volume = "166",
journal = "Ecological indicators",
issn = "1470-160X",
publisher = "Elsevier",

}

Download

TY - JOUR

T1 - The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2

AU - Jia, Pingping

AU - Zhang, Junhua

AU - Liang, Yanning

AU - Zhang, Sheng

AU - Jia, Keli

AU - Zhao, Xiaoning

N1 - Publisher Copyright: © 2024 The Author(s)

PY - 2024/9

Y1 - 2024/9

N2 - The escalating salinization of cultivated soil poses a significant threat to the ecological environment. It is imperative to establish a monitoring system and mitigate the spread of salinization in arid and coastal areas through remote sensing, incorporating high-precision cross-regional models for soil salt content inversion. This study focuses on typical saline-alkali soils in arid and coastal regions of China. Using Sentinel 2 data (including 6 bands and 27 spectral indices), along with soil texture, moisture content, temperature, precipitation, and digital elevation model (DEM) data to establish an arid-coastal salinity inversion model. Variable selection methods such as pearson correlation coefficient (PCC), variable importance in projection (VIP), gray relational analysis (GRA), and gradient boosting machine (GBM) were used, while using 9 models including adaptive boosting (Adaboost), extremely randomized trees (ERT), and light gradient boosting machine (LightGBM). The best model was further elucidated using the Shapley additive explanations method. Results indicate that the common sensitive characteristic variables of arid-coastal areas were spectral indices and soil properties in PCC, the spectral variable bands and indices in VIP, and all variables in GRA and GBM. The best inversion model GBM-ERT (R2 = 0.91, RMSE = 1.06) in arid areas exhibited higher accuracy than the best inversion model GBM-Adaboost (R2 = 0.77, RMSE = 1.74) in coastal areas. The arid-coastal inversion model PCC-LightGBM demonstrated the best inversion performance (R2 = 0.64, RMSE = 2.29) and simulation performance in arid (R2 = 0.67) and coastal areas (R2 = 0.63). Dead fuel index (DFI) had the most significant impact on model prediction (0.89) and the second ratio index (RI2) contributed the highest relative importance (18 %) to the model. Our analysis indicates that the arid-coastal model of PCC-LightGBM established using common characteristic variables, can effectively monitor large-scale soil salinity.

AB - The escalating salinization of cultivated soil poses a significant threat to the ecological environment. It is imperative to establish a monitoring system and mitigate the spread of salinization in arid and coastal areas through remote sensing, incorporating high-precision cross-regional models for soil salt content inversion. This study focuses on typical saline-alkali soils in arid and coastal regions of China. Using Sentinel 2 data (including 6 bands and 27 spectral indices), along with soil texture, moisture content, temperature, precipitation, and digital elevation model (DEM) data to establish an arid-coastal salinity inversion model. Variable selection methods such as pearson correlation coefficient (PCC), variable importance in projection (VIP), gray relational analysis (GRA), and gradient boosting machine (GBM) were used, while using 9 models including adaptive boosting (Adaboost), extremely randomized trees (ERT), and light gradient boosting machine (LightGBM). The best model was further elucidated using the Shapley additive explanations method. Results indicate that the common sensitive characteristic variables of arid-coastal areas were spectral indices and soil properties in PCC, the spectral variable bands and indices in VIP, and all variables in GRA and GBM. The best inversion model GBM-ERT (R2 = 0.91, RMSE = 1.06) in arid areas exhibited higher accuracy than the best inversion model GBM-Adaboost (R2 = 0.77, RMSE = 1.74) in coastal areas. The arid-coastal inversion model PCC-LightGBM demonstrated the best inversion performance (R2 = 0.64, RMSE = 2.29) and simulation performance in arid (R2 = 0.67) and coastal areas (R2 = 0.63). Dead fuel index (DFI) had the most significant impact on model prediction (0.89) and the second ratio index (RI2) contributed the highest relative importance (18 %) to the model. Our analysis indicates that the arid-coastal model of PCC-LightGBM established using common characteristic variables, can effectively monitor large-scale soil salinity.

KW - Arid-coastal area

KW - Environment variables

KW - Remote sensing

KW - Soil health

KW - Sustainable land use

UR - http://www.scopus.com/inward/record.url?scp=85199858141&partnerID=8YFLogxK

U2 - 10.1016/j.ecolind.2024.112364

DO - 10.1016/j.ecolind.2024.112364

M3 - Article

AN - SCOPUS:85199858141

VL - 166

JO - Ecological indicators

JF - Ecological indicators

SN - 1470-160X

M1 - 112364

ER -