With rising prevalence of pre-sarcopenia in metabolic dysfunction-associated steatotic liver disease (MASLD), this study aimed to develop and validate machine learning-based model to identify pre-sarcopenia in MASLD population.
Materials and MethodsA total of 571 MASLD subjects were screened from the National Health and Nutrition Examination Survey 2017–2018. This cohort was randomly divided into training set and internal testing set with a ratio of 7:3. Sixty-six MASLD subjects were collected from our institution as external validation set. Four binary classifiers, including Random Forest (RF), support vector machine, and extreme gradient boosting and logistic regression, were fitted to identify pre-sarcopenia. The best-performing model was further validated in external validation set. Model performance was assessed in terms of discrimination and calibration. Shapley Additive explanations were used for model interpretability.
ResultsThe pre-sarcopenia rate was 17.51 % and 15.16 % in NHANES cohort and external validation set, respectively. RF outperformed other models with area under receiver operating characteristic curve (AUROC) of 0.819 (95 %CI: 0.749, 0.889). When six top-ranking features were retained as per variable importance, including weight-adjusted waist, sex, race, creatinine, education and alkaline phosphatase, a final RF model reached an AUROC being 0.824 (0.737, 0.910) and 0.732 (95 %CI: 0.529, 0.936) in internal and external validation sets, respectively. The model robustness was proved in sensitivity analysis. The calibration curve and decision curve analysis confirmed a good calibration capacity and good clinical usage.
ConclusionsThis study proposed a user-friendly model using explainable machine learning algorithm to predict pre-sarcopenia in MASLD population. A web-based tool was provided to screening pre-sarcopenia in community and hospitalization settings.
Non-alcoholic fatty liver disease (NAFLD), is the most common chronic liver disease (CLD) worldwide and is associated with a high morbidity and mortality beyond previous estimations [1]. In 2023, metabolic dysfunction-associated steatotic liver disease (MASLD) was proposed to replace NAFLD as a subtype of steatotic liver disease (SLD) [2]. It has been reported that there is a high agreement between NAFLD and MASLD individuals, and findings from NAFLD studies could persist under the new MASLD definition [3,4].
Muscle loss and dysfunction shares common mechanisms with the early stage of MASLD, such as physical inactivity, insulin resistance, dyslipidaemia and chronic systemic inflammation [5,6]. With the progression of disease, advanced fibrosis or cirrhosis becomes a major predisposing condition for the development of sarcopenia [7]. Sarcopenia, as a disease entity, represents a progressive and generalized skeletal muscle disorder, associated with higher likelihood of decompensation risk and mortality [8]. Indeed, skeletal muscle loss and muscle strength reduction often occur and develop asynchronously. Therefore, the European Working Group on Sarcopenia in Older People defined “pre-sarcopenia” as low skeletal muscle mass [9]. This term is also recommended as phenotypic representation in cirrhotic population by the American Association for the Study of Liver Diseases [7].
The onset and development of pre-sarcopenia is progressive and silent due to the co-existence of obesity, which is observed in most cases with MASLD [10]. Current evidence confirmed that sarcopenia is a driving force for MASLD progression [11-13]. Furthermore, the concurrence of sarcopenia and MASLD could lead to advanced fibrosis and higher mortality [14,15].
Computed Tomography (CT), and Dual-energy X-ray Absorptiometry (DXA) are utilized for screening pre-sarcopenia. Nevertheless, widespread use remains a challenge due to the potential risks of overt screening, radiation exposure and high cost. Moreover, the specialized software and laborious image processing are required. In addition, there are thus far few existing tools delivering satisfactory performance in predicting sarcopenia, likely because sarcopenia is multifactorial [16]. A non-invasive model was proposed to predict sarcopenia previously in diabetes mellitus (DM) population [17], but recent studies of MASLD population remained sparse. Hence, a well-performing model aimed to identify early pre-sarcopenia in MASLD is desirable, especially acting as a rule-out strategy.
This exploratory study aimed to develop and validate a model to predict pre-sarcopenia in MASLD population, using a selected but limited number of non-invasive variables and a machine learning (ML) approach, without resort to medical imaging. ML could enable an integrated analysis of multidimensional data to detect the nonlinear interactions among large datasets and diverse variables, making prediction more precise. Shapley Additive explanations (SHAP) were used to interpret the ML model. Through this model, subjects at high risk of pre-sarcopenia may benefit from a more intensive surveillance strategy and preventive intervention.
2Materials and methods2.1Study participantsThe NHANES database provides nationally representative data of the U.S. population with a complex, multistage, stratified sampling survey design and was obtained from the Centre for Disease Control and Prevention, which is publicly available (https://www.n.cdc.gov/nchs/nhanes/Default.aspx). This survey was approved by the institutional review board of the National Centre for Health Statistics (https://www.cdc.gov/nchs/nhanes/irba98.htm), and written informed consent was obtained from all participants. Written informed consent was waived due to the retrospective nature and the study protocol conforms to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a priori approval by the Ethics Committee.
Subjects diagnosed with MASLD who underwent DXA in National Health and Nutrition Examination Survey (NHANES) 2017–2018 were collected. This cohort was divided into training set and internal validation set with a ratio of 7:3.
MASLD was defined as the presence of hepatic steatosis with at least one cardiometabolic risk factor (CMRF) and low alcohol consumption. CMRF was detailed in supplementary materials. As proposed previously, a cut-off point of controlled attenuation parameter (CAP) from liver ultrasound transient elastography (LUTE)≥274 dB/m was considered suggestive of MASLD status with 90 % sensitivity in detecting all degrees of liver steatosis [18]. After exclusion of HBV, HCV, autoimmune liver diseases, hepatocellular carcinoma and excessive alcohol consumption, individuals with CAP≥274 dB/m, irrespective of liver fibrosis, were eligible for study. DXA and LUTE measurements were detailed in Supplemental material.
Given adults underwent DXA in NHANES database were limited within 60 years old, the MASLD subjects aged 18–59 and determined by liver biopsy or non-invasive criteria were retrospectively collected between 2020 and 2023 in our institution as external validation set. The medical records of eligible MASLD subjects were reviewed and their available CT within 6 months of diagnosis were analyzed. Inclusion and exclusion criteria were detailed in the Supplemental material.
2.2OutcomeIn NHANES survey, pre-sarcopenia was defined using the Foundation for the National Institutes of Health definition. The arm appendicular skeletal muscle mass (ASM) (kg) and leg ASM mass (kg) was summed as ASM by DXA [19]. The cut-off points of ASM normalized by BMI (kg/kg/m2), <0.512 for females and <0.789 for males, were used to identify pre-sarcopenia [20]. In external validation set, available CT within 6 months of diagnosis date was analyzed by two radiologists (S.W.Y. and Q.Y.C. with 5 years of radiological experience) who were blinded to all information using a commercially software (slice-O-matic, version 4.2; Tomovision Inc., Montreal, Quebec, Canada). Any discrepancies were resolved through discussion. This software performs tissue demarcation by established Hounsfield unit threshold [21], and representative figure is elucidated in Figure S1. The cross-sectional area of skeletal muscle was summarized using three consecutive plain CT slices at level of 3rd lumbar. The averaged value (cm2) normalized by height in meters squared (m2) was defined as skeletal muscle index (SMI). The SMI value was converted to ASM index (ASMI) as the following steps: ASMICT=0.11*SMI+1.17 [22]; ASMICT was defined as ASM/height2. Considered that pre-sarcopenia cut-point in NHANES is derived from North American cohorts, the pre-sarcopenia was identified using the Asia-specific cut-off values (ASM/height2, kg/m2) in external validation set: <7.0 kg/m2 in men and <5.4 kg/m2 in women [23]. For sensitivity analysis, the ethnicity-specific cut-points of SMI, 42 cm2/m2 for men and 38 cm2/m2 for women, were taken [24].
2.3Data collectionAccording to prior studies exploring the association between muscle mass and MASLD [12,25], all available features were collected from the NHANES database 2017–2018.
In training and internal validation sets, a total of 25 features, including demographic, anthropometric, clinical, laboratory data, physical examination and self-report questionnaires were collected (detailed in the supplemental material). Any variable missing less than 10% would be imputed by multiple imputation using predictive mean matching; Otherwise, it would be excluded from this study. The features in external validation set were collected as required in trained model.
2.4DefinitionsIn the NHANES, Excessive alcohol consumption refers to an average alcohol intake of >3 drinks/day in males and >2 drinks/ day in females [26].
Fibrosis grade was determined by liver stiffness measurement with cutoff values of 8.2, 9.7, and 13.6 kPa for fibrosis grades ≥F2, ≥F3, and F4, respectively, as assessed by LUTE. Significant fibrosis was determined by ≥F2. [18] DM could be identified based on one of the following criteria: glycohemoglobin (HbA1C) ≥6.5%, the value of fasting plasma glucose ≥126 mg/dL (7.0 mmol/L), being told to have diabetes by doctor, or taking insulin as indicated in the questionnaire. If HbA1C ≥5.7% is met alone, prediabetes was diagnosed. Hypertension (HT) was defined as systolic blood pressure ≥130 mmHg, a diastolic blood pressure ≥80 mmHg. Smoking status was classified as current smoker, past smoker and non-smoker. Waist circumference (WC) was assessed by placing a measuring tape in a horizontal plane around the abdomen at the level of the iliac crest. Weight-adjusted waist (WWI) was calculated as WC (cm) divided by the square root of body weight (kg). Education level was categorized as high or low level (high school and lower level).
According the recommended ethnic-specific cut-off values, BMI categories were classified in normal weight (BMI, 18.5–22.9 kg/m2 for Asian Americans; BMI, 18.5–24.9 kg/m2 for non-Asian Americans) and overweight/obesity (BMI, ≥23 kg/m2 for Asian Americans; BMI, ≥25 kg/m2 for non-Asian Americans) [27]. Lean MASLD was a MASLD subgroup without overweight or obesity [28]. Additional definitions are provided in supplemental material.
2.5Feature selection and model constructionFeature selection and model building were performed in the training set and internal validation set. The five-fold cross-validation (CV) was used for hyperparameter tuning. The external validation set was employed to test model reliability.
The features with P-value < 0.1 between outcome-positive and -negative groups were preferred into model construction. For two features with intervariable correlation coefficient (ICC) 0.5, the one with the lower P-value was remained. Redundant features (i.e., features that were either statistically nonsignificant or highly correlated (absolute ICC >0.5) were excluded.
Four binary classifiers, including extreme gradient boosting (XGBoost), Random Forest (RF), Support Vector Machine (SVM) and Logistic regression (LR) were fitted to the data. The hyperparameters with the best average of area under receiver operating characteristic curves (AUROCs) over five-fold CV were determined in training set. Afterwards, the optimal model was determined by comparing these models in internal validation set. For feature reduction, all participated feature in rank of variable importance was picked up. The optimal cutoff was estimated using the Youden index. To avoid overfitting and improve practicability, a final model with the least feature number and stable AUROC was determined, and further tested in external validation set. The model performance was evaluated with AUROC, accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score. The calibration curve and Brier score was used for model calibration. Decision curve analysis (DCA) was used to evaluate the clinical net benefit of the model at different threshold probabilities. The hyperparameter tuning process and feature reduction is provided in supplemental material.
Moreover, to improve interpretability of black-box ML model, a quantitative model interpretation method, SHAP, was deployed to present the contribution of each feature on decision-making.
2.6Statistical analysesContinuous variables expressed as the mean ± standard deviation or median (interquartile range) were compared by Student t-test or Mann–Whitney U test, according to the normality Shapiro test results; otherwise expressed as number (percentage) were compared by χ2 test or Fisher's exact tests. DeLong test was used to compare the AUROCs of different models. Goodness-of-fit of model was evaluated using Hosmer-Lemeshow test. The multiple imputation of missing data was performed using “MICE” package. ML analyses were performed with “tidyverse” package in R (version 4.3.2). P-value < 0.05 was indicative of a significant difference.
2.7Ethical statementOur institution's ethical review board approved the present study. The written informed consent for treatment was waived due to the retrospective nature of the study.
3Results3.1Participant characteristicsA total of 571 subjects from NHANES were included in this study. Pre-sarcopenia was found in 17.51% (100/571) of subjects. A total of 399 subjects were randomly split into training set and 172 subjects in internal validation set (supplemental Table 1). In external validation set, there were 66 MASLD subjects were included (supplemental Table 2) and the prevalence of pre-sarcopenia was 15.16% (10/66). The overview of study design and patient selection flowchart are shown in Fig. 1.
The characteristics of NHANES subjects are described in Table 1. Subjects with DM, overweight/obesity and never-smoker were predominant. The subjects with DM/pre-DM, significant fibrosis and overweight/obesity were more frequent in pre-sarcopenia group compared than those in non-pre-sarcopenia group. The subjects in pre-sarcopenia group were significantly more likely to be older, more frequent in lower education level group, and had higher WWI and alkaline phosphatase (ALP), along with lower high-density lipoprotein (HDL) and creatinine than those without pre-sarcopenia. And in external validation set, subjects with pre-sarcopenia had higher WWI, lower albumin and creatinine.
Characteristics of included participants in NHANES 2017–2018.
Characteristic | Overall(N = 571) | Non-presarcopenia(N = 471) | Pre-sarcopenia(N = 100) | P-value |
---|---|---|---|---|
Age | 45.00 (34.00–53.50) | 44.00 (34.00–52.00) | 51.00 (38.00–55.00) | <0.001 |
Sex | 0.576 | |||
Female | 280 (49.04%) | 234 (49.68%) | 46 (46.00%) | |
Male | 291 (50.96%) | 237 (50.32%) | 54 (54.00%) | |
Race | <0.001 | |||
Mexican American | 101 (17.69%) | 68 (14.44%) | 33 (33.00%) | |
Non-Hispanic Asian | 127 (22.24%) | 110 (23.35%) | 17 (17.00%) | |
Non-Hispanic Black | 104 (18.21%) | 97 (20.59%) | 7 (7.00%) | |
Non-Hispanic White | 156 (27.32%) | 133 (28.24%) | 23 (23.00%) | |
Other Hispanic | 53 (9.28%) | 38 (8.07%) | 15 (15.00%) | |
Other/multiracial | 30 (5.25%) | 25 (5.31%) | 5 (5.00%) | |
HT† | 0.466 | |||
No | 283 (49.56%) | 229 (50.78%) | 54 (56.84%) | |
Yes | 263 (46.06%) | 222 (49.22%) | 41 (43.16%) | |
Diabetes mellitus | 0.026 | |||
Non-DM | 261 (45.71%) | 227 (48.20%) | 34 (34.00%) | |
Pre-DM | 181 (31.70%) | 145 (30.79%) | 36 (36.00%) | |
DM | 129 (22.59%) | 99 (21.02%) | 30 (30.00%) | |
BMI status | 0.05 | |||
Normal weight | 35 (6.13%) | 32 (6.79%) | 3 (3.00%) | |
Overweight/Obesity | 536 (93.87%) | 439 (93.21%) | 97 (97.00%) | |
Significant fibrosis | 0.005 | |||
No | 497 (87.04%) | 419 (88.96%) | 78 (78.00%) | |
Yes | 74 (12.96%) | 52 (11.04%) | 22 (22.00%) | |
WWI | 11.24±0.71 | 11.12±0.69 | 11.81±0.54 | <0.001 |
ALT | 24.00 (16.00–33.00) | 24.00 (16.00–33.00) | 23.50 (16.00–33.25) | 0.734 |
ALB | 41.00 (38.00–43.00) | 41.00 (39.00–43.00) | 41.00 (38.00–42.00) | 0.076 |
ALP | 78.00 (66.00–93.00) | 77.00 (64.00–92.00) | 86.50 (71.75–101.75) | <0.001 |
AST | 20.00 (16.00–26.00) | 20.00 (16.00–26.00) | 20.00 (16.00–25.00) | 0.574 |
BUN | 4.64 (3.93–5.71) | 4.64 (3.93–5.71) | 4.64 (3.93–5.71) | 0.817 |
Creatinine | 72.49 (59.23–84.86) | 74.26 (60.11–86.63) | 65.41 (51.94–78.68) | <0.001 |
GGT | 26.00 (18.00–42.00) | 26.00 (18.00–40.00) | 28.00 (19.75–51.00) | 0.061 |
TBIL | 6.84 (5.13–8.55) | 6.84 (5.13–8.55) | 6.84 (5.13–8.55) | 0.52 |
HDL | 1.16 (1.01–1.35) | 1.16 (1.01–1.37) | 1.09 (0.98–1.29) | 0.047 |
Total Cholesterol | 5.01±0.95 | 5.00±0.94 | 5.06±0.99 | 0.579 |
vitamin D | 53.40 (35.20–69.85) | 52.60 (35.15–70.85) | 55.50 (38.70–66.20) | 0.893 |
HSCRP | 2.86 (1.27–5.75) | 2.75 (1.23–5.71) | 3.25 (1.42–6.68) | 0.097 |
WBC | 7.40 (6.20–8.90) | 7.40 (6.20–8.80) | 7.25 (6.18–9.25) | 0.913 |
PLT | 248.00 (213.50–285.50) | 248.00 (214.00–286.50) | 239.50 (208.00–278.75) | 0.395 |
HGB | 14.20 (13.20–15.30) | 14.20 (13.10–15.30) | 14.40 (13.38–15.60) | 0.396 |
Education level | <0.001 | |||
High | 360 (63.05%) | 320 (67.94%) | 40 (40.00%) | |
Low | 211 (36.95%) | 151 (32.06%) | 60 (60.00%) | |
Smoking Group | 0.803 | |||
Current smoker | 70 (12.26%) | 56 (11.89%) | 14 (14.00%) | |
Former smoker | 93 (16.29%) | 76 (16.14%) | 17 (17.00%) | |
Never smoker | 408 (71.45%) | 339 (71.97%) | 69 (69.00%) |
NOTE: Results for continuous data are expressed as means ± standard deviations and for categorical data as N (%).
indicates presence of missing data;
Abbreviations: HT, hypertension(mmHg); DM: diabetes mellitus; BMI, body mass index(Kg/m2); ALT, alanine aminotransferase(U/L); ALB, albumin(g/L); ALP, alkaline phosphatase(IU/L); AST, aspartate aminotransferase(U/L); BUN, blood urea nitrogen(mmol/L); Cr, creatinine(umol/L); GGT, gamma-glutamyl transferase(IU/L); TBIL, total bilirubin(umol/L); HDL, high-density lipoprotein(mmol/L); HSCRP, high-sensitivity C-reactive protein(mg/L); WBC, White blood cell count(1000 cells/uL); PLT, Platelet count(1000 cells/uL); HGB, Hemoglobin(g/dL);.
After a series of preprocessing, a total of 23 features were included in initial model to fit into training data. Significant statistically differences were noted in 13 features between groups, including age, race, DM, BMI status, significant fibrosis, WWI, albumin, ALP, creatinine, gamma-glutamyl transferase, HDL, high sensitivity C-reaction protein and education level (detailed in supplemental material).
3.3Models building and comparisonFour binary classifiers, including XGBoost, SVM, RF and LR were performed into model building. RF outperformed other models in internal validation set, with an AUROC of 0.819 (95%CI: 0.749, 0.889). The metrics of four models were shown in Fig. 2 and Table 2. The best hyperparameter combination were provided in the Supplemental Table 3.
All metrics of four models in internal validation set and of final RF model in internal and external validation sets.
Abbreviation: ROC_AUC, Area under receiver operating characteristic curve; PR_AUC, Area under precision and recall curve; PPV, positive predictive value; NPV, negative predictive value; DXA, Dual-energy X-ray Absorptiometry; CT, Computed Tomography; RF, Random Forest; XGBoost, extreme gradient boosting; LR, logistic regression; SVM, support vector machine.
The model performance during feature reduction in internal validation set was plotted in Fig. 3. When six top-ranking features were retained, several metrics of RF reached stable. Finally, a final model incorporating 6 features, including WWI, sex, race, creatinine, education level and ALP was determined. At the optimal cut-off of 0.251, it achieved AUROCs of 0.917 (95%CI: 0.885, 0.949) and 0.824 (95%CI: 0.737, 0.910) in training and internal validation (Fig. 4). In internal validation set, a moderate goodness-of-fit was observed in calibration curve and DCA showed that the net benefit probability was approximately gained between 10% and 75% (Fig. 5).
The final model had an AUROC of 0.732 (95%CI: 0.529, 0.936) in external validation set, other metrics were shown in Table 2. Calibration and DCA curves were plotted in Figure S2.
3.5Feature contribution assessed by SHAPThe contribution of each feature in final RF model was visualized using SHAP (Fig. 6). The higher the width of the distribution of a certain feature is, the greater the contribution it makes. That the color of dot is prone to red indicates higher risk of this dot. In final model, WWI ranks first, sex, race, creatinine, education level and ALP follow afterwards. A higher level of WWI and ALP, a lower level of education and creatinine as well as male gender was predictive of pre-sarcopenia. As for race, Mexican American and other Hispanic are prone to higher risk of pre-sarcopenia.
SHAP visualization for models used to identify pre-sarcopenia in MASLD population. The width of the distribution of SHAP value of a certain feature on the horizon axis indicates the level of influence this feature has on the model decision-making. That the color of dot is prone to red indicates higher pre-sarcopenia risk of this case.
In external validation set, when CT-based SMI was used, the pre-sarcopenia was observed in 12 (18.18%) subjects. The final model achieved an AUROC of 0.745 (95%CI: 0.575, 0.914). The comparison of ROC using DXA and CT-based definitions was plotted in Fig. 7(P > 0.05). Other metrics are shown in Table 2. The calibration and DCA curves are described in Figure S3. Out of 35 lean MASLD subjects, the final model exhibited an AUROC of 0.912 (0.757–1.000) and a good performance in DCA curve (Figure S4).
3.7A web-based prediction toolThe final model is available at: https://riskofpresarcopeniainmasld.shinyapps.io/shiny_RiskofPresarcopeiaInNAFLD/.
4DiscussionIn this study, a ML-based model was proposed for identification of pre-sarcopenia in MASLD population. The model, utilizing available features common in clinical and community settings, demonstrated a good prediction performance, indicating promising prospect for broad application. Of note, the high NPV and specificity values suggested that this model could serve as a reliable rule-out strategy in screening settings. A public web-based tool was provided.
Due to improved living standards and popularity of high glucose and high salt diet, MASLD and subsequent fibrosis are projected to become more prevalent. Masked by the high BMI and young age of the MASLD population, it is not easy to identify pre-sarcopenia, which potentially advances hepatic steatosis and fibrosis as well as increasing mortality [29]. To improve practicability of model, all participated features are supposed to be available, short-term stable and easy to implement. The proposed web-based calculator offers convenient evaluation on development and recovery of sarcopenia. Consequently, the individuals with suspected pre-sarcopenia could be screened and monitored in real-life practices, with nutrition consultation and clinical intervention further provided as necessary.
The pre-sarcopenia incidence of MASLD determined by DXA in this study was consistent with previous data, ranging from 8.7% to 35% [30-32]. The difference of incidence among these studies could be explained by different measurement modalities and cut-off points.
SHAP analysis revealed that Mexican American/other Hispanic male subjects with higher WWI and ALP, along with lower creatinine and education level were at higher risk of pre-sarcopenia.
WWI was initially proposed to predict cardiometabolic morbidity and mortality in a large-scale research [33]. This study revealed that WWI contributed to a higher risk of pre-sarcopenia as the most important factor. Several community or population-based studies unmasked that WWI was oppositely associated with abdominal muscle mass and positively associated with abdominal fat mass [34,35]. A higher WC is commonly found in subjects with abdominal obesity, which is characterized by much visceral adipose and a reduced ratio of muscle mass and weight. The value of WC could be compared after normalization of weight.
Loss of skeletal muscle mass was primarily deemed as an age-related change. In this study, age was not included in final model, likely because all eligible subjects were less than 60 years old, and age acted as a continuous variable in model construction. Additionally, education level is related to income level. The impact of socioeconomic status on pre-sarcopenia is complex, by healthy awareness, diet and exercise style as well as quality of medical care.
This study identified male gender as predictive of pre-sarcopenia in the MASLD population, which was consistent with some prior findings [17,32,36]. However, this conclusion was open to debate, as mixed results were presented in cohorts from different regions [12]. It is noted that sex is the only feature without statistical differences between pre-sarcopenia and non-pre-sarcopenia groups in final model. The risk of sarcopenia on metabolic syndrome varies significantly after stratification of sex and WC [37]. It is speculated that the role of sex seemed to be altered due to the potential interaction between other included features.
It is understandable that lower creatinine, influenced by muscle metabolic status, was related to a higher risk of pre-sarcopenia [38,39]. Creatinine is a convenient and available indicator to reflect muscle mass depletion to some extent in population with normal kidney function.
The model performance in external validation set is lower than that in developmental sets. Some points could explain it. The baseline differences of some features between NHANES data and external validation set, especially race, may be responsible for the lower model performance. However, race was retained in order to improve model versatility and generalizability across diverse populations worldwide. Of note, although proposed model exhibited good prediction capacity and net clinical benefit in lean MASLD population. Considering small sample size and limited outcome events, the results need further validation in future studies. Caution is therefore an obligation until prospective studies supporting these findings are available.
In the light of subjects without pre-sarcopenia are predominant in most MASLD cohorts, accuracy is not suitable as performance metric, leading to overestimation of model performance. Hence, ROC, F1 score and Brier score were selected as main metrics for model evaluation.
While a promising model was proposed, it is important to acknowledge its limitations. Firstly, since participants in external validation set were screened form the hospitalized patients, the muscle mass was measured on CT. Although a validated conversion formulas and sensitivity analysis were used, there is an inevitable outcome bias from different imaging modalities and baseline characteristics differences between developmental and external validation sets. However, the generalization ability of the model was further proved. Secondly, the model was trained and validated in population aged less than 60 years old; therefore, the efficacy of the model in the elderly requires further validation. Thirdly, as much as available features were utilized for a comprehensive model at the cost of a reduced the sample size; yet homeostatic model assessment of insulin resistance was not included in model construct due to much its missing value (≥50%). Lastly, muscle quality is crucial in evaluating sarcopenia. Due to limited available data and the nature of retrospective studies, the term “pre-sarcopenia” was used in this study. More research is needed to investigate both muscle quantity and quality in MASLD population.
5ConclusionsIn conclusion, this study developed and validated a noninvasive ML-based model for identification of pre-sarcopenia in MASLD population derived from NHANES database and a real-world cohort, and a web-based tool was available freely.
Funding statementThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Data availability statementData that support the findings of this study are available from the authors upon request.
Declaration of generative AI and AI-assisted technologies in the writing processAny AI and AI-assisted technologies were not used during the preparation of this work.