metricas
covid
Buscar en
Annals of Hepatology
Toda la web
Inicio Annals of Hepatology A data-driven approach to decode metabolic dysfunction-associated steatotic live...
Journal Information
Vol. 29. Issue 2.
(March - April 2024)
Share
Share
Download PDF
More article options
Visits
1551
Vol. 29. Issue 2.
(March - April 2024)
Full text access
A data-driven approach to decode metabolic dysfunction-associated steatotic liver disease
Visits
1551
Maria Jimenez Ramosa, Timothy J. Kendalla,b, Ignat Drozdovc, Jonathan A. Fallowfielda,
Corresponding author
Jonathan.Fallowfield@ed.ac.uk

Corresponding author.
a Centre for Inflammation Research, Institute for Regeneration and Repair, University of Edinburgh, Edinburgh BioQuarter, 4-5 Little France Drive, Edinburgh EH16 4UU, UK
b Edinburgh Pathology, University of Edinburgh, 51 Little France Crescent, Old Dalkeith Rd, Edinburgh EH16 4SA, UK
c Bering Limited, 54 Portland Place, London, W1B 1DY, UK
This item has received
Article information
Abstract
Full Text
Bibliography
Download PDF
Statistics
Figures (2)
Abstract

Metabolic dysfunction-associated steatotic liver disease (MASLD), defined by the presence of liver steatosis together with at least one out of five cardiometabolic factors, is the most common cause of chronic liver disease worldwide, affecting around one in three people. Yet the clinical presentation of MASLD and the risk of progression to cirrhosis and adverse clinical outcomes is highly variable. It, therefore, represents both a global public health threat and a precision medicine challenge. Artificial intelligence (AI) is being investigated in MASLD to develop reproducible, quantitative, and automated methods to enhance patient stratification and to discover new biomarkers and therapeutic targets in MASLD. This review details the different applications of AI and machine learning algorithms in MASLD, particularly in analyzing electronic health record, digital pathology, and imaging data. Additionally, it also describes how specific MASLD consortia are leveraging multimodal data sources to spark research breakthroughs in the field. Using a new national-level ‘data commons’ (SteatoSITE) as an exemplar, the opportunities, as well as the technical challenges of large-scale databases in MASLD research, are highlighted.

Keywords:
NAFLD
MASLD
Big data
Artificial intelligence
Machine Learning
Precision medicine
Abbreviations:
AI
AUROC
CT
DL
EHR(s)
HCC
ICD (-9/-10)
LITMUS
MASLD
MASH
ML
MRI
MRE
NAFLD
NAS
NASH
NASH-CRN
NIMBLE
OPSC-4
SAF
SHG
TE
TPE
Full Text
1Introduction

Metabolic dysfunction-associated steatotic liver disease (MASLD), previously termed non-alcoholic fatty liver disease (NAFLD), is characterized by the presence of liver steatosis and at least one of the five cardiometabolic criteria proposed in a multi-society Delphi consensus statement [1]. Importantly, other causes of steatosis, including increased alcohol intake, must be absent. Metabolic dysfunction-associated steatohepatitis (MASH), previously termed non-alcoholic steatohepatitis (NASH), is the progressive stage of the disease distinguished by the presence of lobular inflammation, hepatocyte ballooning, and an increased risk of liver fibrosis. In some instances, fibrosis progression can lead to cirrhosis and the development of hepatocellular carcinoma (HCC). The presence of certain genetic variants, such as single nucleotide polymorphisms in patatin-like phospholipase domain-containing protein 3 (PNPLA3), hydroxysteroid 17β dehydrogenase 13 (HSD17B13), or transmembrane 6 superfamily member 2 (TM6SF2) genes has also been associated with an increased risk of MASLD development, progression, and unfavorable prognosis [2–4]. Currently, MASLD represents the main cause of chronic liver disease and the leading indication for liver transplantation, affecting ∼30% of the global population [5]. Epidemiological modeling predicts a substantial increase in prevalence, clinical burden, and socioeconomic costs in the coming years – a public health threat that no country appears well prepared to address [6].

Crucially, the severity of fibrosis in MASLD is strongly associated with an increased risk of overall and disease-specific morbidity and mortality [7]. The most common cause of death in people with MASLD is cardiovascular disease, followed by extra-hepatic malignancy, then liver-related mortality [8,9]. These findings reflect the range of comorbidities in MASLD and highlight the need for a multidisciplinary approach to the disease [1].

Despite substantial advances in our understanding of disease pathogenesis, there are still no approved therapies for MASLD, and many drugs have shown limited efficacy in clinical trials, especially in patients with cirrhosis [10]. Given the complexity of disease pathogenesis, combination drug therapy may be required to improve patient outcomes [11], but the optimal combinations or treatment regimens are unknown. Additionally, there is an unmet need for non-invasive biomarkers to accurately diagnose, stage, and monitor the progression of MASLD and reduce or obviate the necessity for a liver biopsy in clinical practice and pharmacological studies. Moreover, the heterogeneity in progression and prognosis of MASLD calls for novel approaches to disease stratification and prediction of individual risk of clinical outcomes; this may require a re-evaluation of MASLD, viewed through the prism of the new nomenclature, with integration of multimodal information including demographic, pathological, genetic/multi-omic, environmental, and electronic health record (EHR) data to understand patient trajectories and define discrete subphenotypes to enable precision medicine in MASLD [12].

In this review, we discuss how large-scale patient data and emerging artificial intelligence (AI) approaches are increasingly being leveraged in the MASLD field, in the quest for new diagnostic biomarkers, efficacious drug targets, and improved patient stratification and prognostication methods. A national-level multimodal database – SteatoSITE [13] – is used as an exemplar to demonstrate the utility and scope of an integrated data-driven approach, to highlight the technical challenges, and to illustrate possible future directions.

2Big data classes and their utility in MASLD research

AI is a large and rapidly growing field, using computer software that mimics human cognitive abilities to perform complex tasks. Machine Learning (ML) is an application of AI that enables computers to learn and recognize patterns from data to make decisions and predictions (Fig. 1). The two broad categories of ML algorithms are: supervised (the computer learns from both input data and corresponding correct answers) and unsupervised (the computer only processes input data). Their main advantage is that they can recognize unique data patterns and include multiple components to create new disease classifications and predictive models through linkage to outcomes [14]. AI/ML applications in liver disease research has increased in recent years, including in studies of MASLD to address the challenges of pathophysiological complexity and heterogeneity of presentation and patient outcomes.

Fig. 1.

Schematic representation of the relationship between Artificial Intelligence and Machine Learning (ML), with ML algorithm categories and their applications.

(0.36MB).
2.1Electronic health record data

Electronic Health Records (EHRs) are digital repositories of comprehensive patient health information, stored in standardized formats for efficient retrieval and sharing among healthcare providers. In both the United States (US) and the European Union, the adoption of EHRs has become nearly ubiquitous in both acute hospital and primary care settings [15]. EHR systems typically encompass administrative and healthcare utilization data, demographic details, diagnostic and procedural codes, laboratory results, pathology assessments, and prescribed medications.

The increased accessibility of EHRs for research has opened new avenues for large-scale observational studies and the application of AI/ML in MASLD, especially for predicting the risk of MASLD development or refining its diagnosis [16–21]. For example, Fialoke et al. used one of the largest US-based EHR resources (from Optum Analytics), which integrates healthcare data from 50 provider organizations treating more than 80 million patients, for a supervised ML classification of MASLD patients to predict the health status of the patient cohort. The inclusion of time-stamped data also facilitates longitudinal profiling of candidate biomarkers and the identification of potential predictor variables associated with clinical outcomes. Typically, EHR data is characterized by noisy, sparse, and irregularly timed observations, which poses a challenge for phenotype discovery in clinical data, although computational ingenuity can overcome this [22–24]. To date, there are very few AI/ML-based studies in MASLD that have leveraged temporally defined EHR data to gain insights into disease progression or prognosis. Vandrome et al. [25] used data mining techniques to search for MASLD subtypes in a hospital database cohort of 13,290 patients, identified using electronic signatures of the disease. Using hierarchical clustering, they identified five distinct subtypes of patients. Notably, two of the major groups exhibited fewer comorbidities and favorable outcomes, whereas a minority within the three smaller subtypes displayed more severe comorbidities and poorer outcomes.

2.2‘Omics data

While EHR-based studies involve a substantial number of patients, none have integrated 'omics data to identify potential disease signatures for patient prediction and stratification. Despite this gap, several smaller studies have made efforts to address the issue.

Utilizing datasets from the Gene Expression Omnibus (GEO) [26], some researchers have conducted differential gene expression analyses, followed by network analysis and the application of ML algorithms. This methodology has enabled the identification of parsimonious gene signatures with a good Area Under Receiver Operating Characteristic Curve (AUROC) for the diagnosis of MASLD [27,28]. Sen et al. [29] employed transcriptomics of whole liver tissue and serum metabolomics from a cohort based on genome-scale metabolic models to identify dysregulated glycosphingolipid pathways across the disease spectrum. In the study by Luo et al. [30] the focus was on identifying serum biomarkers associated with liver fibrosis in patients with MASH. Although they identified key proteins linked to fibrosis and liver injury, they were unable to establish a protein panel capable of distinguishing between early and late fibrosis.

The investigation of interactions between MASH and other diseases has yielded notable findings. Qian et al. [31] defined a 20-gene signature predicting fibrosis progression in MASLD and HCV patients over five years, validated with an AUROC of 0.86. They also identified potential antifibrotic drug candidates and BCL2 as a therapeutic target. Additionally, Fujiwara et al. [32] developed a 133-gene signature for MASLD patients developing HCC, validated in a separate HCC cohort, and converted into a four-parameter blood-based panel (comprising XCL1, GRN, ANGPT2, and MET).

More advanced models have also been explored. Conway et al. [33] utilized ML on clinical trial data (STELLAR 3 and 4) to establish a prognostic five-gene signature predicting progression to cirrhosis and liver-related events in MASH patients, correlating with histological features. Deep learning (DL) was also investigated, outperforming other algorithms with an AUROC >0.80 in identifying genes associated with MASL to MASH progression [34]. Among the final 39 candidates identified, 11 were linked to HCC and survival rate.

2.3Imaging data

Non-invasive imaging techniques have been employed in MASLD research and clinical settings. Advanced magnetic resonance imaging (MRI), including proton-density fat fraction (PDFF) and MR elastography (MRE), facilitates accurate quantification of steatosis and fibrosis for MASH assessment [35]. Recent applications of supervised ML and DL in medical imaging enhance automation, enabling more precise diagnosis. Training these models can unveil abnormal patterns beyond human perception, enhancing the efficiency of non-invasive diagnostic procedures. Studies have utilized ML to predict MRE liver stiffness, achieving an AUROC of 0.84 when combined with clinical data [36]. In a study by Schawkat et al. [37] MRI was employed to explore the viability of assessing liver scarring by integrating texture analysis, a method for extracting information from grey-level intensity within an image, with a supervised ML algorithm. Their results demonstrated a classification accuracy of 87.7%, equivalent to the performance level of MRE.

AI has also been utilized in the analysis of computerized tomography (CT) scans, which can measure liver fat content. Currently, there are no standardized approaches for manually delineating the region of interest (ROI), although some proposals exist, as outlined in Starekova et al. [38]. AI can facilitate automated liver segmentation, contributing to standardized CT analysis methods for MASLD patients. Several studies have already achieved this, demonstrating a robust and significant correlation [39–41]. Notably, a semi-automated DL-augmented method has been used on MRI-acquired 3D liver images to facilitate modeling of resectional surgery for liver cancer [42].

Liver ultrasound scans are a standard non-invasive diagnostic tool for chronic liver diseases, including MASLD, but are influenced by examiner subjectivity and exhibit reduced sensitivity when the liver contains less than 20–30% fat [43]. Limited studies on AI's application for predicting and classifying MASLD patients indicate promising results with excellent AUROC scores [44–46]. Additionally, ML algorithms integrated with transient elastography (TE) have been employed to predict liver fibrosis and MASLD in large clinical trial/cohort studies [47–49].

2.4Digital pathology data

Despite these promising results, the gold standard for diagnosis of MASLD and MASH requires a liver biopsy where steatosis, inflammation, hepatocyte ballooning, and fibrosis are assessed. Whilst a clinical histopathological diagnosis is made by a pathologist integrating all histological features, in a research setting there are two main systems for ordinal scoring of the cardinal histological features. Features of disease activity can be evaluated with the NAFLD Activity Score (NAS) and the stage scored using the NASH Clinical Research Network (CRN) system [50], or disease activity assessed using the SAF (steatosis, activity, and fibrosis) system that scores ballooning and inflammation using different criteria but incorporates the same NASH-CRN stage. The architects of the NAS system explicitly state that a NAS score should not be used to define a diagnosis of steatohepatitis, although a NAS≥4 is often erroneously used for such a purpose. A system based upon score assignment by an observer is inherently subjective with inter- and intra-observer variation. To make the assignment of disease activity or stage scores more reproducible, AI methodologies are being developed to automate feature scoring.

HistoIndex (https://www.histoindex.com/) uses second harmonic generation (SHG) and two-photon excitation (TPE) microscopy with AI analysis to undertake histological assessment of unstained tissue sections [51]. Computationally derived qFIBS scores [52], that are analogous to the pathologist-assigned NAS components and NASH-CRN stage, can be generated, and this tool was used in an international multicentre study to assess lobular inflammation, steatosis, fibrosis, and hepatocyte ballooning. qFIBS had a strong correlation with each component of NAS (P < 0.001) and had an AUROC between 0.82 and 0.986 for each component.

PathAI (https://www.pathai.com) has developed a ML model that uses the digital images of biopsies for automated and quantitative assessment of a disease. The team used DL to predict NAS and fibrosis across three different clinical trials of advanced MASH [53]. Their findings revealed a significant correlation between the NAS scores and fibrosis and their ML model. Additionally, they also developed a new metric called Deep Learning Treatment Assessment (DELTA) Liver Fibrosis Score, designed to capture the change in fibrosis patterns from before to after the implementation of a treatment.

MorphoQuantTM (https://biocellvia.com/) also uses standard stained sections from biopsies to quantify the collagen fibres, as well as the perivascular and septal percentage of collagen. It is AI-based and relies on morphometric recognition and no training is required. MorphoQuantTM successfully quantified macrosteatosis, inflammation, and fibrosis in an automated manner in a mouse MASH model [54].

While the models described above used hematoxylin and eosin (H&E)-stained sections or unstained slides, PharmaNest (https://www.fibronest.com) developed the FibroNest AI algorithm, capable of analysing many different tinctorial stains to automatically quantify fibrosis and inflammation. Specifically, it can quantify collagen amount, structure, and the morphometric traits of their fibres, thereby providing a complete evaluation of fibrosis. They successfully predicted the development of HCC from MASLD through histopathology imaging studies [55]. Moreover, they also used their AI tool to assess fibrosis in a mouse study evaluating semaglutide [56]. Despite not observing a significant change in total fibrosis, their AI-based system revealed an amelioration of the collagen network architecture after treatment. While the total area of collagen remained unaltered, the treatment prevented its further accumulation.

In addition to tools to computationally replicate subjective ordinal feature scoring, methods have been developed to quantify features with continuous metrics. The earliest application of digital pathology in this area was the quantification of scarring in stained sections using simple colour thresholding [57], and AI-based classifiers have subsequently been developed to undertake the same task and provide a metric that complements the ordinal scar staging. Such classifiers are relatively easy to develop using open-source tools and have therefore been developed and used in a study-specific manner [13] that limits their generalizability and widespread application.

These studies show the importance of AI in enabling the stratification and automated quantification of key histopathological parameters in the diagnosis of MASLD. However, to maximize value it is important that such data is integrated with other diverse data sources, including EHRs, laboratory results, and genomic (and other ‘omics) profiles. There are several initiatives that are currently creating resources that store and analyze multimodal, multiscale information to elucidate new patient subphenotypes, identify new biomarkers and therapeutic targets.

3Academia-industry research consortia in MASLD

AI-based approaches to understanding complex diseases are enabled by accessible large-scale multimodal datasets. The Foundation for the National Institute of Health (FNIH) initiative Non-invasive Biomarkers of Metabolic Liver Disease (NIMBLE) is a multi-stakeholder project to support regulatory approval of MASH-related biomarkers [58]. The diagnostic performance of five blood-based panels was evaluated in an observational cohort (n = 1073) covering the full spectrum of MASLD [59]. Multiple biomarkers met prespecified performance metrics. NIS4® had an AUROC of 0.81 for ‘at-risk’ MASH (steatohepatitis and fibrosis stage ≥F2). The AUROCs of the ELFTM test, PROC3, and FibroMeter VCTETM for clinically significant fibrosis (≥F2), advanced fibrosis (≥F3), or cirrhosis (F4), respectively, were all ≥0.8.

The Liver Investigation: Testing Marker Utility in Steatohepatitis (LITMUS) consortium, supported by the European NAFLD registry [60], aims to develop, validate, and progress biomarkers for diagnosing, risk stratifying, and monitoring MASLD/MASH progression and fibrosis stage. The initiative involves a collaborative effort among end-users (clinicians and the pharmaceutical industry), independent academics specializing in medical test evaluation, and biomarker researchers and developers from academic or commercial backgrounds. Leveraging large-scale patient cohorts, bioresources and multi-omics datasets. The goal is to establish a definitive and impartial evaluation platform for these biomarkers. The LITMUS investigators developed prediction models using supervised ML techniques, that improved the detection of MASH and at-risk MASH [61]. They also created a proteo-transcriptomic map of MASLD signatures and generated a composite model comprising four proteins (ADAMTSL2, AKR1B10, CFHR4 and TREM2), body mass index and type 2 diabetes mellitus status to identify at-risk steatohepatitis [62]. LITMUS has recently added an imaging study where they will evaluate different MRI and elastography modalities against liver histology in MASLD [63].

TARGET-NASH, a longitudinal observational study, tracks patients under usual clinical care for MASLD/MASH in both academic and community settings [64]. The dataset is essential for establishing a baseline and assessing the impact of current practice guidelines, management, and new therapies on patients with various medical outcomes. The study's unique design, involving three years of retrospective analysis of MASH patients followed by at least five years of prospective enrolment, enables a comprehensive understanding of the disease's natural history. The TARGET-NASH cohort has allowed the validation of a clinical risk-based classification system [65], among other studies [66–68].

4SteatoSITE

The aforementioned consortia have compiled large multicentric prospective datasets. However, this presents potential disadvantages, including selection bias, loss to follow-up, and long duration to accumulate clinical outcomes. In contrast, SteatoSITE (https://www.steatosite.com) is a retrospective, multimodal MASLD database (Fig. 2) [13]. SteatoSITE includes curated whole-slide images of H&E and picro-sirius red-stained liver sections, accompanying histological assessments (NAS, SAF, NASH-CRN, collagen % area), bulk hepatic RNA-sequencing (RNA-seq), and rich EHR data from a cohort of n = 940 adult patients who had previously undergone either needle biopsy (n = 659), explant (n = 56) or liver resection (n = 225) between January 2000 and October 2019. Cases across the whole MASLD spectrum were identified from three of the four NHS Scotland Biorepositories (Lothian, Greater Glasgow & Clyde, and Grampian), representing 12 of the 14 territorial Health Boards. Covering a span of ten years before the tissue sampling date until May 2020, the dataset encompasses over 5.67 million days (∼15,547 years) of comprehensive routine clinical information derived from EHRs (including demographic data, International Classification of Diseases (ICD)-9/10 and OPCS Classification of Interventions and Procedures version 4 (OPCS-4) codes, laboratory results, and medication history).

Fig. 2.

SteatoSITE Data Commons overview. The right panel includes a schematic diagram in which horizontal lines represent individual patient timelines decorated with a variable amount of multimodal data preceding or following the date of liver tissue sampling (time zero, indicated by the vertical yellow line). MASLD, metabolic dysfunction-associated steatotic liver disease; H&E, hematoxylin and eosin; PSR, picro-sirius red; NASH-CRN, Non-alcoholic steatohepatitis-Clinical Research Network; SAF, Steatosis, Activity, Fibrosis; FFPE, formalin-fixed paraffin-embedded; RNA-seq, RNA-sequencing.

(0.36MB).

SteatoSITE is a resource that can support multiple facets of MASLD research [13] and fulfils the FAIR attributes (Findability, Accessibility, Interoperability, and Reuse of digital assets) that underpin a ‘data commons’ [69]. One research avenue is the use of the extensive histopathological dataset linked to patient outcomes to develop new AI-augmented digital pathology tools for MASLD/MASH. Using training and validation sets derived from the SteatoSITE cohort, new risk prediction indices derived from SHG/TPE imaging features were shown to predict all-cause mortality, decompensation events, and HCC, outperforming both NASH-CRN and qFibrosis ordinal staging [70].

Additionally, analysis of the SteatoSITE bulk RNA-seq data has enabled the discovery of molecular features linked to outcomes. In Kendall et al. [13], a 15-gene transcriptional risk score (TRS) was associated with a higher risk of developing decompensation events in advanced MASLD. Moreover, six of the 15 genes are predicted by bioinformatics to translate into secretome markers. The TRS was also used to investigate transcriptional regulatory networks in MASLD. Three regulons (gene networks controlled by AE binding protein 1 (AEBP1), thyroid hormone receptor beta (THRB), and basonuclin zinc finger protein 2 (BNC2)) exhibited significantly higher counts of TRS genes than anticipated by chance. This suggests that these three networks might play a crucial role in the progression of MASLD. Of particular interest, given recent encouraging data on the THRB agonist resmetirom [71], THRB regulon activity not only decreased with advancing fibrosis stage but also predicted future hepatic decompensation (beyond standard fibrosis scoring).

SteatoSITE was also used to perform deconvolution of the bulk RNA-seq using a published single-cell RNA-seq (scRNA-seq) reference dataset from healthy and cirrhotic patients [72], to estimate cell proportions in MASLD and correlate specific cell subpopulations with clinical outcomes. Interestingly, hepatic scar-associated macrophages (SAMacs) strongly correlated with fibrosis severity and were predictive for all-cause mortality and hepatic decompensation events. Conversely, more homeostatic liver resident cell types such as liver sinusoidal endothelial cells and vascular smooth muscle cells were protective against future mortality or hepatic decompensation.

Derived from a Scottish population with a high prevalence of MASLD and liver-related deaths [73], SteatoSITE is outcome-rich, but also has some specific limitations including inherent spectrum bias and a lack of ethnic diversity. Therefore, compared to other cohorts, SteatoSITE may be less suitable for modeling the population-level natural history of MASLD, and caution is advised about the generalizability of findings to other geographical areas and ethnic populations. Nevertheless, SteatoSITE is currently a unique resource for broad research efforts in MASLD including patient stratification, digital pathology methods, biomarker [74] and drug target discovery.

5Technical challenges of using big data in MASLD research and practice

The main technical challenges can be categorized into two domains: those arising from using EHRs and those related specifically to AI. Prominent EHR challenges include interoperability and usability. Globally, EHR systems have different clinical terminologies and technical specifications [75], which can create barriers when exchanging and using the data, as both aspects need to be addressed to achieve true interoperability. Additional factors hampering the use of EHRs for research purposes include human error (e.g., incorrect data entry, typographical errors, sample mislabelling), difficulties with data standardization, errors arising from different delimiters or encoding in input files, issues related to data formatting, and instances of data duplication, missing data or incompleteness.

Despite the promise of AI/ML approaches in many aspects of MASLD research and clinical practice, certain technical challenges and limitations should be acknowledged. ML algorithms must undergo training to effectively identify patterns in the data. This process is hindered by the notoriously large dimensionality of features in medical datasets, referred to as the “curse of dimensionality”, often resulting in suboptimal algorithm performance in independent studies and failure to generalize to clinical scenarios. Additionally, it is easy to ignore that all input data are generated within a non-stationary environment with shifting patient populations that “drift” away from original training data. This phenomenon adversely affects algorithm performance and should be monitored and mitigated during live deployment. Furthermore, clinicians and pathologists, with differing expertise, contribute to the input data, which may therefore exhibit discrepancies in features/data for the model. Variability in obtaining input data, influenced by factors such as tissue quality, experimental locations and equipment, can contaminate feature selection and ground truthing and adversely impact model performance. AI systems, acting as black boxes (with internal workings that are invisible to the researchers/users), can perpetuate biases that are challenging to detect, such as hidden stratifications [76]. Transparency is therefore crucial in publishing AI models for reliability, reproducibility, and diagnostic use. Additionally, although somewhat theoretical at present, AI algorithms are susceptible to the risk of adversarial attack, which describes an otherwise effective model that can be manipulated by the provision of inputs explicitly designed to fool it and to purposefully generate an incorrect prediction [77]. Finally, standardization and regulatory approval would be essential for future clinical utilization of these diverse algorithms and models in disease diagnosis and assessment.

6Future directions

The incorporation of AI/ML into MASLD research is swiftly advancing. By leveraging appropriate tools and methodologies, such as dimensionality reduction [78] and feature selection, data scientists can extract valuable insights from the growing complexity of accessible datasets. The assessment of liver histology using AI-augmented digital pathology tools is being assimilated into MASLD interventional trials, where digital analyses might provide better reproducibility and greater insights into drug efficacy and mechanism of action than standard scoring methods [79]. Moreover, the integration of AI-digital pathology with spatially resolved ‘omics data and clinical outcomes could drive the development of new histopathological-based metrics and refined categorizations for the stratification and prognostication of MASLD.

Finally, in the longer-term, AI might be applied in various ways to enhance clinical trials in MASLD. For example, AI algorithms could analyze EHRs to pinpoint eligible patients for clinical trials, improving patient recruitment efficiency, or be used to predict patient responses to treatment, helping in the selection of appropriate candidates for specific interventions. In addition, AI algorithms could continuously monitor patient data in real-time for early detection of adverse events, enhancing participant safety during the trial.

Funding

The creation and initial analysis of SteatoSITE was funded by Innovate UK ((Precision medicine: impacting through innovative technology (Reference: TS/R017581/1; J.A.F. and T.J.K.)), Innovate UK Eureka (Reference: 105976; J.A.F., T.J.K. and I.D.), Guts UK Development Grant (Reference: DGO2019_16; J.A.F. and T.J.K.), industrial Collaborative Awards in Science and Engineering (iCASE) PhD studentship funded by the Medical Research Council Precision Medicine Doctoral Training Programme (Reference: MR/R01566X/1; M.J.R.) and Galecto Biotech (M.J.-R.).

Author contributions

Maria Jimenez Ramos contributed to conceptualization, writing review and editing. Timothy J. Kendall and Ignat Drozdov contributed to writing review and editing. Jonathan A. Fallow field contributed to writing review and editing and provided supervision.

References
[1]
M.E. Rinella, J.V. Lazarus, V. Ratziu, S.M. Francque, A.J. Sanyal, F. Kanwal, et al.
A multisociety Delphi consensus statement on new fatty liver disease nomenclature.
[2]
S. Romeo, J. Kozlitina, C. Xing, A. Pertsemlidis, D. Cox, L.A. Pennacchio, et al.
Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease.
Nat Genet, 40 (2008), pp. 1461-1465
[3]
J. Kozlitina, E. Smagris, S. Stender, B.G. Nordestgaard, H.H. Zhou, A. Tybjærg-Hansen, et al.
Exome-wide association study identifies a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease.
Nat Genet, 46 (2014), pp. 352-356
[4]
W. Su, Y. Wang, X. Jia, W. Wu, L. Li, X. Tian, et al.
Comparative proteomic study reveals 17β-HSD13 as a pathogenic protein in nonalcoholic fatty liver disease.
Proc Natl Acad Sci, 111 (2014), pp. 11437-11442
[5]
Z.M. Younossi, G. Wong, Q.M. Anstee, L. Henry.
The global burden of liver disease.
Clin Gastroenterol Hepatol, 21 (2023), pp. 1978-1991
[6]
J.V. Lazarus, H.E. Mark, M. Villota-Rivas, A. Palayew, P. Carrieri, M. Colombo, et al.
The global NAFLD policy review and preparedness index: are countries ready to address this silent public health challenge?.
J Hepatol, 76 (2022), pp. 771-780
[7]
Z.M. Younossi, L. Henry.
Epidemiology of non-alcoholic fatty liver disease and hepatocellular carcinoma.
JHEP Rep Innov Hepatol, 3 (2021),
[8]
S. Ciardullo, E. Bianconi, R. Cannistraci, P. Parmeggiani, E.M. Marone, G. Perseghin.
Peripheral artery disease and all-cause and cardiovascular mortality in patients with NAFLD.
J Endocrinol Investig, 45 (2022), pp. 1547-1553
[9]
G. Targher, C.P. Day, E. Bonora.
Risk of cardiovascular disease in patients with nonalcoholic fatty liver disease.
N Engl J Med, 363 (2010), pp. 1341-1350
[10]
S.A. Harrison, A.M. Allen, J. Dubourg, M. Noureddin, N. Alkhouri.
Challenges and opportunities in NASH drug development.
Nat Med, 29 (2023), pp. 562-573
[11]
J.F. Dufour, C. Caussy, R. Loomba.
Combination therapy for non-alcoholic steatohepatitis: rationale, opportunities and challenges.
[12]
S.L. Friedman, M. Pinzani.
Hepatic fibrosis 2022: Unmet needs and a blueprint for the future.
Hepatology, 75 (2022), pp. 473-488
[13]
T.J. Kendall, M. Jimenez-Ramos, F. Turner, P. Ramachandran, J. Minnier, M.D. McColgan, et al.
An integrated gene-to-outcome multimodal database for metabolic dysfunction-associated steatotic liver disease.
[14]
L. Zhang, Y. Mao.
Artificial intelligence in NAFLD: will liver biopsy still be necessary in the future?.
Healthcare, 11 (2023), pp. 117
[15]
Charles D., Gabriel M., Searcy T. Office of the National Coordinator for Health Information Technology (ONC) Data Brief No. 23. Adoption of electronic health record systems among U.S. non-federal acute care hospitals: 2008-2014. 2015 [accessed Nov 24 2023]. Available from: https://www.healthit.gov/data/data-briefs/adoption-electronic-health-record-systems-among-us-non-federal-acute-care-1
[16]
J.M. Schattenberg, M.M. Balp, B. Reinhart, A. Tietz, S.A. Regnier, G. Capkun, et al.
NASHmap: clinical utility of a machine learning model to identify patients at risk of NASH in real-world settings.
[17]
S. Fialoke, A. Malarstig, M.R. Miller, A. Dumitriu.
Application of machine learning methods to predict non-alcoholic steatohepatitis (NASH) in non-alcoholic fatty liver (NAFL) patients.
Proceedings of the AMIA annual symposium proceedings AMIA symposium, pp. 430-439
[18]
S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi.
A systematic machine learning based approach for the diagnosis of non-alcoholic fatty liver disease risk and progression.
[19]
T.C.F. Yip, A.J. Ma, V.W.S. Wong, Y.K. Tse, H.L.Y. Chan, P.C. Yuen, et al.
Laboratory parameter-based machine learning model for excluding non-alcoholic fatty liver disease (NAFLD) in the general population.
Aliment Pharmacol Ther, 46 (2017), pp. 447-456
[20]
A.K. Loomis, S. Kabadi, D. Preiss, C. Hyde, V. Bonato, M. St. Louis, et al.
Body mass index and risk of nonalcoholic fatty liver disease: two electronic health record prospective studies.
J Clin Endocrinol Metab, 101 (2016), pp. 945-952
[21]
K.E. Corey, U. Kartoun, H. Zheng, R.T. Chung, SY. Shaw.
Using an electronic medical records database to identify non-traditional cardiovascular risk factors in nonalcoholic fatty liver disease.
Off J Am Coll Gastroenterol ACG, 111 (2016), pp. 671
[22]
J. Gronsbell, J. Minnier, S. Yu, K. Liao, T. Cai.
Automated feature selection of predictors in electronic medical records data.
Biometrics, 75 (2019), pp. 268-277
[23]
I.E. Nogues, J. Wen, Y. Lin, M. Liu, S.K. Tedeschi, A. Geva, et al.
Weakly semi-supervised phenotyping using electronic health records.
J Biomed Inform, 134 (2022),
[24]
T.A. Lasko, J.C. Denny, M.A. Levy.
Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data.
[25]
M. Vandromme, T. Jun, P. Perumalswami, J.T. Dudley, A. Branch, L. Li.
Automated phenotyping of patients with non-alcoholic fatty liver disease reveals clinically relevant disease subtypes.
Biocomputing 2020, World Scientific, (2019), pp. 91-102
[26]
R. Edgar, M. Domrachev, AE. Lash.
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.
Nucleic Acids Res, 30 (2002), pp. 207-210
[27]
N. Han, J. He, L. Shi, M. Zhang, J. Zheng, Y. Fan.
Identification of biomarkers in nonalcoholic fatty liver disease: a machine learning method and experimental study.
[28]
Z. Zhang, S. Wang, Z. Zhu, B. Nie.
Identification of potential feature genes in non-alcoholic fatty liver disease using bioinformatics analysis and machine learning strategies.
[29]
P. Sen, O. Govaere, T. Sinioja, A. McGlinchey, D. Geng, V. Ratziu, et al.
Quantitative modeling of human liver reveals dysregulation of glycosphingolipid pathways in nonalcoholic fatty liver disease.
[30]
Y. Luo, S. Wadhawan, A. Greenfield, B.E. Decato, A.M. Oseini, R. Collen, et al.
SOMAscan proteomics identifies serum biomarkers associated with liver fibrosis in patients with NASH.
Hepatol Commun., 5 (2021), pp. 760
[31]
T. Qian, N. Fujiwara, B. Koneru, A. Ono, N. Kubota, A.K. Jajoriya, et al.
Molecular signature predictive of long-term liver fibrosis progression to inform antifibrotic drug development.
Gastroenterology, 162 (2022), pp. 1210-1225
[32]
N. Fujiwara, N. Kubota, E. Crouchet, B. Koneru, C.A. Marquez, A.K. Jajoriya, et al.
Molecular signatures of long-term hepatocellular carcinoma risk in nonalcoholic fatty liver disease.
Sci Transl Med, 14 (2022), pp. eabo4474
[33]
J. Conway, M. Pouryahya, Y. Gindin, D.Z. Pan, O.M. Carrasco-Zevallos, V. Mountain, et al.
Integration of deep learning-based histopathology and transcriptomics reveals key genes associated with fibrogenesis in patients with advanced NASH.
[34]
I. Park, N. Kim, S. Lee, K. Park, M.Y. Son, H.S. Cho, et al.
Characterization of signature trends across the spectrum of non-alcoholic fatty liver disease using deep learning method.
[35]
P.S. Dulai, C.B. Sirlin, R. Loomba.
MRI and MRE for non-invasive quantitative assessment of hepatic steatosis and fibrosis in NAFLD and NASH: Clinical trials to clinical practice.
J Hepatol, 65 (2016), pp. 1006-1016
[36]
L. He, H. Li, J.A. Dudley, T.C. Maloney, S.L. Brady, E. Somasundaram, et al.
Machine learning prediction of liver stiffness using clinical and T2-weighted MRI radiomic Data.
Am J Roentgenol, 213 (2019), pp. 592-601
[37]
K. Schawkat, A. Ciritsis, S. von Ulmenstein, H. Honcharova-Biletska, C. Jüngst, A. Weber, et al.
Diagnostic accuracy of texture analysis and machine learning for quantification of liver fibrosis in MRI: correlation with MR elastography and histopathology.
Eur Radiol, 30 (2020), pp. 4675-4685
[38]
J. Starekova, D. Hernando, P.J. Pickhardt, SB. Reeder.
Quantification of liver fat content with CT and MRI: state of the art.
Radiology, 301 (2021), pp. 250-262
[39]
P.M. Graffy, V. Sandfort, R.M. Summers, PJ. Pickhardt.
Automated liver fat quantification at nonenhanced abdominal CT for population-based steatosis assessment.
Radiology, 293 (2019), pp. 334-342
[40]
Y. Huo, J.G. Terry, J. Wang, S. Nair, T.A. Lasko, B.I. Freedman, et al.
Fully automatic liver attenuation estimation combing CNN segmentation and morphological operations.
Med Phys, 46 (2019), pp. 3508-3519
[41]
K.J. Choi, J.K. Jang, S.S. Lee, Y.S. Sung, W.H. Shim, H.S. Kim, et al.
Development and validation of a deep learning system for staging liver fibrosis by using contrast agent–enhanced CT images in the liver.
Radiology, 289 (2018), pp. 688-697
[42]
D.J. Mole, J.A. Fallowfield, A.E. Sherif, T. Kendall, S. Semple, M. Kelly, et al.
Quantitative magnetic resonance imaging predicts individual future liver performance after liver resection for cancer.
[43]
Y. Li, X. Wang, J. Zhang, S. Zhang, J. Jiao.
Applications of artificial intelligence (AI) in researches on non-alcoholic fatty liver disease(NAFLD) : a systematic review.
Rev Endocr Metab Disord, 23 (2022), pp. 387-400
[44]
C.C. Wu, W.C. Yeh, W.D. Hsu, M.M. Islam, P.A.A. Nguyen, T.N. Poly, et al.
Prediction of fatty liver disease using machine learning algorithms.
Comput Methods Progr Biomed, 170 (2019), pp. 23-29
[45]
A. Tahmasebi, S. Wang, C.E. Wessner, T. Vu, J.B. Liu, F. Forsberg, et al.
Ultrasound-based machine learning approach for detection of nonalcoholic fatty liver disease.
J Ultrasound Med, 42 (2023), pp. 1747-1756
[46]
Y.X. Liu, X. Liu, C. Cen, X. Li, J.M. Liu, Z.Y. Ming, et al.
Comparison and development of advanced machine learning tools to predict nonalcoholic fatty liver disease: An extended study.
Hepatobiliary Pancreat Dis Int., 20 (2021), pp. 409-415
[47]
A.J. Sanyal, Q.M. Anstee, M. Trauner, E.J. Lawitz, M.F. Abdelmalek, D. Ding, et al.
Cirrhosis regression is associated with improved clinical outcomes in patients with nonalcoholic steatohepatitis.
Hepatology, 75 (2022), pp. 1235
[48]
M. Noureddin, F. Ntanios, D. Malhotra, K. Hoover, B. Emir, E. McLeod, et al.
Predicting NAFLD prevalence in the United States using National Health and Nutrition Examination Survey 2017–2018 transient elastography data and application of machine learning.
Hepatol Commun, 6 (2022), pp. 1537
[49]
B. Mamandipoor, S. Wernly, G. Semmler, M. Flamm, C. Jung, E. Aigner, et al.
Machine learning models predict liver steatosis but not liver fibrosis in a prospective cohort study.
Clin Res Hepatol Gastroenterol, 47 (2023),
[50]
D.E. Kleiner, E.M. Brunt, M. Van Natta, C. Behling, M.J. Contos, O.W. Cummings, et al.
Design and validation of a histological scoring system for nonalcoholic fatty liver disease.
Hepatology, 41 (2005), pp. 1313-1321
[51]
M. Strupler, M. Hernest, C. Fligny, J.L. Martin, P.L. Tharaux, M.C. Schanne-Klein.
Second harmonic microscopy to quantify renal interstitial fibrosis and arterial remodeling.
J Biomed Opt, 13 (2008),
[52]
F. Liu, G.B.B. Goh, D. Tiniakos, A. Wee, W.Q. Leow, J.M. Zhao, et al.
qFIBS: an automated technique for quantitative evaluation of fibrosis, inflammation, ballooning, and steatosis in patients with nonalcoholic steatohepatitis.
Hepatology, 71 (2020), pp. 1953-1966
[53]
A.M. Dinani, K.V. Kowdley, M. Noureddin.
Application of artificial intelligence for diagnosis and risk stratification in NAFLD and NASH: the state of the art.
Hepatology, 74 (2021), pp. 2233
[54]
Automated computerized image analysis for the user-independent evaluation of disease severity in preclinical models of NAFLD/NASH.
Lab Investig, 100 (2020), pp. 147-160
[55]
Y. Nakamura, H. Miyaaki, S. Miuma, Y. Akazawa, M. Fukusima, R. Sasaki, et al.
Automated fibrosis phenotyping of liver tissue from non-tumor lesions of patients with and without hepatocellular carcinoma after liver transplantation for non-alcoholic fatty liver disease.
Hepatol Int, 16 (2022), pp. 555-561
[56]
J.A. Inia, G. Stokman, M.C. Morrison, N. Worms, L. Verschuren, M.P.M. Caspers, et al.
Semaglutide has beneficial effects on non-alcoholic steatohepatitis in Ldlr-/-.Leiden mice.
Int J Mol Sci, 24 (2023), pp. 8494
[57]
V. Calvaruso, A.K. Burroughs, R. Standish, P. Manousou, F. Grillo, G. Leandro, et al.
Computer-assisted image analysis of liver collagen: relationship to Ishak scoring and hepatic venous pressure gradient.
Hepatology, 49 (2009), pp. 1236-1244
[58]
A.J. Sanyal, S.S. Shankar, R.A. Calle, A.E. Samir, C.B. Sirlin, S.P. Sherlock, et al.
Non-invasive biomarkers of nonalcoholic steatohepatitis: the FNIH NIMBLE project.
Nat Med, 28 (2022), pp. 430-432
[59]
A.J. Sanyal, S.S. Shankar, K.P. Yates, J. Bolognese, E. Daly, C.A. Dehn, et al.
Diagnostic performance of circulating biomarkers for non-alcoholic steatohepatitis.
Nat Med, 29 (2023), pp. 2656-2664
[60]
T. Hardy, K. Wonders, R. Younes, G.P. Aithal, R. Aller, M. Allison, et al.
The European NAFLD Registry: a real-world longitudinal cohort study of nonalcoholic fatty liver disease.
Contemp Clin Trials, 98 (2020),
[61]
J. Lee, M. Westphal, Y. Vali, J. Boursier, S. Petta, R. Ostroff, et al.
Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study.
[62]
O. Govaere, M. Hasoon, L. Alexander, S. Cockell, D. Tiniakos, M. Ekstedt, et al.
A proteo-transcriptomic map of non-alcoholic fatty liver disease signatures.
Nat Metab, 5 (2023), pp. 572-578
[63]
M. Pavlides, F.E. Mózes, S. Akhtar, K. Wonders, J. Cobbold, E.M. Tunnicliffe, et al.
Liver investigation: testing marker utility in steatohepatitis (LITMUS): assessment & validation of imaging modality performance across the NAFLD spectrum in a prospectively recruited cohort study (the LITMUS imaging study): Study protocol.
Contemp Clin Trials, 134 (2023),
[64]
A.S. Barritt, N. Gitlin, S. Klein, A.S. Lok, R. Loomba, L. Malahias, et al.
Design and rationale for a real-world observational cohort of patients with nonalcoholic fatty liver disease: The TARGET-NASH study.
Contemp Clin Trials, 61 (2017), pp. 33-38
[65]
A.J. Sanyal, B. Munoz, K. Cusi, A.S. Barritt, M. Muthiah, A.R. Mospan, et al.
Validation of a clinical risk-based classification system in a large nonalcoholic fatty liver disease real-world cohort.
Clin Gastroenterol Hepatol, 21 (2023), pp. 2889-2900.e10
[66]
H.P. Kim, M.O. Idowu, A.R. Mospan, A.G. Allmon, M. Roden, P. Newsome, et al.
Liver biopsy in the real world—reporting, expert concordance and correlation with a pragmatic clinical diagnosis.
Aliment Pharmacol Ther, 54 (2021), pp. 1472-1480
[67]
M.J. Thomson, M. Serper, V. Khungar, L.M. Weiss, H. Trinh, R. Firpi-Morell, et al.
Prevalence and factors associated with statin use among patients with nonalcoholic fatty liver disease in the TARGET-NASH study.
Clin Gastroenterol Hepatol, 20 (2022), pp. 458-460.e4
[68]
A.S. Barritt, S. Watkins, N. Gitlin, S. Klein, A.S. Lok, R. Loomba, et al.
Patient Determinants for histologic diagnosis of NAFLD in the real world: a TARGET-NASH study.
Hepatol Commun, 5 (2021), pp. 938-946
[69]
R. Asiimwe, S. Lam, S. Leung, S. Wang, R. Wan, A. Tinker, et al.
From biobank and data silos into a data commons: convergence to support translational medicine.
J Transl Med, 19 (2021), pp. 493
[70]
T. Kendall, D. Tai, G. Ho, Y. Ren, E. Chng, J. Fallowfield.
Digital pathology using stain-free imaging indices allows direct prediction of all-cause mortality, hepatic decompensation and hepatocellular carcinoma development in patients with non-alcoholic fatty liver disease.
J Hepatol, 78 (2023), pp. S70-S71
[71]
S.A. Harrison, M. Bashir, S.E. Moussa, K. McCarty, J. Pablo Frias, R. Taub, et al.
Effects of resmetirom on noninvasive endpoints in a 36-week phase 2 active treatment extension study in patients with NASH.
Hepatol Commun, 5 (2021), pp. 573
[72]
P. Ramachandran, R. Dobie, J.R. Wilson-Kanamori, E.F. Dora, B.E.P. Henderson, N.T. Luu, et al.
Resolving the fibrotic niche of human liver cirrhosis at single-cell level.
Nature, 575 (2019), pp. 512-518
[73]
The Scottish Public Health Observatory. Chronic liver disease: international comparisons. 2023 [accessed Nov 21 2023]. Available from: https://www.scotpho.org.uk/health-conditions/chronic-liver-disease/data/international-comparisons/
[74]
R. Carlessi, E. Denisenko, E. Boslem, J. Köhn-Gaone, N. Main, N.D.B. Abu Bakar, et al.
Single-nucleus RNA sequencing of pre-malignant liver reveals disease-associated hepatocyte state with HCC prognostic potential.
[75]
M. Reisman.
EHRs: the challenge of making electronic data usable and interoperable.
Pharmacol Ther., 42 (2017), pp. 572-575
[76]
L. Oakden-Rayner, J. Dunnmon, G. Carneiro, C. Ré.
Hidden stratification causes clinically meaningful failures in machine learning for medical imaging.
Proceedings of the ACM conference on health, inference, and learning, pp. 151-159 http://dx.doi.org/10.1145/3368555.3384468
[77]
S.G. Finlayson, J.D. Bowers, J. Ito, J.L. Zittrain, A.L. Beam, I.S. Kohane.
Adversarial attacks on medical machine learning.
Science, 363 (2019), pp. 1287-1289
[78]
B. Szubert, J.E. Cole, C. Monaco, I. Drozdov.
Structure-preserving visualisation of high dimensional single-cell datasets.
[79]
N.V. Naoumov, D. Brees, J. Loeffler, E. Chng, Y. Ren, P. Lopez, et al.
Digital pathology with artificial intelligence analyses provides greater insights into treatment-induced fibrosis regression in NASH.
J Hepatol, 77 (2022), pp. 1399-1409
Copyright © 2023. Fundación Clínica Médica Sur, A.C.
Download PDF
Article options
es en pt

¿Es usted profesional sanitario apto para prescribir o dispensar medicamentos?

Are you a health professional able to prescribe or dispense drugs?

Você é um profissional de saúde habilitado a prescrever ou dispensar medicamentos