Thyroid cytology: The reality before and after the introduction of ultrasound classification systems for thyroid nodules

Lopes, Sara Campos; Shah, Bijal; Eloy, Catarina

doi:10.1016/j.endinu.2022.06.008

Article information

Abstract

Full Text

Bibliography

Download PDF

Statistics

Abstract

Background

Several ultrasound-based systems for classification of thyroid nodules are available. They allow for a better triage of the nodules that require cytological assessment, and lead to standardized recommendations. Our aim was to compare patients and nodules referred to fine-needle aspiration (FNA) before and after the introduction of these systems.

Methods

A retrospective study comparing two cohorts of patients referred for FNA was performed (386 patients and 463 nodules in 2015; 220 patients and 263 nodules in 2021).

Results

The sex distribution (89.1% vs 85.9% females, p=0.243), number of nodules referred to FNA per patient (median of 1), and the distribution of the Bethesda categories (p=0.082) was similar in both years. In 2021, patients were older (53.4±14.5 years vs 57.8±13.2 years, p<0.001) and nodules over one centimetre were larger (median 17.0mm vs 19.0mm, p=0.002), especially the ones categorized as Bethesda III (median size 11mm vs 23mm, p=0.043). In 2021, at least 23.1% of the nodules referred to FNA did not have any criteria, and 38.8% of the nodules were not categorized by any system.

Conclusion

This analysis draws attention to the importance of systematically applying ultrasound-based classification systems. It seems that, by not being focused mainly on size thresholds, they allow for longer surveillance periods, without aggravating the cytology results when FNA becomes indicated. Nevertheless, greater efforts are needed to ensure more standardized reports, and to increase adherence to the resulting recommendations to reduce clinical uncertainty, unnecessary FNA, and overtreatment.

Keywords:

Thyroid nodule

Fine-needle

Biopsy

Ultrasonography

Resumen

Antecedentes

Existen varios sistemas basados en la ecografía para la clasificación de los nódulos tiroideos. Permiten un mejor triaje de los nódulos que requieren una evaluación citológica y conducen a recomendaciones estandarizadas. Nuestro objetivo fue comparar los pacientes y los nódulos remitidos para punción aspiración con aguja fina (PAAF) antes y después de la introducción de estos sistemas.

Métodos

Se realizó un estudio retrospectivo comparando 2 cohortes de pacientes remitidos para PAAF (386 pacientes y 463 nódulos en 2015; 220 pacientes y 263 nódulos en 2021).

Resultados

La distribución por género (89,1% vs. 85,9% mujeres, p=0,243), el número de nódulos remitidos para PAAF por paciente (mediana: 1) y la distribución de las categorías Bethesda (p=0,082) fue similar en ambos años. En 2021 los pacientes eran de mayor edad (53,4±14,5años vs. 57,8±13,2años, p<0,001) y los nódulos supracentimétricos eran mayores (mediana 17,0mm vs. 19,0mm, p=0,002), especialmente los Bethesda III (mediana 11mm vs. 23mm, p=0,043). En 2021 al menos el 23,1% de los nódulos remitidos a PAAF no tenían criterios, y el 38,8% de los nódulos no fueron categorizados por ningún sistema.

Conclusión

Este análisis llama la atención sobre la importancia de aplicar este sistemas de clasificación. Parece que, al no estar centrados principalmente en el tamaño, permiten períodos de vigilancia más prolongados, sin agravar la citología cuando se indica la PAAF. No obstante, es necesario realizar mayores esfuerzos para garantizar informes más estandarizados y aumentar la adherencia a las recomendaciones resultantes, con el fin de reducir la incertidumbre clínica y las PAAF innecesarias.

Palabras clave:

Nódulo tiroideo

Aguja fina

Biopsia

Ultrasonografía

Full Text

Introduction

Thyroid nodules are common, with a prevalence as high as has 68% in some ultrasonography series.1,2 The incidence of nodular thyroid disease and thyroid cancer grew considerably in the last two decades, mainly due to the widespread use of imaging techniques leading to the detection of thyroid incidentalomas.3 Nevertheless, only a minority of these nodules are malignant (7–15%).4

Thyroid ultrasonography (US) is the gold standard for identifying and evaluating thyroid nodules via imaging.4 US helps to differentiate the nodules that need to be investigated with fine-needle aspiration (FNA) biopsy from the ones that only require surveillance. In order to improve diagnostic accuracy, decrease unwarranted FNA biopsies, and prevent overdiagnosis and overtreatment, several risk stratification systems based on the US features of thyroid nodules have been developed worldwide in the last few years.4–9 In general, these stratification systems focus on similar ultrasonographic features suggestive of malignancy, namely, hypoechogenicity, ill-defined margins, solid appearance, and the presence of calcifications.10

These updated and commonly used US classification systems include the system proposed in the American Thyroid Association (ATA) guidelines from 2015 (published in January 2016),4 the Thyroid Imaging Reporting and Data System (TIRADS) developed by the European Thyroid Association in 2017 (EU-TIRADS),6 and the TIRADS proposed by the American College of Radiology in 2017 (ACR-TIRADS).8 The impact of introducing these stratification US systems on the thyroid nodules referred for FNA and their cytological results is unknown.

The main aim of our study was to compare the patients and thyroid nodules referred to FNA and their cytological result before and after the introduction of these well-established US classification systems.

Materials and methodsStudy design and patient selection

A retrospective study was conducted at the Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal. The study included all the patients that underwent US-guided thyroid FNA in this institution from 1st October to 10th November 2015 and from 1st October to 10th November 2021 by the same experienced pathologist (CE). The same pathologist performed the cytology report for all the nodules. The system used for the classification of the cytopathological findings was the Bethesda system.11

IPATIMUP is a reference centre for thyroid FNA with an active contract with the Administração Regional de Saúde do Norte (ARS Norte), where primary care providers can send patients for thyroid FNA biopsy and cytology. It has a pathology laboratory that is double accredited by the College of American Pathologists and by NP EN ISO 15189. The remaining patients are referred to IPATIMUP mainly from private practice, and occasionally from public hospitals after specific situations (consecutive Bethesda category I results and/or technically difficult to perform FNA biopsy). To undergo FNA biopsy, the patient needs to have an US report of the thyroid performed within the previous six months.

We collected the most recent data from 2021 and the same corresponding period in 2015 to avoid having confounding factors, such as a bias in referral due to an eventual seasonal variation in seeking medical care that could invalidate possible explanations for differences that could emerge from the statistical analyses between the two samples. The year selected to perform the comparison with 2021 was 2015 since it was the last year before the new ATA guidelines4 were proposed (published in January 2016) and subsequently the EU-TIRADS6 and ACR-TIRADS8 scoring systems.

The data for characterizing, and then comparing, the patients and nodules from 2015 and 2021 were obtained from the available clinical information provided both by the US report and by the clinician that referred the patient to the IPATIMUP. The variables collected were: sex, age, number of nodules for which FNA was requested per patient, the nodules’ size, the US classification system used and the number of nodules with US criteria to perform FNA. The latter variable was decided based on the guidelines that were in force at the time.

In 2015, we considered that the nodule had indication for FNA if it met the criteria presented in the 2009 ATA guidelines, as they were the ones most widely accepted at the time12 and as other US classification systems with scoring systems, such as TIRADS (Horvath et al., 2009),5 were not commonly used in Portugal. To do this assessment, we only had access to the data provided by the clinician as, at the time, the US report was not added to the patients’ file as it is now in 2021. The most common information given was the nodule size and, sporadically, the echogenicity and the nodule composition (solid, cyst, mixed). There was no reference in any of the patients from 2015 of the use of any US classification system, while in the information given by the clinician in 2021, the EU-TIRADS/ACR-TIRADS/ATA classification was frequently reported. Nevertheless, it is relevant to take into consideration that the clinical information given by the clinicians was scarce both in 2015 and in 2021. However, in 2021, we had access to more data, as we had an US report available for analysis for each patient from the previous six months. This was due to the COVID-19 pandemic. Since the first lockdown in Portugal in 2020, in IPATIMUP, the US reports started being scanned and attached to the clinical file of the patients to prevent additional physical contact. The nodule size and the US classification described in the report were collected and used to assess if the nodule had criteria for FNA. The US reports that did not have any reference to an US classification system were labelled as such and considered “missing data” in terms of whether these nodules had criteria for FNA or not. Additionally, the origin of the referral was collected (primary care providers vs other specialties).

As mentioned above, our main aim was to compare the cohort of patients and thyroid nodules referred to FNA and their cytological result from 2015 with the 2021 cohort from the same period. Our secondary aim was to characterize and compare the subset of patients and thyroid nodules with an US classification described in the US report (group 1) with the remaining 2021 cohort (group 2) and with the 2015 cohort. When significant differences between group 1 and 2 were not detected on the comparative analysis, we considered the 2021 cohort as a homogenous cohort for that specific variable and did not describe the comparative analysis of group 1 versus the 2015 cohort.

Data analysis and statistics

Data were retrieved from the SISPAT® software and were exported to Microsoft Office Excel® 2013. The statistical analysis was conducted on Statistical Package for the Social Sciences (IBM® SPSS Statistics, version 26.0).

Categorical variables were presented as absolute value (n) and relative frequency (%) and summarized into tables.

To assess the normality of the distribution of the continuous variables of each sample, we performed the Shapiro–Wilk test. If normality was assumed, values were presented as mean±standard deviation. If normality was not assumed, values were presented as median and 25–75th percentile (P25–P75).

Pearson's chi-square test was used to assess the associations between categorical variables. Student's t-test was used to compare means among normally distributed variables and the Mann–Whitney U test was used to analyze non-normally distributed variables.

A p-value below 0.05 was considered statistically significant.

ResultsPopulation characteristics

In this study, 606 adult patients were included (n=386 in 2015; n=220 in 2021). Within the 2021 cohort, group 1 included n=133 patients (60.5%) and group 2 included n=87 patients (39.5%). The number of patients undergoing FNA at IPATIMUP was higher in 2015 in comparison with 2021 due to the physician's availability to schedule the patients (57.0% more patients in 2015). Table 1 shows the general characteristics of the population.

Table 1.

General characteristics of the patients from 2015 and from 2021.

Variable	2015	2021	p
	(n=386 patients)	(n=220 patients)
Gender, n (%)
Male	42 (10.9)	31 (14.1)	0.243
Female	344 (89.1)	189 (85.9)

Age (years), mean±SD	53.4±14.5	57.8±13.2	<0.001

Source of referral, n (%)
Primary care provider	344 (89.1)	212 (96.4)	0.002
Other specialties	42 (10.9)	8 (3.6)

Patients submitted to FNA, n (%)
1 nodule	316 (81.9)	180 (81.8)	0.993
2 nodules	63 (16.3)	37 (16.8)
3 nodules	7 (1.8)	3 (1.4)

FNA, fine needle aspiration; SD, standard deviation.

In both years, the majority of patients were females (89.1% vs 85.9%, p=0.243). The average age in 2015 was significantly lower when compared with 2021 (53.4±14.5 years vs 57.8±13.2 years, p<0.001). The main source of referral was primary care providers. This source of referral was significantly more frequent in 2021 than in 2015 (89.1% vs 96.4%; p=0.002).

No differences were observed between group 1 and group 2 regarding sex distribution (p=0.138), mean age (p=0.096) and source of referral (p=0.904).

Characteristics of the thyroid nodules

A total of 471 nodules in 2015 and 273 nodules in 2021 were analyzed. Within the 2021 cohort, group 1 included 61.2% of the nodules (n=165) and group 2 included 38.8% of the nodules (n=106). These totals included all the nodules undergoing FNA biopsy (n=463 in 2015 and n=263 in 2021), plus the additional n=8 nodules in 2015 and n=10 nodules in 2021 for which FNA was requested but FNA was not performed. The reasons for not performing a biopsy in these 18 nodules were due to their small size (below 10mm), and the fact the nodules did not have suspicious US features and were part of a multinodular disease.

The median nodule size in 2015 was 16.0mm (P25–P75: 12.0–22.0mm, ranging between 4mm and 66mm), while median nodule size in 2021 was 17.0mm (P25–P75: 14.0–24.0mm, ranging between 5mm and 84mm). This difference in the size of the nodules was statistically significant (p=0.022), seeming to be mainly due to an increase in nodules 10mm or larger in 2021 (17.0mm vs 19.0mm, respectively, p=0.002). The nodules under 10mm had similar medians (8.0mm vs 8.05mm, p=0.873). When comparing group 1 with the 2015 cohort, identical findings were obtained regarding median nodule size [17mm (P25–P75: 14–24mm) vs 16.0mm (P25–P75: 12–22mm), p=0.016], also due to the nodules larger than one centimetre [20mm (P25–P75: 15–27mm) 17mm (P25–P75 13–22mm), p<0.001], and not the ones under one centimetre in size (8.0mm vs 8.05mm, p=0.873).

Regarding the 2021 cohort, no significant differences regarding the median nodule size were detected between group 1 and group 2 [19.0mm (P25–P75: 14.0–26.0mm) vs 16.7mm (P25–P75: 14.0–23.0mm), p=0.326]; however, when dividing the nodules into under 10mm and 10mm or larger, similar results to those previously mentioned were detected. There were no differences regarding the nodules under one centimetre in size (8.0mm vs 9.0mm, p=0.498), but the nodules over one centimetre in size were significantly bigger in group 1 (20mm vs 17mm, p=0.043).

At least 3.2% of the nodules were under 10mm in 2015 (n=15), as opposed to in 2021 which was almost double (6.6%; n=18), although this was not a statistically significant result (p=0.155). The size of the nodules was not recorded in 23.1% (n=109) of the nodules in 2015 and in 1.5% (n=4) of the nodules in 2021 (n=3 within group 1 – with an associated US categorization).

The median number of nodules for which FNA was requested per patient was one in both 2015 and 2021 (range in 2015: 1–5, range in 2021: 1–4). There was information regarding the number of nodules for which FNA biopsy was specifically requested in 428 nodules in 2015 (90.9%), and 271 nodules in 2021 (99.3%). There was no specific request in n=30 patients from 2015 (7.8%) and in n=2 patients from 2021 (0.9%) that were excluded from this analysis.

From the 744 nodules analyzed, there was data available to assess criteria for undergoing FNA in 34.6% of the nodules from 2015 (n=163) and in 60.8% of the nodules from 2021 (n=166). Therefore, the availability of data enabling the criteria for FNA to be assessed was significantly better in 2021 (p<0.001). Within the 2021 cohort, there was also a significant difference between group 1 and group 2 regarding the available information to make this decision (p<0.001); the vast majority of the nodules where we could assess the criteria for FNA were from group 1 (n=163; 98.2%). The median nodules with an indication for FNA biopsy per patient was 1 in both years (range in 2015: 0–3, range in 2021: 0–4). FNA was requested for at least 23.1% (n=63) of all the nodules in 2021, although without US criteria [vs n=158 (33.5%) and n=103 (37.7%) nodules with inclusion criteria, respectively]. In both 2015 and 2021, none of the nodules reported to be under 10mm and for which FNA was requested had criteria for FNA biopsy (n=15 and n=18, respectively).

The majority of the patients in both years underwent FNA of a single nodule (81.9% vs 81.8%, p=0.993). Therefore, the median nodules undergoing FNA per patient was one in both years (range in both years: 1–3).

US classification systems

None of the nodules from 2015 had any clinical information referring to the use of any risk stratification system. From the 2021 cohort, 60.5% of the patients had an US report with data regarding an US classification system (61.2% of the nodules; n=167) – group 1. More than one-third of the nodules from 2021 were not categorized using a US classification system (38.8%, n=106). The most frequently used US classification system in the sample from 2021 was the EU-TIRADS (39.2%, n=107) (Table 2). The majority of the nodules with a US classification were ranked as class 4 (46.5%, n=74) and 3 (35.6%, n=57). None were classified as class 1. Grading by US classification system used is reported on Table 2. There was a positive and significant correlation between having criteria to undergo FNA and the US classification of the 2021 nodules (p<0.001).

Table 2.

Distribution of the 273 nodules from 2021 as per the US classification system used and corresponding category.

US classification system used n (%)	EU-TIRADS	ACR-TIRADS	ATA	Not referred
	107 (39.2)	58 (21.2)	2 (0.7)	106 (38.8)

US classification system category	EU-TIRADS (n=107)	ACR-TIRADS (n=58)	ATA (n=2)	Not known (n=106)
Class 1, n (%)	0 (0)	0 (0)	0 (0)	NA
Class 2, n (%)	3 (2.8)	3 (5.2)	1 (50)	NA
Class 3, n (%)	29 (27.1)	27 (46.6)	1 (50)	NA
Class 4, n (%)	52 (48.6)	22 (37.9)	0 (0)	NA
Class 5, n (%)	16 (15.0)	5 (8.6)	0 (0)	NA
Not referred, n (%)	7 (6.5)	1 (1.7)	0 (0)	106 (100)

ACR, American College of Radiology; ATA, American Thyroid Association; EU, European; NA, not applicable; TIRADS, Thyroid Imaging Reporting and Data System; US, ultrasound.

Cytology report

Table 3 shows the distribution of the 726 nodules submitted to FNA, as per the cytology report, according to the Bethesda classification11 (n=463 nodules in 2015; n=263 nodules in 2021). There were no differences in terms of distribution of the Bethesda categories between 2015 and 2021 (p=0.082). The same occurred when performing the analysis separately for the nodules that had criteria for FNA (p=0.185) and the nodules that did not have criteria for FNA (p=0.974). Within the 2021 cohort, there were no differences in the overall distribution of Bethesda categories (p=0.469). When performing the analysis for the nodules without an indication for FNA, no differences were detected (p=0.988). However, when evaluating exclusively nodules with indication for FNA, a significant difference was observed for the Bethesda III category nodules that was not taken into consideration due to the small sample size (n=5; n=5). In line with that, considering this small sample size effect for many of the Bethesda categories apart from Bethesda II in group 1, definite conclusions should not be drawn from comparisons between group 1 and the 2015 cohort for the Bethesda I and III to VI categories. When comparing the median nodule sizes for the Bethesda II category nodules, there were significant differences between groups (17mm vs 19mm, p=0.029).

Table 3.

Distribution of the nodules undergoing FNA, according to the Bethesda classification.

Bethesda classification	2015 (n=463)	2021 (n=263)
		Group 1 (n=102)	Group 2 (n=161)
I – Non diagnostic, n (%)	58 (12.5)	24 (9.1)
		7 (29.2)	17 (70.8)
II – Benign, n (%)	366 (79.1)	215 (81.7)
		87 (40.5)	128 (59.5)
III – AUS/FLUS, n (%)	9 (1.9)	10 (3.8)
		5 (50)	5 (50)
IV – Follicular tumour, n (%)	20 (4.3)	10 (3.8)
		3 (30)	7 (70)

V – Suspicious for malignancy, n (%)	1 (0.2)	3 (1.1)
		0 (0)	3 (100)

VI – Malignant, n (%)	9 (1.9)	1 (0.4)
		0 (0)	1 (100)

AUS/FLUS: Atypia of Undetermined Significance or Follicular Lesion of Undetermined Significance.

Regarding the size of the thyroid nodules, when performing the analysis separately for the nodules that were under 10mm and the nodules that were 10mm or larger, there were no differences in terms of Bethesda classification between 2015 and 2021 (p=0.586 and p=0.091, respectively). The most common Bethesda categories in any of these subgroups were Bethesda I and II. However, when comparing the median nodule sizes per Bethesda category, there were significant differences between 2015 and 2021. More specifically, between the Bethesda II and Bethesda III nodules from 2015 and 2021 (Table 4). The sample size of the nodules Bethesda V and VI was small [n=4 (n=1 in 2015, n=3 in 2021) and n=10 (n=9 in 2015, n=1 in 2021), respectively], with a wide range of nodule sizes (range 6.8–74mm and 6.5–22mm, respectively). Within the 2021 cohort, there were no differences between group 1 and group 2 regarding the median size of the nodules per Bethesda category, even when performing the analysis separately for the nodules under 10mm and 10mm or larger (p>0.05). Table 5 summarizes the distribution of the nodules from 2021 that underwent FNA by US classification system category and the corresponding Bethesda category. In 41.4% of the 2021 nodules undergoing FNA (n=109), the US classification system categorization was not applied.

Table 4.

Comparison of the median nodule size per Bethesda category between 2015 and 2021.

Bethesda classification	2015 (n=463) median size, mm	2021 (n=263) median size, mm	p
I – Non diagnostic	14	14	0.642
II – Benign	17	19	0.036
III – AUS/FLUS	11	23	0.043
IV – Follicular tumour	17	14.9	0.352
V – Suspicious for malignancy	15	30	0.655
VI – Malignant	11	22	0.377

AUS/FLUS: Atypia of Undetermined Significance or Follicular Lesion of Undetermined Significance.

Table 5.

Distribution of the nodules from 2021 undergoing FNA, as per their US classification system category and Bethesda classification (n=263).

	Bethesda classification
US classification system category	Bethesda In (%)	Bethesda IIn (%)	Bethesda IIIn (%)	Bethesda IVn (%)	Bethesda Vn (%)	Bethesda VIn (%)
EU-TIRADS 2 (n=3)	0 (0)	3 (1.4)	0 (0)	0 (0)	0 (0)	0 (0)
EU-TIRADS 3 (n=29)	2 (8.3)	24 (11.2)	1 (10)	1 (10)	1 (33.3)	0 (0)
EU-TIRADS 4 (n=52)	5 (20.8)	42 (19.5)	1 (10)	3 (30)	1 (33.3)	0 (0)
EU-TIRADS 5 (n=16)	2 (8.3)	13 (6.0)	0 (0)	1 (10)	0 (0)	0 (0)
ACR-TIRADS 2 (n=2)	1 (4.2)	1 (0.5)	0 (0)	0 (0)	0 (0)	0 (0)
ACR-TIRADS 3 (n=26)	3 (12.5)	21 (9.8)	2 (20)	0 (0)	0 (0)	0 (0)
ACR-TIRADS 4 (n=21)	3 (12.5)	17 (7.9)	0 (0)	1 (10)	0 (0)	0 (0)
ACR-TIRADS 5 (n=4)	1 (4.2)	2 (0.9)	0 (0)	1 (10)	0 (0)	0 (0)
ATA 3 (n=1)	0 (0)	1 (0.5)	0 (0)	0 (0)	0 (0)	0 (0)
Unknown (n=109)	7 (29.2)	91 (42.3)	6 (60)	3 (30)	1 (33.3)	1 (100)

TOTAL	24 (100)	215 (100)	10 (100)	10 (100)	3 (100)	1 (100)

ACR, American College of Radiology; ATA, American Thyroid Association; EU, European; NA, Not applicable; TIRADS, Thyroid Imaging Reporting and Data System; US, Ultrasound.

Discussion

It is crucial to accurately distinguish between benign and malignant nodules, to avoid unnecessary surgical procedures, as well as to alleviate potential patients’ psychological distress and reduce healthcare costs.

US is the recommended imaging exam for guiding the initial management of thyroid nodules and several US-based classification systems for risk stratification are currently available.4–9 These systems allow for a standardized risk-stratification assessment of thyroid nodules, leading to more consistent clinical decisions and better agreement between clinicians.4–9,13–16 They are intended to reduce the number of nodules with an indication for cytological assessment, and consequently, decrease the workload and costs associated with the workup of nodules that would be either benign or malignancies unlikely to cause harm during the patient's lifespan.14,15,17 These systems include those used in this study: EU-TIRADS, ACR-TIRADS and ATA.4,6,8

Our results suggest that a change in the management of thyroid nodules in the North region of Portugal might have happened since the US-based classification systems for thyroid nodules were published.4,6,8 In comparison with 2015, there was a significant adherence to US classification systems in 2021, with most of the nodules being categorized according to one of these systems (61.2%), mainly with the EU-TIRADS (39.2%) and ACR-TIRADS systems. Although developed by an American society, the ACR-TIRADS was used in 21% of the available US reports. This might be due to its well-validated performance and associated higher specificity in triaging thyroid nodules when compared to the EU-TIRADS.15 The ATA4 classification was rarely used (only described in one US report), possibly due to its pattern-based system, instead of the points-based system of ACR TI-RADS and EU-TIRADS, leading to some nodules not being categorized.

Nevertheless, some features of the population and respective nodules submitted to FNA biopsy were similar between 2015 and 2021, namely the sex distribution (mainly females, as expected according to the literature18,19), the number of nodules for which FNA was requested per patient, the number of nodules undergoing FNA and the Bethesda classification of the nodules.

The most significant differences recorded between the two years were as follows: the patients in 2021 were 4.4 years older than in 2015, the size of the nodules was significantly larger in 2021 than in 2015 mainly due to nodules larger than 10mm, and the percentage of patients referred from primary care providers was higher in 2021 (89.1% vs 96.4%; p=0.002). The differences described here may have been due to at least two potential factors. First, the implementation of the US classification systems potentially allowed for extended surveillance periods, leading to older patients and larger nodules. This last observation is further corroborated by the results obtained by the comparison of group 1 with group 2 and with the 2015 cohort. The nodules categorized with an US system (group 1) were significantly larger, apparently at the cost of the nodules larger than one centimetre (3mm larger in diameter when compared to the nodules from group 2 and the 2015 cohort). On the other hand, it may also represent the impact of the COVID-19 pandemic. It resulted in a delay in access to health care services and the emergence of financial-social difficulties, leading to fewer patients resorting to private practice and a higher referral rate from primary care providers.

Comparing the nodule size per Bethesda category, both Bethesda II and III category nodules were significantly larger in 2021 (17mm vs 19mm, p=0.036; 11mm vs 23mm, p=0.043, respectively). Bethesda V and VI nodules should not be compared regarding their dimensions due to the small sample size. The size difference between the Bethesda III nodules (12mm) is clinically relevant, since the performing FNA in larger nodules is usually easier, more successful, and with a higher sensitivity.20,21 However, it is important to mention that the sample size of the Bethesda III nodules was small. It seems that the US classification systems allow delaying FNA, without aggravating the cytology results when FNA becomes indicated, leading to larger nodules in 2021 undergoing FNA. Even the detected size differences of 1 or 2mm, although not clinically relevant, might be indicative of this positive trend. As the Bethesda IV-VI categories have a small sample size not detecting differences regarding the nodules’ dimensions and Bethesda category distribution, it might be due to the low statistical power associated with small populations. According to Tables 3 and 4, we can perceive an apparent tendency towards bigger nodules in 2021 across the six Bethesda categories and a slight change in the distribution of the nodules throughout these categories from 2015 to 2021. Regarding the Bethesda category distribution, we could not make correlations with the US risk stratifying systems, as an US classification system categorization was not applied in 41.4% of the 2021 nodules undergoing FNA. However, the most reported US category was EU-TIRADS 4. We reported n=16 nodules EU-TIRADS 5 and n=5 nodules ACR-TIRADS 5, however they did not correspond to the Bethesda V and VI nodules of the 2021 cohort. As shown in Table 5, none of these Bethesda V–VI nodules were classified as EU-TIRADS 5 or ACR-TIRADS 5. The majority of the nodules EU-TIRADS 5 and ACR-TIRADS 5 undergoing FNA were Bethesda II. Nonetheless, in n=1 Bethesda V and n=1 Bethesda VI, no US classification system was applied so we cannot exclude they would not be classified as EU-TIRADS 5 or ACR-TIRADS 5. We acknowledge a very low percentage of Bethesda V and VI in our cohorts. Possible explanations for the scarcity of higher risk Bethesda categories might be the high percentage of nodules referred for FNA that did not have criteria for it; the fact that the US were performed by multiple radiologists with different degrees of experience in thyroid US (being the majority non-exclusively dedicated to this type of US). Another important factor to considered is that there is a potential referral bias associated with the IPATIMUP itself, where the majority of the FNAs are requested by primary care providers, as opposed to the hospital setting where they have a higher percentage of referrals from hospital specialties, and nodules diagnosed in more complex contexts. Additionally, some more complicated situations are sent to IPATIMUP from private practice, and occasionally from public hospitals, namely, consecutive Bethesda category I or III and/or technically difficult to perform FNA biopsies. These nodules might not have initially had an indication for FNA. As the low rate of Bethesda V and VI was observed both in 2015 and 2021, a potential referral bias and high percentage of referred nodules without an indication for FNA are the most plausible justifications. A further topic of research in the IPATIMUP will be to assess in a larger population the potential variables responsible for this atypical low percentage of Bethesda V and VI categories in our sample and to have the statistical power to correlate the Bethesda categories with the US classification systems categories.

The most recent US-based classification systems4,6,8 (and the ones included in this study) are more conservative than the guidelines in force in 20155,12 regarding nodule size thresholds that indicate cytological assessment. Nodules under one centimetre no longer indicate FNA, even if high-risk factors are present, as the course of papillary microcarcinoma is usually indolent.4 Consequently, this leads to fewer FNA. However, in 2021, nodules under 10mm are still being referred to FNA. At least 6.6% of nodules referred in 2021 were under 10mm, as opposed to 2015 (at least 3.2%). Nevertheless, in 2015 we did not have access to the nodule size in 33.5% of the nodules, and therefore a direct comparison between the two years should be interpreted with caution. Even with several risk-stratification systems available in 2021, nearly 40% of the nodules are still not being stratified with an ultrasound classification system, precluding the clinician to assess indication for FNA in virtually all of those nodules. This goes along the same lines of a recent study reporting inconsistent and incomplete descriptions of thyroid nodules, with some US features only being reported in as few as 9% of the nodules.22 Consequently, nearly all the 2021 nodules where we were able to assess the criteria for FNA were nodules classified with an US risk stratifying system (98.2%), and the n=3 nodules without this categorization that we were able to assess indication for FNA was due to the large dimensions of the nodules (more than 20mm in diameter). Additionally, at least 38% of the nodules referred to FNA in 2021 did not have criteria to do it, testifying that the clinicians do not always follow the recommendations associated with each risk category. This might partly justify why the distribution of the Bethesda categories did not significantly change from 2015 to 2021.

The ongoing referral for biopsy of nodules under one centimetre in size and other nodules without indication for cytological assessment also shows us that other factors are accountable for this, independently of these classification systems. We could speculate some contributing factors are the patients’ expectations, as research shows office consults’ satisfaction suffers and health-related anxiety increases when wanted diagnostic interventions are not obtained,23,24 and, somewhat related, the rise of defensive medicine in Europe.25

It is pertinent to mention two important caveats concerning our data. We had a significant percentage of missing data regarding nodule size in 2015 (33.5% of the nodules) and regarding information on US criteria to undergo FNA in both 2015 and 2021 (65.4% vs 39.2%, respectively). To assess the indication for FNA of these nodules in 2015, we had to use that limited data (mainly, the nodule size), leading to a potential information bias without taking into consideration potential specific risk factors such as a history of thyroid cancer in first degree relatives, history of external beam radiation as a child, previous hemithyroidectomy with discovery of thyroid cancer, and 18FDG avidity on PET scans.12 These factors are no longer considered in the updated systems,4,6,8 facilitating their application. This study includes other limitations. Thyroid US was performed by multiple radiologists introducing operator-driven variability that was not controllable. However, we can contemplate this as a strength of the study since the wide diversity of examiners with different levels of expertise performing the US led to a sample more representative of what is happening in daily clinical practice. Another limitation was the retrospective design of this study, leading to gaps in information. The higher number of patients who underwent FNA in 2015 (57% more patients) was due to institutional constraints associated with the availability of the pathologist performing this technique in the previous months, leading to a bigger waitlist in October 2015.

Conclusion

This study is the first study to compare the reality in Portugal of thyroid FNA and cytology before and after the publication of the most commonly used US-based classification systems for thyroid nodules. To the best of our knowledge, it is also the first published study worldwide making this comparison and assessment. This analysis draws attention to the importance of systematically applying US-based classification systems for thyroid nodules in our daily clinical practice. It seems that these systems, by not being focused mainly on size thresholds, allow for extended surveillance periods, without aggravating the cytology results when FNA becomes indicated. FNA potentially become safer as the nodules are permitted to have larger dimensions. Nevertheless, bigger efforts are needed to ensure more standardized US reports by systematically using these risk-stratification systems and, when applied, to increase adherence to the resulting recommendations. This will allow more consistent clinical decisions, with a decrease in clinical uncertainty, unnecessary FNA biopsies, and potential overtreatment.

Ethical considerations

All procedures were in accordance with the ethical standards of the institutional ethics committee and in line with the principles of the Declaration of Helsinki.

Funding sources

This research did not receive any specific grant from funding agencies.

Conflict of interest

No conflicts of interest in connection with this article.

Acknowledgments

Not applicable.

References

[1]

S. Guth, U. Theune, J. Aberle, A. Galach, C.M. Bamberger.

Very high prevalence of thyroid nodules detected by high frequency (13MHz) ultrasound examination.

Eur J Clin Invest, 39 (2009), pp. 699-706

http://dx.doi.org/10.1111/j.1365-2362.2009.02162.x | Medline

[2]

D.S. Dean, H. Gharib.

Epidemiology of thyroid nodules.

Best Pract Res Clin Endocrinol Metab, 22 (2008), pp. 901-911

http://dx.doi.org/10.1016/j.beem.2008.09.019 | Medline

[3]

S. Vaccarella, L. Dal Maso, M. Laversanne, F. Bray, M. Plummer, S. Franceschi.

The impact of diagnostic changes on the rise in thyroid cancer incidence: a population-based study in selected high-resource countries.

Thyroid, 25 (2015), pp. 1127-1136

http://dx.doi.org/10.1089/thy.2015.0116 | Medline

[4]

B.R. Haugen, E.K. Alexander, K.C. Bible, G.M. Doherty, S.J. Mandel, Y.E. Nikiforov, et al.

2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer.

Thyroid, 26 (2016), pp. 1-133

http://dx.doi.org/10.1089/thy.2015.0020 | Medline

[5]

E. Horvath, S. Majlis, R. Rossi, C. Franco, J.P. Niedmann, A. Castro, et al.

An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management.

J Clin Endocrinol Metab, 94 (2009), pp. 1748-1751

http://dx.doi.org/10.1210/jc.2008-1724 | Medline

[6]

G. Russ, S.J. Bonnema, M.F. Erdogan, C. Durante, R. Ngu, L. Leenhardt.

European thyroid association guidelines for ultrasound malignancy risk stratification of thyroid nodules in adults: the EU-TIRADS.

Eur Thyroid J, 6 (2017), pp. 225-237

http://dx.doi.org/10.1159/000478927 | Medline

[7]

H. Gharib, E. Papini, J.R. Garber, D.S. Duick, R.M. Harrell, L. Hegedüs, et al.

AACE/ACE/AME Task Force on Thyroid Nodules. American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazione Medici Endocrinologi medical guidelines for clinical practice for the diagnosis and management of thyroid nodules – 2016 update.

Endocr Pract, 22 (2016), pp. 622-639

http://dx.doi.org/10.4158/EP161208 | Medline

[8]

F.N. Tessler, W.D. Middleton, E.G. Grant, J.K. Hoang, L.L. Berland, S.A. Teefey, et al.

ACR Thyroid Imaging, Reporting and Data System (TI-RADS): white paper of the ACR TI-RADS committee.

J Am Coll Radiol, 14 (2017), pp. 587-595

http://dx.doi.org/10.1016/j.jacr.2017.01.046 | Medline

[9]

E.J. Ha, S.R. Chung, D.G. Na, H.S. Ahn, J. Chung, J.Y. Lee, et al.

2021 Korean thyroid imaging reporting and data system and imaging-based management of thyroid nodules: Korean Society of Thyroid Radiology Consensus Statement and Recommendations.

Korean J Radiol, 22 (2021), pp. 2094-2123

http://dx.doi.org/10.3348/kjr. 2021.0713 | Medline

[10]

L.R. Remonti, C.K. Kramer, C.B. Leitão, L.C. Pinto, J.L. Gross.

Thyroid ultrasound features and risk of carcinoma: a systematic review and meta-analysis of observational studies.

Thyroid, 25 (2015), pp. 538-550

http://dx.doi.org/10.1089/thy.2014.0353 | Medline

[11]

E.S. Cibas, S.Z. Ali.

The 2017 Bethesda system for reporting thyroid cytopathology.

Thyroid, 27 (2017), pp. 1341-1346

http://dx.doi.org/10.1089/thy.2017.0500 | Medline

[12]

American Thyroid Association (ATA) Guidelines Taskforce on Thyroid Nodules and Differentiated Thyroid Cancer, Cooper DS, Doherty GM, Haugen BR, Kloos RT, Lee SL, Mandel SJ, Mazzaferri EL, McIver B, Pacini F, Schlumberger M, Sherman SI, Steward DL, Tuttle RM. Revised American Thyroid Association management guidelines for patients with thyroid nodules and differentiated thyroid cancer. Thyroid. 2009;19:1167–214. doi:10.1089/thy.2009.0110. Erratum in: Thyroid. 2010 Jun;20(6):674–5.

[13]

G. Grani, L. Lamartina, V. Cantisani, M. Maranghi, P. Lucia, C. Durante.

Interobserver agreement of various thyroid imaging reporting and data systems.

Endocr Connect, 7 (2018), pp. 1-7

http://dx.doi.org/10.1530/EC-17-0336 | Medline

[14]

Z. Pang, M. Margolis, R.J. Menezes, H. Maan, S. Ghai.

Diagnostic performance of 2015 American Thyroid Association guidelines and inter-observer variability in assigning risk category.

Eur J Radiol Open, 6 (2019), pp. 122-127

http://dx.doi.org/10.1016/j.ejro.2019.03.002 | Medline

[15]

D. Seminati, G. Capitoli, D. Leni, D. Fior, F. Vacirca, C. Di Bella, et al.

Use of diagnostic criteria from ACR and EU-TIRADS systems to improve the performance of cytology in thyroid nodule triage.

Cancers (Basel), 13 (2021), pp. 5439

http://dx.doi.org/10.3390/cancers13215439

[16]

J.K. Hoang, W.D. Middleton, A.E. Farjat, S.A. Teefey, N. Abinanti, F.J. Boschini, et al.

Interobserver variability of sonographic features used in the american college of radiology thyroid imaging reporting and data system.

AJR Am J Roentgenol, 211 (2018), pp. 162-167

http://dx.doi.org/10.2214/AJR.17.19192 | Medline

[17]

A. Persichetti, E. Di Stasio, R. Guglielmi, G. Bizzarri, S. Taccogna, I. Misischi, et al.

Predictive value of malignancy of thyroid nodule ultrasound classification systems: a prospective study.

J Clin Endocrinol Metab, 103 (2018), pp. 1359-1368

http://dx.doi.org/10.1210/jc.2017-01708 | Medline

[18]

S. Ezzat, D.A. Sarti, D.R. Cain, G.D. Braunstein.

Thyroid incidentalomas. Prevalence by palpation and ultrasonography.

Arch Intern Med, 154 (1994), pp. 1838-1840

http://dx.doi.org/10.1001/archinte.154.16.1838 | Medline

[19]

L. Xu, F. Zeng, Y. Wang, Y. Bai, X. Shan, L. Kong.

Prevalence and associated metabolic factors for thyroid nodules: a cross-sectional study in Southwest of China with more than 120 thousand populations.

BMC Endocr Disord, 21 (2021), pp. 175

http://dx.doi.org/10.1186/s12902-021-00842-2 | Medline

[20]

M. Shrestha, B.A. Crothers, H.B. Burch.

The impact of thyroid nodule size on the risk of malignancy and accuracy of fine-needle aspiration: a 10-year study from a single institution.

Thyroid, 22 (2012), pp. 1251-1256

http://dx.doi.org/10.1089/thy.2012.0265 | Medline

[21]

D.W. Kim, E.J. Lee, S.H. Kim, T.H. Kim, S.H. Lee, D.H. Kim, et al.

Ultrasound-guided fine-needle aspiration biopsy of thyroid nodules: comparison in efficacy according to nodule size.

Thyroid, 19 (2009), pp. 27-31

http://dx.doi.org/10.1089/thy.2008.0106 | Medline