Reliable assessment of individuals with Parkinson's disease (PD) is essential for providing adequate treatment. Clinical assessment is a complex and time-consuming task, especially for bradykinesia, since its evaluation can be influenced by the degree of experience of the examiner, patient collaboration and individual bias. Improvement of the clinical evaluation can be obtained by considering assessments from several professionals. However, this is only true when inter and intra-rater agreement are high. Recently, the Movement Disorder Society highlighted, during the COVID-19 pandemic, the need to develop and validate technologies for remote assessment of the motor status of people with PD. Thus, this study introduces an objective strategy for the remote evaluation of bradykinesia using multi-specialist analysis.
MethodsTwelve volunteers with PD participated and these were asked to execute finger tapping, hand opening/closing and pronation/supination movements. Each task was recorded and rated by fourteen PD health experts for each patient. The scores were assessed on an individual basis. Intra and inter-rater agreement and correlation were estimated.
ResultsThe results showed that agreements and correlations between experienced examiners were high with low variability. In addition, group analysis was noted as possessing the potential to solve individual inconsistency bias.
ConclusionFurthermore, this study demonstrated the need for a group with prior training and experience, along with indicating the importance for the development of a clinical protocol that can use telemedicine for the evaluation of individuals with PD, as well as the inclusion of a specialized mediating group. In Addition, this research helps to the development of a valid remote assessment of bradykinesia.
La evaluación confiable de las personas con la enfermedad de Parkinson (EP) es esencial para lograr con un tratamiento adecuado. La evaluación clínica es una tarea compleja y que requiere mucho tiempo, especialmente para la bradicinesia, ya que su evaluación puede verse influenciada por el grado de experiencia del examinador, la colaboración del paciente y el sesgo individual. La mejora de la evaluación clínica se puede obtener considerando las evaluaciones de varios profesionales. Sin embargo, esto solo es más preciso cuando el convenio intra e inter evaluadores es alto. Recientemente, la Sociedad de Trastornos del Movimiento destacó, durante la pandemia COVID-19, la necesidad de desarrollar y validar tecnologías para la evaluación remota del estado motor de las personas con EP. Por lo tanto, este estudio presenta una estrategia objetiva para la evaluación remota de la bradicinesia mediante un análisis multi evaluadores.
MétodosParticiparon 12 voluntarios con EP y se les pidió que ejecutaran movimientos de golpeteo de dedos de las manos, movimientos con las manos y pronación-supinación de las manos. Cada ejecución del movimiento fue registrado y calificado por 14 expertos en salud. Las puntuaciones se evaluaron de forma individual. Se estimó el convenio y la correlación intra e inter evaluadores.
ResultadosLos resultados mostraron que los convenios y las correlaciones inter evaluadores experimentados son altos con baja variabilidad. Además, se observó que el análisis de grupo posee el potencial de resolver el sesgo de inconsistencia individual.
ConclusionesDe esta forma, este estudio demostró la necesidad de un grupo con formación y experiencia previa, señalando la importancia para el desarrollo de un protocolo clínico que utiliza la telemedicina para la evaluación de personas con EP y como la inclusión de un grupo mediador especializado. En realidad, esta investigación propone una evaluación remota eficaz de la bradicinesia.
The analysis of agreement among examiners is present in several scientific studies in the health field,1 since evaluating neurological patients in clinical practice is a complex and time-consuming task.2 In particular, Parkinson's disease, with its high incidence3 and wide range of signs and symptoms, needs good knowledge and specialist analysis for a correct diagnosis.2 To perform an agreement analysis, there are different statistical methods in use, which compare and reveal variations between examiners.1 The most widely used methods are Cohen's Kappa correlation, the intraclass correlation coefficient (ICC)2,4,5 and the Bland–Altman method.5
Usually, in clinical practice, the assessment of signs and symptoms of PD are performed by a single professional, which may result in limited reliability. The inclusion of a multi-examiner approach may impact the quality of the clinical evaluation contributing to a better management of the disorder. Remote evaluation of patients can ease the participation of national and international specialists on the assessment of individuals with PD. In this sense, telemedicine can be proposed for remote assessment.5
In 2020, the Movement Disorder Society highlighted the urgent need for the development and validation of technologies for remote assessment of PD, especially during the COVID-19 pandemic.6 In this context, telemedicine implementation opens the potential for attending to the social demands encountered in the current scenario. Therefore, the tendency towards changes in the care of patients with PD and other movement disorders exists, indicating evaluation acceptance via telemedicine,7 as an initial assessment for diagnosis or continuous monitoring over the trajectory of the disease.
The current COVID-19 global emergence has driven a rapid reorganization of healthcare systems towards telemedicine, with a safety enhanced priority, allowing for the continuous diagnostic–therapeutic patient process.8 Telemedicine assistance is essential to facilitating outpatient consultations, while decreasing costs. To this end, the MDS telemedicine group created a remote guide for PD assessment, which brings out the importance of telemedicine.9
However, under calamitous circumstances, one notes that telemedicine interaction still needs changes and adjustments. In their study, Spear et al.10 pointed out some difficulties with telemedicine interaction in real-time with Parkinson's disease patients, which considered the following points: (i) the lack of clinical interaction when performing a physical examination; (ii) intimacy absence with the clinician due to being unable to develop a close relationship; (iii) problems with internet at the time of the virtual visit and (iv) interest and access – both of which are related and limited to demographic, socioeconomic, and cultural conditions, as the highest number of participants were white and had a high level of education. Consequently, these factors pointed out the many problems that still exist in the application of telemedicine.
In general, PD patients need continuous treatment and medication adjustments. This necessity becomes a worldwide problem when patients do not have access to specialists for carrying out periodic diagnoses or follow-up assessments of PD. This fact highlights important issues: (i) absence of a quality service, especially as professionals become overloaded; (ii) the absence of a sufficient quantity of specialized professionals, which limits the diagnosis and treatment to merely one opinion; (iii) assessments conducted in unfavourable environments, showing the need to promote evaluations in familiar environments, avoiding stress or situations of discomfort.2 In light of the aforementioned, there arises the necessity of how this scenario can be avoided or reversed, while increasing the implementation of telemedicine, since it serves in providing urgent and continuous health care.11 Moreover, even with all the efforts implemented, the question arises whether the current telemedicine service meets all the demands connected to it.
Stimulated by factors such as the global emergency caused by the pandemic and the need to provide access to specialists and improve health services via telemedicine this work proposes an objective and humanized strategy for the remote clinical evaluation of bradykinesia in PD patients. For such, several specialists evaluated video-recorded hand movements of patients and then employed MDS-UPDRS (Movement Disorder Society – Unified Parkinson's Disease Rating Scale) to score the degree of severity of bradykinesia. Intra and inter-rater agreements were estimated to verify the variability and reliability of the results.
MethodologyExperimental protocolThis research follows Resolution 466/2012 of the National Health Council. The study was conducted at the Centre for Innovation and Technology Assessment in Health of the Federal University of Uberlandia (UFU), Brazil. The experimental protocol was approved by the Human Research Ethics Committee (CEP-UFU), CAAE Number: 65165416.4.0000.5152. The participants were informed concerning the data collection procedures and signed a consent form before data collection.
Twelve volunteers participated in this study, 8 males and 4 females, with an average age of 69±7.5 years. The participants had been diagnosed with Parkinson's disease for 5.3±5.5 years. They had been in the OFF state of medication for 14±4h at the time of data collection. Table 1 shows additional data of the volunteers with PD who participated of this research.
Participants were asked to execute finger tapping, hand opening/closing and pronation and supination movements. These are the MDS-UPDRS tasks (Part III – items 3.4, 3.5 and 3.6)12 performed for the evaluation of bradykinesia. The tasks were executed with the most affected limb. The participants were asked to perform the tasks as quickly and accurately as possible.
For every patient, each task was recorded and later blindly scored by fourteen health professionals with experience in research and application of MDS-UPDRS. The experience of the examiners was defined according to Tables 2 and 3.
The face-to-face evaluation of bradykinesia was carried out, supervised and guided by a mediation group with experience in applying MDS-UPDRS. This group have contact with the patients and are part of their treatment process, in a Parkinson's Association (a familiar environment). Fig. 1 presents the scheme evaluation scenario.
After the recording, special care was given to the video editing. Following on from the editing work, the material was compiled and sent for analysis to the examiners, enabling a remote and practical evaluation, where this could be performed in their available time.
The final videos contained ten repetitions of each task for the assessment of bradykinesia, following the MDS-UPDRS recommendations, which are the gold standard in PD evaluation. The specialized examiners were subsequently consulted on the practicality and accessibility to recorded videos.
Statistical analysisStatistical analysis of this study was applied to analyze intra and inter-rater assessment of bradykinesia. To verify the paired correlation and agreement between the evaluators, the following methods were used: Kendall's tau (Kτ), ICC, Cohen's Kappa and Bland–Altman method. To facilitate the interpretation of the Bland–Altman method, the amplitude measure of variability was employed (1):
A: amplitude; UL: upper limit; LL: lower limit.
Descriptive statistics were used to describe intra-rater analyses. The inter-rater analyses followed three steps:
- (i)
Use of the non-parametric test Kruskal–Wallis for comparing paired correlation coefficients obtained from the Kendall's tau (Kτ), ICC, Cohen's Kappa, as the null hypotheses of normal distributions were rejected (Shapiro–Wilk test, p>0.05).
- (ii)
The influence of the experience of the evaluator over the results was investigated. Since the results of the correlation coefficients and concordance methods are not added in the arithmetic sense and it is not possible to calculate the average correlation coefficients, it was necessary to apply Fisher's Z transformation. For this analysis, the evaluation of the experience of the examiners was conducted using Kendall's coefficient.
- (iii)
Kendall's correlation coefficient was used to compare the concordance obtained by the 14 health professionals included in the study, to estimate the overall correlation taking the first and second assessments into account.
A total of 12 patients were evaluated by 14 examiners. The results were divided into two parts: intra-rater and inter-rater analyses.
The comparisons by the Bland–Altman method are explained by the amplitude values, as they consider the values of mean and standard deviation of agreement analysis between examiners.
All the examiners were invited to analyze the videos twice with a mean interval of 2 months. Table 4 shows the results of the intra-rater analysis. The supplementary material shows the results for the inter-rater paired analyses.
Results for the intra-rater analysis.
Evaluator | Experience score | Kendall's tau | Bland–Altman | Kappa | ICC |
---|---|---|---|---|---|
amplitude | |||||
1 | A | 0.90 | 4.14 | 0.88 | 0.92 |
2 | C | 0.50 | 10.01 | 0.70 | 0.70 |
3 | A | 0.78 | 6.22 | 0.89 | 0.90 |
4 | A | 0.70 | 5.63 | 0.82 | 0.83 |
5 | B | 0.75 | 6.99 | 0.78 | 0.81 |
6 | C | 0.84 | 4.46 | 0.89 | 0.91 |
7 | C | 0.54 | 9.46 | 0.71 | 0.67 |
8 | C | 0.84 | 5.24 | 0.85 | 0.89 |
9 | B | 0.88 | 5.58 | 0.83 | 0.86 |
10 | B | 0.98 | 3.67 | 0.98 | 0.95 |
11 | A | 0.93 | 2.96 | 0.95 | 0.96 |
12 | B | 0.74 | 5.41 | 0.82 | 0.83 |
13 | B | 0.80 | 4.86 | 0.87 | 0.89 |
14 | B | 0.87 | 7.56 | 0.70 | 0.73 |
Fig. 2 depicts the results for the comparison between examiners 11 and 12, which yielded the largest correlation coefficients. In Fig. 2, a linear model fit is shown on the left (A) together with the associated Kendall's tau (Kτ) and its p-value. The Bland–Altman plot is shown on the right (B).
Table 5 shows the mean results of Kendall's tau correlation after Fisher's Z transformation. The best correlation results are attributed to the high experience evaluators with a score A.
Kruskal–Wallis test (p=0.000155) confirmed significant differences in correlation coefficients estimated by the three methods. The Nemenyi post hoc test revealed significant differences between Kendall's tau and Kappa (p=0.0015) and Kendall's tau and ICC (p=0.0005), but no difference between ICC and Kappa. Fig. 3 depicts a violin plot of the distribution of correlation coefficient values estimated from ICC, Kappa, and Kendall's tau.
Kendall's coefficient correlation was 0.88 for all evaluators on the first day and 0.84 for the reassessment on the second day.
DiscussionThis study used remote monitoring to assess bradykinesia in people with Parkinson's disease. Bradykinesia is a cardinal sign in Parkinson's disease present in all patients.13 Research conducted by Martinez-Manzanera et al.14 and Smith et al.15 highlighted the difficulty in clinically assessing bradykinesia. The authors also stated that, in general, correlation and agreement analyses of bradykinesia yield poorer results when compared to analyses of other motor symptoms of the disorder.
The results and methods presented in this study have several practical implications, thus, to facilitate the discussion, this section is divided into four parts: (i) intra and inter-rater agreement and correlation; (ii) correlation methods in paired comparison; (iii) mediating group importance; (iv) telemedicine needs in the COVID-19 scenario.
Intra and inter-rater agreement and correlationIndividual scoring may be influenced by a lack of training and the difficulty in using clinical scales consistently.16 The intra-rater correlation and agreement analysis evaluates the consistency of individual evaluations. The results from evaluators 1 (score A), 10 (score B) and 11 (score A) are more consistent, as shown in Table 4, with a high correlation and low amplitude (i.e., excellent agreement). Examiners 2 (score C) and 7 (score C) had relatively low intra-rater correlation and high amplitude (i.e., poor agreement). The intra-rater analysis suggests that a specialized team is required to eliminate outliers and discordances caused by individual bias.
According to Bajpai et al.,1 three factors influence inter-rater evaluation: (1) evaluator training, (2) evaluator experience, and (3) the evaluator's commitment to improving the quality of the clinical assessment. As shown in Table 5, the group with extensive experience has higher correlation and agreement (score A). These findings support one of our hypotheses, that the correlation between experienced examiners is higher, which is consistent with the findings of Maidane et al.,4 who emphasized the importance of evaluator experience.
Another essential factor to consider is training indication. Inter and intra rater analysis (Table 4 and supplementary material) from evaluator 7 (experience score C) revealed low correlation coefficients and high amplitude variability. The intra and inter-rater analyses revealed the following issues: (i) inconsistency in reassessment; (ii) low agreement due to inexperience; and (iii) low agreement when compared to the other evaluators.
Although it is possible to find studies that compare the correlation between examiners in bradykinesia assessment,2,14 these studies consider only a small number of evaluators (2 and 4, respectively), which produces inconsistent results. Stefan Williams et al.17 point out the flaws in clinical evaluations, the authors emphasize that a large number of clinical evaluators, using blind methods and gold evaluation scales, provide a more robust “ground truth” of bradykinesia evaluation, despite the fact that 22 evaluators took part in the study, each patient was only evaluated by 5 different examiners. The research conducted in17 reinforces the findings of our study, emphasizing the importance of incorporating experienced examiners (i.e., more than five years), for a reliable clinical analysis of Parkinson's disease. The importance of a large group is confirmed by the high correlation between the test (0.88) and reassessment (0.84).
Correlation methods in paired comparisonsThe use of agreement or correlation methods to evaluate the consistency of clinical evaluations is relevant for improving the diagnosis and follow-up of the motor symptoms of PD2,4,18 and for the development of new technologies (i.e., artificial intelligence, machine learning) that is based in clinical results.14,19As it is also important mention about the potential use of wearable technologies for the objective assessment of motor symptoms in PD, specially in bradykinesia, because the actual finds in literature suggest that inertial sensors are good instruments and capable of differentiating bradykinetic movements from normal movements in controlled environments.20
Although it is possible to find in the literature several studies that use different methods for the quantification of similarity between results provided by distinct examiners, the statistical equivalence of the results obtained through these methods is not clear. This was the motivation behind reporting the outcomes of the most traditional methods for agreement and correlation analysis in Table 4 and supplementary material.
The application of Kruskal–Wallis test, with the aim of verifying the statistical equivalence of results provided by the methods, confirmed the equivalence (p<0.05) over the results for the evaluation of bradykinesia.
Fig. 3 shows the differences between Kendal's tau, Kappa and ICC methods. There is a similarity between the distributions of correlations estimated from Kappa and ICC, unlike the distribution of the Kendall's tau method. This visual interpretation is validated by the Kruskal–Wallis post hoc test.
A relevant aspect connected to the results reported in this research is that these were estimated from a relatively large number of examiners when compared to other studies. In addition, this is the first study that is focused on the evaluation of bradykinesia in PD.
The importance of the mediating groupNoteworthy here is that the entire methodology of this study was only possible due to the following of the protocol by the mediating group, which included criteria for capturing and editing videos. During a clinical assessment, the importance of the mediating group becomes even more apparent. Correspondingly, in trials that follow rules close to those outlined in MDS-UPDRS, it has been found that patients do not consistently execute the movements and the group were responsible for guaranteeing a more consistent data collection, easing the evaluation and reevaluation of the motor tasks.
While several studies in the literature show that telemedicine services are effective in terms of connecting health systems, eliminating distance barriers, providing clinical medical care to isolated populations and connecting patients to specialized healthcare treatment,21 there is a lack of information concerning how this type of service can be implemented and improved upon with the aid of mediating groups. In the light of the COVID-19 pandemic, Davarpanah et al.22 demonstrated the high demand placed on health services, thus pointing out the importance of telemedicine assessments for assisting in the screening and treatment of respiratory complications. In this context, the authors highlight the importance of a humanitarian teleconsultation service that can detect and show the difficulties in mediation; in this regard, the authors declare the importance of a teleconsultation mediation team.
In this work, we have addressed the problem of inadequate interaction in telemedicine services, as during the clinical tests qualified specialists were able to provide an adequate and humanized approach to patients in a more familiar setting.10,23 The participation of a mediation group was critical to the success of data collection from Parkinson's disease patients.
Telemedicine needs in the COVID-19 scenarioIn remote consultations, the nuances found in the implementation of technology are important, particularly in the current emergency scenario of the COVID-19 pandemic. A recent review article discusses all of the advantages of telemedicine implementation during a pandemic, including improved primary care and diagnosis of post-surgery complications, as well as access to specialist treatment for neurological disorders like Parkinson's disease.24
Bhaskar Roy et al.6 described the efficacy and feasibility of tele-neurology during the current COVID-19 pandemic in a research review, which is considered a step forward in medical care. The results of this study reinforce the need to verify and improve standards of best practices for the remote assessment of patients.
Telemedicine has had a positive effect on health emergency triage, rapid deployment of a large number of health providers, and supplying services when hospitals and local health centres are unable to meet demand.24 However, there are still major challenges for the future and applicability of telemedicine,25 especially in the treatment of Parkinson's disease, as shown in the study of Elson et al.26 in which the distance in the interaction between patients and specialists interfered with patient actions that did not follow medical recommendations.
Scott Kruse et al.23 identified the most common barriers to telemedicine use as being related to user perspectives on an education level, computer knowledge, low quality of internet service, lack of knowledge about telemedicine service centres, as well as data security, confidentiality, privacy, and legal responsibility. All these issues were relevant issues during the COVID-19 pandemic.
Another important point is the possibility of real-time interaction or recorded video interaction. However, when the proposed protocol of this study was defined, some fundamental factors in terms of the final result of the evaluation were noted: (i) the volunteers are most commonly elderly individuals, with limited access to technology and difficulties of interacting with online services, (ii) the evaluation of bradykinesia using the MDS-UPDRS items in part III has a protocol that needs to be followed, and with the correct video recording, a real-time evaluation interaction was not necessary; (iii) it was not possible to bring all PD specialists together at the same time, as these healthcare professionals are extremely overloaded, however, telemedicine allowed for analyses from different specialists.
As a result, all these factors contributed to the success of the proposed methodology, as the use of recordings enabled PD patients to participate, the MDS-UPDRS assessment proposed was accurate, and 14 specialized evaluators were able to participate in the bradykinesia evaluation.
ConclusionSeveral key points were identified during this research that contributed to its success: (i) a group composed of several examiners can improve clinical assessment; (ii) necessity of a mediation group with prior training and experience in Parkinson's disease; (iii) possibility of a remote and a reliable assessment of bradykinesia in PD; (iv) the importance of developing a telemedicine protocol for people with PD to be employed in the context of the COVID-19 pandemic.
Other motor symptoms of Parkinson's disease, such as tremor and gait dysfunction, will need to be evaluated remotely in the future. This can be accomplished using the experimental protocol and experience gained by the research team.
FundingThe present work was carried out with the support of the National Council for Scientific and Technological Development (CNPq), Coordination for the Improvement of Higher Education Personnel (CAPES – Programme CAPES/DFATD-88887.159028/2017-00, Programme CAPES/COFECUB-88881.370894/2019-01, CAPES/COFECUB-88887.628121/2021-00) and the Foundation for Research Support of the State of Minas Gerais (FAPEMIG-APQ-00942-17). A.O. Andrade is a fellow of CNPq, Brazil (304818/2018-6).
Conflict of interestThe authors declare that they have no conflict of interest.
The authors are grateful to the health professionals and patients of Associação Parkinson do Triângulo (Brazil) who contributed to this research.