A patient registry is an organised system that uses observational study methods to collect uniform data (clinical and other) to assess specific outcomes for a particular population defined by a disease, condition or exposure and that serves one or more purposes (scientific, clinical or political).1 The objectives of registries may be to describe the natural history of a disease, to determine the clinical effectiveness or cost-effectiveness of products and services used in health care, to monitor safety or to measure the quality of health care. Depending on the target population, they can be product registries (including patients exposed to pharmaceuticals or medical devices), health services registries (patients undergoing the same medical or surgical procedure, such as the National Incisional Hernia Registry [EVEREG]) or disease registries (including patients with the same diagnosis or condition, e.g., pancreatic cancer).1
Planning, designing and creating a register
The usefulness of a registry depends on its initial approach. The fundamentals for maximising its effectiveness are1,2- 1.
Definition of the specific objectives (e.g., long-term outcomes of a treatment).
- 2.
Identification of all stakeholders, patients, industry, health authorities, healthcare providers, clinical teams, academic institutions, professional associations and funding agencies.
- 3.
Assessment of feasibility and maintainability. Funding is key and possible sources include industry, foundations, health systems, etc.
- 4.
Creation of the working group, which should include project coordinators, subject matter experts, research experts (epidemiologists, statisticians), data managers, legal and quality advisors.
- 5.
Establishment of a management and monitoring plan. Detection of errors and implementation of changes, with minimal impact on the integrity of the registry.
- 6.
Consideration of scope and rigour: specify sample size, duration of the registry, scope (local, national, international) and costs; ensure availability of data for analysis and maintenance of rigour.
- 7.
Definition of the dataset, outcomes and study population.1–3 All variables should be relevant and clearly defined, using internationally unified criteria if they exist. If they require calculation (e.g. age, BMI), this should be done automatically. For quantitative variables, it is useful to set warnings for data outside the usual limits. For qualitative variables, drop-down lists/dichotomous choice boxes should be used to avoid typing errors. Open text fields should be exceptional. Multiple objectives should be avoided, as they are a source of researcher overload and a reason for abandonment.
- 8.
Development of the project plan: specify objectives; the development schedule; quality plan and monitoring; dissemination plan and responsible persons; planning of risk identification and prevention related to collection (backups), data quality (audit, periodic data review), and costs
The selection of the data to be collected requires a balance between their importance for the integrity of the registry and for the analysis of the results, their reliability, their contribution to the registrars' workload and the increased costs associated with their collection.1 Primary data are collected specifically for the registry through an established protocol. Data from secondary sources designed for other purposes (e.g. institutional databases) can also be integrated, but in this case, exact data matching must be ensured.1,2
Data can be collected in paper format (data collection forms or DCFs), but electronic format is now preferred, via a website to which registrars have access with their own key and password. This facilitates data entry and analysis. These websites must have specific security measures in place to prevent the possibility of data loss or hacking and must be backed up regularly.1
Data quality is crucial for a register to be useful. The various types of errors (in coding, interpretation or data entry) can be avoided with proper register design, adequate training of registrars, online help systems, associated definitions in the register, automatic warnings for out-of-range values, etc.4,5
Data auditing, key to quality assurance, is the verification and examination of the data entered into the registry.1,5 It can be done at each registry site or from a central base, exhaustively (analysing all data), or limited to a randomly selected sample of patients and/or key variables.
Analysis and interpretation of outcomesThe first step is to select the target cases and sort them for export to a statistical package. It is useful to perform periodic cutoffs to allow us to detect errors and to consider changes.1
The main factors influencing outcome analysis are patient-related; treatment exposure; treatment outcomes; covariates; time since treatment; potential biases, and loss of data or cases. The more robust a database is, the fewer the number of lost data or cases and the longer the follow-up period, the closer the interpretation will be to clinical reality and the higher the scientific quality.1,2,5
Outcome interpretation must be rigorous. In very large samples, the differences found may be statistically significant, but not clinically relevant. 6 Missing data also cause difficulties in analysis.6 “Data mining" (cross-referencing variables in a registry in search of statistically significant results), which may yield spurious associations, should be avoided.
Advantages and drawbacks of registersThe advantages are large sample sizes, prospective data collection, inclusion of all consecutive patients with the characteristic under study (e.g., incisional hernia) and the possibility of long-term follow-up. They facilitate comparisons between centres in the same or different countries, and serve as a personal audit for individual clinicians.1,2,6
Disadvantages include the possibility of errors in data collection, poor quality or missing data, and loss to follow-up.6. Registries that rely on the voluntariness of registrants have a higher risk of data loss, selection bias in selecting patients with only certain outcomes, and abandonment of data entry in the long term.
Data protection, confidentiality and informed consentData confidentiality must always be guaranteed.1,2 The European Directive 2016/679 (General Data Protection Regulation) and the Spanish Organic Law LOPD 3/2018 of 5th December (Law on the Protection of Personal Data and Guarantee of Digital Rights) establish the legal framework.
Each case must have a unique identifier not associated with the identity of the patients (no names, history numbers or ID numbers). Data should be stored on external servers, not connected to the medical records. The persons in charge of analysing and monitoring the data should not have access to the identification of the patients or to the centre or researcher who entered the cases. The actual identification of the patient, if necessary, can only be done by the professional directly involved in the patient's care and who has authorised access to the patient's medical records.1
Informed consents for the patient should specifically indicate the release of data for registration, study protocols, statistical plans, and report on the systems used for anonymisation, and if data sharing is done, how it will be done.1,2
ConclusionRegistries and databases are a powerful source of information in routine clinical practice. They allow magnification of sample sizes, incorporation of a larger number of investigators and centres, and even the participation of patients themselves. Design is critical, choosing the necessary, well-defined variables for the desired objective. Long-term maintenance is the weak point of registries, as lack of funding and incentives hinder data entry. When the maintenance of the registers depends on the voluntariness of the registrars, the likelihood of their decay over time is very high. This aspect could be remedied with the active participation of health systems, as is the case in the Nordic countries, where registries of various pathologies, such as the Danish Hernia Database, are mandatory,7,8 thus providing very reliable information on the reality.
FinancingThis study received no funding.
Conflict of interestsThe authors have no conflict of interests to declare.
Please cite this article as: Hernández-Granados P, Pereira Rodríguez JA, Gimeno López M. Registros y bases de datos: ¿Cómo utilizarlos? Cir Esp. 2022;100:517–519.