The world is experiencing a rapid transition from the pre-AI to AI era. This has ushered rapid developments towards broader technology integration in our lives, including the Internet of Things, and this has stimulated new and critical research questions. Scholars have attempted to quantify the effects of technology use (e.g., social media) on social and mental wellbeing. All human interaction with technology leaves a digital footprint from which we can glean key insights into human behavior and varying states of mind, and deepening our understanding of mental health.1 Digital phenotyping is an incredible advance in understanding the human condition, especially timely as our lives become increasingly digital, but quantifying these human-machine-interactions remains limited by existing methods.
The APIcalypse is hindering researchA prevailing roadblock to scientific discovery is that Application Programming Interfaces (APIs) are largely closed on social media platforms,2 therefore not allowing to study what people are actually doing on these platforms or to link their digital footprints to self-reported experiences including mental health topics (before the Cambridge Analytica scandal this was possible). The so called “APIcalypse”3 led to an overreliance on self-report to gain insights into technology use. Self-report is more suitable to study certain phenomena (e.g., opinions and attitudes, subjective well-being), while objective technology use can quantify patterns of technology use and explore digital footprints and their key insights into mental states or psychological traits.4,5
How can the APIcalypse be circumnavigated?Programmers can develop bespoke smartphone tracking technologies that can track call behavior, use of certain apps, screen-time and GPS data.6 Such an approach can be combined with ecological momentary assessment, hence asking participants about their well-being, depressive symptoms (or the prodromal signals) or other variables of interest. Unfortunately, using such tracking technologies still requires specialists to ensure proper tracking of the phones and the depth of digital phenotyping allowed also relies to a degree on the operation system (OS) installed on the phones and the policies of the companies behind the OS.7 A further problem arising from smartphone tracking technologies represents the sandboxing principle: Every app can be seen as a sandbox and although it is possible to track what app is installed and how long it is used, it is very difficult to see how people behave within such an app (for instance to study language use, which is a digital signal linked to depression). With APIs being largely closed and the limitations of smartphone tracking mentioned, one way to still get insights into what people are doing on a certain application can be achieved with data donation portals. For instance, WhatsApp data can be studied using the “ChatDashboard” tool8 following privacy by design principles. While promising, the data donation solution is also not optimal, because it requires constructing different platform solutions for different applications to study digital footprints.
Other researchers put forward the idea to study Google Trends data to understand what prompts people are feeding into search engines (or soon, generative AI products). By this, the advantage arises that one can rather easily study what people are interested in at the moment and it is also possible to narrow this down to certain areas of the world to come up with meaningful insights9,10 – perhaps with these insights being proxies for mental states. The problem remains here though that it is not possible to tie person variables including self-report data to the online queries. Therefore, uncertainties arise regarding the correct interpretation of Google Trends or similar data and the accuracy of interpretation of these data. For example, algorithms are not yet sophisticated enough to differentiate between searches on the topic for a project, versus trying to self-diagnose or seek psychoeducational materials for personal or family relevance. In other words, the motivations behind the queries are not clear at all. The positive aspect about studying Google Trends data compared to the use of smartphone tracking technologies is its much higher degree of privacy, because mobile sensing and digital phenotyping principles relying on individual tracked digital footprints of course face ethical challenges.11 Studies basing on such principles need to ensure that re-identification of participants is not possible.
For an overview on approaches with their advantages and disadvantages see Table 1.
Advantages and Disadvantages of Studying Digital Footprints with different approaches.
We believe that studying digital footprints needs to be enabled for independent scientists and initiatives such as the Digital Services Act from the EU12 hopefully will support to force large platforms to open up their APIs again for the study of relevant questions arising from interacting with these platforms for societies.
Mirroring the EU's Digital Services Act, Asia has pursued similar initiatives to regulate digital platforms and enhance data accessibility for research, with varied outcomes across the region. China, for example, has proactively formulated policies to foster open research data, demonstrated by its national strategies and the creation of data repositories and journals.13 These moves underscore the growing recognition of open data's significance for scientific progress and societal advantage in Asia.
In contrast, Taiwan's attempt to regulate digital platforms through the Digital Intermediary Services Act (DISA), also inspired by the EU's Digital Services Act, aimed at increasing platform accountability and transparency, potentially including mandates for platforms to open their APIs to facilitate societal engagement. However, DISA faced significant pushback from the public and industry due to concerns over potential encroachments on free speech and ambiguous legal stipulations. Critics highlighted those specific provisions, such as information restriction orders, could lead to censorship and were not sufficiently defined, posing compliance challenges for platforms and enforcement difficulties for courts. This opposition led to the draft being reconsidered and sent back for further review.14
This regional divergence in outcomes reflect the broader challenges in Asia's rapidly digitized landscape and considerations in regulating digital platforms and enhancing data accessibility for research. These case studies highlight the delicate balance between regulation, freedom of speech, and the promotion of open data for the collective good.
Of course, research on the re-opened platforms (re-opened APIs) need to be done with oversight from IRBs and sound data protection plans (and also ensuring that the insights derived are used for the public good). As discussed in this short piece, many research questions exist which might not directly fall in the realm of regulation initiatives, but still hold great potential for tackling global mental health. If platforms stay closed here or scientists are not able to find creative solutions to study digital footprints, innovation for the health sciences and power by owning digital data will only be held in the hands of those who are running these platforms.
Ethical ConsiderationsDoes not apply, as this is no empirical work.
FundingNone.