Issues with the TikTok Research API and its effect on researchers
By Bruno Coelho, Lexie Barthelemess and Dominique Geissler
For the past year and a half, we at Cybersecurity for Democracy have been working extensively with the TikTok Research API. We’ve discussed our first impressions of it and our open-source library to help researchers interface with the API. While the API has the potential to offer valuable insights, recurring issues with the API have made gathering data for research projects difficult, if not impossible when studying certain time periods.
The Internal Server Error: A Major Roadblock for Researchers
A major issue with the TikTok Research API is the frequency with which it returns internal server errors when attempting to retrieve historical data. This error often affects requests for data from at least 2021 to mid-2022, as well as some dates in late 2023. To our knowledge, these dates are random and can range from a few days to months for which no videos will be returned from the API. Instead, the API will return a 500 Internal Service Error. Unlike silent bugs that can distort data invisibly, this error is immediately apparent, alerting researchers that the data they're seeking is currently unavailable. Unfortunately, even though no data was returned this still decreases the daily quota, and further attempts at retrieving the data result in the same issue.
For example, the following query requests for “snoopy” related videos in April 2022:
Instead of retrieving the data, the server returns a 500 Internal Server Error response, alongside this error code:
We have not been able to retrieve any data from this or similar queries for at least two months. Based on our observations, the issue affects all hashtag-related queries between March 2021 and July 2022. We have also encountered this error sporadically when querying for October 2023. The API appears to work as expected outside of these dates, although we have not done an exhaustive search to determine the extent of total dates that are affected by this error.
At Cybersecurity for Democracy, one of our research projects focuses on what type of content is promoted around polarizing topics, including the overturning of Roe v. Wade and the Israel-Hamas war. Both of these issues coincide with periods in which Internal Server Errors frequently disrupt data access. For researchers conducting longitudinal studies, these unanticipated data gaps make it impossible to gather contiguous historical data, which can undermine the integrity of research findings. This serves as a serious roadblock and can render a project infeasible.
Lack of Response from TikTok
Despite TikTok’s documentation explicitly requesting for researchers to notify support when a 500 Internal Service Error has been returned, our reports have gone unanswered. We filed bug reports both through the official support page as well as by email on October 21 and have yet to receive a reply.
Unfortunately, this lack of communication seems to be encountered not just by us but by many in the research community. Our collaborators from LMU Munich submitted an application for access to the TikTok research API for a joint project on August 5th, 2024. The application has been received by TikTok but they have yet to receive an answer, even after contacting TikTok twice through the support page.
Researchers at Politecnico di Milano and CENTAI have also brought up issues with TikTok’s quota and data quality issues. Tech Policy Press recently discussed that TikTok’s data access could have consequences for critical research projects, such as when examining online behaviors during the 2024 European Elections. Unreliable access to utilize the API can result in distorted data analysis and restrict academia from progress in social media research.
Conclusion
We note our issues are with the main TikTok Research API and we don’t know if they affect the “Virtual Compute Environment,” a new, separate system that allows researchers to query available data. We hope this system addresses some of the API’s limitations; we have requested access to this new service, however, we have yet to receive a response.
Despite these challenges, we believe the TikTok Research API could still be a valuable resource for researchers. When functioning correctly, it offers unique access to metadata around videos, comments, and user profiles, allowing researchers to explore trends, analyze engagement, and study the impact of various topics on a global audience. However, in its current state, the API’s issues make it impractical to study topics that require data within the blackout time frames or over a long-term horizon as research quality would suffer from inconsistent data.
We hope that TikTok will work on improving its communication with researchers. When researchers discover bugs, limitations, and errors, there needs to be an easy and efficient way to communicate with TikTok to maintain a product that remains useful and focuses on protecting data quality. We also hope TikTok addresses these issues soon, improving the API’s stability. In the meantime, we encourage researchers to share their experiences and contribute to our open-source tools to improve the community’s access to this critical data.