Machine-learning-based bibliometric analysis of pancreatic cancer research over the past 25 years

Machine learning and semantic analysis are computer-based methods to evaluate complex relationships and predict future perspectives. We used these technologies to define recent, current and future topics in pancreatic cancer research. Publications indexed under the Medical Subject Headings (MeSH) te...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Kangtao (Author) , Herr, Ingrid (Author)
Format: Article (Journal)
Language:English
Published: 28 March 2022
In: Frontiers in oncology
Year: 2022, Volume: 12, Pages: 1-11
ISSN:2234-943X
DOI:10.3389/fonc.2022.832385
Online Access:Verlag, kostenfrei, Volltext: https://doi.org/10.3389/fonc.2022.832385
Verlag, kostenfrei, Volltext: https://www.frontiersin.org/articles/10.3389/fonc.2022.832385/full
Get full text
Author Notes:Kangtao Wang and Ingrid Herr
Description
Summary:Machine learning and semantic analysis are computer-based methods to evaluate complex relationships and predict future perspectives. We used these technologies to define recent, current and future topics in pancreatic cancer research. Publications indexed under the Medical Subject Headings (MeSH) term 'Pancreatic Neoplasms' from January 1996 to October 2021 were downloaded from PubMed. Using the statistical computing language R and the interpreted, high-level, general-purpose programming language Python, we extracted publication dates, geographic information, and abstracts from each publication's metadata for bibliometric analyses. The generative statistical algorithm "latent Dirichlet allocation" (LDA) was applied to identify specific research topics and trends. The unsupervised "Louvain algorithm" was used to establish a network to identify relationships between single topics. A total of 60,296 publications were identified and analyzed. The publications were derived from 133 countries, mostly from the Northern Hemisphere. For the term "pancreatic cancer research", 12,058 MeSH terms appeared 1,395,060 times. Among them, we identified the four main topics "Clinical Manifestation and Diagnosis", "Review and Management", "Treatment Studies", and "Basic Research". The number of publications has increased rapidly during the past 25 years. Based on the number of publications, the algorithm predicted that "Immunotherapy", Prognostic research", "Protein expression", "Case reports", "Gemcitabine and mechanism", "Clinical study of gemcitabine", "Operation and postoperation", "Chemotherapy and resection", and "Review and management" as current research topics. To our knowledge, this is the first study on this subject of pancreatic cancer research, which has become possible due to the improvement of algorithms and hardware.
Item Description:Gesehen am 25.10.2022
Physical Description:Online Resource
ISSN:2234-943X
DOI:10.3389/fonc.2022.832385