Changes in the absolute numbers and proportions of open access articles from 2000 to 2021 based on the Web of Science Core Collection: a bibliometric study
Article information
Abstract
Purpose
The ultimate goal of current open access (OA) initiatives is for library services to use OA resources. This study aimed to assess the infrastructure for OA scholarly information services by tabulating the number and proportion of OA articles in a literature database.
Methods
We measured the absolute numbers and proportions of OA articles at different time points across various disciplines based on the Web of Science (WoS) database.
Results
The number (proportion) of available OA articles between 2000 and 2021 in the WoS database was 12 million (32.4%). The number (proportion) of indexed OA articles in 1 year was 0.15 million (14.6%) in 2000 and 1.5 million (48.0%) in 2021. The proportion of OA by subject categories in the cumulative data was the highest in the multidisciplinary category (2000–2021, 79%; 2021, 89%), high in natural sciences (2000–2021, 21%–46%; 2021, 41%–62%) and health and medicine (2000–2021, 37%–40%; 2021, 52%–60%), and low in social sciences and others (2000–2021, 23%–32%; 2021, 36%–44%), engineering (2000–2021, 17%–33%; 2021, 31%–39%) and humanities and arts (2000–2021, 11%–22%; 2021, 28%–38%).
Conclusion
Our study confirmed that increasingly many OA research papers have been published in the last 20 years, and the recent data show considerable promise for better services in the future. The proportions of OA articles differed among scholarly disciplines, and designing library services necessitates several considerations with regard to the customers’ demands, available OA resources, and strategic approaches to encourage the use of scholarly OA articles.
Introduction
Background
The current status of open access (OA) initiatives has been criticized by a number of stakeholders, one of whom has pointed out that “the current journal market is failing to operate optimally—particularly in relation to journal access and the cost of Gold OA” [1]. Slow progression, increasing costs, and resistance from researchers and publishers are issues related to OA compliance [2,3]. These negative views on OA are based on its production side, where the increased production of OA scholarly articles has been counteracted by the simultaneous increase in subscription-based publications. The increase in scholarly publications, including both subscription-based and OA journals, has resulted in growing subscription costs for libraries and article processing costs for authors aiming to publish OA articles [4].
OA for scholarly information has been a key agenda for knowledge production and exchange since the 2002 Budapest initiative [5]. The international community has made efforts to spread the OA campaign so that anyone can freely use research results without any economic, legal, and technical barriers [6–8]. On November 23, 2021, the United Nations Educational, Scientific and Cultural Organization (UNESCO) Recommendation on Open Science [9] was adopted by 193 member countries, including South Korea. The US governmental policy has been enhanced by removing the current 12-month embargo period on making federally funded research publications publicly accessible earlier [10].
To present a professional advisory opinion on establishing a national OA policy in Korea, the authors recently published a report analyzing the current OA status of scholarly publishing and its general principles, focusing on future practical tasks and the roles of various officials by synthesizing discussions about achieving OA [11]. As an extension of our report, this research will demonstrate some practical ways to implement OA services and formulate the further development of the government’s current OA policies [12].
This study investigated aspects of public library services for scholarly information, such as why and for whom these services are necessary and what their requirements are. More specifically, this study focused on the amount of available OA articles, which is the starting point of public services providing scholarly OA information. The content quality, subject domains, document types, and users’ preferences have to be considered when assessing the quantity of OA resources. Moreover, different types of OA involve different routes of access, such as gold OA, hybrid gold OA, and access through institutional repositories of public platforms for OA articles, which must be incorporated into the public database search.
Objectives
This study aimed to identify changes in the number and proportion of OA articles from 2000 to 2021 based on the Web of Science (WoS) Core Collection [13]. Specifically, first, changes were analyzed according to document type. Second, changes in the six major categories of research were investigated; and third, annual trends in OA and non-OA documents were traced.
Methods
Ethics statement
Neither approval by the Institutional Review Board nor obtainment of informed consent was required since this was a literature-based study.
Study design
This was a descriptive study based on a bibliometric analysis of the literature database.
Outcomes
The analysis involved querying the database for the number of documents available as OA at the production level. The absolute number and proportion of OA articles were used as basic parameters. The time of document production was categorized into two data sets: the cumulative data from 2000 to 2021 and 1-year data from 2021. Furthermore, the document type selection and subject domains for analysis were considered as parameters. Different types of OA have different access routes, and production and availability were studied as basic infrastructure for designing search and service systems.
Data sources/measurement
Database for the collection of scholarly information
We used WoS [13] to collect data on scholarly publications. The documents indexed in this database are considered to have met the selection criteria and been certified as having a representative level of quality. The classification system of the academic disciplines and document types used in the database was applied; no exclusions or additions were made.
Assessment of the types of documents
We checked the number of documents available based on the 43 types of documents in the WoS database. The four main types—articles, proceedings papers, review articles, and letters—were analyzed. We found that a reasonable set of documents consisted of the aforementioned four main types. Thereafter, we sorted these four types by research discipline. The other 39 types were analyzed in a similar way.
Discipline-based quantitation of OA documents
The classification system of academic domains was adopted from that of WoS. A total of 254 disciplines were grouped into 25 subcategories, which were further grouped into six categories: (1) multidisciplinary; (2) health and medicine; (3) natural sciences, including environmental science, agriculture, and geoscience; (4) social sciences and others, including education, law, economics, and management; (5) engineering; and (6) humanities and arts. The numbers and proportions of OA articles were displayed according to the long-term trends of 22 years (2000–2021) and recent statistics of 2021. An additional analysis of the OA status of multidisciplinary science journals was conducted.
Annual trends in OA and non-OA documents
Detailed data on the trends on a yearly basis were shown for selective representative disciplines. The yearly trends were analyzed and discussed.
Bias
There was no bias in selecting the target articles.
Study size
No sample size estimation was required since this study included all target journals in two databases.
Statistical methods
Descriptive statistics were used for the data analysis.
Results
Assessment of the types of documents and OA
Articles were the most common document type. In total, 10.6 million articles were available as OA, corresponding to 35.6% of all produced research articles between 2000 and 2021 (Table 1).
The proportion of OA among articles in 2021 was higher, at 49.5%. Proceedings papers, which are often produced in the engineering discipline, were the second most common document type. Slightly fewer than 1 million proceedings papers were OA, and they accounted for 13.9% of all produced papers of this type; however, this percentage increased to 24.9% in 2021.
We selected four types of documents, articles, proceedings papers, review articles, and letters, as the key sets for our data retrieval. These document types encompassed 90.1% of all OA documents between 2000 and 2021 and 91.5% of all OA documents in 2021. OA documents of these four types accounted for 32.4% of all publications between 2000 and 2021 and 48.0% in 2021.
The annual trends in OA and non-OA documents among the four major document types (Fig. 1) and among all types of documents (Fig. S1) are shown. The numbers (proportions) of 1-year production of OA documents in 2000 and 2021 were 148,642 (14.6%) and 1,521,946 (48.0%), respectively. The total number of both non-OA and OA documents increased from 2000 to 2021; however, the increment rate was higher for OA documents (×10.2) than for the total number (×3.1) and non-OA documents (×1.9).
Discipline-based quantitation of OA documents
The numbers and proportions of OA articles among the total publications are shown for six categories, 25 subcategories, and 254 disciplines. The numbers and proportions of OA are displayed in the form of long-term trends between 2000 and 2021 and the recent statistics from 2021. The data and interpretation of the data are shown for six categories, and a total list of six categories and 25 subcategories are shown in Table 2.
Multidisciplinary sciences: the numbers and proportions of OA documents
This category corresponds to a single discipline: multidisciplinary sciences. The proportions of OA articles were 79% between 2000 and 2021 and 88.6% in 2021 (Table 2). The annual trends in OA and non-OA articles in the multidisciplinary sciences discipline showed a rapid increase in OA articles from 2008 to 2011 (Fig. 2). To explain this increase, data regarding the top 20 journals (based on the number of published articles) in this discipline are displayed. These include the numbers of total and OA documents and their proportions between 2000 and 2021 and 2021, along with the year that the journal was first indexed in the database (Table S1). Among the top 20 journals, nine journal titles are recently founded OA journals. Four journals—PLoS One, Scientific Reports, Proceedings of the National Academy of Sciences of the United States of America, and Nature Communications—published 74.4% (2000–2021) and 63.4% (2021) of all published OA articles in this discipline. A total of 686,485 documents were available as OA articles, and they constituted useful OA resources for every subject domain. The proportion of OA documents in Nature in 2021 was high (62.3% among all types of documents). The proportion of OA documents limited to the “article” type was higher (Fig. S2).
Health and medicine: the numbers and proportions of OA documents in four discipline subcategories
This category comprises many different health and medicine disciplines, classified into four subcategories (Table S2). The proportions of OA articles were 37% to 40% between 2000 and 2021 and 52% to 60% in 2021 (Table 2). Detailed data on four subcategories are shown in Tables S3–S6. The annual trends in OA and non-OA articles in the discipline of infectious diseases showed a rapid increase in OA documents in the most recent 10 years, whereas the number of fee-based articles was stationary (Fig. 3). The recent increase in OA articles is seen in almost every discipline, including oncology (Fig. S3).
Natural sciences: the number and proportion of OA documents in seven subcategories
Seven subcategories in the natural sciences category showed relatively high numbers of total and OA articles (Table S7). The most recent data from 2021 showed proportions of 41% to 62%, and the long-term trends between 2020 and 2021 presented proportions ranging from 21% to 46% (Table 2). Two subcategories, biosciences and mathematics, showed high proportions (44%–46%) of OA. Four subcategories—physics, environmental, geoscience, and agriculture/fishery—had OA proportions of 29% to 33%, and the chemistry subcategory showed the lowest proportion (21%). Details of the subcategories are shown in Tables S8–S14.
Particle physics and astronomy/astrophysics are known as fields in which OA research predominates, and the proportions of OA articles in 2021 were 88% and 83%, respectively (Table S8). The annual trends in astronomy/astrophysics showed a rapid shift to OA publishing from 2005–2006, and OA publications have thereafter predominated, accounting for 80% or more in recent years (Fig. 4). The annual trends in applied physics show a definitive transformation of fee-based publishing to OA publishing (Fig. S4)
Data on 23 disciplines within the natural sciences showed that the subcategory of biosciences had the highest proportions of OA among the seven subcategories at 45.6% between 2000 and 2021 and 62.3% in 2021 (Table S9). Two disciplines (microbiology and virology) within the biosciences subcategory of the natural sciences had high proportions of OA documents, exceeding 60%. The annual trends in the numbers of OA and non-OA documents in the microbiology subcategory (Fig. S5) showed a predominance of OA articles over non-OA articles, and the pattern was similar to that of infectious diseases (Fig. 3). The discipline of biochemistry/molecular biology also showed an increase in OA articles, although the baseline of fee-based documents remained constant (Fig. S6).
The subcategory of chemistry showed a relatively low, but increasing, OA proportion (Table S10). The subcategory of mathematics showed a high OA proportion, similar to that of biosciences (Table S11). The subcategory of environmental sciences also showed a high proportion, but considering the common demands related to environmental issues, the figures were lower than expected (Table S12). However, the analysis of annual trends showed a recent rise in OA articles in the environmental sciences discipline (Fig. S7). The subcategory of geoscience had 30% of resources available as OA, corresponding to the middle of the natural sciences category (Table S13). The agriculture, fishery, and forestry subcategory showed a moderate penetration of OA (Table S14). The annual trends in dairy animal science and food science/technology showed a recent rise in OA articles (Figs. S8, S9). A few subcategories within the same category showed lower proportions; however, the total numbers of articles were small in these subcategories, suggesting selection bias.
Social sciences and others, including education, law, economics, and management: the numbers and proportions of OA documents in four subcategories
This category comprises four subcategories: social sciences; psychology; education; and law, economics, and management (Table S15). These subcategories have the common features of basic and applied social sciences. The proportions of OA articles were 23% to 32% between 2000 and 2021 and 36% to 44% in 2021 (Table 2). Detailed results for the four subcategories are shown in Tables S16–S19. Eight disciplines in the subcategory of law, economics, and management showed similar OA proportions and increasing trends over time (Table S19). The annual trends in OA and non-OA articles in the economics discipline are shown in Fig. 5. The annual trends in OA and non-OA articles in five disciplines—education/educational research, business finance, management, political science, and law—are shown in Figs. S10–S14.
Engineering: the numbers and proportions of OA documents in five subcategories
This category comprises several different domains of engineering, classified into five subcategories (Table S20). The proportions of OA articles were 17% to 33% between 2000 and 2021 and 31% to 39% in 2021 (Table 2). The details of the five subcategories are shown in Tables S21–S25. At least 16 disciplines in the subcategory of major engineering showed similar figures of OA proportions and increasing trends over time (Table S24). The annual trends in OA and non-OA articles in the discipline of electrical/electronic engineering are shown in Fig. 6. The OA trends in nanotechnology/nanoscience are presented in Fig. S15.
Humanities and arts: the numbers and proportions of OA documents in four subcategories
The category of humanities and arts consists of an arbitrary list of disciplines with low OA penetration (Table S26). The most recent data from 2021 showed 29% to 38% penetration, and the long-term trends between 2020 and 2021 ranged from 12% to 22% (Table 2). Detailed data are shown in Tables S27–S30.
Among the 10 disciplines in the humanities and arts, the lowest OA was found for literature, with proportions of 11.6% between 2000 and 2021 and 28.8% in 2021 (Table S29). Three subcategories—general humanities, history, and arts—showed proportions of 19% to 22% between 2000 and 2021 and 34% to 38% in 2021. The annual trends in the literature disciplines in a 22-year period are shown in Fig. 7. The number of non-OA articles in literature remained stationary, while that of OA articles has been increasing gradually. The annual trends in linguistics have shown increases in both OA and fee-based articles (Fig. S16).
Annual trends in OA and non-OA documents
Four patterns are recognized in the annual trends of OA and non-OA documents.
Transformative pattern
A rapid increase in OA documents coupled with a rapid decrease in non-OA documents means that fee-based journals have been transformed into OA journals or authors have chosen OA publishing. Examples of this pattern are furnished by astronomy/astrophysics (Fig. 4), electrical/electronic engineering (Fig. 6), applied physics, and education/educational research (Figs. S4, S10)
Rapid increase in OA and a plateau in non-OA documents
A rapid increase in OA publications with a stationary pattern in non-OA documents was the most common pattern in the categories of health and medicine and natural sciences. The expansion of these academic domains is evident, and most new articles are published in newly established OA journals. Multidisciplinary sciences, infectious diseases, and microbiology provide examples of this pattern (Figs. 2, 3, S5). The amount of non-OA documents published in traditional journals remained constant in those disciplines. Some disciplines have strong fee-based journals with a high number of publications, so the relative proportion of non-OA documents remains high. Oncology, biochemistry, molecular biology, and agriculture–dairy animal sciences are examples (Figs. S3, S6, S8).
Increases in both OA and non-OA documents
Increases in both OA and non-OA documents were observed in domains where the amount of publications has recently increased. Examples in the social sciences category include the subcategories of economics, business finance, and law, while some examples in the natural sciences are the subcategories of environmental science and food science/technology. Additional examples of this trend include the subcategory of nanoscience/nanotechnology in the engineering category and the subcategory of linguistics in the humanities.
Other patterns
The fourth pattern involved a minimal increase in OA documents, with non-OA documents remaining the predominant type of publishing. Examples are literature (Fig. 7) and management (Fig. S12). Other nonspecific patterns occurred, which were probably related to the small number of published documents.
Discussion
Key results
The growing trends in the numbers and proportions of OA documents during the study’s approximately 20-year period must be recognized. The number of OA articles available on WoS between 2000 and 2021 was 12 million, amounting to 32.4% of all articles. The 1-year data from 2021 was 1.5 million, amounting to 48%. The proportion of OA documents was the highest in the multidisciplinary, natural sciences, and health and medicine categories, in which OA documents comprised 50% or more of the total documents. The categories of social sciences and others, engineering, and humanities and arts had proportions of around 30% to 40%, but these proportions are increasing.
Interpretation
The availability of OA resources can be expressed as the percentage of all documents. Data from previous research have shown different figures depending on the types of resources and years of publication. The proportion stood at 45% among articles published in 2015 and 28% in the cumulative data at the same time [14]. Our equivalent data would be 49.5% in 2021 and 35.6% between 2000 and 2021. However, the time of assessment is an important factor. A dramatic change occurred due to advances in the search function, and we believe that the aggregation of metadata on hybrid gold OA and green OA could uncover those hidden OA documents. Other examples of differences are a reported proportion 14% in 2019 [15] versus 48.4% in 2021 (current study) in the physics and astronomy subcategory. This difference is partly explained by the inclusion of only gold OA in the study of Demeter et al. [15], whereas our data include other types of OA. The details of document types were not explained. Thus, we selected only the four types necessary for services. The absolute number and proportional data among all resources were analyzed. We found that an optimal denominator was critical for our assessment of the proportion. For example, the document type was one of the important criteria for selecting document types for the denominator. Four document types (articles, review articles, proceedings papers, and letters) were chosen for the data pool of our denominator [16]. The annual trends used for evaluation were simplified to 2000–2021 and 2021, and an annual comparison between OA and non-OA was added for individual subjects.
The academic discipline was an important parameter for our assessment of the proportion of OA documents, reflecting its importance for user services. The domains of research subjects can be defined in different manners, and the details of the scope, level, and other service-related factors should be critically reviewed.
The most significant result of this study was the high rate of OA documents in the multidisciplinary category. This category did not show OA predominance group until 2009, when a rapid surge in OA documents was observed (Fig. 2). Two OA journals, PLoS One and Scientific Reports, were the definitive leading cause of this change. We also found a major change in researchers’ publishing patterns. Domain-specific research was gradually replaced by interdisciplinary or multidisciplinary research, and multidisciplinary journals are more often selected by authors for publication. An increase in OA documents was also noted in premium scholarly journals, such as Nature (Fig. S10) and Science. We believe that these journals have been influenced by the OA mandate policy in developed countries in the West. The best research papers supported by the governmental research funds of top-ranked countries are published in these top journals as hybrid gold OA. The documents in this multidisciplinary category are a fundamental resource for new OA-based library services.
Significant trends for OA were also observed in the health and medicine category. The public demands for health and well-being reflect this trend. Authors willingly share their research, and health professionals feel satisfaction if they become well-known for their scientific excellence. The high rates of OA in articles on infectious diseases and general healthcare are manifestations of the authors’ willingness to share. The participation of medical and healthcare professionals as volunteers for ancillary services, in addition to OA scholarly library services, is also expected.
Among the seven subcategories in the natural sciences category, biosciences and astronomy/physics had high proportions of OA publishing. Biosciences have a few overlapping features with healthcare and medicine, and basic and applied health researchers including students will benefit from OA articles on biosciences. The discipline of astronomy/physics is a peculiar case. Researchers in high-energy physics and astrophysics initiated their own OA projects and since 2014, have made collective efforts on OA. The Sponsoring Consortium for Open Access Publishing in Particle Physics (SCOAP3) covers more than 11 journals, books, and repositories [17]. However, chemistry and industry-oriented sciences showed lower proportions of OA.
Documents in the engineering category had relatively low OA proportions, and authors in these disciplines often worry about intellectual property rights when their articles are freely available. Although improving trends were observed during the study period (extending through 2021), the numbers are still lower than those in the categories of natural sciences and health and medicine. Within engineering, the multidisciplinary category showed a relatively high proportion of OA documents (57% in 2021). High OA proportions were further observed in marine and ocean engineering, metallurgy, and biomedical engineering. Low OA proportions were noted in chemical, geological, and environmental engineering.
Academic documents in the categories of “social sciences and others” and “humanities and arts” are published more often as monographs than as journal articles. Users favor electronic or analog versions of books, and digital transformation in these subject domains has been slower. It is necessary, however, to keep old literature available online so that it can be searched by those who want to obtain the corresponding knowledge in detail. Wikipedia and other types of online resources are used for these subjects, and OA documents will add more value to these existing knowledge resources.
Library services using OA resources are the ultimate goal of current OA initiatives [18]. The production of OA documents is important, and the visibility of those open documents can be enhanced by the development of research technologies [19] and timely library services for various types of users not limited to academic scholars. This study assessed the fundamental infrastructure for OA scholarly information services.
Limitations
The availability of the full text of those OA documents may be different among individuals because different types of OA require different routes for access. This study does not guarantee that users all have access to these OA documents. The Unpaywall [20], Google Scholar, and Scholytics services will help access the full text of those OA documents.
The complexity of OA document production and use is not covered in this article. The value of the conceptual status of journal goods from the viewpoint of Ostrom [21] and library services are summarized in our previous research report [11], Table S31, and Fig. S17.
Notes
Conflict of Interest
No potential conflict of interest relevant to this article was reported.
Funding
The author received no financial support for this article.
Acknowledgements
The author is grateful to Dr. Eun-Hee Hyun (National Assembly Library of Korea, Seoul, Korea), So-Hyeong Kim (National Research Foundation of Korea, Seoul, Korea), and Eunjung Shin (Science and Technology Policy Institute of Korea, Sejong, Korea). They participated in discussions on the national policy of open access of Korea, which is part of the subject of this manuscript.
Supplementary Material
Supplementary files are available from https://doi.org/10.7910/DVN/W4EVZI
Supplementary tables
Supplementary figures