Skip to Main Content

CHEM 184/284 (Chemical Literature) - Huber - Winter 2025: Lecture 18

A two-credit course in the techniques and tools for effective searching the literature of chemistry, biochemistry, chemical engineering and related fields.

Open Science - What is it? Why do it?

Open Science

Open Science, somethimes referred to as Open Research or Open Scholarship,  is the idea that the products of research, especially publicly-funded research, should be made widely available without charge, both to make the reults of reserch, especially in medicine widely available, and to promote new research based on it.

See the National Academies Press (2018) document, Open Science by Design: Realizing a Vision for 21st Century Research  https://nap.nationalacademies.org/catalog/25116/open-science-by-design-realizing-a-vision-for-21st-century

 

The  three main generally recognized pillars of open science are:

  • Open access publishing
  • Open data
  • Open software/code.

Other facets sometimes included in open science are:

  • Open protocols/procedur3es
  • Open instrumentation/materials
  • Open peer review
  • Open educational resource (OER)

For more detailed discussion of each of these, see the box below.

 

 

The Facets of Open Science

The open access publishing movement grew out of an intersection of the tremendous growth in subscription prices of scholarly journals in the 1980s and 1990s, and the desire to make scientific discoveries, especially in biomedicind, freely available to doctors, researchers and the general public. The development of the World Wide Web as a platform for scholarly publishing facilitated the movement.

The are two terms which are frequently used synonymously, but are really distinct.

  • public access publishing - publications which are freely available to read by all, without having to have a subscription. Subscription journals are frequently referred to as being "behind the paywall".
  • open access publishing - Open access publications are free of subscription charges AND h are in the public domain, or have Creative Commons licenses allowing the reuse of the publications, and the cretion of derivative works based on them.    The Directory of Open Access Journals (hthttps://doaj.orgtps://doaj.org) lists only journals that are fully open access that is, no articles behind a paywall, and all with Creative Commons licensing.

Economic Models of Open Access Publishing

In the world of print, scholarly journals were typically supported by subscriptions, either from individual or institution / libraries, with some also receving significan revenue from advertisers, notably Nature and Science  Electronic intituional subscriptions led to a decline in individual subscrtions, which in turn decreased the attractiveness of scholarly journals to advertisers. If subscription-free, open access pubblishing was to work, new economic models would have to be devised.  Among those are:

  • author publishing charges - Instead of the reader paying for access, the author would pay bor the costs of editing, reviewing, etc. APCs cn range from a few hundred dollars, to over ten thousand dollars per aricle. Note that they are not chaged at the point of submission, but only after the article has been accepted for publication. This model is often referred to as gold open access. Some journals give authors a choice of whehter they wish their articles to be behind a paywalll or open access. These are frequently referred to as hybrid journals. Funding agencies or reearc insitutions may provide money to support APCs. APCs are frequently criticized fro trasferring the buren of supporting publication from the readers to the authors, especially in the Global South.
  • article repositories - Another approach is to place copies of the articles in a freely accessible server. This is often referred to as green open access.  Such repositories are usually not affiliated with a particular journal or publisher, but rather with an academic insitution, such as UC's eScholarship, or govenrnmental institution, such as PubMed Central. The repository may be focused on a single subject area or be general purpose.
  • platinum/diamond open access - These terms are sometimes used to refer to journals which function without either subscriptions or APCs, but which are supported by some toher funding source. In some cases, they re subsidized by revenue from other publications, or by institutions voluntarily paying the equivalent of subscriptins to  support the journal even though its contents are freely available.

Open Access Mandates

While many researchers have embraced the principles of open access publishing, many have not. However, they may be required to publish open access, usually in one of two ways.

  • institutional mandates - In 2013, the UC Academic Senate passed a rule requiring all UC faculty to deposit copies of their final submitted manuscripts in eScholarship.  Shortly thereafter, the UC Office of the President issued a similar rule applying to all UC researchers. Note that the UC mandatedoes allow faculty to opt out of the requirement. Many major academic institutions have issued similar requirements.
  • funding mandates - Even more effective, some agencies which fund research no require grant recipients to publish open access. Many European funding agencies have joined together in Plan S, which requires publication of research results in fully open access journals only.  In the Unived 
  • states, the National Institutes of Health require grant recipients to deposit copies of their manuscripts in Pub med Central.  In 2022, the White House Office of Science and Technology Policy (OsTP) issued a memo require ALL federal agencies which fund research to draw up public access plans for their grant recipients by the end of 2025. However, in Februaary, 2025, the Trump administation revoked this memo.  It is not yet know what effect this will have on agencies which have already drawn up public access plans.
  • As a result of such mandates, and the genral growth of enthusiasm for open access, the fraction of total reseach published in some form of open access has grown dramatically over the past 20 years. However, it is still not universal, even in the sciences.

 

 

 

Open Data

Following on the OA publishing movement came a push for Open Data.  Access to the original data on which research conclusions are based is essential for other reearchers to verify research results and to build on them.  Support for open data grew most quickly in the discovery sciences, that is, those which uncover what is already existing in nature. This was especially true for areaas of research depending n pooling large collections of data. The "big data" fields includie genome research, astrophysics and some areas of environmental science.  Adoption of open data has been more slow in areas of invention science, that is, where researchers are creating new substances or devices, such s pharmaceutical chemistry or most ares of engineering. Not coincidentally, there are areas which often lead to patentable inventions.

In 2016, a group defined the FAIR Data Principles.  The acroym stands for: 

  • Findable -  Data are made available in a searchable source, with metadata which clearly identify the nature and source of the data.
  • Accessible - Data are retrievable using fee, open and standardixed protocols.
  • Interoperable - Data are stored using a formal, accessible, and shared language and vocabularies.
  • Reusable - Data are richly described, clearly licensed and have detailed provenance.

 

FAIR data principles are implemented in different ways by different research communities. Data may be stored in institutional repositoires, cush as those mantained by both the UC and UCSB, or in discipline specific epositories. Examples of the latter include the Protein Data Bank for prtoein sequences, and the Cambridge Crystallographi database for organic and organometallic substnce crystal structues. In genral, all crystallography journals now require authors to deposit their data in the CCDC databse.

Open Software / Open Code

As mentioned in the discussion of copyright in Lecture 6, computer software is an expression of an idea (algorithm) in a fixed form, and is therefore protected by copyright (though some software is also patented). However, the software used to collect and interpret scientific data is an essential part of being able to verify and reproduce that data and any conclusions drawn.  So, FAIR data implies the availableility of the associted software in open form.

There are vaious repositories in which software develoers may deposity their code, making it available for reuse and for the creating of derivative works, usually under an "attribution" and "share alike" licence.  Theese include:

Login to LibApps