CHEM 184/284 (Chemical Literature) - Huber - Winter 2022: Lecture 10

A two-credit course in the techniques and tools for effective searching the literature of chemistry, biochemistry, chemical engineering and related fields.

Lecture 10: Chemical Abstracts Service, Introduction and the Legacy of Print

Chemical Abstracts Service: http://www.cas.org/

  • Chemical Abstracts Service was founded in 1907 as a division of the American Chemical Society (https://www.acs.org).
  • The first volume contained 15,000 abstracts and was distributed free of charge to ACS members. Indexing and abstracting was done by professional chemists acting as volunteers.
  • Today: over 1,000,000 abstracts per year, with indexing done by a team of dozens of professional indexers, most with PhDs in chemistry or related sciences, and hundreds of support staff.
  • More than 38 million abstracts total have been published (as of January, 2014.)

What CAS Does

Other CAS Products and Services

  • Chemical Industry Notes (CIN) -- indexes the literature of chemical business (e.g. Chemical & Engineering News, Chemical Week.)
  • Chemical Abstracts Service Source Index (CASSI)
    • Lists all periodicals ever indexed by CAS
    • Lists many pre-1907 sources, such as those appearing in Beilstein
    • Available on CD-ROM and in the web version above.
    • Listings in the CD-ROM versioninclude language information, starting dates and current volume numbers, cross-references to changed titles or translations and holdings information. The Web version has only title, abbreviation, ISSN and CODEN information.
  • CAS IP Awecixwa -- The CAS Search Service
    • Professional scientific searching on a task-by-task basis, by the highly trained information professionals at CAS.
  • CAS Chemical Compliance Indexl - National Chemical Inventories
    • CAS supplies inventories and lists of regulated chemicals from dozens of nations on the Web.
    • This product is especially useful to manufacturers and shippers of chemicals who need to know the applicable regulations in the countries where they operate.
  • CAS Analytical Methods
    • CAS Analytical Methods  is an umbrella title for a variety of research methods collected from the literature by CAS for easy retrieval and comparison..
  • PatentPak
    • PatentPak is a tool for chemical patent specialists.  It provides full-text of patents, enhanced with the ability to jump directly to the section of the patent in which a particular chemical substance is identified.
    • PatentPak is integrated with both SciFinder and STN.  It is, UCSB users have access to PatentPak if they are using SciFinder-n.
  • ChemZent
    • ChemZent is a digitized version of Chemisches Zentralblatt, a German index to the chemical index to the chemical literature which was published from 1830-1969, offering unique access to the very earliest days of modern chemical science.  
    • The CAS implementation is unique, in that it is searchable and displayable in English as well as German.
    • ChemZent is integrated with SciFinder; or SciFinder-n. however, it is an added-cost product, and UCSB does not currently have access to ChemZent.
  • Formulus​
    • Formulus is a database of formulations, that is, mixtures of chemicals deigned for a specific purpose
    • It is currently aimed primarily at the pharmaceutical and agrochemical industries.
    • It brings together data on the formualtions, their cemical components, supplier infomration, regulatory information and shelf life of the formulation.
  • CAS Custom Services
    • CAS offers a variety of technology, content and knowledge services, building on their huge databases, and expertise in manipulating such databases.
  • and lots more, mainly aimed at chemical industry...

Importance of Chemical Abstracts

Analyzing Chemical Abstracts in terms of the general properties of indexes:

  • Scope
    • CA attempts to cover chemistry in the broad sense...anything that might be interpreted as new research in chemistry or chemical engineering
    • Chemistry as the "central science". CA's coverage has high overlap with medicine, biology, physics, materials, agriculture, geology, etc., making it important for researchers in those fields as well.
    • Note: since CA focuses on "new research" in earlier times it did not always index all chemical patents - only those deemed to have "new chemistry".  Nowadays, however, patents are selected for coverage on the basis of the chemistry International Patent Classification codes.
  • Comprehensiveness
    • CA attempts to cover the literature of chemistry worldwide, in any language.
    • It attempts to cover all forms of primary chemical literature.
    • Note that in some cases - technical reports and dissertations - it depends on secondary sources and indexers do not read the original documents.
  • Chronological coverage
    • Print CA began in 1907; electronic CA in 1967 -- but now the whole CA collection back to 1907 has been digitized, and CAS has added to the electronic database selected records from 1876 to 1906 including the early issues of JACS and J. Phys. Chem.. It has added records from Chemisches Zentralblatt, an early chemical literature index, which will eventually cover 1830-1906. (Note: This product is an added cost option in SciFinder - not all SciFinder users may have access.) See ChemZent above.
    • Abstracts are updated daily in all the online forms. Online, basic bibliographic information for the over 1500 core journals and nine key patent authorities is online within two days after receipt at CAS. Other types of documents, especially technical reports and dissertations, may have a significantly greater time lag.
    • Online records are first added with bibliographic data and abstracts only; detailed indexing is added as it is completed.
  • Access points
    • Printed volume indexes were indexed by author, subject heading, systematic chemical name, molecular formula and patent number (see below).
    • Electronic forms combine keyword and subject heading approaches, and add roles for chemical substances.
    • In the online form, links to Registry File add enhanced searching of chemical substances, including structure searching.
    • Chemical substance records have been enhanced with chemical property data, some of which is searchable depending on the electronic interface used.
    • The database also has cited references for journal articles, conference papers and patents from key issuing authorities from 1997 on.
  • Constant enhancements: CAS has been in forefront of computerization of indexing for over 30 years and is always refining its search tools.

Chemical Abstracts in Print

Why Devote a Lecture to Chemical Abstracts in Print?

As of January, 2010, Chemical Abstracts Service discontinued the publication of a print edition of Chemical Abstracts. The principal reasons for this were economic. More and more institutional subscribers were dropping subscriptions to print CA in favor of SciFinder or other methods of electronic access. As a result, the per copy costs of printing volumes of CA were growing and growing. As worldwide Internet access has become better and better, the need to provide print to even the smallest and most remote institutions was going away. So...a 102 year-old tradtion ended.

So, with the all the abstracts from 1907-present now available in electronic form, why should we spend time looking at structure of the print version?

  • Not all institutions/businesses will have ready access to electronic CA. You may have to use the print form, if not here, then elsewhere in your career. It's rare, but possible.
  • More importantly, some of the characteristics of indexing in CA were born in the print medium, and still affect the way you search in the electronic form. For example, searching by chemical name in SciFinder is still affected by the nomenclature rules developed for CA in print. Look for notes labeled: Relevance to online searching in each section below.

Arrangement of Abstracts in Print CA

  • For ease of browsing, abstracts were grouped by subject area.
  • There are 80 subject sections (see http://www.cas.org/content/ca-sections), divided into five broad groups.
  • Biochemistry and Organic Chemistry used to come out in odd-numbered weeks, while Macromolecular Chemistry, Applied Chemistry and Chemical Engineering and Physical, Inorganic and Analytical Chem. used to come in even-numbered weeks. Now, abstracts are added in all sections each week.
  • Cross-references were used where a given abstract might legitimately appear in more than one section.
  • Note that subject sections change with time to reflect current research. Now that the print product has ceased, it is likely that the subject section codes will remain stable henceforth.
  • One volume per year was published until 1962, when they switched to two volumes per year. Collective Indexes where issued every ten years until 1957, and every five years since then.
  • Abstracts have been individually numbered only since 1967. From 1907-1932, pages were numbered, and indexes would refer to a page number, with a superscript denoting the order of the abstract on the page. Example: 3216, for the sixth abstract on page 321.
    From 1933-1966, each page had two columns of abstracts which were numbered, with letters running down the center of the page to identify where on the page the abstract fell. Example: 1733h would be near the bottom of page 1733.
    Since 1967, abstract numbers have been of the form 223717w, where the letter is meaningless except as a sort of check digit.
  • Relevance to online searching: CA Section Codes can be used to refine searches in SciFinder-n, and can be searched directly in the STN versions of CAPLUS. The STN version has an online thesaurus of section codes, allowing you to easily track the changes in section codes over the years. CAS abstract numbers frequently show up as references in chemical reference works. You can search by abstract/accession number in SciFinder/STN to find the relevant abstracts and go from there to the original articles and other data.

Contents of the Abstract Record

  • All CA records contain:
    • Title of the document
    • Author(s) or inventor(s) for patents
    • Corporate source or patent assignee information
    • Source Information, e.g. journal title, volume, issue, pages or patent numbers
    • Language
    • Abstracts (usually)
  • Author's names appear as given in the original document.
  • Abstracts for journal articles are usually those written by the author.
  • Patent abstracts may be fleshed out by the indexer.
  • Dissertations and some other documents have no abstracts.
  • Note that in the early days of CA, the abstracts tended to be much longer and more detailed; nowadays, the abstracts are usually the same as those in the published paper.
  • Relevance to online searching: The basic structure of the abstracts has not changed in the electronic forms, though the electronic forms display additional information, such as the CA index terms (subject headings), Registry Numbers for the substances indexed in the record, and, for entries post-1997, cited references.

Abbreviations

  • Journal names are listed using CASSI abbreviations.
  • Corporate names are heavily abbreviated.
  • All abstracts use abbreviations for common chemical terms (see CAS Standard Abbreviations and Acronyms at http://www.cas.org/content/cas-standard-abbreviations.)
  • Relevance to online searching: CAS journal name abbreviations are still heavily used in chemical journal references. SciFinder and STN records now display the full journal titles, though the abbreviations are still searchable.  Journal abbreviaions are also still used in lists of cited references associated with documents. Abbreviations of terms in the abstracts still appear, but the online search tools have the capability to automatically look for the abbreviated terms when you enter a search term which may be abbreviated.

Indexing in Print CA

  • The types of indexing available in CA reflect the constraints of print.
  • The indexing in the Volume and Collective Indexes is more systematic, but still reflects the limitations of print.
  • Volume & Collective Indexes
    • Author
    • Chemical Substance
    • General Subject
    • Molecular Formula
    • Patent

Author Indexing

  • Volume and Collective Indexes
    • First authors get both the abstract number and title of the paper listed under their names.
    • The author name is not necessarily the form used in the article, but may be a standardized form of the name. (Note: in recent years, CAS has largely given up on name standardization and uses the form found in the document.)
    • Other authors are cross-referenced to the first author of the document.
    • Examples:
      • Ford, Peter Campbell
        Quantitative mechanistic studies of the photoreactions of... 148754a
      • Lange, Frederick Fouse
        See Miller, Kelly T.; Sudre, Olivier
        ---; Lam, D.C.C.; Sudre, O.
        Powder processing and densification of ceramics 144196x
    • Even though CA tries to pull all of an author's works under one name, it cannot always distinguish authors with the same initials, so it alphabetizes by last name and initials, even where the full name is spelled out! Examples:
      • Ellis, A.
      • Ellis, Arthur Baron
      • Ellis, A. D.
      • Ellis, Anthony Ewart
      • Ellis, Avery K.
      • Ellis, Andrew Michael
      • Ellis, Albert T.
  • Spelling of Author Names: Be aware of special rules for handling certain names. Names with "Mc" or umlauted letters or transliteration from non-Roman alphabets can be tricky. Example:
    • Mössbauer is listed as Moessbauer
  • Relevance to online searching: CAS no longer attempts to use uniform versions of author names, but rather sticks with the form used in the document itself. Transliteration rules still apply. SciFinder author searching will often (though not always!) find alternative spellings of an author's name(s) for you. SciFinder-n does not use this system, but offers a drop-down autocomplete list of possible author names as you type your entry.

Patent Indexing

  • Chemical Abstracts only indexes the first version of each patent it receives.
  • However, the patent index (arranged by country code and patent number) gives cross-references from later, equivalent patents, that is, the same invention by the same inventor, patented in a different national or international patent office..
  • When searching for an equivalent patent, start at the year of issue of the known patent reference and work forward until you find the equivalent or run out of indexes.
  • Relevance to online searching: Nowadays, CAS uses patent family information from other sources to make all the relevant multi-national patent numbers and application numbers for a given chemical invention searchable and displayable. A separate concordance is no longer necessary. In some cases, CAS now indexes multiple versions of the same patent, reflecting the different amounts of information provided by different patent issuing authorities.

Concept Indexing in Chemical Abstracts

  • Weekly indexes used keyword indexing, which still carries over into the electronic versions.
  • Volume and Collective Indexes used systematic indexing for both general concepts and chemical substances, and this too, carries over into the electronic versions.

Keyword Indexing

  • Keywords are assigned by the indexer based on the body of the document, not just the title or abstract.
  • Terms are often abbreviated, following the standard CA abbreviations
  • To save space, a keyword is not assigned if it's part of the section heading for the section the abstract appears in, e.g. "Steroids".
  • Additional keywords are listed beneath the main keyword heading to flesh out the concept (like the co-terms in Science Citation Index).
  • Chemical names are listed along with concept terms in the issue indexes. The chemical names are not systematic, but follow the author's nomenclature.
  • Example
    Article title: "Facile preparations of 4-fluororesorcinol"
    • Acetophenone
      methoxy fluorination regiochem
    • Benzene
      fluoro dihydroxy
    • Deacetylation
      demethylation fluorodimethoxyacetophenone
    • Demethylation
      fluorodimethoxybenzene
    • Methoxybenzene
      methoxyacetophenone fluorination regiochem
    • Fluorodihydroxybenzene
    • Fluororesorcinol
    • Resorcinol
      fluoro
  • Relevance to online searching: Keyword like the ones above still appear in the online records as supplementary terms. They are searched whenever you do topic searching (SciFinder) or basic index searching (STN).  In STN you may specify this as a searchable field, or as a Filter field in SciFinder-n.  The latter can be extremely useful if you are searching for a phrase containing a stopword, such as "in vitro".

Volume and Collective Indexes: General Subject Index

  • The General Subject Index uses standard subject headings in order to better bring related documents together (collation).
  • The standard headings list does get modified and expanded to reflect new areas of research. Major changes are usually done at the beginning of a Collective Index period. Sometimes the changes are minor, sometimes drastic.
  • Prior to 1997, headings were chosen so as to draw related topics into physical proximity in the printed volumes, with electronic searching treated as a secondary aspect of CA. In 1997, headings were changed to be more like natural language for easier electronic searching, with the print version treated as a secondary aspect of CA. However, these new headings in turn proved unpopular, and many changed back in 1999.
  • Note: to help cope with these changes, CAS has developed an electronic thesaurus, called CA Lexicon, which is available as part of the CA Databases on STN. It currently covers the subject headings from 1907-present. It has not yet been fully implemented for SciFinder.
  • However, CAS has used the Lexicon to algorithmically change the subject headings in the electronic files for 1997-1998 to conform with the previous and subsequent versions. Moreover, SciFinder Scholar has some built-in synonym checking which carries out part of the functions of the Lexicon, at least for commonly used synonyms (e.g. cancer, carcinoma, neoplasm.)
  • Broadly speaking, the General Subject Index includes:
    • classes of chemical substances
    • physical and chemical phenomena
    • types of reactions
    • chemical technology
    • industrial processes and equipment
    • scientific names for living organisms
    • biological and medical terminology
  • For extensive subjects, qualifiers were added as part of the main subject heading, such as Blood, analysis. For 1997-98, the qualifiers become part of a single heading: Blood analysis. In 1999, the system reverted to the pre-1997 standard.
  • Pre-1997, classes of substances used to have derivative categories, such as Carboxylic acids, esters. From 1997-98, there is simply a heading for Carboxylic acid esters. In 1999, the pattern reverted.
  • Classes of compounds in both periods have qualifiers, but the specific qualifiers have changed. Sulfonic acids, uses and miscellaneous was the old usage. Now, Sulfonic acids, miscellaneous and Sulfonic acids, uses are separate. This change has not reverted.
  • Note: the following list of substance categories apply to pre-1997 indexes and post-1998 indexes. In the two year interval, most were replaced by separate headings.
  • Substance Categories
    • For ketones, aldehydes
      - acetals, hydrazones, mercaptals, oximes
    • For acids
      - anhydrides, anhydrosulfides, esters, lactones
    • For alcohols
      - ethers
    • For amines
      - oxides
    • General: compounds, derivatives, polymers
  • Heading Qualifiers (old)
    • For substances and classes of substances
      • analysis
      • biological studies
      • occurrence
      • preparations
      • properties
      • reactions
      • uses and miscellaneous
      • New: all of the above, plus formation (nonpreparative) and processes, as well as separate uses and miscellaneous categories.
    • In the electronic versions of the file, these have evolved into role indicators. Note that the full detail of roles and even some higher level roles (NANO or nanomaterials) have not yet been implemnted in SciFinder.
    • For organs and tissues (old)
      • composition
      • disease or disorder
      • metabolism
      • neoplasm
      • toxic chemical or physical damage
    • In 1997-98 subject headings, the disease and neoplasm headings have been combined with their respective organ or tissue to form separate primary headings.
    • For alloys (old)
      • base - applied to the largest single constituent of the alloy.
      • non-base -- applied to other constituents of the alloy.
    • Most alloys are now listed by type not constituent; to search by constituents, use the Chemical Substance Index.
  • Relevance to online searching: All of these rules still affect the assignment of index terms/subject headings in the electronic versions of CA. However, with keyword searching, knowing all the details of the subject rules is much less necessary. Substance qualifiers have been replaced by substance roles. These roles are extremely useful in online searching or refinement, especially on STN and in SciFinder-n.

CA Index Guide

  • The Index Guide was the key printed tool for identifying the correct subject heading for any topic in Chemical Abstracts
  • Each Index Guide listed the approved headings in use for its period of coverage.
  • An IG was published at the beginning of each Collective Index period, with updates every 18 months until the final comes with the Collective Index itself.
  • Contents of the Index Guide
    • An alphabetical listing of the approved subject headings, with cross-references to related headings and descriptive notes.
    • Many common terms not used as headings are listed, with See references to the correct heading.
    • Many common and/or trade names for chemical substances are listed, giving the correct CA systematic name (and Registry Number!)
    • There are also appendices on the organization and use of the subject indexes; how CA indexers select headings; CA chemical nomenclature; and a hierarchical list of the headings.
    • Like the rest of print CA, the Index Guide is no longer being published. Its content has been incorporated into the CA Lexicon on STN, and some of the Lexicon information has been incorporated in the background to enhance keyword searching in SciFinder.  Some of the appendices are available as PDF files on the CAS website (see example above.)

The Rule of Specificity

  • Usually, CA indexers will assign the most specific subject heading that applies to the document.
  • For example, if a document deals with the synthesis of a specific ester, the indexer will assign that substance to the index, not the general term "Esters".
  • In most indexes, cancer of the lungs will appear as Lung, neoplasm not Lung, disease.
  • From 1997-98, the general term in Lung tumors with more specific terms for specific types, e.g. Lung adrenocarcinomas
  • Relevance to online searching: This is still relevant in searching in SciFinder. It's not as important in STN, where the CA Lexicon allows you to easily locate and search broader and narrower subject terms based on your starting terms.

Substance Indexing: The Challenge of Nomenclature

  • In order to ensure that each substance has a unique possible name, and to group "like" compounds together, CA has devised their own system of nomenclature (not necessarily IUPAC) and scheme for arranging them in the Chemical Substance Index.
  • Unfortunately, this system can be hideously complex. Here's a hideous example
    • Dodecahedrane (C20H20) used to be listed as simply dodecahedrane.
    • Then a systematic name was assigned:
      5,2,1,6,3,4-[2,3]Butanylidenedipentaleno [2,1,6-cde:2',1',6'-gha]pentalene, hexadecahydro-
    • Now it's treated as a member of the fullerene family:
      [5]Fullerane-C20-Ih
  • It is important to remember that the CAS nomenclature has changed over time, as in the case of dodecahedrane above. The most important change took place in 1972; nomenclature has been fairly stable since then. But if you are using the older literature, you may have to do some checking to be sure of the correct terminology.

Basic Rules of CAS Nomenclature

  • CAS indexers select the "main" part of the compound to act as the heading parent.
  • Substituents to the parent are listed after it. This is referred to as inverted order
  • What constitutes a parent compound and how it would be named are not always obvious, even to a chemist.
  • Examples
    • Toluene is
      Benzene, methyl-
    • ortho-Xylene is
      Benzene, 1,2-dimethyl-
    • Benzyl alcohol is
      Benzenemethanol
  • When there are multiple substituents, they are listed in alphabetical order, including the prefixes.
    • Carbon tetrachloride is
      Methane, tetrachloro-
    • CCl2F2 is
      Methane, dichlorodifluoro-
    • CCl3F is
      Methane, fluorotrichloro-
  • Polymers are listed by the monomer(s) or repeating unit, with polymer or homopolymer appended.
    • Teflon is
      Ethene, tetrafluoro-, homopolymer

Alphabetization of Compounds

  • Compounds are listed first by parent compound, with the parent compound itself first (with any qualifiers and categories), then by substituted forms in alphabetical order.
  • Substituents are read from left to right, ignoring numbers and punctuation.
  • Example: Benzene
    • Benzene
    • Benzene, analysis
    • Benzene, uses and miscellaneous
    • Benzene, compounds
    • Benzene, polymers
    • Benzene, azido-
    • Benzene, chloro-
    • Benzene, 1,2-dibutyl-

Special Cases: Salts

  • Salts of organic acids, or inorganic oxyacids are named as derivatives of the parent acid.
  • Potassium chloride is
    Potassium chloride
  • But: Potassium sulfate is
    Sulfuric acid, potassium salt (2:1)

Helps for finding CAS Chemical Names

  • In general, it can be very tricky to look at the structure of a complex compound and decide what the CA name will be.
  • However, in many cases, you can use a variety of resources to help find the CA name.
  • Remember that some data collections give the CAS name for compounds: Merck Index, CRC Handbook of Chemistry and Physics, among others.
  • Relevance to online searching: In the electronic environment, where you can search by chemical name synonyms and by structure and substracture, finding the CAS name before you start your search is not as crucial. Moreover, in many cases it is as easy to find the CAS Registry Number and use that in your searches.
    • Remember that there are many sources you can use to find Registry Numbers which have good synonym indexes: Merck Index, Combined Chemical Dictionaries (or the print equivalents), the Aldrich catalog, etc.
    • These days, even Wikipedia is a fairly reliable source for finding CAS Registry numbers of common substances.
    • On the other hand, you should also remember that different sources may give different Registry Numbers for what appears to be the same substance: examples: parent compounds with salts, stereoisomers, polymers.

Molecular Formula Index

  • While most molecular formulas have a large number of possible compounds. it is far easier to look at a possible name and decide whether it matches your compound than to guess at a name.
  • Note that the Molecular Formula Index just gives a list of abstract numbers, not a breakdown by subheadings.

Molecular Formula Index Organization

  • Molecular formulas are listed in Hill order:
    1. If carbon is present, it comes first, followed by hydrogen, then all other elements in alphabetical order.
    2. If not, then all (including H) in alphabetical order.
  • Note that the rules for salts apply to molecular formulas, too.
  • Molecular Formula Examples
    • Benzene is C6H6
    • Teflon is (C2F4)x
    • Ferrocene is C10H10Fe
    • Hydrochloric acid is ClH
    • Benzoic acid is C7H6O2
    • Sodium benzoate is C7H6O2, sodium salt...NOT C7H6NaO2
  • Relevance to online searching: CAS still does molecular formulas in Hill order (as do many reference works.) The rules for special compounds like polymers and salts also still apply. Even in the online environment, searching a molecular formula in the wrong way can lead you astray. Try searching Na2O4S and compare the results to H2O4S.2Na !

© 2022 Charles F. Huber

Creative Commons License
This work by Charles F. Huber is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Based on a work at guides.library.ucsb.edu


Copyright © 2008-2019 The Regents of the University of California, All Rights Reserved.
UCSB Library (805) 893-2478 • Music Library (805) 893-2641 • UCSB, Santa Barbara, CA 93106-9010
Contact UsPolicies