Skip to Main Content

CHEM 184/284 (Chemical Literature) - Huber - Winter 2024: Lecture 4

A two-credit course in the techniques and tools for effective searching the literature of chemistry, biochemistry, chemical engineering and related fields.

Introduction to Data Collections

Data Collections

Just as libraries can be organized by subject classification (grouping books on the shelf by subject area) or by indexing (creating records describing the book - author, title, subjects, etc. - and indexing the various types of information), so too, the primary scientific literature can be organized in two broad ways: indexing, or selecting data from the published literature and bringing them together in collections.  This lecture will look at the various forms of data collection.

  • These are a form of secondary literature in which an editor selects information from primary sources and arranges it to facilitate a particular type of access.
  • Often, the data are reviewed and evaluated by the editors before inclusion, adding further value.
  • The right data collection can be more useful than searching primary sources, depending on the objective of your search. If you're looking for a specific piece of data, or an introcution/overview of an area you are not yet expert in, data collections can save you a lot of time.

Types of data collections in chemistry

  • Dictionaries
    This includes both classical lists of definitions of terms, and "chemical dictionaries" which have alphabetical lists of compounds, with various kinds of data.
  • Encyclopedias
    Encyclopedias have substantial articles on relevant topics, usually in alphabetical order, usually with a significant bibliography of the source literature.
  • Physical data collections (including spectra collections, crystallography data)
    Physical data collections can take many different forms, depending on the objective of the editor. Some are ordered by compound name or formula, others by the value of the property in question.
  • Reaction and synthesis guides
    These may collect preparations of individual compounds, applications of individual reagents, or general methods, grouped by type of reaction, type of starting material or type of product.
  • Analytical methods guides
    These may deal with specific or general techniques, grouped by analyte, matrix, or method.
  • Health, safety, toxicity guides
  • Comprehensive works
    These are usually ongoing series, attempting to summarize all of a given area of chemistry. Good examples from the past include the Beilstein Handbook of Organic Chemistry and the Gmelin Handbook of Inorganic and Organometallic Chemistry.


Approaching a data collection for the first time: What should you know?

  • What type of data is it attempting to collect?
  • How comprehensive is it?  Is it attempting to collect all instances of, say, the melting point of naphthalene, or just a "best value"?  What years of the literature is it collecting from?  Is it updated periodically?
  • If it's a printed work (or a simple digitization of a print work), how is the data tabulated?  For an example, if you have a table of melting points, are tehy arranged alphabetically by substance name or numerically by melting point?  Different arrangements answer different types of questions!
  • How are tables grouped within the work?
  • Are there indexes?  If so, what are available: substance name (and does it include synonyms or not)? substance class?  CAS Registry Number? for substance data; for reactions, reactants? products? name reactions? reaction classes?
  • If it's an electronic database, what fields are indexed?  What are the display options?  Can you re-sort data? Can you export data to a spreadsheet?

© 2022 Charles F. Huber

Creative Commons License
This work by Charles F. Huber is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Based on a work at


Examples of Data Collections

Examples of Widely Used Data Collections in Chemistry

For greatest ease in visiing online subscription resources, log into the VPN or proxy server before returning to this page.

ChemSpider (

  • Created by Antony Williams, and now a product of the Royal Society of Chemistry, ChemSpider is a free website collecting a wide range of information on over 39 million (as of Jan. 2016) chemical substances, drawn from hundreds of data sources.
  • ChemSpider is seaerchable by names, formulas, properties and structures.  It is divided into three databases: Compounds (for molecular substances), Substances (includes salts, mixtures, formulations) and BioAssays (substance data seaerchable by keywords from bioassays of the substances.)
  • ChemSpider Synthetic Pages ( is a related site for reaction data.

PubChem (

  • PubChem is a free (taxpayer-supported) database created by the National Library of Medicine) containing name, structure, basic physical properties, safety, toxicity , drug and pharmacological information for millions of compounds.
  • It is composed of three separate, but linked, databases: Compound (covering single chemical comounds), Substance (including salts, mixtures and preparations) and BioAssay (for finding substances by terms describing their bioassays.)
  • PubChem now includes chemical toxicology information formerly contained in the Hazardous Substances Data Bank (HSDB) database.

CRC Handbook of Chemistry and Physics (print: QD 65 .H3)

  • Familiar source; published annually (currently in the 101st edition)  but usually changes little from one year to the next.
  • Variety of useful physical and chemical data, with some references. Tables are grouped in broad subject sections. Arrangement within tables varies.
  • Most frequently used for tables of organic compounds and inorganic compounds, which contain data on melting points, boiling points, density and solubility among others.
  • Note that both tables have synonym indexes following the table.
  • Not very systematic in choice of data, and indexing can be inconsistent.
  • CRC is now publishing a Web version of the Handbook as part of ChemNetBase at
  • The electronic version may be browsed by table of contents, or searched by text term. It may also be searched by physical property values (requires a browser capable of handling Java 2 applets) and allows you to sort tabular information in a variety of ways. To use, select "Substance/Property Search", then pick the property or properties you wish to search by, then pick the appropriate operator (=, >, <, etc.), then enter the values you wish to search.
  • The electronic version may be searched freely by anyone, but display of data requires a subscription. The UC system has a current subscription to this version.

Merck Index (print: RS 356 .M4)

  • Published by Merck Pharmaceuticals, with data primarily on organics, strongest on drugs (surprise!).
  • Includes physical data, preparation references., toxicity and uses.
  • Arranged alphabetically by chemical name; well-indexed; updated irregularly.
  • It also contains a small number of other tables, and a section describing over 400 "name reactions" in chemistry, with reaction schemes and references.
  • The Merck Index is now available in a Web version from the Royal Society of Chemistry at The UC system now has a site license for the Web version. Compound information may be searched by chemical name, CAS Registry Number, selected property values and chemical structure. Name reactions may be browsed alphabetically or searched by keyword.

Millipore-Sigma (aka Sigma-Aldrich Catalog (print: TP 202 .A48)

  • The primary purpose of the catalog is to faciliate ordering chemicals and other materials from Millipore Sigma. However, it also provides basic physical property data, Safety Data Sheets (SDS) and in some cases, spectral data on the chemicals they sell.
  • Note that in both print and online versions, a single compound may appear in a number of different product records, usually representing various grades of purity. Note also, physical property data is usually only listed for the highest grade version of a given compound.
  • See also other chemical companies catalogs, both in print and on the Web. Check Chemistry & Biochemistry: Chemical Suppliers at

Kirk-Othmer Encyclopedia of Chemical Technology

  • Commonly referred to as "Kirk-Othmer" after its early editors.
  • Wide-ranging, authoritative encyclopedia of chemical and process information
  • Very strong on industrially important chemicals.
  • Good subject indexing, cross-references and bibliographies.
  • Other important encyclopedias of chemical engineering include:

Ullman's Encyclopedia of Industrial Chemistry,, and UC has a subscription to it as well.

Polymer Science: A Comprehensive Reference

  • A fairly recent ten-volume reference work on a wide range of current topics in polymer science, including volumes on nanostructured polymers and polymers for sustainability and green energy.
  • UCSB has this only in online form.

Encyclopedia of Polymer Science and Engineering

  • Sister publication to Encyclopedia of Chemical Technology above.
  • Covers polymer science in great detail, with thorough indexing, good cross-references and excellent bibliographies.
  • This new 12 volume 3rd edition is organized interestingly: rather than alphabetically listing articles from Volume 1 to Volume 12, the volumes are grouped in three sets of four. For best results, consult the index in Volume 12 to find relevant articles.
  • UCSB has access Web version of the 4th edition of this encyclopedia. Note that recent articles may not be fully accessible - we would need to purchase an update to our subscription.

Polymeric Materials Encyclopedia ( Ref TP 1110 .P65 1996)

  • Twelve volume work on polymeric materials; covers both natural and synthetic polymers, both specific compounds and classes of comounds; preparations, reactions and properties; processes and applications.
  • Well referenced and indexed.

Encyclopedia of Materials: Science and Technology ( print:TA402 .E53 2001)

Encyclopedia of Materials Characterization

  • This is a comprehensive volume on analytical techniques used in materials science for the characterization of surfaces, interfaces and thin films.
  • UCSB has online access to this work via at

​​​​​Encyclopedia of Chemical Physics and Physical Chemistry

  • This encyclopedia, published by the Institute of Physics (the primary society for physicists in the UK) in 2001, comprises three volumes on fundamentals, methods and applications in chemical physics and physical chemistry. UCSB has this work only in electronic form, via Knovel. at

Comprehensive Chemistry" Series

(various call numbers, see below)

Encyclopedia of Inorganic  and Bioinorganic Chemistry

  • Originally published in print at the Encyclopedia of Inorganic Chemistry, it was updated and expanded for the online version.
  • Covers inorganic, bioinorganic, organometallic and coordination chemistry
  • Alphabetical organization, with thematic list in the foreword, subject index and list of contributors.

Combined Chemical Dictionary

  • CRC Press (a divsion of Taylr & Francis) published a variety of "dictionaries" of compounds (formerly published by Chapman-Hall.)
  • They give structure diagrams, basic physical data (on both the compound and significant derivatives), and references for other information.
  • Alphabetical arrangement; well-indexed, including CAS Registry #'s.
  • Current sets include:
    • Dictionary of Organic Compounds, 6th ed.  QD 251 .D5 1996
    • Dictionary of Natural Products  QD 415 .A25 D53 1994
    • Dictionary of Inorganic Compounds  QD 148 .D53 1992
      Note: Unlike the other titles, this one is arranged by molecular formula, with the elements arranged alphabetically.
    • Dictionary of Organometallic Compounds, 2nd ed.  QD 411 .D53 1995
    • Dictionary of Organophosphorus Compounds  QD 412 .P1 E36 1988
    • Dictionary of Antibiotics  RS 431 .A6 D53 1988
    • Dictionary of Drugs  RS 51 .D479 1990
    • Dictionary of Analytical Reagents  QD 77 .D498 1993
  • CRC has a Web version of the combined chemical dictionaries as part of ChemNetBase at​. From the CHEMnetBASE site, several of the dictionaries above may also bbe searched individually.

Science of Synthesis

  • A very comprehensive series on organic methods, with periodic supplements. Originally published in German as Houben-Weyl's Metoden der Organischen Chemie. In recent decades, it switched to English. The title changed to Science of Synthesis. The online version ws launched in 2002.
  • Organized by chemical classes.
  • Now publishing specialized sets, e.g. on stereoselective reactions, as well as updating the claassic volumes.
  • Stereoselective Synthesis  QD 258 .M4 1952 v.E21 parts 1-10
     This electronic version of a handbook of organic synthetic methods, in two parts: Science of Synthesis contains 48 volumes, covering the fields of Organometallics; Hetarenes and Related Ring Systems; Compounds with Four Carbon-Heteroatom Bonds e.g. Carbonic Acids, Imidic Acids etc.; Compounds with Three Carbon-Heteroatom Bonds e.g. Nitriles, Isocyanides, and Derivatives, Amides and Derivatives, Peptides, Lactams, Thio-, Seleno- and Tellurocarboxylic Acids and Derivatives, Compounds with Two Carbon-Heteroatom Bonds e.g. Ketones, and Heteroatom Analogues of Aldehydes and Ketones, Compunds with One Saturated Carbon-Heteroatom Bond (e.g. halogens) and Compounds with All-Carbon Functions. It is browsable by the table of contents, and may be searched by chemical name or chemical structure.
  • The Houben-Weyl Archive (1909 to 2004) provides immediate access to 146 000 product specific experimental procedures, 580 000 structures, and 700 000 references in all fields of synthetic organic chemistry - dating back to the early 1800s. It may be browsed by table of contents, or searched for name reactions. Most of the earlier volumes are in German. UCSB has a subscription to this resource.

Organic Reactions

  • Annual publication with review articles on important synthetic methods.
  • Articles are published in no particular order, but the series is well-indexed, with cumulative author and chapter/topic indexes in each volume for all the preceding volumes.
  • UCSB has access to an electronic version on the Web (0. It currently all volumes of the printed work. The articles are not listed by volume, but may be browsed by article title or reaction type, and searched by keyword or structure.Note: Some of the most recently added articles may be inaccessible.

Organic Syntheses

  • Annual publication with tested syntheses of organic and organometallic compounds.
  • Gives detailed descriptions of synthetic techniques, reagents, yields and safety aspects.
  • Well-indexed (authors, compound names, reaction types, molecular formulas)
  • Collective volumes include revised and updated syntheses from annual volumes. There is a cumulative index for the first eight collective volumes.
  • The publishers, in collaboration with Wiley and PerkinElmer has released a FREE Web version at With a free chemical drawing plug-in available at the Web site, the online version is substructure searchable.
  • Wiley has also released a somewhat more up-to-date subscription version. Note that articles in this Wiley reference work (and many others) are available on a pay-per-view basis to individual users.

Inorganic Syntheses (Ref QD 151 .I5)

  • A less-than-annual publication, similar in format to Organic Syntheses
  • Covers inorganic and organometallic compounds (including boranes, synthetic metals, superconductors)
  • No collective volumes, but the indexes cumulate every five volumes. Wiley has created an online version of Inorganic Syntheses at Chapters are available as PDF files, and may be searched by chapter title but they are not searchable as a true database like Organic Syntheses (yet). UCSB does not have an online subscription.

Fieser and Fieser's Reagents for Organic Synthesis (Ref QD 262 .F5)

  • Classic series reporting on new reagents and new uses for old reagents.
  • Published less-than-annually.
  • Alphabetical list of reagents, with author and subject index.
  • Cumulative index for Vols. 1-12.
  • Wiley is creating an electronic version of this series at Articles may be browsed by title, or searched by full text keyword. UCSB does not have a subscription to this as yet. It is rumored that Wiley may incorporate "Fieser" in with other synthesis-related series which they publish, but no such product has been released yet.

e-EROS (Encyclopedia of Reagents for Organic Synthesis)

  • Originallly published in print as a  multi-volume set, listing compounds in alphabetical order
  • Gives physical data and brief, but detailed description of uses
  • Excellent references and indexing (compound name, formula, type of reaction)
  • The electronic version of EROS may be searched using the Wiley search tool, or browsed alphabetically, or by most recently added/updated, or by most cited articles.The online version is the current edition, and is updated periodically, though some articles still date to the first edition.

International Tables for Crystallography

  • The ICT is published by the International Union for Crystallography, in assocation with Wiley. and is "the definitive reference work for crystallography and structural science. "
  • The online version of International Tables for Crystallography provides access to a fully interactive symmetry database and all nine volumes in the series in pdf and richly linked html format. The following content is available online:
  • It can be searched with the Wiley search tool, or brrowsed by Table of Contents, or by most recently added/updated articles or by most cited articles.

Cambridge Structural Database (WebCSD)  (

  • Produced by the Cambridge Crystallographic Data Centre, the Cambridge Structural Database is the world's repository of experimentally determined organic and metal-organic crystal structures, with over 500,000 structures. 
  • The WebCSD interface allow for text and numerical searching, substructure searching, similarity searching, and reduced cell searching of the database.  It can display structures in a variety of visualization formats. 
  • WebCSD is paid for by the UCSB Department of Chemistry and Biochemistry; students in that department may contact departmental IT support for information on downloading additional CSD software.

Spectra Collections

  • UCSB Library has a variety of collections of spectra, some one volume, some multivolume, including IR, NMR, UV, powder diffraction, etc..
  • Most are located at either QC 435-765 or QD 95-96.
  • Some have general coverage, some deal with specific classes of compounds.
  • In ascending order of size and complexity, the main SEL Ref Area spectra collections are:
  • Note that a wide range of experimental and producted spectra are available through the substance records in SciFinder/SciFinder-n. These will be discussed in later lectures.

Multi-type Spectra Collections

Integrated Spectral Data Base System (SDBS) (

This site, from the National Institute of Materials and Chemical Research in Japan, contains full spectra and, in many cases, peak assignments for about 33,000 compounds, including about 24,000 electron-impact mass spectra, 13,000 13C NMR, 14,700 proton NMR, 51,100 Fourier transform (FT) IR, 3,500 Raman and 2,500 ESR spectra. Peak assignments are provided, where possible, for the NMR spectra. The database is searchable by compound name, CAS Registry Number, molecular formula and NMR or IR peaks. The database is free to the public, but users are asked to download no more than 50 spectra per day without specific permission of the site owners.

NIST Chemistry Webbook

Among other data, NIST Chemistry Webbook has IR spectra for over 16,000 compounds, mass spectra for over 15,000 compounds, UV/visible spectra for over 1,600 compounds and electronic and vibrational spectra for over 5000 compounds which may be searched in a variety of ways, displayed and printed. Note that the variety of data available here is growing; well-worth checking for a wide variety of data. The Webbook may also be searched by keyword, property or chemical name along with a large number of NIST databases at the NIST Data Gateway


  • Originally published in print as Encyclopedia of Nuclear Magnetic Resonance, the online version is greatly updated and expanded.
  • Not a spectra collection; gives articles on techniques, applications, types of substances on which NMR has been done.
  • Vol. 1 is all on "historical perspectives" on NMR.
  • Excellent references and indexing.

Encyclopedia of Computational Chemistry

  • Somewhat dated electronic version of a five volume work published in 2005, covering all areas of computers in chemistry: structure-activity relationships, molecular modeling, electronic structure name it. Unlike some of the Wiley reference works, this encyclopedia is not updated.
  • Good indexing; lots of references.

Encyclopedia of Catalysis

  • Fairly recent (2011) six-volume work with alphabetically-arranged articles on the most significant aspects of homogeneous, heterogeneous, asymmetric, biomimetic, and biological catalysis. This encyclopedia is not updated in the online version.

Encyclopedia of Analytical Chemistry

  • Originaly published as a 15 volume set devotes its first ten volumes to areas of chemical analysis (e.g., Chemical Weapons Chemical Analysis; Environment: Water and Waste; Peptides and Proteins; Surfaces) and the last five volumes to methods (e.g., Atomic Spectroscopy; Liquid Chromatography; Radiochemical Methods).
  • The online version is based on the 2nd edition, with articles updated regularly. However, we may not have full text access to the most recently added/updated articles.
  • The articles are detailed, by experts in their fields, with good references and cross-referencing.
  • You may search the encylopedia using the Wiley search tool, or browse byTable of Contents, most recently added/updated, or most cited articles.

Encyclopedia of Analytical Science, 2nd ed.

  • Excellent relatively recent reference for analytical chemistry. Its articles cover:
    • Techniques, like "atomic absorption spectroscopy"
    • Analytes, like "antimony", " asbestos", "carbohydrates"
    • Matrices, like " blood", "ceramics"
    • Classes of analysis, like "bioprocess analysis", "forensic science"
  • Well-indexed and cross-referenced.
  • Note that his edition is not updated online. You may search within the encyclopedia, or brosse by title, subject or author.

Current Protocols   https://currentprotocols.onlinelibrary.wiley/

  • The Current Protocols series are laboratory manuals considered a benchmark for scientific research methods. Wiley now treats them as a cluster of journals (see below.) With their regular updates, these publications constantly evolve and change to meet the needs of the scientific research community. They include:The University of California has a subscription to most of the Protocols series in online form, including:The protocols are browsable by Table of contents or keyword searchable with stemming (truncation) and a subject thesaurus. Search results may be refined by date, journal source, or author.
    • Step-by-step protocols with annotations that alert you to special considerations,tips, and optional procedures. 
    • Alternate and support protocols to accommodate different equipment and desired results. 
    • Materials lists for each protocol to ensure you have everything you need before you start work. 
    • Detailed recipes for reagents, solutions, and culture media. 
    • Expert commentaries filled with scientific insight, including general background, troubleshooting instructions, and planning considerations.
    • Tables and figures to clarify complex procedures. 
    • Appendices filled with useful reference material. 
    • Current Protocols in Bioinformatics
    • Current Protocols in Cell Biology
    • Current Protocols in Cytometry
    • Current Protocols in Human Genetics
    • Current Protocols in Immunology
    • Current Protocols in Molecular Biology
    • Current Protocols in Neuroscience
    • Current Protocols in Nucleic Acid Chemistry
    • Current Protocols in Pharmacology
    • Current Protocols in Protein Science
    • Current Protocols in Toxicology

Encylopedia of Biological Chemistry

  • Originally published by Elsevier in 2004, we have online access to this encyclopedia through the Gale e-books collection.
  • Articles are short, but well illustrated and well referenced
  • Note: While there is a table of contents and subject index, they DO NOT link to the corresponding artilces. The best way to use this work is to go to the Advanced Search screen, and search your keyword, along with Ecyclopedia of Biological Chemistry as the Publication Title.

Wiley Encyclopedia of Chemical Biology

  • Originally publiished in 2007, it has had limited updating online since then.
  • Articles cover both theory and applications of chemical biology, and are well referenced.
  • You may search the encyclopedia with the Wiley search tool, or browse by topic, most recently updated, most cited, or alphabetically by article title.

Kyoto Encyclopedia of Genes and Genomes (KEGG)   (

  • The primary objective of KEGG is to computerize the current knowledge of molecular interactions; namely, metabolic pathways, regulatory pathways, and molecular assemblies. At the same time, KEGG maintains gene catalogs for all the organisms that have been sequenced and links each gene product to a component on the pathway. KEGG also organizes a database of all chemical compounds in living cells and links each compound to a pathway component. And finally, KEGG aims at developing new bioinformatics technologies toward functional reconstruction. In addition, the KEGG site has a good collection of links to other biochemical Web resources.

Protein Data Bank

  • The RCSB PDB provides a variety of tools and resources for studying the structures of biological macromolecules and their relationships to sequence, function, and disease. The RCSB is a member of the wwPDB whose mission is to ensure that the PDB archive remains an international resource with uniform data. This site offers tools for browsing, searching, and reporting that utilize the data resulting from ongoing efforts to create a more consistent and comprehensive archive.

Methods in Enzymology Only

  • The name of this immense (645 volumes as of Nov. 2020) and growing) series underestimates the breadth and depth of its coverage. While it contains excellent review articles on methods of all kinds, especially in enzymology, its articles frequently explore the functions of the substrates the enzymes act on.
  • Volumes are not arranged by subject, but are published in semi-random order. Many volumes are individually cataloged in UCSB Library Search, and so can be located by a subject search.
  • You can browse the volumes's Tables of Contents, or search by keyword within the entire series or any individual volume.

BRENDA: The Comprehensive Enyzme Information System

  • BRENDA is the main collection of enzyme functional data available to the scientific community.
  • It is available free of charge for academic, non-profit users via the internet.
  • As available, each enzyme has data on nomenclature, reactions and specificity, structure, isolation and preparation, stability and cross-references to sequence databanks. Although BRENDA gives a representative overview on the characteristics and variability of each enzyme the Handbook is not a compendium. The reader will have to go to the primary literature for more detailed information.

Sax's Dangerous Properties of Inducstrial Materials,12th ed.

  • Published by Wiley, our most recent edition is available online from Knovel. 
  • Typical entries for a substance include identification information, basic physical properies, flammability, solubility and toxicity informaiton, and safety profile.

CRC Handbook of Laboratory Safety 5th ed.

  • Everything you always wanted to know about setting up and operating a laboratory safely.
  • /This version of the handbook can only be downloarded as a somplete PDF, not as individual chapters.

Other Electronic Data Collections

  • Many classic data collections are not available on the Web because their publishers are making good money off the print versions, and they haven't figured out how to best make money off of Web versions.
  • Consider searching UCSB Library Search or checking major collections of reference works (see the Indexes and Databases lists) as CRCnetBASE or
  • However, there are some good collections on the Web, mostly from government sources, academic sources, or commercial firms seeking to demonstrate the usefulness of their products.
  • In addition to sites mentioned above, see also UCSB Library's Chemistry & Biochemistry Help by Subject page at for some examples.
  • For references in specific subdisciplines, see also the following specialized guides:
    Analytical Chemistry
    General Chemistry
    Inorganic Chemistry
    Organic Chemistry
    Physical Chemistry
    Chemical Engineering


© 2024 Charles F. Huber

Creative Commons License
This work by Charles F. Huber is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Based on a work at


Login to LibApps