CHEM 116BL (Laverman, Spring 2022): SciFinder-n

Guide to using library resources (e.g. SciFinder, Reaxys) for literature research for CHEM 116BL.

SciFinder-n for CHEM 115BL

Registration

  • SciFinder requires advance registration by each user. Each institutional subscriber has its own registration page for SciFinder Web. For UCSB students faculty and staff, you will find the link to that page here at http://proxy.library.ucsb.edu:2048/login?url=http://www.library.ucsb.edu/node/2017. Note that access to this page is restricted to UCSB IP addresses. If you are off-campus, and not already logged into the VPN or proxy server, you will be prompted to log into the proxy server with your UCSBnetID and password. See the Off Campus Access page at https://login.proxy.library.ucsb.edu:9443/login for details. If you log in via the proxy server, from the "Success" page, click on the link for Article Indexes and Databases, then go to the information page for SciFinder and click through to the registration page.
  • Note: descriptions and screen shots in these lectures are for the SciFinder Web version as seen on Firefox 8.0 on a Windows XP machine. The look on other operating systems and browsers may vary slightly.

 

SciFinder registration first screen

  • Above is the first screen you'll see after clicking on the registration link. Click the "Next>>" button to begin registration.
  • On the second screen, you must agree to the license agreement shown. (Note that the license agreement changes from time to time.)  A similar license agreement shows up each time you login to SciFinder as well.

SciFinder licence agreement

  • On the screen below, fill in the information required. Notice that some information is optional. You must use a valid "ucsb.edu" e-mail address, such as "smith@umail.ucsb.edu" or "jones@chem.ucsb.edu". E-mail addresses from other providers (e.g. gmail, hotmail, yahoo) will not be accepted. Only one SciFinder username and password may be created per e-mail address, and CAS allows only one username/password per user even if you have multiple valid e-mail addresses.
  • If you have a username/password from another institution, you will have to re-register to be able to use SciFinder at UCSB.
  • After you have completed registration, you will receive an e-mail at the address you entered on the form verifying your registration. You must then follow the instructions on the e-mail to verify that you are the valid user of that e-mail address and complete the registration process.

SciFinder registration, part 1

SciFinder registration, part 2

Logging in to SciFindern

Note that this database requires registration (see above.) If you are logging in from off-campus, either use the campus VPN or the Library proxy server. If you use the links in the Library databases lists, our systems will automatically detect whether you are on-campus, already connected to the proxy or VPN, or off-campus and unconnected. If the latter, you will be automatically routed to a proxy server login screen. Once logged in, you will continue on to the SciFinder login page.

SciFinder-n log in screen

Enter your username or the e-mail address you used to register. A "Next" button will appear. Click it. On the next screen, enter your password, then click "Log in". Do NOT select :Keep me logged in" if you are using a public workstation.The system will remember your username/e-mail for future use unless someone else logs in on that workstation.

SciFinder Opening Screen

SciFinder-n opening screen (substances)

Note that the default opening screen is set for substance searching. If you have previously searched in this account, your recent search history will display below the search window.

Breakdown of the Opening Screen

In the upper left is the CAS SciFinder-n logo. T

CAS SciFinder-n logo

On the left are three vertically-arranged dots. Clicking on them opens a drop-down menu:

CAS Product Selection Menu

This enables the user to switch freely among CAS products. Note that you or your institution must subscribe to the product in question to access it, and at the moment, UCSB only subscribes to SciFinder-n.

In the upper right are three options:

SciFiner-n opening screen upper right corner detail

  • Clicking on Saved and Alerts takes you to a list of saved searches, and search alerts. From there you can mange those searches and alerts, as well as combining saved answer sets with AND, OR or NOT.
  • History takes you to your full search history on this account. From there, you can re-run any previous searches.
  • My Account opens a drop-down menu:
    • My CAS Profile allow you to manipulate details of your SciFinder account
    • What's New? takes you to a list of the most recent updates to the SciFinder-n interface.
    • Help opens a new tab, with help related to the screen you had been looking at, as well as a table of contents of Help topics, and a search window for the help topics.
    • Log out lets you log out of your current session. (Note: Always be sure to log out when using public workstations!)

SciFinder-n Searching For options

To the left of the screen is the menu for selecting which type of search you wish to do. By default, the highlighted choice will be whichever type of search you did last.

SciFinder-n search window for substances

To the right of the Search Selection menu is the Search Window, which will vary by the type of search selected.. For searches involving substances or reactions, the window will include both a search term window, and a Draw icon to open the structure drawing tool (discussed in detail in Lecture 14). All search windows will have the magnifying class icon. Click on it to begin the search.

Below the search window is a drop-down menu for searching specific fields appropriate to the type of search you are conducting.. The "Add Advanced Search field" link lets you add additional fielded searches to combine with your basic search.

Searching by Research Topic

SciFinder-n reference search screen

 

  • Searching for references by keyword in SciFinder-n is similar in many respects to searching in other scholarly databases like Web of Science. For instance:
    • You can combine terms with the Boolean operators AND, OR, NOT and parentheses. Example: (oxidation OR reduction) NOT combustion
    • You can combine terms as a specific phrase using quotation marks, for example "nuclear magnetic resonance"
    • You can use wildcard characters, asterisk * or question mark ? to generalize your searches.  For example:
      • photosynth* finds photosynthesis, photosyntheses, photosynthetic
      • alumin?um finds aluminum, aluminium
  • SciFiner-n is also capable of handling natural language phrases, such as  oxidation of secondary alcohols to ketones. SciFinder-n will automatically do truncation and insert necessary connectors. Note that the results of a natural language search may not be identical to those of an equivalent Boolean search.
  • However, there are also things which SciFinder-n does uniquely, such as:
    • In addition to titles and abstracts, keyword searches search indexer-selected keywords, CA concept headings and MeSH subject headings.
    • It automatically searches for CAS abbreviations when it encounters a word which could be abbreviated, and vice versa.
    • It automatically searches for some synonyms of common chemical terms. For example, searching for NMR will also find nuclear magnetic resonance. 
    • Most importantly, when you search a chemical name as a keyword, it will detect that it is a chemical name, and also search the CAS Registry Number for that substance, greatly increasing the comprehensiveness of the search. Each document record has in-depth substance indexing.See examples below, and more about CAS Registry Numbers in Lecture 13.

Keyword Searching Example

  • Let's say I wanted to search for the topic, electrochemistry of nickel or cobalt phthalocyanines.
  • , You might enter that exact phrase, but you can also use something like this, electrochem* AND (nickel OR cobalt) AND phthalocyan*, using either the Boolean operators, and parentheses to group the terms.
  • Note below how, as I start to type the phrase, SciFinder-n provides a drop-down autocomplete list of possible terms.

SciFinder-n autocomplete example

  • The list of terms suggests strongly that I might want to use the asterisk wildcard to get all the terms starting with electrochem*
  • Here's the completed search phrase:

SciFinder-n keyword search example

 

  •  Now I click the Search icon and obtain the initial results set below.

SciFinder-n keyword search results, part 1

SciFinder-n keyword search results, part 2

SciFinder-n keyword search results, part 3

  • Note the following points about the reference answer set display above:
    • The default Sort order is by Relevance.
    • Search terms are highlighted.
    • SciFinder-n has retrieved a large answer set BUT note at the upper left that it has not displayed the full answer set. The "most relevant" answer set is 1,944 answers. If I click on the Load More Results button, it goes to 2,199.
    • SciFinder-n expects that the user will get large answer sets, and use the Relevance sort, and the Filter options to home in on the desired answers.
    • Note that a new Filter option has appeared: Substance Role  This is because we have searched on one or more chemical substance terms. In the drop-down list, only roles pertaining to the substances which we searched on appear, and the numbers reflect how many answers have each role. Note, too, that a substance may be assigned more than one role in a given document record. Note that the substance roles displayed in the left-hand column are broad roles, e.g. Uses. If you click on the View All link the table of roles will include more specific roles, e.g. Analytical Reagent Use.
    • Below is the Concepts filter table for this answer set. This is a powerful refinement tool for my search

SciFinder-n keyword search results concepts table

  • Note how, unsurprisingly, there are many "electrochem" terms among the most common concepts.
  • Note, too, how you may find related terms that you might want to consider to broaden or refocus your search, like cyclic voltammetry (an electrochemical technique) and metallaphthalocyanines.
  • You may select as many of the concept headings as you like, then click the Apply button to filter your results to just those documents that have those concepts in their indexing. You may use the Search tab to help find relevant concepts when you have a long list. See example below. In this case, I searched electro* and then clicked the Select All on Page box to select them all at once.

SciFinder-n keyword search results search of theconcepts table

 

  • After applying, the Searched concepts above, and applying the Substance Role, Analytical Study, my answer set is narrowed to 261 answers. Below is the full document record for one of the answers.

SciFinder-n reference detail from keyword search, part 1

SciFinder-n reference detail from keyword search, part 2

SciFinder-n reference detail from keyword search, part 3

SciFinder-n reference detail from keyword search, part 4

SciFinder-n reference detail from keyword search, part 5

  • Note that all the keywords searched are highlighted, in title, abstract, keywords, concept headings and substance records.
  • Note that in the substance records, the roles for each substance as they are given in the document appear. Note also, that both the general roles that appear in the Filers list (such as Analytical Study) and more specific roles appear. These more specific roles cannot be used to filter in SciFinder-n (yet) though they can be used in STN.
  • Note that Metallopthalocyanines is highlighted in the concept headings, even though we didn't search it directly, SciFinder-n's built-in "smarts" searched for it anyway. Also note, that since it is a class of compounds, substance roles are applied to it as well as to specific substances.

 

Searching for Substances

SciFinder-n search for substances opening screen

  • This is the opening screen for substance searching in SciFinder-n. In the CAS databases, chemical substances include: simple organic and inorganic substances, polymers, biomacromolecules (such as proteins and nucleic acids), metals and alloys, mixtures and more. Each substance receives its own CAS Registry Number (about which more below), including isotopically-labeled substances, stereoisomers, salts and ions of differing charges.
  • As indicated, you can use the search box to search by:
    • Chemical Name - This includes trade names (e.g. Teflon), generic drug names (e.g. ibuprofen), common chemical names (acetone), acronyms (EDTA), systematic chemical names, and CAS inverted chemical names.
      • If you search by a single word chemical name, you may also retrieve substances in which your term is apart of the name. If you wish to retrieve only the single substance, put the word in quotes.
      • Chemical name searching may not retrieve all the variations on a substance, such as stereochemical variants, isotopically-labeled substances, the mineral version of a salt, etc.
      • You can use the  asterisk wildcard to truncate single word names, and use quotation marks to enclose phrases.
      • You can enter multiple chemical names at the same time to find multiple substances.They must be separated by a space, not by commas or any other punctuation.
    • CAS Registry Numbers - This is the Chemical Abstracts Service ID number for substances. Like a Social Security number, or a UCSB perm number, the number contains no information about its subject. It is purely an identifier..
    • Note: At present, you cannot directly search other chemical identifiers, such a SMILES strings, InChI numbers or InChI keys in SciFinder-n. You can, however, use SMILES or InChI identifiers in the structure drawing tool to generate a starting structure for searching. See lecture 14.
    • Document identifiers - Patent numbers, DOIs PubMed IDs and CAS Accession Numbers can be used to retrieve the substances contained in the document identified.
  • You may also enter a DOI for a document or a patent number, and retrieve the substances indexed in that document or patent.
  • To the right of the search box is the Draw button. Clicking it opens the SciFinder-n structure drawing tool, for finding substances by chemical structure. This will be discussed extensively in Lecture 14.
  • Below the search box is the link for Advanced Search, which will be discussed in detail below. You can use it to search by: You can add multiple advanced search fields if desired  Unlike the advanced search fields in Reference searching, fields in Substance searching are automatically combined with AND.

Searching by Chemical Name; Substance Answer Sets

SciFinder-n substance search using chemical names

  • Above is a substance search using four common names of over-the-counter analgesic and anti-inflammatory drugs.
  • Note that in SciFinder-n, you can use Boolean operators, wildcards and parentheses in Substance name searching Wildcard searching only truncates the specific term to which it is applied, not the whole of a complex name. However, the terms you enter may be searched within names. Pay close attention to your results sets to determine whether you are retrieving the answers you expect to get from your name search.
  • Below are the results of that search.

SciFinder-n substance answer set, part 1

SciFinder-n substance answer set, part 2

SciFinder-n substance answer set, part 3

  • Looking at the display above, notice:
  • To the immediate right of the Substances header is the number of substances in the answer set (4)
  • Further to the right is the drop-down Sort menu. The default sort is Relevance. Also available are CAS Registry Number (RN), Molecular Formula, Molecular Weight, Number of References, (all in ascending or descending order) and Number of Suppliers descending order only.) Sorting by CAS RN is essentially sorting by when the substance was added to the Registry database, the larger the number, the more recent the addition. 
  • To the right of that is the Record View drop-down menu. Default is Partial; option is Full.
  • Below that, left to right, are tabs for retrieving References, Reactions or Suppliers associated with selected records or the entire answer set.
  • To the right are the icons for Download, E-mail and Save and Alerts. For Substances, the download options are Excel, PDF, RTF and SDF. SDF stands for Substance Description File. Note that there are limits on how many substances you can download; for most formats it's 1000 records, for Excel files it's 100 records at a time. If you have a larger answer set, you'll need to break it up into smaller chunks.
  • To the left are the Filter options for Substances. As with References, you may opt to either Filter by or Exclude a given parameter.  Again, only options that are relevant to your answer set appear. As with References, the top five possibilities display. If there are more, click on See More to get up to 10 answers, or a full table.
    • Commercial Availability - whether or not suppliers are avalable for a substance
    • Reaction Role - What roles in reactions does a substance play? Product, Reactant, Reagent, Catalyst, Solvent. If a substance appears in a given role in even one reaction, it will be listed here.
    • Reference Role - This is the counterpart to the Substance Role filter for References. If a substance has a given role in at least one reference, it will appear here. To view the table of all Reference Roles for substances in the answer set, click the View all link.
    • Stereochemistry - Is there at least one stereochemical center in the answer structure?
    • Number of Components - Salts, mixtures, copolymers, alloys, etc. will have more than one component.
    • Substance Class - Examples: Organic/Inorganic Small Molecule, Polymer, Biosequence, Mixture, etc.
    • Isotopes - Are there any isotopically-labeled substances in the answer set?
    • Metals - Do any of the substances in the answer set contain metals?
    • Molecular Weight - Lets you specify a range of molecular weights to filter your results.
    • Experimental Property - Lists the experimental properties (not the values of the properties) available for substances in the answer set.
    • Experimental Spectrum -  Lists the types of experimental spectra available for substances in the answer set.
    • Regulatory Data by Country/Region - Is regulatory information available in the database for substances in the answer set, broken down geographically.
    • Regulatory Data by List - Same as the above, but broken down by list, such as EINECS or NIOSH.
    • Bioactivity Indicator - Lists biological activities that have been studied for substances in the answer set.
    • Target Indicator - Lists biological targets (e.g. enzymes) for which substances in the answer set have been studies.
    • Search Within Results - Lets you open the structure drawing tool to search within the answer set for a particular structure of substructure and require or exclude that structure. Note that with small answer sets (that is, almost anything less than the full Registry file), even small structure fragments can be successfully searched.
  • Below the filter list is Filter Content Report, which generates an Excel spreadsheet of selected filter data for this answer set.
  • On the right are the brief records for the substances in the answer set. Note that right-clicking on links here, as elsewhere in SciFinder-n, will open a new tab or window containing the linked information. This can be handy for moving back and forth between an answer set and individual answers.
    • At the top of each record is the CAS Registry Number (or CAS RN) for the substance. Clicking on the RN will take you to the full record for the substance (see below for examples.)
    • To the right of the RN is an Expand link, which displays a more extensive brief record (see below for the aspirin record.) This view adds key physical property data (where available) and a link to the experimental properties and spectra table for the substance.

SciFinder-n expanded view of aspirin brief record

  • Next you see the 2-D structure of the substance. Stereochemical bonds, if any, are indicated. Note that structures are only displayed if there is a known structure for the substance, and it contains 255 or fewer non-hydrogen atoms. Thus, most biosequences do not have a displayable 2-D structure. If you click on the structure, you get a pop-up "quick view" of the substance record (see below). The quick view includes the CAS RN, a brief name, the structure diagram, and links to Substance Detail (the full substance record), Reactions (all reactions in which the substance is indexed), Synthesize (all reactions in which the substance is a product), Start Retrosynthetic Analysis (see Lecture 15 for a discussion), References (all references in which the substance is indexed), and Suppliers (all suppliers in the database for the substance.) The Edit Structure link opens the structure drawing tool and enters the substance's structure as a starting point for creating a new structure search.

SciFinder-n substance quick view for aspirin

  • Returning to the brief record, below the structure are:
    • Molecular Formula in Hill order. Hill order is carbon, then hydrogen, then all other elements in alphabetical order. If carbon is not present, then all elements, including hydrogen, are in alphabetical order
    • Substance Name - in this case, the name which was used as a search term.
    • Links to References, Reactions and Suppliers in which the substance appears.

CAS Registry Numbers

     CAS Registry Numbers were first assigned to substances by Chemical Abstracts Service in the 1960s when they created a computerized database of substances to aid their indexers in determining whether a substance in a document they were indexing had previously appeared in the literature. CAS RNs are of the form: xxx-xx-x where the first number is 2-7 digits long, the second number is always two digits long and the third number is a check digit generated by an algorithm from the previous digits in such a way that most common mistakes in entering an RN would generate an invalid RN, rather than the RN for the wrong substance.  

     Every unique chemical substance gets its own RN, including stereoisomers, isotopically-labeled substances, mixtures, etc. One exception to this is that polymers which only differ in chain length or molecular weight do not get different RNs, nor do plastics which differ only in how they were processed. This is a long-standing CAS indexing policy, somewhat to the regret of scientists working in the plastics industry.

     Note that CAS RNs are purely identification numbers, and do not convey any information about the structure or properties of the substances they represent. Most RNs are assigned by indexers in the course of indexing documents. Some are assigned at the request of chemical manufacturers or government agencies, and represent substances which have no published references. Note, too, that CAS RNs are the property of Chemical Abstracts Service and are not in the public domain. Reaction to this led to the creation of the InChI system (International Chemical Identifier) as an alternative which would be freely available to anyone.

Substance Detail (Full Substance Records)

Below is the substance detail for aspirin

SciFinder-n substance detail for aspirin, part 1

 

SciFinder-n substance detail for aspirin, part 2

  • From the top: 
    • Links to References, Reactions, Suppliers for this substance.
    • CAS Registry Number
    • 2-D Structure Diagram (Clicking on the structure gives the same pop-up window as clicking on the structure in the brief record shown above.)
    • Molecular Formula in Hill order
    • CAS Systematic Chemical Name (in inverted order)
    • Key Physical Properties (properties shown varies depending on the substance. These are fairly typical for a common organic molecule.)
    • Then a series of drop-down lists, beginning with: Other Names and Identifiers These include the canonical SMILES string, where available, and any trade, generic and other chemical names used for the substance. This list can be VERY long - polyethylene has over 1000 names in its list!
    • Experimental Properties - These are given in tabular form. divided into tabbed sections by type of property. These sections will vary depending on what is available for the substance. Each property may or may not give actual numeric values, and may or may not have conditions associated with them (such as pressure for boiling points). All will have a link to the reference from which the property information was obtained.)
    • Experimental Spectra - These two are listed in tabular form. If a spectrum listed says View then the spectrum itself is available in SciFinder-n. Click on the link to get the spectrum, with source detail. The spectrum may be scrolled up and down in size, or shifted left to right or up and down by clicking and dragging for better viewing. The spectrum may be freely downloaded as a JPG image. The SciFinder-n spectra do not, in general, give peak assignments. If the spectrum does not say view, then it will link to the SciFinder-n record for the source document.
    • Predicted Properties - This table of properties is calculated from the chemical structure with software created by ACDLabs and licensed by CAS. Among the tabbed lists for aspirin, you will see one labeled "Lipinski". These are named for Christopher Lipinski, who, while at Pfizer, described a set of five properties which could be used to determine whether a given chemical would be orally active as a drug. These involve molecular weight, acid-base properties and the relative solubility in water vs. organic solvents.
    • Predicted Spectra - These are also generated by ACDLabs software, and may be downloaded like the experimental spectra mentioned above.
    • Bioactivity Indicators - A hierarchical list of the broad and narrow bioactivities described in the literature for the substance. Each bioactivity has the number of current documents containing the information. Clicking on the name of the bioactivity creates a Reference list of the relevant documents, which may then be manipulated like any other SciFinder-n reference list.
    • Target Indicators - Hierarchical list of the proteins (including enzymes) with which the substance has been shown to interact. Like the bioactivity indicators, the number of papers is shown for each protein target, and clicking on the link generates a Reference list of those papers. Note: For a widely-tested drug like aspirin, this list is VERY LONG!
    • Regulatory Information - Lists the names under which is substance is known in national regulations, and the countries which regulate it, and the names of the documents in which the regulation appears. This information is derived from the CAS database, CHEMLIST. Note that SciFinder-n does not contain, or link to, the actual regulatory documents.
    • Additional Details - (not visible in the image above) Includes a list of  Document Types in which the substance is referenced; Substance Classes to which it belongs, and Deleted CAS Registry Numbers. Deleted RNs occur when an indexer identified a substance in a document as a new substance, assigned a RN, and it is later found to be the same as a previously known substance. The newer RN is then deleted from the Registry file. However, since it is still attached to the original document(s), SciFinder-n automatically searches all the deleted RNs when you search for references to a substance - so you don't miss out on anything!.

Sample Records for Other Classes of Substances

Polymers/Plastics

SciFinder-n substance detail for styrene-butadiene copolymer

SciFiner-n substance detail for styrene-butadiene copolymer, part 2P

  • Above is the Substance Detail for a styrene-butadiene copolymer, with the Experimental Properties section expanded.
  • Note how the two monomers are treated as individual components of the polymer. Some polymers are graphically described with the structure repeating units (SRU) instead.
  • Note the molecular formula gives the two monomers in descending order of molecular formula in Hill order, enclosed in parentheses with an x subscript. Molecular formulas for SRU polymers use an subscript. Both the n ad x indicate an indeterminate length polymer.
  • Note how the systematic name is written. Copolymers use "copolymer with"; homopolymers use "homopolymer"  There are also Registry Numbers for "block" and "graft" polymers.
  • Note how the Experimental Properties include categories relevant to plastics, such as Flow and Diffusion and Mechanical.
Biosequence

SciFinder-n substance detail for human insulin, part 1

SciFinder-n substance detail for human insulin, part 2

  • Above is the Substance Detail for a Registry Number for human insulin, with the Sequence Details, a section unique to biosequece records, expanded.
  • Note how there is no structure diagram given. Human insulin has more than 255 non-hydrogen atoms, so the Registry Record cannot record a 2D structure for the molecule. However, see the Sequence Details section.
  • Note that below the name, it is identified as a Protein/Peptide Sequence, the total sequence length (in amino acids) as well as the lengths of the two sub-chains, the protein is identified as Multichain, and there is a link to Related Sequences. Since CAS assigns separate Registry records for each distinct protein or polynucleotide, and sell as distinguishing by source organism, there are many other "insulin" records besides the one we retrieved. Clicking on the link creates a Substance answer set of all the related sequences.
  • In Sequence Details, you get the amino acid sequences for each subchain, as well information on the number, types and locations of modifications to the chairs (in this case the Cys-Cys bridges between the two subunits.) The sequences are given in standard one-letter codes for each amino acid, familiar to protein chemists. Polynucleotide chains use the standard A,T,C,G, and U
  • Be aware that in SciFinder-n (at present) there is no way to directly search for biosequences, subsequences or similarity, though rumor has it that they may be adding this capability in the near future. The STN version of the Registry database does allow sequence and subsequence searching, with gaps, wildcards and so forth, as well as BLAST similarity searching. BLAST searching is also available in public sequence databanks like those at the National Center for Biotechnology Information (NCBI).
Alloys and Tabular Inorganics

SciFinder-n substance detail for monel alloy, part 1

SciFinder-n substance detail for monel alloy, part 2

  • Above is the Substance Detail for a Registry record for monel alloy, with the Experimental Properties section expanded.
  • Note the tabular composition display. This is common for metal alloys, as well as some other types of nonstoichiometric inorganic substances, like the high-temperature superconducting perovskites. The first column lists the components (usually elements, though occasionally metal oxides), the second column the molar percentage (or range of percentages) of each component, and the third column the CAS RN for the component.
  • Below that is the molecular formula in Hll order. Notice that there are no subscripts given for the elements. Below that is the number of components, and the CAS systematic name. The element with the highest molar percentage is considered the "base" of the alloy,  The percentage ranges for it and the other elements are included in the name.
  • Again, note the the Experimental Properties categories given are ones appropriate for a metal alloy, such as Electrical and Mechanical.
Mixtures

SciFinder-n substance detail for metformin-glipizide mixture

  • Above is the Substance Detail for a mixture of metformin and glipizide (two drugs usd for the treatment of hypertension, that is, high blood pressure, in humans.)
  • Note that in addition to the mixture Registry Number, the structures and Registry Numbers for each component of the mixture are given.
  • The molecular formula gives the Hill order molecular formula of each component in descending alphabetical order. Note that no specific rations of the two components are given. (CAS now has a database, Formulas, that gives detailed information on formulations, including states of matter, coatings and the like, for drugs and agrochemicals. It is a separate product from SciFinder-n, aimed at industrial users. However, when you retrieve references for substances which in in formulations, you will see a Filter for formulation information appear. This can be useful on identifying which documents have detailed formulation information int heir full text.)
  • Typically, for mixtures there is no experimental or  predicted property information given, but for mixtures used as drugs or agrochemicals, there is frequently bioactivity indicator data.

Advanced Substance Search (Molecular Formulas, Substance Properties, Experimental Spectra)

SciFinder-n advanced substance search, part 1

SciFinder-n advanced substance search, part 2

  • Just below the main keyword search window in Substance Search is a link to the Advanced Substance Search (see image above), including Molecular Formula search, Substance Property Search and Experimental Spectra search.
  • Advanced search fields include:
    • CAS Registry Number
    • Chemical Name
    • Document Identifier - Searching by document identifier will retrieve all substances indexed in the selected document.
    • Patent Identifier - Searching by patent identifier will retrieve all substances indexed in the selected patent.
    • Experimental Spectra
      • Currently, searchable experimental spectra include:
        • Proton NMR
        • Carbon-13 NMR
        • Nitrogen-15 NMR
        • Fluorine-19 NMR
        • Phosphorus-31 NMR
      • You may enter specific peaks in ppm, or ranges of ppm. Examples are given.
    • Biological
      • Bioconcentration Factor (predicted) - specific values or ranges
      • Median Lethal Dose (experimental) in mg/kg specific values or ranges. Note that you cannot specify the organism.
    • Chemical Properties
      • ​​​​​​​Koc (predicted)
      • LogD (predicted)
      • LogP (predicted)
      • Mass Intrinsic Solubility (predicted)
      • Mass Solubility (predicted)
      • Molar Intrinsic Solubility (predicted)
      • Molar Solubility (predicted)
      • Molecular Weight
      • pKa (predicted)
      • Vapor Pressure (predicted)
    • ​​​​​​​Density
      • ​​​​​​​Density - can search both experimental and predicted values, or experimental values only
      • Molar Volume (predicted)
    • ​​​​​​​Electrical (experimental only)
      • ​​​​​​​Electrical conductance
      • Electrical conductivity
      • Electrical resistance
      • Electrical resistivity
    • ​​​​​​​Lipinski (predicted only)
      • ​​​​​​​Freely Rotatable Bonds
      • Hydrogen Acceptors
      • Hydrogen Donor/Acceptor Sum
      • Hydrogen Donors
    • ​​​​​​​Magnetic
      • ​​​​​​​Magnetic Moment (experimental)
    • ​​​​​​​Mechanical
      • ​​​​​​​Tensile Strength (experimental)
    • ​​​​​​​Optical and Scattering
      • ​​​​​​​Optical Rotatory Power (experimental)
      • Refractive Index (experimental)
    • ​​​​​​​Structure Related
      • ​​​​​​​Molar Surface Area (predicted)
    • ​​​​​​​Thermal
      • Boiling Point (experimental and predicted, or experimental only) Note that SciFinder-n does not allow you to specify the pressure at which the bp is measured.
      • Enthalpy of Vaporization
      • Flash Point
      • Glass Transition Temperature
      • Melting Point
  • ​​​​​​​​​​​​​​Note that new searchable properties are added from time to time. Also note that for property values, units are specified. SciFinder-n does not have any units conversion facility.
  • ​​​​​​​Advanced structure search fields may be combined with structure drawing searching. See Lecture 14 for more information.

Searching Substances by Structure

  • When you click on the Draw button next to any search window, the SciFinder-n structure dfawing tool will open. See below:

SciFinder-n CASDraw structure drawing screen

 

  • Note the drop-down menu at top. SciFinder-n offers two built-in structure editors: CAS Draw and ChemDoodle. This lecture will focus on CAS Draw. You can also use external structure drawing programs, such as ChemDraw which can export a structure as a MOL file. See Import below. ChemDraw Professional integrates directly with SciFinder-n for easy transfer of structure information.
  •  Top horizontal row of icons
    • New - start drawing a new structure
    • Import - Enter the name of a saved CXF or MOL file that you wish to modify or search in SciFinder-n.
    • Export - Once you have crated a structure, you can export it as a CXF or MOL file.
    • Save as Template - You can save the structure in the current structure drawing window as a template to reuse as a starting point for structure building and searching.
    • Center Structure - Moves the structure you have drawn to the center of the drawing field
    • Cut
    • Copy
    • Paste
    • Undo
    • Redo
    • Preferences - Lets you set bond lengths and angles as you prefer. This only affects the way structures display; it has no effect on searching.
    • Keyboard shortcuts - Opens a list of single-keystroke shortcuts that can save time for the experienced user.

SciFinder-n CASDraw keyboard shortcuts table

  • Text template window - You may enter a CAS Registry Number, SMILES string or InChI, then click Enter, and the identified structure will appear in the structure drawing field or you to modify or search.
  •   Furthest-left  bertical tool bar  Note that when you clickon one of these icons, a description of its function appears in the main drawing window.
    • Select objects by clicking on them, or clicking and dragging a line around them.
    • Pencil - Draw or change atoms or bonds 
    • Atoms  - Opens a periodic table from which you may select any atom to draw with the Pencil tool. Selected atoms show up in the window just below the drawing field. When you hover your cursor over an element in the table, it will display the atomic number, element name (in English), atomic weight, and the column and row names to which it belongs, see below. Use the close button when you are done selecting atoms. Note, too, that deuterium and tritium are available at the bottom of the table.

SciFinder-n CASDraw periodic table for selecting atoms

  • Variables - List of available variables: See below.

SciFinder-n CASDraw variables table

  • <blank>
  • Add positive charge - Lets you add one or more positive charges to a structure you'v drawn. This does not affect searching.
  • Repeating Groups - If you wish to specify a repeating group in your structure, dfaw it, then shoose this tool and draw a selection box around the group. A window will appear in the horizontal bar above the drawing window where you can specify a range of how many repeats, from 0 to 20.
  • Variable attachment points - Used to allow variation in where a given subsituent may be attached to a ring system. Create the ring system structure, then creae the substituent separate from the ring(s). Now click on the Variable Attachment tool, then on the substituent, then on each atom on the ring system where you wish to allow th substituent to be attached. Note that this does not mean that multiple copies of the substituent will necessarily be attached. The substituent will be shown as connected by dotted lines to the selected ring atoms.See example below.
  • Map atoms - Designate corresponding atoms in reactants and products
  • Bonds formed/broken - Clickon a bond in a reactant or product to indicate that the bond is broken or formed in the reaction.
  • Marquee - Lets you draw a rectangular box to select an atom or group of atoms (e.g. for deletion). Click in the upper-left hand corner of the area you wish to select, then hold and drag to the lower right-hand corner. Release the button to select.
  • Lock Ring - This tool is used for substructure searching. Select it, then click on an atom in the desired ring to prevent answers in which there is a ring fused onto the drawn structure.
  • Rotate fragment - This tool lets you click on an atom in your structure and then use your mouse to rotate the structure around that atome (in a 2D sense). Note that rotating the structure does not affect searching, only display.
  •  
  • Reaction Arrow - Only used for reaction searching. This will be discussed in Lecture 15
  • Map Atoms - Only used for reaction searching. This will be discussed in Lecture 15​​​​​​/

Second left-hand vertical tool bar:

  • Rectangle selection tool - Click ion the upper left hand corner of the area you wish to enclose and select, then drang to the lower right hand cordner and release, to select all the atoms and bonds within the rectangle.
  • Eraser -Delete atoms and bonds
  • Shortcuts - Opens a table of "shortcut" symbols for commonly used functional groups, such as Me, Et, NO2, etc. Note that if you use ashortcut group in drawing your structure in a substructure search, no further substituion is allowed on the shortcut group. Note that a few shortcuts appear in two different forms. This makes no difference in searching, but can be used to get a better appearance in the structure diagram.

SciFinder-n CASDraw shortcuts table

  • R-groups - These are "build your own" variable groups. You may create up to 20 different R-groups per structure (R1, R2, R3 etc.) selecting from Atoms, Variables and Shortcuts.  Note that STN allows you to nest R-groups within other R-groups, and to create structure fragments to act as R-groups. SciFinder-n does ont allow this at present.

SciFinder-n r-group definitions

  • Structure Templates - These are a set of pre-defined structures which you can use as starting points for structure drawing. See the full list of structure types below, and an example of one of the types in fulll. Note that you can create you own user-defined templates using the Save as Template tool mentioned above. Note, too, that you can seaerch all the templates by name to find a particular one. See the opening table, and the expanded Alkaloids example.

SciFinder-n CASDraw templates selection tool

SciFinder-n CASDraw tmplate selection tool

 

  • Add negative charge - See "Add positive harge" above.
  • Chain - Click and drag this tool to create a chain of single-bonded carbon atoms.
  • Substance role in reactions - Click on an atom and designate whether the structure fragment it is in is a reactant, product or other
  • Lock rings - Clicking on this icon displays a menu of several tools: Click on an atom in a ring to designate that no additional ring fusion on tht ring is allowed.. Lock atoms forbids non-hydrogen substitution on te atom; Rotate fragment and Flip fragment. These last two only affect the appearance of the display, and do not influence searching.
  • Add reaction arrow - Click and drag to create a reaction arrow, with reactants to its left, and products to its right.

Verticle Bars on the right-hand side of the drawing window

  • These provide access to commonly used atoms and bond types. Note that some boxies have an arrow in the lower right-hand corner. These indicate that more choices appear when you click on the box.
    • C = Carbon; H = Hydogen (Click on the arrow for Deuterium and Tritium)
    • O = Oxygen; S= Sulfur
    • N = Nitrogen; P = Phosphorus
    • Cl = Chlorine (click on arrow for Fluorine, Bromine, Iodine); Si = Silicon
    • Single bod; Double bond (Click on arrow for Triple bond and Unspecified bond.)
    • Stereo bond up (Click on arrow for Stereo bond down, Stereo double bond up, Stereo double bond down); Cis=trans double bond.
    • Cyclopentane ring; Cyclopentadiene ring
    • Cyclohexane ring; Benzene ring
    • Cycloheptane ring; 3-15 soded romg

Drawing Structures

With the tools above, drawing structures in SciFinder-n is relatively straightforward. SciFinder automatically checks for "normalized" bonds in aromatic structures or tautomers.

Some general tips -- While you can do most functions in any order, I prefer to do the following:

  • Plan out your structure before you start drawing. Think about whether you are doing an exact structure search or substructure search. Are there templates you can use? Think ahead as to what groups can be represented by shortcuts. Note that SciFinder (and the underlying database) do not allow substituents to be attached to most shortcuts.
  • Is there a known substance that you can use as a starting point, rather than drawing from scratch? If so, look up the substance in SciFinder-n and click on its structure diagram, then select "Edit Structure". Alternatively, if you can find its CAS Registry Number , SMILES string or InChi number, you can enter it in the CAS Draw search window and pull it up that way.
  • Whe starting from scratch"
    • Draw any ring structures in the compound first.
    • Then draw/attach chains as needed.
    • Then change any atoms into heteroatoms, variables, shortcuts, R-groups as necessary.
    • Then modify bonds as needed. This is where you can add stereochemistry if desired.
    • Then consider any added restrictions: locking down substitution at some sites or locking out ring fusion.
  • Previewing substructure searches is a good idea to make sure you're getting the kind of results you anticipated.

Structure Drawing Example: Ruthenium(2+)​, tris(2,​2'-​bipyridine

As an example, let's take a look step-by-step at the drawing of the structure of feropolone (Note: the screenshots below were made with the Java structure editor, but the process is the same with the non-Java editor):

Ruthenium(2+)​, tris(2,​2'-​bipyridine

First, use the ring drawing tools (right-hand tool bar) below the drawing board to draw in the ring portions of the structure -- the benzene ring tool for the six benzene rings :Note that the precise  positioning is not important - I've laid them out similar to the final 2-D structure for convenience.

SciFinder-n Drawing Ru(bipy_3, part 1

Now, use the Pencil tool to draw bonds connecting pairs of benzene rings to create three biphenyls. Note that as the Pencil tool touches an existing atom or bond, that atom or bond is highlighted. Hold the mouse putton and go to the corresponding atom on the adjacent ring. As you do so, a bond will appear. When you reach the other atom, the bond will link the two rings.

SciFinder-n Drawing Ru(bipy)3, part 2

Continuing with the pencil tool, select N from the list of common elements on the right-hand tool bar and change one carbon on each ring in the ortho position to the biphenyl bond to a nitrogen.  Now, you have three 2,2'-bipyridyls.

SciFinder-n Drawing Ru(bipy)3, part 3

Now, we can use the Atom button,(the periodic table icon on the left-hand tool bar)  to open the periodic table and select Ru  and close the periodic table. Note how the window to the left of the buttons now shows a Ru. That is the current atom that will appear when you draw. Now, move the pencil tool cursor to the center of the circle of bipyridyls and click to place a ruthenium atom there.

SciFinder-n Drawing Ru(bipy)3, part 4

Now use the pencil tool to connect the Ru atom to each of the six nitrogens, one at a time. Now, your structure is drawn. You could use the + tool in the left-hand menu bar to add a +2 charge to the Ru, but charge assignments are not used in searching by SciFinder-n.

SciFinder-n Drawing Ru(bipy)3, part 5

Now, click OK to return to the search screen.

SciFinder-n, Searching by structure for Ru(bipy)3

 

Note how the structure Edit icon is now highlighted. Click the Search icon to carry out the sturure search. You can ad additional substance search terms (keywords, property values, etc.) before launching the search.

 

SciFinder-n Ru(bipy)3 structure search results, part 1

SciFinder-n Ru(bipy)3 structure search results, part 2

Searching Structures

When you click the Search button, you gett a display of thesbustance answers in the SciFinder-n database that match your query. The default sort order is Relevance though you can select other sort orders. Note, too, on the left, that there are three answer sets created by your search:

  • As drawn - Structure containing the drawn structure, but with no additional substitution on the structure itself. However, the answer set can include strucutres with different stereochemistries, salts (in this case, since the main structure is a cation, you can get salts with a range of different anions), isotopically-labelled comounds, and mixtures containing the basic structure.
  • Substructure - This answer set includes all of the structures in the "As drawn" set, plus all those in which additional substituents are attached to the original strcuture.
  • Similarity - This answer set uses an alogrithm to select structure similar to the one drawn They may have additional sutsituentsor fewer atoms, or some replaced by different elements (suc as a Co in place of the Ru). The similarity algorithm ranks the results by a percent similarity. Similarity searching is frequently used in drug discovery searchesto turn up compounds that might have similar biological activity to the starting structure.

"Filters" allows you to limit your search results to certain types of substances or to exclude certain types of substances. Note that you can apply multiple filters to the same answer set. Filters can be removed as well.

Filtering Substance Answer Sets

The previous tab, "Substance Searching", lists the various filters available for substance answer sets. Note that a filter category only appears if it is applicable to the answer set in question. If the number of possibilities in a given option is small, they will be listed in order of descending occurrence. If they are larger, you'll see a "View All" link, which will open a table of all the possibilites. For esample, here's the table of Reference Roles for the Ru*bipy)3 answer set above:

SciFinder-n reference roles for Ru(bipy)3 answer set

Note that the roles can be lised either in descending order by occurrence cunt, or alphanumberically.  If you select more than one option from the table, they are, in effect "OR"ed together. If you want to "AND" roles together, apply them one at a time.

Structure Search Within Results -- Why, you might ask, would I want to refine one structure search with a second structure search? Why not just do the structure search you wanted in the beginning. The answer is that some types of structure search are too general to run on the full database. But if you can create a smaller subset of substances to run your desired structure against, you can be successful. The subset can be created by either structure or molecular formula searching, or by using the Get Substances option on a document answer set. Also, by starting with a more general search, you can filter for a particular structure option, then back up and try a different option.

 

 

Copyright

© 2022 Charles F. Huber

Creative Commons License
This work by Charles F. Huber is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Based on a work at guides.library.ucsb.edu

Screenshots of Reaxys are copyright © 2022 by Chemical Abstracts Service (CAS), a division of the American Chemical Society, and are used for fair use educational purposes only


Copyright © 2008-2019 The Regents of the University of California, All Rights Reserved.
UCSB Library (805) 893-2478 • Music Library (805) 893-2641 • UCSB, Santa Barbara, CA 93106-9010
Contact UsPolicies