APLIC-International, The Communicator, Spring 1997, Issue #64

US Government Information in Electronic Formats:
The National Library of Medicine

Text of presentation by Sheldon Kotzin, National Library of Medicine

Thank you for inviting me to participate today. While I know a good bit about the POPLINE database, I am by no means an expert on the content that specialists in this field are interested in. Therefore, I am glad I can talk about electronic data, something I do know about, and hopefully it will be of interest and value to you. I want to elaborate on some of the unique activities going on at the National Library of Medicine that take advantage of the Internet. While the more glamorous subjects are in the world of the future, I first want to mention a step the Library is taking into the past, electronically speaking.

Through the National Library of Medicine alone, one's fingertips can reach out and touch or at least click on the full text of several journals accessible on the world wide web; locate one million gene sequence records; 52,000 prints, photographs and other historical images; 800,000 cataloging records for monographs; and more than 9 million references to the biomedical journal literature from 1966 to date. Yet often users have to be reminded that important biomedical information was published prior to 1966. In the health science literature, the detailed descriptions of the natural history of diseases were written decades ago. Investigators seeking ways to treat drug resistant tuberculosis look to the literature in the first half of the 20th century. MEDLINE and other databases contain virtually no citations on this subject published in the 1970s, 1980s, and early 90s, because little research was conducted. Recent concern about radiation experiments sent researchers and librarians looking to the literature of the 1940s and 1950s for the same reason.

Therefore, NLM has decided that the retrospective conversion of printed records would become a priority. MEDLINE began in 1966, when NLM was the first govern-ment agency to put its data into electronic form. We have added 40 databases, including POPLINE, since then. Not only are they made available for direct searching, but nearly all are leased to commercial companies that also provide access. Last year NLM put the 1964-65 data in machine readable form. We call it OLDMEDLINE. We intend to keep moving back in time, converting data as quickly as money will allow. In addition, in a cooperative project with the American Association of the History of Medicine, we intend to make available in electronic form, the first series of the Library's index catalog which dates from 1879. At some point, the two extremes may come together and join so that every citation will be in electronic form.

I would like to spend the next few minutes speaking with you about National Library of Medicine plans in this age of electronic journals, specifically how they relate to our mission as a national library and our role as an institution that often establishes defacto standards for the health information community.

There are two major areas where electronic journals will play an increasingly important role: the first is in transitioning from print to electronic journals in meeting basic user needs; the second, is in the linking to the full text of electronic journals from online databases such as MEDLINE or maybe POPLINE. The issues, at the moment, seem overwhelming: preservation, fair use, licensing, copyright, cost of information.

What are some of the problems in dealing with electronic journals? URLs change and even disappear with regularity. Electronic journals are often not published by large, established publishers and reflect their lack of experience with publishing and with characteristics of an electronic medium. Publishers aren't always concerned about archival versions, should the electronic version cease. Some electronic journals are free; most are not. The range of pricing and licensing arrangements is astounding. Some publishers establish separate prices for print and electronic versions. Others offer access to electronic versions for an additional charge, sometimes 10% - 20% more. License agreements typically preclude libraries from making "Fair use" copies for individuals who may not be part of the group covered by the license -- i.e., faculty, staff, and students. In some cases, there is no easy way to get print copies of electronic journal articles, and when it is possible, the cost can be as high as $35/ per article.

Access, indexing, and preservation are the three primary areas of concern for NLM and it has formulated a set of questions for publishers to answer regarding these.

Access questions include:

Indexing issues:

Preservation issues:

In the best of all worlds, publishing in electronic form should provide more rapid dissemination and better access to current information. It should result in a "level playing field" for researchers in less developed countries who cannot afford to pay for print journals but may have Web access. However, special attention is required to ensure that the benefits of electronic publishing are achieved without limiting access only to the wealthy or technologically advantaged, without increasing the difficulty in identifying and obtaining the information, and without jeopardizing the ability to preserve the scholarly record.

Next, I'd like to address an experimental retrieval system being developed by NLM called PubMed. It takes advantage of the Internet and the availability of electronic journals on publisher home pages. Let me give a little background on how the PubMed project got started.

In the last few years, increasing numbers of publishers have been sending NLM citation and abstract data in SGML or HTML format. The data goes directly into our indexing system and enables the Library to have a new database of citations and abstracts without medical subjects headings and other indexer-added information. The citations with abstracts are available to NLM users on a daily basis, much sooner than they appear in MEDLINE. The file, called PREMEDLINE, is not without some errors, but it serves well as an in-process file for new citations. Now back to the PubMed story for another minute. To reiterate, NLM was receiving citation data directly from publishers but a larger goal was to receive access to the full text. The library indexes 400,000 articles each year and it makes no sense to think of storing the full-text of this many articles in our databases. What did make sense was to have a link from the databases to those electronic articles. Moreover, to take it one step further, it would be useful for a user retrieving the articles from the publisher's home page, to have these references linked back to other MEDLINE citations. This is the PubMed model.

As apposed to other NLM databases, PubMed accesses data on a server rather than the more expensive mainframe computer. It uses the WWW and requires no special software or telecommunications packages. It is possible that PubMed may become an important method used for accessing MEDLINE and other NLM databases in the near future. The URL is http://www.ncbi.nlm.nih.gov

Thanks so much for your attention.

Communicator Table of Contents | Conference Agenda