A Digital Library Progress Report
With a Look to the Future
January 1995 through January 1998
The UC Berkeley Library
February 5, 1998
The Digital Library as a Process
It's tempting to think of the "digital library" as a goal, or as an event that can be reached at some fixed point in time. In this conceptualization, the digital library is achieved when some predefined set of electronic services and collections become available to the community at large. In contrast, the UC Berkeley Library prefers to view the emerging digital library more as a process - a road traveled toward some very promising, but vague destination. The vagaries are the result of an inability to predict the shape and scope of future digital library services. Prophesizing the future is difficult for many reasons, but mostly because it is almost impossible to foresee the impact new technologies will have on digital collections and services. It was only three years ago, December 1994, that Netscape released its first web browser, the Netscape Navigator. Looking back, it's clear not many people understood the full impact the World Wide Web would have on our society and there is no reason to think that future innovations may not have the same revolutionary effects.
Given the vague nature of a future digital library, it is very difficult to predict the straightest path to the vision. Therefore, the most important activity in the process of shaping the digital library becomes making and taking advantages of opportunities that are presented by new technologies, government policies, funding sources, publishers and the information industry in general. The UC Berkeley Library is committed to a leadership position in the creation of the future digital library by pursing and realizing strategic opportunities.
The UC Berkeley Library Approach to Creating a Digital Library
In treating the emerging digital library as a process, one overriding goal of the library has been to provide the best network based services possible to the UC Berkeley community. The Library's plan can be viewed as a series of short and longer-term efforts.
Short Term:Looking Back to See Forward
In the shorter-term, the Library has concentrated on taking advantage of opportunities in two strategic areas:
Through its own efforts and by working with strategic partners like IS&T, Sun Microsystems, Pacific Bell and OCLC, the Library has built an impressive infrastructure to deliver digital library services. The current infrastructure is made up of over 2,000 network connections, three large network servers and over 800 PC workstations. In the last three years, the Library has created and realized opportunities to enhance this infrastructure in the form of hardware, software and facilities gifts valued at over $430,000.
The remaining infrastructure challenges for the Library are twofold:
1) completing the infrastructure by replacing over 200 terminals with Web-browser enabled workstations (approx. $575,000 initial investment, plus $165,000 per year in maintenance and replacement costs); and 2) identifying equipment replacement funding sources to maintain and upgrade this infrastructure over time (approx. $665,000 per year for our current installed base).
Deploying Networked Based Library Services
As one would expect, the Library's internal processing services are fully automated. Berkeley library users have been made more self sufficient through UC systemwide cooperative opportunities via the Melvyl system which provides access to over 85 databases. In addition, the Library has used technology opportunities to provide access to more than 250 networked CD-ROMs. Just last year, the Library released a new Web based catalog named Pathfinder. This new service was made possible by an opportunity provided through a strategic partnership with OCLC, who agreed to cosponsor the Berkeley Digital Library SUNsite through a gift of their SiteSearch software. The SUNsite was initially made possible by a hardware gift opportunity from another strategic partner, Sun Microsystems.
In addition to the efforts described above, the Library has been aggressively pursing grant-funded opportunities to shape the longer-term services that a digital library may provide. The Library has taken a national leadership position in a number of areas, including creating best practices for metadata creation, display and navigation of digital materials, distributed system architecture and community education. Library grant funded projects that have been active in the past three year represent over $2,000,00 of extramural funding generated to support digital library research and demonstration projects. Details on fifteen of these strategic projects can be found later in this document.
By looking back on the digital library as a process, one can see the progress achieved and lessons for guiding the future. The Library's first efforts in the digital environment were to automate its internal procedures. The catalog records produced in this process were then loaded into Gladis, the Berkeley campus online catalog, and finally into the Melvyl union catalog. The next big step for digital library services came with the extension of digital content from catalog records to abstract and indexing databases. With this addition, it became even easier to discover the existence of published materials, especially journal articles. The library community is now on the verge of the next big step forward, network access to critical masses of full digital content. At the same time, the crisis in scholarly communication is inhibiting our ability to pay for this access. Within this context, new areas the UC Berkeley may wish to consider to investigate include:
- Electronic Commerce, License Management, etc.
- Content Building
- Additional Distributed Architecture Issues
- Developing New Web Service Tools
- Technology Renewal
- Business Models to Sustain Operations Built on Extramural Funding
Full participation in the emerging national digital library is a strategic objective for the UC Berkeley Library. Therefore, the Library supports multiple layers of digital library projects that are integrated into current service programs. In addition, the Library is playing a national leadership role in defining best practices that will allow scholars, students and the public better access to distributed digital library collections. In all these efforts, the UC Berkeley Library is fortunate to have strong strategic partners in its digital library projects.
Information Systems and Technology (IST), the Berkeley campus computer center, is an outstanding organization that is deeply involved in many projects that are crucial to the future success of the Library's digital service objectives. Working with IST, the Library has installed over 2,000 network connections in campus libraries, including switched and shared 100Mbps intranets. IST is also Berkeley's lead department for campus participation in the Internet2 project. They are also deeply involved in the UC effort to create a systemwide authentication and authorization system that is targeted for first use by digital library services.
Another strategic partner for the Library is the new UC organization, The California Digital Library (CDL). The Berkeley Library now has the opportunity to work with its sister campuses to leverage the strength of the UC system through the CDL. The UC organization has already accomplished much through the Melvyl system, including access to the UC union catalog and over eighty-five abstracting and indexing databases. The CDL now gives us the opportunity to pursue new resource sharing efforts, most notably access to more full digital content.
Finally, The UC Berkeley Library is very active participant with its national strategic partners, such as OCLC, RLG and the Digital Library Federation (DLF). By maintaining an aggressive grant program, the Library has been able to sustain a leadership role in shaping the emerging digital library. Berkeley is well know for its lead role in the NEH funded projects to develop the Encoded Archival Description (EAD), an encoding scheme used to provide intellectual access to primary source materials via digitized "finding aids." The Berkeley Library is also currently the lead institution in a DLF sponsored project, the Making of America II, which is developing best practices for the creation and encoding of digital library objects and also investigating distributed system architectures to support an integrated national digital library. Another example of Berkeley's national participation includes sponsorship of the UC Berkeley Digital Library SUNsite which was made possible through software and hardware gifts from cosponsors Sun Microsystems and OCLC. Many more examples of UC Berkeley's national leadership role can be found below.
THE PROGRESS REPORT
No matter how much digital library content is available, it cannot be made reliably available without a robust technical infrastructure to support the discovery and delivery of these materials. The equipment that constitutes this infrastructure falls into three main categories: servers, desktop systems, and network connections. The Library's servers include a Tandem mainframe, three large Unix servers, and a dozen smaller servers -- Unix, NT, and Novell - used primarily to provide access to more than 250 networked CD-ROMs. The desktop systems used by staff and patrons to access these and other servers include approximately 800 networked PC's and over 200 dumb terminals. Over 2000 network connections in our libraries are provided for servers and desktop systems, leaving additional capacity for portable computer docking.
While continuing constraints on the operations budget did not permit our increasing funding as hoped and planned, a number of grants and gifts received over the past three years have allowed the Library to significantly expand its technology infrastructure:
Remaining Infrastructure Issues
Sun Microsystems donated a SPARCcenter 2000 server to house the UC Berkeley Digital Library SUNsite, including Pathfinder, the Library's Web/Z39.50 information system.
OCLC donated its WebZ/SiteSearch software for the development of Pathfinder as a SUNsite project.
A private donor made possible the purchase of high bandwidth network equipment that made possible the creation of a systems that allows music students to listen to performances put on course reserve over the network.
Purchase of a new Tandem mainframe that houses Gladis, the Library's primary catalog, catalog maintenance, and circulation systems was made possible through special funding from the UC Berkeley campus administration.
The Vice Chancellor provided special funding earmarked for digital library development that has been used to increase the number of network-accessible resources, including many more networked CD-ROM's and HarpWeek.
Various grant-funded inter-institutional projects and donations related to them have permitted in the acquisition of some additional hardware and software, including new servers and software for navigating the Library's online collections of archival materials.
1) Completion of the InfrastructureB) Deploying Networked Based Library Services
The majority of the digital information resources most used by Library patrons today require graphics-capable PC workstations. In response to this, the Library has added around 40 public service PC's per year over the past three years; but over 200 of the Library's public workstations are still text-based dumb terminals, incapable of providing access to the resources our patrons need. To replace them, however, would require over $575,000 in one-time funding and another $165,000 per year for maintenance and equipment replacement. This has been beyond our means, and as a consequence the only progress on this front has been in units fortunate enough to receive gifts for the purpose.
2) Equipment Replacement
While the development of our technology infrastructure has progressed, in spite of the financial constraints of the past several years, there is as yet no solution to the problem of its long-term maintenance. The Library has acquired nearly 3 million dollars worth of computing equipment in support of its efforts to improve services to its patrons and reduce the costs of providing those services. While this machinery must be replaced regularly to remain serviceable, there has been no reliable source of funding for this purpose, making equipment replacement an unfunded mandate of over $665,000 per year.
The Library's World Wide Web site was mostly created, enhanced and refined during the past three years. This new service has provided the Library a new format in which to provide its services and share staff expertise with the Berkeley campus community. The Library's Web site has just completed a complete "renovation" which has made it even more useful and easy to use than before.
Another important value-added service the Library offered its patrons in the past three years was the introduction of Pathfinder, Berkeley's new Web based catalog. Pathfinder provides significant searching enhancements in an easy to use Web browser format. In addition, Pathfinder has allowed for new services such as hyper linking to other library resources, such as finding aids.
The Library has continued to pursue adding Berkeley's Affiliated Library holdings to Gladis and Pathfinder. Affiliated Libraries do not report to the UC Berkeley Library. To date, the holdings of the Forest Products Library, the Institute of Industrial Relations Library, the Ethnic Studies Library, and the Philosophy Library are included in GLADIS and Pathfinder; and plans are being made to add the holdings of the Rossberg Library of international and area studies this calendar year.
Among the recent additions to the Library's collection network based services is a repository of UC Berkeley computer science technical reports. This repository is one of over 70 worldwide that can be searched individually or collectively, delivering the full text of research reports in a variety of formats. The Library has also created an online listening reserves service for the School of Music, and is experimenting with a reserves system aimed at making lecture notes, test results, and the like more readily accessible to Berkeley students. And the Library has collaborated with UC Data to mount and provide Web access to numerical data from the 1990 Census.
Finally, The Library Systems Office has spent a great deal of time in the past three years supporting re-engineering projects that have helped the Library absorb its targeted budget cuts. Examples of these projects include: changing the Gladis Technical Services workflow to support shelf ready books; interfacing Gladis to the Campus Accounts Receivables System (CARS) to reduce billing paperwork; enhancing and refining management statistics; merging NRLF records into Gladis to eliminate a separate automated system and reduce costs.
Projects Yet to Complete
A project to implement GLADIS E-mail notification to speed the return of recalled items and to notify the next borrower when a desired item is available. This project will also provide significant savings in mailing costs if print notices can be eliminated.
Interlibrary Services and the Library Systems Office are collaborating to develop a mechanism that will permit UC Berkeley faculty to request articles via the Library's Web catalog and retrieve digitized copies of those articles from their homes or offices. The paging system is currently ready for deployment, and the delivery system will be tested this semester.
A number of patron self-sufficiency projects such as user-initiated inventories and renewals have been implemented through GLADIS and are heavily used by the Library's patrons. A new project in this area includes a Pathfinder-based mechanism that will permit UC Berkeley faculty to request delivery of items from the Berkeley collections to their offices will be deployed in February 1998. And work on a similar mechanism for paging materials from the NRLF will be begun later this year, after completing the integration of the NRLF and GLADIS databases.
C) Leadership in Shaping the Emerging Digital Library
The UC Berkeley Library has been pursuing an aggressive, grant funded plan that has allowed it to take a leadership role in significant areas of national digital library development. The following is a list of research and educational projects that have been active in the Library over the past three years.
The Making of America II Testbed Project (MoA II)
The UC Berkeley Library is the lead institution in the Digital Library Federation's Making of America II project. MoA II is a research and demonstration project that will investigate best practices for distributed system architectures and for the encoding of intellectual, structural, and administrative metadata for primary resources. The MoA II collection will be "Transportation, 1869-1900", particularly the development of the railroads and their relationship to the cultural, economic, and political development of the country.
The California Heritage Digital Image Access Project
California Heritage builds on the development of the Encoded Archival Description (EAD), an SGML based national best practice created by the Berkeley Library for encoding finding aids. This project links collection level catalog records to the finding aids, and the finding aids to over 25,000 digitized photographs relating to California history.
The American Heritage Virtual Digital Archive
American Heritage builds on California Heritage and the EAD. The primary goal of this project is to develop a "virtual archive" of digitized finding aids populated by the project's participants. A testbed archive is to be developed to investigate the feasibility of providing access to distributed digital library resources and decentralized production methods. To achieve its goal, the project will explore intellectual, political, technical, and economic concerns.
The UC Encoded Archival Description (EAD) Project
The UC EAD project is a two-year pilot project to develop an UC-wide prototype union database of archival finding aid data encoded using the Encoded Archival Description (EAD). This database will serve as the foundation for the development of a full-scale digital archive for the University of California System (UC) available via the Internet to diverse user communities.
Scholarship from California on the Net (SCAN)
Scholarship from CAlifornia on the Net (SCAN) is designed to facilitate broad scholarly access to humanities journals, books, and related materials through publication on the Internet. SCAN's goal is to develop an economically viable publishing model for humanities scholarship that integrates electronic dissemination, library access, and scholarly use. This project is a collaboration among the University of California Press, the University Libraries at Berkeley, Irvine, and Los Angeles, and the Division of Library Automation of the Office of the President, and has received substantial funding from the Andrew W. Mellon Foundation.
Area Studies Monographs Project
With funding from the Mellon Foundation, the University of California Press in partnership with the Berkeley Library is conducting a project to publish a focused group of scholarly monographs on the Internet. The project has two major goals: to devise a way of publishing book-length scholarship more efficiently and cost-effectively; and to evaluate use patterns for book-length works in electronic format.
The UC Berkeley Digital Library SUNsite
The Berkeley Digital Library SUNsite was developed as a place to build digital collections and services while providing information and support to others doing the same. The SUNsite is sponsored by The Library, UC Berkeley, Sun Microsystems, Inc., and OCLC.
The Museum Digital Library Collective
The Museum Digital Licensing Collective, Inc. (MDLC) non-profit corporation formed to: provide financial assistance for the digitization of original materials in museums and collecting institutions; manage the storage, distribution, and licensing of digitized materials to educational institutions, libraries, museums, commercial companies, and the public and; develop and distribute related technical and computer services. The UC Berkeley Library has submitted a proposal to the MDLC to develop their image delivery system.
UC Berkeley/Columbia Digital Scriptorium
The project aims to select Latin dated and datable manuscripts (to 1550), describe the manuscripts, transcribe portions of the manuscripts, and provide digitized images of the portions of the manuscripts that are of paleographic interest. In addition, there is a significant economic evaluation analysis.
The Social Sciences and Government Data Library
The UC Berkeley Library & UC DATA, are collaborating to make US Census and other social science data available via the Internet. We anticipate this site will be rolled out in early February 1998. The initial products are the Summary Tape File 1 and 3 (also at the Census Bureau) and the Subject Summary Tape File 3, Persons of Hispanic Origin in the U.S. , and Subject Summary Tape File 5, Characteristics of the Asian and Pacific Islander Population in the U.S.
The Network Computer Science Technical Reports Library (NCSTRL)
NCSTRL, Networked Computer Science Technical Reports Library (pronounced "ancestral") is an international collection of computer science technical reports from CS departments and industrial and government research laboratories, made available for non-commercial and educational use. The UC Berkeley Library took over this service from the Computer Science Department as a technology transfer exercise.
The Institute for Digital Library Development
The Institute on Digital Library Development was a five-day workshop to retool librarians, archivists, and museum professionals with the skills they need to use existing tools and proven techniques to place library content on the Internet. The Institute was supported by a grant from the U.S. Department of Education, Higher Education Act Title II-B grant and the UC Berkeley Library.
APIS: The Advanced Papyrological Information System
The APIS project constitutes the efforts of papyrologists at a number of American universities to integrate in a "virtual" library the holdings from their collections through digital images and detailed catalog records that will provide information pertaining to the external and the internal characteristics of each papyrus, corrections to previously published papyri, and republications.
CALIPR: Preservation Planning Software
CALIPR has been designed to allow institutions without preservation expertise on their staffs to determine the preservation needs of their collections. The CALIPR manual leads the surveyor through the design of a needs assessment survey using CALIPR software, and its application to a sample of materials drawn from a collection about which estimates of preservation needs are desired. CALIPR software and manual were developed the UC Berkeley Library, with the support of the US Department of Education under the provisions of the Library Services and Construction Act, administered in California by the California State Library.
The Jack London Collection
Materials that reflect on the life and influence of one of turn-of-the-century America's most enduring authors. Letters, manuscripts and photographs from the Bancroft Library as well as out-of-print publications, electronic texts of some of London's writings, and original essays and research aids. This collection is cooperatively produced by the UC Berkeley Library and Sonoma State University.
The Emma Goldman Collection
Selected documents and photographs relating to Emma Goldman's life and work as well as indexes to thousands of other documents and photographs available in collections around the world. This content building project was a collaboration between the UC Berkeley Library and the Emma Goldman Papers Project.
Copyright © 1998 by the Library, University of California,
Berkeley. All rights reserved.
Document authored by: Bernie Hurley, Library Chief Scientist
Document maintained by: Ann Moen
Last update 2/18/98. Server manager: contact