Last week I had the pleasure of attending the Digital Directions workshop, hosted by the Northeast Document Conservation Center. There was a ton of fantastic information, along with perspectives from seasoned professionals in the field, and colleagues who are tackling some of the same challenges we’re facing at my institution.
It will be a little while until I’m finished digesting all of this information, but in the meantime I wanted to post my notes. In addition to the links sprinkled in in the notes, check out the free resources for digital preservation on the NEDCC website.
Digital Directions Day 1
Digital Preservation (Introduction)
Digital Preservation – ensuring access across technologies and over time.
Digital Curation – actions people toake to maintian and add value to digital information over its lifecycle.
Curatorial actions must serve needs of current and future users.
Content creators are not aware of how, or the value of their output outside their own use.
Why curate? Provide information about: changes, custody, usability, metadata, findability, versions, cultural memory (activities), advance knowledge (research)
Two Main Docs
Trusted Digital Repositories
Tech and procedure suitability
Open Arcival Information System Reference Model (OAIS)
Producer -> Submission Information Packet -> Archives Information Packet -> Dissemination Information Packet -> User community
Over past 12 years operating in a merged model of these two.
DCC Lifecycle Model (University of Edinburgh)
Graphical High-level overview of the stages required for successful curation and preservation of data.
Moving forward, away from “Public” vs. “Technical”
New triad: Infrastructure, Content, Services (including helping users create new content, managing rights)
David Lankes on knowledge production (iSchool at Syracuse)
Librarians as facilitators of conversations.
Libraries, Kitchens, and Grocery Stores – Joan Frye Williams (2008)
Where do collecting institutuions go after preserving and providing access? People want to do stuff with digital informaton and artifacts.
Member-facing content creation services
Engage producers to build literacies and skills
Provide content creation, production conversion tools
Offer content hosting & production services
Cultural Institutuions in the Evlolving Paradigm
Teachers / Instructional Partners
Observers / Anthropologiists of information users (members) to study evolving user needs.
Content Producers and Communicators
Organizational Designers (new services, new staffing, etc)
Collaborative Network Creator (partnering with other organizations)
Digital Curation Networks (eSholarship – California, NITLE, Alliance)
Digtal Preservation Networks (MetaArchive, Chnonopolis, LOCKSS)
Digital Project Planning
Emily Gore, DPLA Director for Content
The Power of Where Your Collections Can Go
Start with reuse.
Create sharable metadata.
Thinking beyond the institutuional portal – international data models (broader than institutuion or local community)
Tell a more complete story by creating virtual collections, complementary collections at other institutuions. Linking to other relevant content or context. Reuse/remixing.
What? Or what do you want to select from born-digital items?
Selecting particular collections should be part of core mission and goals for the institutuions- taking the commitment.
Value – what is valuable, what fits with institutuional mission, what is potential use, what is the cost of NOT digitizing?
Ability – can you? do you have staff etc? RIGHTS RIGHTS RIGHTS and licence, not just on the objects but also the metadata
Legal considerations – DPLA and Europeana are working on standardized actionable statements that could be used across institutuions. Searchable collections by copyright: public domain, Creative Commons flavors, etc.
Workflow of what to outsource for example. Can this step be done in-house? Some steps in house and some outsourced, project-by-project basis.
Level of discovery needed. Minimal level metadata means minimal accessability.
More product/less metadata, or rich metadata and more selective collections?
Crowdsourcing transcription projects (NYPL menu project)
Delivery Expectations – what do you want users to do?
Do you have to place restrictions? If so, be very clear.
Create documented APIs
Look at Rijksstudio in Amsterdam – making money (from prints, postcards, etc) AND making images available for download.
Serendip – o – matic “let your sources surprise you” – run your text through this and discover related content in major collections.
No need to create a portal anymore – just make collections open and allow others (aggregators like DPLA) to build the interface.
Part of larger effort or collaborations?
GLAM, DPLA Hub, MetaArchive, Chronopolis, DPN)
IIIF (Stanford et al) and Mirador – international image interoperbility framework (media ecology project?) – participating institutuions have IIIF plugin running on their collections, Mirador is the interface to search and compare across institutuions. Artstor has released an IIIF-compliant viewer.
Goals for long-term access, preservation & sustainability?
Essential part of the process; partner with other institutuions or outsource. LOCKSS, HathiTrust?
How will you $$$?
See grant guidelines for best ways to plan the process even if you don’t apply for the money. Force you to address each consideration. Take the IMLS national grant applicaton for example. Consider local funding options.
Digital Directions Day 2
Sr. Director, Archives, Special Collections and Digital Curation
Preservation is the preservation of access
Creating durable access:
Sustainability – maintained and accessed over time
Authenticity – digital object is reliably true to the original
Interoperability – standards-based object can be used in a standards-based system
Reusability – can be used in ways not related to original purpose
Parts of a Digital Repository System
Repository (the infrastructure for preservation)
Systems that support the application of policies and activities
Five Attrubutes of Digital Integrity (RLG)
Digital Integrity (Paul Conway): content, fixity, reference, provenance, context
DCC Curation Lifecycle Model: Integrity + Time + Actions = Preservation
Not just getting stuff in and being able to get it back out again, but maintaining usability over time, via: Metadata maintenance, format migration, transforming the original resource to a usable digital object for today (example of a Quark Express file). Continual attention to preserve access.
Preservation is a value proposition based on purpose & mission, and available resources.
Downside of a complete repository system is like having to replace an entire house of plumbing if you want a new kitchen faucet. Keep tools module and connect them together.
Flickr DPLA, WordPress, Omeka are the shiny faucet that can reuse your stuff and present it in new ways.
Presentation (Discovery access)
Tools that enable siple or sophisticated user experiences within the control of the repository manager.
Neatline sits on Omeka and takes an object to put on a map. Viewshare – visualizes objects. These only work because the foundation is there and the digital objects are durable.
British Library interactive collections online.
People can use your stuff anyway they want.
Systems that leverage repository data without management or ownership responsibilities (except for the rights statement).
If you build good objects and have addribution information in metadata attached to objects, then when people build layers and layers on top or remix items they can always track back to the original source.
Digital repositories provide the structue within which preservation decisions can be made and implemented.
Digital Workflow Roundtable
Trying to create a vetting process – digital project proposal questionnaire?
Examples posted after session? Syracuse University, project proposal and checklist for evaluating questionnaire.
No metadata is bad, just misunderstood – use what you can.
200 DPI greyscale for best OCR experience?
For microfilm newspapers, going from original microfilm is fine if the quality is good enough for purposes and less fragile than paper.
Using a Wiki todocument procedures and policies – allowing students to comment on points and nominate them for staff review and clarification.
Project management strategies:
Task tracker platforms (web-based and students can post their progress)
Zohoprojects, Basecamp, MS Project
Asana for task management (in additon to Trello)
Rebecca Chandler, AVPreserve
Managing Digital Collections for Preservation and Access
Digital collections are the same as managing physical objects (sortof)
Require item-level control
Require intervening technology at every stage
Appraisal in a digital world:
Carrier media & file format (can the carrier medium sustain the information over time?)
What are we trying to save? – the experience of the original digital object, or the information on the carrier?
Original source file characteristics
Normalized information objects
The experience of the original
Given the chance, people have also chosen convenience over durability (of the carrier media)
Given information overload, we go back to assessment: what is important? what is culturally valuable? one grocery list vs. 1,000? fitbit info to tell you something you already know?
Cultural Armageddon: The Digital Attic (versioning, naming, in the digital world you get all the drafts (in paper world the curator might only get one or two drafts and the final version))
Appraisal is hard because there’s more stuff AND it’s harder to view and assess it all.
Paul Conway, Handbook for Digital Projects, NEDCC, 2000
Analog items become digital objects:
Source: condtion, container, readability
Purpose: Protect original from handling by making digital surrogate, Represent information rather than the thing, Transform use?
Technology: Does the technology exist? Do you own the equipment, can you afford to outsource it? Can you manage and DELIVER the resulting output?
“Born Digital” content doesn’t work this way:
No such thing as physical arrangement, only intellectual arrangement.
How you define “objects” affects management and access more than arrangement.
Advantage: presentation is very flexible, using metadata, to mix, remix, match rearrange objects and display different kinds of relationships, groups, etc.
OAIS IS rocket science!
Conceptual model of an information object that is self-contained and self-describing.
A set of data elements combined into a package that is internally coherent and can be managed in a digital preservation environment (digital repository).
How do you manage?
Largest/smallest information unit that becomes a unit? (lumper vs. splitter) Creating complex objects from small parts (recombining individually scanned pages back into a browsable book, or lumping them back into a PDF).
Quality decisions about the primary content file and its metadata.
How much context is enough? Do we need John Hancock’s pants to understand why he signed the Declaration?
Mangement Requires Tools
Managing digital objects requires intervening technology at EVERY stage.
Translate functional needs into application services.
Software Reality (goal is the free movement of content – from depository to discovery to access to remixing)
Let the tools do the work
Create it once, use it often. Central repository manages metadata, archival masters. Metadata is interoperable across multiple schemas, crosswalks are key.
Automate activities as much as possible (let the system create the derivatives at the point of need).
Step 1: scan RR pictures, basic metadata automated with duplicate records (these are all railroad pictures, #1, 2, 3, 4). Users look for caboose, users can describe and add metadata to the system to delineate different kinds of trains.
Archive does metadata on an item level but does NOT describe in detail each item. (These are all part of the trains collection.)
Case Study: California Visual History Archive Preserve (on the Internet Archive)
CALIPR as the assessment for items to be digitized – how to choose what to pick?
CAVPP specifications for vendors to inspect items and treatment if necessary.
Statement of Work includes technical metadata that they ask vendors to capture as well as descriptive metadata fields.
Documentation and templates available from http://calpreservation.org
Pop-up Archive from Berkeley.
Like many institutions of our size (or larger), my library has a large collection of commercially produced VHS tapes. Many of these titles are out of print, yet still heavily used for teaching and research. And as technology is ever-changing, what used to be a collections bragging point has now become an albatross, and for years we haven’t had much of a notion of how to address the issue. However, developments in the archival and fair use circles in the past two years have slowly revealed a possible solution.
I’ve written before about the Code of Best Practices in Fair Use for Academic and Research Libraries, and wondered how this new set of guidelines would support the work and processes already in play. At the time, entities such as the Chronicle for Higher Education predicted that the Code would “solve the problem” of VHS, however it is not clear that this document has had such a direct impact. Institutions have remained slow to publicly adopt and advocate for a change in practice.
Interestingly, New York University was quietly working on their own, more direct solution to the “VHS problem,” supported by a grant from the Mellon Foundation. In 2012 they published Video at Risk: Strategies for Preserving Commercial Video Collections in Research Libraries. The guidelines [PDF 406KB] provide more specific interpretation on the clauses within the copyright law that allow for archival preservation of copyrighted materials. I had reviewed the project information when I heard about it from CCUMC and other mailing lists last year, and was further interested after hearing Howard Besser speak about the Video at Risk project at the National Media Market last fall.
Although the preservation allowance within copyright law isn’t a complete solution for our VHS problem, the Video at Risk guidelines in combination with the Code present a groundwork I hope we can use to build a collections policy around these items. We’ve already begun to replace VHS titles in other formats when possible. The next step will be weeding the VHS collections to remove items that are no longer meeting current needs (including federal government documents on VHS and films about computer technology from the 1980′s). We will also develop a framework for justifying digital preservation of remaining out-of-print VHS titles where damage can be proven. Hopefully, within the next year or so the VHS format will become officially obsolete, which will further allow libraries and archives to justify long-term preservation of these materials in an accessible form.
This month we have two articles on the topic of interlibrary-loan practices for electronic information. While the practice has become common for individual articles or book chapters, libraries have been blocked from circulating eBooks for a number of reasons.
As Jennifer Jenkins mentions in her recent C&RL News article, Last sale? Libraries’ rights in the digital age, libraries are permitted to loan purchased physical copies of works under the “first sale” doctrine, an official part of copyright law. However, this provision does not apply to electronic works, and most publishing contracts actually specify the transaction in terms of a license of the material, rather than a purchase. Under these types of licences libraries have even less control over their electronic collections. Until recently, no academic institution has been ambitious enough to attempt a method of “lending” entire eBooks to consortial partners.
The gridlock over eBook lending may change if a new pilot program from Texas Tech University and the University of Hawaii-Manoa is successful. As reported in the Chronicle of Higher Education (Library Consortium Tests Interlibrary Loan of e-Books), the Greater Western Library Alliance partners have successfully recruited major academic publisher Springer to allow them to test a newly developed system for loaning eBooks via ILL.
The developers came up with a straightforward, frills-free solution. Using the web-based Occam’s Reader software, a lending library takes a stripped-down version of an e-book and loads it onto a secure web server. (Publisher metadata is removed in the process, Mr. Litsey says, to keep the feel of a print-book loan and—more important from a marketing perspective—as a compromise to preserve the potential sales appeal of publishers’ enhanced versions.)
The fact that universities and publishers are willing to work together to tackle the thorny issue of inter-library eLending is encouraging, since it has been a debacle from the moment the first eBook was created. And the approach is intriguing – as noted, the Occam’s Reader software passes only the raw content of the book to the inter-institutuion borrower, thus removing tools like bookmarking, note-taking, and citation management interoperability. The “owning” library is able to lend the whole book, and the publisher maintains a proprietary hold on the financial incentives that make eBooks more attractive to purchase (or subscribe to) in the first place.
As a media librarian at an institution that is moving to subscription-based acquisition of online streaming media collections, I have to wonder if a similar model could work for eLoaning videos. Like the Occam’s Reader model, there should be an easy way to serve a stripped-down video stream, without the fancy publisher features like interactive transcripts, clipping tools, playlists and annotations. Would Alexander Street Press, Docuseek2 contributors, or heavyweights like Swank be willing to negotiate contract terms so that we could lend individual films to partner institutions? Given the restrictive terms currently present in these licensing agreements, I have my doubts. But someone has to start the conversation in order for it to gain momentum.
Meanwhile, it will be interesting to see how the Occam’s Reader project influences policies and actions for other institutions and publishers. Ideally, libraries will be able to retain some kind of right-to-lend, even if the context justification happens differently than outlined via First Sale. After all, lending is what we are here for, and without the ability to lend between institutions, collections budgets will not be able to keep up with user needs. As much as voluntary cooperation is useful, without the rule of law to back up our actions, most institutions will be hesitant to break new ground into legal grey areas.