Born Digital Collections:
An Inter-Institutional Model for Stewardship (2009-2011)
What were AIMS’s objectives? Simply put, they sought to:
- Create a framework for stewarding born digital materials.
- Process fourteen collections that were either born digital or hybrid collections of digital and analogue content.
- Foster a community of digital archivists.
The partnership was among institutions in both the United States and the United Kingdom: the University of Virginia, Stanford University, Yale University, and the University of Hull (UK). This broad partnership allowed for a diverse group of practitioners to come together and determine whether or not a shared methodology was possible. One of the initial challenges we faced was our organizations’ highly disparate approach to managing archival materials in general and the individual ability to manage born digital materials in particular. Each partner’s infrastructure posed unique problems to a shared strategy for born digital materials. This first takeaway from our project was dramatic: if the four partners could not agree on a single workflow, how could we expect to create a best practices document that would be useful to the international archival community? From that point, we decided to take broader view of what we were trying to accomplish. Instead of trying to create a single, monolithic, and complex workflow perhaps we could all collectively discover where we were making key decisions and begin documenting those moments. This, as it turns out, was a much more successful approach and provided us with the ability to craft a shared framework that could incorporate the idiosyncrasies of local practice as well as highly disparate infrastructures. As a result, we were able to create a framework that could take into account all of these factors. We broke this work out into four main components:
- Collection Development
- Arrangement and Description
- Discovery and Access
Each of these sections goes into greater detail—documenting the decision points and issues that the AIMS group encountered. As this article is simply an overview, the entire with paper can be viewed here: http://www2.lib.virginia.edu/aims/white-paper/.
To give a brief breakdown of each section, collection development deals with the actions and policies of any given organization’s strategy to acquire materials for their collections. These are the necessary steps needed to accept stewardship for and legal ownership of materials from a donor, seller, etc. Collection development policies help guide an organization to acquire either certain kinds of objects or materials centered on specific subjects. A large part of this process would be the early inclusion of a donor survey. There is a detailed sample in the AIMS white paper and it is important to understand that this process helps clarify what the materials might be and how they are disposed. It asks critical questions such as: the creator’s work habits; how does the potential donor organize his or her files; what types of digital materials have been created (particularly MIME types); how are they organized; whether the donor possess any mobile devices; multiple email accounts; and general practice when it comes to the donor’s digital footprint. One important aspect of the donor survey is to establish the ground rules for what content is to be transferred and/or included in a donor’s “collection”. This is perhaps the most important step in the process as it sets the parameters for everything else that follows. It helps the archive and the archivist to navigate specific ethical issues such as the stewardship of the physical media. What does that mean exactly? Take this example: a donor is very clear that she only wants specific materials to be part of her archive—say, digital photos, word processing documents, media files, etc. However, since she donated her computer to an archive that has the technological means to scan her hard drive for other things such as web browser history, online site passwords and activities, this too could be programmatically added to the donor’s archive. Without proper documentation, this clear violation of a donor’s intention is avoidable. There are many careful steps that need to be part of the discussions with any potential donor. These interactions can also guide the donor to work more closely with an archive to ensure the proper capture of the appropriate digital materials.
The second component in the AIMS framework is accessioning. Accessioning has always been a central function for archives. Accessioning actions relate to the organization taking physical and legal custody of the materials. The receiving repository also documents this transfer in the appropriate manner for the institution. In other words, these are the processes that establish physical, administrative, and intellectual control over transferred materials. It can also take into account any assessment and documentation of future needs. Accessioning plays a significant role in the future disposition of the materials. For many institutions, this is also the point where any restrictions that a donor may place on a collection are recorded. This is particularly necessary for planning out the future access strategy for the content and in some cases, restrictions can be more stringent than copyright law. This part of the AIMS framework might also consist of pulling the files off of physical media and transferring them to a preservation environment pending further processing. This would be a critical step if the archive itself has no real means to process the digital materials. At a minimum, the materials themselves have been stabilized.
The third major component of the AIMS framework is arrangement and description, which can be seen as the processes to establish intellectual control of the materials. It also prepares the content for the appropriate level of access. Arrangement and description seeks to preserve the original context as part of that means of discovery. At this stage, any implementation of policies and agreements with donors would take place to position an end user to access the content. It is here in the workflow that the deepest understanding of existing infrastructure is required. The activities related to arrangement and description would be guided by a processing plan and would have to take into account what a given institution’s technical abilities might be. In other words, what is the organization’s strategy for managing and delivering this content? Creating the metadata is, for the most part, fairly straightforward assuming an adequate ability to read the various file formats. Given the huge range of possible formats, an archivist would need to know whether or not a file format would need to undergo transformation in order to be accessed. This transformation would then need to be confirmed in some manner. Linking content to appropriate rights policies would also be part of this stage as well as a clear understanding of how users will interact with the content. This will be discussed further in the access and delivery section. With respect to description—does your institution have a strategy for making hundreds of thousands of emails searchable? Would this content be added to the general searching from one’s catalogue or separated out? How would you manage a collection that has searchable text but by the millions of files? Do you have the ability to check every file for sensitive information? Arrangement and description of born digital materials poses the greatest challenges to an archivist. At the time of writing this article, the archives world still lacks a comprehensive software environment to do this work adequately.
The final component of the AIMS framework is discovery and access, though this is by no means last in importance. In fact, all other stages that lead up to access should take an institution’s ability to make content available in mind. This stage refers to the systems (hardware and software) and workflows that make collection materials and their metadata available to users. A solid understanding of what this entails will inform most of the processing of collection materials. In other words, what is your institution’s access strategy for born digital content? Does it:
- Create emulation environments that show the digital materials within their original hardware and software environments?
- Transform digital objects for sustainable access. In other words, provide the materials in an updated format (e.g. migrate from a WordStar file to a RTF file). If so, does is still allow for access to the untransformed original?
Does the institution expect to provide all the commensurate functionality for born digital materials or does it expect to take advantage of web services developed outside of your infrastructure? These are all questions related to the relative functionality of digital content. There are an equal number of questions that relate to the rights and intellectual property issues related to the materials that must be part of any access and discovery framework. Can you properly restrict content to the appropriate, authenticated user? Does it need to reside in a different system or architecture in order to do so? Can the materials only be viewed on site or can they all be accessed remotely? Do you share basic metadata with users even though the content is restricted? These are all simple questions that have highly complex answers and undoubtedly a profound impact on the infrastructure required to mange the content.
This brief overview of a series of highly complex issues is more deeply explored with examples and case studies in the AIMS whitepaper. It does underscore the necessity of having a strategy in place before any kind of born digital stewardship can occur. Otherwise there is the risk of potential data loss, donor displeasure, and other forms of mismanagement. It may not be possible to have the entire framework in place but executing a few simple steps in advance can avoid a huge amount of future difficulties.
If you have any questions about the AIMS Project, please contact Bradley Daigle.