Digital Preservation

"When you look at a cupcake, you've got to smile."  - Anne Byrn

Our goal at SWEETS is for the recipes within Cupcakes Galore! to be at the disposal of students, faculty, and staff as well as the general public for many years. The documents will only remain usable if they are preserved properly. Ensuring long term access to usable formats of digitized material will be an ongoing project which will require us to evaluate our policies yearly. Digital Preservation includes acquiring digital files, ensuring they are formatted and stored in a way which allows access using available software, and being aware of new technology and trends within digital preservation.

The first step in preserving these pieces of wisdom is collecting them. The pages of the cupcake books will be scanned and saved as file type PDF/A. "PDF/A is a subset of the PDF format suitable for the long-term preservation and archiving of page-oriented text documents" ("Meeting" 1). Many archivists see this format as stable. Though XML was considered as a format, PDF/A is better at "retaining the current look and feel" of the digitized document (Anderson, Hodge, p 61). TIFF files, being still shots, may seem more authentically like the original; however, users cannot search for words within a TIFF. We feel confident that PDF/A will continue to used as a preservation format.  This support from the community will ensure that software will be able to open PDF/A files or that the option to migrate PDF/A files to a usable format will be available. Once these files for Cupcakes Galore! are gathered, they will be stored on our servers. Data will be backed up on secondary servers weekly. At the same time, the files will be scanned to ensure none are corrupted.

There are some drawbacks to committing to this format. The PDF/A file format does not support videos files, which could be embedded within blogs. Currently, videos fall beyond the scope of Cupcake Galore! which will focus on the text of the recipes and the culture created by them. This and all aspects of our digital preservation policy will be evaluated one and then two years after the digital library is created.

The possibility of videos being inserted within blogs is one reason we've opted to not save the files of the blog posts on our servers initially. During the initial phase of the digital collection, SWEETS will subscribe to Archive-It, a subscription based service which will allow Cupcakes Galore! staff to collect, manage, and browse the chosen cupcake blogs. The blogs files will be hosted on Archive-It's servers, but the bibliographic information will be on our servers. Before committing to Archive-It, we will ensure that all comments posted are being captured. The cupcake phenomenon is nation-wide and envelops a community of people. In light if this, the comments in the cupcake blogs are nearly as important to archive as the recipes and pictures themselves. Subscribing to Archive-It (or a like service) would enable multiple captures of the same blog post. Comments made after the post was initially added to the database would also be included. Another negotiation point with Archive-It is getting blog posts from the past preserved individually.  If this is not possible, our staff members will gather full-text versions of these older posts and save them in PDF/A format.

Archive-It will continue gathering and storing materials for at least one year. However, as the collection grows we will reassess the nature of having some digital files stored on Archive-It while others are stored and backed up by our IT staff and hardware. Eventually, we would prefer having all materials housed on our servers. This delay will allow our staff to become comfortable with procedures and software systems.

Cupcakes Galore! staff will also subscribe to the Library of Congress Digital Preservation Newsletter and other like publications to learn about new technologies and ensure knowing the practices and trials of other libraries.


Resources

Library of Congress Web Archives. Minerva. The Library of Congress, 2008. 6 Mar. 2008.

National Digital Information Infrastructure and Preservation Program. The Library of Congress, 2009. 1 Dec. 2009.

Archive-It.org. Internet Archive, 2009.


Footnote 1:  "Meeting the Challenge:Specifications for Digital Formats.” Library of Congress Digital Preservation Newsletter. Library of Congress. May, 2008

Foot note 2: Anderson, Nikkia; Hodge, Gail. "Formats for digital preservation: A review of Alternatives and Issues." Information Services and Use. Vol. 27. 2007. 45-63