Sharing format preservation information and how this will benefit us all
World Digital Preservation Day is all about the global community coming together to share ideas and collaborate. So how can we all work more closely on sharing format preservation information and what is the value of doing this?
The evolution of the Preservation Action Registries (PAR) initiative
Digital Preservation has evolved from the early days of problem definition, specific research and demonstrator prototypes. At the start of the millennium anyone embarking on a Digital Preservation program was likely to start typing in code and generating their own software. This was often done as part of community efforts, often as funded projects, leading to a great deal of information sharing but not much practical preservation activity.
Today the landscape is different. Instead of cutting code you are more likely to buy one of the variety of Digital Preservation products available. Each has is strengths based on functionality, capacity and economic model, and this presents genuine choice to anyone starting up in this area.
There are however dangers in this new world, most particularly that Digital Preservation expertise is tied up in the proprietary knowledge of individual companies rather than being shared with the community.
Certainly, we at Preservica are uncomfortable with this and so have joined forces with Arkivum, and Artefactual, as well as JISC and the Open Preservation Foundation to create the Preservation Action Registries initiative. This will allow us to read each other’s specific file format preservation rules, creating a dialog that will enable us all to move digital preservation best practice forward at a much faster pace.
Discussing preservation information sharing
Calling all digital preservation experts
We are now encouraging more partners to join this early stage project. We want to be able to access the know-how of digital preservation experts round the world, whether practitioners, academics or subject matter experts, so we can improve the quality of the file format business rules we support and ensure that all our users can benefit from the entire community’s expertise. We hope as many people as possible will take up this invitation.
The benefits of sharing preservation knowledge
Once we are much more confident in the file format business rules and these are shared and validated with the community, a lot of opportunities become available. It is for example possible to share preservation policy templates for specific purposes with users without any format expertise so they can be used with confidence. For example, a policy such as “I want to be able to stream as many video formats as possible” can be translated into migration rules for all video formats into streamable MP4 and applied to content as it is ingested. An inexperienced user can choose to apply this policy knowing it has community approval and can expect it to just happen.
More advanced use cases occur when the knowledge of the community changes. Changes to what we know will affect the policies we have created, and our users will expect that these changes can be applied not just going forward to new content, but also retroactively to existing content. For example, if my policy is to use DROID for format identification and the DROID signatures change (which we know they do, frequently), then the system should be able to automatically re-identify those files affected. If the identified format of a particular piece of content changes, it should then be able to trigger extra processes such as property extraction and migration based on the new information.
The complexity comes in the detail of course. For example, if the rule for documents changes from “always migrate to Open Office” to “always migrate to the new Microsoft format” do I replace the previous Open Office representations or just add new Microsoft ones? And how do I cope with needing to migrate one million video files because the rules for streaming have just changed? And if I decide I no longer want to be able to stream videos do I remove the representations I have already created?
Collaboration and sharing preservation information is vital for the whole community
A vision of the future
The opportunity presented by this automatic and dynamic application of preservation policy should be a sufficient driver to overcome the challenges. Being able to select your policy from a template, modify as you require, then have your preservation system automatically apply the policy will be a huge step forward from manually selecting your action for each format or opening and saving your own files outside the system.
In Preservica we are working with Arkivum and Artefactual to make sure the Preservation Action Registries initiative gets to critical mass. We are also starting to prototype the sharing and application of preservation policies created and shared using PAR to investigate solutions to some of the challenges I mentioned, and we hope to be demonstrating these soon.
I am looking forward to World Digital Preservation Day 2030 when we don’t even know preservation is happening. The sharing of file format preservation policy and its automatic application is a big step in the right direction.
If you’d like to get involved in sharing digital preservation information or contributing to the PAR initiative then drop me a line firstname.lastname@example.org