Digital Preservation futures: looking ahead to 2030….
Much of the discussion in the digital preservation community often focuses on the ‘here and now’. But how do we prepare for the future and the step changes we will undoubtedly see in the next ten years?
After this year’s PASIG conference in vibrant Mexico City, Preservica ran a community panel session to debate the challenges of preserving digital content, with a focus on casting our minds forward 10 years to the year 2030. We asked where the practice of preservation is heading in the next decade, who its users will be, and what exactly will be being preserved.
Here is a summary of just some of the views we garnered from the digital preservation community.
Digital preservation in business
It’s becoming apparent that commercial organizations are waking up to the need for digital preservation. This is especially so as digital transformation and legacy application decommissioning initiatives gather pace. This means digital preservation is moving out of it’s traditional domain of the corporate archive and into the preservation of long-term business records.
The panel discussed a number of forward thinking organizations that are deploying digital preservation for business, including long-term compliance and value extraction. However it was also noted that many organizations are still over reliant on shared network drives for long-term digital content. To increase adoption the digital preservation community needs to consider how it can automate, simplify and scale the digital preservation tools and products being created today so they work for business users. We will also see a great deal of consumption in media entertainment, who are concerned with preserving natural language records.
The sheer amount of data being created every day is staggering, and nowhere is this more apparent than in our personal lives. Every day around the world, we all create tons of data on our smartphones, tablets and personal laptops as we scroll through social media, send texts and take pictures on holiday. The recent MySpace data loss and Google+ shutdown further demonstrate how data can be lost as well as the consequences of not taking the appropriate action to protect and ensure accessibility to digital content over decades.
New types of data
We’re also creating new types of data that were unimaginable a few decades ago: today we’re seeing more complex data as well as increasing technology refresh rates. Just look at autonomous vehicles which download terabytes of telemetry and navigation data every second they’re on the road. Your town library would need a few more centuries to preserve everything at this rate!
The rise of AI and Machine Learning
AI will also be a key area of development. Artificial intelligence and machine learning can be put to use in digital preservation across the business function in categorising vast amounts of information like emails, making it more findable, driving actions instantaneously, or managing security settings. Taught by humans, AI will drive new efficiencies, but we must be aware of unconscious biases creeping into the process.
Software and games preservation
During our session at PASIG, an audience member from EA games shared their challenge of preserving the virtual ‘open worlds’ of their games. These worlds are constantly changing and developing, simply based on how different players approach the same game.
Rather than simply preserving one version to be stored in its archive, EA must store every possible iteration. For this, they’re embracing disruptive technologies like machine learning and AI that will interact with the virtual world to preserve the simulation of its game play.
What does all this show us?
For me, it was a clear warning that the objects we will be preserving in 2030 are set to become immensely complex. In the United States, NARA will likely no longer be preserving paper records after 2023, and that will just be the beginning. In ten years’ time, the focus of the digital preservation community will have significantly shifted to adjust to this brave new world.
In addition, with all of this new and complex data being created, how can we decide what we need to preserve and what we can discard? For example, when faced with a huge database of hundreds of thousands of emails, many will contain important information worth keeping, whereas others are unnecessarily taking up space and can be deleted. I think decision-makers need to remember that dealing with ever more material presents its own challenges not least in terms of cost alone.
Delivering the future — together
Although the technologies that will drive the future of digital preservation are still very much in development, the shift towards automation and seamless integration as part of the information lifecycle is already beginning to surface. Developments in integration with content management platforms and rapid evolution in AI and machine learning are steering digital preservation towards a more embedded and essentially invisible, yet vital, presence.
The only way to maintain momentum is for the digital preservation community to make the case for innovative preservation strategies with key stakeholders and to work together. I was put in mind of our work within the PAR community, and hopefully we will see similar initiatives thrive both nationally and internationally — learn more about PAR here.
I’m looking forward to continuing the debate on digital preservation futures and the challenges we will all face as we move towards 2030.
In particular I look forward to discussing with our growing user community at our upcoming face-to-face meetings in Oxford, UK and Austin, TX. This is an amazing group very much at the forefront of digital preservation. We expect over 200 attendees across the two meetings this year so I am sure there will be plenty of debate and the sharing of more great views, ideas and insights on the future of digital preservation as we hurtle towards 2030.