Whether you’re a publisher, a library, corporate or private client, you almost certainly have hundreds, if not thousands, of historic archives that are hugely valuable, that require digitising to help preserve historical content or to build an online platform for viewing.
Digitising an archive to make works available online sounds easy, but the reality of finding, transporting, scanning, refining, and organising large collections of historical publications is anything but. Delivering a digital archive is full of surprises:
“Whilst working with aged, related archives, it is impossible to truly know what to expect when we start preparing the digitisation. We need to be aware of all eventualities with old media.”
Cheryl-Lee Foulsham – Director Oxford Duplication Centre
1. DON’T GO IT ALONE; TRUSTING YOUR DIGITISATION COMPANY WITH YOUR ARCHIVE
One of the most complex projects we are currently working on a vast archive of academic books typewritten on Onion Skin paper. Our company was trusted with this very valuable academic archive to digitise into PDF with optical character recognition, ready for online presence.
This medium is extremely challenging, especially when the books require a non-destructive scan of each page, and with each page being of this fragile medium which can readily damage. Given the nature of onion skin paper, typically called because of the translucency of the paper, is that the text becomes transparent through the back pages. To combat this, we must ensure a clean white sheet of paper is placed between each page before scanning. With the digitised scan, we can then apply a guide fix to eliminate the back text to leave the front-page text intact. This has been very successful with our professional machinery and software. We are halfway through the project, with the aim to complete in July 2022.
We are extremely proud to work with The London Library, digitising their typed bound volumes. This project is very complex, requiring our technicians to painstakingly separate each page from their books very carefully. With no margin, the text was typed right to the very edges of the pages which were stitched in place. Once the pages were removed, we could then clean the edges and carefully digitise before applying Optical Character Recognition to enable the book to be searchable.
Since 2018, we were asked to convert The Kidlington Historical Societies Archives of their entire collection of document archives dating back to the 1700’s, in the medium of newspapers, letters, photographs, magazines and books. This archive is still ongoing, hopefully to be completed in March 2022. The aim is to provide the global research community with immediate access to over 50,000 pages of historical information about our famous village Kidlington over a period of 300 years.
2. CHALLENGES OF DIGITISING ARCHIVES
One of the biggest (and most surprising) challenges of digitising any archive is identifying and finding original copies of back-catalogue titles – especially when those back catalogues extend to hundreds of years.
As part of the Palestine Police Association Book Archive (PPPA) project, the association had been bequeathed a complete newsletter archive that was thought to have been destroyed. This was a wonderful and very important find for the historical society, who then contacted our company to provide digitisation and optical character recognition, which has helped preserve these valuable newsletters ready for global recognition and research.
OCR is used to convert the non-editable soft copies into editable text documents. This supports our clients with adjusting documents and optical character recognition in being able to find text within the pages.
2. COLLABORATION OF PARTNERSHIPS
3. HANDLE WITH CARE: TRANSPORTING AND SCANNING RARE HISTORIC WORKS
We offer a personal service for the transportation and return of rare media, to ensure our clients feel confident in our project management of the archive digitisation process.
4. ALLOW PLENTY OF TIME (AND EVEN MORE PATIENCE)
We estimate an average of 8-12 weeks to collect, transport, scan and return a project. So, a client requesting the digitisation of 50-60 titles can expect to have them returned between one and two months after collection, which is a big commitment on their part and trust on our part.
And there are lots of other technical details that can expand the project timeline. The fragility of older media means that there must always be someone monitoring the machinery to protect archives from accidental damage.
Our philosophy is “if we can’t scan a book without damaging it, we won’t scan it.”
Further challenges arise after the media has been scanned. Some scientific, non-destructive titles have been written on old writing machines, so the ink quality can vary substantially from page to page. There is some technically intricate work needed to ‘clean up’ text, sharpen, or refocus letters – especially on the pages of these older titles – but what you get at the end is a publication much easier to read than the original. This attention to detail naturally takes time.
5. BE PREPARED TO INVEST IN THE FUTURE
A digital archive project needs substantial financial commitment to secure the safe transportation and scanning of thousands of titles, and the professionalism of a trusted and knowledgeable digitisation team. We dedicate a team of full-time staff to each project, with roles ranging from project management to bibliographic maintenance, book selection, transport and quality assurance. Cutting corners will only cost you more down the line.
6. IS IT WORTH THE EFFORT?
The most important effect of undertaking and investing in the lengthy, painstaking (and, yes, often frustrating) task of building a digital archive, is making important publications – many of which have been lost or forgotten – highly accessible via your website. Whether it’s helping a PhD student in Tokyo or a librarian in Oxford, you’ll be playing a pivotal role in narrowing the gap between readers and research-driven progress.
Contributions from
Emma Warren-Jones
Comments
Post a Comment