Skip to main content

Digitising Large Archives - And Is It Worth It?

 HOW TO DIGITISE YOUR ARCHIVE 

Whether you’re a publisher, a library, corporate or private client, you almost certainly have hundreds, if not thousands, of historic archives that are hugely valuable, that require digitising to help preserve historical content or to build an online platform for viewing.   

Digitising an archive to make works available online sounds easy, but the reality of finding, transporting, scanning, refining, and organising large collections of historical publications is anything but.  Delivering a digital archive is full of surprises: 

Whilst working with aged, related archives, it is impossible to truly know what to expect when we start preparing the digitisation.  We need to be aware of all eventualities with old media. 

Cheryl-Lee Foulsham – Director Oxford Duplication Centre 



1. DON’T GO IT ALONE; TRUSTING YOUR DIGITISATION COMPANY WITH YOUR ARCHIVE 

One of the most complex projects we are currently working on a vast archive of academic books typewritten on Onion Skin paper.  Our company was trusted with this very valuable academic archive to digitise into PDF with optical character recognition, ready for online presence.   

This medium is extremely challenging, especially when the books require a non-destructive scan of each page, and with each page being of this fragile medium which can readily damage.  Given the nature of onion skin paper, typically called because of the translucency of the paper, is that the text becomes transparent through the back pages.  To combat this, we must ensure a clean white sheet of paper is placed between each page before scanning.  With the digitised scan, we can then apply a guide fix to eliminate the back text to leave the front-page text intact.  This has been very successful with our professional machinery and software.  We are halfway through the project, with the aim to complete in July 2022. 


We are extremely proud to work with The London Library, digitising their typed bound volumes.  This project is very complex, requiring our technicians to painstakingly separate each page from their books very carefully.  With no margin, the text was typed right to the very edges of the pages which were stitched in place. Once the pages were removed, we could then clean the edges and carefully digitise before applying Optical Character Recognition to enable the book to be searchable. 

Since 2018, we were asked to convert The Kidlington Historical Societies Archives of their entire collection of document archives dating back to the 1700’s, in the medium of newspapers, letters, photographs, magazines and books.  This archive is still ongoing, hopefully to be completed in March 2022.  The aim is to provide the global research community with immediate access to over 50,000 pages of historical information about our famous village Kidlington over a period of 300 years. 



2. CHALLENGES OF DIGITISING ARCHIVES 

One of the biggest (and most surprising) challenges of digitising any archive is identifying and finding original copies of back-catalogue titles – especially when those back catalogues extend to hundreds of years.


As part of the Palestine Police Association Book Archive (PPPA) project, the association had been bequeathed a complete newsletter archive that was thought to have been destroyed. This was a wonderful and very important find for the historical society, who then contacted our company to provide digitisation and optical character recognition, which has helped preserve these valuable newsletters ready for global recognition and research 


OCR is used to convert the non-editable soft copies into editable text documents. This supports our clients with adjusting documents and optical character recognition in being able to find text within the pages. 



2. COLLABORATION OF PARTNERSHIPS 

It is important to note, that the sheer scale of digitisation requires the collaboration of trusted partners and highly professional technicians to ensure the success of digitising valuable archives.   We work closely with digitisation partner Preservica to ensure that all digitised cultural media are uploaded to a leading platform for preserving online data specifically for archives. This is very important partnership, one that guarantees important archives are held on a secure platform ready for online viewing and downloading. 



3. HANDLE WITH CARE: TRANSPORTING AND SCANNING RARE HISTORIC WORKS 

Transporting archives is the equivalent of moving expensive works of art, so everything from the temperature and humidity of storage space to strict handling guidelines, is followed to the letter. Having utmost trust in such a partner to operate on your behalf is also crucial, so we ensure a rigorous criterion for your archive transportation.  We find adhering to the smallest of details is crucial to building trust and delivering a high-quality digital archive. 


We are highly confident in our transportation and handling, that many libraries, corporates, and heritage clients allow us remove archives offsite or out of the country


We offer a personal service for the transportation and return of rare media, to ensure our clients feel confident in our project management of the archive digitisation process. 




4. ALLOW PLENTY OF TIME (AND EVEN MORE PATIENCE) 

However long you estimate your digital archive project taking, you should always add a sizeable amount of contingency time. This is a process that just can’t be rushed. 


We estimate an average of 8-12 weeks to collect, transport, scan and return a project. So, a client requesting the digitisation of 50-60 titles can expect to have them returned between one and two months after collection, which is a big commitment on their part and trust on our part 


And there are lots of other technical details that can expand the project timeline. The fragility of older media means that there must always be someone monitoring the machinery to protect archives from accidental damage.


Our philosophy is “if we can’t scan a book without damaging it, we won’t scan it.” 

Further challenges arise after the media has been scanned. Some scientific, non-destructive titles have been written on old writing machines, so the ink quality can vary substantially from page to page. There is some technically intricate work needed to ‘clean up’ text, sharpen, or refocus letters – especially on the pages of these older titles – but what you get at the end is a publication much easier to read than the original. This attention to detail naturally takes time. 

 

5. BE PREPARED TO INVEST IN THE FUTURE 

A digital archive project needs substantial financial commitment to secure the safe transportation and scanning of thousands of titles, and the professionalism of a trusted and knowledgeable digitisation team.  We dedicate a team of full-time staff to each project, with roles ranging from project management to bibliographic maintenance, book selection, transport and quality assurance. Cutting corners will only cost you more down the line. 


 

6. IS IT WORTH THE EFFORT? 

The most important effect of undertaking and investing in the lengthy, painstaking (and, yes, often frustrating) task of building a digital archive, is making important publications – many of which have been lost or forgotten – highly accessible via your website. Whether it’s helping a PhD student in Tokyo or a librarian in Oxford, you’ll be playing a pivotal role in narrowing the gap between readers and research-driven progress.  



 

Contributions from  

Emma Warren-Jones  

Comments

Popular posts from this blog

The Repair Shop - How To Spot A Ferrotype Camera 1855-1940s

After watching The Repair Shop on BBC1 restore a beautiful and rather rare ferrotype camera I thought a blog on the process would be interesting. Not only did they repair but they managed to have the camera working, taking photographs. This was very inspirational given the age of the camera. ABOUT FERROTYPE PROCESS Ferrotypes first appeared in America in the 1850s, but didn’t become popular in Britain until the 1870s. They were still being made by while-you-wait street photographers as late as the 1950s. The ferrotype process was a variation of the collodion positive, and used a similar process to  wet plate photography . A very underexposed negative image was produced on a thin iron plate. It was blackened by painting, lacquering or enamelling, and coated with a collodion photographic emulsion. The dark background gave the resulting image the appearance of a positive. Unlike collodion positives, ferrotypes did not need mounting in a case to produce a positi...

Onion Skin Archive Book Scanning - What is this and how do we process the pages?

CURRENT BOOK SCANNNING PROJECT.  We are currently working on a very large archive of old books that require HQ scanning to Archival TIFF images.  Once processed, these images will be prepared to PDF with OCR (optical character recognition) for a complete searchable output.   The difficulty in this order, is the books are prepared using a medium called Onion Skin Paper. Whilst we are very confident in preparing this type of medium, it is very important to be aware that there are risks with scanning, given the sometimes-fragile nature of the paper.   Tears and rips can occur, so a very gentle white glove approach is required. Equally, with the nature of onion skin, the paper is very translucent which requires a sheet of white paper to be placed under each page before scanning. This then grants a very good HQ image that we can work with.   WHAT IS ONION SKIN PAPER? Onion skin paper is a type of very light weight, almost translucent paper that ...

Vellum Document Deed Scanning in Oxfordshire

One of our most interesting orders came in today in the form of a vellum property deed dating back nearly 200 years. We were asked to prepare a digital copy so our client could preserve the original. After assessment we decided on which scanner to use and scanned the deed to a high quality Archival TIFF file before applying post production tone and sharpening. The results were stunning and a wonderful image produced. Interesting fact:  To ensure you could prove a duplicate copy, some deeds had a defined curved wave cut out at the top. This meant the original's replica could be matched 100% to the original, safeguarding fraud. What is vellum? Vellum is prepared  animal skin or "membrane", typically used as  a material for writing or printing on, to produce single pages, scrolls,  codices  or books. The word is derived from the  Latin  word  vitulinum  meaning "made from calf". Typically deeds are folded and stored. Alth...