MU MU Libraries
home Resources and Services, Staff news Improving Digital Accessibility on Digitized Historical Commencement Programs

Improving Digital Accessibility on Digitized Historical Commencement Programs

The Digital Initiatives team has been working to digitize historical commencement programs in the past year. In one year, 176 programs were scanned. We uploaded 135 programs and will upload the last 41 program in the next few months. You can find the digitized collection on MOspace: https://mospace.umsystem.edu/xmlui/handle/10355/86901 

We are very proud of this project because we not only created high-quality scans as we always do, but also made efforts to improve the digital accessibility on the PDF files we created for this project. Each PDF file of the commencement programs has corrected OCR and is screen reader friendly.

What is the digitization process like for this project?

A digitization project usually starts with a planning process that defines the scope of the project, evaluates the condition of the physical items, and decides on the equipment, technical and metadata standards to be used for the project. Then, the project will be assigned to staff and students for the scanning, editing, quality controlling, and uploading processes. This project started in September 2023 with the planning process and handed over to a team of one student and 2 staff to execute the digitization workflow. Our student employee Evie worked about 12 hours per week on scanning and editing images.

Flowchart showing digitization workflow
Digitization workflow for commencement programs

Why invest time on improving digital accessibility of the pdfs?

Though we always take care, when possible, to provide OCR that is generally readable and searchable, certain items such as these commencement programs provide important details about Mizzou history and Mizzou alumni. Alumni, family members, and researchers often find commencement programs to be meaningful. Accuracy of the content is crucial for digitized commencement programs because users would want to search and find specific information such as student names, degree programs, awards and honors in the commencement.

How did you improve digital accessibility of PDFs?

We first use a software that automatically does OCR (optical character recognition) and then follow up with a few more manual steps to ensure digital accessibility, including:

  • reviewing and correcting text (particularly names)
  • correcting the reading order of elements on each page
  • adding alt-text to images when needed
A screenshot of an OCR editor software, showing the process of checking for name errors in automatically generated OCR text
Pic1-Checking for name errors in automatically generated OCR text
A screenshot showing before and after correcting the OCR errors caused by unique fonts.
Pic2-Before and after correcting the OCR errors caused by unique fonts.
A screenshot of a pdf page in OCR editor, showing the machine suggested reading order of different elements.
Pic3-before correcting the reading order of text blocks (pay attention to #5, #10-17)
A screenshot of a pdf page in OCR editor, showing the manually corrected eading order of different elements.
Pic4-after correcting the reading order of text blocks

According to World Wide Web Consortium (W3C), digital accessibility is the inclusive practice of ensuring that websites, tools and technologies are designed and developed so that people with disabilities can use them. Furthermore, when digital tools are correctly designed, developed, and updated, generally all users have equal access to information and functionality.

Digital Initiatives team has been interested in learning about digital accessibility since a couple of years ago. We attended multiple webinars and training sessions and discussed how to apply what we learned into practice. The commencement programs project is a great learning experience for both staff and students, and we hope this digital collection serves all users equally.

Resources:

Web Content Accessibility Guidelines (WCAG): https://www.w3.org/TR/WCAG22/

Library Accessibility Toolkit: https://docs.google.com/document/d/1Z0Pc6cLz1JjTUAysWkm16TKk-dQXDZ03NAOMGSMpoZQ/edit#heading=h.3oa7rh5pxjpe