Art Rhyno

Library project digitizing Indonesian newspaper collection

The Leddy Library is a lead participant in a project to make a large Indonesian newspaper collection available online.

The Optical Character Recognition (OCR) of newspaper pages collected for the Violent Conflict in Indonesia Study is carried out at night using grid processing techniques and library workstations.

The study was conducted by the World Bank Conflict and Development team, and used local newspaper monitoring to track incidents of violence. More than 1,000,000 newspaper pages undergo OCR to make the text captured in the page images searchable and reusable.