Michel Chammas, Researcher at the Digital Humanities Center at the University of Balamand, alongside Abdallah Makhoul (University of Franche-Comté) and Jacques Demerjian (Lebanese University), claimed the first prize at the HisFragIR20, a competition on Image Retrieval for Historical Handwritten Fragments, using a state-of-the-art system based on advanced machine learning techniques.
Handwriting recognition methods are pivotal in deciphering historical documents and preserving cultural heritage. The International Conference on Frontiers in Handwriting Recognition (ICFHR), a world-leading scientific conference in the domain of pattern and handwriting recognition, held a competition on Image Retrieval for Historical Handwritten Fragments, HisFragIR20 during a virtual conference from April 5th till June 1st.
The competition tested a large-scale retrieval of historical document fragments based on writer recognition. Such work is an arduous analysis typically performed by trained humanists. To simulate fragments, random text patches were extracted from historical document images from diverse origins and genres. The goal was to find similar patches of the same page or manuscript.
Due to the fierce competition on the qualitatively and quantitatively fronts, only five teams were selected to submit their results.
The evaluation was done using a leave-one-image-out cross-validation approach. This means that every image of the test set was used as a query based on which the other test images were ranked. The overall evaluation used mean average precision (mAP) bilaterally:
- On a writer-level, i.e. the goal is to find fragments of the same writer.
- On a page-level, i.e. finding fragments of the same page.
The certificate of achievement will be delivered at the ICFHR 2020 conference, which will be held virtually this year due to the COVID-19. Also, an article will be published by the organizing team after the conference, titled “ICFHR 2020 Competition on Image Retrieval for Historical Handwritten Fragments”.
Michel is currently working on a similar project for the Historical Arabic Manuscripts, which are available at Digital Humanities Center at UOB. The project aims to solve an important challenge for the researchers in which they have developed an adaptive deep learning system that works on identifying the authorship of unidentified historical Arabic documents.
It is worth noting that the Digital Humanities Center at UOB has a unique database that contains a large number of digitized and transcribed manuscripts. The dataset consists of a large set of historical Arabic documents, more than 50 manuscripts owned by the Center and hundreds imported from different areas in the middle east and cover a large span of time between the 8th and 18th century.