TU Darmstadt / ULB / TUbiblio

Issue Based OCR Error Prediction in Video Streams

Siegmund, Dirk ; Sacco, Luís Rüger ; Kuijper, Arjan (2020)
Issue Based OCR Error Prediction in Video Streams.
virtual Conference (23.-25.09.)
doi: 10.23919/SPA50552.2020.9241245
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

This paper increases the reliability of Optical Character Recognition (OCR) systems in natural scene by proposing a novel Image Quality Assessment (IQA) system. We propose to increase reliability based on the principle that OCR accuracy is a function of the quality of the input image. Detected text boxes are analyzed regarding their OCR score and different quality issues, such as blur, light and reflection effects. The novelty of our approach is to model IQA as a classification task, where one class represents high quality elements and each of the other classes represent a specific quality issue. We demonstrate how this methodology allows the training of IQA systems for complex quality metrics, even when no data labeled with the desired metric is available. Furthermore, a single IQA system outputs the quality score as well as the quality issues for a given image. We built on publicly available databases to generate 60k text boxes for each class and obtain 97,1% classification accuracy on a test set of 24k images. We conclude that the learnt quality metric is a valid indicator of common OCR errors by evaluating on the ICDAR 2003 Robust Word Recognition dataset.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2020
Autor(en): Siegmund, Dirk ; Sacco, Luís Rüger ; Kuijper, Arjan
Art des Eintrags: Bibliographie
Titel: Issue Based OCR Error Prediction in Video Streams
Sprache: Englisch
Publikationsjahr: 2020
Verlag: IEEE
Buchtitel: Proceedings of the Signal Processing Conference: Algorithms, Architectures, Arrangements, and Applications (SPA 2020)
Veranstaltungsort: virtual Conference
Veranstaltungsdatum: 23.-25.09.
DOI: 10.23919/SPA50552.2020.9241245
Kurzbeschreibung (Abstract):

This paper increases the reliability of Optical Character Recognition (OCR) systems in natural scene by proposing a novel Image Quality Assessment (IQA) system. We propose to increase reliability based on the principle that OCR accuracy is a function of the quality of the input image. Detected text boxes are analyzed regarding their OCR score and different quality issues, such as blur, light and reflection effects. The novelty of our approach is to model IQA as a classification task, where one class represents high quality elements and each of the other classes represent a specific quality issue. We demonstrate how this methodology allows the training of IQA systems for complex quality metrics, even when no data labeled with the desired metric is available. Furthermore, a single IQA system outputs the quality score as well as the quality issues for a given image. We built on publicly available databases to generate 60k text boxes for each class and obtain 97,1% classification accuracy on a test set of 24k images. We conclude that the learnt quality metric is a valid indicator of common OCR errors by evaluating on the ICDAR 2003 Robust Word Recognition dataset.

Freie Schlagworte: Video analysis, Image quality, Machine learning
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Graphisch-Interaktive Systeme
20 Fachbereich Informatik > Mathematisches und angewandtes Visual Computing
Hinterlegungsdatum: 02 Dez 2020 12:28
Letzte Änderung: 02 Dez 2020 12:28
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen