CVOCR: Context Vision OCR - RCA Research Repository

CVOCR: Context Vision OCR

Tools

Lists

Sadeghian, Rasoul, Shahin, Shahrooz and Sareh, Sina, 2024, Journal Article, CVOCR: Context Vision OCR 2024 20th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA). ISSN 2835-902X

Abstract or Description:	Optical Character Recognition (OCR) technologies are crucial for automated information extraction across various domains. However, the intricate layouts and diverse text properties often found on different products can complicate accurate data retrieval and categorization. This paper introduces Context Vision OCR (CVOCR), a versatile framework designed to address the proposed challenges using advanced image processing and text analysis techniques. While CVOCR is applicable to any OCR-related application, this paper focuses on pharmaceutical items as a case study due to the stringent accuracy requirements and the complexity of medicine packaging. The CVOCR algorithm is developed based on the integration of the Fast Super-Resolution Convolutional Neural Network (FSRCNN) for enhanced image clarity, LayoutLMv2 for spatial layout understanding, Tesseract OCR for robust character recognition, and GPT-Neo for advanced contextual analysis. The strategic integration of these components form a cohesive system that significantly improves text detection and interpretation accuracy. We demonstrate the efficacy of the CVOCR system through testing on various pharmaceutical products, where it consistently outperforms Tesseract OCR.
Subjects:	Other > Engineering > H600 Electronic and Electrical Engineering > H670 Robotics and Cybernetics > H671 Robotics
School or Centre:	Research & Innovation School of Design
Identification Number or DOI:	10.1109/MESA61532.2024.10704827
Date Deposited:	30 Jul 2024 12:25
Last Modified:	11 Aug 2025 04:18
URI:	https://researchonline.rca.ac.uk/id/eprint/5915

Edit Item (login required)