Minesh Mathew
മിനേഷ് മാത്യു
ML/CV Researcher
Bangalore, India
I am a senior machine learning scientist at Wadhwani AI, where I lead the ML team for Oral Reading Fluency (ORF). ORF is deployed in multiple states in India and millions of students are assessed for reading fluency. I have completed MS + PhD from IIIT Hyderabad. I obtained my undergraduate degree in B.Tech Computer Science Engineering from NIT Warangal.
My PhD thesis deals with machine-understanding of document images. Specifically, I worked on problems such as OCR in Indian languages, scene text understanding, and Document Visual Question Answering (DocVQA). I co-created the DocVQA benchmark and task, which is widely used to evaluate whether models understand document layout and content, not just recognize text. Through open challenges at CVPR and ICDAR and an ongoing challenge series, this work helped shift document AI research toward integrated, purpose-driven reasoning. OCR and scene text recognition models from my master's and PhD research are deployed on Bhashini, India's national language technology platform.
News & Updates
- Dec 2025Promoted to senior machine learning scientist at Wadhwani AI.
- Jun 2025Received CVPR 2025 outstanding reviewer award.
- May 2023Received CVPR 2023 outstanding reviewer award
- Apr 2023ICDAR 2023 Challenge on Road text detection, tracking and recognition comes to an end. Results are public now.
- Apr 2023George's work on VQA for driving videos accepted at ICDAR 2023.
- Jan 2023Organizing two challenges in ICDAR 2023: RoadText Video Text Detection, Tracking and Recognition; Text-based Video Question Answering on News Videos.
- Sep 2022Soumya's work on VQA accepted to WACV 2023.
- May 2022Work comparing CTC based architectures for Indian languages OCR on arXiv.
- Mar 2022"Read while you Drive — Multilingual Text Tracking on the Road" accepted to DAS 2022. Congrats Sergi and George.
- Oct 2021Attended ICCV 2021 Doctoral Consortium.
- Oct 2021InfographicVQA paper accepted at WACV 2022.
- Sep 2021Presented work on QA over handwritten documents (oral) at ICDAR 2021.
- Sep 2021Organized first edition of DocVQA workshop at ICDAR.
Academic Services
- Reviewer for conferences — CVPR 2022–2026, NeurIPS 2026, ACCV 2022, ECCV 2022, SIGGRAPH 2022, ICDAR 2021, WACV 2021–2023, ICCV 2021, 2023, 2025
- Reviewer for journals — Pattern Recognition, IEEE TNNLS, TPAMI, Visual Computer, Concurrency and Computation, IJCV, TMLR
- [2023] Organizer, NewsVideo QA and Road-text challenges in ICDAR 2023
- [2021] Organizer, Document Visual Question Answering Workshop, ICDAR 2021
- [2021] Organizer, DocVQA competition, ICDAR 2021
- [2020] Organizer, DocVQA competition, CVPR 2020
- [2020] Organizer and Competition Chair, Text and Documents Workshop, CVPR 2020
- [2019] Organizer, Scene Text VQA Competition, ICDAR 2019
Achievements & Recognitions
- [2025] Outstanding reviewer — CVPR 2025
- [2023] Outstanding reviewer — CVPR 2023
- [2021] Outstanding reviewer (top 5%) — ICCV 2021
- [2021] Selected for WACV 2021 and ICCV 2021 doctoral consortiums
- [2017] Best Paper — Int'l Workshop on Arabic Script Analysis and Recognition (ASAR)
- [2015] Runner up — Microsoft Azure ML Hackathon held at IIIT Hyderabad
- [2015] TCS PhD fellowship