Details
Original language | English |
---|---|
Title of host publication | MMPT 2021 |
Subtitle of host publication | Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding |
Pages | 46-54 |
Number of pages | 9 |
ISBN (electronic) | 9781450385305 |
Publication status | Published - 27 Aug 2021 |
Event | 1st International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding, MMPT 2021 - Taipei, Taiwan Duration: 21 Aug 2021 → … |
Publication series
Name | MMPT 2021 - Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding |
---|
Abstract
The recognition of handwritten mathematical expressions in images and video frames is a difficult and unsolved problem yet. Deep convectional neural networks are basically a promising approach, but typically require a large amount of labeled training data. However, such a large training dataset does not exist for the task of handwritten formula recognition. In this paper, we introduce a system that creates a large set of synthesized training examples of mathematical expressions which are derived from LaTeX documents. For this purpose, we propose a novel attention-based generative adversarial network to translate rendered equations to handwritten formulas. The datasets generated by this approach contain hundreds of thousands of formulas, making it ideal for pretraining or the design of more complex models. We evaluate our synthesized dataset and the recognition approach on the CROHME 2014 benchmark dataset. Experimental results demonstrate the feasibility of the approach.
Keywords
- datasets, formula recognition, generative adversarial network
ASJC Scopus subject areas
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Hardware and Architecture
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
MMPT 2021 : Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding. 2021. p. 46-54 (MMPT 2021 - Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Unsupervised Training Data Generation of Handwritten Formulas using Generative Adversarial Networks with Self-Attention
AU - Springstein, Matthias
AU - Müller-Budack, Eric
AU - Ewerth, Ralph
PY - 2021/8/27
Y1 - 2021/8/27
N2 - The recognition of handwritten mathematical expressions in images and video frames is a difficult and unsolved problem yet. Deep convectional neural networks are basically a promising approach, but typically require a large amount of labeled training data. However, such a large training dataset does not exist for the task of handwritten formula recognition. In this paper, we introduce a system that creates a large set of synthesized training examples of mathematical expressions which are derived from LaTeX documents. For this purpose, we propose a novel attention-based generative adversarial network to translate rendered equations to handwritten formulas. The datasets generated by this approach contain hundreds of thousands of formulas, making it ideal for pretraining or the design of more complex models. We evaluate our synthesized dataset and the recognition approach on the CROHME 2014 benchmark dataset. Experimental results demonstrate the feasibility of the approach.
AB - The recognition of handwritten mathematical expressions in images and video frames is a difficult and unsolved problem yet. Deep convectional neural networks are basically a promising approach, but typically require a large amount of labeled training data. However, such a large training dataset does not exist for the task of handwritten formula recognition. In this paper, we introduce a system that creates a large set of synthesized training examples of mathematical expressions which are derived from LaTeX documents. For this purpose, we propose a novel attention-based generative adversarial network to translate rendered equations to handwritten formulas. The datasets generated by this approach contain hundreds of thousands of formulas, making it ideal for pretraining or the design of more complex models. We evaluate our synthesized dataset and the recognition approach on the CROHME 2014 benchmark dataset. Experimental results demonstrate the feasibility of the approach.
KW - datasets
KW - formula recognition
KW - generative adversarial network
UR - http://www.scopus.com/inward/record.url?scp=85114808291&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2106.09432
DO - 10.48550/arXiv.2106.09432
M3 - Conference contribution
AN - SCOPUS:85114808291
T3 - MMPT 2021 - Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding
SP - 46
EP - 54
BT - MMPT 2021
T2 - 1st International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding, MMPT 2021
Y2 - 21 August 2021
ER -