YD Woldemariam
A cloud-hosted MapReduce architecture for syntactic parsing
Woldemariam, YD; Pletschacher, S; Clausner, C; Bass, JM
Mr Stefan Pletschacher S.Pletschacher@salford.ac.uk
Mr Christian Clausner C.Clausner@salford.ac.uk
Senior Research Fellow
Prof Julian Bass J.Bass@salford.ac.uk
Professor of Software Engineering
Syntactic parsing is a time-consuming task innatural language processing particularlywherea largenumber of text files are beingprocessed. Parsingalgorithms are conventionally designed to operate on a single machine in a sequential fashionand, as a consequence, failto benefit from high performance and parallel computing resources available on the cloud.We designed and implemented a scalable cloud-based architecture supporting parallel and distributed syntactic parsing for large datasets. The main architecture consists of asyntactic parser(constituency and dependency parsing)and a MapReduceframework running onclusters of machines.The resulting cloud-based MapReduce parsing is able to builda map where syntactic trees of the same input file have the same keyand collect into a singlefile containing sentences along with their corresponding trees.Ourexperimental evaluation showsthat the architecture scales wellwith regard to number or processing nodes and number of cores per node.In the fastest tested cloud-based setup, the proposed design performs 7times faster when compared to a localsetup. In summary, this study takes an important step toward providing and evaluating a cloud-hostedsolution for efficient syntactic parsingof natural language data sets consisting of a large number of files.
Woldemariam, Y., Pletschacher, S., Clausner, C., & Bass, J. (2019). A cloud-hosted MapReduce architecture for syntactic parsing. In Kallithea, Greece. https://doi.org/10.1109/SEAA.2019.00024
Conference Name | Euromicro Conference on Software Engineering and Advanced Applications |
Start Date | Aug 28, 2019 |
End Date | Aug 30, 2019 |
Acceptance Date | May 7, 2019 |
Online Publication Date | Nov 21, 2019 |
Publication Date | Nov 21, 2019 |
Deposit Date | Jul 3, 2019 |
Publicly Available Date | Jul 3, 2019 |
Publisher | Institute of Electrical and Electronics Engineers |
Book Title | Kallithea, Greece |
DOI | https://doi.org/10.1109/SEAA.2019.00024 |
PID5964649 Camrea Ready.pdf
(688 Kb)
You might also like
A survey of OCR evaluation tools and metrics
Conference Proceeding
Efficient and effective OCR engine training
Journal Article
Crowdsourcing historical tabular data : 1961 census of England and Wales
Conference Proceeding
Highlights of the novel dewaterability estimation test (DET) device
Journal Article
ICFHR 2018 Competition on recognition of historical Arabic scientific manuscripts - RASM2018
Conference Proceeding
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search