Yan Zhang, Shijie Liu
Shanghai Maritime University (China)
https://doi.org/10.53656/ped2023-5s.12
Abstract. This paper reports on the compilation of a multi-genre maritime domain-specific corpus and the research methods used to analyze it. The corpus was compiled using a combination of manual and automated methods, including web crawling and manual selection of relevant texts. Qualitative and quantitative methods, such as discourse analysis and statistical analysis, were employed to analyze the corpus. The paper describes the background, significance, text scope, principles, and process for compiling the corpus, and explores its applications, including maritime language curriculum development, standardization of maritime language genres, international maritime discourse analysis, term extraction, development of maritime domain-specific machine translation models, and the study of quantitative linguistics in the maritime domain. This study can provide a valuable resource for researchers and educators in the maritime domain and serve as a reference and inspiration for future studies in this field.
Keywords: maritime communication; corpus; development and application; term extraction; machine translation