详细信息
长篇语料的平行锚点匹配策略——基于《红高粱家族》中英文本自动对齐实验
Parallel Anchor Matching Strategy for Long Texts:An Experiment of Automatic Alignment of Chinese-English Bilingual Text in Red Sorghum
文献类型:期刊文献
中文题名:长篇语料的平行锚点匹配策略——基于《红高粱家族》中英文本自动对齐实验
英文题名:Parallel Anchor Matching Strategy for Long Texts:An Experiment of Automatic Alignment of Chinese-English Bilingual Text in Red Sorghum
作者:李新[1];孙润[1]
机构:[1]广东海洋大学外国语学院,广东湛江524000
年份:2024
卷号:22
期号:3
起止页码:105
中文期刊名:广东水利电力职业技术学院学报
外文期刊名:Journal of Guangdong Polytechnic of Water Resources and Electric Engineering
基金:广东海洋大学校级一般项目(C22862)。
语种:中文
中文关键词:语料对齐;相似度;语料库;《红高粱家族》
外文关键词:corpus alignment;similarity;corpus;Red Sorghum
中文摘要:平行语料库研究是近年来热门的语言学研究领域,并逐渐广泛应用于翻译研究领域。平行语料库的构建需要人力和计算机技术相结合,通过客观描述不同语言规律来提供研究文本的依据。然而在实践中,利用计算机技术进行语料自动对齐容易出错,特别是出现长篇双语常见对齐错误传播问题。对此,通过《红高粱家族》中英文本自动对齐实验,提出平行锚点匹配策略,其核心思想是利用多句强相似度将长篇语料进行区域划分。该策略能有效遏止对齐错误传播问题,准确率达99%,可为长篇英汉平行语料的高效构建和翻译研究提供参考和借鉴。
外文摘要:The study of parallel corpora is a hot linguistic research field in recent years,and has gradually been widely used in the field of translation research.Its construction requires the combination of human resources and computer technology.A parallel corpus contains text from one language and translation from another language.The construction of parallel corpora breaks down subjective judgments and provides the research basis for studying texts by objectively describing different language rules.Because these corpora are huge in volume and quantity,relying solely on manual editing is time-consuming,labor-intensive,and prone to errors,which further highlights the importance of using computer technology for automatic language text alignment.However,in practice,automatic alignment is still prone to errors.Aiming at the common problem of alignment error propagation in the process of computer automatic alignment of large bilingual texts,this paper proposes a parallel anchor matching method through research,which effectively curbs the problem of alignment error propagation.The core idea of this method is to partition long texts into segments using multi-sentence strong similarity.The accuracy rate is as high as 99%.The Red Sorghum is an important representative work of Mo Yan,the Nobel Prize winner in literature.The parallel anchor matching method has important practical significance for efficient construction and translation research of the English-Chinese parallel corpus of the Red Sorghum.
参考文献:
正在载入数据...