The impact of differences in text segmentation on the automated quantitative evaluation of song-lyrics

Friederike Tegge; Katharina Parry

首页> 外文期刊>PLoS One >The impact of differences in text segmentation on the automated quantitative evaluation of song-lyrics

【24h】

The impact of differences in text segmentation on the automated quantitative evaluation of song-lyrics

机译：文本分割差异对宋歌词自动定量评价的影响

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The text-evaluation application Coh-Metrix and natural language processing rely on the sentence for text segmentation and analysis and frequently detect sentence limits by means of punctuation. Problems arise when target texts such as pop song lyrics do not follow formal standards of written text composition and lack punctuation in the original. In such cases it is common for human transcribers to prepare texts for analysis, often following unspecified or at least unreported rules of text normalization and relying potentially on an assumed shared understanding of the sentence as a text-structural unit. This study investigated whether the use of different transcribers to insert typographical symbols into song lyrics during the pre-processing of textual data can result in significant differences in sentence delineation. Results indicate that different transcribers (following commonly agreed-upon rules of punctuation based on their extensive experience with language and writing as language professionals) can produce differences in sentence segmentation. This has implications for the analysis results for at least some Coh-Metrix measures and highlights the problem of transcription, with potential consequences for quantification at and above sentence level. It is argued that when analyzing non-traditional written texts or transcripts of spoken language it is not possible to assume uniform text interpretation and segmentation during pre-processing. It is advisable to provide clear rules for text normalization at the pre-processing stage, and to make these explicit in documentation and publication.

机译：文本评估应用COH-METRIX和自然语言处理依赖于文本分段和分析的句子，并常常通过标点符号检测句子限制。当POP歌曲歌词等目标文本不遵守正式标准的书面文本组成并缺乏原始标准时出现问题。在这种情况下，人类转录通常是准备用于分析的文本，通常关注未指定或至少记录的文本规范化规则，并依赖于作为文本结构单元的假定对句子的共享理解。本研究调查了在文本数据预处理期间使用不同的转录器将印刷符号插入歌曲歌词是否可能导致句子描绘中的显着差异。结果表明，不同的转录（根据其语言专业人士的广泛经验，根据其广泛的语言和写作的标点符合标点符号）可以产生句子细分的差异。这对至少一些COH-METRIX措施的分析结果具有影响，并突出了转录问题，具有在句子水平和以上定量的潜在后果。有人认为，当分析非传统书面文本或口语的成绩单时，在预处理期间不可能承担统一的文本解释和分段。建议在预处理阶段提供明确的文本标准化规则，并在文档和发布中制作这些明确。

著录项

来源
《PLoS One》 |2020年第11期|共16页
作者
Friederike Tegge; Katharina Parry;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词

相似文献

外文文献
中文文献
专利

1. Evaluation of morphological Differences in Breast Cancer Patients with Carcinomatous Meningitis and Brain Oligometastases Metastasis by automated subcortical Segmentation [J] . Reibelt A., Mayinger M., Borm K., Strahlentherapie und Onkologie . 2018,第Suppla1期

机译：乳腺癌脑膜炎患者形态学差异评价自动脑细胞分割转移转移
2. Automated segmentation of macular layers in OCT images and quantitative evaluation of performances [J] . Serguei A. Mokhov Computing reviews . 2013,第10期

机译：OCT图像中黄斑层的自动分割和性能的定量评估
3. Automated segmentation of macular layers in OCT images and quantitative evaluation of performances [J] . Ghorbel I., Rossant F., Bloch I., Pattern Recognition: The Journal of the Pattern Recognition Society . 2011,第8期

机译：OCT图像中黄斑层的自动分割和性能的定量评估
4. THE USE OF VISIBLE COLOR DIFFERENCE IN THE QUANTITATIVE EVALUATION OF COLOR IMAGE SEGMENTATION [C] . Hsin-Chia Chen, Sheng-Jyh Wang IEEE International Conference on Acoustics, Speech, and Signal Processing . 2004

机译：在彩色图像分割定量评估中使用可见颜色差异
5. Automated Face Analysis (AFA) distinguishes deliberate from spontaneous smiles on the basis of quantitative differences in intensity and asymmetry. [D] . Zlochower, Adena Judith. 2001

机译：自动脸部分析（AFA）根据强度和不对称性的定量差异将故意笑容与自然笑容区分开来。
6. Fully Automated Pulmonary Lobar Segmentation: Influence of Different Prototype Software Programs onto Quantitative Evaluation of Chronic Obstructive Lung Disease [O] . Hyun-ju Lim, Oliver Weinheimer, Mark O. Wielpütz, -1

机译：全自动肺叶分割：不同原型软件程序对慢性阻塞性肺疾病定量评估的影响
7. Fully Automated Pulmonary Lobar Segmentation: Influence of Different Prototype Software Programs onto Quantitative Evaluation of Chronic Obstructive Lung Disease. [O] . Hyun-ju Lim, Oliver Weinheimer, Mark O Wielpütz, 2016

机译：全自动肺叶片分割：不同原型软件程序对慢性阻塞性肺疾病定量评估的影响。

The impact of differences in text segmentation on the automated quantitative evaluation of song-lyrics

摘要

著录项

相似文献

相关主题

期刊订阅