首页> 外文会议>BICA Society., Meeting >Development of Text Data Processing Pipeline for Scientific Systems

【24h】

Development of Text Data Processing Pipeline for Scientific Systems

机译：科学系统文本数据处理管道的开发

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The aim of this work was to develop pipeline processing of scientific texts, including articles and abstracts, for their further categorization, identify patterns and build recommendations to users of scientific systems. The authors proposed a number of methods of pre-processing of texts, the method of cluster and classification analysis of texts, developed a software system of recommendations to users of scientific publications. To solve the problem of data preprocessing it is proposed to use parametrical approach to retrieve new -semantic - feature from textual publications - the type of scientific result. Scientific result type extraction is built just based on user's need for content having specific property. To solve the problem of users' profile clustering it is proposed to use ensemble method with distance metric change. For classification, ensemble method based on entropy is used. Evaluation of proposed methods and algorithms employment efficiency was carried out as applied to operation of search module of "Technologies in Education" International Congress of Conferences information system. Author acknowledges support from the MEPhl Academic Excellence Project (Contract No. 02.a03.21.0005).

机译：这项工作的目的是制定科学文本的管道处理，包括文章和摘要，以便他们的进一步分类，确定对科学系统用户的模式和建议。作者提出了许多文本预处理的方法，文本的集群和分类分析方法，为科学出版物的用户开发了一种建议的软件系统。为了解决数据预处理的问题，建议使用参数化方法来检索文本出版物的新 - 许可特征 - 科学结果的类型。科学结果类型提取基于用户需要具有特定属性的内容。为了解决用户的轮廓群集问题，建议使用具有距离度量变化的集合方法。对于分类，使用基于熵的集合方法。评估所提出的方法和算法就业效率是在会议信息系统的“教育技术中技术”的搜索模块的运作中进行的。作者认识到Mephl学术卓越项目的支持（第02.A03.21.0005号合同。

著录项

来源
《BICA Society., Meeting》|2020年|xix 617 pages :|共13页
会议地点
作者
Anna I. Guseva; Igor A. Kuznetsov; Pyotr V. Bochkaryov; Stanislav A. Filippov; Vasiliy S. Kireev;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-532;
关键词
Text mining; Natural language processing; Machine learning;

机译：文本挖掘;自然语言处理;机器学习;
入库时间 2022-08-21 00:04:44

相似文献

外文文献
中文文献
专利

1. Development of ground pipeline system for high-level scientific data products of the Hisaki satellite mission and its application to planetary space weather [J] . Tomoki Kimura, Atsushi Yamazaki, Kazuo Yoshioka, Journal of Space Weather and Space Climate . 2019,第1期

机译：Hisaki卫星任务的高级科学数据产品地面管道系统的开发及其在行星空间天气中的应用
2. Electronic document processing operating map development for the implementation of the data management system in a scientific organization [J] . Alexey Artamonov, Kristina Ionkina, Evgeny Tretyakov, Procedia Computer Science . 2018,第5期

机译：电子文档处理操作地图开发，用于实施科学组织中的数据管理系统
3. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data [J] . Dreisbach Caitlin, Koleck Theresa A., Bourne Philip E., International journal of medical informatics . 2019,第MAY期

机译：对电子患者撰写的文本数据中自然语言处理和症状的文本挖掘的系统评价
4. Development of Text Data Processing Pipeline for Scientific Systems [C] . Anna I. Guseva, Igor A. Kuznetsov, Pyotr V. Bochkaryov, BICA Society., Meeting . 2020

机译：科学系统文本数据处理管道的开发
5. Streamlining Big Data Processing Pipelines via Unix Memory Tools, Persistent Spark Datasets, and the Apache Ignite Inmemory File System [D] . Blair, Walter 2018

机译：通过Unix内存工具，持久性Spark数据集和Apache Ignite内存文件系统简化大数据处理管道
6. An Ontology-Enabled Natural Language Processing Pipeline forProvenance Metadata Extraction from Biomedical Text (ShortPaper) [O] . Joshua Valdez, Michael Rueschman, Matthew Kim, -1

机译：用于本体的自然语言处理管道从生物医学文本中提取来源元数据（简短内容）纸）
7. Using Self-Determination of Senior College Students with Disabilities to Predict Their Quality of Life One Year after Graduation Pen-Chiang Chao 10.12973/eu-jer.7.1.1 Pages: 1-8 6 5 ABSTRACT VIEW ARTICLE FULL TEXT PDF The Development Process of a Mathematic Teacher’s Technological Pedagogical Content Knowledge Hilal Yildiz, Tuba Gokcek 10.12973/eu-jer.7.1.9 Pages: 9-29 1 3 ABSTRACT VIEW ARTICLE FULL TEXT PDF Children and Discipline: Investigating Secondary School Students’ Perception of Discipline through Metaphors Fatma Sadik 10.12973/eu-jer.7.1.31 Pages: 31-45 1 3 ABSTRACT VIEW ARTICLE FULL TEXT PDF Eliciting the Views of Prospective Elementary and Preschool Teachers about the Nature of Science Ayhan Karaman 10.12973/eu-jer.7.1.45 Pages: 45-61 4 3 ABSTRACT VIEW ARTICLE FULL TEXT PDF Correlation between Computer and Mathematical Literacy Levels of 6th Grade Students Unal Ic, Tayfun Tutak 10.12973/eu-jer.7.1.63 Pages: 63-70 2 3 ABSTRACT VIEW ARTICLE FULL TEXT PDF Effect of Activities Prepared by Different Teaching Techniques on Scientific Creativity Levels of Prospective Pre-school Teachers [O] . Nur Akcanca 2018

机译：使用高年级大学生的自决残疾人预测他们毕业笔蒋朝10.12973 / EU-jer.7.1.1页面之后的新生活一年的质量：1-8 6 5摘要查看文章全文PDF的发展过程一个数学教师的技术内容教学知识希拉尔耶尔德兹，图拔Gokcek 10.12973 / EU-jer.7.1.9页：1月9日至29日3摘要查看文章全文PDF儿童和纪律：调查纪律的中学生知觉通过隐喻法特玛·萨迪克10.12973 / EU-jer.7.1.31页：31-45 1个3摘要查看文章全文PDF引发前瞻性小学和幼儿教师查看关于科学艾汉卡拉曼10.12973 / EU-jer.7.1.45网页的性质：45 -61 4 3抽象视图为文章全文6年级学生ÜNALIC，塔伊丰Tutak 10.12973 / EU-jer.7.1.63网页的计算机和数学素养水平之间的文本PDF相关：63-70的2 3摘要查看文章全文PDF效果活动P通过对前瞻学前教师的科学创造力水平不同的教学方法repared
8. Text and Illustration Processing System (TIPS) User's Manual. Volume 1. Text Processing System [R] . Brown, C. J., Cox, R. 1981

机译：文本和插图处理系统（TIps）用户手册。第1卷。文本处理系统

Development of Text Data Processing Pipeline for Scientific Systems

摘要

著录项

相似文献

相关主题

期刊订阅