首页> 外文学位 >Faceted Search and Browsing of Indonesian Text Collection Using Shallow Parsing Techniques.

【24h】

Faceted Search and Browsing of Indonesian Text Collection Using Shallow Parsing Techniques.

机译：使用浅层解析技术对印度尼西亚文本集合进行多面搜索和浏览。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text search is a very useful way of retrieving document information from a particular website. The public generally use internet search engines over the local enterprise search engines, because the enterprise content is not cross linked and does not follow a page rank algorithm. On the other hand the enterprise search engine uses metadata information, which allows the user to specify the conditions that any retrieved document should meet. Therefore, using metadata information for searching will also be very useful. My thesis aims on developing an enterprise search engine using metadata information by providing advanced features like faceted navigation. The search engine data was extracted from various Indonesian web sources. Metadata information like person, organization, location, and sentiment analytic keyword entities should be tagged in each document to provide facet search capability. A shallow parsing technique like named entity recognizer is used for this purpose. There are more than 1500 entities that have been tagged in this process. These documents have been successfully converted into XML format and are indexed with "Apache Solr". It is an open source enterprise search engine with full text search and faceted search capabilities. The entities will be helpful for users to specify conditions and search faster through the large collection of documents. The user is assured results by clicking on a metadata condition. Since the sentiment analytic keywords are tagged with positive and negative values, social scientists can use these results to check for overlapping or conflicting organizations and ideologies. In addition, this tool is the first of its kind for the Indonesian language. The results are fetched much faster and with better accuracy.

机译：文本搜索是从特定网站检索文档信息的非常有用的方法。公众通常在本地企业搜索引擎上使用Internet搜索引擎，因为企业内容没有交叉链接并且不遵循页面排名算法。另一方面，企业搜索引擎使用元数据信息，该信息允许用户指定任何检索到的文档应满足的条件。因此，使用元数据信息进行搜索也将非常有用。本文旨在通过提供诸如多面导航的高级功能来开发使用元数据信息的企业搜索引擎。搜索引擎数据是从印度尼西亚的各种网络资源中提取的。应在每个文档中标记元数据信息，例如人，组织，位置和情感分析关键字实体，以提供方面搜索功能。为此，使用了诸如命名实体识别器之类的浅层解析技术。在此过程中已标记超过1500个实体。这些文档已成功转换为XML格式，并使用“ Apache Solr”建立了索引。它是一个开源的企业搜索引擎，具有全文搜索和多面搜索功能。这些实体将有助于用户指定条件并在大量文档中进行更快的搜索。通过单击元数据条件，可以确保为用户提供结果。由于情感分析关键字被标记为正值和负值，因此社会科学家可以使用这些结果来检查组织和意识形态的重叠或冲突。此外，该工具是印尼语言中的第一个此类工具。可以更快，更准确地获取结果。

著录项

作者
Sanaka, Srinivasa Raviteja.;
展开▼
作者单位

Arizona State University.;

展开▼
授予单位 Arizona State University.;
学科 Computer Science.
学位 M.S.
年度 2010
页码 51 p.
总页数 51
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:37:09

相似文献

外文文献
中文文献
专利

1. Integrated Faceted Browser and Direct Search to Enhance Information Retrieval in Text-Based Digital Libraries [J] . Shea-Tinn Yeh, Yan Liu International journal of human-computer interaction . 2011,第4a6期

机译：集成的多面浏览器和直接搜索功能可增强基于文本的数字图书馆中的信息检索
2. Rapid Induction of Multiple Taxonomies for Enhanced Faceted Text Browsing [J] . Lawrence Muchemi, Gregory Grefenstette International Journal of Artificial Intelligence & Applications (IJAIA) . 2016,第4期

机译：快速归纳多种分类法以增强多面文字浏览
3. Search and browse services for heterogeneous collections with the peer-to-peer network Pepper [J] . Henrik Nottelmann, Gudrun Fischer Information Processing & Management . 2007,第3期

机译：使用对等网络Pepper搜索和浏览异构集合的服务
4. MultiFacet: A Faceted Interface for Browsing Large Multimedia Collections [C] . Henry Michael J., Endert Alex, Roberts Ian IEEE International Symposium on Multimedia . 2013

机译：MultiFacet：用于浏览大型多媒体收藏的多面界面
5. Faceted searching and browsing over large collections of textual and text-annotated objects [D] . Dakka, Wisam 2008

机译：多面搜索和浏览大量文本和带文本注释的对象
6. Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison [O] . Yaoyun Zhang, Firat Tiryaki, Min Jiang, 2019

机译：使用基于深度学习的最新解析器解析临床文本：系统比较
7. Visual Abstraction and Ordering in Faceted Browsing of Text Collections [O] . VinhTuan Thai, Siegfried Handschuh 2010

机译：文本集合的分面浏览中的视觉抽象和排序
8. Analysis of Free-Form Battlefield Reports with Shallow Parsing Techniques. [R] . Hecking, M. 2004

机译：浅析浅析浅析自由形态战场报告。

Faceted Search and Browsing of Indonesian Text Collection Using Shallow Parsing Techniques.

摘要

著录项

相似文献

相关主题

期刊订阅