首页>
外国专利>
Graph based re-composition of document fragments for name entity recognition under exploitation of enterprise databases
Graph based re-composition of document fragments for name entity recognition under exploitation of enterprise databases
展开▼
机译:基于图的文档片段重组,用于利用企业数据库进行名称实体识别
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods and systems are described that involve recognizing complex entities from text documents with the help of structured data and Natural Language Processing (NLP) techniques. In one embodiment, the method includes receiving a document as input from a set of documents, wherein the document contains text or unstructured data. The method also includes identifying a plurality of text segments from the document via a set of tagging techniques. Further, the method includes matching the identified plurality of text segments against attributes of a set of predefined entities. Lastly, a best matching predefined entity is selected for each text segment from the plurality of text segments.;In one embodiment, the system includes a set of documents, each document containing text or unstructured data. The system also includes a database storage unit that stores a set of predefined entities, wherein each entity contains a set of attributes. Further, the system includes a processor to identify a plurality of text segments from a document via a set of tagging techniques and to match the identified plurality of text segments against the set of attributes.
展开▼