A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books

机译：很大的英语语料库中随着时间的句法语法数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We created a dataset of syntactic-ngrams (counted dependency-tree fragments) based on a corpus of 3.5 million English books. The dataset includes over 10 billion distinct items covering a wide range of syntactic configurations. It also includes temporal information, facilitating new kinds of research into lexical semantics over time. This paper describes the dataset, the syntactic representation, and the kinds of information provided.

机译：我们基于350万本英语书籍的语料库创建了一个语法语法（计数依赖性树片段）的数据集。该数据集包括超过100亿个不同项，涵盖了广泛的语法配置。它还包括时间信息，以促进随着时间的流逝对词法语义学的新研究。本文介绍了数据集，语法表示形式以及所提供的信息种类。

著录项

来源
《Joint conference on lexical and computational semantics》|2013年|241-247|共7页
会议地点
作者
Yoav Goldberg; Jon Orwant;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The most famous fish: human relationships with fish as inferred from the corpus of online English books (1800-2000) [J] . Konstantinos I. Stergiou Ethics in science and environmental politics . 2017,第1期

机译：最著名的鱼类：从在线英语书籍的语料库中推断出的人类与鱼类的关系（1800-2000年）
2. The Natural Stories corpus: a reading-time corpus of English texts containing rare syntactic constructions [J] . Futrell Richard, Gibson Edward, Tily Harry J., Language Resources and Evaluation . 2021,第1期

机译：自然故事语料库：包含罕见的句法结构的英语文本的阅读时间语料库
3. Deep Learning Based Sentiment Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media Corpus [J] . Jamatia Anupam, Swamy Steve Durairaj, Gamback Bjorn, International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2020,第5期

机译：基于码混合英语 - 印度和英语 - 孟加拉社交媒体语料库的深度学习情感分析
4. A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books [C] . Yoav Goldberg, Jon Orwant Joint conference on lexical and computational semantics . 2013

机译：从一本非常大的英文语料库中随着时间的推移数据集
5. AN INTEGRATED BASIC MARITIME ENGLISH COURSE BASED ON THE FINDINGS AND IMPLICATIONS OF A DETAILED TEXT-ANALYSIS OF A CORPUS OF MARITIME ENGLISH WRITTEN DISCOURSE (CAREER COURSE, TERTIARY LEVEL). [D] . ANTONIOUS, RAAFAT SARKIS. 1984

机译：一个基于海事英语书面课程（职业课程，高等）的详细文本分析的发现和含义的综合性海事英语基础课程。
6. Which patients are not included in the English Cancer Waiting Times monitoring dataset 2009–2013? Implications for use of the data in research [O] . C Di Girolamo, S Walters, C Gildea, 2018

机译：2009-2013年英国癌症等待时间监测数据集中未包括哪些患者？在研究中使用数据的含义
7. Hans Lindquist, Corpus Linguistics and the Description of English (Edinburgh University Press, 2009): a Detailed EvaluationIn Corpus Linguistics and the Description of English Hans Lindquist offers another introductory book to corpus linguistics, but aims [O] . Marlies Gabriele Prinzl 2010

机译：汉斯·林德奎斯特，《语料库语言学和英语描述》（爱丁堡大学出版社，2009年）：《语料库语言学和英语描述》的详细评估汉斯·林德奎斯特为语料库语言学提供了另一本入门书，但目的是

A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books

摘要

著录项

相似文献

相关主题

期刊订阅