首页> 外文OA文献 >From Big Data to Argument Analysis and Automated Extraction: A Selective Study of Argument in the Philosophy of Animal Psychology from the Volumes of the Hathi Trust Collection
【2h】

From Big Data to Argument Analysis and Automated Extraction: A Selective Study of Argument in the Philosophy of Animal Psychology from the Volumes of the Hathi Trust Collection

机译:从大数据到论证分析和自动提取:从Hathi Trust藏书中选择性地研究动物心理学哲学中的论证

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Digging by Debating (DbyD) project aimed to identify, extract, model, map and visualiseudphilosophical arguments in very large text repositories such as the Hathi Trust. The projectudhas: 1) developed a method for visualizing points of contact between philosophy and theudsciences; 2) used topic modeling to identify the volumes, and pages within those volumes,udwhich are ‘rich’ in a chosen topic; 3) used a semiformaluddiscourse analysis technique toudmanually identify key arguments in the selected pages; 4) used the OVA argument mappingudtool to represent and map the key identified arguments and provide a framework forudcomparative analysis; 5) devised and used a novel analysis framework applied to theudmapped arguments covering role, content and source of propositions, and the importance,udcontext and meaning of arguments; 6) created a prototype tool for identifying propositions,udusing naive Bayes classifiers, and for identifying argument structure in chosen texts, usingudpropositional similarity; 7) created tools to apply topic modeling to tasks of rating similarity ofudpapers in the PhilPapers repository. The methods from 1 to 5 above, have enabled us toudlocate and extract the key arguments from each text. It is significant that, in applying theudmethods, a nonexpertudwith limited or no domain knowledge of philosophy has both identifiedudthe volumes of interest from a key ‘Big Data Set’ (Hathi Trust) AND identified key argumentsudwithin these texts. This provided several key insights about the nature and form of argumentsudin historical texts, and is a proofofconceptuddesign for a tool that will be usable by scholars.udWe have further created a dataset with which to train and test prototype tools for bothudproposition and argument extraction. Though at an early stage, these preliminary results areudpromising given the complexity of the task.udSpecifically, we have prototyped a set of tools and methods that allow scholars to moveudbetween macroscale,udglobal views of the distributions of philosophical themes in suchudrepositories, and microscaleudanalyses of the arguments appearing on specific pages in textsudbelonging to the repository. Our approach spans bibliographic analysis, science mapping,udand LDA topic modeling conducted at Indiana University and machineassistedudargumentudmarkup into Argument Interchange Format (AIF) using the OVA (Online Visualization ofudArgument) tool from the University of Dundee, where the latter has been used to analyse andudrepresent arguments by the team based at the University of East London, who alsoudperformed a detailed empirical analysis of arguments in selected texts. This work has beenudarticulated as a proof of concept tool – linked to the repository PhilPapers – designed byudmembers linked to the University of London. This project is showing for the first time how biguddata text processing techniques can be combined with deep structural analysis to provideudresearchers and students with navigation and interaction tools for engaging with the large andudrich resources provided by datasets such as the Hathi Trust and PhilPapers. Ultimately ourudefforts show how the computational humanities can bridge the gulf between the “big data”udperspective of firstgenerationuddigital humanities and the close readings of text that are theud“bread and butter” of more traditional scholarship in the humanities.
机译:辩论中的挖掘(DbyD)项目旨在识别,提取,建模,地图化和可视化超大型文本存储库(如Hathi Trust)中的 udophistics参数。该项目 udhas:1)开发了一种可视化哲学与 udsciences之间的联系点的方法; 2)使用主题建模来识别在所选主题中“丰富”的体积和其中的页面3 )使用半正式的话语分析技术手动识别所选页面中的关键论点; 4)使用OVA参数映射 udtool表示和映射关键的已识别论点,并提供 udparative分析的框架; 5)设计和使用一个新颖的分析框架应用于 udpapped的论点,涵盖命题的作用,内容和来源,以及论据的重要性, udcontext和含义; 6)创建了一个原型工具,用于识别命题, udive天真贝叶斯分类器和识别论点使用 udpropositional相似度; 7)创建工具,将主题建模应用于PhilPapers存储库中 udpaper的相似度评分任务。上面1到5的方法使我们能够 udlocate并从每个文本中提取关键参数。重要的是,在运用 udmethods时,具有有限领域哲学知识或没有哲学领域知识的非专家 ud既从关键的“大数据集”(Hathi Trust)中确定了感兴趣的数量,又在这些文本中确定了关键的论点。这提供了关于参数的性质和形式的一些关键见解 udin历史文本,并且是学者可以使用的工具的概念证明 uddesign。 ud我们进一步创建了一个数据集,可用来训练和测试这两种工具的原型工具 ud命题和参数提取。尽管在早期阶段,鉴于任务的复杂性,这些初步结果还是不理想的。 ud特别地,我们已经原型化了一组工具和方法,可以使学者们在宏观,宏观的哲学主题分布之间移动。这样的 udrepository,以及在文本 ud属于该存储库的特定页面上显示的参数的微尺度 udanalys。我们的方法涵盖在印第安纳大学进行的书目分析,科学制图, udand LDA主题建模,以及使用邓迪大学的OVA( udArgument在线可视化)工具将 udargument udmarkup机器辅助转换为Argument Interchange Format(AIF)。后者被东伦敦大学的团队用来分析和表达论据,他们也对选定文本中的论据进行了详细的实证分析。这项工作已作为概念证明工具进行了详细说明–与存储库PhilPapers链接–由与伦敦大学相关的 udmembers设计。该项目首次展示了如何将大 uddata文本处理技术与深度结构分析相结合,为 udresearchers和学生提供导航和交互工具,以利用诸如Hathi Trust之类的数据集提供的庞大和 udrich资源和PhilPapers。最终,我们的努力表明了计算人文科学如何弥合第一代人 uddigital人文科学的“大数据” 阅读与人文科学中更为传统的“面包和黄油”的文本阅读之间的鸿沟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号