首页> 外文会议>ACM/IEEE-CS joint conference on digital libraries >Building a Search Engine for Computer Science Course Syllabi
【24h】

Building a Search Engine for Computer Science Course Syllabi

机译:构建计算机科学课程大纲的搜索引擎

获取原文

摘要

Syllabi are rich educational resources. However, finding Computer Science syllabi on a generic search engine does not work well. Towards our goal of building a syllabus collection we have trained various Machine Learning classifiers to recognize Computer Science syllabi from other web pages and the discipline that they represent (AI or SE for instance) among other things. We have crawled 50 Computer Science departments in the US and gathered 100,000 candidate pages. Our best classifiers are more than 90% accurate at identifying syllabi from real-world data. The syllabus repository we created is live for public use [1] and contains more than 3000 syllabi that our classifiers filtered out from the crawl data. We present an analysis of the various feature selection methods and classifiers used.
机译:Syllabi是丰富的教育资源。但是,在通用搜索引擎上查找“计算机科学”教学大纲无法正常工作。为了实现建立教学大纲集合的目标,我们已经训练了各种机器学习分类器,以从其他网页及其代表的学科(例如AI或SE)中识别计算机科学教学大纲。我们已经在美国搜寻了50个计算机科学系,并收集了100,000个候选页面。我们最好的分类器从真实世界数据中识别音节的准确率超过90%。我们创建的教学大纲资料库可供公众使用[1],其中包含3000多个教学大纲,我们的分类器从爬网数据中筛选出该教学大纲。我们对使用的各种特征选择方法和分类器进行了分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号