ALPOS: A Machine Learning Approach for Analyzing Microblogging Data

机译：ALPOS：用于分析微博数据的机器学习方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the development of Internet, the increasing volume of information posted on micro-blogging sites like Twitter necessitates the need for efficient information filtering. In conventional text classification problems, it is assumed that the feature vectors extracted from the available documents are sufficient to learn good classifiers. However, this conventional approach is not likely to work for Twitter due to the limited number of characters on each tweet. From a higher level, each tweet can be viewed as an abbreviated abstraction of a long document, and we only have a partial observation of this document. To solve the problem caused by the partial observations, we introduce a novel domain adaption/transfer learning approach called Assisted Learning for Partial Observation (ALPOS). The basic idea is to use a large number of multi-labeled examples (source domain) to improve the learning on the partial observations (target domain). In particular, we learn a hidden, higher-level abstraction space, which is meaningful for the multi-labeled examples in the source domain. This is done by simultaneously minimizing the document reconstruction error and the error in a classification model learned in the hidden space by using known labels from the source domain. The partial observations in the target space are then mapped to the same hidden space for recovery and classification. We compare the performance of this method with existing approaches on synthetic data and the well-known Reuters-21578 dataset. We also present experimental results on twitter classification.

机译：随着互联网的发展，在Twitter等微博站点上发布的信息量越来越大的信息需要需要有效的信息过滤。在传统的文本分类问题中，假设从可用文档中提取的特征向量足以学习良好的分类器。但是，由于每次推文上的字符数有限，这种传统方法不太可能为Twitter工作。从更高的级别，每个推文都可以被视为长文档的缩写抽象，并且我们只对本文档进行了部分观察。为了解决部分观察引起的问题，我们介绍了一种名为辅助学习的新型域适应/转移学习方法，用于部分观察（ALPO）。基本思想是使用大量多标记的示例（源域）来改善部分观察（目标域）的学习。特别是，我们学习隐藏的更高级别的抽象空间，这对于源域中的多标记示例有意义。这是通过使用来自源域的已知标签同时最小化隐藏空间中学到的分类模型中的文档重建误差和错误来完成的。然后将目标空间中的部分观测映射到相同的隐藏空间以进行恢复和分类。我们将这种方法的性能与综合性数据和众所周知的Reuters-21578数据集的现有方法进行比较。我们还呈现了Twitter分类的实验结果。

著录项

来源
《IEEE International Conference on Data Mining Workshops》|2010年||共8页
会议地点
作者
Zhang Dan; Liu Yan; Lawrence Richard D.; Chenthamarakshan Vijil;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP274.2-53;
关键词
Assisted Learning for Partial Observation; Text Classification; Transfer learning; Twitter;

机译：辅助学习部分观察;文本分类;转移学习;推特;

相似文献

外文文献
中文文献
专利

1. Improvised Technique for Analyzing Data and Detecting Terrorist Attack Using Machine Learning Approach Based on Twitter Data [J] . Aditi Sarker, Partha Chakraborty, S. M. Shaheen Sha, Journal of Computer and Communications . 2020,第7期

机译：基于推特数据使用机器学习方法分析数据和检测恐怖攻击的简易技术
2. Improvised Technique for Analyzing Data and Detecting Terrorist Attack Using Machine Learning Approach Based on Twitter Data [J] . Aditi Sarker, Partha Chakraborty, S. M. Shaheen Sha, 电脑和通信（英文） . 2020,第007期

机译：基于Twitter数据的机器学习方法分析数据和检测恐怖袭击的改进技术
3. Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization, Spatial Regression, and Machine Learning [J] . Chao Wu, Mengjie Zhou, Pengyu Liu, Geohealth . 2021,第8期

机译：使用Multisource数据分析Covid-19：可视化，空间回归和机器学习的综合方法
4. ALPOS: A Machine Learning Approach for Analyzing Microblogging Data [C] . Zhang Dan, Liu Yan, Lawrence Richard D., . 2010

机译：ALPOS：一种用于分析微博数据的机器学习方法
5. Analyzing Small Molecule Inhibition of Enzymes: A Preliminary Machine Learning Approach towards Drug Lead Generation. [D] . Philip, Pearl. 2017

机译：分析酶的小分子抑制作用：一种用于药物线索生成的初步机器学习方法。
6. Analyzing COVID‐19 Using Multisource Data: An Integrated Approach of Visualization Spatial Regression and Machine Learning [O] . Chao Wu, Mengjie Zhou, Pengyu Liu, 2021

机译：使用多源数据分析Covid-19：可视化空间回归和机器学习的综合方法
7. Detecting Spam in Twitter Microblogging Services: A Novel Machine Learning Approach based on Domain Popularity [O] . Khalid Binsaeed, Gianluca Stringhini, Ahmed E. 2020

机译：检测Twitter微博服务中的垃圾邮件：一种基于领域人气的新型机器学习方法

ALPOS: A Machine Learning Approach for Analyzing Microblogging Data

摘要

著录项

相似文献

相关主题

期刊订阅