首页> 外文学位 >Bayesian Nonparametric Models on Big Data
【24h】

Bayesian Nonparametric Models on Big Data

机译:大数据贝叶斯非参数模型

获取原文
获取原文并翻译 | 示例

摘要

This thesis focuses on the role investor type and sentiment play in financial markets, using data from social media. First paper investigates the effect of the interaction between asset maturity and liquidity restrictions in "on-the-run" phenomenon, using asset markets with search frictions. Under the presence of search frictions, investors would not prefer holding assets with very short time to maturity, since they will need to go back to the market and search for a counterpart to buy new assets every time their assets mature, incurring a search cost. However, they also would not want to hold assets with very long time to maturity due to liquidity considerations. An asset search model is set up to determine asset choices of investors with different liquidity preferences. Model considers two assets that differ in their maturities and two investor types who differ in their liquidity preferences. Main finding of this paper is that liquidity cost matters in the presence of search frictions as the model predicts a separating equilibrium where high type agents choose the long term asset and the low type agents choose the short term asset. When the two assets have the same time-to-maturities, the same separating equilibrium is obtained. Spread of the long term asset is found to be higher than that of the short term asset, which goes in line with the data and hence this paper shows that "on-the-run" phenomenon can be explained by higher search frictions in the off-the-run markets and investors with different liquidity preferences.;Second paper predicts intra-day foreign exchange rates by making use of trending topics from Twitter, using a sentiment based topic clustering algorithm. Twitter trending topics data provide a good source of high frequency information, which would improve the short-term or intra-day exchange rate predictions. This project uses an online dataset, where trending topics in the world are fetched from Twitter every ten minutes since July 2013. First, using a sentiment lexicon, the trending topics are assigned a sentiment (negative, positive, or uncertain), and then using a continuous Dirichlet process mixture model, the trending topics are clustered regardless of whether they are explicitly related to the currency under consideration. This unique approach enables to capture the general sentiment among users, which implicitly affects the currencies. Finally, the exchange rates are estimated using a linear model which includes the topic based sentiment series and the lagged values of the currencies, and a VAR model on the topic based sentiment time series. The main variables of interest are Euro/USD, GBP/USD, Swiss Franc/USD and Japanese Yen/USD exchange rates. The linear model with the sentiments from the topics and the lagged values of the currencies is found to perform better than the benchmark AR(1) model. Incorporating sentiments from tweets also resulted in a better prediction of currency values after unexpected events.;Third paper investigates the behavior of Reddit's news subreddit users and the relationship between their sentiment on exchange rates. Using graphical models and natural language processing, hidden online communities among Reddit users are discovered. The data used in this project are a mixture of text and categorical data from a news website. It includes the titles of the news pages, as well as a few user characteristics, in addition to users' comments. This dataset is an excellent resource to study user reaction to news since their comments are directly linked to the webpage contents. The model considered in this paper is a hierarhical mixture model which is a generative model that detects overlapping networks using the sentiment from the user generated content. The advantage of this model is that the communities (or groups) are assumed to follow a Chinese restaurant process, and therefore it can automatically detect and cluster the communities. The hidden variables and the hyperparameters for this model can be obtained using Gibbs sampling.
机译:本文使用来自社交媒体的数据,重点研究投资者类型和市场情绪在金融市场中的作用。第一篇论文使用具有搜索摩擦的资产市场,研究了资产期限和流动性限制之间的相互作用对“运行中”现象的影响。在存在搜索摩擦的情况下,投资者不希望持有期限很短的资产,因为他们每次资产到期时都需要返回市场并寻找对等方购买新资产,从而产生搜索成本。但是,由于流动性的考虑,他们也不想持有很长期限的资产。建立了资产搜索模型,以确定具有不同流动性偏好的投资者的资产选择。模型考虑了两种到期日不同的资产以及两种流动性偏好不同的投资者类型。本文的主要发现是,流动性成本在存在搜索摩擦的情况下很重要,因为该模型预测了一种分离均衡,其中高类型的主体选择长期资产,而低类型的主体选择短期资产。当两种资产的到期时间相同时,可以获得相同的分离均衡。发现长期资产的利差高于短期资产的利差,这与数据相符,因此,本文表明,“运行中”现象可以用较高的搜索摩擦来解释。 -运行中的市场和具有不同流动性偏好的投资者。第二篇论文通过基于Twitter的趋势主题以及基于情感的主题聚类算法,预测日内汇率。 Twitter趋势主题数据提供了高频信息的良好来源,这将改善短期或日内汇率预测。该项目使用在线数据集,自2013年7月起每隔十分钟从Twitter提取世界上的热门话题。首先,使用情感词汇,将热门话题分配给情感(否定,积极或不确定),然后使用在连续的Dirichlet过程混合模型中,趋势主题被聚类,无论它们是否与所考虑的货币明确相关。这种独特的方法可以捕获用户之间的总体情绪,这会隐含地影响货币。最后,使用线性模型(包括基于主题的情感序列和货币的滞后值)和基于主题的情感时间序列的VAR模型来估计汇率。感兴趣的主要变量是欧元/美元,英镑/美元,瑞士法郎/美元和日元/美元汇率。发现线性模型具有主题的情绪和货币的滞后值,其性能优于基准AR(1)模型。整合来自推文的情绪还可以更好地预测意外事件之后的货币价值。第三篇论文研究了Reddit的新闻subreddit用户的行为以及他们的情绪与汇率之间的关系。使用图形模型和自然语言处理,在Reddit用户中发现了隐藏的在线社区。此项目中使用的数据是新闻网站中文本和分类数据的混合。除了用户的评论外,它还包括新闻页面的标题以及一些用户特征。该数据集是研究用户对新闻的反应的极佳资源,因为他们的评论直接链接到网页内容。本文考虑的模型是分层混合模型,这是一种生成模型,可以使用用户生成的内容中的情感来检测重叠的网络。此模型的优势在于,假定社区(或组)遵循中国餐馆的流程,因此它可以自动检测和聚类社区。该模型的隐藏变量和超参数可以使用Gibbs采样获得。

著录项

  • 作者

    Ozcan, Fulya.;

  • 作者单位

    University of California, Irvine.;

  • 授予单位 University of California, Irvine.;
  • 学科 Economics.;Statistics.;Computer science.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 158 p.
  • 总页数 158
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号