In October 2014, I successfully defended my dissertation titled “Analysis of Twitter Messages for Sentiment and Insight for use in Stock Market Decision Making.” This dissertation was the final step in earning more D.Sc. in Information Systems.
In my dissertation, I reported on my research into using natural language processing (NLP) to perform sentiment analysis on Twitter messages. This sentiment is then analyzed to determine if this sentiment can be used for stock market investing decisions using the idea of the Bear/Bull ratio which is a quantitative measure of sentiment from Twitter. The research conducted for my dissertation is the baseline for the services on this website.
You can view a video of my dissertation defense below (or click over to Vimeo to watch it there). Additionally, I’ve created a PDF version of my dissertation and I’ve listed it for sale on my Trade The Sentiment website for $50 per copy. You can purchase this dissertation and read up on some of the research that is the basis of this site by buying a copy for yourself. on this site. When you purchase a copy, you’ll have access to download a PDF version of my dissertation. To purchase a copy for yourself, you can click here and use Paypal or Stripe to buy a copy.
Note: If you are an academic research or doctoral student, please contact me directly and I’ll share a copy of my dissertation for free.
I have uploaded my Dissertation Defense slides on Slideshare. If you don’t recall, the title of my dissertation is: “Analysis of Twitter Messages for Sentiment and Insight for use in Stock Market Decision Making.”
I am now one step closer to finishing my doctorate. On Friday Oct 31, I defended my dissertation. The video of the presentation during the defense is provided below. I now only have to get a few documents signed and format my dissertation for publishing and I’ll be completely finished.
The title of my dissertation is: “Analysis of Twitter Messages for Sentiment and Insight for use in Stock Market Decision Making.”
The video is a bit over 1 hour and 12 minutes long. I cut out the question and answer session for the sake of brevity.
While working up my data analysis chapter of my dissertation, I came across some interesting tidbits of information and thought I’d share.
Nothing here is earth-shattering and there’s not much I (or you) can do with this…but I thought it interesting and hope someone else out there does too. I’ve shared other findings before – and continue to share my daily Bear/Bull Ratio via my Trade The Sentiment site, which is an outcome of this research.
For the data collection phase of my dissertation, I collected Twitter messages for all stocks in the S&P500 index and the SPY ETF itself. There are many great pieces of knowledge that I’ve gathered from this work – some I’ve shared but most I won’t share because I need something to put into the dissertation. 🙂
So…here’s some data that you might find interesting (or maybe you won’t). Without further ado – and without interpretation, here you go:
SPY and all symbols in S&P 500 Index
Dates: Jan 1 – Dec 30 2012
Number of Twitter Messages Captured: 1,655,962
Number of Symbols: 501 (S&P 500 + SPY)
Number of days messages captured: 361
Number of Twitter users: 224,499
Average Messages per day: 4,587.15
Average Messages per user: 7.38
Date with Highest message volume: December 5 2012
Symbol with most Mentions: AAPL (620,964 messages or 37.5% of messages)
Symbol with most Bearish Mentions: AAPL with 98,402 messages with bearish sentiment
Symbol with most Bullish Mentions: AAPL with 78,353 messages with bullish sentiment
User with most Tweets: SeekingAlpha
Top 10 users account for 128,703 messages or 7.77% of messages
Top 25 users account for 197,878 messages or 11.95% of messages
Top 50 users account for 278,846 messages or 16.84% of messages
50% of messages were sent by 849 Twitter users or 0.38% of users
80% of messages were sent by 14,049 Twitter users or 12.27% of users
While most of my research on Twitter Sentiment has been for use on larger time-frames (Daily, Weekly, etc), I’ve been very curious about using sentiment for intraday signals.
I finally found some time to hack together a script that would look at sentiment data intraday…and now i’m a bit unhappy that I did…because I’m fascinated with the intraday signals I’m seeing.
On Feb 25, we saw a nice little sell-off. The S&P500 gapped up to open the day and sold off for the remainder of the day for a loss of almost 38 points on the $SPX.
The following day, Feb 26, saw another move down until about mid-day when the markets ‘turned’ and started heading up…and we saw a retrace of about 40 points over the next three days.
Everyone seemed to be looking for a breakdown on Feb 25 and Feb 26 but it didn’t happen.
As luck would have it, over the weekend I had built my script to look at intraday sentiment. It was a quick hack (like most of my stuff) that allowed me to run a quick query to see what sentiment looks like at time “now”. On Feb 25, I was occasionally calling out the the intraday sentiment values in @gtotoy’s trading room over at DayTraderBootCamp (you should join if you aren’t a member…some GREAT traders there).
As the day wore on, I was noticing the sentiment was getting much more bearish…to the extent that it was in ‘bearish extreme’ levels by a large margin. At one point, the sentiment for the day was around 1.5 or so (1.0 is neutral, anything over 1.25 is considered a bearish extreme).
Twitter users were extremely bearish on the market on Feb 25 and the morning of Feb 26. On Feb 26th at 9:30 Central and again at 11:30 Central, there were 2 large bearish sentiment spikes…those times marked the same times that we saw the bottom in the market on that day (looking at at 15 minute chart).
At the time, I didn’t have the sentiment loaded up into a platform to view it against price action…but I’ve since fixed that.
Take a look at the following….a chart showing a 15 minute candle chart of SPY Action (in green), Intraday Sentiment sampled in 15 minute increments (in Orange) and the 21 period EMA of the Intraday Sentiment (in Magenta).
You’ll notice that the sentiment chart is fairly noisy…but the 21 EMA is much cleaner and provides a couple nice ‘signals’ over the last few weeks. On the 21 EMA chart (bottom pane), you’ll see a Red oval that highlights a Sell signal with Bullish Extreme on Feb 19 and a Green oval that highlights a Buy Signal on Feb 25 / 26th with Bearish Extreme readings).
Note: a few days of data don’t make something useful. It could be pure happenstance that the below signals did what they did…but I’ll be looking at more historical data to see what I can find.
I just finished giving a presentation titled “Will Twitter Make you a better investor?”…and like I always do with these presentations, I recorded one of my rehearsal’s to share.
In this presentation, I provide an overview of my research into using twitter sentiment and message volume as inputs into modeling stock price movements. A quick and dirty linear regression model using Twitter Sentiment, the Number of Tweets per day, the VIX Closing price and the VIX Price change delivers a simple model for the S&P 500 SPY ETF that has an accuracy of 57% over 6 months (tested on out-of sample data). This model was built using data from July 11 2011 to August 11 2011. Note: Accuracy is a measure of predicting the direction of movement. Being accurate and making money from that accuracy is two different things.
Update: Please note that the Linear Regression model described in this presentation is far from ideal. When modeling Time Series data, the linear regression model must be used with care due to autocorrelation issues.
If you don’t want to listen to me yammer, you can jump down to the bottom of this post and take a look at the slides.