Reddit Analysis

G. Lemus
2 min readFeb 12, 2021

(2021–02–16: Update — I added a link to the Yahoo Financial charts on the First table; clicking it takes you to the price plot for the stock)

Using the code published in (Demystifying) Sentiment Analysis in Finance I developed an automatic script that reads the ‘hot’ posts in r/wallstreetbets, identifies the stock tickers, and then looks for SEC Form 13F that also mentions them. I do not include the sentiment score as it uses a movie review off-the-shelf analyser completely incompatible with Financial Sentiment.

The output of the code also highlights many caveats on automatic analysis: there are many ‘false positives’ (common words wrongly identified as tickers) and some stocks are not being picked up (have fun looking for the mistakes on the last table).

(You can scroll on the embedded plots to see more rows)

(For an up to date version: click here)

Using the above list of stocks, I then search on the published SEC forms 13F for the funds that have a published position. Adding the calls and shares as a ‘long’ and the puts as a ‘short’ gives us an idea of the fund management view on the stock. The ‘s/l’ (ratio of shorts and longs) column gives us an idea of the fund world view bearishnes on the stock (either shorting the stock or protecting against a drop); again, with caveats, as the quarterly reports can be up to 45 days late:

(For an up to date version: click here — for details month by month click here)

As a sanity check, it is always good to review the raw data — see below the actual list of ‘hot’ titles and the found stocks. The order of the ‘top’ titles is determined by the moderators (they ‘pin’ posts to the top) and the time decay formula (a 12-hour-old post must have 10 times as many as a new post — more info at Reddit_for Beginner), the score is the number of upvotes minus downvotes. Take a look at the sid_compound (sentiment analysis score compounding positive and negative…

G. Lemus