Gary Manley Data: twitter

Showing posts with label twitter. Show all posts

Friday, 26 January 2018

Basic Sentiment Analysis

For some of the work I have been doing recently I have starting looking into sentiment analysis. In other words this is analysing text to see if it is positive or negative. At the moment I have been using this to analysis tweets to see if certain subjects are trending in a positive or negative light.

You can see a sample post of some of the analysis here.

The library used for this is NLTK.

The code:

Saturday, 20 January 2018

Create a blog post from Twitter results

Following on from my post yesterday and setting up a means to post to blogger using python I have combined the following elements in this post:

Posting to blogger
Scraping Twitter
Connecting to Database
Getting data from database
pandas styles
pandas to html

There is of course scope to add more to this, I can scrape more websites and I can certainly improve on the formatting of the tables. I can do more analytics by looking into the words within the tweets and add some graphs. So lots more to be done. I can also use something like schedule to get this to run automatically and much more.

Pretty much everything needed for this post is contained in the code below or in the preceding posts.

The end result is something like this.

The code:

Friday, 12 January 2018

Twitter Reporting

So one of the things that I was interested in doing was analysing some data from Twitter. I thought a good place for me to start was with data on Dupuytren's and then on Ledderhose. Why this? Well because I am a trustee for the British Dupuytren's Society.

In this post I am going to use a variety of tools and libraries, I probably don't do this in the most perfect way as I was trying to achieve a few different things and this was just because I find python and data analysis fun!!!

Twitter Scraping:

The first python library that I am going to use is twitterscraper. This is a great tool for scraping the information from twitter. I have had a few issues with the errors coming out but that is probably my fault. Overall it works well, see the link above and my code snippet below.

I did add in retweets and likes but it did not seem to like that and the success rate for extraction was 10% rather than the 70% without it. The biggest issue is that I didn't want to cater for lots of different languages so everything, for example, in Chinese fails.

This is the python regular expression library. I have used regex's before and they are super powerful and cool (if you like that sort of thing). For more information see the following link. I only use it for a very simple use below, removing return characters and tabs from the tweets so that exporting to a text document works.

I also use other packages that I have discussed elsewhere:

pandas
Xlwings
pyodbc / SQL Alchemy
SQL scripting

Screen Shots of the Output: (Yes I could have made this look nicer with some of the formatting options I have posted about before or even just auto-fitting the columns but I felt those commands would have gotten lost in this post).

The Code: