In two of my previous posts (this and this), I tried to make a sentiment analysis on the twitter airline data set with one of the classic machine learning technique: Naive-Bayesian classifiers. For this post I did one classifier with a deep learning approach. This work won’t be seminal, it’s only an expedient to play, a little bit, with neural networks.
For this work I used Tensorflow and Keras to define the neural network and the new Jupyter Lab to write the code (I think it’s really cool!). If you would, you can find my data science environment, with all of these stuffs dockerized, at this link.
Ok, now let’s talk about the neural network used in this post, the most interesting layer is the LSTM layer. If you want to know more about LSTM I suggest to the read this post of Christopher Olah blog. LSTM layes are widely used for language processing, this is why I used this kind of layer for my analysis. A schema of the very simple neural network for this example if the following:
The entire notebook used for this analysis is just down here and can be found on my github profile here. Every code block is commented, so I don’t want to annoying you with a lot of words, let’s the code talks…
To train this network I used my dockerized data science environment on my laptop without any kind of GPU in a few minutes.
As we can see from the graphs: “Training and validation loss” and “Training and validation accuracy”, the 3th epoch is the best before the network start to over fitting the data.
The accuracy of prediction, with this network, is jumped from 86% to 94%, compared to the previous Naive-Bayesian classifiers, with a very simple network and few epochs. The accuracy for the positive tweets is increased too. Despite the accuracy increase with this kind of network, I think the accuracy can be improved, and this is the goal of my next tests.
Please feel free to comment and contact me to discuss about this post! 🙂