Thursday, 29 March 2018
A Review On Tensorflow Course By LinkedIn
Friday, 2 March 2018
Analysis on Padmaavat Movie Review
Padmaavat Movie Review
Sangamesh K S
March 2, 2018
Introduction
I had performed this analysis way back and was scared to past this. As the issue is little bit off the trend now, I want to show how I perfumed sentimental analysis to understand crowd sentiment.
In this article we will look into the sentiment of people towards the movie Padmavat . Which was one of the most controversial movies in Indian cinema? We will uncover the real review which is available on public domain and perform analysis.
I am little bit scared to perform sentimental analysis as I did on Modi’s Mann Ki Bath. So we will do only till word cloud and will not conclude on the movie.
Let me load the required packages for the same
library(rvest)
## Loading required package: xml2
library(tm)
## Loading required package: NLP
library(SnowballC)
library(wordcloud)
## Loading required package: RColorBrewer
library(RColorBrewer)
Now let me scrap the web and import the 720 comments of the people who wrote the review on the website imdb
result <- c()
for(i in c(1, seq(10, 290, 10))) {
link <- paste0("http://www.imdb.com/title/tt5935704/reviews?ref_=tt_urv",i)
HouseofCards_IMDb <- read_html(link)
# Used SelectorGadget as the CSS Selector
reviews <- HouseofCards_IMDb %>% html_nodes(".text")%>%
html_text()
# perfrom data cleaning on user reviews
reviews <- gsub("\r?\n|\r", " ", reviews)
reviews <- tolower(gsub("[^[:alnum:] ]", " ", reviews))
sapply(reviews, function(x){})
result <- c(result, reviews)
}
Now lets clean the data
movie_data<-result
movie_data <- gsub("\r?\n|\r", " ", movie_data)
movie_data <- tolower(gsub("[^[:alnum:] ]", " ", movie_data))
movie_data<-removeWords(movie_data,stopwords())
movie_data<-gsub(pattern = "\\b[A-z]\\b{1}","",movie_data)
movie_data<-stripWhitespace(movie_data)
movie_data <- Corpus(VectorSource(movie_data))
dtm <- TermDocumentMatrix(movie_data)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
Now lets plot the data
wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=brewer.pal(8, "Dark2"))
head(d, 10)
## word freq
## movie movie 1380
## ranveer ranveer 510
## bhansali bhansali 480
## padmaavat padmaavat 450
## film film 420
## just just 420
## great great 390
## like like 390
## deepika deepika 360
## shahid shahid 360
If you are interested kmowing the sentimental analysis of Modi’s Mann Ki Bath please visit the link below
PART-1: https://experimentswithdatascience.blogspot.in/2017/08/text-analysis-easy-way-to-web-scrapping.html
PART-2 : https://experimentswithdatascience.blogspot.in/2017/10/modis-mann-ki-baat-text-analytics-text.html
PART-3 : http://experimentswithdatascience.blogspot.in/2017/11/analysis-of-modis-speech-on-29102017_40.html