Showing posts with label Text Analysis. Show all posts
Showing posts with label Text Analysis. Show all posts

Friday 2 March 2018

Analysis on Padmaavat Movie Review


Padmaavat Movie Review

Introduction

I had performed this analysis way back and was scared to past this. As the issue is little bit off the trend now, I want to show how I perfumed sentimental analysis to understand crowd sentiment.

In this article we will look into the sentiment of people towards the movie Padmavat . Which was one of the most controversial movies in Indian cinema? We will uncover the real review which is available on public domain and perform analysis.

I am little bit scared to perform sentimental analysis as I did on Modi’s Mann Ki Bath. So we will do only till word cloud and will not conclude on the movie.

Let me load the required packages for the same

library(rvest)
## Loading required package: xml2
library(tm)
## Loading required package: NLP
library(SnowballC)
library(wordcloud)
## Loading required package: RColorBrewer
library(RColorBrewer)

Now let me scrap the web and import the 720 comments of the people who wrote the review on the website imdb

result <- c()
for(i in c(1, seq(10, 290, 10))) {
  link <- paste0("http://www.imdb.com/title/tt5935704/reviews?ref_=tt_urv",i)
  HouseofCards_IMDb <- read_html(link)
  
  # Used SelectorGadget as the CSS Selector
  reviews <- HouseofCards_IMDb %>% html_nodes(".text")%>%
    html_text()
  
  # perfrom data cleaning on user reviews
  reviews <- gsub("\r?\n|\r", " ", reviews) 
  reviews <- tolower(gsub("[^[:alnum:] ]", " ", reviews))
  sapply(reviews, function(x){})
  result <- c(result, reviews)
}

Now lets clean the data

movie_data<-result
movie_data <- gsub("\r?\n|\r", " ", movie_data) 
movie_data <- tolower(gsub("[^[:alnum:] ]", " ", movie_data))
movie_data<-removeWords(movie_data,stopwords())
movie_data<-gsub(pattern = "\\b[A-z]\\b{1}","",movie_data)
movie_data<-stripWhitespace(movie_data)
movie_data <- Corpus(VectorSource(movie_data))
dtm <- TermDocumentMatrix(movie_data)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)

Now lets plot the data

wordcloud(words = d$word, freq = d$freq, min.freq = 1,
          max.words=200, random.order=FALSE, rot.per=0.35, 
          colors=brewer.pal(8, "Dark2"))

head(d, 10)
##                word freq
## movie         movie 1380
## ranveer     ranveer  510
## bhansali   bhansali  480
## padmaavat padmaavat  450
## film           film  420
## just           just  420
## great         great  390
## like           like  390
## deepika     deepika  360
## shahid       shahid  360

If you are interested kmowing the sentimental analysis of Modi’s Mann Ki Bath please visit the link below

PART-1: https://experimentswithdatascience.blogspot.in/2017/08/text-analysis-easy-way-to-web-scrapping.html

PART-2 : https://experimentswithdatascience.blogspot.in/2017/10/modis-mann-ki-baat-text-analytics-text.html

PART-3 : http://experimentswithdatascience.blogspot.in/2017/11/analysis-of-modis-speech-on-29102017_40.html