Sunday 29 October 2017

Modi's Mann Ki Baat: Text Analytics-Text Mining (Part-2)


Ananlysis of Modi’s Speech on 29/10/2015- using Text Analysis

This part will majorly consist of Web scrapping and Data Mining part on Modi’s speach which he did on 29th October 2017 at 11AM.

To web scrapping we will use times of india and use a library called rvest.

library(rvest)
## Warning: package 'rvest' was built under R version 3.4.1
## Loading required package: xml2
## Warning: package 'xml2' was built under R version 3.4.1
article<- read_html("https://timesofindia.indiatimes.com/india/pm-modi-addresses-the-nation-in-mann-ki-baat-highlights/articleshow/61315589.cms")

article<-article %>%
  html_node(".Normal") %>% html_text()

Now we will look our imported data

article
## [1] "\n\n  * From the days of Khadi for Nation, we came to see Khadi for Fashion, and now the country is moving towards 'Khadi for transformation'\n  \n  * This year, there has been an increase of nearly 90 per cent in sale of handloom and Khadi products over previous year. This will have proved beneficial for the poor craftsmen\n  \n  * People asked on the NM App whether they could somehow send sweets on the occasion of Diwali to the soldiers at the borders\n  \n  * Celebrating Diwali with the jawans at Gurez was an unforgettable experience for me\n  \n  * Not only are our soldiers safeguarding the borders, they are helping keep the peace all across the world. More than 18,000 Indian soldiers have contributed to UN Peacekeeping missions\n  \n  * Currently 7,000 soldiers Indian troops are deployed with UN peacekeeping initiatives, it's third highest in the world\n  \n  * Indian women soldiers have contributed significantly to peacekeeping missions and India was first nation to send a female police unit to peacekeeping mission in Liberia\n  \n  * The United Nations declaration of Human Rights is testimony to India's push for gender equality. Article 01, which began, \"All men are born free and equal in dignity and rights\" was changed to \"All human beings are born free and equal in dignity and rights\" due to the constant efforts of Hansa Mehta\n  \n  * India has always spread the message of peace, unity and goodwill. We believe that everyone should live in harmony and move towards building a better and peaceful tomorrow\n  \n  * After 10 years India won Asia Cup. I congratulate entire team and the support staff\n  \n  * Young Indian players won hearts of fans during the FIFA Under-17 World Cup. Future of football is bright in India\n  \n  * Outdoor activities are a must for children. Elders must encourage children to move out and play in open field\n  \n  * Yoga for Young India! Yoga will help our children from lifestyle disorders\n  \n  * A NGO called Ecological Protection Organization launched a cleanliness campaign in Chandrapur Fort. In this campaign lasting for 200 days, people performed task of cleaning fort, non-stop, without any fatigue and with team-work. They sent me photographs with a caption- 'Before and After'! I was overwhelmed on seeing these\n  \n  * We shall celebrate the birth anniversary of Sardar Vallabhbhai Patel ji on the 31st of October. He ensured that millions of Indians were brought under the ambit of one nation & one constitution. 'Run for Unity' will be organised throughout the country on the day\n  "

Now we will start text mining

For which we will again create a set called text1

tolower(article)
## [1] "\n\n  * from the days of khadi for nation, we came to see khadi for fashion, and now the country is moving towards 'khadi for transformation'\n  \n  * this year, there has been an increase of nearly 90 per cent in sale of handloom and khadi products over previous year. this will have proved beneficial for the poor craftsmen\n  \n  * people asked on the nm app whether they could somehow send sweets on the occasion of diwali to the soldiers at the borders\n  \n  * celebrating diwali with the jawans at gurez was an unforgettable experience for me\n  \n  * not only are our soldiers safeguarding the borders, they are helping keep the peace all across the world. more than 18,000 indian soldiers have contributed to un peacekeeping missions\n  \n  * currently 7,000 soldiers indian troops are deployed with un peacekeeping initiatives, it's third highest in the world\n  \n  * indian women soldiers have contributed significantly to peacekeeping missions and india was first nation to send a female police unit to peacekeeping mission in liberia\n  \n  * the united nations declaration of human rights is testimony to india's push for gender equality. article 01, which began, \"all men are born free and equal in dignity and rights\" was changed to \"all human beings are born free and equal in dignity and rights\" due to the constant efforts of hansa mehta\n  \n  * india has always spread the message of peace, unity and goodwill. we believe that everyone should live in harmony and move towards building a better and peaceful tomorrow\n  \n  * after 10 years india won asia cup. i congratulate entire team and the support staff\n  \n  * young indian players won hearts of fans during the fifa under-17 world cup. future of football is bright in india\n  \n  * outdoor activities are a must for children. elders must encourage children to move out and play in open field\n  \n  * yoga for young india! yoga will help our children from lifestyle disorders\n  \n  * a ngo called ecological protection organization launched a cleanliness campaign in chandrapur fort. in this campaign lasting for 200 days, people performed task of cleaning fort, non-stop, without any fatigue and with team-work. they sent me photographs with a caption- 'before and after'! i was overwhelmed on seeing these\n  \n  * we shall celebrate the birth anniversary of sardar vallabhbhai patel ji on the 31st of october. he ensured that millions of indians were brought under the ambit of one nation & one constitution. 'run for unity' will be organised throughout the country on the day\n  "
article<-gsub(pattern = "\\d","",article)
article<-gsub("\n","",article)
article<-gsub(pattern = "\\W"," ",article)
article
## [1] "    From the days of Khadi for Nation  we came to see Khadi for Fashion  and now the country is moving towards  Khadi for transformation       This year  there has been an increase of nearly  per cent in sale of handloom and Khadi products over previous year  This will have proved beneficial for the poor craftsmen      People asked on the NM App whether they could somehow send sweets on the occasion of Diwali to the soldiers at the borders      Celebrating Diwali with the jawans at Gurez was an unforgettable experience for me      Not only are our soldiers safeguarding the borders  they are helping keep the peace all across the world  More than   Indian soldiers have contributed to UN Peacekeeping missions      Currently   soldiers Indian troops are deployed with UN peacekeeping initiatives  it s third highest in the world      Indian women soldiers have contributed significantly to peacekeeping missions and India was first nation to send a female police unit to peacekeeping mission in Liberia      The United Nations declaration of Human Rights is testimony to India s push for gender equality  Article   which began   All men are born free and equal in dignity and rights  was changed to  All human beings are born free and equal in dignity and rights  due to the constant efforts of Hansa Mehta      India has always spread the message of peace  unity and goodwill  We believe that everyone should live in harmony and move towards building a better and peaceful tomorrow      After  years India won Asia Cup  I congratulate entire team and the support staff      Young Indian players won hearts of fans during the FIFA Under  World Cup  Future of football is bright in India      Outdoor activities are a must for children  Elders must encourage children to move out and play in open field      Yoga for Young India  Yoga will help our children from lifestyle disorders      A NGO called Ecological Protection Organization launched a cleanliness campaign in Chandrapur Fort  In this campaign lasting for  days  people performed task of cleaning fort  non stop  without any fatigue and with team work  They sent me photographs with a caption   Before and After   I was overwhelmed on seeing these      We shall celebrate the birth anniversary of Sardar Vallabhbhai Patel ji on the st of October  He ensured that millions of Indians were brought under the ambit of one nation   one constitution   Run for Unity  will be organised throughout the country on the day  "

Here in text mining we have removed all the numbers, unwanted punctuations, hyperlinks and unwanted letters.

Now we have not completed the mining process. now we will remove unwanted words from the data set which are not necessary.

library(tm)
## Warning: package 'tm' was built under R version 3.4.1
## Loading required package: NLP
stopwords()
##   [1] "i"          "me"         "my"         "myself"     "we"        
##   [6] "our"        "ours"       "ourselves"  "you"        "your"      
##  [11] "yours"      "yourself"   "yourselves" "he"         "him"       
##  [16] "his"        "himself"    "she"        "her"        "hers"      
##  [21] "herself"    "it"         "its"        "itself"     "they"      
##  [26] "them"       "their"      "theirs"     "themselves" "what"      
##  [31] "which"      "who"        "whom"       "this"       "that"      
##  [36] "these"      "those"      "am"         "is"         "are"       
##  [41] "was"        "were"       "be"         "been"       "being"     
##  [46] "have"       "has"        "had"        "having"     "do"        
##  [51] "does"       "did"        "doing"      "would"      "should"    
##  [56] "could"      "ought"      "i'm"        "you're"     "he's"      
##  [61] "she's"      "it's"       "we're"      "they're"    "i've"      
##  [66] "you've"     "we've"      "they've"    "i'd"        "you'd"     
##  [71] "he'd"       "she'd"      "we'd"       "they'd"     "i'll"      
##  [76] "you'll"     "he'll"      "she'll"     "we'll"      "they'll"   
##  [81] "isn't"      "aren't"     "wasn't"     "weren't"    "hasn't"    
##  [86] "haven't"    "hadn't"     "doesn't"    "don't"      "didn't"    
##  [91] "won't"      "wouldn't"   "shan't"     "shouldn't"  "can't"     
##  [96] "cannot"     "couldn't"   "mustn't"    "let's"      "that's"    
## [101] "who's"      "what's"     "here's"     "there's"    "when's"    
## [106] "where's"    "why's"      "how's"      "a"          "an"        
## [111] "the"        "and"        "but"        "if"         "or"        
## [116] "because"    "as"         "until"      "while"      "of"        
## [121] "at"         "by"         "for"        "with"       "about"     
## [126] "against"    "between"    "into"       "through"    "during"    
## [131] "before"     "after"      "above"      "below"      "to"        
## [136] "from"       "up"         "down"       "in"         "out"       
## [141] "on"         "off"        "over"       "under"      "again"     
## [146] "further"    "then"       "once"       "here"       "there"     
## [151] "when"       "where"      "why"        "how"        "all"       
## [156] "any"        "both"       "each"       "few"        "more"      
## [161] "most"       "other"      "some"       "such"       "no"        
## [166] "nor"        "not"        "only"       "own"        "same"      
## [171] "so"         "than"       "too"        "very"
article<-removeWords(article,stopwords())
article<-gsub(pattern = "\\b[A-z]\\b{1}","",article)
article<-stripWhitespace(article)
article
## [1] " From days Khadi Nation came see Khadi Fashion now country moving towards Khadi transformation This year increase nearly per cent sale handloom Khadi products previous year This will proved beneficial poor craftsmen People asked NM App whether somehow send sweets occasion Diwali soldiers borders Celebrating Diwali jawans Gurez unforgettable experience Not soldiers safeguarding borders helping keep peace across world More Indian soldiers contributed UN Peacekeeping missions Currently soldiers Indian troops deployed UN peacekeeping initiatives third highest world Indian women soldiers contributed significantly peacekeeping missions India first nation send female police unit peacekeeping mission Liberia The United Nations declaration Human Rights testimony India push gender equality Article began All men born free equal dignity rights changed All human beings born free equal dignity rights due constant efforts Hansa Mehta India always spread message peace unity goodwill We believe everyone live harmony move towards building better peaceful tomorrow After years India won Asia Cup congratulate entire team support staff Young Indian players won hearts fans FIFA Under World Cup Future football bright India Outdoor activities must children Elders must encourage children move play open field Yoga Young India Yoga will help children lifestyle disorders NGO called Ecological Protection Organization launched cleanliness campaign Chandrapur Fort In campaign lasting days people performed task cleaning fort non stop without fatigue team work They sent photographs caption Before After overwhelmed seeing We shall celebrate birth anniversary Sardar Vallabhbhai Patel ji st October He ensured millions Indians brought ambit one nation one constitution Run Unity will organised throughout country day "

now we will requre 2 packages i.e stringr and worldcloud. one for data manupulation and create textbag and other for data visuvalisation.

library(stringr)
## Warning: package 'stringr' was built under R version 3.4.1
library(wordcloud)
## Warning: package 'wordcloud' was built under R version 3.4.1
## Loading required package: RColorBrewer
## Warning: package 'RColorBrewer' was built under R version 3.4.1
textbag<-str_split(article,pattern = "\\s+")
textbag<-unlist(textbag)
data.set<-table(textbag)

Now we have completed the process and now let’s start data visualization. By plotting a bar plot of world frequency

barplot(data.set[order(data.set,decreasing = T)],las=2,space = 1.5,cex.names = .6)

similarly we will plot a world cloud

wordcloud(textbag)

Later in my article we do sentimental analysis using positive words and negatives words and classify the article to be positive or negative.

Please note: it’s a times of India article and not an actual speech and you can go ahead with actual speech and can imitate the same.

Adding: you can also do little customization is data visualization and data mining part.

No comments:

Post a Comment