About Me

My photo
Just wasting time sharing knowledge about, Big Data and Analytics

May 18, 2013

Mining the last French presidential debate

After reading this post (thanks to him), I think it could be interesting to replicate this with some specific up of french language and to see and we can perform rapid view of the debate between Sarkozy and Hollande of the last 2nd round of presidential election.

Key words : TextMining, Elections, France, Debate, 2nd Round

We use the packages qdap from (Tyler Rinker) and tm to perform textmining analysis and the classical package like ggplot or RColorBrewer make our  graphics look pretty.
For Hollande

Top words From hollande
For Sarkozy
Top words from Sarkozy
As we can see, I've a problem to manage french accent. If somebody have any idea... We can also perform a quick Gantt plot basing on Qdap package and get some information about who lead the debate No surprise about the winner.

May 12, 2013

A new package : Quandl

Quandl is a new database management tool which seeks to become the place to find datasets. That is, each unique indicator is considered an independent data set. This helps them to seem to have a ginormous quantity of data sets. Source : Blog Econometric Simulation.
To load or find the datasets, we have to authentify using the API like with Twitter First, we need to set an account to receive the Quandl.auth

Quandl.auth("XXXXXXXXXX")  ### Replace with yours
For example, we load the database about pollution and gdp and try to find the link.

plot(date, pollution, col = "red", type = "o", lwd = 2, ylim = c(70, 150), ylab = "", 
    main = "Evolution du PIB vs Pollution entre 94 et 2006 au Japon")
lines(date, pib, lty = 2, col = "purple", type = "o")
legend("topright", legend = c("PIB", "Pollution"), col = c("purple", "red"), 
    pch = 15, bty = "n", pt.cex = 2, cex = 0.8, text.col = "black", horiz = TRUE, 
    inset = c(0.1, 0.1))

plot of chunk unnamed-chunk-3
We fit a model to know if the growth of GDP is responsible of pollution's growth in japan between 1994 and 2006.

plot of chunk unnamed-chunk-4

Ecologists are not totaly wrong ! ! !

The entire code to run this post
Share it ! ! !

May 6, 2013

Monitoring des médias 2

Petit monitoring de notre observatoire des médias sur Twitter.

Chez Mediapart :

plot of chunk unnamed-chunk-7

Le Monde

plot of chunk unnamed-chunk-9

Le Figaro

plot of chunk unnamed-chunk-11

Le parisien

plot of chunk unnamed-chunk-13

Vue globale

plot of chunk unnamed-chunk-15
Le code pour réaliser ce post :