Tue, 08 Mar 2016

gtrendsR 1.3.3

A very nice new update to the gtrendsR package by Philippe and myself is now avilable via CRAN. I had only blogged about the initial 1.3.0 release, and we have added a whole slew of new features and fixes. Philippe rewrote a lot of the parsing to make it more robust to different encodings, and to add other features. So in no particular order, we can now sub-group by regions more finely, withstand various misfeatures in returned data sets, generally do better on connections, and more --- and also allow for intra-day, daily and weekly queries!

That last part is pretty fun. Here is the code I ran last Saturday to look at the query for Donald Drumpf, a name brought to us via a beautiful John Oliver episode worth watching which ran about nine days ago. So last Saturday, when we were still within the seven day window, I ran

library(gtrendsR)
dp <- gtrends("Donald Drumpf", res="7d")
plot(dp) + ggplot2::ggtitle("The Drumpf") + ggplot2::theme(legend.position="none")

which resulted in the following chart

Donald Drump query 

which highlights another nice feature: the ggplot2 object created by the plotting function is returned, so we can locally modify and tune it. Here we set a title and suppress the default legend.

As I had not blogged about the interim bug-fix releases 1.3.1 and 1.3.2, here is the set of NEWS entries for the last three releases:

gtrendsR 1.3.3

  • A ggplot2 object can now be returned for further customization. plot(gtrends("NHL")) + ggtitle("NHL trend") + theme(legend.position="none")

  • Support for hourly and daily data (#67). For example, it is now possible to have hourly data for the last seven days with gtrends("nhl", geo = "CA", res = "7d"). Use ?gtrends for more information about the time resolution supported by the package.

  • Support for categorties (#46). Ex.: gtrends("NHL", geo = "US", cat = "0-20") will search only in the sport category.

  • Some countries (ex: Hong Kong) were missing from the list (#69).

  • Various typos and documentation work.

gtrendsR 1.3.2

  • Added support for sub-countries (#25). Ex.: gtrends("NHL", geo = "CA-QC") will return trends data for Québec province in Canada. The list of supported sub-countries can be obtained via data(countries).

  • Data parsing should work for any data returned by Google Trends (i.e. countries independent).

  • Better support for queries using keywords in different languages (#50, #57). Ex.: gtrends("蘋果", geo = "TW")

  • Now able to specify up to five countries (#53) via gtrends("NHL", geo = c("CA", "US"))

  • Fixing issue #51 allowing UK-based queries via geo = "GB"

gtrendsR 1.3.1

  • Fixing issue #34 where connection verification was not done properly.

  • Now able to use more latin character in query. For example: gtrends("montréal").

  • Can now deal with data returned other than in English language.

Courtesy of CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the gtrendsR repo where questions, comments etc should go via the issue tickets system.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Edited 2016-03-08: Corrected code snipped and one grammar instance

/code/gtrendsr | permanent link

Sun, 29 Nov 2015

gtrends 1.3.0 now on CRAN: Google Trends in R

Sometime earlier last year, I started to help Philippe Massicotte with his gtrendsR package---which was then still "hiding" in relatively obscurity on BitBucket. I was able to assist with a few things related to internal data handling as well as package setup and package builds--but the package is really largely Philippe's. But then we both got busy, and it wasn't until this summer at the excellent useR! 2015 conference that we met and concluded that we really should finish the package. And we both remained busy...

Lo and behold, following a recent transfer to this GitHub repository, we finalised a number of outstanding issues. And Philippe was even kind enough to label me a co-author. And now the package is on CRAN as of yesterday. So install.packages("gtrendsR") away and enjoy!

Here is a quiick demo:

## load the package, and if options() are set appropriately, connect
## alternatively, also run   gconnect("someuser", "somepassword")
library(gtrendsR)

## using the default connection, run a query for three terms
res <- gtrends(c("nhl", "nba", "nfl"))

## plot (in default mode) as time series
plot(res)

## plot via googeVis to browser
## highlighting regions (probably countries) and cities
plot(res, type = "region")
plot(res, type = "cities")

The time series (default) plot for this query came out as follows a couple of days ago:

Example of gtrendsR query and plot

One really nice feature of the package is the rather rich data structure. The result set for the query above is actually stored in the package and can be accessed. It contains a number of components:

R> data(sport_trend)
R> names(sport_trend)
[1] "query"     "meta"      "trend"     "regions"   "topmetros"
[6] "cities"    "searches"  "rising"    "headers"  
R>

So not only can one look at trends, but also at regions, metropolitan areas, and cities --- even plot this easily via package googleVis which is accessed via options in the default plot method. Furthermore, related searches and rising queries may give leads to dynamics within the search.

Please use the standard GitHub issue system for bug reports, suggestions and alike.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

/code/gtrendsr | permanent link