API, Google Trends, Python
Table Of Contents
- Frequently Asked Questions
- Why use the Google Trends API instead of the Google Trends Web interface?
- What do Google Trends values actuallydenote?
- What data can you pull with the Google TrendsAPI?
- What parameters can you specify in yourqueries?
- Are there any limitations to using the Pytrends Google TrendsAPI?
- 1. Search terms and topics are two different things
- 2. Disproportionate results
- 3. Keyword Length Limitations
- 4. All data is relative, not absolute
- 5. The categories are unreliable at best.
- 6. You can only provide five entries per chart.
- What API Methods are available with the Google TrendsAPI?
- autoComplete
- dailyTrends
- interestOverTime
- interestByRegion
- realtimeTrends
- relatedQueries
- relatedTopics
- What to look out fornext…
Frequently Asked Questions
Why use the Google Trends API instead of the Google Trends Web interface?
There is no problem with just using the web interface, however, when doing a large-scale project, which requires building a large dataset — this might become very cumbersome. Manually researching and copying data from the Google Trends site is a research and time-intensive process. When using an API, this time and effort are cut dramatically.
Are there any limitations to using the Pytrends Google Trends API?
Yes, there are. Before you begin you must be aware of these few things:
1) Search terms and topics are two different things.
2) The results are disproportionate.
3) There are keyword length limitations.
4) All data is relative, not absolute.
5) The categories are unreliable at best.
6) You can only provide five entries per chart.
What do Google Trends values actually denote?
According to Google Trends, the values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular. A value of 0 indicates a location where there was not enough data for this term.
Google Trends is a public platform that you can use to analyze interest over time for a given topic, search term, and even company.
Pytrends is an unofficial Google Trends API that provides different methods to download reports of trending results from google trends. The Python package can be used for automation of different processes such as quickly fetching data that can be used for more analyses later on.
In this article, I will share some insights on what you can do with Pytrends, how to do basic data pulls, providing snippets of Python code along the way. I will also answer some FAQs about Google Trends and most importantly — address the limitations of using the API and the data.
Why use the Google Trends API instead of the Google Trends Web interface?
There is no problem with just using the web interface, however, when doing a large-scale project, which requires building a large dataset — this might become very cumbersome.
Manually researching and copying data from the Google Trends site is a research and time-intensive process. When using an API, this time and effort are cut dramatically.
What do Google Trends values actuallydenote?
According to Google Trends, the values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular. A value of 0 indicates a location where there was not enough data for this term.
What data can you pull with the Google TrendsAPI?
Related to a particular keyword you provide to the API, you can pull the following data:
- Interest Over Time
- Historical Hourly Interest
- Interest by Region
- Related Topics
- Related Queries
- Trending Searches
- Top Charts
- Keyword Suggestions
We will explore the different methods that are available in the API for pulling this data in a bit, alongside how the syntax for each of these methods looks like.
What parameters can you specify in yourqueries?
There are two objects that you can specify parameters for:
- optionsObject
- callback
The callback is an optional function, where the first parameter is an error and the second parameter is the result. If no callback is provided, then a promise is returned.
const googleTrends = require('google-trends-api');googleTrends.apiMethod(optionsObject, [callback])
The optionsObject is an object with the following options keys:
- keyword (required) — Target search term(s)
string
orarray
- startTime — Start of the time period of interest (
new Date()
object). IfstartTime
is not provided, date of January 1, 2004, is assumed as this is the oldest available google trends data - endTime — End of the time period of interest (
new Date()
object). IfendTime
is not provided, the current date is selected. - geo — location of interest (
string
orarray
if you wish to provide separate locations for each keyword). - hl — Preferred language (
string
defaults to English) - timezone — Timezone (
number
defaults to the time zone difference, in minutes, from UTC to current locale (host system settings)) - category — the category to search within (
number
defaults to all categories) - property — Google property to filter on. Defaults to a web search. (enumerated
string
[‘images’, ‘news’, ‘youtube’ or ‘froogle’] the latter relating to Google Shopping results) - resolution — Granularity of the geo search (enumerated
string
[‘COUNTRY’, ‘REGION’, ‘CITY’, ‘DMA’]).resolution
is specific to the interestByRegion method. - granularTimeResolution — Boolean that dictates if the results should be given in a finer time resolution (if
startTime
andendTime
is less than one day, this should be set totrue
)
Are there any limitations to using the Pytrends Google TrendsAPI?
Yes, there are. Before you begin you must be aware of these few things:
1. Search terms and topics are two different things
Search terms and Topics are measured differently, so relatedTopics
will not work with comparisons that contain both Search terms and Topics.
This leads to duplicate entries.
This is something easily observable in the Google Trends UI, which sometimes offers several topics for the same phrase.
2. Disproportionate results
When using the interestbyregion
module, a higher value means a higher proportion of all queries, not a higher absolute query count.
So a small country where 80% of the queries are for “Google” will get twice the score of a giant country where only 40% of the queries are for that term.
3. Keyword Length Limitations
Google returns a response with code 400 when a keyword is > 100 characters.
4. All data is relative, not absolute
The data Google Trends shows you are relative, not absolute. Forbes Baxter Associates explains this neatly:
Look at the chart for searches in 2019. When you see the red line on the chart reaching 100 in about June, it doesn’t mean there were 100 searches for that term in June. It means that was the most popular search in 2019 and that it hit its peak in June.
5. The categories are unreliable at best.
There are some top-level categories, but they are not representative of the real interest and data.
There are cases where the categories and the data don’t represent the real-life operations, and this may be due to a lack of understanding from the searcher, falsely attributed intent, or an algorithm bug.
Another limitation is that you can only pick one category. But if you need to choose more than one due to a discrepancy between the data in the two categories, then this becomes a challenge for the next steps in data consolidation, visualization, and analysis.
6. You can only provide five entries per chart.
This can be really annoying. If you are using the API for professional purposes, such as analyzing a particular market, this makes the reporting really challenging.
Most markets have more than five competitors in them. Most topics have more than five keywords in them. Comparisons need context in order to work.
What API Methods are available with the Google TrendsAPI?
The following API methods are available:
autoComplete
Returns the results from the “Add a search term” input box in the google trends UI.
#install pytrends!pip install pytrends#import the librariesimport pandas as pdfrom pytrends.request import TrendReqpytrend = TrendReq()# Get Google Keyword Suggestionskeywords = pytrend.suggestions(keyword='Facebook')df = pd.DataFrame(keywords)df.head(5)
dailyTrends
Daily Search Trends highlights searches that jumped significantly in traffic among all searches over the past 24 hours and updates hourly.
These trends highlight specific queries that were searched, and an absolute number of searches made.
20 daily trending search results are returned. Here, a retroactive search for up to 15 days back can also be performed.
#install pytrends!pip install pytrends#import the librariesimport pandas as pdfrom pytrends.request import TrendReqpytrend = TrendReq()#get today's treniding topicstrendingtoday = pytrend.today_searches(pn='US')trendingtoday.head(20)
You can also get the topics that were trending historically, for instance for a particular year.
# Get Google Top Chartsdf = pytrend.top_charts(2020, hl='en-US', tz=300, geo='GLOBAL')df.head()
Output:
interestOverTime
Numbers represent search interest relative to the highest point on the chart for the given region and time.
If you use multiple keywords for comparison, the return data will also contain an average result for each keyword.
You can check the regional interest for multiple search terms.
#import the librariesimport pandas as pd from pytrends.request import TrendReqpytrend = TrendReq()#provide your search termskw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']#search interest per region#run model for keywords (can also be competitors)pytrend.build_payload(kw_list, timeframe='today 1-m')# Interest by Regionregiondf = pytrend.interest_by_region()#looking at rows where all values are not equal to 0regiondf = regiondf[(regiondf != 0).all(1)]#drop all rows that have null values in all columnsregiondf.dropna(how='all',axis=0, inplace=True)#visualiseregiondf.plot(figsize=(20, 12), y=kw_list, kind ='bar')
You can also get historical interest by specifying a time period.
#historical interesthistoricaldf = pytrend.get_historical_interest(kw_list, year_start=2020, month_start=10, day_start=1, hour_start=0, year_end=2021, month_end=10, day_end=1, hour_end=0, cat=0, geo='', gprop='', sleep=0)#visualise#plot a timeseries charthistoricaldf.plot(figsize=(20, 12))#plot seperate graphs, using theprovided keywordshistoricaldf.plot(subplots=True, figsize=(20, 12))
This has to be my favorite one as it enables super cool additional projects such as forecasting, calculating the share of search (if using competitors as input) and other cool mini-projects.
interestByRegion
This allows examining search term popularity based on location during the specified time frame.
Values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular, and a value of 0 indicates a location where the term was less than 1% as popular as the peak.
#install pytrends!pip install pytrends#import the librariesimport pandas as pd from pytrends.request import TrendReq#create modelpytrend = TrendReq()#provide your search termskw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']#get interest by region for your search termspytrend.build_payload(kw_list=kw_list)df = pytrend.interest_by_region()df.head(10)
realtimeTrends
Realtime Search Trends highlight stories that are trending across Google surfaces within the last 24 hours and are updated in real-time.
#install pytrends!pip install pytrends#import the librariesimport pandas as pd from pytrends.request import TrendReqpytrend = TrendReq()# Get realtime Google Trends datadf = pytrend.trending_searches(pn='united_states')df.head()
relatedQueries
Users searching for your term also searched for these queries. The following metrics are returned:
- Top — The most popular search queries. Scoring is on a relative scale where a value of 100 is the most commonly searched query, 50 is a query searched half as often, and a value of 0 is a query searched for less than 1% as often as the most popular query.
- Rising — Queries with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these queries are new and had few (if any) prior searches.
Check out the full code in the Collab link.
#install pytrends!pip install pytrends#import the librariesimport pandas as pd from pytrends.request import TrendReqfrom google.colab import files#build modelpytrend = TrendReq()#provide your search termskw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']pytrend.build_payload(kw_list=kw_list)#get related queriesrelated_queries = pytrend.related_queries()related_queries.values()#build lists dataframestop = list(related_queries.values())[0]['top']rising = list(related_queries.values())[0]['rising']#convert lists to dataframesdftop = pd.DataFrame(top)dfrising = pd.DataFrame(rising)#join two data framesjoindfs = [dftop, dfrising]allqueries = pd.concat(joindfs, axis=1)#function to change duplicatescols=pd.Series(allqueries.columns)for dup in allqueries.columns[allqueries.columns.duplicated(keep=False)]: cols[allqueries.columns.get_loc(dup)] = ([dup + '.' + str(d_idx) if d_idx != 0 else dup for d_idx in range(allqueries.columns.get_loc(dup).sum())] )allqueries.columns=cols#rename to proper namesallqueries.rename({'query': 'top query', 'value': 'top query value', 'query.1': 'related query', 'value.1': 'related query value'}, axis=1, inplace=True) #check your datasetallqueries.head(50)#save to csvallqueries.to_csv('allqueries.csv')#download from collabfiles.download("allqueries.csv")
relatedTopics
Users searching for your term also searched for these topics. The following metrics are returned:
- Top — The most popular topics. Scoring is on a relative scale where a value of 100 is the most commonly searched topic, a value of 50 is a topic searched half as often, and a value of 0 is a topic searched for less than 1% as often as the most popular topic.
- Rising — Related topics with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these topics are new and had few (if any) prior searches.
The syntax here is the same as above, with the change only in two rows, where related_queries
are mentioned:
# Related Topics, returns a dictionary of dataframesrelated_topic = pytrend.related_topics()related_topic.values()
What to look out fornext…
Hope you enjoyed this exploration.
You can find all of the code compiled into one Collab below ( ⬇️ Scroll down to the bottom of the page to view 🚀)
My next article will explore and go in-depth into 4 Beginner-Friendly Python Projects You Can Use Google Trends In.
Stay tuned and thanks for reading.
In the meantime, check out these resources created by brilliant people:
- Google Trends API exploration: Google Trends API for Python
- Script as a function for getting daily search data using Pytrends: pytrends/dailydata.py at master · GeneralMills/pytrends
The Ultimate Guide to PyTrends: the Google Trends API (with Python code examples)
Related posts:
- 5 Ways to Connect Semrush With Data Studio
- 5 Ways to Connect Hubspot to Data Studio
- Google Search Console URL inspection API in Data Studio (free dashboard template)