This tutorial will show you how to scrape Google data for free using ScrapeHero Cloud. Using these crawlers we will be scraping Google Search Results Page, Google Maps, and Google Reviews.
In this article, we will explore two major methods to scrape data from Google Maps, making it easier for you to access and use the data you need.
Here are the two methods to scrape Google Maps data:
- Building a web scraper in Python or JavaScript.
- Using the ScrapeHero Cloud, a no-code scraping tool.
We will dive into each of these methods so that you’ll be able to scrape Google Maps for your data needs.
Building a web scraper in Python or JavaScript to extract Google Maps data
In this section, we will guide you on how to scrape data from Google Maps using either Python or JavaScript. We will utilize the browser automation framework called Playwright to emulate browser behavior in our code.
Advantages of using this method include its ability to bypass common blocks put in place to prevent scraping. However, one disadvantage is the need for a deeper understanding of the Playwright API in order to use it effectively.
You could also use Python Requests, LXML or Beautiful Soup to build Google Maps scraper without using a browser or a browser automation library. But bypassing the anti scraping mechanisms put in place can be challenging, and is beyond the scope of this article.
Here are the steps to scrape Google Maps data using Playwright:
- Choose either Python or JavaScript as your programming language.
- Install Playwright for your preferred language:
Python
Javascript
Python
pip install playwright # to download the necessary browsers playwright install
Javascript
npm install playwright@latest
- Write your code to emulate browser behavior and extract the desired data from Google Maps using the Playwright API. You can use the code provided below:
Python (scraper.py)
Javascript (scraper.js)
Python (scraper.py)
import asyncio import json from playwright.async_api import Playwright, async_playwright from playwright.async_api import TimeoutError as PlaywrightTimeoutError async def extract_details(page): """ Extracts the results information from the page Args: page: Playwright page object Returns: A list containing details of results as dictionary. The dictionary has title, review count, rating, address of various results """ # defining selectors result_container_selector = 'div[role="article"]' title_selector = '.fontHeadlineSmall span' review_text_selector = '.ZkP5Je span' address_selector = 'div.W4Efsd div.W4Efsd:nth-of-type(1) span[jsinstance' '="*1"] span:not([aria-hidden="true"]):not([style' '*="none"])' phone_selector = 'div.W4Efsd div.W4Efsd:nth-of-type(2) span[jsins' 'tance="*1"] span:not([aria-hidden="true"]):not([sty' 'le*="none"])' results_parsed = [] results = page.locator(result_container_selector) # iterating through all the displayed results for result_idx in range(await results.count()): # extracting individual results result_elem = results.nth(result_idx) # extracting the title title = await result_elem.locator(title_selector).inner_text() # extracting and cleaning review details review_raw = await result_elem.locator( review_text_selector).all_inner_texts() rating = float(review_raw[0]) review_count = review_raw[1].replace('(', '').replace(')', '') # extracting address try: address = await result_elem.locator(address_selector).inner_text() except PlaywrightTimeoutError: # address may not be available address = None # extracting phone try: phone = await result_elem.locator(phone_selector).inner_text() except PlaywrightTimeoutError: # phone may not be available phone = None data = { 'title': title, 'review_count': review_count, 'rating': rating, 'address': address, 'phone': phone } results_parsed.append(data) return results_parsed async def run(playwright: Playwright) -> None: """ Main function which launches browser instance and performs browser interactions Args: playwright: Playwright instance """ browser = await playwright.chromium.launch( headless=False, proxy={'server': 'proxy url here'} ) context = await browser.new_context() # overriding timeout context.set_default_timeout(100000) search_term = "dentist in New York City, NY, USA" # Open new page page = await context.new_page() # Go to https://www.google.com/maps/ await page.goto("https://www.google.com/maps/") # Click [aria-label="Search Google Maps"] await page.locator("[aria-label="Search Google Maps"]").click() # Fill input[name="q"] await page.locator("input[name="q"]").fill(search_term) # click search button await page.locator("button[id="searchbox-searchbutton"]").click() # waiting for results to be displayed on the page await page.wait_for_selector('div[role="article"]') results = await extract_details(page) # saving the data with open('restaurant_data.json', 'w') as f: json.dump(results, f, indent=2) # --------------------- await context.close() await browser.close() async def main() -> None: async with async_playwright() as playwright: await run(playwright) asyncio.run(main())
Javascript (scraper.js)
// importing required modules import fs from 'fs'; // initializing the required browser import playwright from 'playwright'; /** * Open browser, goto given url and collect data */ async function run() { const browser = await playwright.chromium.launch({ headless: false }); const context = await browser.newContext({ proxy: { server: 'http://ProxyIP:Port' } }); // Open new page const page = await context.newPage(); // Go to https://www.google.com/maps await page.goto('https://www.google.com/maps', { waitUntil: 'load' }); // Click on search tab await page.locator('[aria-label="Search Google Maps"]').click(); // Enter the search query await page.locator('[aria-label="Search Google Maps"]').type('restaurants near New York, NY, USA', { 'delay': 500 }); // Press Enter after entering the query await page.locator('[aria-label="Search Google Maps"]').press('Enter'); let ListingPageData = await extractDetails(page); // Save data as JSON const jsonData = JSON.stringify(ListingPageData); saveJSONFile(jsonData); // Closing browser context after use await context.close(); await browser.close(); }; /** * Extract data from HTML content * @param page - Page object * @returns {JSON} - Return collected data in JSON format */ async function extractDetails(page) { // Wait for results let listedProductSelector = 'div[role="article"]'; await page.waitForSelector(listedProductSelector); let results = page.locator(listedProductSelector); // Now we need to collect details from HTML content. let ListingPageData = []; let resultCount = await results.count(); // All the selectors used to collect data let reviewTextSelector = '.ZkP5Je span'; let addressSelector = 'div.W4Efsd div.W4Efsd:nth-of-type(1) span[jsinstance="*1"] span:not([aria-hidden="true"]):not([style*="none"])'; let phoneSelector = 'div.W4Efsd div.W4Efsd:nth-of-type(2) span[jsinstance="*1"] span:not([aria-hidden="true"]):not([style*="none"])'; let titleSelector = `.fontHeadlineSmall span`; // Iterate through each search result and save data to a list variable for (let i = 0; i < resultCount; i++) { let resultElem = results.nth(i); let title = await resultElem.locator(titleSelector).innerText(); let reviewRaw = await resultElem.locator(reviewTextSelector).allInnerTexts(); let rating = reviewRaw[0]; let reviewCount = reviewRaw[1].replace('(', '').replace(')', ''); let address = null; let phone = null; try { address = await resultElem.locator(addressSelector).innerText(); } catch (err) { console.log("Address was not found!"); }; try { phone = await resultElem.locator(phoneSelector).innerText(); } catch (err) { console.log("Phone number was not found!"); }; let productData = { title: title, reviewCount: reviewCount, rating: rating, address: address, phone: phone }; console.log(productData); ListingPageData.push(productData); }; return ListingPageData; }; /** * Save JSON data to .json file * @param jsonData - Extracted data in JSON format */ async function saveJSONFile(jsonData) { fs.writeFile("data.json", jsonData, 'utf8', function (err) { if (err) { console.log("An error occured while writing JSON Object to File."); return console.log(err); }; console.log("JSON file has been saved."); }); }; run();
If you don't like or want to code, ScrapeHero Cloud is just right for you!
Skip the hassle of installing software, programming and maintaining the code. Download this data using ScrapeHero cloud within seconds.
Get Started for Free
This code shows how to scrape restaurant information from Google Maps using the Playwright library in both Python and JavaScript.
The corresponding scripts have two main functions, namely:
- run function: This function takes a Playwright instance as an input and performs the scraping process. The function launches a Chromium browser instance, navigates to Google Maps, fills in a search term, clicks the search button, and waits for the results to be displayed on the page.
The extract_details function is then called to extract the restaurant details and store the data in a JSON file. - extract_details function: This function takes a Playwright page object as an input and returns a list of dictionaries containing restaurant details. The details include the title, review count, rating, address, and phone of each restaurant.
Finally, the main function uses the async_playwright context manager to execute the run function. A JSON file would be created that contains the listings of the Google Maps script that you just executed.
The xpaths utilized in this tutorial may vary based on the location from which Google Maps is accessed. Google dynamically renders different xpaths for different regions. In this tutorial, the xpaths used were generated while accessing Google Maps from the United States.
- Run your code and collect the scraped data from Google Maps.
Python
Javascript
Python
python scraper.py
Javascript
node scraper.js
You can view the complete code in Github:
Using No-Code Google Maps Scraper by ScrapeHero Cloud
The Google Maps scraper by ScrapeHero Cloud is a convenient solution for scraping search results from Google Maps. It provides an easy, no-code method for scraping data, making it accessible for individuals with limited technical skills.
In this section, we’ll guide you through the steps required to set up and use the Google Maps scraper.
- Sign up or log in to your ScrapeHero Cloud account.
- Go to the Google Maps Search Results scraper by ScrapeHero Cloud in the marketplace.
- Add the scraper to your account. (Don’t forget to verify your email if you haven’t already.)
- If you want to scrape results for just one query, enter it in the field provided and choose the number of pages to scrape.
- If you want to scrape multiple queries, switch to AdvancedMode and in the Input tab, add the queries to the SearchQuery field and save the settings.
- To start the scraper, click on the Gather Data button.
- The scraper will start fetching data for your queries, and you can track its progress under the Jobs tab.
- Once it is finished, you can view or download the data from the same.
- You can also pull Google Maps data into an Excel spreadsheet from here. Just click on the Download Data and select “Excel” and open the downloaded file using Microsoft Excel.
If you don't like or want to code, ScrapeHero Cloud is just right for you!
Skip the hassle of installing software, programming and maintaining the code. Download this data using ScrapeHero cloud within seconds.
Get Started for Free
Use cases of Google Maps data
Here are some use cases for the location data from Google Maps:
- Location-based marketing: Location data from Google Maps can be used to target advertising and promotional messages to users based in those locations.
- Lead generation: Analyzing business locations, contact information, and other data points can help in generating leads mainly for B2B opportunities based on location.
- Visitor insights: Using “popular times” data point from the Google Maps scraper, you can generate insights on customer trends for a particular business listing.
- Brand sentiment: Reviews and ratings data from Google Maps by customers on business listings can help in determining the general sentiment towards that particular business.
- Competitor analysis: Google Maps data can be used to map out competitor locations, analyze competitor reviews and activities, such as hours of operation and new products, and identify gaps in the market.
Frequently Asked Questions
Google Maps scraping is the process of extracting data from the listings shown for a particular search query in Google Maps. This involves scraping details like business name, addresses, phone numbers, reviews, popular times and other data points.
Doesn’t Google Maps have an API? Why use ScrapeHero Cloud instead?
The official API provided by Google Maps API is costly and challenging to set up. Cost is really significant for commercial and other large-scale projects. The Google Maps scraper by ScrapeHero Cloud provides a cheaper alternative. Also, the official API is limited to customizations and adds an additional dependency for the project.
What is the subscription fee for the Google Maps Scraper by ScrapeHero?
You can subscribe to ScrapeHero Cloud for as low as 5$ for 300 page credits. You can see all pricing plans for our google maps scraper here.
What are the data categories you get by scraping Google Maps ?
You can scrape
- Business listings and its details by using Google Maps Scraper by ScrapeHero Cloud.
- Reviews of businesses by using Google Reviews Scraper by ScrapeHero Cloud.
Is it legal to scrape Google Maps ?
Legality is totally dependent on the legal jurisdiction, i.e. laws are specific to the country and the locality. Gathering or scraping publicly available information is not illegal.
Web scraping is generally considered legal if you are scraping publicly available data.
Please refer to our Legal Page to learn more about legality of web scraping.