How to Scrape Booking.com for Hotel Data

How to scrape Booking.com
Install the packages needed for running the Booking Scraper
The Code
Create Selectorlib Template to Scrape Hotel Data from Booking.com Search Results
Running the Web Scraper

This tutorial will show you how to scrape hotel data and pricing from Booking.com using Python and Selectorlib. You can use this for scraping hotel data from Booking.com.

How to scrape Booking.com

Search on Booking.com for Hotels with your conditions like Location, Check In Date, Check Out Date, Room Type, Number of People, etc.
Copy the Search Result URL and pass it to hotel scraper.
In the scraper, we will download this URL using Python Requests
We will then parse this HTML using Selectorlib Template to extract the fields like Name, Location, Room Type etc.
Scraper will then save the data to a CSV file

This hotel data scraper will extract the following data. You may add more fields

Hotel Name
Hotel Location
Type of Room
Price
Price For (eg: 1 night, 2 Adults)
Bed Type
Overall Rating
Rating Tile
Number of Reviews
Link

Read More – Scrape TripAdvisor for hotel data

Install the packages needed for running the Booking Scraper

Follow this guide to setup your computer and install packages:

How To Install Python Packages for Web Scraping in Windows 10

We will need the following Python 3 Packages

Python Requests, to make requests and download the HTML content of the Search Result page from Booking
SelectorLib python package to extract data using the YAML file we created from the webpages we download.

Install them using pip3

pip3 install requests selectorlib

Read More – Which is the largest Hotel Chain in US ?

The Code

All the code used in this tutorial is available for download from Github at Booking.com Web Scraper

Lets create our project folder called booking-hotel-scraper. In the folder, add a Python file called scrape.py

Paste the code below in scrape.py

from selectorlib import Extractor
import requests 
from time import sleep
import csv

# Create an Extractor by reading from the YAML file
e = Extractor.from_yaml_file('booking.yml')

def scrape(url):    
    headers = {
        'Connection': 'keep-alive',
        'Pragma': 'no-cache',
        'Cache-Control': 'no-cache',
        'DNT': '1',
        'Upgrade-Insecure-Requests': '1',
        # You may want to change the user agent if you get blocked
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',

        'Referer': 'https://www.booking.com/index.en-gb.html',
        'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
    }

    # Download the page using requests
    print("Downloading %s"%url)
    r = requests.get(url, headers=headers)
    # Pass the HTML of the page and create 
    return e.extract(r.text,base_url=url)


with open("urls.txt",'r') as urllist, open('data.csv','w') as outfile:
    fieldnames = [
        "name",
        "location",
        "price",
        "price_for",
        "room_type",
        "beds",
        "rating",
        "rating_title",
        "number_of_ratings",
        "url"
    ]
    writer = csv.DictWriter(outfile, fieldnames=fieldnames,quoting=csv.QUOTE_ALL)
    writer.writeheader()
    for url in urllist.readlines():
        data = scrape(url) 
        if data:
            for h in data['hotels']:
                writer.writerow(h)
            # sleep(5)

The code above will

Open a file called urls.txt and download the HTML content for each link in it
Parse the HTML using the Selectorlib Template called booking.yml
Save the output to a CSV file called data.csv

Let’s create the file urls.txt and paste our search result URLs into it, then lets go ahead and create our Selectorlib Template.

Create Selectorlib Template to Scrape Hotel Data from Booking.com Search Results

You will notice that in the code above that we used a file called booking.yml. This file is what makes the code in this tutorial so concise and easy. The magic behind creating this file is a Web Scraper tool called Selectorlib.

Selectorlib is a tool that makes selecting, marking up, and extracting data from web pages visual and easy. The Selectorlib Web Scraper Chrome Extension lets you mark data that you need to extract, and creates the CSS Selectors or XPaths needed to extract that data. Then previews how the data would look like. You can learn more about Selectorlib and how to use it here

If you just need the data we have shown above, you do not need to use Selectorlib. Since we have done that for you already and generated a simple “template” that you can just use. However, if you want to add a new field, you can use Selectorlib to add that field to the template.

Here is how we marked up the fields for the data we need to scrape using Selectorlib Chrome Extension.

Once you have created the template, click on ‘Highlight’ to highlight and preview all of your selectors. Finally, click on ‘Export’ and download the YAML file and that file is the booking.yml file.

Here is how our template – booking.yml looks like

hotels:
    css: div.sr_item
    multiple: true
    type: Text
    children:
        name:
            css: span.sr-hotel__name
            type: Text
        location:
            css: a.bui-link
            type: Text
        price:
            css: div.bui-price-display__value
            type: Text
        price_for:
            css: div.bui-price-display__label
            type: Text
        room_type:
            css: strong
            type: Text
        beds:
            css: div.c-beds-configuration
            type: Text
        rating:
            css: div.bui-review-score__badge
            type: Text
        rating_title:
            css: div.bui-review-score__title
            type: Text
        number_of_ratings:
            css: div.bui-review-score__text
            type: Text
        url:
            css: a.hotel_name_link
            type: Link

Running the Web Scraper

To run the scraper, from the project folder,

Search in Booking.com for Hotels
Copy and add the search result URLS to urls.txt
Run python3 scrape.py
Get data from data.csv

Here is an example data from a search results page

You can parse the address scraped using this tutorial.

We can help with your data or automation needs

Turn the Internet into meaningful, structured and usable data

Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

Disclaimer: Any code provided in our tutorials is for illustration and learning purposes only. We are not responsible for how it is used and assume no liability for any detrimental usage of the source code. The mere presence of this code on our site does not imply that we encourage scraping or scrape the websites referenced in the code and accompanying tutorial. The tutorials only help illustrate the technique of programming web scrapers for popular internet websites. We are not obligated to provide any support for the code, however, if you add your questions in the comments section, we may periodically address them.

Published on: April 29, 2020

Services

How to Scrape Booking.com for Hotel Data

Table of contents

How to scrape Booking.com

Install the packages needed for running the Booking Scraper

The Code

Create Selectorlib Template to Scrape Hotel Data from Booking.com Search Results

Running the Web Scraper

We can help with your data or automation needs

Table of contents

Scrape any website, any format, no sweat.

Ready to turn the internet into meaningful and usable data?

Continue Reading

Programmatic Content Archiving: An Essential Guide

Monitor Instagram Reels on SERP via Python Scraping: A Technical Overview

How to Scrape AI Overviews for Multiple Queries: A Technical Guide