How to Scrape Zillow Real Estate Listings using Python and LXML

Web Scraping real estate data is a viable option to keep track of real estate listings available for sellers and agents. Being in possession of extracted real estate information from real estate sites such as can help adjust prices of listings on your site or help you create a database for your business. In this tutorial, we will scrape Zillow data using python, and show you how to extract real estate data. In particular, we will show you how to scrape real estate listings based on zip code.

Here are the steps to scrape Zillow

  1. Construct the URL of the search results page from Zillow. Example –
  2. Download HTML of the search result page using Python Requests.
  3. Parse the page using LXML – LXML lets you navigate the HTML Tree Structure using Xpaths.
  4. Save the data to a CSV file.
Scrape Zillow using ScrapeHero Cloud at just $5!

No coding required and No setup required – Just provide URLs to start scraping!

Scrape Real Estate Listings in Zillow from ANY browser

Get started with scraping Zillow for the lowest price

Learn how to scrape real estate data using ScrapeHero Cloud

We will be extracting the following data from Zillow:

  1. Title
  2. Street Name
  3. City
  4. State
  5. Zip Code
  6. Price
  7. Facts and Features
  8. Real Estate Provider
  9. URL

Below is a screenshot of some of the data fields we will be extracting from Zillow


Required Tools

Install Python 3 and Pip

Here is a guide to install Python 3 in Linux –

Mac Users can follow this guide –

Windows Users go here –


For this web scraping tutorial using Python 3, we will need some packages for downloading and parsing the HTML. Below are the package requirements:

The Code

We have to first construct the search result page URL. We’ll have to create this URL manually to scrape results from that page. For example, here is the one for Boston-

You can download the code from the link here  if the embed does not work.

If you would like the code in Python 2.7 to scrape zillow listings, you can check out the link at

Running the Zillow Scraper

Assume the script is named, When you type in the script name in a command prompt or terminal with a -h

usage: [-h] zipcode sort

positional arguments:


                available sort orders are :

                newest : Latest property details

                cheapest : Properties with cheapest price

optional arguments:

  -h, --help  show this help message and exit

You must run the zillow scraper using python with arguments for zip code and sort. The sort argument has the options ‘newest’ and ‘cheapest’ listings available. As an example, to find the listings of the newest properties up for sale in Boston, Massachusetts we would run the script as:

python3 02126 newest

This will create a CSV file called properties-02126.csv that will be in the same folder as the script. Here is some sample data extracted from for the command above.


You can download the code at

Read More: How to Scrape Trulia using ScrapeHero Cloud [/inlinelink]

Known Limitations

This Zillow scraper should be able to scrape real estate listings of most zip codes provided. To learn more on real estate data management you can go through this post – Real Estate and Quality Challenges

If you would like to scrape Zillow listings details of thousands of pages, you should read Scalable do-it-yourself scraping – How to build and run scrapers on a large scale and How to prevent getting blacklisted while scraping.

If you need some professional help with web scraping real estate data, you can fill-up the form below.

We can help with your data or automation needs

Turn the Internet into meaningful, structured and usable data

Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

Disclaimer: Any code provided in our tutorials is for illustration and learning purposes only. We are not responsible for how it is used and assume no liability for any detrimental usage of the source code. The mere presence of this code on our site does not imply that we encourage scraping or scrape the websites referenced in the code and accompanying tutorial. The tutorials only help illustrate the technique of programming web scrapers for popular internet websites. We are not obligated to provide any support for the code, however, if you add your questions in the comments section, we may periodically address them.


Matthew Hom July 20, 2019

Anyone else having this issue when running the code?

status code received: 200

Traceback (most recent call last):
File “C:/Users/matto/PycharmProjects/Real_Estate_Scraping/”, line 185, in
scraped_data = parse(zipcode, sort)
File “C:/Users/matto/PycharmProjects/Real_Estate_Scraping/”, line 129, in parse
return get_data_from_json(raw_json_data)
File “C:/Users/matto/PycharmProjects/Real_Estate_Scraping/”, line 74, in get_data_from_json
cleaned_data = clean(raw_json_data).replace(‘“, “”)
AttributeError: ‘NoneType’ object has no attribute ‘replace’


    davidmakovoz July 30, 2019

    I just tried this script. It looks like zillow implemented a Captcha to prevent automated harvesting of their data. Here is a snippet from the response I got:

    response = get_response(url)

    ….function handleCaptcha(response)….


    Chris July 31, 2019

    Yes, I am receiving the same error message. It appears to stem from the variable “raw_json_data” being empty. Maybe a problem with the parser.xpath() call?


Erin October 2, 2019

I ended up installing tesseract to handle Captcha’s and reran Still no luck


John March 26, 2020

Did anyone figure out how to do this?


diytechy May 9, 2020

Follow the advice of “Xiyu-1 commented on Mar 7” from the git site “”

Here Xiyu describes how the script needs to be modified to return the results and complete creation of the csv file.




Leave a Reply

Your email address will not be published. Required fields are marked *

Turn the Internet into meaningful, structured and usable data   

ScrapeHero Logo

Can we help you get some data?