How To Scrape eBay Product Data using Python and LXML

Web scraping eBay can help with collecting information for businesses and pricing intelligence. In this article, we will show you how to scrape eBay and extract data such as prices and names of products in all categories by a brand. Scraping eBay listings at regular intervals can be useful to check the details of products and compare them with your competitor sites.

Here are the steps to scrape eBay

  1. First, construct the URL for the search results to scrape eBay. Example – https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=iphone+7&_blrs=recall_filtering
  2. Download HTML of the search result page using Python Requests.
  3. Parse the page using LXML – LXML lets you navigate the HTML Tree Structure using Xpaths.
  4. Save the scraped eBay product data to a CSV file.

Below is a screenshot of the data to extract using our eBay scraper.

scraping-ebay

You could also scrape details such as the number of products sold or the ratings given by consumers, but for now, we will keep this eBay scraper simple and extract these.

Requirements

Install Python 3 and Pip

Here is a guide to install Python 3 in Linux – http://docs.python-guide.org/en/latest/starting/install3/linux/

Mac Users can follow this guide – http://docs.python-guide.org/en/latest/starting/install3/osx/

Windows Users go here – https://www.scrapehero.com/how-to-install-python3-in-windows-10/

Packages

For this web scraping tutorial using Python 3, we will need some packages for downloading and parsing the HTML. Below are the package requirements:

The Code

Since we will be monitoring prices by their brand, here is the one for Apple – https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=iphone+7&_blrs=recall_filtering

https://gist.github.com/scrapehero/2a1be61eb28cfa577e379e2b69b31c90

You can download the link at https://gist.github.com/scrapehero/2a1be61eb28cfa577e379e2b69b31c90 if the embed above does not work.

If you would like the code in Python 2.7, you can check out the link at https://gist.github.com/scrapehero/352052515da5e2511d5a31a5eb2786da

Running the eBay Scraper

We have named the script ebay_scraper.py. If you type in the script name in terminal or command prompt with a -h

usage: ebay_scraper.py [-h] brand

positional arguments:

  brand       Brand Name

optional arguments:

  -h, --help  show this help message and exit

 

The brand argument represents any brand available on eBay. You can type in a brand that eBay currently has on its site such as – Samsung, Canon, Dell, etc. The script must be run with the argument for brand. As an example, to find all of the products Apple currently has on eBay, we would run the scraper like this. 

python3 ebay_scraper.py apple

In this article we are only scraping the product’s name, price, and URL from the first page of results, so a CSV file should be enough to fit in all the data. If you would like to extract details in bulk, a JSON file is more preferable. You can read about choosing your data format, just to be sure.

This will create a CSV file named apple-ebay-scraped-data.csv that will be in the same folder as the script. Here are some of the data extracted from eBay in a CSV file from the command above.

how-to-scrape-ebay

You can download the code at https://gist.github.com/scrapehero/2a1be61eb28cfa577e379e2b69b31c90

We would love to know how this scraper worked for you. Let us know in the comments below.

Known Limitations

This code should be able to scrape eBay prices and details of most brands available. If you want to scrape and extract the details of thousands of products for each brand and check the competitor prices of products periodically (on an hourly basis), then you should read  Scalable do-it-yourself scraping – How to build and run scrapers on a large scale and How to prevent getting blacklisted while scraping

Disclaimer: Any code provided in our tutorials is for illustration and learning purposes only. We are not responsible for how it is used and assume no liability for any detrimental usage of the source code. The mere presence of this code on our site does not imply that we encourage scraping or scrape the websites referenced in the code and accompanying tutorial. The tutorials only help illustrate the technique of programming web scrapers for popular internet websites. We are not obligated to provide any support for the code, however, if you add your questions in the comments section, we may periodically address them.

If you need some professional help with scraping complex websites you can fill up the form below.

Do you need some prices monitored ?

We help business monitor prices across e-Commerce websites by collecting data



Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

 

Responses

frdscave June 7, 2018

i m a newbie in python trying to install it steps by steps, however i m getting error msg like below

/usr/local/lib/python3.6/site-packages/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn’t match a supported version!
RequestsDependencyWarning)
Traceback (most recent call last):
File “ebay_scraper.py”, line 4, in
import unicodecsv as csv
ModuleNotFoundError: No module named ‘unicodecsv’

grateful if you can give some advice, thanks!

Reply

    Anders Henricsson July 18, 2018

    I had a similar problem. It seems that unicodecsv is not installed by default, so you’ll have to install it yourself. For example using:
    git clone https://github.com/jdunck/python-unicodecsv.git
    cd python-unicodecsv/
    pip install .

    Reply

      stephenl April 30, 2019

      pip install unicodecsv

      Reply

Jena June 6, 2020

Hello everyone,
I have tried to run the code for Samsung (and a few other companies) and I keep on getting the same error. For Samsung, it says “Found 52,503 results for Samsung for Samsung”, however, it says “No data scraped” right below it and no csv file is made. I tried printing out the product_listings variable and it came up empty so the code never enters the “for product in product_listings:” loop. Does anyone have suggestions of what I am doing wrong?
Thank you.

Reply

Comments or Questions?

Turn the Internet into meaningful, structured and usable data   

ScrapeHero Logo

Can we help you get some data?