ebay-tutorial-scraper

How To Scrape Competitor Prices from eBay.com using Python and LXML

Prices on products can fluctuate indiscriminately. Frequently monitoring pricing data can help you adjust your prices and infer which products are popular and in demand. Web scraping is a profitable technique that can help with collecting information for businesses and pricing intelligence.

For this article, we will show you how to extract the names and prices of products in all categories by a brand available on eBay. Scraping data from eBay.com at regular intervals can be useful to check the details of products and compare them with your competitor sites.

Below is a screenshot of the data we will be extracting.

data-to-scrape-ebay-prices-tutorial

You could also scrape details such as the number of products sold or the ratings given by consumers, but for now, we will keep it simple and scrape these.

Scraping Logic

  1. First, construct the URL for the search results from eBay. Since we will be monitoring prices by their brand, here is the one for Apple- https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=iphone+7&_blrs=recall_filtering
  2. Download HTML of the search result page using Python Requests.
  3. Parse the page using LXML – LXML lets you navigate the HTML Tree Structure using Xpaths. We have predefined the XPaths for the details we need in the code.
  4. Save the data to a CSV file. In this article we are only scraping the product’s name, price, and URL from the first page of results, so a CSV file should be enough to fit in all the data. If you would like to extract details in bulk, a JSON file is more preferable. You can read about choosing your data format, just to be sure.

Requirements

For this web scraping tutorial using Python, we will need some packages for downloading and parsing the HTML. Below are the package requirements.

  • Python 2.7 ( https://www.python.org/downloads/ )
  • PIP to install the following packages in Python (https://pip.pypa.io/en/stable/installing/ )
  • Python Requests, to make requests and download the HTML content of the pages ( http://docs.python-requests.org/en/master/user/install/).
  • Python LXML, for parsing the HTML Tree Structure using Xpaths ( Learn how to install that here – http://lxml.de/installation.html )

 

The Code

You can download the link here, if the embed above does not work.

Running the Scraper

We have named the script ebay_scraper.py. If you type in the script name in terminal or command prompt with a -h


usage: ebay_scraper.py [-h] brand

positional arguments:

  brand       Brand Name

optional arguments:

  -h, --help  show this help message and exit

 

The brand argument represents any brand available on eBay. You can type in a brand that eBay currently has on their site such as- Samsung, Canon, Dell, etc. The script must be run with the argument for brand. As an example, to find all of the products Apple currently has on eBay, we would run the scraper like this. 

ebay_scraper.py apple

This will create a CSV file named apple-ebay-scraped-data.csv that will be in the same folder as the script. Here’s some of the data extracted from eBay in a CSV file from the command above.

ebay-scraper-results

You can download the code at https://gist.github.com/scrapehero/2a1be61eb28cfa577e379e2b69b31c90

We would love to know how this scraper worked for you. Let us know in the comments below.

Known Limitations

This code should be able to scrape the details of most brands available on eBay. If you want to scrape the details of thousands of products for each brand and check the prices of products periodically (on an hourly basis), then you should read  Scalable do-it-yourself scraping – How to build and run scrapers on a large scale and How to prevent getting blacklisted while scraping

If you need some professional help with scraping complex websites you can fill up the form below.

You can also get data delivered to you, as a Service from us. Interested?


 

Join the conversation