How To Scrape Amazon Product Details and Pricing using Python

Scraping Amazon Tutorial (Custom)

Amazon provides a Product Advertising API, but like most APIs, the API doesn’t provide all the information that Amazon has on a product page.

The only way to get the exact data that you see on a product page is by using a web scraper. Scraping ensures that you can get exactly what you see by visiting the site using a web browser.

Scraping Amazon for data is useful for a lot of things, such as:

  1. Scrape product details that you can’t get with the Product Advertising API
  2. Monitor an item for change in Price, Stock Count/Availability, Rating etc.
  3. Analyze how a particular Brand is being sold on Amazon
  4. Analyze Amazon Marketplace Sellers
  5. Analyze Amazon Product Reviews
  6. Or anything else – the possibilities are endless and only bound by your imagination


An easy way to get started with scraping Amazon is by building a crawler in Python that can go to any Amazon product’s page using an ASIN (a unique keyword Amazon uses to keep track of products in its database)

First, let’s collect a list of products identified by their ASINs.
e.g. An ASIN looks like


Then we will download the HTML of each product’s page and start to identify the XPaths for the data elements that you need – e.g. Product Title, Price, Description etc. Read more about XPaths here.

The Code


For this tutorial, we will stick to using basic Python and a couple of python packages – requests and lxml. We will not use more complicated packages like Scrapy for something simple.

You will need to install the following:

  • Python 2.7 available here ( )
  • Python Requests available here ( . You might need Python pip to install this available here –
  • Python LXML ( Learn how to install that here – )

We make this process a bit easier for you by providing you the actual Python code. The code will help scrape few important data elements such as Product Name, Price, Availability, Description etc.

Feel free to copy and modify it to your needs – that is the best way to learn ! You can download the code directly from here.



Modify the code shown below with a list of your own ASINs.

def ReadAsin():
  #Change the list below with the ASINs you want to track.
	AsinList = ['B0046UR4F4',
	extracted_data = []
	for i in AsinList:
		url = ""+i
	#Save the collected data into a json file.

and run it from a terminal or command prompt like this (if you name the file


You’ll get a file called data.json with the data collected for the ASINs you had in AsinList in the code.

Here is how the JSON output for a couple of ASINs will look like

        "CATEGORY": "Electronics > Computers & Accessories > Data Storage > External Hard Drives", 
        "ORIGINAL_PRICE": "$1,899.99", 
        "NAME": "G-Technology G-SPEED eS PRO High-Performance Fail-Safe RAID Solution for HD/2K Production 8TB (0G01873)", 
        "URL": "", 
        "SALE_PRICE": "$949.95", 
        "AVAILABILITY": "Only 1 left in stock."
        "CATEGORY": "Electronics > Computers & Accessories > Data Storage > USB Flash Drives", 
        "ORIGINAL_PRICE": "$599.95", 
        "NAME": "G-Technology G-RAID USB Removable Dual Drive Storage System 8TB (0G04069)", 
        "URL": "", 
        "SALE_PRICE": "$599.95", 
        "AVAILABILITY": "Only 2 left in stock."

This should work for small-scale scraping and hobby projects and get you started on your road to building bigger and better scrapers.

However, if you want to scrape websites for thousands of pages there are some important things you should be aware of and you can read about them at Scalable do-it-yourself scraping – How to build and run scrapers on a large scale.

Web scraping is very useful to automate such simple or many complex tasks that can easily be done by computers.

Thanks for reading and if you need help with your complex scraping projects let us know and we will be glad to help.

EDIT: Nov 25, 2016 – If you want to also scrape Amazon reviews for a product, head over to this new blog post.

If you are looking for a service to collect this data for your business needs, we can help.

Need some help with scraping eCommerce data?

Disclaimer: Any code provided in our tutorials is for illustration and learning purposes only. We are not responsible for how it is used and assume no liability for any detrimental usage of the source code. The mere presence of this code on our site does not imply that we encourage scraping or scrape the websites referenced in the code and accompanying tutorial. The tutorials only help illustrate the technique of programming web scrapers for popular internet websites. We are not obligated to provide any support for the code, however, if you add your questions in the comments section, we may periodically address them.

37 thoughts on “How To Scrape Amazon Product Details and Pricing using Python

  1. I Try to get Image url using this xpath :

    XPATH_IMG = ‘//div[@class=”imgTagWrapper”]/img/@src//text()’

    but the result is Null, can you give me the point to achieved this

  2. Hejsan from Sweden,

    I am a total “dummie” regarding python. I tried to use this code with Python 3 instead. There you have pip and requests included as I understand. Anyway, I do not get a data.json file respectively the provided code is not running and if i check it through python they mention missing parentheses. I just wonder if the code should work for python 3 as well and if not, why? Is it a different language?

    best regards,


    1. Hi Chris,
      Yes it is almost a new language – v2 code will not work in 3 for most cases especially with libraries used.
      Try downloading and running in V2.


    2. Hi Chris,

      I am running the following version of python:
      Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:53:06)
      [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
      Type “help”, “copyright”, “credits” or “license” for more information.

      I changed the code only a little to fit python 3. Pasted the code below. Let me know if you need any help.

      from lxml import html
      import csv,os,json
      import requests
      #from exceptions import ValueError
      from time import sleep

      def AmzonParser(url):
      headers = {‘User-Agent’: ‘Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36’}
      page = requests.get(url,headers=headers)
      while True:
      doc = html.fromstring(page.content)
      XPATH_NAME = ‘//h1[@id=”title”]//text()’
      XPATH_SALE_PRICE = ‘//span[contains(@id,”ourprice”) or contains(@id,”saleprice”)]/text()’
      XPATH_ORIGINAL_PRICE = ‘//td[contains(text(),”List Price”) or contains(text(),”M.R.P”) or contains(text(),”Price”)]/following-sibling::td/text()’
      XPATH_CATEGORY = ‘//a[@class=”a-link-normal a-color-tertiary”]//text()’
      XPATH_AVAILABILITY = ‘//div[@id=”availability”]//text()’

      RAW_NAME = doc.xpath(XPATH_NAME)

      NAME = ‘ ‘.join(”.join(RAW_NAME).split()) if RAW_NAME else None
      SALE_PRICE = ‘ ‘.join(”.join(RAW_SALE_PRICE).split()).strip() if RAW_SALE_PRICE else None
      CATEGORY = ‘ > ‘.join([i.strip() for i in RAW_CATEGORY]) if RAW_CATEGORY else None
      AVAILABILITY = ”.join(RAw_AVAILABILITY).strip() if RAw_AVAILABILITY else None

      if not ORIGINAL_PRICE:

      if page.status_code!=200:
      raise ValueError(‘captha’)
      data = {

      return data
      except Exception as e:

      def ReadAsin():
      # AsinList = csv.DictReader(open(os.path.join(os.path.dirname(__file__),”Asinfeed.csv”)))
      AsinList = [‘B0046UR4F4’,
      extracted_data = []
      for i in AsinList:
      url = “”+i
      print(“Processing: “+url)

      if __name__ == “__main__”:

  3. Thanks a lot for this very useful script. I m going to the next step : Scalable do-it-yourself scraping – How to build and run scrapers on a large scale

Join the conversation