Web Scraping Tutorials


Step by step tutorials for web scraping, web crawling, data extraction, headless browsers, etc. Our web scraping tutorials are usually written in Python using libraries such as LXML or Beautiful Soup and occasionally in Node.js. The full source code is available to download or clone using Git.

All Tutorials

How to disable images and CSS in Puppeteer to speed up web scraping

How to disable images and CSS in Puppeteer to speed up web scraping

Learn how to disable images and CSS of an entire web page using Google Chrome Headless or Chromium using Puppeteer and Node JS, for debugging tests or for web scraping

How to scrape Prices from any eCommerce website

How to scrape Prices from any eCommerce website

Check out this step by step tutorial on scraping prices from eCommerce websites in Google Chrome web browser using a single smart scraper in JavaScript that can fetch product prices.

How to monitor price difference across multiple sellers on Amazon

How to monitor price difference across multiple sellers on Amazon

This step by step tutorial will show you how to build a web scraper using Python and LXML to extract prices and seller information from Amazon’s Offer Listing page, a feature which enables a price comparison from multiple sellers and focuses on offering additional buying options to customers.

How To Make  Anonymous Requests using TorRequests and Python

How To Make Anonymous Requests using TorRequests and Python

Tor is quite useful when you have to use requests without revealing your IP address, especially when you are web scraping. This tutorial will use a wrapper in python that helps you with the same.

How to take screenshots using Puppeteer

How to take screenshots using Puppeteer

Learn how to take screenshots of entire web page, a specific area or different view ports in Google Chrome, Chrome Headless or Chromium using Puppeteer and Node JS, for debugging tests or for web scraping

Web Scraping with Puppeteer and NodeJS

Web Scraping with Puppeteer and NodeJS

Puppeteer is a node.js library which provides a powerful but simple API that allows you to control Google’s Chrome browser. In this tutorial post, we will show you how to build a web scraper and control chrome using puppeteer and node.js to the scrape details of hotel listings from booking.com

Turn the Internet into meaningful, structured and usable data   

ScrapeHero Logo

Can we help you get some data?