Web Scraping Tutorials


LEARN HOW TO USE WEB SCRAPING TO ENHANCE PRODUCTIVITY AND AUTOMATION

We provide many step-by-step tutorials with source code for web scraping, web crawling, data extraction, headless browsers, etc.

Our web scraping tutorials are usually written in Python using libraries such as LXML, Beautiful Soup, Selectorlib and occasionally in Node.js.

The full source code is also available to download in most cases or available to be easily cloned using Git.

We also provide various in-depth articles about Web Scraping tips, techniques and the latest technologies which include the latest anti-bot technologies, methods used to safely and responsibly gather publicly available data from the Internet.

The community that has coalesced around these tutorials and their comments help anyone from a beginner hobbyist person to an advanced programmer solve some of the issues they face with web scraping.

These tutorials are frequently linked to as StackOverflow solutions and discussed on Reddit.

Please feel free to read and participate in the discussions with your comments.

All Tutorials

How To Make  Anonymous Requests using TorRequests and Python

How To Make Anonymous Requests using TorRequests and Python

Tor is quite useful when you have to use requests without revealing your IP address, especially when you are web scraping. This tutorial will use a wrapper in python that helps you with the same.

How to take screenshots using Puppeteer

How to take screenshots using Puppeteer

Learn how to take screenshots of entire web page, a specific area or different view ports in Google Chrome, Chrome Headless or Chromium using Puppeteer and Node JS, for debugging tests or for web scraping

Web Scraping with Puppeteer and NodeJS

Web Scraping with Puppeteer and NodeJS

Puppeteer is a node.js library which provides a powerful but simple API that allows you to control Google’s Chrome browser. In this tutorial post, we will show you how to build a web scraper and control chrome using puppeteer and node.js to the scrape details of hotel listings from booking.com

Web Scraping Tutorial for Beginners – Part 3 – Navigating and Extracting Data

Web Scraping Tutorial for Beginners – Part 3 – Navigating and Extracting Data

Learn how to build a web scraper to scrape Reddit. Navigate and extract comment data from Reddit using Python 3 and BeautifulSoup.

How to Scrape Coupon Details from a Walmart Store using Python and LXML

How to Scrape Coupon Details from a Walmart Store using Python and LXML

Tutorial to build a web scraper to extract coupon details from Walmart.com, a leading retail store in the U.S, based on a store ID. We will extract details such as store name, address, contact details and more using Python 3, Python Requests and LXML.

How to Scrape Store Locations from Walmart.com using Python 3

How to Scrape Store Locations from Walmart.com using Python 3

Tutorial to build a web scraper to extract store locations and its details from Walmart.com, a leading retailer in the U.S. We will extract details such as store name, address, contact details and more using Python 3 and Python Requests.

Turn the Internet into meaningful, structured and usable data