Web Scraping Tutorials


Step by step tutorials for web scraping, web crawling, data extraction, headless browsers, etc. Our web scraping tutorials are usually written in Python using libraries such as LXML or Beautiful Soup and occasionally in Node.js. The full source code is available to download or clone using Git.

All Tutorials

Tutorial: How to Scrape LinkedIn for Public Company Data

Tutorial: How to Scrape LinkedIn for Public Company Data

We are glad that you came here to learn how to scrape LinkedIn and we won’t disappoint you. In this tutorial we will show you how to scrape the data in a LinkedIn company page. For those who stumbled onto this page without a clear understanding of why they wanted to scrape LinkedIn data, here […]

How To Scrape Amazon Product Details and Pricing using Python and SelectorLib

How To Scrape Amazon Product Details and Pricing using Python and SelectorLib

In this tutorial, we will build an Amazon scraper for extracting product details and pricing. We will build this simple web scraper using Python and SelectorLib and run it in a console. But before we start, let’s look at what can you use it for. How to use Amazon Product Data Scrape Product Details that […]

XPaths and their relevance in Web Scraping

XPaths and their relevance in Web Scraping

XPath (XML Path Language) is a syntax for defining parts of an XML document. XPath is a query language for identifying and selecting nodes or elements in an XML document using a tree like representation of the document. XPath was defined by the World Wide Web Consortium (W3C). XPaths are one of the few ways […]

Why *not* scrape yourself

Why *not* scrape yourself

Before you get all kinds of ideas about what the topic of this article means – please look at the context – We are talking about Web Scraping here ! This post will talk about reason why not to do this yourself and why to call in a professional (wink wink – use ScrapeHero) You […]

Webscraping using Python without using large frameworks like Scrapy

Webscraping using Python without using large frameworks like Scrapy

If you need publicly available data from scraping the Internet, before creating a web scraper, it is best to check if this data is already available from public data sources or APIs. Check the site’s FAQ section or Google for their API endpoints and public data. Even if their API endpoints are available you have […]

5 tips for scraping big websites

5 tips for scraping big websites

Scraping bigger websites can be a challenge if done the wrong way. Bigger websites would have more data, more security and more pages. We’ve learned a lot from our years of crawling such large complex websites, and these tips could help solve some of your challenges 1. Cache pages visited for scraping When scraping big […]

Turn the Internet into meaningful, structured and usable data