Scraping Tips


Interesting tips and articles about Web Scraping. How to successfully use automation to gather data from websites. Data extraction techniques and code are available in our tutorials

How Web Scraping With Excel Works: A Tutorial

How Web Scraping With Excel Works: A Tutorial

This tutorial deals with a detailed explanation of the Web Query feature in Excel and how web scraping with Excel is carried out using Web Query.

Web Scraping With Playwright in Python and JavaScript

Web Scraping With Playwright in Python and JavaScript

Learn about Playwright web scraping in Python and JavaScript by building and running web scrapers using a browser with Playwright.

Web Scraping using Urllib

Web Scraping using Urllib

Explore URL handling, parsing, quoting, and scraping with urllib. This step-by-step guide covers everything from basic URL operations to building a fully-fledged scraper using urllib.

Web Scraping vs. Web Crawling

Web Scraping vs. Web Crawling

Web Scraping and Web Crawling are terms used interchangeably. Explore what constitutes web scraping and web crawling in this article discussing web scraping vs web crawling.

What is Browser Fingerprinting? How to Bypass it?

What is Browser Fingerprinting? How to Bypass it?

Have you ever encountered the term “browser fingerprinting” while surfing the internet? With Browser fingerprinting, serves can uniquely identify clients and web scrapers, but there are ways to bypass this. Check out what browser fingerprinting entails and how you can avoid it.

How to fake and rotate User Agents using Python 3

How to fake and rotate User Agents using Python 3

When scraping many pages from a website, using the same user-agent consistently leads to the detection of a scraper. A way to bypass that detection is by faking your user agent and changing it with every request you make to a website. In this tutorial, we will show you how to fake user agents, and randomize them to prevent getting blocked while scraping websites.

How To Rotate Proxies and change IP Addresses using Python 3

How To Rotate Proxies and change IP Addresses using Python 3

When scraping many pages from a website, using the same IP addresses will lead to getting blocked. A way to avoid this is by rotating IP addresses that can prevent your scrapers from being disrupted. In this tutorial, we will show you how to rotate IP addresses to prevent getting blocked while scraping.

Get Sales Leads From Google

Get Sales Leads From Google

In this tutorial we will show you how businesses can get sales leads from Google for free using Google Maps Crawler and Contact Detail Crawler available on ScrapeHero Cloud.

How do websites detect and block bots using Bot Mitigation Tools

How do websites detect and block bots using Bot Mitigation Tools

An in-depth analysis of how most of the bot mitigation tools work, and how they distinguish between bots and humans on the server-side and client-side, going through the fundamentals of the web.

Scalable Large Scale Web Scraping – How to build, maintain and run scrapers

Scalable Large Scale Web Scraping – How to build, maintain and run scrapers

Here are the high-level steps involved in this process and we will go through each of these in detail – Building scrapers, Running web scrapers at scale, Getting past anti-scraping techniques, Data Validation and Quality Control & Ongoing Maintenance

How To Make  Anonymous Requests using TorRequests and Python

How To Make Anonymous Requests using TorRequests and Python

Tor is quite useful when you have to use requests without revealing your IP address, especially when you are web scraping. This tutorial will use a wrapper in python that helps you with the same.

How To Install Python Packages for Web Scraping in Windows 10

How To Install Python Packages for Web Scraping in Windows 10

Web scraping using Python in Windows can be tough. In this tutorial follow the steps to setup python 3 and python packages on your Windows 10 computer for web scraping in Windows 10.

Turn the Internet into meaningful, structured and usable data