How To Make Anonymous Requests using TorRequests and Python

Tor is quite useful when you have to use requests without revealing your IP address, especially when you are web scraping. This tutorial will use a wrapper in python that helps you with the same.

What is TOR?

TOR is short for “The Onion Project”, a worldwide network of servers used by U.S. Navy. While enabling people to browse the internet anonymously, TOR also acts as a non-profit organization for research and development of online privacy tools.

TOR can mean two things

  1. The software you install in your computer to run TOR
  2. The network of computers that manages TOR connections

In simple words, TOR allows you to route your web traffic through several other computers so that a third person can’t trace the traffic back to a user. Anyone who tries to look up the traffic would see random untraceable nodes on TOR network.

Install TOR

TorRequest has TOR as a dependency. Install TOR first.

The instructions are for Ubuntu / Debian users. To install on windows or Mac, check here.

sudo apt-get update
sudo apt-get install tor

Restart the TOR service

sudo /etc/init.d/tor restart

Configure TOR

Let’s hash a new password so that random access to the port by outside agents is prevented.

tor --hash-password <enter your password here>

You will get a long combination of alphabets and numbers as your new hashed password. Now let’s go to the TOR configuration file (torrc) and make necessary changes.

Where the torrc file is placed depends on the operating system you use and where you are receiving tor from. Mine was at ./etc/tor/torrc .You can refer this to know more.

We have three things to do

  1. Enable the “ControlPort” listener for TOR to listen on port 9051, as this is the port to which TOR will listen for any communication from applications talking to the Tor controller.
  2. Update the hashed password
  3. Implement cookie authentication

You can achieve this by uncommenting and editing the following lines just above the section for location hidden services.

SOCKSPort 9050
HashedControlPassword <your hashed passsword obtained earlier here>
CookieAuthentication 1

### This section is just for location-hidden services ###

Save and exit and restart TOR.

sudo /etc/init.d/tor restart

Now TOR is all set! Kudos!

What is TorRequest?

TorRequest is a wrapper around requests and stem libraries that allows making requests through TOR. View the project here.

You can install torrequest via PyPI:

pip install torrequest

Let’s try TorRequest. Open your python terminal.

from torrequest import TorRequest

Pass your password to Tor

tr=TorRequest(password='your_unhashed_password here')
Let’s check our current IP address

import requests
response= requests.get('http://ipecho.net/plain')
print ("My Original IP Address:",response.text)

My response was

My Original IP Address: 45.55.117.170

Let’s try the same through TorRequest

tr.reset_identity() #Reset Tor
response= tr.get('http://ipecho.net/plain')
print ("New Ip Address",response.text)

You will get a different IP address now. Reset Tor again to get a new IP address again.

Now you can easily mask your IP address in python using Torrequests.

All the best!

We can help with your data or automation needs

Turn the Internet into meaningful, structured and usable data


Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

Disclaimer: Any code provided in our tutorials is for illustration and learning purposes only. We are not responsible for how it is used and assume no liability for any detrimental usage of the source code. The mere presence of this code on our site does not imply that we encourage scraping or scrape the websites referenced in the code and accompanying tutorial. The tutorials only help illustrate the technique of programming web scrapers for popular internet websites. We are not obligated to provide any support for the code, however, if you add your questions in the comments section, we may periodically address them.

 

Posted in:   Scraping Tips, Web Scraping Tutorials

Responses

Sree December 16, 2018

Hi All, How to solve the Request Throttling Error response back from Web Scraping code ?

Reply

    Enfa February 4, 2019

    There could be many reasons that your request is getting throttled. With respect to this library, I think it might be because you have been requesting too fast from the same IP.

    Here are a few suggestions.

    Go Slow

    Use time.sleep() function to sleep between every two requests. Throttling is basically limiting the number of requests you are requesting in a given period of time.

    Rotate/Change your IP

    Ensure you have rotated your IP. Sometimes simply resetting your tor identity using tr.reset_identity() doesn’t reset your identity. See the issue here. What you can do for the time being is to create a new instance every time you want to reset your IP, until the bug is fixed.

    with TorRequest(password='yourPasswordHere') as tr:
      tr.reset_identity() #Reset Tor
      response= tr.get('http://ipecho.net/plain')
      print ("New Ip Address",response.text)

    If you are still facing issues, check out the blog How to prevent getting blacklisted while scraping

    Hope this helps!

    Reply

magnusvp June 11, 2019

Hi great guide! Just wanted to point out that in the TOR config file, there’s no need to uncomment ‘SOCKSPort 9050’ since it’s the default. What one needs to do for this guide to work is to uncomment ‘ControlPort 9051’ however.

Reply

alfa July 14, 2019

this not working on windows 10 what the problem?

Reply

Comments or Questions?

Turn the Internet into meaningful, structured and usable data