Public vs Private Data

What can we generally scrape?

We only scrape public data. 

Public Data

Public data is any data/information/text available on a website that does not require you to login and is not prohibited from being accessed in the sites robots.txt file.

How to check

The easiest way to verify if the data is public or private is to access the page using a new browser in “Incognito” (Chrome) or “Private” (Firefox, Edge etc) mode and then NOT login or provide access to your location or other data etc. if prompted by the website.

Firefox Private Mode


Chrome incognito mode

If you are presented with a login page, the data is not public.

If you are NOT presented with the data that you expected, again the data is not public.

Robots.txt

Additionally if you are proficient in computers, you can also validate that the site’s robots.txt file does not restrict that page from the public. However, if you do not know how to do that, we can do that for you.

Please verify the data that you request is public.

Examples of private data that are frequently requested and we are unable to scrape

  • Any kind of LinkedIn.com personal data (except public LinkedIn Jobs data)
  • Facebook private profiles
  • Facebook user data
  • Facebook private groups
  • Follower details for many social networks
  • Emails on social media profiles
  • Private data on social media profiles
  • Emails from websites that do not show an email but provide a means to contact the business through a website form

Turn the Internet into meaningful, structured and usable data