A Must-Have Guide to Web Scraping Using Chrome

Web Scraping Using Chrome

In the knowledge century, web scraping gives you the competitive edge irrespective of your domain.

You could be an entrepreneur, marketer, researcher, analyst, journalist etc. To get ahead of everyone in the game, you need whatever you call it- data, information, statistics or knowledge of the latest trends. For pricing intelligence or competitor analysis, market research or sentiment analysis, you need to scrape actual data from the web to arrive at a suitable strategy.

For different users, needs could be entirely different. The reason why you need to do web scraping determines everything else. If the user is clear regarding the needs, budget, investment in terms of time and energy, it is easier to work out the tools and how to use them.

Want to know more about it? Click here.

What is web scraper chrome extension?

Web scraper chrome extension is one of the most powerful tools for extracting web data. Using the extension, you can devise a plan or sitemap regarding how a particular web site of your choice should be navigated. Web scraper chrome extension will, then, follow the navigation design accordingly and scrape the data. There are several web scraping extensions available in Chrome. Among them, we have selected the most preferred tool. You can install this extension into your Chrome browser using this link.

Web Scraping Chrome Extension
You may need to extract different types of data such as tables, text, links, images etc. Web scraper chrome extension enables you to scrape the multiple types of data with ease. You can scrape multiple pages as well. Compared to other tools which work well with HTML alone, web scraper chrome extension can extract data from dynamic web pages powered by Javascript and Ajax.
You can get the extracted data in CSV format.

Here’re the key features as provided by Web Scraper Chrome Extension:

  • Scrape multiple pages
  • Sitemaps and scraped data are stored in browser’s local storage or in CouchDB
  • Multiple data selection types
  • Extract data from dynamic pages (JavaScript+AJAX)
  • Browse scraped data
  • Export scraped data as CSV
  • Import, Export sitemaps
  • Depends only on Chrome browser

Why Web Scraping?

Why web scraping using chrome?

As far as low-budget, easy-to-use web scraping tools are concerned, chrome extension for web scraping is a great choice!
Here’re some more reasons for web scraping using chrome:

  • First of all, it’s free! If you buy web scraping software, it will burn a hole in your pocket. Available as a download, chrome extension perfectly suits your budget!
  • It’s a point-and-click plugin which works best for all kinds of users. Even if you don’t have the technical knowledge for handling software, you can still carry out web scraping using chrome. You don’t need to write a single line of code. In short, you don’t need to be a programmer.
  • It is also a good tool if data as a service, at the moment, drains your pocket and resources. e. g. if you are buying web scraping service from service providers.
  • As a first time scraper or a beginner in the field of scraping, you would not want to go for something that is too technical such as software. You may also want to see how it works out, so you may not want to invest too much financially as well. Thus, in the initial stages of exploring web scraping, using chrome extension will click for you!
  • If you are learning web scraping, it could be a difficult exercise to handle web scraping software. For learning web scraping, using chrome extension can turn out to be a beginner’s delight!
  • If you don’t have any bulk work or professional needs which call for hefty investment in web scraping software, web scraping using chrome works wonders for you!
  • You will get the scraped data in CSV format so you can process it later as per your needs at ease!

9 Must-know Points – Before you start using chrome extension

Like everything else, web scraping using chrome requires ample clarity before you get started!
It would do you a world of good, if you take time and ponder over these points before you take a plunge!
First of all, the prerequisites:

  • You need a good configuration running PC
  • Google chrome browser
  • Uninterrupted Internet connection while the scraper is running
  • Extension of web scraper for scraping

Apart from these, you need the constant presence and monitoring by a human being who can start and stop scraping task!

1. Web Scraping: What’s that?

While you want to use chrome extension for extracting data, you may ask yourself, ‘Do I know and understand the term ‘web scraping’’? If you are sufficiently clear about it, you are more or less ready to proceed!

2. Knowledge of HTML and CSS

As you would soon find out, when you use web scraping chrome plugin, you will need to select element which you want to extract.

It’s obvious that you will get the data that you want only if you can select the element accurately!
So extraction of data is tied to selectors which are evidently based on HTML and CSS.

The extension has got multiple selectors for different types of data extraction and for different kinds of interaction with a website.

The selectors can be divided in following groups:

  • Data extraction selectors for data extraction
  • Link selectors to extract links, this links can be used to navigate into the website
  • Element selectors for element selection that separates multiple records

To be precise, HTML gives the structure of a web page and CSS selectors find HTML elements in the said web page and scrape data from it. This is why you need to know basic HTML and CSS before you get down to web scraping using chrome.

3. Data: Define your need

Guess what?

Tools are secondary; it boils down to the needs of an end user.

If you need to scrape small tiny chunks of data, chrome extension is a great solution for web scraping.

But there will be bottlenecks as your need for data intensifies. For instance, if you scrape 1000 page on a daily basis from a particular website, the said website’s service will become unavailable for a period of time. In other words, you could be blocked by the admin of that site for requesting too much data!

However, it’s not the end of the world, don’t you worry!

Cloud based solutions can troubleshoot this bottleneck because it uses high elite proxy for requesting data!
Nonetheless, rest assured, web scraping using chrome works well for smaller portions of data!

4. Lower Budget and Smaller Scale

No need to panic, you see, most of chrome extension for web scraping is freely available with all the basic features!

If you are satisfied with smaller chunks of data that you can get through chrome plugin, there is no reason to get worried.

If your company doesn’t want to invest heavily in data at the moment and the data you need is available through Chrome extension, it is the perfect solution!

In the context of these points, you have every reason to opt for chrome extension for web scraping!

5. Investment of Time

Most people opt for tools to get things done URGENTLY and automate the process.

Chrome extension will suit you if you are patient and can afford to wait for the extension to do its work at its pace!

If you are fine with this, there is no harm in opting for web scraping using Chrome!

In case you need fast scraping or extracting data from web pages, you need to opt for cloud based scraping!

6. Provision for Storage

Let’s say you opt for Chrome extension, you need to keep in mind that cloud storage is not available. You need to download the scraped data every time a scrapping job is finished. To use that data for historical purpose you need to maintain versioning of this data yourself.

Cloud storage can give you many advantages such as usability, disaster recovery and accessibility of data.

If storage is a consideration for you, it would be preferable to go for cloud based scraping. Else you can opt for desktop based or Chrome extension scraping!

7. Human Intervention

As it is obvious, you need a human being watching over the process of web scraping using Chrome because he/she will need to select elements for extracting data. Besides, the screen is always on at the time of scraping.

So you need to merely keep it in mind that you will need to station somebody to lead and monitor the process of web scraping using Chrome.

The good news is there is a solution for this. You can avail other tools which provide data as service. It means you provide your need and you will get your data.

You can also make use of cloud based scraper in which you merely need to select the elements and cloud scraper will
give you the notification via email or other means after scraping done!

However, you may not want to bother about all these, if you can afford to station somebody to watch over the process. In such a scenario, web scraping using Chrome can work well for you!

8. Customer Support!

As you are now aware, most of the Chrome extensions are free for web scraping. Since it’s free, you cannot expect some customer service executive ready to do the troubleshooting for you, can you?

Once you configure your task and make the setup of scraper, you will be on your own! If there are technical issues, remember, there’s no help desk to resolve your technical glitches! Here’s a screenshot from the users who got stuck up in scraping data using Chrome!

Customer Support

Wondering what to do?

Here’s how you can deal with it:

Before you get started, go through some tutorials and learn the basics of how to use Chrome extension for web scraping!

If you don’t have time or energy to do this, the other option is to move to cloud based scraper. Cloud based scrapper
provides configuration or setup for task and additional support is also available!

Now here’s the moment of reckoning for you: are you comfortable with a tool that offers no technical or additional support? If you think you can handle this, web scraping using Chrome is for you!

9. No API Support

Let’s consider this scenario:

You want to scrape real estate website’s property listing details and conduct some competitor analysis based on the extracted data.

In Chrome, you will first need to download the data, make your database from it, and you can enter into your desired task! (This is of course same for the most of the free Chrome extension)

This is manual and may not be so much fun for you!

In case you want an alternative to this, you may consider using desktop or cloud based scraper wherein there will be
API support so that you can access data anywhere.

From API, you can add the data directly to your desirable task and proceed for the intended outcome.

However, if you can manage without any API support, web scraping using Chrome isn’t that bad after all!

Web scraping using Chrome: What exactly can you do?

What exactly can you do?

Oh, it’s simple! You can scrape data from websites!

Wondering what kind of data from websites?

Here’s a sample list:

  • Product details from e-commerce websites
  • Email addresses
  • Netflix movie titles
  • Stock market data
  • Job listings
  • IMDB movie rankings
  • Reddit posts
  • Real estate listings
  • Dating profiles
  • Social media trends

The possibilities are limitless!

However, you may do well to keep in mind that it will all depend on your need for data. If you need data in small portions, web scraping using Chrome will serve as a great solution. But if you need to scrape data in huge volumes, it may not be possible or easy.

How to do web scraping using Chrome

Now that you are aware of what web scraper chrome extension is and the relevant points for consideration while approaching web scraping using chrome extension, we will carry out a sample task.

We will scrape Men’s Shoes prices from www.boohoo.com to make this a practical example. We have taken all the necessary screenshots to make each step easier for you to understand. You can follow same steps for your own use case.

We have selected this use case as scraping product prices for competitor websites has become necessary to keep your product pricing in line.

In this sample task, we will extract product Title, Regular Price, Discounted Price of all Mens shoes and shoes subcategories from this URL http://www.boohoo.com/mens/shoes/.

We will get scraped data in CSV format after scraper is set up and run.

Scraped Data

Alright, then let’s jump right into it!

To simplify things, we have divided the entire process in 6 stages and each stage consists of a few easy steps to follow.

  • Installing web scraper chrome extension
  • Creating scraper task
  • Selection of sub categories of shoes
  • Selection of titles and prices from listing page
  • Running the scrapper
  • Extracting data

Installing web scraper chrome extension

1) Open Chrome web browser on your PC/laptop

2) Go to this link https://chrome.google.com/webstore/detail/web-scraper/jnhgnonknehpejjnehehllkliplmbmhn?hl=en

3) Click on “Add to Chrome” Green Button on top-right corner.
It will install the extension into Chrome and you will see button text changed to “Added to Chrome”

Ad To Chrome

Creating a scraper task

4) Open the website using Chrome from which you want to extract the data.

5) Here, I opened the URL – www.boonoo.com

6) Open web scraper extension

=> by using a short key ctrl+shift+i and go to web scraper tab
Or
=> go to chrome browser => more tools => developer tools => web scrape tab

Scrap Tool

You will notice that browser screen is divided into two parts.

Please consider following screenshot to understand naming conventions.

  • Browser main screen
  • Developer tools screen

Boohoo

7) Once you click “Web Scraper” tab in Developer Tool Screen, you will see the Web Scraper Dashboard.

Dashboard

8) Click on Create New Sitemap and choose Create Sitemap.

Sitemap

9) The following screen will appear. (There is no screenshot of the screen after this point)

New Sitemap

10) Fill sitemap name and start URL,
Sitemap name : Give a name to the task
Start URL : Enter URL of the website from which you want to scrape data

I entered,
Sitemap name : boohoo-shoes
Start URL : http://www.boohoo.com/mens

Click on create site map.

Fill Sitemap

Selection of sub-categories of shoes

11) We want to scrape all the sub-categories of men’s shoes. So we will need to select all the subcategories using a Selector.

12) “Selector” is a term used to select any element on a web page. “Element” could be link, text, image etc.

Selector

13) Once you click on Add new selector, it will open a new window like the following image:

Create Sitemap

14) Fill the form using the following guideline.

Id – for identity a selector
Here, I entered Id – shoes-subcategories

Type – the type of data you want to select
There will be a dropdown menu containing several items like Text, Link, Popup Link etc.

We need to select Type = link because we want to click on all sub-categories of SHOES and go to all sub-category listings.

Type Selector

Selector – which gives a CSS selector (pointer) to select any element of web page

Click on select button to select all the sub-categories of SHOES

15) That will give you selection pointer to select any element of web page.

16) Now move your mouse cursor to Main Browser Screen,
When you keep the mouse cursor on mens, it will show all menu items.
We want to select sub-categories of SHOES. So select by clicking on each of the subcategory.

Sub Categories

After selecting all sub-categories, click on Done selecting! to complete the selection step.

17) It will automatically fill in Selector as given in below image.
Click on Multiple check-box because we want to select and scrape multiple sub-categories.

Selector Text Box

18) Through steps no. 13 to 17, required fields to create a Selector are filled. Now save the selector by clicking Save selector.

Save Selector

19) After saving selector, you will see the following image

Add New Selector

20) So far we have instructed Chrome Extension with links of sub-categories to be scrapped.

Now we need to instruct it with all the items by selecting items on (any) one of the sub-category page.

To do this, go to ID – shoes-subcategories by clicking on that.

Shoes Subcategories

21) In ID – shoes-subcategories, click add new selector button.

Click New Selector

22) You will find the following image:

Create Sitemap

23) Now go to the mens section on the website => click on any sub-categories of SHOES,
Here I click on Trainers. (Remember, you have to do this in browser main screen.)
(We could have selected any of the subcategories eg. boots, smart shoes, sandals & sliders. But we need to select any one.)
The browser main screen look like the following image:

Men's Trainers

24) Now go to developer tool screen and fill the details in this way

  • id = item-shoes
  • type = element (Because we want to extract information from all the boots items.)
  • Selector = Select

After clicking on Select move mouse cursor to browser screen and Select item from list as given below:

Select List

  • Next, select second item by clicking on it. Once you have selected second item, it will select all another items as given below :
  • After that click Done selecting! to complete this task.

Done Selecting

25) Selection will be entered automatically in the selector text box.

Selector Text Box

Click on multiple (Because we want data from all items of boot (or shoes)).
Now Save the selector.

26) After saving selector, it will appear as shown below:

Selector Saving

27) Through steps no 23 to 25 we have instructed chrome extension to select all the items (boots or shoes) to be scrapped.
Now we need to tell it to select items Titles and Prices. It will be done in the following steps.
Again click on ID called elements.

Click On Item

28) Inside ID elements, click Add new selector.

Elements Selectors

29) With this selector, we will extract Title of the item.

Fill boxes as follows:
Id – title
Type – text (Why text? Because I want to extract title name which is in text format)

Move to Select
Select Title of the item by clicking on title as shown below.
And then click Done Selecting to complete the selection task.

Select Title

30) It will fill automatically in selector as shown below.
Next, save the selector.

Next Selector

31) After saving, it will appear as follows:

Selector Saving

32) After extracting Title of the item, we want to scrape Discounted Price of the item.
So go to ID called elements and Add new selector.

Click New Selector

33) In this new selector, to extract Discounted Price of items.

Fill the details as follows:
Id – Discounted-price
Type – text

Now go to Selector,
And click on Discounted Price as shown below:
Now click Done selecting to complete selection.

Discount Price

34) It will fill automatically in Selector box.
Next, click Save selector button.

Fill Select Box

35) After saving the selector, it will appear as follows:

Saved Selector

36) Now to scrape the Regular Price of items, again follow similar steps we followed for Title, and Discounted Price.
Add new selector in ID – elements

Id Selector

37) This new selector is for extracting Regular Price of the item.
Fill the form as follows:
Id – standard-price
Type – text (Because we want to extract standard price of the item.)
Now go to Selector,
Select regular price of the item by clicking on it as shown below:
After that click, Done selecting to complete selection.

Regular Price

38) It will automatically fill Selector box.
And then Save selector.

Fill Save Selector

39) After you save, it will appear as follows:
You can see that there are three elements from which we will get Title, Discounted Price and Regular (Standard) Price of the item.

Regular Selector

40) We can verify the configuration of all above steps by this way.
go to Sitemap => Selector graph

Sitemap Scrap

41) It will give you tree structure for your scraper task namely test.
You click on each –one by one- it will open like branches.

Tree Structure

If you see the branches as shown below, you can be sure that the configuration has been done correctly.

Running the scraper

43) Now it’s time to run the scraper which is configured in above steps.

44) To run the scraper, go to Sitemap => Scrape

Sitemap Scrap

It will open a window like an image given below.
If you want to set interval between scraping of multiple pages, you can set it as per your need.
(However it is recommended that you do not change it)

Sitemap Test

45) Now click on Start Scraping button.

Click To Scrap

46) It will automatically open a new window into your Chrome Browser and start scraping.

Chrome Browser

47) When scraping is done, it will notify with a notification as given below.

Finished Notification

48) You will get a table in your dashboard, see the image bellow:

Table Dashboard

49) For downloading your data, go to Sitemap => Export data as CSV

50) It will open a window as given below. In that window, click on Download now! link.

Download Button

51) It will download as .CSV (CSV file carries name of the main task)

File Downloaded

Hurrah, web scraping using Chrome extension is done!

Alternatives to web scraping using chrome

Installable Software

  • You need to install web scraping software on your PC. Most of the software available are Windows-based.
  • You can configure the software like the browser extension.
  • You can avail the data in CSV or other downloadable format.
  • You can scrape one or more pages at a time.
  • It is suitable for small to medium amount of data scraping.

Cloud Based

  • You don’t need to install any software on your PC.
  • You can configure your plan and requirement.
  • You can get the data through API and downloadable format.
  • There is no restriction on the amount of data to be scraped as it runs on multiple computing environment.