Scrape product details

Almost all eCommerce or online retail websites display products on pages of search results.

With ParseHub you can grab details about each product that are both on the search page and each product's page, to do a product comparison, easily.

You should follow this tutorial if you want to scrape anything like Amazon, Etsy, H&M, or any other online commercial retailer.

In this tutorial we will show you how to:

1) Starting from a results page, get the name and url of each product.

2) Get additional information about each product from the search page.

3) Get information about the products that can only be find on the individual item's page.

Getting Started

You can skip these two steps if you're building this template after setting up your main_template to get all the categories on an eCommerce site, for example.

1. Open the ParseHub desktop app and click "New Project".

 

2. In the text box add the following url (or any other eCommerce site url) - http://www.hm.com/us/products/ladies/dresses_jumpsuits and click on "Start project on this url".

Get the name and url of each product with a Select command

3. With the new Selection on this template, select the name of the first product by clicking on it.

4. Click on the second product name on the page. All of the names should be now selected for you in green. This should automatically extract the name and url.

5. Rename the Selection to products, or something similar.

Get additional information on the page with Relative Select commands

6. Click on the plus button beside Begin new entry in products, and add a new Relative Select command.

7. Click on the first product's name, then click on the first product's price below it. The relative selection should automatically select the price for every product. 

On some pages, ParseHub will make the relative selection from every name to one price. To fix this, simply click on the name of the second product, and the price below that, to train ParseHub to understand the data you need.

8. Rename the relative selection to price.

9. You can repeat steps 6–9 for all of the information you need. Make sure to use a separate relative selection for each point of data you need.

Get information on the products page with a Click command

10. If there's information that you can't find on the search page, it might be in the product's details pages. Click on the plus button beside Begin new entry in products, and add a new Click command.

11. A pop up will appear with options for the click. Make sure to select Create New Template and input a name like product_details.

12. When the new page opens with the new template, click on the description you need to select it with the automatically-created empty selection

13. Rename this selection description.

14. If you need more information, click on the plus button beside the Select page command, and add a new Select command. Then, repeat steps 12 and 13. Make sure not to add multiple items in one selection at this step.

Getting Further Pages of Information

It's likely that you'll have more than one page of search results. In your product listing template, you can follow the steps from this tutorial on pagination to get every possible search result: not just the first page.

Have more questions? Submit request!

0 Comments

Please sign in to leave a comment.