Your ParseHub scraping project consists of a series of templates and commands.
Each template consists of a set of commands applicable to a particular website layout. For each different type of page layout on the website you will create a unique template that commands ParseHub to take particular actions on that layout.
For example, if you are scraping an e-commerce website, you may have one template (in this case named "main_template") that scrapes all of the results on a product listing page:
And another template (in this case named "product_details") that scrapes the details when you click into a particular product from the results page (e.g. brand, price, SKU, shipping time...etc.):
There is a "Template Options" section that you will find on the right hand side of any template by clicking on the three dots beside the template name.
Please note that for the vast majority of cases you will not need to access the more complex of these options - there are only a minority of projects in which they will be necessary.
There are 15 commands available in the ParseHub toolbox, each of which instruct ParseHub to take a different action in your project. This link contains a complete reference of all of the commands with links to articles which discuss each command in depth.
For most projects you will only use a small number of commands. Some of the most common commands are:
- Select: this command selects elements on the page. If you click on one element it will select a single element and if you click on another similar element it will automatically select all elements of that type and insert a Begin New Entry command (hidden under list icon ) to ensure each one has it's own entry in your data.
- Relative Select: this command is nested under a Select command and links one element to another. After you've selected an item, you can use a Relative Select command to click on that item and link it to another. This is used to associate a date to a headline, a phone number with a name or a price with a product name, for example.
- Click: this command allows your project to click into an element you've already selected with a Select command.
- Extract: this command allows you to extract data from an element you've already selected with a Select command. For example, if you select a link it will automatically extract both the name of the link and the url itself, if you were only interested in the name you could use the Extract command to extract just the name.
You can also rearrange commands by dragging and dropping them to the right spot on your template.