The click command is used to click on the current element (see Execution flow). You might need to use the click command:
- When content on the page is hidden until you click on a button to load it.
- When you click the "Next" button or page numbers in pagination.
- When you see a "load more" button that loads more elements via AJAX.
- When you click on multiple links on the page to load different pages with new urls.
The click command shows up in 2 lines, showing the action (click) on the elements and the template that ParseHub visits after executing the click:
How to use the click command:
1. Use the Select tool to make a selection on the page.
2. Click the + button on the selection command. If it is not visible, hold down the SHIFT key and wait for the + button to appear.
3. Choose the click tool from the tool box.
4. Choose whether the element being clicked is a "next page" button
5. Specify whether the click should repeat the current template (if it is a next page button) or whether it should stay on the same template or go to another existing or new template (if it is not a next page button).
6. Increase or decrease the amount of seconds that ParseHub waits after the click command and before executing any other commands in the template.
When to use the click command:
1. When content on the page is hidden until you click on a button to load it.
2. When you click the "Next" button or page numbers in pagination.
3. When you see a "load more" button that loads more elements via AJAX.
4. When you click on multiple links on the page to load different pages with new urls.
Command Options
Loads a new page
This option should be used for clicking a button that will send you to a different web page, and will provide the option of using a new template when the page loads.
Uses AJAX
This option should be used for clicking a button that will not send you to a different web page, such as for buttons that:
- Load more content on a page.
- Open a tab of options.
- Close a popup.
Continue executing current template
You can choose this option if you don't want the Click tool to go back to the start of a template.
Go to template
When you use the Click tool, you can choose this option to run your chosen template from the beginning. The dropdown lets you choose which template in your project to use when it loads.
You will usually select a template within the same project, but you can also select templates from other projects in your account.
Go to Another Project
You also have the option of redirecting to a template on another project saved on your account. This is useful if you have multiple projects on the same website, or you are scraping websites with almost identical HTML structures.
Wait ___ seconds after clicking
This lets you specify the length of time to pause for after clicking on the current element, in seconds.
Number of repeats
Lets you limit the amount of times a click command is executed. If your click is a pagination click, setting the number of repeats as 4 will click on the next button a maximum of 4 times (ie. scrape the first 5 pages).
Pagination and Dynamic Loading
The click tool is incredibly important in pagination, that is, loading successive, individual result pages of data, such as different search pages on Google.
Most websites, such as Reddit, put different pages of results on entirely different pages. With this kind of pagination you should make the click command Load a new page with a click after extracting all the data you need.
Some websites, on the other hand, have dynamic loading. This means that they bring up new results on the same page, such as by loading more items when you scroll down (like Twitter), or when you click on a button.
You should use the scroll command for websites like Twitter that load when you reach the bottom of a page. For dynamic loading websites that require a button to be pressed, use the select command to select the button element, then the click command, and choose the "Use AJAX" option.
Parallelization
Normally, commands can be expected to be executed in depth-first order. The click command, like the go to template command, is an exception to that rule, if you are clicking a link. ParseHub makes it easy to create projects that span thousands or millions of pages. In order to ensure that your data is retrieved quickly, the workload needs to be split up among a fleet of servers.
When executing a click command on a link, ParseHub will automatically offload the work to another server, then continue with executing the current template. This means that commands after the click command can be executed while the new template is still running. This sometimes results in unexpected behavior. For example, a list may be in a different order than it appears on the pages.
ParseHub will automatically figure out when it needs to wait for the click command to finish. For example, if the new template sets a property on the scope that is used by a later command, ParseHub will wait for that template to finish before executing the later command.