Select

Screen_Shot_2020-01-30_at_5.31.07_PM.png

The select command is used to select elements on the page. Clicking on the + button and choosing Select will create a new select command nested under the current command.

Hovering over an element will highlight it in blue. Clicking on a highlighted element will select it (green). Clicking on further elements on the page will add them to the selection. While the select command is active, a number in brackets to the right indicates how many elements on the page are being selected.

Selection commands are labeled. You can edit these labels by clicking on them in the template details pane. The labels can be used with the Conditional and Jump tools for fine-grained control over execution flow. They also serve as reminders of which elements are being selected by that node.

Related is the relative select command, which allows you to select elements relative to the current element (see below).

Internal representation

When you select an element, ParseHub figures out a concise pattern by which to represent that element (if you're familiar with XPath, it's kind of like that, but more powerful). When you add to your selection by clicking more yellow-highlighted elements, you are giving the select command additional samples so that ParseHub can make a better decision about the right pattern to use.

The select command is not tied to the particular page that you created the selection on. So when a selection is executed on a different page, ParseHub will automatically figure out which elements are in that selection based on the pattern. You can select extra elements across multiple pages to train the selection!

You may be wondering: what happens if I haven't given enough samples? This is a well-known problem for those that have written crawlers themselves. Your code will work on the pages that you've tried, but not on some that you haven't. For now, you simply have to find enough samples to train ParseHub. We are working on a way to automatically detect ambiguities so that ParseHub can figure out when it hasn't been trained enough.

Execution flow

You can think of a command as a loop over the elements it selects. For each element in your set of selected elements, it will set that as a current element, then execute all of the select command's children in order, with that current element.

The current element selected affects what some commands, such as Extract and Click, will read or interact with.

If you nest two selections, any further children will be executed once for each element from the inner selection multiplied by the number of elements from the outer selection. That is, if each selection has 10 elements, the children of the inner selection will be executed 100 (10x10) times.

Modifiers

Ctrl - Zoom

By default, ParseHub restricts the elements that can be selected to those that have text inside them. Sometimes that is not desired (e.g. you'd like to extract the css class of an element which has no text). You can use hold the ctrl key and scroll up or down (1 or 2) to "zoom" to the right element.

After zooming to the right level, there may be one or more "potential" highlighted elements. These are the elements that you may now hover over and click on to select them.

Command options

Selection Node

If you press the "Edit" button here, you can choose to manually input a path to elements you want to select on the page using either CSS or XPath.

You shouldn't use this function without understanding CSS or XPath, and only when ParseHub's own selection, with zooming, can't seem to find the element you need.

Wait for elements to appear

Normally, a select command is skipped when its pattern matches no elements on the page (or rather, it loops over 0 elements). With this option, if no elements are matched, the command will keep trying for up to 60 seconds. This is useful if e.g. an element gets created in response to an AJAX call.

Articles in this section

Internal representation

Execution flow

Modifiers

Ctrl - Zoom

Command options

Selection Node

Wait for elements to appear

COMPANY

PRODUCT

HELP

EXAMPLES

Articles in this section

Internal representation

Execution flow

Modifiers

Ctrl - Zoom

Command options

Selection Node

Wait for elements to appear

Related articles

COMPANY

PRODUCT

HELP

EXAMPLES