Relative Select

The relative select command is used to select related elements on the page.

Unlike the select command, it defines the selection as a relationship from one element to another. This requires two clicks. The first click defines the source element. It will typically be an element that is already selected. The second click defines the destination element.

You would use the relative select tool, instead of the select tool, if you wanted to get data for multiple items on the same page. For example:

  • You need price, publisher, name, url, and image from all products on a search results page.
  • In a list of movies on a site such as IMDB, you want the movie titlerating, and run time without navigating to the details page.
  • You want to scrape a table containing sports stats, weather, grades, prices, and so on.
  • You want to scrape all the comments on a product or article page with the comments that they are nested under.
  • Each element you want scrape has it's own "Load More" button that you need to click.

Just like select commands, relative select commands are labeled and can be referenced by other tools. If you click on the wrong element for the source and want to cancel your relative selection, you can press the escape key.

Internal representation

When you create a relative selection, ParseHub figures out a pattern to capture the relationship. Just like with the select command, add additional samples to the relative select by clicking on elements. Unlike the select command, the pattern that ParseHub figures out describes pairs of elements.

Just like regular selections, the relative selection is not tied to the particular page where you defined the relationship. When you execute the command on a page, ParseHub will automatically figure out the elements based on the relationship.

Execution flow

When the relative select command is executed, ParseHub will apply the relationship to the current element (source) and will loop over the resulting elements (destinations), setting each in turn as the new current element, and executing all child commands.

The current element affects the relative selection (remember, the relationship is applied to the current element). This means that you can chain relative selections to express exactly how different pieces of the page are related to each other.

Modifiers

Ctrl - Zoom

In a relative select command the first element must be an orange element. The second element however can be any element on the page. By default, ParseHub restricts the elements that can be selected to those that have text inside them. Sometimes that is not desired (e.g. you'd like to extract the css class of an element which has no text). You can use hold the ctrl key and scroll up or down (1 or 2) to "zoom" to the right element.

After zooming to the right level, there may be one or more "potential" highlighted elements. These are the elements that you may now hover over and click on to select them.

Command options

Wait for elements to appear

Normally, a relative select command is skipped when its pattern matches no elements on the page (or rather, it loops over 0 elements). With this option, if no elements are matched, the command will keep trying for up to 60 seconds. This is useful if e.g. an element gets created in response to an AJAX call.

Have more questions? Submit request!

0 Comments

Please sign in to leave a comment.