Command | Description and Sample Tutorials |
---|---|
Select |
Targets one or more elements for a command. Auto-extracts element text and url when possible. This command is typically used before any other command. Commands used on a selection will happen once for every element selected. |
Relative Select |
Pairs already-selected elements with other corresponding elements on the page. Targets those new elements for a command. Auto-extracts element text and url when possible. Useful when creating a relationship between two types of elements on a webpage. For example, linking products names to their prices on a product listing page. |
Click |
Clicks an already-selected element on the page. Use to load a new page or open tabs, drop downs, click buttons that load content on the same page. Often used to click through "next" buttons (pagination). |
Input |
Types text into an already-selected textbox or input area. Used to fill out search boxes and login forms. |
Hover |
Hovers over an already-selected element on the page. Use to show an element that only shows up on hover, such as addresses on a map. |
Scroll |
Scrolls an already-selected element into view. Used to help you get data from an infinitely scrolling page. Use this command after selecting the main container on a page to scroll down to the bottom. |
Extract |
Grabs and stores text, urls, or other attributes from an already-selected element. Selections will automatically extract text and urls unless some other command is applied to the selection. Extracted data can then be used in if statements to decide what else to scrape. |
Begin New Entry |
Creates a new bucket in your data which you can use to store related extractions. ParseHub will usually add this for you automatically. Use to create a list without any selections to manipulate how your data will be organized and collected. Use if you want to create a new list after using the loop tool that executes on JSON that you added in the "starting value" part of the project in the "Settings" tab. |
Go To Template |
Makes ParseHub go to a different template. Useful if you want to scrape data from a specific url or list of urls. Also useful if you want to reuse a template that has already been created. |
Loop |
Use to go through items in a list in your data. A list appears in your data by either filling in the "Starting value" text box in the "Settings" tab of your project, or by using the Begin New Entry command earlier in your project. Useful if you want ParseHub to go through a list of urls and enter them into ParseHub one by one to get data. Also used if you want ParseHub to go though thousands of keywords and enter each one into a textbox one by one. |
Conditional |
Similar to an if statement in a programming language. It evaluates an expression and if the expression is truthy ParseHub continues to execute on the commands inside the Conditional command. Used if you want to filter the selection of ratings to those higher than 9: |
Stop |
Similar to a break statement in a traditional programming language. Stop running when it reaches a certain command. ParseHub will exit to the named selection and continue executing commands after that selection. If you stop and exit to the page selection ParseHub will stop executing that entire template. Useful if you want ParseHub to stop collecting data after 20 titles that contain similar text. |
Jump |
Make ParseHub go to another command. ParseHub will continue executing the instructions after this command. Usually used after the click command to get content and data that loads on the same page. Useful to go to the next page when it is loaded dynamically - using AJAX. Also used for recursive relationships. |
Wait |
Use to add additional waiting time to the project. If added under any command, ParseHub will wait a specified number of seconds before going forward and executing other commands. Use when the web page takes a long time to load in any browser. Use to make sure ParseHub does not skip over data when scraping the project. |
Server Snapshot |
Captures a snapshot of exactly what ParseHub's servers see during a run. You can then inspect a snapshot by clicking the camera icon on the command. Used for troubleshooting your project when it's not working correctly. ParseHub automatically takes server snapshots when something unexpected happens. For example, if a page fails to load. |
Commands in the tool box - Reference
Follow
Do you have any questions or need help with this tutorial? Submit a request or book a demo call.