In some instances, you may have a list of usernames and passwords and your goal may be to input each username/password combination into the login page, extract data for that user and then continue with the next username/password combination.
In this tutorial, we will demonstrate how you can do that using Airbnb as an example.
Building your project
1. Open your ParseHub client, click on "New Project" and input the URL you would like to scrape data from. For this example, we will be using AirBnb. You can type https://www.airbnb.com/ into your project if you would like to follow along. Click on "Start a project on this URL".
2. Click the gear icon in the top left corner and click Settings from the dropdown menu.
3. There are two ways of importing your list of usernames and passwords to ParseHub:
- Import them from a CSV or JSON file using the "Import from CSV/JSON" option on ParseHub
- If you are familiar with JSON, copy your JSON list directly into the Starting Value textbox
If you have a CSV file like the one below, called "loginList.csv" with two columns (a "username" column and a "password" column), using the "Import from CSV/JSON" option to select your CSV file will import it as follows:
You also have the option of pasting in JSON like the list below directly into the "Starting Value" textbox.
{
"loginList":[
{"username":"cool@parsehub.com","password":"password1"},
{"username":"fun@parsehub.com","password":"password2"},
{"username":"awesome@parsehub.com","password":"password3"},
]
}
Note that the passwords will be visible on the app to you or anyone else who is able to view your account (all projects are public on the free plan).
4. Go back to the "Commands" tab on your project. Then click on the "+" button next to "Select page" and click on the "Advanced" arrow to show more tools.
5. Choose the "Loop" command. The loop command iterates through a list and is good for repeating commands multiple times.
6. In the text boxes - change "item" to "login" and type in "loginList" in the list text box (without quotation marks).
- You can change "item" to anything you want. The item represents one username and password set in your list of usernames and passwords.
- Make sure the list name is the same as your list name in JSON. If you typed in {"loginList":....} make sure to keep the text in the text box as loginList (this is case sensitive).
7. Click on the "+" button next to "For each login in LoginList", click on the "Advanced" arrow to show all the commands and select a "Begin New Entry" command. Using this command, the results for each one of the username and password sets will go into a separate row in Excel and a separate scope in JSON. If you don't use this command anywhere in your project, the results scraped for each username and password set will override one another.
8. Rename the "list1" name that appears next to "Begin new entry" to something else like "logins". Make sure not to name the list command the same as the list that holds your username and password sets. The list command should have a unique name.
9. Now, we need to train ParseHub to click on the Login button to bring up the login pop-up. Click on the "+" button next to "Begin new entry in logins" (or "Begin new entry in list1" if you did not rename it in step 8) and choose a Select command.
10. Use your Select command to select the Login button, which will be highlighted in green. You can optionally rename your selection1 to "loginButton" by double-clicking on "selection1" and typing in the new name.
11. Click on the "+" button next to "Select & Extract loginButton" (or "Select & Extract selection1" if you didn't rename your selection in the previous step) and choose a Click command.
12. The pop-up will ask you if this is a "next page" button. Click on "No" which should prompt you to "Create New Template" option where you can input a name such as "login page", since AirBnb redirects us to a new www.airbnb.com/login URL. Click on "Create New Template".
13. By default, ParseHub will automatically skip a page if it has already been visited previously. Because we are going to visit www.airbnb.com/login each time we want to input a different username/password set, we should click on the template's "Options" and deselect "No Duplicates".
14. In the new template you'll already have a Select command ready for you by default (which should say "Empty selection1"). Select the username field that appears in the login pop-up. ParseHub will automatically create an Input command for you. Instead of typing the actual username, just type in "login.username". This will tell ParseHub to add the current username to your list of username and password sets. Also, ensure that you select "expression" in the "Input type" drop-down menu so that ParseHub will read the text as an expression instead of just plain text. You can optionally rename your selection from "selection1" to "username".
This login page has a "Continue" button to proceed to the password. If your website has the same button you can add a Select and Click command to continue.
15. Now repeat step 13 for the password field: click on the "+" button next to "Select page" and choose a Select command. Select the password field that appears in the login pop-up. ParseHub will automatically create an Input command for you. Instead of typing the actual password, just type in "login.password". This will tell ParseHub to add in the current password in your list of username and password sets. Also, ensure that you select "expression" in the "Input type" drop-down menu so that ParseHub will read the text as an expression instead of just plain text. You can optionally rename your selection from "selection2" to "password".
16. To click on the "Log in" button, click on the "+" button next to "Select page" and choose a Select command. Select the login button. You can optionally rename your selection from "selection3" to "loginButton".
17. Next to "Select loginButton", click on the "+" button and choose a Click command. You will be prompted to answer the question of whether it is a "next page" button again; click "No". This should give you the option to create a new template which you can call something like "post-login". Click on "Create New Template".
18. On our post-login template we can teach ParseHub what data we would like to extract from the page after we've logged in.
If you are starting from a website similar to AirBnb where the login is in a pop-up, you may need to teach ParseHub to log out on your post-login template using a combination of Select commands.
However, you can also start the project directly from the login page in cases where you can just visit www.examplewebsite.com/login and this should automatically let you log into a second account without the need to log out first.
If you have any questions regarding your own project, you can always contact us at hello@parsehub.com