Solving Captcha

Some websites ask you to solve a Captcha in order to access their data. Only Captchas that show an image that needs to be translated into characters is solvable by ParseHub.

In this article, we will show you how to add a Captcha solver to your template in order to scrape Captcha enabled websites.

1. If not already in select mode, click on the "Select page" command + button that is located on the right of the command and choose the "Select" tool in the tool menu.

2. Select the Captcha image. You can rename this selection to "image".

3. Click on the + button next to the "Select & Extract image", then click on Advanced options and choose the "Extract" command. 

In the Extract text box, remove the $e.prop("src") and enter the Captcha solution command:

$e.solveCaptcha() 

This is an internal function which will solve the Captcha during the run automatically. 

You can also rename the Extract command to "captcha". 

4. Please note that the Captcha solution will not work while building the project or doing the test run. However, once the project runs on ParseHub servers, the Captcha solver will work properly. 

Now that you added the Captcha solver, you can choose the answer field and enter the solution via a ParseHub expression.

Choose the + button next to the "Select page" command and choose the Select tool. Select the answer field. An "Input" command will be created automatically. Change the Input format to "expression" from the drop down menu that appears on the bottom of the command, and enter "captcha" without quotations. This value is the solution from the Captcha solver which was extracted as "captcha" in the previous step.

5. Normally there is a submit button available on the page, that you can select to submit the Captcha solution. 

Choose the + button next to the "Select page" command and choose the Select tool. Select the "Submit" button.

In the process of building the project you must enter the Captcha solution manually. Before adding the next command, please go to "Browse" mode, by clicking on the green "Select" button on the top of the template. Next, enter the Captcha solution manually on the website.

6. Click on the + button next to the "Select & Extract submit" and choose the "Click" command. 

The Click command's configuration pop up will appear. You can either choose to repeat the same template or you can create a new template in case the website is loading the results on a different page.

If you need more help with your project, please email us at hello@parsehub.com. We would be happy to help you.

Have more questions? Submit request!

0 Comments

Article is closed for comments.