AI for Selenium WebDriver


I recently read about a way to use artificial intelligence to replace regular placeholders in Selenium WebDriver. I can’t wait to test this solution so I’m getting to work!

You can find the previous post here: Proces testowy, do którego dążę w projektach, nad którymi pracuję

Setup of the application to be tested

I’m going to test the app you already know from the series on Cypress. Its setup is described in this post: Environment preparation

But how!

The solution is built on a plugin which you can find at the following address:

As you can see, it seems that the original purpose is to use it with Appium, ie on mobile devices, but I will try to hook it up to testing on the Desktop - supposedly it is possible. Admittedly, there are already some tools that advertise the use of AI for test automation, but most of them resemble Selenium IDE on steroids. Finally something for me!

Plugin Setup

You have the setup described above in the git link, it looks like there is no Windows version, here is how I handled it one step at a time on a Mac.

I installed all the required dependencies using this command in the terminal. Of course you need to have the brew tool already installed:

brew install pkg-config cairo pango libpng jpeg giflib

I then installed the plugin using the npm tool, also in the terminal:

npm install -g test-ai-classifier

It turned out that the installation failed with an error:

No receipt for '' found at '/'.

No receipt for '' found at '/'.

No receipt for '' found at '/'.

The solution turned out to be reinstalling the xcode tool - note this command removes the xcode-select directory so you use at your own risk:

sudo rm -rf $(xcode-select -print-path)
xcode-select --install

It then tries to run again:

npm install -g test-ai-classifier

Completed successfully! Unfortunately it would be too beautiful for this to just work, after running the command in the terminal:


It receives the following error:

Error: ENOENT: no such file or directory, open '/Users/marcinstanek/.nvm/versions/node/v12.14.1/lib/node_modules/test-ai-classifier/classifier-proto/classifier.proto'

I don’t know why but there is no file under the given path. I poked around in the developer repo and just created it manually. By copying it from this link:

There is also an issue on github set up by me at this link. As you can read the issue with the missing file has been fixed:

Well, and finally the grading server is working properly!

node-pre-gyp info ai-rpc Classification RPC server started on

Test Setup

I use my boilerplate for automated testing with Selenium WebDriver Java and Groovy, which you can download here: Selenium WebDriver Java - Groovy BoilerPlate. In the project you will find a README describing how to run it. This is a project I created back in 2018, also you can’t expect very much. However, it is good enough for me to share it. It will allow you to test the plugin very quickly, which is what we want, right?

Once I have a working server for classification and a project with Selenium. We still need to add a client that allows us to consume the functionality exposed by the server from within the code. Currently there are implementations for 4 different languages - which is very good:

Upon entering the repo, you will notice that the package is not exposed in maven central. Apparently, the developers are not going to make it easy. Fortunately there is an easy way to do this, go to the address:

In the input at the top of the page, paste the repo address:

This way we have a ready to use package. Next, in the project with tests we add entries to pom.xml - I mean the one in the root directory. We just have to add an entry to the repositories node:


And to the dependcies node:


And already in the testframework module:


This way, finally, we have all the dependencies. Let’s move on to the test implementation.

First test

I start by creating a field and its initialization classifier in class BaseSpec

    import ai.test.classifier_client.ClassifierClient

    @RetryOnFailure(times = 3)
    class BaseSpec extends Specification {
        protected ClassifierClient classifier;

        def setup() {

        def cleanup() {
            if (classifier != null) {


        def setupClassifier(){
            classifier = new ClassifierClient("", 50051);


The most important fragment:

classifier = new ClassifierClient("", 50051);

When creating an object, the address and port should be used of course the same indicated by output on the terminal:

node-pre-gyp info ai-rpc Classification RPC server started on

I am adding a new file named LoginSpec.groovy in the directory:


I am also creating a new test:

import com.marcinstanek.seleniumjavaspockboilerplate.BaseSpec
import io.qameta.allure.Epic
import io.qameta.allure.Feature

class LoginSpec extends BaseSpec {
  def 'Open login page from home page'(){
    when: 'user click Sign in button'
    def els = classifier.findElementsMatchingLabel(driver, "Sign in")

    then: 'is redirected to /login'
    driver.currentUrl == 'http://localhost:4100/login'

The test is to find the Sign in button on the page and press it. The library that we have installed provides the ClassifierClient object which in turn provides us with the findElementsMatchingLabel (driver, “Sign in”) method, to which we must pass the object implementing the IWebDriver interface and a string representing the object we are looking for.

Let’s look at this

def els = classifier.findElementsMatchingLabel(driver, "Sign in")

The method returns a list of WebElement class objects that should look like a SignIn button. With the received WebElements, we can do whatever we want with these types of objects. I just select the first item in the list and try to press it.

Here’s how to run the test:

how to run the test

As you can see, the test failed. However, let’s analyze how the library tries to deal with the problem. So, according to the developers, it works like this. The classifier client, a dependency we added to the project, runs an XPath query that tries to find items on the page. It then makes screenshots of these elements on the page. Then the client sends these snapshots to the server, this part running in the terminal performs image classification and sends the classification strength for each element to the client. At this point, the client has all the data to map the XPath queries and return those that match us as the items we are looking for. Unfortunately, as you can see in the launch video, 9 elements were found, but none of them were matched as the correct one.

I am writing this test at a loss, I will try with something else. :)

Second test

It turns out you can’t type literally anything as a search item - that would be too good. The list of available keywords can be found here. It is not very impressive so far. So I’ll have to get a test to use one of them.

def 'User click settings link at home page'(){
    given: 'user is on the login page'
    driver.get(getConfig().url + 'login'
    when: 'email field has been filled'
    and: 'password field has been filled'
    and: 'button clicked'
    and: 'settings link clicked'
    def button = classifier.findElementsMatchingLabel(driver, "settings")

    then: 'user should be redirected to /settings'
    driver.currentUrl == 'http://localhost:4100/settings'

As you can see, I mixed up the search styles here to successfully log in. In the context of the topic of the post, we are interested in these lines of code:

def button = classifier.findElementsMatchingLabel(driver, "settings")

Let’s check it!


As you can see, the test was successful. The item we searched for using the newly added library was found and the interaction was successful. Excellent!


The new way to search for items on the page works fine, but it’s pretty limited at the moment. Installation is problematic at this stage, which is definitely a downside. However, once I have everything set up, using this library is very simple. Probably the migration of the existing tests for this way of searching for items would be relatively easy. It seems to me that the development of this project should be watched. Perhaps the long-awaited revolution in the field of testing is just coming? :)

The code can be seen here: