Issue
When attempting to crawl JavaScript single-page applications (SPA) using the Web V2 connector, the crawl job fails during validation or startup. Users typically encounter the following error message in the Fusion UI:
Cannot connect to any of the provided start links. Errors: https://stackoverflow.com/questions/51851142/java-selenium-webdriver-failed-to-create-chrome-processDiagnosis
In Fusion version 5.9.14, the failure is often caused by a library mismatch between the connector and the Selenium WebDriver, resulting in a java.lang.NoSuchMethodError.
In Fusion version 5.9.15, the issue is frequently linked to a NullPointerException (NPE) during the data source validation phase. The connector logs will show a stack trace similar to the following:
Caused by: java.lang.NullPointerException: Cannot invoke "String.split(String)" because "chromeExtraCommandLineArgs" is null
at com.lucidworks.connector.plugins.web.fetcher.webdriver.DefaultWebDriverPool.getChromeOptions(DefaultWebDriverPool.java:218)This occurs specifically when the "Chrome extra command line options" field is left empty in the data source configuration, even though the field is marked as optional.
Environment
Fusion Version: 5.9.14, 5.9.15
Connector: Web V2 (versions 2.2.1 and below)
Platform: Self-Hosted (Kubernetes/AKS)
Service: Selenium Grid / Remote WebDriver
Cause
The root causes vary by version:
Fusion 5.9.14: Version incompatibility between the built-in Selenium/ChromeDriver libraries and the connector JARs.
Fusion 5.9.15: A code-level defect where the Web V2 connector fails to handle a null value for the "Chrome extra command line options" field, triggering an NPE when the system attempts to split the string for processing.
Resolution
To resolve these issues, follow the steps below based on your current version:
Upgrade Web V2 Connector
The definitive fix for the NullPointerException and WebDriver initialization issues is to upgrade the Web V2 connector to version 2.2.2 or higher.
Download the updated connector package from the Lucidworks Plugin Portal.
Upload the new package to the Fusion Blob Store.
Restart the
connector-pluginpods to ensure the new version is loaded.
Manual Workaround (Fusion 5.9.15)
If an immediate upgrade to the connector is not possible, you can bypass the NullPointerException by ensuring the "Chrome extra command line options" field is not null.
Navigate to the Web V2 Data Source configuration.
Locate the JavaScript Evaluation section.
In the Chrome extra command line options field, enter a valid flag such as:
--no-sandboxSave the configuration and restart the crawl.
Selenium Grid Configuration
For self-hosted environments on Kubernetes, it is highly recommended to use a remote Selenium Grid to decouple browser/driver versions from the Fusion connector pods.
Deploy a Selenium Hub and Chrome nodes within your cluster.
Ensure the following properties are configured in your
values.yamlor connector environment variables:
useRemoteWebDriver: true
remoteWebDriverUrl: "http://<selenium-hub-service>:4444/wd/hub"Timeout Adjustments
If WebDriver sessions are created successfully but the job still fails with a TimeoutException, increase the following values in the Data Source configuration:
Page Load Timeout: 60,000 – 180,000 ms
Script Timeout: 60,000 ms
AJAX Timeout: 180,000 ms
Implicit Wait Timeout: 10,000 ms