API Error When Retrieving Job Schedule and Data Source Run Status in Fusion Admin – Lucidworks

Issue

Users are unable to view data source run statuses or load schedulers within the Fusion Admin UI. Attempting to access these views results in an explicit API error during the retrieval of job schedules.

Diagnosis

When the problem occurs, the backend data source jobs continue executing normally via automated schedules, but their real-time statuses and configurations are entirely invisible or unmanageable through the administrative interface.

The issue typically manifests across multiple Fusion applications within the same cluster. Common initial troubleshooting steps, such as restarting job-launcher, api-gateway, or fusion-admin microservices, fail to restore visibility. In addition, checking the Solr system collections (such as system_jobs or system_jobs_history) shows that the underlying storage shards and replicas are completely healthy. This points to an inconsistent, cached scheduler state trapped within the job coordination layer rather than a storage or gateway routing failure.

Environment

Fusion Version: 5.9.x
Solr Version: 9.6.x
Kubernetes Version: 1.32
Cloud Platform: AKS / Kubernetes-native

Cause

An inconsistent scheduler state or a stale metadata cache is introduced within the job orchestration layer, often triggered during ad-hoc schedule modifications, enabling/disabling tasks, or microservice communication hiccups. Because the scheduler state is maintained by dedicated internal services, simply recycling the main API gateway or administration pods does not clear the corrupted operational cache.

Resolution

To resolve the inconsistent scheduler state, clear the internal metadata cache by performing a sequential restart of the core job services via the Kubernetes CLI.

Step 1: Execute a rollout restart of the job-rest-server deployment to flush the REST operational cache.

kubectl rollout restart deployment/job-rest-server

Step 2: Monitor the pods and ensure that the job-rest-server instances return to a fully functional and healthy state.

kubectl get pods -l app.kubernetes.io/name=job-rest-server

Step 3: Once the REST server is completely healthy, trigger a rollout restart of the job-config service to re-synchronize the metadata configurations.

kubectl rollout restart deployment/job-config

Step 4: Verify that all pods for both services are running and ready.

kubectl get pods -l app.kubernetes.io/name=job-config

Step 5: Log back into the Fusion Admin UI and confirm that the data source run statuses load successfully and that job schedules are retrievable.

Issue

Diagnosis

Environment

Cause

Resolution

Related articles