Issue
Users are unable to view data source run statuses or load schedulers within the Fusion Admin UI. Attempting to access these views results in an explicit API error during the retrieval of job schedules.
Diagnosis
When the problem occurs, the backend data source jobs continue executing normally via automated schedules, but their real-time statuses and configurations are entirely invisible or unmanageable through the administrative interface.
The issue typically manifests across multiple Fusion applications within the same cluster. Common initial troubleshooting steps, such as restarting job-launcher, api-gateway, or fusion-admin microservices, fail to restore visibility. In addition, checking the Solr system collections (such as system_jobs or system_jobs_history) shows that the underlying storage shards and replicas are completely healthy. This points to an inconsistent, cached scheduler state trapped within the job coordination layer rather than a storage or gateway routing failure.
Environment
Fusion Version: 5.9.x
Solr Version: 9.6.x
Kubernetes Version: 1.32
Cloud Platform: AKS / Kubernetes-native
Cause
An inconsistent scheduler state or a stale metadata cache is introduced within the job orchestration layer, often triggered during ad-hoc schedule modifications, enabling/disabling tasks, or microservice communication hiccups. Because the scheduler state is maintained by dedicated internal services, simply recycling the main API gateway or administration pods does not clear the corrupted operational cache.
Resolution
To resolve the inconsistent scheduler state, clear the internal metadata cache by performing a sequential restart of the core job services via the Kubernetes CLI.
Step 1: Execute a rollout restart of the
job-rest-serverdeployment to flush the REST operational cache.
kubectl rollout restart deployment/job-rest-serverStep 2: Monitor the pods and ensure that the
job-rest-serverinstances return to a fully functional and healthy state.
kubectl get pods -l app.kubernetes.io/name=job-rest-serverStep 3: Once the REST server is completely healthy, trigger a rollout restart of the
job-configservice to re-synchronize the metadata configurations.
kubectl rollout restart deployment/job-configStep 4: Verify that all pods for both services are running and ready.
kubectl get pods -l app.kubernetes.io/name=job-configStep 5: Log back into the Fusion Admin UI and confirm that the data source run statuses load successfully and that job schedules are retrievable.