Issue
Users may encounter a 504 Gateway Timeout error when sending requests to a query pipeline endpoint in Fusion. The response includes a message similar to the following:
"error": "service-timeout",
"msg": "Did not observe any item or terminal signal within 15000ms in 'circuitBreaker' (and no fallback has been configured)",
"path": "/api/apps/<app-name>/query/<pipeline-name>"
This error typically originates from the
proxy (api-gateway) microservice and indicates that a response was not received within the configured timeout period managed by Resilience4j.Cause
Fusion's
proxy service uses Resilience4j for circuit breaker functionality. If the timeoutDuration value for the query instance is too low relative to downstream processing latency, the circuit breaker will trigger a timeout before receiving a response. This results in the service-timeout error and a 504 response code.By default, the value may be set to 15 seconds (15000ms) on the api-gateways. In some environments, this timeout may be too aggressive for longer-running query requests.
Resolution
To increase the timeout and prevent premature circuit breaker termination in the proxy service:
- Identify the
api-gatewaydeployment in your Fusion Kubernetes namespace. This is typically named using the pattern:<namespace>-api-gateway - Edit the deployment configuration using
kubectl:kubectl edit deploy <namespace>-api-gateway -n <namespace>
- Locate the
args:section undercontainers:and add or modify the following line:--resilience4j.timelimiter.instances.query.timeoutDuration=15s
This sets the timeout duration for the query circuit breaker instance to 15 seconds. Adjust the value based on your system’s expected response times.
- Save and exit the editor to apply the changes.
- Monitor the pod rollout to ensure that the updated configuration is applied.
- You can confirm the new pods are running.
Additional guidance
- Apply configuration changes in a lower environment first to validate expected behavior before deploying to production.
- Avoid setting excessively high timeout durations, as this can delay failure detection and degrade system responsiveness under load.
For persistent or unclear timeouts, contact the Lucidworks Support Team for further assistance.