Issue
When upgrading from the forked Tika parser to the asynchronous Tika parser, users may find that the asynchronous Tika parsing pod is not deployed or running in the Kubernetes cluster. This prevents asynchronous parsing from functioning, potentially resulting in slower indexing performance for large document volumes.
Diagnosis
To verify whether the asynchronous Tika parser pod is running:
Run the following command to check for the asynchronous parsing pod:
kubectl get pods | grep asyncIf the pod is running, the output should resemble the following:
To reduce storage requirements, add or adjust the following configuration in the fusion_values.yaml file:
async-parsing:
volume:
storage: 10GiApply the updated Helm values to redeploy the services.
Resolve StatefulSet patch errors
Applying new Helm configurations can result in the following error:
Error: UPGRADE FAILED: cannot patch "fusion-sandbox-async-parsing" with kind StatefulSet:
StatefulSet.apps "fusion-sandbox-async-parsing" is invalid: spec: Forbidden:
updates to statefulset spec for fields other than 'replicas', 'template',
'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbiddenIf this error occurs:
Delete the StatefulSet:
kubectl delete statefulset <statefulset-name>Delete the associated PersistentVolumeClaim (PVC):
kubectl delete pvc -l <label-key>=<label-value>Recreate the StatefulSet using the updated YAML file:
kubectl apply -f <statefulset-config-file.yaml>After redeployment, re-run the command:
kubectl get pods | grep asyncVerify that the asynchronous parsing pod is running as expected.