Engine Execution Slowness US East 1 Kitewheel Stack
Incident Report for Xponent
Resolved
This incident is fully resolved.
Posted Feb 20, 2023 - 22:12 UTC
Update
We have fully identified the issue and processed the full backlog. All processing should be back to normal now. We will continue to monitor.
Posted Feb 20, 2023 - 20:36 UTC
Monitoring
We are processing old jobs through the queue to catch any but the offending jobs. Offending jobs that are causing the issue will be processed and errored out.
Posted Feb 20, 2023 - 19:52 UTC
Update
Processing normally still. AWS has not allowed us to make configuration changes so impacted delayed graph runs are still delayed but all current processing is going fine.
Posted Feb 20, 2023 - 18:34 UTC
Update
Configuration change has not woked, but we have moved the problematic work loads are off to the side and currently jobs are processing normally.

This has been shown to be related to an underlying platform issue which ha an update awaiting. We are attempting to apply that update which could further impact processing.
Posted Feb 20, 2023 - 17:24 UTC
Identified
We have discovered that a configuration edge in our scaling capability has been hit for memory consumption. We are working to reconfigure the boxes for lower memory use per graph deployment.

This will cause a pause in executions which will show errors in the hub for deployed graphs temporarily and may show some as stopped. After we have updated our configs this will all be repaired and graphs should be restored to running state. This will interrupt graph API listener graphs.

We will update with further information as next steps progress.
Posted Feb 20, 2023 - 16:44 UTC
Update
We are continuing to investigate this issue.
Posted Feb 20, 2023 - 15:37 UTC
Update
We are attempting to reconfigure the engines to try to deal with the current load as presented in a smoother manner will update with progress.
Posted Feb 20, 2023 - 15:36 UTC
Update
We are continuing to investigate this issue.
Posted Feb 20, 2023 - 15:32 UTC
Investigating
We are seeing jobs backed up in the job distribution queue on the KW US East 1 Stack - This will be impacting all types of graphs including API listener graphs at this time.
Posted Feb 20, 2023 - 15:32 UTC
This incident affected: Kitewheel (US East (N. Virginia) Engine, US East (N. Virginia) Graph API).