Downtime Followup

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Downtime Followup

HTTP_404_NotFound@lemmyonline.com · 1 year ago

Understandably, using longhorn on top of ceph, is not the best decision.

However, given the backups were all performed in longhorn, the backups need to be restored to longhorn. As this is a temporary solution for now, this is taped together, by using replicas=1, which tells it to only keep a single copy of each piece of data. Now, ideally, this should mean the longhorn functions… as a glorified method of handling local storage, but, there are still other issues.

Another issue, recall, I had a 4-node kubernetes cluster, before the failure. Everything is currently condensed into a single VM… and it appears… it might just be too much stuff for a single server. Only… ~200 pods running, but, I am still seeing lots of errors for resource contention, despite having enough ram/cpu.

So… still working on this, to get things back to stable and normal.