I’m sending this to my boss to remind him why monitoring disk space is vital.

  • Pons_Aelius@kbin.social
    link
    fedilink
    arrow-up
    51
    ·
    edit-2
    1 year ago

    $100 says there is a series of emails sent by a sysadmin/DBA over the past couple months warning about this issue in explicit detail and its increasing urgency, that have been ignored.

    The person sending the emails will still get chewed out because they failed to make the higher ups realise this is a real problem.

    • ShunkW@lemmy.world
      link
      fedilink
      arrow-up
      10
      ·
      edit-2
      1 year ago

      I used to be a sysadmin, now a software developer. At one of my old jobs for a massive corporation, they decided to consolidate several apps’ db servers onto one host. We found out about this after it had already happened because they at least properly setup cname records so it was seamless to us. Some data was lost though, but having literally billions of records in our db, we didn’t notice until it triggered a scream test for our users. We were also running up against data storage limits

      They ended up undoing the change which caused us a data merge nightmare that lasted several full workdays.

      • Brkdncr@kbin.social
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        The only thing worse than a single database server is servers poorly maintained database servers. The idea was right, but maybe the implementation was wrong.

      • Pons_Aelius@kbin.social
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        While that story is shitty I doubt the manufacturing control DB and customer data DB are anywhere near each other.

  • MrPoopyButthole@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    1 year ago

    I had a call recording server crash 2 days ago and when inspected I found some jackass had partitioned the drive into 50G and 450G partitions and then never used the 450G one. The root partition had 20kb free space remaining. The DB was so beyond fucked I had to create a new server.

  • MelodiousFunk@kbin.social
    link
    fedilink
    arrow-up
    10
    ·
    1 year ago

    I bet at one time they had a functional threshold alerting system. Then someone missed something (because they’re human) and management ordered more alerts “so it doesn’t happen again.” Wash, rinse, repeat over the course of years (combined with VM sprawl and acquiring competitors) until there’s no semblance of sanity left, having gone far past notification fatigue and well into “my job is just checking email and updating tickets now.” But management insists that all of those alerts are needed because Joe Bob missed an email… which there are now exponentially more of… and the board is permanently half red anyway because the CTO (bless his sociopathic heart) decreed that 80% is the company standard for alerts and a bunch of stuff just lives there happily so good luck seeing something new.

    …I was not expecting to process that particular trauma this evening.

  • kowcop@aussie.zone
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    1 year ago

    Sack both the capacity management and the reporting & analytics teams