Click here to go back to the index

* Art 1 * Art 2 * Art 3 * Art 4 * Art 5 * Art 6 * Art 7 * Art 8 * Art 10 * Art 11 * Art 12 * Art 13 * Art 14 * Art 15 * Art 16 * Art 17 * Art 18 * Art 19 *

The Evolution of Catastrophic Service Failure at My ISP

OK, based on my experiences working for other ISP's, this is the normal progression of events during a catastrophic or major service failure (news, mail, etc). This assumes a complete and total failure and no backup machines / service are already in place (which there should be).

  1. Problem starts: A few users complain. ISP staff makes note to fix it.
  2. 6 hours later: Lots of users complaining. One sysadmin is working on the problem almost exclusively.
  3. 12 hours later: Problem still there. Support mailbox is getting flooded. At least 2 sysadmins are working overtime and Tim Hortons has been contacted to keep a steady flow of coffee going. Network status page updated. Description and apology entered in the support newsgroup.
  4. 24 hours later: Problem still occurring. Customers are getting *really* ticked. All qualified support staff are working on the problem. Outside contractors/specialists are called. Someone is looking into a complete replacement (mail/news/web/etc server).
  5. 48 hours later: Problem still occurring, replacement server has been purchased, it's on the way. Damage control is occurring in the support newsgroups. Support and Billing have been instructed to offer discounts for each half day of service outage.
  6. 60 hours later: Replacement server has arrived, new server software installed. Backup data being transferred. Patched into the network and testing starts.
  7. 72 hours later: Replacement server is up online, service restored to normal. Detailed description and explanation of the problem and it's solution posted to the newsgroup and network status page. All users who complain are automatically given a 3 day credit. Problem server / service being analyzed in detail. Once fixed, will be kept as emergency backup machine.

Now, here's what appears to be happening at My ISP during catastrophic failures:

  1. Service fails: User complains. Complaint ignored as tech assumes user doesn't know what they're talking about (irony???)
  2. 6 hours later: service still down. Users complaining, but at a slow rate, since 310-SURF is so impossible to get through (and so many DUMB levels to go through in the menuing !) A trouble ticket is opened, but it gets deleted for the same reasons as 1) above.
  3. 12 hours later: Service still down. Users complaining in droves now. 310-SURF wait times are now measured in geologic ages. A note is forwarded to the My ISP's upstream provider's staff that there 'might' be a minor problem somewhere. My ISP's upstream provider ignores it, assuming My ISP doesn't know what they're talking about. (Aha! There's that irony again!)
  4. 24 hours later: Service still down. Users are *ticked*. Complaints now flowing into tor.general, ott.general, etc. Someone at My ISP actually *calls* My ISP's upstream provider and says Okay, there is a problem. My ISP's upstream provider opens a trouble ticket, but it's deleted for the same reasons as 3) above.
  5. 36 hours later: service still down. Rogers (Major competitor to My ISP) sales line experiences a sudden increase in inquiries. My ISP's upstream provider's staff admits something may be wrong. A junior tech tries to send email to himself as a test.
  6. 48 hours later: Service still down. Anarchist's cookbook gets lots of downloads and there is a sudden interest in the location of My ISP's customer service centers. My ISP's upstream provider's senior admin finally looks at the problem (She sends herself an email as a test). Starts trying to fix things. 310-SURF told to deny existance of a problem.
  7. 60 hours later: Service still down. My ISP's user's start buying new telephones as they throw theirs across the room in frustration. My ISP's upstream provider's staff are trying to re-write a custom version of TCP/IP in an effort to "fix" things. My ISP's support person breaks ranks and whispers to one caller that there might indeed be a problem. 5 minutes later this support person is fired.
  8. 72 hours later: Service definitely still down. Rogers marketing staff are dancing jigs as their market share visibly increases by the hour. My ISP's upstream provider's staff are now trying to re-write their own versions of Windows 98 to "fix" the problem. My ISP's tech support are busily instructing people on how to re-format their hard drives and re-install windows 98. My ISP makes a new TV commercial with a focus on the word "reliability". My ISP's prez laughs diabolically and states "I LOVE it! Go to press!"
  9. 90 hours later: A staff person from My ISP's upstream provider realizes that someone pulled the power cords simultaneously for their routers, UPS AND coffee machines. He plugs in the coffee machine and writes an email asking about the router power cords.
  10. 102 hours later: A tech from My ISP actually drives over to My ISP's upstream provider's office to see what's going on. While waiting for someone to answer a question, notices a power cord on the ground an plugs it in. Service restored. Tech from My ISP is fired.
  11. My ISP posts nothing to the newsgroups. Tells callers that there was a very small minority of people who had problems, all of which had unplugged their modems by mistake.
  12. New commercial for My ISP hits the airwaves. It's *very* effective. Unfortunately, they screwed up and put the Rogers sales number instead of their own.

The scary thing is that the above isn't really so unbelievable, is it?

The above is, of course, pure satire and humour. No offense is intended, though a suggestion that the ISP in question could seriously improve their response times is definitely intended :)

Excuse for censoring My ISP (Of many, many years ago and not my current ISP) and their upstream provider's name:I've removed the name of my ISP and their supposedly arm's length (ha!) upstream provider, simply because some people don't have a sense of humour and take things far too seriously. Sadly, "My ISP" could be replaced with many of them (ISPs) out there and this story would be just as accurate :(

* Art 1 * Art 2 * Art 3 * Art 4 * Art 5 * Art 6 * Art 7 * Art 8 * Art 10 * Art 11 * Art 12 * Art 13 * Art 14 * Art 15 * Art 16 * Art 17 * Art 18 * Art 19 *

Our Privacy Statement

© 2012 Marc Bissonnette, Beachburg, Ontario

InternAlysis - Customized, specialized, dedicated eMarketing specialist
CanadianISP - Canadas' largest Internet Service Provider (ISP) list and comparison web site