Geek Rant dot org

Thu 2009-05-14

When critical systems fail

Filed under: — daniel @ 19:18

There’s some interesting things coming out of the bushfires royal commission; the last couple of days has highlighted the limitations of the emergency Triple-0 system, when surges in the number of calls outstripped available capacity, and overflow calls were put on hold, got recorded messages or were diverted.

The first half-hour of Jon Faine’s show on 774 is worth a listen for those interested, particularly the section from about 10 minutes in, with Garth Head, a former adviser to Minister for Police and Emergency Services. For geeks, it’s a reminder that sometimes the systems we design, implemennt and manage are sometimes critically important to those who rely on them.

Bookmark and Share

Mon 2008-12-29

Hold music down (severity level unknown)

Filed under: — daniel @ 14:22

I find it mildly amusing when the web hosting ISP reports through its system status RSS feed that the support phone line hold music has broken down.

But I suppose it shows they’re serious about tech support issues.

Bookmark and Share

Mon 2008-02-11

Just in case you need to know

Filed under: — daniel @ 13:07

My main web provider logs all their problems onto a fault-tracking database, and publishes them onto the Web, including via RSS, to make sure their customers are kept informed, and can work around things where necessary.

Even down to the most trivial thing.

We are currently experiencing issues with the on hold music on our telephone system. This is causing customers to receive silence when placed on hold. Periodic messages are still being played.

This will be rectified tomorrow morning.

Maybe I don’t need to know that, but it’s reassuring to know they’re being open and honest about any faults that occur. If only all companies were this open.

Bookmark and Share

Wed 2007-09-26

Errors at Chadstone

Filed under: — daniel @ 07:23

This stuff never gets old. Seen at Chadstone Shopping Centre on Saturday:

Kiosk error

Kiosk error

See also: Highpoint directory

Bookmark and Share

Tue 2006-09-19

Citylink goes down

Filed under: — daniel @ 19:08

CricketersA power outage resulted in a shutdown of Melbourne’s CityLink tollway tunnels today around 9am, for several hours. Apart from the obvious electronic signs that rely on power, I assume it affected lighting and exhaust pumps.

According to the Herald-Sun, Citylink spokeswoman Jean Ker Walsh said: “We have rebooted the systems that allow our operators to manage the tunnels safely.” So there you go. They rebooted the tunnel. Ms Ker Walsh also mentioned on the evening TV news that they’d be upgrading their UPS!

Interestingly on the Herald-Sun’s RSS feed, this story came through in the early afternoon. The feed claimed there was an attached picture, but it turned out not to be a picture of gridlocked cars or an empty motorway — rather it seemed to be a picture of cricket players.

The other effect of the shutdown was the Citylink web site also appeared to lose power… or perhaps it was just snowed under by the traffic. Like some other transport providers, they didn’t cope well under stress.

The Vicroads web site kept running under the load, though apart from showing slow traffic in the area, didn’t contain specific information relevant to motorists who might be caught there. I assume the information for radio reports and the like are gathered by phone, not off the web sites.

Bookmark and Share

Sun 2006-05-28

Too scared to wipe your machine just to improve performance?

Filed under: — josh @ 07:35

If you’re too scared to wipe your machine just to improve performance, follow these
instructions for keeping your old installation in a virtual machine.

Seriously, you’ve got to check out the screenshot of this guy’s Start Menu. Don’t believe them when they say size isn’t everything!

Bookmark and Share

Tue 2006-04-11

Outages and response times

Filed under: — daniel @ 20:37

Cam ponders web hosting SLAs and wonders what’s reasonable. For his hosting, they guarantee 99.99% uptime, which works out to 52 minutes per year. (His outage was about 9 hours, or about ten years’ worth).

Bad stuff happens. We all know that. Even if it’s the most reliable setup ever. But there’s some major factors in determining what’s acceptable:

Frequency — If it’s happening too regularly, then there’s a reliability problem. They need better hardware, better software, whatever it is, needs to be fixed. Cam reckons it’s the second time in a few months.

Response — Obviously, you want a quick response, and a quick (and reliable) solution. There’s also sorts of monitoring tools out there these days. Typically anything like a full outage should be known about within minutes. A reputable web host will have substitute hardware ready to switch-on and go just as soon as that nice recent backup is restored.

Communications — Any third party like this has to keep the customer informed. There’s no excuse for not doing so. SMS alarms, emails, phone calls, whatever. (I wrote about alarms recently for my work blog.)

BTW, Cam’s also having troubles with his iPod… or more accurately, Apple’s 90 day warranty on replacement units.

I reckon he’s jinxed, myself.

Bookmark and Share

33 queries. 0.460 seconds. Powered by WordPress