Some Skype users faced a frustrating weekend after a software bug left many unable to log in to the Internet phone service.
The sign on problem surfaced early Thursday afternoon (August 16th), with the company’s blog reporting a software issue. At the same time, Skype temporarily disabled all downloads of Skype.
With users all around the world still experiencing problems eight hours later, Skype blogger Villu Arak was reassuring folks that the service had not “crashed or been victim of a cyber attack,” rather nauseously adding, “We love our customers too much to let that happen.”
Arak went on to explain that the problem occurred because of a “deficiency in an algorithm within Skype networking software” which controlled the “interaction between the user’s own Skype client and the rest of the Skype network.”
By 11am the next day (Fri 17th), Skype was still wobbling like a large lady on a slimming belt, with no blog updates appearing until midnight, when Aruk surfaced again to insist that they’d, “commandeered extra supplies of pizza and coffee” to ensure that, “the Skype people aren’t going anywhere until they’re happy that everything is back to normal.”
By 11am Sunday, Aruk felt he knew his users well enough to address them as “friends,” bringing the glad tidings that all was now well with Skype, with a full explanation promised in the morning.
As for us, the Skype outage couldn’t have come at a worse time, as we were away from our desktop machines all weekend and were relying on the IM+ for Skype app on our Treo to keep us connected.
We’ve always found Skype to be reliable, so when we couldn’t connect we started to suspect the Palm program – thanks to the people at Shape for promptly putting us straight (and apologies for pointing the finger of blame at you guys!)
With an estimated 220 million people worldwide using the Skype service, the weekend’s outage was a timely reminder of how much we’ve grown to depend on the service, so let’s hope the folks at Skype don’t come across any more “algorithm deficiencies” any time soon.
Update: As promised, Arak has posted up an explanation for the extended outage on the company’s blog:
“On Thursday, 16th August 2007, the Skype peer-to-peer network became unstable and suffered a critical disruption. The disruption was initiated by a massive restart of our user’s computers across the globe within a very short timeframe as they re-booted after receiving a routine software update.
The abnormally high number of restarts affected Skype’s network resources. This caused a flood of log-in requests, which, combined with the lack of peer-to-peer network resources, prompted a chain reaction that had a critical impact.
Normally Skype’s peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly. Regrettably, as a result of this disruption, Skype was unavailable to the majority of its users for approximately two days.
The issue has now been identified explicitly within Skype. We can confirm categorically that no malicious activities were attributed or that our users’ security was not, at any point, at risk.”