A magazine recently ran a ‘ Dilbert Quotes’ contest. They were looking for people to submit quotes from their real life Dilbert-type managers. These were voted the top ten quotes in corporate America :
‘As of tomorrow, employees will only be able to access the building using individual security cards. Pictures will be taken next Wednesday, and employees will receive their cards in two weeks.’
(This was the winning quote from Fred Dales, Microsoft Corp in Redmond WA )
‘What I need is an exact list of specific unknown problems we might encounter.’
(Lykes Lines Shipping)
‘E-mail is not to be used to pass on information or data. It should be used only for company business.’
(Accounting manager, Electric Boat Company)
‘This project is so important we can’t let things that are more important interfere with it.’
(Advertising/ Marketing manager, United Parcel Service)
‘Doing it right is no excuse for not meeting the schedule.’
(Plant Manager, Delco Corporation)
‘No one will believe you solved this problem in one day! We’ve been working on it for months. Now go act busy for a few weeks and I’ll let you know when it’s time to tell them.’
(R&D supervisor, Minnesota Mining and Manufacturing/ 3M Corp)
Quote from the Boss: ‘Teamwork is a lot of people doing what I say.’
(Marketing executive, Citrix Corporation)
My sister passed away and her funeral was scheduled for Monday. When I told my Boss, he said she died on purpose so that I would have to miss work on the busiest day of the year. He then asked if we could change her burial to Friday. He said,’That would be better for me.’
(Shipping executive, FTD Florists)
‘We know that communication is a problem, but the company is not going to discuss it with the employees.’
(Switching supervisor, AT&T Long Lines Division)
We are a New Zealand company with a large global user base for our web based project management software ProWorkflow.com.
All our SaaS servers are located at the LayeredTech/Fastservers data center in Chicago. They have given us 6-7 years of solid performance with plenty of available staff for onsite 24/7 support at all times.
I get asked regularly why I don’t use New Zealand based hosting providers for our global SaaS company. The reason is simple.
I’m yet to find one in New Zealand that actually takes it seriously, has full onsite 24/7 support and a level of transparent communication at least similar to most US data centers."
I want to see:
- Multiple support staff onsite 24/7
- Redundancy that works
- Clear open communication with customers
What I’ve seen often (and lately) is:
- Staff go home at 5:30 and monitor remotely
- Redundancy that fails
- Lack of / confusing communication with customers
So what caused this blog post?
Most NZ techo’s would be aware that a few days ago that a data center had major issues with an infrastructure failure that resulted in about 24hrs of complete downtime.
I believe this affected many thousands of websites and a massive amount of email. A few of our non-related websites and email are through a well known New Zealand based hosting provider.
The outage came at a time we were dealing with some important issues and due to the lack of email our company had to pull all nighters calling customers and monitoring our DB based backup support system.
We run a 24/7 SaaS model with US customers awake while NZ sleeps which is why we need to know our servers have onsite support. If they go down, companies literally stop working. We don’t host websites, our SaaS app is a core business system for many thousands of users globally.
To re-iterate, the letter below is not from our Server provider in the US, it’s from a hosting provider here in NZ we just use for email and some small websites. During the outage, this company was very non-communicative with it’s customers. people were freaking out, didn’t know what was happening and they were also trashing the provider on Twitter and other places.
New Zealand hosting providers I’ve talked to and worked with seem to forget a basic fact that a large number of people and companies actually work late, through the night or have global user bases. Even if they do know this, a comment below shows that they seem to put less priority on issues occurring ‘Outside of work hours’ in New Zealand.
Read the letter below we received AFTER the massive outage with little communication through the event. In comparison, our other server provider had some issues lately and updated their status page regularly so customers knew what was happening.
The part that really brassed me off was the part that said:
“During more sociably acceptable hours…” they contacted help to source hardware. So thousands of people, companies and websites suffer whilst they have their breakfast waiting for the ‘shop’ to open.
In addition, their phones and support numbers weren’t accessible, the website was down and they weren’t being transparent or communicative on Twitter. Customers were mostly in the dark. At the very least they should have a blog or status website hosted separate from their facility to keep customers updated.
In contrast, the data center company we deal with in the US store replacement hardware onsite for almost everything needed in case of emergency. They also have a separately hosted Status website for updating customers.
Where was the risk analysis?
When designing the data center, did anyone ask the question “What happens if our firewalls fail?”
Take this as a lesson. Kiwi hosting providers and data centers really need to lift their game if they want global software co’s to host here.
- Tip 1. Don’t go home at 5:30
- Tip 2. Keep customers informed
- Tip 3. Have a status/blog site independently hosted
- Tip 4. Use ‘people language’ not ‘Geekspeak’ (We’re not all techos’)
- Tip 5. Keep replacement hardware/software onsite for key services
That’s the rant for the day – time to get back to dealing with the email backlog.
PS: I’m not angry, just a little frustrated and being totally honest here about how NZ providers need to lift their game.
Following Tuesday night’s reported outage (21.30-24.00) which was attributed to a core switch intermission failure, last night the same symptoms occurred (commencing 19.30). Clearly this highlighted that the corrective action of the previous night i.e. the replacement of both core switches deferred the issue rather than provided a permanent resolution.
Last night the fault was again identified by our network management software and the team reassembled consisting of the CTO, Sys-Admins and management. The issue was immediately escalated to our external maintenance support teams (CheckPoint firewall provider and hardware provider) as is standard practice for an outage of this significance. This identified that the fault appeared to be within the Checkpoint firewall clustering software (dual redundancy).
With the assistance of Checkpoint engineers the decision was made to split the firewall cluster and run them as individual stand alone units to resurrect the network. This appeared to temporarily solve the issue at 00.15. For context the firewall servers are running at 15-20% whilst not clustered i.e. with very low levels of utilisation for the spec of the equipment.
At 02.45 the network failed again. The team were still on-site monitoring the network. Our firewall maintenance providers were again called who arranged for patches to be downloaded. At 05.10 the patches were installed and the firewall management server reconfigured to accommodate the patch upgrade. This did not provide a permanent fix.
During more sociably acceptable hours we reached out to our friends in XXXX to help source checkpoint firewall hardware and to provide ‘men on the ground’ to help support our technical team that had worked through the night. In addition to this a decision was taken to move some core applications to the old network (ASA) that was still functioning as was not reliant on the check point firewalls. These include XXXXXXX.co.nz, Email (inbound and outbound) and XXXXXXX.co.nz. However the core network was re-established without the need to deploy this second network with the core applications migrated.
Low level analysis with the assistance of Checkpoint engineers in the USA identified high volumes of fragmented packets originating from one of our shared virtual hosting servers to be the root cause of the issue. These packets were flooding the firewalls and causing the outage. The source of these packets was identified and blocked at 13:50. The checkpoint firewalls then returned to normal service which finally brought the network back on line at approximately 14:00 hours.
Like all hosting companies, we do not exercise strict control over the content that customers upload to their websites. It appears that one customer site was compromised, which in turn caused the flood of malformed packets to the firewalls. Our internal network analysis software did not identify these packets as they were not ‘standard’ TCP/IP traffic.
In order to prevent this level of disruption in future we intend to move all shared virtual hosting customers behind a separate firewall that is isolated from the rest of our networks. This will ensure that should there be any re-occurrence the offending server is quarantined, and does not cause the kind of outage we have just experienced.
We do sincerely apologise for this outage. These problems are extraordinarily difficult to diagnose, and we are grateful for the assistance provided by CheckPoint engineers in the USA, and local XXXXX network engineers who have complemented the efforts of our own technical team.
Should you require a more technical update, please contact XXXXXX our CTO or please contact me on my email or directly on my cell (xxx xxxx xxxx).
Once again our apologies for this critical issue and thank you for your continued support.
About The Author:
Julian Stone is the CEO of ProActive Software, developers and creators of the leading web based project management software http://www.proworkflow.com
On a totally non I.T. work related subject, every hard worker needs a relaxing hobby to keep them sane. RC model gliders is one of mine. I’ve built quite a few RC gliders and planes as a kid but though I’d have another crack and see if I could still produce the goods!
This is the 3 channel RC glider I’m building. Getting there. Just a bit more structure and some electrical/mechanical work, then I can cover it! Looking forward to throwing it off a cliff after 6 months work!
Bellow you can see I’m starting to fit the electronics. It’ll be a squish in there but we’ll get it all in hopefully! Thinking of adding a small USB camera ;-)
This is the other plane. It’s a trainer and regularly gets smashed up ;-)
I know a huge amount of web designers and creative’s and being an ex-creative myself I can totally relate to this vid. This is a humorous look at what some web design clients would be like in real life if they took the same approach to other areas that they do to paying for work ;-)
PS: Not all are like this – some are great, but most web design companies have at least a few clients like this. Enjoy!!
About The Author:
Julian Stone is the CEO of ProActive Software, developers and creators of the leading web based project management software http://www.proworkflow.com.
I saw a site advertising a new iPhone app to work with BaseCamp’s Project Management tool. The website had some screenshots on the homepage, and one of these screenshot section is called ‘Comments’.
But look at the sample comment! haha! eh? What?
I guess people should be a little more professional when adding product screenshots and make sure the integrity of the data matches the integrity of the app! This just looks tacky and unprofessional.
(Spot the one man band!)
The comment says:
Look at your man, now back to me, now back to your man, now back to ME. Sadly, he isn’t me, but if he stopped using other Basecamp apps and switched to Headquarters, he could work like he’s me.
Where are you? You’re on a boat with the man you man could work like.
What’s in your hand? Back at me. I have it/ It’s two tickets to the thing you love. Look again. The tickets ARE NOW DIAMONDS.
Seriously, that a weird sample comment there…