Archive for the 'Risk Management' Category

The Bloody Project – Another Lesson From the Course

I’m alright and nobody worry ’bout me

Why you got to gimme a fight, can’t you just let me be?

These are the unmistakable opening lines of the theme song to Caddyshack.  As a fan of slapstick comedy and of the game of golf, I have to rate it as one of the classic movies from the 80’s.

One of the  benefits of my recent career changes was to have a little more flexibility in my schedule; a flexibility that would allow me to spend more time with my family.  Last week, I had the chance to exercise that flexibility and booked a round of golf with my oldest daughter and one of her friends.  With my less than spectacular golf skills, I highly suspected there could be a slapstick moment on the course.

It was a typical winter day in Central Texas – sunny, not hot but not cold, not windy – a day my friends in northern climates couldn’t even imagine exists in late January.  After working from the world headquarters of Nice Socks Consulting for the morning, I headed to our home course at Avery Ranch Golf to meet them when they got out of school that afternoon.  I was excited to spend some quality time with her before she heads off to a college yet to be determined later this year.

The course was not busy so we were excited about enjoying a casual round without anyone pressing on us.  As we teed up on the first hole, little did I know that our round would be far from casual.  My daughter’s drive pushed a little right of the fairway, ending up on a slight slope near a small outcropping of limestone just to the front and right of where her ball landed. She was about 120 yards from the green and confident she could be on the green in regulation. Unfortunately, the 2nd shot did not go as planned.  Her ball hit the rock outcropping (yes, she let the club face open up) and bounced directly back, striking her in the head.

At first I was not sure what had just happened.  I was watching for the flight of the ball and when I did not see the ball in the air, turned around to see her kneeling on the ground.  She had her hand on her forehead and when she moved her hand, I saw the blood.  Lots of blood.  I ran to my cart and grabbed a golf towel to apply pressure and slow down the bleeding. I won’t go into the gory details of the next few hours.  However, I will let you know that after 7 stitches expertly applied by a plastic surgeon, she was all good.  No concussion. No life altering injury.  Just a nasty wound that will heal and hopefully leave nothing but a faint scar.

As I am apt to do, once I knew for certain that this incident was not going to result in long-lasting impact on my daughter’s health and well-being, I started to think about what I could learn from this life event.  At first my mind went to thinking about being prepared for the unexpected. However, the more I thought about it, the more I began to see that the potential for a project management lesson to come out of this unfortunate event.  This angle is probably due to the fact that my first consulting engagement since going out on my own is focused on driving a significant solution platform rationalization project.

Most projects start off with a well thought out plan with well-defined milestones and details on the steps required to meet those milestones.  The approach to a round of golf is similar.  You know the par on each hole and know in general where you need each shot to go in order to meet or beat par.  But we all know that not everything goes according to plan on the course nor in the office.  Therefore, you have to be able to adjust as the round unfolds; you have to manage the round, just like you have to manage a project.

In the case of my daughter, she had planned for her tee shot to go up the right side of the fairway and land 100-110 yards from the middle of the green. She then planned to hit a nice easy approach shot into the green where she would do no worse than two putt and make par or better.  Instead her tee shot went a little further right than expected and landed in the rough, on uneven ground, near an outcropping of rock, about 10 yards shorter than expected.  Her second shot then proceeded to hit the rock outcropping and end her round prematurely after two strokes.

When assessing what to do after that first shot, she had five options. One option was to play the ball as is and go for the green to get back on plan. The second option was to chip out onto the fairway giving up distance to have a much better position for her next shot.  The other options (per rule 28 of the Rules of Golf) involved declaring the ball unplayable and 1) going back to point of her first shot and hit again under penalty of stroke and distance per rule 27-1; 2) taking a one stroke penalty and dropping a ball behind the point where the ball lay; or 3) taking a one stroke penalty and dropping a ball within two club-lengths of where the ball lay, but not nearer the hole.

The execution against her project plan for Hole #1 was off-track after her drive.  In this case, she decided to take an action to get back on plan with one swift action versus incurring an additional stroke.  I had seen her make similar shots  from similar positions on that very hole before, so in the moment I did not suggest she do otherwise. Sadly, that swift action ended the round and resulted in a trip to the ER.

In hindsight, the safer more practical play would have been to give up distance and punch it into the fairway to set-up her 3rd shot or perhaps declaring the ball unplayable and taking a drop with a penalty stroke.  In either scenario, she would have likely had a ball on the green sitting 3 with a chance to sink a putt for par or at worse bogey.

She made a decision to go for the green rather than take the less risky option of taking an extra stroke on the hole. While the reward for going for the green was large, so was the risks. They say hindsight is 20/20, and in this case I can’t help but second guess not suggesting she choose another option.

Those same decision points haunt project managers.  No matter how well-managed, there are usually issues arise that could potential get a project “off plan.”  Many of those issues are minor and can easily, without introducing more risks into the project, be identified and addressed quickly.  But at times the issues appear abruptly and are significant and can only be solved by either taking a risky bold action that could get your project back on plan in one swift action but also introduce risks of incurring further negative impact to the project (i.e. your project ends up in the ER); or taking a less risky action that has some short-term negative impact (i.e. you take another stroke on the hole) but sets your project up for long-term success.

In the early days of my career I was usually inclined to “go for the green” when faced with one of those decisions as a manager.  But as I gained experienced, I learned that sometimes taking the penalty stroke or just punching out to the fairway is the better course of action.  As a project manager (or any kind of manager for that matter) you have to assess the risks presented you and make a decision that gives you the best chance of achieving the ultimate objective of the project.  Sometimes that means going for the green and other times it means taking a penalty stroke.  The main thing is to keep yourself and your project out of the ER.

Fore!

 

 

Flashback: No Slam Dunks In IT

I was looking back through some of my early, circa 2012, musings and came across this gem. I can happily say that I survived my years of managing data centers without ever having to declare a disaster.  However, even with a constant focus on change management processes, I did see my fair share of self-inflicted outages.

I long ago learned that humans are fallible and that all the procedures in the world can’t prevent every mistake.  However, I still believe that following a structured change management process is critical to running a successful IT Operations function and that the key to a good change management process is communication.

While I am currently taking a break from being responsible for IT Operations, if I ever find myself back in that role, I for sure will subscribe to my: ” Plan –> Communicate –> Execute –> Test –> Communicate framework.

Here’s my original thoughts from 2012:

“There are No Slam Dunks in IT.”

That’s a saying I have thrown around for close to 10 years now. But one that I think too many people in technology fail to remember on a daily basis. They get caught up in the urgency of the moment, short cut change management procedures, fail to think about the downstream impact of what they see as a minor, isolated change. All too often the mindset of “the easy change,” “the lay-up,” or “the routine lazy fly ball” ends up as an unexpected outage. That break away slam dunk clanks off the rim and bounces out of bounds. That easy two points turns into a turnover.

As we kicked off 2012, a relatively new to the company network engineer noticed that a top of rack server switch had two fiber uplinks but only one was active. Anxious to make a good impression, he wanted to resolve that issue. It was an admiral thing to do. He was taking initiative to make things better. So one night during the first week of the fresh new year, he executed a change to bring up the second uplink. Things did not go well as the change, and I will not go into the gory technical details, brought down the entire data center network. It was after standard business hours – whatever that means in today’s 24×7 business world – but the impact of that 10 minutes outage was significant. A classic case of a self-inflicted wound from not following good change management procedures.

It was actually a frustrating incident for me, because as we put together the 2012 Business Plan for Corporate Technology Services, we were asked to list the keys to success for our operations and the actions we needed to take achieve success.

THE #1 key for success listed was: Avoid self-inflicted outages and issues that take away cycles from the planned efforts and cause unplanned unavailability of our client facing solutions.

So 30 days prior I had told our CEO, CFO and the rest of the executive management team that our #1 key to success in IT was to avoid such things, yet here I was four days into the new year staring at the carnage of a self-inflicted outage.

Outages are close to a given in the world of technology. Servers will crash, switches will randomly reboot, hard drives will fail, application will act weird, redundancy will fail, and there will be maintenance efforts that we know will cause outages. Given that, every IT organization must take steps to not be the cause of even more outages. Business leaders know that there will be some level of downtime with technology – have you ever seen a 100% SLA? Rarely. It is usually some 99.xx% number. But outages that are caused by the very people charged with keeping things running drives them nuts, and rightfully so.

The morning after that self-inflicted wound, I communicated out the following to every member of the IT organization:

We need to strive to make sure that we are not the cause of any unexpected outages. We must exercise good change management process and follow the five actions listed above. As our solutions and the underlying infrastructure become increasingly intertwined, we must make an extra effort to assess the potential unintended downstream (or upstream) impact as we plan the change.

When making a change we must always follow these steps:

Plan – make sure each change action/project we undertake is well thought out, steps are documented, risks are assessed. If disruption in service is expected, plan for when we make this change to limit the impact of the disruption.

Communicate – communicate each change action/project to the parties potentially impacted prior to executing the change

Execute – flawlessly execute according the plan developed

Test – test to make sure that the change executed resulted in the expected results and there are no unintended consequences from the change

Communicate – communicate to the potentially impacted parties that the change has been completed and tested

To keep this goal of avoiding self-inflicted outages top of mind, we implemented a ‘It’s Been X Days Since our Last Self-Inflicted Outage” counter. Basically taking a page out of the factory accident prevention playbook.

Hey! You! Should You Be on That Cloud?

I said, Hey! You! Get off of my cloud

Hey! You! Get off of my cloud

Hey! You! Get off of my cloud

Don’t hang around ’cause two’s a crowd on my cloud, baby

 I am fairly certain that Mick and the boys were not singing about today’s clouds – computing clouds that is.  I guess they could have been advocating private clouds, but I highly doubt it.   However, if any of the Stones were current day CIOs, they might be singing “Hey, you! Should you be on THAT cloud?”

It seems these days that everybody in business has a cloud, is on a cloud or wants to be on a cloud.  It feels like every tech article I read, every discussion I have with my technology partners, every request I get from my internal business partners includes the cloud word or one of the “X” as a service phrases – where “X” = software, platform, infrastructure. In fact I just read about a new company that has created a SaaS offering for farmers – not exactly an industry synomonous with leading edge technology.

Now don’t get me wrong, I love this cloud stuff.  I have even gone on record as saying I would love to one day be the CIO of a company that has no data centers and owns no servers.  But that day is not today – but its getting closer everyday.

There is still much to be sorted out with these clouds.  While I meet with company after company that wants to talk about sellling me a cloud, not one of them wants to talk about how to manage them.  How do you best evaluate the risks associated with cloud computing?  How do you keep non-IT parts of the business from jumping on every cloud that drifts by them without fully thinking through all the ramifications of doing so? Without strong governance you have cloud chaos.  In my part of the world, out of control clouds are called thunderstorms.  While thunderstorms can bring much needed rain, they can also cause a large amount of upredictable damage.

To avoid the thunderstorms of cloud computing, companies must implement comprehensive cloud management procedures throughout the organization.  The non-IT parts of the business need to include IT in the evaluation, selection and implementation of all cloud-based services.  At the same time, IT cannot be the “department of NO” in an effort to keep everything within the four walls of the corporate data center.  In addition, any ventures into the cloud need to involve those dreaded people in legal/contracts departments and whatever role(s) within the company responsible for risk management functions.

An easy place to start with cloud governance is to create and publish a simple “X as a Service” risk assessment form.  The form asks 15-20 basic questions about the proposed cloud offering.  The group within the company that is driving the effort completes the form and submits to the InfoSec or similar group within the company. The information about the proposed service is reviewed and where needed additional information can be requested.  Where needed, the Infosec team can engage other functions – legal, privacy, finance, etc. – within the company to obtain specific feedback.  It’s not an overly elaborate process, but it gets information about the use of cloud computing flowing through the organization and rasies awareness about potential risks associated with such services.  Once risks are identified, efforts can be undertaken to address those risks and beging enjoying the benefits of the cloud.

Not all clouds are created equal – so make sure you choose wisely.  If after looking at a cloud offering, there is a level of doubt about it, stop and ask yourself, “what would Mick do?”