In today’s business world, companies have become increasingly reliant on their IT systems for day to day functions. These systems serve as a link to their customer base and are vital to business operations. The primary impact from IT systems downtime is lost revenue. This can affect both present and future revenue.
As a result, companies of all sizes are now addressing the need for disaster recovery plans. Downtime or any interruption in business operations may result from natural disaster to
human error. No geographical area is safe and no amount of proactive hardware monitoring and user training can guarantee avoidance of potential downtime. In reality, the majority of business outages are due to something occurring within the organization and are statistically more disruptive to smaller businesses than larger corporations.
- Natural or man-made disaster – As previously noted, this may occur anywhere and would include earthquakes, tornadoes, hurricanes, floods, snowstorms, fires, and power failure. Companies in the mid-Atlantic region have all felt the effects of power outages due to snowstorms and hurricanes in recent years.
- Hardware Failure – Critical hardware failure may occur at any given time. Loss of data can cause irreparable damage to any business in the form of legal consequences or financial impact.
- Human Error – Whether it’s maintaining hardware improperly that causes an outage or performing a series of keystrokes and accidently deleting data, human error is a real point of consideration for disaster planning.
- Business Availability – Ultimately, this is the primary point for disaster recovery planning. Business unavailability results in the loss of revenue. Customers now have many options with increased competition across all lines of business. If your business is disrupted and you can’t answer the phone or your website is down, your prospective new customer may take their business elsewhere. The same or worse applies to existing customers. They may be less forgiving about an outage that affects their business.
While putting a disaster recovery plan in place may be an intimidating task, Trigon has the expertise and experience to guide you through it. We service the Philadelphia and Central PA area. Contact us for an assessment and we’ll help you implement an effective and valuable disaster recovery plan.
Here at Trigon, we offer two specific backup applications for our customers:
- Trigon Online Backup
- Trigon Replay
The reason for the distinction between the two offerings is because we know that our customers do not have uniform backup requirements across the board. Additionally, the term "Disaster Recovery" has a variety of different meanings for different organizations. The technologies we employ allow us to tailor specific solutions which can meet the needs of nearly any size business.
Trigon Online Backup (TOB) is a file level backup application. Its primary objective is to be a cloud-based backup solution which means there is no requirement for local storage or a local backup device. Data is backed up directly to secure data centers on the East and West Coast of the United States. Trigon Online Backup is VMWare, Exchange and SQL aware. It has very granular retention and revision policies and also provides a centralized web portal for easy management. We have had a lot of success in deploying Trigon Online Backup in a variety of both small and medium sized environments. Additionally, TOB serves as a secure backup application for workstations and laptops.
Trigon Replay is a block level backup application, otherwise known as an image-based backup. Trigon Replay is a highly scalable solution for large environments and offers very impressive RPOs (Recovery Point Objectives) and RTOs (Recovery Time Objectives) in even the most demanding scenarios. The Trigon Replay Online component allows the local backups to be replicated to an offsite location and stored within the Trigon Continuity Cloud. The Trigon Continuity Cloud provides the ability for our customers to quickly bring up their servers in a virtual environment in the event of an extended outage or disaster at their main site. Trigon Replay is also cluster and application aware.
As you can see, Trigon has many adaptable solutions to meet your backup needs. Since no two scenarios are exactly alike, we will strategically assess your data recovery requirements and ensure that your business can restore its critical data if the need should arise. Contact us today to talk about your plans and how we can assist you and your organization.
In years past, business continuity and disaster recovery (DR) were not thought about the way they are today. They were always created and managed as two separate initiatives. Businesses now realize that this separation has caused more issues than it has resolved. With the ever growing dependency on technology and a competitive business landscape, most businesses could not survive a prolonged disruption of IT services.
An effective (DR) plan should be directly linked to the overall business continuity plan. There are a number of critical focus areas that should always be considered when writing a disaster plan:
- Identify key IT services – Define mission-critical services (which will be different for every company) and directly link them to business operations.
- Recovery Time Objective (RTO) – How long can the business afford to be without critical IT services?
- Recovery Point Objective (RPO) – How close to real time recovery is required? Can you lose a full days’ worth of data?
There are a large number of factors that should be considered when writing an effective DR plan. For example, third party vendor contacts, secondary data center information, remote access, recovery teams, when to initiate, and full DR testing are a few major considerations.
If you are not sure how to proceed or have questions, let Trigon guide you and your team to develop a plan that will allow you to sleep soundly during the next big storm.
Power outages, they happen and they never happen at a convenient time. What will you do when your business loses power? Will you sit and wait hoping that your server will come back up? Or do you have a plan to enact and secure your data, company, and future?
In my years in the Information Technology industry I have seen power outages caused by weather, accidents, water main breaks, and animals. It is the most common disaster event I have worked with. They will happen to you. Are you prepared?
What should you do to be prepared?
Figure out the steps you need to take to preserve your servers, network gear, and workstations.
- Have a plan. A full disaster plan may have a section for power outages that last days, such as getting a generator or relocating. This really isn't feasible for small business. But having a checklist or plan what to do during a power outage will help.
- Walk through the plan BEFORE the power goes out. Test it verbally. Check each step to make sure it makes sense and is feasible.
- Follow the plan during the next outage. Try to take notes what really didn't work and what worked well.
How long can you be without power? What is the financial loss with the power outages lasting 15 minutes, an hour, or a day? Knowing this will help you determine how fast you need power back on get buy in for any investments into technology improvements to address this.
Have your power company’s phone number and your account number so you can call and report the outage and find out how long it will be. This will help in your decisions process. Some power companies have websites reporting outage information. And yes, they sometimes are set up for smart phone interfaces.
Have an idea who needs to be alerted that the power went out and when to alert them. These are the owners, clients, employees, and service providers. If you rent your space, have the contact information for them as well. Make sure to include cell phones if possible as they may be without power as well.
Have flashlights. Your smart phone flashlight will only last so long.
Have your plan handy so you don’t stumble around in the dark trying finding it.
Keep critical systems on an Uninterruptable power supply (UPS). At the very least you should have your devices on surge protectors.
BACKUP YOU’RE DATA. You may lose the data being worked on that was not saved or backed up since the last back up job. But if you lose your server, or just a couple of hard drives without a well thought out back schedule. You will lose your company. Having a backup solution will help mitigate this risk.
Hopefully some of these ideas help prepare you for your next power outage. By being prepared, you are investing in your future.
- by David, "Don't Bring That Soda Into The Server Room", Quiram.
So you want to have your own Exchange server and host your own email. You are concerned about your Disaster Recovery plan. You want high availability for your email but can't afford the hardware and licensing to do so. Well, once again Google has come through for your needs! There is a Google service to fill in the gap for the high availability in case a disaster strikes your site or just your Exchange server. The service is called Google Message Continuity.
As quoted from Google:
“Google Message Continuity, powered by Postini, is a cloud-based email continuity solution for organizations running Microsoft Exchange email servers. By providing Gmail as an alternate, synchronized email system, Google Message Continuity increases email availability for your organization and helps users stay productive in the event of a disaster or outage of the Exchange server. By extending the reliability,security, and functionality of Google’s services to Exchange, Google Message Continuity allows you to:
• Develop a complete email continuity and disaster recovery solution for your organization
• Maintain constant email access for users round the clock, even if your Exchange server is not available
• Minimize the risk of data loss due to on-premise server failures
• Protect your email from spam, viruses, phishing, and other email-borne threats with built-in message security features”
So that all sounds good, so let’s dig into some details.
Google Message Continuity requires the following Microsoft components:
- Microsoft Exchange Server 2003, 2007, or 2010 as the primary mail server. Either Standard or Enterprise editions.
- Each Windows client machine where you plan to install and run the Continuity Sync Server must have this minimum configuration:
- Microsoft Windows: Windows XP SP3, Windows Server 2003 or Windows 7.
- Microsoft Outlook 2003 with SP3 or later.
- It is recommended that you use the latest patches for both Microsoft Windows and Outlook.
- The Continuity Sync Server must be installed on a computer with a minimum disk space of 10MB per user and a minimum overall disk space of 5GB total.
- Do not install the Continuity Sync Server on the same machine that is running your Exchange Server.
You will also have to have the Postini service as well from Google or get it as part of the bundle.
All the requirements seem to be pretty reasonable. From my own experiences, most of the Exchange servers I see out in the wild all meet these requirements. And if you are running anything 2000, well...it is time to upgrade. The cost of the upgrade will be well invested even if there isn't a call to use this service.
The service is managed through a web interface and there are customizable alerts. During the outage which the service is utilized there is 25 Gb work of email storage per account available. Once your Exchange server is back up, the emails will sync from Google Message Continuity back to your server. Those of you who have used the Google Postini service have experienced this during a spooling event due to loss of server or connectivity to your site.
It isn't just your email that syncs up, you can sync up your calendar and contacts as well. The client that is installed on the desktop will keep the Outlook details sync'd up with the Gmail account used for failover.
Once your Exchange service is interrupted, the Gmail accounts will be accessible for the users. There is even a mobile connectivity available as well for those out on the road!
The cost for this is very reasonable. You can provide the Postini filtering service and Message Continuity servicer for under $1000 for the average medium sized business.
So how does it work?
When you are using the Postini email filtering service, the email is routed through the Postini servers. Once the email is filtered, it is sent on to your Exchange server. The Message Continuity service replicates the email that is filtered and stored on a Google Sync Server to be available when the service is activated.
Google provides a great diagram:
As you can see, the flow of mail is replicated over to the Google Sync Server during the mail flow and the syncing between servers happens along this mail floe. The client side contacts and calendar evens sync up through the client that is installed on the workstation or laptop.
For those already using the Google Postini service, this will add very little cost per user and have access to high availability email services. For those looking at the Google Message Continuity as a Disaster Recovery tool, you get the added benefit of an excellent mail filtering service. Both are Win-Win scenarios.
I think this is such a good disaster recovery tool for the small and medium sized businesses. It easy for someone with IT experience to set up and maintain. The email syncing with a Gmail account is such a natural step for the Postini service and is based on working technology. Having your email available to the users WHEN the disaster happens and DURING the disaster is awesome. One of the highest priorities in Disaster Recovery plans is communications. By leveraging the Google Message Continuity service you already have one of your recovery priorities secured even before you fully know the extent of the disaster!!! The affordability of the service brings it to the small and medium business allowing them to be more competitive and resilient in today's trouble markets. If you'd like to know more about Google Apps, be sure to talk to our friends over at Mosaic. We'd be more than welcome to help set-up your hosted Exchange services here at Trigon.
- by David, "Q-Tips", Quiram
Here are some non-technical common sense things to avoid doing during a full blown disaster recovery, with or without an established plan. Ten things to not do that aren't always covered in plans or even discussions of disaster recovery.
- Panic. No really, don't panic. Things will work out or not. Panicking will just get the client nervous, your staff nervous, and you unfocused.
- In the words of Colin Powel - "Get mad, and then get over it." Don't get angry or if you are, vent and get over it. The emotions will limit your outlook and lead you to bad choices.
- Don't deviate from the plan. Stick to the plan. If you don't have a plan, make one up before you start doing anything. Know the next step you’re taking before you finish the step you’re at.
- Do not wing it….this leads to the bad place. Document your steps you have taken and notes on what you plan to do next. You WILL be interrupted and your train of thought will be broken and you WILL lose that one thought that takes you to the next step. Notes will give you the tool to get focused on the process you were working on.
- Do not work all night. Having you or your staff exhausted will only hinder the process. Take breaks, and space out staffing. Don’t get wrapped up in the drama of the moment. Remember, you are the calm one steering the company through the storm.
- Don't go quiet. Communicate to the stakeholders involved. They want to know what is going on. Control the communication flow so you can control the interruptions to you and your staff. Communicate to your staff. This will calm nerves and focus the process.
- Don't consume beverages with high amounts of caffeine and sugar, the so called energy drinks. As much the iconic IT staffer is tied with Redbull, Monster, and Mountain Dew, this will just hinder the process. These will burn you out quicker and break your focus.
- Don't consume alcohol. This should be a no brainer.
- Don't do it all yourself. Delegate to your staff. If you are stuck reach out to others for assistance. There are support contracts in place just for this, use them. You have associates in the IT field you can tap for a sounding board, use them!
- Don't forget to follow up after the recovery is done. The recovery process will have shown shortfalls in the plan, if there was one. It is essential to have a de-brief of the event to understand what happened, how it can be prevented, and how to make the process better. You just went through all that pain, learn from it.
- by David, "Gingerbread", Quiram
On Monday there was an issue with the Google mail system that affected 40,000 users (though there are some reports of up to 500,000 from non-Google sources). The users could not access their email and contacts during the outage. At first it was suspected the data was lost, but Google later reported that was not the case, the data was inaccessible due to a storage software update issue. The data access was resorted within 24 hours.
I have made the recommendation in a blog back in September of using Gmail as a tool to make your business more resilient to a localized disaster (i.e. loss of building, vandalism, theft) and keeping your email accessible during the recovery. I still recommend it. In fact in light of how the issue was handled by Google, I recommend it even more. .
Google addressed the issue quickly and fixed it. The storage software update was halted as soon as the forums were lit up with requests into why users "lost" their emails and contacts. Think about it, the forums got the news of issues. Then that information was communicated through the company to the Engineers who identified the problem and stopped it from affecting other storage sites. It isn't like these people all work in the same room. Google is huge well over 20,000 employees. The restore portion of the event is what took the 24 hours.
That being said I am sure that there are those who are not convinced and see this outage as just another in a line of outages from 2009, two years ago. Yeah, there were several outages in 2009 and one this March. Things happen. No computer system is fail-safe. Nothing is, things fail. It is having a DR plan and protocols to follow during outages and disasters that make up for it.
Let’s looks at the pros and cons here from this most recent outage:
- You are relying on someone else to run your email accounts.
(Well you would be anyway; someone has to manage it be it on site staff, Managed Service Company. Having Google do it is free and they will maintain their technology and provide the staff for it)
- Google is not immune to outages….frankly no one is...not even the uptime of 99.99999%, there is still some outages in there. So there will be downtime.
- The responsibility of communication is on Google's side. You can contact them about issues, but there is no recourse of action if they do not communicate back.
- Google has shown that they are committed to make the Gmail system work and provide the service. They will get it fixed and they have the right people to do it. This particular issue will not happen again. Ever.
- Google responded to the issue quickly and stopped it from affecting other storage systems. Less than 1% of the total Gmail users were affected.
- Google still has the redundancy of hot sites and physical backup. Services which are out of most business price range. This service usage is Free!
- Comparatively the amount of downtime that business experience with their own internal email is much longer than the 24 hours experienced by Gmail users.
Overall using Gmail is not a larger risk after the latest outage. There are the same risks as before that is inherent in using cloud resources. Google has shown that they have the capability and drive to recover from outages that just cannot be matched by smaller business and it is free. If you'd like to hear more about specific Disaster Recovery plans, please contact Trigon at your earliest convenience! If you'd like to hear more about Google Apps, contact our friends at Mosaic.
- by David, "And Don't Call Me Shirley", Quiram
Have you documented your disaster recovery plan? I am not talking about listing some tasks you want to do or where you kept your backup cd, but a full planned out process to get the IT portion of your company running again in a timely and efficient manner?
Most companies have not and some will not. Having gone through a disaster for a company which did not have a disaster recovery, I have seen where the value of having a process laid out will make the recovery less stressful for you, the owners, and the staff.
Having the plan figured out before a crisis gives you several advantages:
It gives a level of confidence in the systems, your ability, and the future in case something happens.
It begins the exercise of how would I do this before things go wrong, as opposed to how do I do this with things going wrong.
It gives you a framework to work with. No static plan will ever fully cover a dynamic event. The saying is the first casualty of any battle is the plan. But having steps mapped out, priorities determined, resources identified, and an understanding of what needs to happen will allow you to be more maneuverable to the changes in the crisis.
By documenting your plan or process for recovering your IT resources you will be showing the owners that you are thinking of the future. It isn't doom and gloom thinking, the reality is that things go wrong, thing will break, and it is never convenient. Plan ahead to reduce the issues and risks you will encounter. If you have a well thought out and tested plan in place, you will be able to support the company’s Business Continuity plan. Remember the BC plan is different from the DR plan. The DR plan is there to support the BC and should be integrated into it. The DR provides the tools for the Business to move forward after a disaster, but we need to make sure we have the tools ready and available.
I have written up several disaster recovery plans for companies that range from complicated multi server/multi-site to single server at someone's home office. Each time the planning and documentation process has proven to be invaluable. It is a time for discovery, identifying issues that would hamper the recovery process, and gaining a better understanding of what needs to happen when. The plan will also get the owner's buy-in to the process beforehand and set up realistic expectations BEFORE the disaster happens. This is very important. If you can show that the recovery effort will take 24 hours for the core infrastructure to be re-established, it will give you the time and room to do your job without the owner and management worrying what you are doing.
The plan provides the framework to address the crisis. No matter how well I have written a plan, there are unexpected events that occur which can derail the plan. Even in testing things happen, replacement servers fail, key equipment doesn’t show up, staff members are sick, network devices power supplies fail. All these have happened just in testing. I welcome these issues in testing because it adds another stress test to the plan and staff working with the plan. By identifying the priorities in the recovery, you can react and re-task resources based on that. The advantage of having an establish priority in recovery is that if you are questioned on the reasoning you have a document which was agreed upon beforehand by management and owners that you are following.
The planning of a DR is only the first step in the whole process. But it is a step that should not be taken lightly. Understand that once you start you are going to find issues with in the systems you are backing up, the thinking and views of management, owners, and staff. By getting all of this address and the plan laid out ahead of time, you will save time, reduce frustration, reduce risk, and increase success in the event of a disaster.
Make sure to take the next step and test the plan through a walk through and then an actual recovery with test machine to simulate the disaster. This will all be time and money well spent when the disaster occurs. If you'd like to have Trigon review, or create a disaster recovery plan for your business, be sure to contact us.
Frankly, keeping up with growth has presented more work than our small team was prepared for — with traffic now climbing more than500M pageviews each month. But we are determined and focused on bringing our infrastructure well ahead of capacity as quickly as possible. We’ve nearly quadrupled our engineering team this month alone, and continue to distribute and enhance our architecture to be more resilient to failures like today’s.
We can’t apologize enough, nor can we thank you enough for putting up with these growing pains. We know how impossibly frustrating it is to see your work offline. But please always know that we truly care about your work as much as you do, and we have an incredibly capable team working incredibly hard to take good care of it.
You could smell the internet tears lasting nearly 24 hours yesterday. If Twitter is what most consider a social forum for posting small 140 character messages of your breakfast, then Tumblr is the forum that lets you write an essay on that breakfast.
Tumblr is an easy to use blogging platform that lets you set-up a website in seconds. You can now write your little heart out about cute adorable kittens, or funny pictures of Kim Jong-il looking at things. Sadly, things came to an abrupt end thanks to an error in Tumblr's cluster of servers. The entire network of blogs were down for 24 hours. Anger and hysteria ensued. Some even wrote open letters.
But, what are we owed from a free service? Anything?
Tumblr is a free blogging service and super simple to use. Do we have a right to get angry if we host a site with them and things are down for an entire day? That could be business lost if your goal is to make monies from your page. On the other end of the spectrum, if you're just posting picture of Kim Jong-il, I'm sure you wouldn't really care that much.
Do hipsters have a right to get upset with Tumblr's lackadaisical communication during the outage, or should they just get along with their lives and possibly shower?
Perhaps Tumblr should have contacted Trigon about the virtualization of their servers. We're able to work with out clients to minimize any downtime, whether it be an off-site disaster recovery center or bringing sandwiches during a server rebuild.
- by David, "The Conscience", Quiram.
So the buzz word these days is cloud computing, you can use it to free your staff from their desks, you can use it to access applications data anywhere for a reasonable cost, you can use it to save your company. All these are true according to ads and blogs (including one of my own). But can it really. What do we know about the cloud industry? Is the cloud ready for true disaster recovery?
The cloud has been around since the internet, as it is the internet. It’s like the way we name oceans, they are all one big ocean connected, but we segment them for our own understanding. So it isn't some new technology from on high. It is just a new way of using what is there.
While there are opportunities to improve your processes, budgets, and disaster recovery using this technology, the truth of it is that you are just trading the risks you had onsite to the risks inherent in the cloud computing and resources. There are risks that most people don't understand or are even aware of.
No one really questions if the internet could ever fail as a whole, if it did, where would the cloud computing be then? Where is the disaster recovery you were depending on? Where is that spreadsheet you needed for the meeting? You don't know. I don't mean to sound like an alarmist, I like the resources and opportunities of cloud computing, but I am also very aware of its vulnerabilities and the risks I will take using it.
The companies that offer cloud based resources are all working in the wild west. There isn't' much regulation to maintain the integrity of the industry, there are major players but they are feeling out the business as much as the smaller companies, they just have the advantage of being able to make more mistakes due to their budget. The technology is constantly improving and changing, the needs of the users are still being discovered. There are niches to be filled and plenty of opportunities to grow and create great offers and services. But the business practices are still immature and there really isn’t a standard to dictate best practices. It is the Precambrian explosion for cloud computing, anything goes, and what does not work will die out.
The cloud options are attractive, but it definitely is ‘buyers beware’ market. Unless you stipulate in a contract, you will not control where your data rests. There will be third party employees who will have access to the data. Again, contracts are what are going to limit that liability. But what if the company goes under or is bought out; you are at risk of data loss when there wasn't even a site loss.
Like all decisions that involve the disaster recovery of your company's data and assets, you need to weigh the risks, all of them. Do the research; look for the words behind the ads and buzz speak. You are the first step to a successful disaster recovery and you choose the tools you use….choose wisely.
Don't get caught making hasty decisions. If you need some help, be sure to let us know and we can help you along your cloud computing route. Jenkintown or Doylestown; we can help!