Availability Management: When Are You Free?

availability managementWhen infrastructure and applications are more available, this has a direct impact on the business’ well-being. As a matter of fact, when IT systems and networks are not accessible or down, businesses come to a standstill, literally. Yet, even with the best maintained or designed infrastructure, failures like memory corruption or a disk system crash are unavoidable. Of course, if you take a course like this one, you will be able to administer better-functioning databases.

Your availability management system needs to be able to proactively preempt and identify events of failure before they actually happen. Plus, once there is an identified availability issue, the administration should be notified and can immediately put through triage processes to get the problem resolved before more end users are affected. In other words, availability management’s fundamental objective is to make sure that all services of IT are functioning correctly and are available when users and customers want to make use of them in the SLA (Service Level Agreement) framework. It refers to the process of organizing assets of IT in a way ensuring continuous access to the assets by the persons who need them.

Why Implement Availability Management?

There are benefits to implementing the processes of availability management, and these include:

  1. Potential issues of service availability are corrected and identified before they impact services negatively.
  2. Services are provisioned on infrastructure that is specific depending on their needs of availability. This avoids costs that are unnecessary due to service provisions with longer times of recovery on more expensive platforms with high-availability.
  3. During expected time frames as specified in the SLA, services are available for use.

Effective management of availability starts with knowing the nature of risk. There are various occurrences that impact data, systems or sites negatively, which can reduce end user’s experienced availability. These risks are referred to as events. There is a bigger potential for reduced availability or extended outages when a system if more vulnerable to risk events. This causes lost productivity in business, consequently.

Responsibilities of Availability Management

In the ideal world, every user would have access immediately to IT assets regardless of how many users attempted to access them. This is not always the case, however, so availability management tries to make the best usage of the assets of an organization.

Availability management is 1 of 5 service delivery components. The aim of availability management is to improve, measure, plan, analyze and define all IT services aspect of availability. It is responsible for ensuring that all infrastructure of IT including roles, tools and processes are appropriate for the agreed targets of availability.

The ability of an IT component to perform at agreed levels over time is what availability management addresses availability management allows organizations to sustain the service availability of IT to support the business at a cost that is justifiable. These higher level activities realize requirement availability, monitor maintenance obligations, monitor availability and compile a plan of availability.

The availability management team reviews availability requirements of the process of business and ensures the contingency plans that are most cost effective and that these are tested and put into place on a constant basis to ensure that needs of the business are met

For example, an online application that supports ordering systems might have a half-an-hour or less requirement for recovery, so they might be provided with infrastructure components that provide many levels of redundancy. Non-customer-facing, less-critical applications are used by less users in small offices with a five-day period of recovery may be provided on less pricey infrastructure with limited capabilities of redundancy. Here is a course entitled Apple OS X Mavericks Server Training- A Guide that will help you set up and use this particular type of server.

Specific tasks performed by the availability management team include:

  • Measuring the results and making adjustments such as in wait times after requests
  • Implementing and planning procedures for the improvement of resource sharing and IT infrastructure
  • Defining and analyzing IT requirements
  • Establishing redundant, high-availability systems to support applications that are mission-critical
  • Ensuring proper plans for contingency are tested and are in place
  • Cataloging requirements of business
  • Reviewing requirements of the business for business systems availability
  • Determining the availability failures cause
  • Ensuring that availability of service meets SLA’s

It is the availability management team’s responsibility to supervise compliance with the Underpinning Contract (UC) and Operation Level Agreement (OLA) agreed with external and internal providers of service. Also, proposing improvements for the IT services and infrastructure with a view to increase availability levels is another responsibility of availability management.

Usually, availability management puts the focus on maintaining systems or users access whose work is critical to the business and then looks at providing access that is ‘good enough’ to less critical systems and users. Here is a course entitled Core Solutions of Microsoft SharePoint Server that gets you server-certified.

Key Indicators: Is Your Availability Management Working?

The key indicators on which the process of availability management rests include:

  • Security: There may be associated data with a service. Security refers to the data availability, integrity and confidentiality of that data. There is a clear overview given of the system’s end-to-end availability.
  • Resilience: A method of keeping services reliable and some freedom from operational failure. Redundancy is one popular resiliency method.
  • Serviceability: An external supplier’s ability to maintain a function or component’s availability under a 3rd party contract.
  • Maintainability: An IT component’s ability to be restored or remain in an operational state.
  • Reliability: An IT component’s ability to perform at a level agreed in conditions that have been described.

In a warehouse of data, one fundamental metric for productivity measurement is downtime. However, this number does not help you very much in understanding the basis of the availability of a system. Focusing too much on a number at the end of the month can result in biases towards reactive views of availability. To prevent specific problems from the past from happening again, root-cause analysis is important, but this does not prevent newer issues from causing downtime in the future.

How can you shift your perspective to progressive views of availability-needs provision on a continuous basis? Availability management is the answer. Proactive approaches to availability apply concepts of risk management to lessen the chances of downtime and prolonged outages. By the way here is an article you might want to check out entitled Creating a SQL Server Maintenance Plan.

Risk events of data warehousing can range from catastrophic, to inconvenient, to barely detected. These can be sorted into 3 familiar downtime categories based on their type of impact:

  1. Degraded Downtime: a low quality of availability in which performance is inefficient and slow and yet the system is available. This results in capacity exhaustion and poor workload management.
  2. Unplanned Downtime: this is a loss of application access, data or system such as planned downtime overruns, human error or utility outage that no one had anticipated.
  3. Planned Downtime: this is a system scheduled outage usually during non-critical or low-usage periods such as testing, planned maintenance and updates or upgrades.

Your availability management process will mature and improve in time. As the levels of service improve, your process will gather momentum and you can look for even higher standards. Just remember that availability management processes are critical to the quality of service. On their own, they add value but running them and aligning them together will not only result in improvements but will also reduce repeat downtime. Service uptime will then increase. Having availability management is one important step in moving an IT company from system management to service management as customer satisfaction increases.

Areas Affected by Availability Management

Even if risk occurrence is not something under anyone’s control, applying a good framework for availability management gets the impact mitigated. To meet tactical and strategic objectives, all areas that affect the availability of the system be addressed using real-world, tangible IT assets, processes, people and tools and other attributes that can be supervised, administered, assigned and budgeted to support the availability of the system:

  1. Recover-ability: processes and strategies to regularly archive and back up data to restore functionality and data in cases of disaster or data loss.
  2. Data protection: product features and processes that eliminate or minimize loss of data, theft and corruption. This includes large cliques, hot standby disks, hot standby noted, fall back and security.
  3. Operations: Support and procedures for operations as well as personnel support used in everyday administration of the database and system.
  4. Technology: each system’s design, remote connectivity and enabled tools and utilities.
  5. Infrastructure: the configuration connecting the IT architecture, network and assets.
  6. Environment: the physical conditions and layout of the equipment within the data center housing the infrastructure including cleanliness, quality of power, airflow and infrastructure.

In actual practice availability management is the art of meeting the needs of a company in a way that is cost-effective. This involves managing the expectations of users almost as much as the technology, which you will be able to do effectively with this course.