An in-depth analysis of the value of the data center and how to adapt its infrastructure to ensure resilience levels that meet the highest standards.
By Kristian Montevecchi, Tech Expert VEM Sistemi
The data center has undergone great transitions thanks to the continuous and constant evolution of ICT technologies. This progress has introduced completely different needs and complexities than in the past: today it plays a vital role in the company’s business and requires operational continuity 24 hours x 365 days a year.
Data centers that are no longer modern often have oversized or undersized issues that lead to energy inefficiencies, reliability limitations and management difficulties. Planning their renovation and adaptation, however, comes up against the need to keep the services provided up and running at all times, without stoppages or impacts on productivity.
Making infrastructural improvements to an existing site can require extremely invasive interventions, especially if one of the objectives is to achieve a level of resilience that meets the highest construction standards.
The value of the data center is not determined by its power or size, but by the consequences that any down-time would have on the business. To assess redundancy, reliability and availability, the tiered classification created by the Uptime Institute, called the Tier Classification System, is often used. This system has four tiers, from Tier 1 to Tier 4, with the latter representing the highest level of reliability.
What are the four tiers of the data center?
Tier 1 – Basic Site Infrastructure: provides 99.671% data center availability. It consists of a basic infrastructure without redundancy or backup systems. Maintenance, reconfiguration or failure of one of the elements of the infrastructure results in the interruption of services.
Tier 2 – Redundant Components: provides 99.741% data center availability. What makes it more reliable than Tier 1 is the redundancy of only critical equipment, such as uninterruptible power supplies (UPSs), backup generators and cooling systems. This means that Tier 2 has fewer vulnerabilities than Tier 1; in the event of maintenance or failure of one of the redundant elements, there is no impact on services that are less prone to interruption. However, it does not provide for reconfiguration, routine/extraordinary maintenance on the distribution infrastructure without service interruption.
Tier 3 – Concurrently Maintainable: provides 99.982% data center availability. It is characterised by the presence of an active and a passive path, both of which can distribute the load and guarantee the backup of critical equipment. Dual power sources, dual connectivity to network service providers and the ability to support server hardware with N+1 redundancy on the active branch are required. During maintenance operations that require the main branch to be taken out of service, the passive branch is temporarily activated to ensure business continuity. The failure of one of the infrastructure elements that make up the active branch could cause the data center to go down.
Tier 4 – Fault-tolerant: provides 99.995% data center availability. It requires a fully redundant system, which means that every component of the infrastructure is duplicated, ensuring that the service remains available at all times even if any element fails. Tier 4 requires a highly complex system architecture but guarantees maximum reliability due to the presence of 2 active paths, each with the capacity to handle the entire workload, ensuring greater redundancy and fault tolerance. Dual power sources, dual connectivity to network service providers and the ability to support server hardware with 2N redundancy are required.
How to adapt the infrastructure of an existing data center?
A sustainable solution, in these cases, is to carry out on-site analyses to study restructuring and/or adaptation procedures that allow “live” operations on the infrastructure. These are projects whose main constraint is to eliminate or minimise the impact on the data center’s service continuity.
This approach, in the presence of bi-powered equipment, makes it possible to intervene on all the active elements that make up the electrical and mechanical sections of the infrastructure with the possibility of changing the topological configuration of the architecture. Conversely, in the event that one of the requirements expressed in the refurbishment process is to obtain a Tier construction certification guaranteeing its quality, one would opt for the achievement of level 3, since the strict structural requirements of level 4 are limits that are difficult to overcome.
In fact, the main conditions that must be met for a Tier 4 Data Center concern:
- the compartmentalisation, where the elements and distribution paths that make up the supporting infrastructure must be physically isolated from each other in order to prevent a single event from impacting both systems;
- the topological configuration of the architecture, which, although it is possible to change, in the level 4 certification scenario requires some additional elements involving the physical layout of the data center environments, and in some cases also the building structure;
- the presence of systems within the infrastructure that autonomously detect and isolate any faults, while related events (liquid leaks, fires, etc.) must be contained and confined to the area where they are generated.
It is therefore quite unlikely (if not impossible) to meet these additional criteria in the event of live renovations, as they would require extremely onerous and highly invasive works. The alternative for achieving construction certification would involve downgrading to Tier 3, which would mean moving from a Fault-tolerant to a Concurrently Maintainable configuration.
Having said that, it is important to note that the standard does not provide for any splitting of levels (e.g. Tier 3.5 or Tier Plus). Even the achievement of Tier 3 may not be sufficient, as it only guarantees the ability to keep the data centre active during routine and extraordinary maintenance, but not in the event of electrical or mechanical failure, unlike Tier 4.
In this context, the ideal solution could be the adoption of a hybrid system known as Multi-Tier.
This mixed configuration is mainly based on standards 3 and 4, and is highly customised to fit existing data centers, where it is necessary to distinguish between critical and less critical infrastructure parts, to prioritise and determine the level of adaptation.
In order to implement a Multi-Tier system, it is essential to conduct an in-depth analysis of the existing situation to assess the current state of the structure, the reliability and vulnerabilities of the data center, as well as the topology of the facilities, paths, their interaction, etc.
Based on this information, it will be possible to design hybrid infrastructures that adopt specific solutions according to the criticality of the parts of the data centre that meet the different Tier standards. These infrastructures will be equipped with dedicated procedures for modifications of existing systems and for new implementations, aimed at reducing or eliminating service continuity interruptions during the adaptation operations.