|
DATA CENTER CONSTRUCTION COSTS
By Larry Smith, President, ABR Consulting
Group, Inc.
BACKGROUND
One of the most troublesome parts of producing a budget for the design and
relocation of a data center is budgeting for the construction of the computer
room and supporting equipment yard portion of the project. The primary
reasons that make this area so difficult to budget is; (1) facilities groups and
IT groups rarely design and build data centers, and (2)
The Uptime Institute® http://upsite.com/TUIpages/whitepapers/tuitiers.html
has developed a tiered classification approach to site infrastructure
functionality that addresses the need for a common benchmarking standard. The
Institute’s system has been under development for several years, and includes
measured availability figures ranging from 99.67% to more than 99.99% It is
important to note that this range of availability is substantially less than the
current Information Technology (IT) expectations for “Five Nines.”
Over the last forty years, data center designs have evolved through at least
four distinct stages, which are captured in the Institute’s classification
system. Tier I first appeared in the early sixties, Tier II in the seventies,
Tier III in the late eighties and early nineties, and Tier IV in 1994 with the
United Parcel Service Windward project, which was the first site to assume the
availability of dual-powered computer equipment. The Uptime Institute®
participated in the development of Tier III concepts and pioneered the creation
of Tier IV.
Invention of Tier IV was made possible by Ken Brill, Executive Director of The
Uptime Institute, who envisioned a future when all computer hardware would come
with dual power inputs. During construction
of the $50 million Windward project, United Parcel Service worked with IBM and
other computer hardware manufacturers to provide dual-powered computer hardware.
Dual power technology requires having at least two completely independent
electrical systems. These dual systems supply power via diverse power paths to
the computer load, which moves the last point of electrical redundancy from
within the Uninterruptible Power System (UPS) down to within the computer
hardware itself. Brill’s intuitive conclusion has since been confirmed by
Uptime Institute research that has determined that 95% of all site
infrastructure failures occur between the UPS and the computer load. Since
completion of the Windward project in 1994, Tier IV electrical
designs have become common and the number of computer hardware products with
dual inputs has grown.
The advent of dual-powered computer hardware in tandem with Tier IV electrical
infrastructure is an example of site infrastructure design and computer hardware
design simultaneously achieving higher availability. With the significant
improvements in computer hardware design currently being made, many data centers
constructed even in the last five years offer only Tier I, II, or III
functionality, falling far behind in their capacity to match the availability
offered by the Information Technology they support.
Defining the Tiers
The tier classification system involves several definitions. A site that can
sustain at least one “unplanned” worst-case site infrastructure failure with
no critical load impact is considered fault tolerant. A site that is able to
perform planned site infrastructure activity without shutting down critical load
is concurrently maintainable (fault tolerance level may be reduced during
concurrent maintenance). It is important to remember that a typical data center
site is composed of at least twenty major mechanical, electrical, fire
protection, security and other systems, each of which has additional subsystems
and components. All of these must be concurrently maintainable and/or fault
tolerant for the entire site to be considered concurrently maintainable and/or
fault tolerant.
Some sites built with fault tolerant System+System electrical concepts failed to
incorporate the mechanical analogy, which involves dual mechanical systems. Such
sites are classified Tier IV electrically, but only achieve a Tier II level
mechanically. The following list summarizes the characteristics of each Tier.
+ Tier I
Single path for power and cooling distribution, no redundant components, 99.671%
availability.
+ Tier II
Single path for power and cooling distribution, redundant components, 99.749%
availability.
+ Tier III
Multiple power and cooling distribution paths, but only one path active,
redundant components, concurrently maintainable, 99.982% availability.
+ Tier IV
Multiple active power and cooling distribution paths, redundant components,
fault tolerant, 99.995% availability.
The availability numbers have been drawn from industry benchmarking conducted by
The Uptime Institute and sites in the top 90th percentile (this means only 10%
of all sites performed at this level). The quality of human-factors management
is the most significant element separating top sites from all others.
Tier I Data Center
Basic
A Tier I data center is susceptible to disruptions from both planned and
unplanned activity. It has computer power distribution and cooling, but it may
or may not have a raised floor, a UPS, or an engine generator. If it does have
UPS or generators, they are single-module systems and have many single points of
failure. The infrastructure should be completely shut down on an annual basis to
perform preventive maintenance and repair work. Urgent situations may require
more frequent shutdowns. Operation errors or spontaneous failures of site
infrastructure components will cause a data center disruption.
Tier II Data Center
Redundant Components
Tier II facilities with redundant components are slightly less susceptible to
disruptions from both planned and unplanned activity than a basic data center.
They have a raised floor, UPS, and engine generators, but their capacity design
is “Need plus One” (N+1), which has a single-threaded distribution path
throughout. Maintenance of the critical power path and other parts of the site
infrastructure will require a processing shutdown.
Tier III Data Center
Concurrently Maintainable
Tier III level capability allows for any planned site infrastructure activity
without disrupting the computer hardware operation in any way. Planned
activities include preventive and programmable maintenance, repair and
replacement of components, addition or removal of capacity components, testing
of components and systems, and more. For large sites using chilled water, this
means two independent sets of pipes. Sufficient capacity and distribution must
be available to simultaneously carry the load on one path while performing
maintenance or testing on the other path. Unplanned activities such as errors in
operation or spontaneous failures of facility infra-structure components will
still cause a data center disruption. Tier III sites are often designed to be
upgraded to Tier IV when the client’s business case justifies the cost of
additional protection.
Tier IV Data Center
Fault Tolerant
Tier IV provides site infrastructure capacity and capability to permit any
planned activity without disruption to the critical load. Fault-tolerant
functionality also provides the ability of the site infrastructure to sustain at
least one worst-case unplanned failure or event with no critical load impact.
This requires simultaneously active distribution paths, typically in a
System+System configuration. Electrically, this means two separate UPS systems
in which each system has N+1 redundancy. Because of fire and electrical safety
codes, there will still be downtime exposure due to fire alarms or people
initiating an Emergency Power Off (EPO.) Tier IV requires all computer hardware
to have dual power inputs as defined by The Uptime Institute’s Fault Tolerant
Power Compliance Specification Version 1.2. www.uptimeinstitute.org/spec.html
Tier IV site infrastructures are the most compatible with high availability IT
concepts that employ CPU clustering, RAID DASD, and redundant communica-tions to
achieve reliability, availability, and serviceability. The accompanying chart
shows how these IT ideas relate to site infrastructure concepts.

Solving Incompatible “Five
Nines”
Expectations
Even a fault-tolerant and concurrently maintainable Tier IV site will not
satisfy an IT requirement of “Five Nines” (99.999%) uptime. The best a Tier
IV site can deliver over time is 99.995%, and this assumes a site outage occurs
only as a result of a fire alarm or EPO, and that such an event occurs no more
than once every five years. Only the top 90th percentile of Tier IV sites will
achieve this level of performance. Unless human activity issues are continually
and rigorously addressed, at least one additional failure is likely over five
years. While the site outage is assumed to be in-stantaneously restored (which
requires 24 x “forever” staffing), it can still require up to four hours for
IT to recover information availability.
Tier IV’s 99.995% uptime is an average over five years. An alternative
calculation using the same underlying data is 100% uptime for four years and
99.954% for the year in which the downtime event occurs.
Contact us at www.abrconsulting.com
Phone: 925.872.5523 Fax: 916.478.2814
|