- Rob Ober, LSI Fellow,
Processor and System Architect, LSI Corporate Strategy Office , says:
The unrelenting growth in the
volume and velocity of data worldwide is spurring innovation in datacenter
infrastructures, and mega datacenters (MDCs) are on the leading edge of these advances.
Although MDCs are relatively new, their exponential growth – driven by this
data deluge – has thrust them into rarefied regions of the global server
market: they now account for about 25 percent of servers shipped.
Rapid innovation is the
watchword at MDCs. It is imperative to their core business and, on a much
larger scale, forcing a rethinking of IT infrastructures of all sizes. The
pioneering efforts of MDCs in private clouds, compute clusters, data analytics
and other IT applications now provide valuable insights into the future of IT. Any
organization stands to benefit by emulating MDC techniques to improve scalability,
reliability, efficiency and manageability and reduce the cost of work done as
they confront changing business dynamics and rising financial pressures.
The Effects of Scale at MDCs
MDCs and traditional
datacenters are miles apart in scale, though the architects at each face many of
the same challenges. Most notably, both are trying to do more with less by implementing
increasingly sophisticated applications and optimizing the investments needed
to confront the data deluge. The sheer scale of MDCs, however, magnifies even
the smallest inefficiency or problem. Economics force MDCs to view the entire datacenter
as a resource pool to be optimized as it delivers more services and supports
more users.
MDCs like those at Facebook,
Amazon, Google and China’s Tencent use a small set of distinct platforms, each
optimized for a specific task, such as storage, database, analytics, search or
web services. The scale of these MDCs is staggering: Each typically houses
200,000 to 1,000,000 servers, and from 1.5 million to 10 million disk drives. Storage
is their largest cost. The world’s largest MDCs deploy LSI flash cards, flash cache
acceleration, host bus adapters, serial-attached SCSI (SAS) infrastructure and
RAID storage solutions, giving LSI unique insight into challenges these
organizations are facing, and how they are pioneering various architectural
solutions to common problems.
MDCs prefer open source
software for operating systems and other infrastructure, and the applications
are usually self-built. Most MDC improvements have been given back to the open
source community. In many MDCs, even the hardware infrastructure might be
self-built or, at a minimum, self-specified for optimal configurations – options
that might not be available to smaller organizations.
Server virtualization is only
rarely used in MDCs. Instead of using virtual machines to run multiple
applications on a single server, MDCs prefer to run applications across clusters
consisting of hundreds to thousands of server nodes dedicated to a specific
task. For example, the server cluster may contain only boot storage,
RAID-protected storage for database or transactional data, or unprotected
direct-map drives with data replication across facilities depending on the task
or application it is performing. MDC virtualization applications are all open
source. They are used for containerization to simplify the deployment and
replication of images. Because re-imaging or updating virtualization applications
occurs frequently, boot image management is another challenge.
The large clusters at MDCs make
the latency of inter-node communications critical to application performance,
so MDCs make extensive use of 10Gbit Ethernet in servers today and, in some
cases, they even deploy 40Gbit infrastructure as needed. MDCs also optimize
performance by deploying networks with static configurations that minimize
transactional latency. And MDC architects are now deployingat least some software
defined network (SDN) infrastructure to optimize performance, simplify management
at scale and reduce costs.
To some, MDCs are seen as cheap,
refusing to pay for any value-added functionality from vendors. But that’s a
subtle misunderstanding of their motivations. With as many as 1 million servers,
MDCs require a lights-out infrastructure maintained primarily by automated
scripts and only a few technicians assigned simple maintenance tasks. MDCs also
maintain a ruthless focus on minimizing any unnecessary spending, using the
savings to grow and optimize work performed per dollar spent.
MDCs are very careful to eliminate
features not central to their core applications, even if provided for free, since
they increase operating expenditures. Chips, switches and buttons, lights, cables,
screws and latches, software layers and anything else that does nothing to
improve performance only adds to power and cooling demands and service overhead.
The addition of one unnecessary LED in 200,000 servers, for example, is
considered an excess that consumes 26,000 watts of power and can increase operating
costs by $10,000 per year.
Even minor problems can
become major issues at scale. One of the biggest operational challenges for
MDCs is HDD failure rates. Despite the low price of hard disk drives (HDDs),
failures can cause costly disruptions in large clusters, where these breakdowns
are routine. Another challenge is managing rarely-used archival data that may exceed
petabytes and is now approaching exabytes of online storage, consuming more
space and power while delivering diminishing value. Every organization faces
similar challenges, albeit on a smaller scale.
Lessons Learned from Mega Datacenters
Changing business dynamics
and financial pressures are forcing all organizations to rethink the types of
IT infrastructure and software applications they deploy. The low cost of MDC
cloud services is motivating CFOs to demand more capabilities at lower costs
from their CIOs, who in turn are turning to MDCs to find inspiration and ways
to address these challenges.
The first lesson any
organization can learn from MDCs is to simplify maintenance and management by
deploying a more homogenous infrastructure. Minimizing infrastructure spending
where it matters little and focusing it where it matters most frees capital to
be invested in architectural enhancements that maximize work-per-dollar.
Investing in optimization and efficiency helps reduce infrastructure and
associated management costs, including those for maintenance, power and cooling.
Incorporating more lights-out self-management also pays off, supporting more
capabilities with existing staff.
The second lesson is that
maintaining five-nines (99.999%) reliability drives up costs and becomes increasingly
difficult architecturally as the infrastructure scales. A far more
cost-effective architecture is one that allows subsytems to fail, letting the rest
of the system operate unimpeded and the overall system self-heal. Because all
applications are clustered, a single misbehaving node can degrade the performance
of the entire cluster. MDCs take the offending server off line, enabling all
others to operate at peak performance. The hardware and software needed for such
an architecture are readily available today, enabling any organization to
emulate this approach. And though the expertise needed to effectively deploy a
cluster is still rare, new orchestration layers are emerging to automate cluster
management.
Storage, one of the most critical
infrastructure subsystems, directly impacts application performance and server
utilization. MDCs are leaders in optimizing datacenter
storage efficiency, providing high-availability operation to satisfy requirements
for data retention and disaster recovery. All MDCs rely exclusively on direct-attached
storage (DAS), which carries a much lower purchase cost, is simpler to maintain
and delivers higher performance than a storage area network (SAN) or network-attached
storage (NAS). Although many MDCs minimize
costs by using consumer-grade Serial ATA (SATA) HDDs and solid state drives
(SSDs), they almost always deploy these drives on a SAS infrastructure to maximize
performance and simplify management. More MDCs are now migrating to large-capacity,
enterprise-grade SAS drives for higher reliability and performance, especially
as SAS migrates from 6Gbit/s to 12Gbit/s bandwidth.
When evaluating storage
performance, most organizations focus on I/O operations per second (IOPs) and MBytes/s
throughput metrics. MDCs have discovered
that applications driving IOPs to SDDs quickly reach other limits though, often
peaking well below 200,000 IOPs, and that MBytes/s performance has only a modest
impact on work done. A more meaningful metric is I/O latency because it correlates
more directly with application performance and server utilization – the very
reason MDCs are deploying more SSDs or solid state caching (or both) to
minimize I/O latency and increase work-per-dollar.
Typical HDD read/write
latency is on the order of 10 milliseconds. By contrast, typical SSD read and
write latencies are around 200 microseconds and 100 microseconds, respectively
– about five orders of magnitude lower. Specialized PCIe® flash cache acceleration
cards can reduce latency another order of magnitude to tens of microseconds. Using
solid state storage to supplement or replace HDDs enables servers and
applications to do four to10 times more work. Server-based flash caching provides
even greater gains in SAN and NAS environments – up to 30 times.
Flash cache acceleration
cards deliver the lowest latency when plugged directly into a server’s PCIe
bus. Intelligent caching software continuously and transparently places hot data
(the most frequently accessed or temporally important) in low-latency flash
storage to improve performance. Some flash cache acceleration cards support
multiple terabytes of solid state storage, holding entire databases or working
datasets as hot data. And because there is no intervening network and no risk
of associated congestion, the cached data is accessible quickly and
deterministically under any workload.
Deploying an all-solid-state
Tier 0 for some applications is also now feasible, and at least one MDC uses
SSDs exclusively. In the enterprise, decisions about using SSDs usually focus
on the storage layer, and cost per GByte or IOPs, pitting HDDs against SSDs
with an emphasis on capital expenditure. MDCs have discovered that SSDs deliver
better price/performance than HDDs by maximizing work-per-dollar investments in
other infrastructure (especially servers and software licenses), and by
reducing overall maintenance costs. Solid state storage is also more reliable,
easier to manage, faster to replicate and rebuild, and more energy-efficient than
HDDs – all advantages to any datacenter.
Pioneering the Datacenter of the Future
MDCs have been driving open
source solutions with proven performance, reliability and scalability. In some
cases, these pioneering efforts have enabled applications to scale far beyond
any commercial product. Examples include Hadoop® for analytics and derivative
applications, and clustered query and databases applications like Cassandra™
and Google’s Dremel. The state-of-the-art for these and other applications is
evolving quickly, literally month-by-month. These open source solutions are
seeing increasing adoption and inspiring new commercial solutions.
Two other, relatively new
initiatives are expected to bring MDC advances to the enterprise market, just
as Linux® software did. One is OpenCompute, which offers a minimalist,
cost-effective, easy-to-scale hardware infrastructure for compute clusters. Open
Compute could also foster its own innovation, including an open hardware
support services business model similar to the one now used for open source
software. The second initiative is OpenStack® software, which promises a higher
level of automation for managing pools of compute, storage and networking resources,
ultimately leading to the ability to operate a software defined datacenter.
A related MDC initiative
involves disaggregating servers at the rack level. Disaggregation separates the
processor from memory, storage, networking and power, and pools these resources
at the rack level, enabling the lifecycle of each resource to be managed on its
own optimal schedule to help minimize costs while increasing work-per-dollar. Some
architects believe that these initiatives could reduce total cost of ownership
by a staggering 70 percent.
Maximizing
work-per-dollar at the rack and datacenter levels is one of the best ways today
for IT architects in any organization to do more with less. MDCs are masters at
this type of high efficiency as they continue to redefine how datacenters will scale
to meet the formidable challenges of the data deluge.
About the Author
Robert Ober is an LSI Fellow
in Corporate Strategy, driving LSI into new technologies, businesses and
products. He has 30 years of experience in processor and system architecture.
Prior to joining LSI, Rob was a Fellow in the Office of the CTO at AMD, with
responsibility for mobile platforms, embedded platforms and wireless strategy.
He was one of the founding Board members of OLPC ($100 laptop.org) and was
influential in its technical evolution, and was also
a Board Member of OpenSPARC.
No comments:
Post a Comment