- Jack Norris, Vice President, Marketing, MapR Technologies, says:
The demand for Big Data analytics is challenging traditional
data storage and processing methods. Enterprises are capturing large
amounts of data from many new sources, like the web or social media, and now
need a new tool to rapidly process and analyze the mix of structured and
unstructured data. Traditional data warehousing technology was not
designed to deal with this kind of volume or blend of structured and
unstructured data. Instead, organizations are tapping into Hadoop's
power to provide the rapid processing and analytics to unlock the informational
value of their data.
Hadoop applications are increasingly moving out of the lab and
into production environments. The shift coincides with a push for Big
Data analysis to move from batch oriented historical analytics, to processes
that deliver more immediate results. Enterprises are looking to tap into
their data as quickly as they capture it and use it to add more value to their
products and services. As organizations look at leveraging Hadoop for Big
Data analytics, enterprise-level requirements are increasingly important to
many organizations.
Enterprise readiness constitutes critical elements such as high
availability, security, data protection and also multi-tenancy capabilities
that ensure the ability to effectively share a cluster across disparate users
and solutions.
High Availability
When evaluating a Hadoop solution, understanding where a cluster
is vulnerable for failure and what can cause downtime is essential. There
are alternatives that offer automated failover and the ability to protect
against multiple and simultaneous failures.
Security
With multiple users trying to access and change data on the
cluster, there is a need to ensure proper authentication, authorization as well
as isolation at the file system and job-execution layers. Not all distributions
provide these capabilities.
Data Protection
Using point-in-time snapshots to restart a process from a point
before a data error or corruption is something enterprise storage administrators
have come to expect, however, snapshots are not available in all Hadoop
distribution offerings. This is also true for mirroring. In order
for Hadoop to be included in an enterprise's disaster recovery plan remote
mirroring must be supported.
Multi-tenancy
Multi-tenancy support so that different users and groups can
effectively share a cluster is a required feature for most large deployments.
Sharing of the cluster allows for efficient utilization of the storage and
computational power of Hadoop. This requires features such as segregated user
environments, quotas and data placement control which is available only with
some Hadoop distributions.
Meeting all of these enterprise level requirements while
constantly upgrading features with new analytical capabilities will ensure
Hadoop's strong position in the Big Data industry. The newly emerging data
analytics processes driven by Hadoop will push well beyond what is currently
possible with Big Data. Hadoop has emerged as a standard platform and
soon will be found in all the well-established, production data centers of the
Fortune 1000.
About the Author
Jack Norris, vice president, marketing, MapR Technologies
Jack has over 20 years of enterprise software marketing
experience. As VP of Marketing at MapR Technologies, Jack leads worldwide
marketing for the industry's most advanced distribution for Hadoop. Jack's
experience ranges from defining new markets for small companies, leading
marketing and business development for an early-stage cloud storage software
provider, to increasing sales of new products for large public companies. Jack
has also held senior executive roles with Brio Technology, SQRIBE, EMC, Rainfinity,
and Bain and Company.
No comments:
Post a Comment