订阅我们的博客
“When elephants cross the world's hottest desert…” “When elephants cross the world's hottest desert…”

Introduction
Anyone who is serious about big data, scale out applications and cloud infrastructure should want to intimately understand the benefits of scale out architecture and the resource elasticity of cloud services. As we continue our evolution into a deeper understanding of data, we see a need agile access to an elastic big data platform. Such a platform can allow us to capture, synthesize and quantify data into business value.

Enter OpenStack Sahara - the intersection of Hadoop and OpenStack.

As an OpenStack project started by Red Hat, Mirantis and Hortonworks during the OpenStack Havana summit in Portland, Sahara was incubated for the OpenStack Icehouse release and is expected to be integrated for OpenStack Juno by the end of 2014.

Sahara’s mission is to provide a scalable data processing stack and associated management interfaces. Sahara delivers on that mission by providing the ability to rapidly create and manage Apache Hadoop™ clusters and easily run workloads across them. All on OpenStack managed infrastructure, without having to deal with the details of cluster management.

With full cluster lifecycle management, provisioning, scaling and termination, Sahara allows the user to select different Hadoop versions, cluster topology and node hardware details.

Sahara key features and use cases:

  • Fast and agile Hadoop cluster deployment
  • An extensible framework for management and provisioning components
  • Run Hadoop workloads in few clicks without expertise in Hadoop operations
  • “Analytics as a Service” utilization of unused compute capacity for ad-hoc or bursty analytic workloads
  • Sahara supports different types of jobs: MapReduce, Hive, Pig and Oozie workflows. The data could be taken from various sources: Swift, HDFS, NoSQL and SQL databases. It also  supports various provisioning plugins.
  • The intersection of two of the largest open source movements
  • OpenStack provides  the foundation and hub of innovation for cleanly managing infrastructure resources. While Apache Hadoop™ serves as the core and innovation driver for storing and processing data.

Sahara graph

Bringing these two technologies together not only strengthens and catalyzes their ecosystems, but offers an increasing wealth of value to their users.

The OpenStack Sahara project aims to facilitate this combination and enable customers and partners alike to take advantage of a growing big data processing platform on OpenStack.

hadoop openstack

Our vision is to bring Big Data and OpenStack together, with a broad ecosystem of partner interoperability, reliability & choice.

You can use Sahara now in RDO and as technology preview in RHEL OSP 5

Over the next few months, we’ll bring you examples of how to use Sahara in RDO and RHEL OSP, how to get involved as a customer or partner, and tell you about the value provided by merging the infrastructure and data processing universes. Look for post by Keith Basil and Matthew Farrelle.

To learn more and get involved with the Sahara project, please visit the Sahara OpenStack Wiki at: https://wiki.openstack.org/wiki/Sahara

 


关于作者

按频道浏览

automation icon

自动化

有关技术、团队和环境 IT 自动化的最新信息

AI icon

人工智能

平台更新使客户可以在任何地方运行人工智能工作负载

open hybrid cloud icon

开放混合云

了解我们如何利用混合云构建更灵活的未来

security icon

安全防护

有关我们如何跨环境和技术减少风险的最新信息

edge icon

边缘计算

简化边缘运维的平台更新

Infrastructure icon

基础架构

全球领先企业 Linux 平台的最新动态

application development icon

应用领域

我们针对最严峻的应用挑战的解决方案

Original series icon

原创节目

关于企业技术领域的创客和领导者们有趣的故事