Building the Cloud with Open Source and Open Standards

Building the Cloud

Open source software and standards are not just beneficial, but highly essential for a heterogeneous, shared and scalable environment such as the “cloud”. What’s more, the community has promptly readied the tools needed to meet this emerging trend. It is not surprising, then, that evangelists believe that open source has built the cloud……

We certainly do not need to tell you what free/open-source software is, but probably should spend a few minutes to clarify what cloud computing really is. Take any definition of cloud computing, and it sounds so similar to software-as-a-service, utility computing, and even grid computing, to boot. It takes quite a while to figure out the difference, which is why it is best explained to you right away, so you can appreciate the cloud computing concept and the open source advantage even better.

Hey diddle diddle…

If the cow jumped over the moon today, what would it see on the clouds below? It would see software-as-a-service (SaaS), platform-as-a-service (PaaS), utility computing, managed service providers, Web services, and cloud integrators. Well, that is what it is. Cloud computing is not a new technology; it is merely a new concept that integrates many virtualisation and pay-as-you-go models that already existed.

Software-as-a-service enables a user to rent and run software applications from service providers who maintain and manage the application on their servers. Remember Salesforce.com. The user simply has to rent an app, use it over the Web, and pay as per usage.

Utility computing, similarly, enables users to rent infrastructure, such as storage or servers, and use it over the Web. Companies use such services to cater to temporary surges in requirements. Remember IBM, Sun and Amazon.com.

Web services enable developers to connect or fit functional blocks or application programming interfaces (APIs) offered over the Web into their own applications, so as to not reinvent the wheel. Remember Google Maps and Xignite.

On-demand platforms enable users to string together whatever applications they need from a service provider into a seamless solution, and use it over the Internet. This is somewhat like an extension of SaaS, but here the user picks and uses a group of applications, rather than just one. The choice of applications is, however, limited by whatever the service provider has on offer. Remember Force.com, Coghead, and Google AppEngine.

Managed services is also a similar concept; it’s just that the user is offered a service — such as network security or backup — over the Web, instead of just an application. Remember IBM, Symantec and Verizon.

Put all this together, and you have cloud computing, which is a very broad term that covers a range of resources and services offered over the Internet. Some experts opine that any resource that a company uses over the Internet, outside its firewall, is “on the cloud”. So, a company can choose, customise or develop an enterprise-wide solution, manage and maintain it, scale it up or down, or do whatever it wants, completely over the Internet — oh, sorry, we should now be saying “on the cloud”, but it means the same thing anyway. The cloud is nothing but a metaphor for the Internet, if that helps ease the confusion in any way!

…the cat and the fiddle

The biggest advantage of cloud computing is that the end-user or developer does not have to bother about the physical location or configuration of the actual resources. The service providers will worry about all that, abstracting all the dirty details from the developers, who can work completely on a logical plane.

These systems can be easily scaled up or scaled down. You can start with a small server and a little storage during development, and then scale up the volume or features of the system on-the-go, as the usage grows. Or, a company can use cloud resources simply to cater to temporary needs. They could even opt to use a combination of their own and cloud resources. In short, they can use what they want, and pay just for what they use.

Almost all literature on this subject compares the cloud to the electricity grid. You consume electricity without bothering about where it is generated, how far away or along which route it travelled to get to you. Nor do such details as how many others are using power from that grid, bother you. Similarly, you can use software, services and infrastructure from the service providers using a Web interface, without worrying about any back-end details. Examples of cloud-computing platforms include Amazon’s Elastic Cloud Compute (EC2), Salesforce’s Force.com and Microsoft’s Azure.

The abstraction provided by cloud computing is so beneficial, from a user’s perspective, that some large organisations are cloaking even their existing infrastructure as “private clouds”. What this means is that they consolidate all their resources and deliver it to various user units within the company, just as cloud service providers would do to the public. So, the users within the company can use the resources they need without worrying about where they are, or how they are managed. To the company, it offers a two-fold benefit — user convenience, as well as better resource utilisation. No resource gets locked up. It is all on the private cloud, can be managed centrally, and provisioned as needed within the company. Plus, the life of resources also gets extended. An old, slow and steady computer can still be used, as long as it can connect to the network — it can run all the needed applications off the cloud!

“While the benefits of cloud computing are compelling, as with any product or technology, adoption takes its own course. Indian organisations, like many of their counterparts across the world, are in evaluation mode. They have largely moved away from the understanding mode, but adoption is not yet high. I think India presents an immense opportunity to realise the cloud possibilities. In developed countries, we see organisations looking at cloud adoption in stages: first, adopting infrastructure virtualisation; then, setting up a private cloud; this is followed by selective use of the public cloud; before moving a large part of the application portfolio to the cloud (private or public). For Indian organisations, there is a big opportunity for them to leapfrog to the cloud faster, given their under-investment in information technology (IT). So they don’t have to worry about managing legacy investments, creating an expensive migration plan, and then adopting the cloud,” says Dhiraj Sinha, leader of the Applications Technology group, Dell Services. Many of Dell’s solutions have used open source — for example, Dell is working with Canonical to help customers adopt Ubuntu-powered open source clouds, and the high-performance analytics services of Dell use Hadoop, the open source distributed computing and data-storage framework.

‘Open source has built the cloud’

Sharing, the freedom to mix and match, choice, and many other characteristics of cloud computing demonstrate a likeness to the principles of open source software. More than just the similarities, there is also a sheer necessity for open source software and open standards in clouds that comprise heterogeneous, and often proprietary, infrastructure. The availability of source code, the freedom to modify and redistribute, the flexibility and constant evolution, and other open philosophies greatly favour the cloud schema of things. A cloud is, after all, a fluffy and lovable structure. Would we not hate a boxed cloud?

Undoubtedly, open source lies at the foundation of many of the earliest cloud implementations. “Open source has built the cloud. When we think of the services we consume on the cloud, from Facebook and Google to Amazon, none would have been affordable or scalable using a traditional licensing model,” says Prakash Advani, partner manager — Central Asia, Ubuntu. “Moving that capability from the leading edge of SaaS provision and into the mainstream enterprise is the next big opportunity for open source.”

The fact that most of the public clouds, including Amazon’s Elastic Compute Cloud (EC2), run on Linux-based platforms is just one facet of the story. In fact, that is now taken so much for granted that the focus has shifted to tools and platforms that enable the building of private or hybrid clouds, the integration of legacy infrastructure with the cloud, and so on.

A rather large number of such open source tools are now available, right from platforms and development tools, to management dashboards and automated migration tools for applications. Existing open source platforms are also fast adapting to the needs of cloud computing and include features such as intelligent workload management and cloud-enabled scalability considerations, to help massive horizontal scalability at all the layers of the technology stack. Ubuntu, Red Hat, and almost every other open source platform now has a stable cloud offering.

George Paul, executive vice president, HCL Infosystems, quickly justifies our point with some examples: “An open source software-infrastructure project called Eucalyptus imitates the experience of using Amazon’s EC2, but allows users to run programs on their own resources. The University of Chicago’s Nimbus is another open source cloud-computing project that is widely recognised as having pioneered the field. Today, customers have a large choice of open source applications for the cloud, including Red Hat, Traffic Server, Puppet, Zoho, Cloudera, Enomaly and Joyent.”

He goes on to explain that apart from new tools, existing open source offerings are being made cloud-enabled through standardisation. “A common standard called the Application Packaging Standard (APS), an open standard with all specifications, has been introduced; it helps in making applications multi-tenant. It consists of over 250 applications, and will continue to grow in the future as well,” he says. APS helps to standardise the packaging, automate the provisioning and management, and to integrate with other hosted services.

HCL’s O’zone, a cloud-enabled services suite, combines a variety of open source solutions, including the Proxmox open source virtualisation platform, Red Hat Enterprise Virtualisation, and open source enterprise resource planning, customer relationship management and content management systems.

Open source for the cloud

Virtualisation of infrastructure is at the heart of cloud computing. Open source options, such as the Kernel-based Virtual Machine (KVM) for Linux and the Xen hypervisor, are very competent. Other open source, Linux-native tools like Hadoop, Cassandra, HipHop, CouchDB and Btrfs also assist one in building a first-class data centre, very cost-effectively. These could be seen as the starting point for the coming wave of enterprise-scale open source adoption on the cloud.

Software platforms such as Eucalyptus, which enable the implementation of private and hybrid clouds, are also becoming very popular. Eucalyptus is a modular platform that is capable of working with a variety of interfaces, including Amazon’s EC2 and Simple Storage Service (S3) services. Eucalyptus works with various distros, including Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux Enterprise Server (SLES), OpenSUSE, Debian and Fedora. It can also host MS Windows images. It is capable of working with many virtualisation technologies such as VMware, Xen and KVM hypervisors, in order to implement the abstraction demanded by a cloud environment.

Solutions such as Cloudera build on the capabilities of popular open source options like Hadoop, to meet an enterprise’s cloud-computing needs. The Cloudera data management platform incorporates the Hadoop Distributed File System (HDFS), Hadoop MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zookeeper and Hue, and is available free under an Apache licence. The enterprise package includes support, tools and training.

Joyent’s SmartPlatform is an open source, server-side JavaScript-based framework for developing and delivering real-time, asynchronous Web applications to the cloud. It is basically a platform-as-a-service. While hosting is free at the moment, it might become a paid service once SmartPlatform graduates from the beta to a stable release. People are betting big on Joyent’s offering, because of the ubiquity of Javascript.

Enomaly’s open source cloud-management and provisioning software, the Elastic Computing Platform or ECP, allows an enterprise to create a private cloud inside its own data centres. It can also link the private cloud to a public one, as and when the company suddenly needs more computing resources. The tool-kit can also be used by service providers and telecom firms to quickly set up and deliver infrastructure-as-a-service (IaaS) cloud-computing services to customers. ECP includes security and compliance features, and Enomaly also offers service and support to license holders, in a model very similar to that followed by companies like Red Hat. RightScale, Elastra and 3Tera are other similar offerings.

Nimbus is another notable tool-kit, maintained by the University of Chicago, which enables you to swiftly convert your cluster into an IaaS cloud. Currently, Nimbus is deployed into a Globus 4.0.x Java container, a system built around the Apache Axis engine. It supports three sets of remote interfaces: the Amazon EC2 Web services definition/description language (WSDL), the Amazon EC2 Query API, and the grid community Web services resource framework (WSRF). It also manages the security for these interfaces. The storage implementation is compatible with the Amazon S3 REST API, and virtualisation is based on Xen and KVM. Nimbus is known for being highly configurable and extensible.

George Paul brings to our notice another interesting cloud-computing project, Reservoir, funded by the European Union and coordinated by IBM. Last September, the group released its cloud stack featuring the Claudia Service Manager, a tool for automatic management of service scalability, and the OpenNebula Cloud Toolkit. OpenNebula is an open source cloud-computing tool-kit capable of managing several thousand virtual machines, along with large storage and networks. It supports all common cloud interfaces, and can fit into any existing data centre to help build a private, public or hybrid cloud.

Some of the other notable open source tools that are very useful in a cloud environment include Apache’s Traffic Server (a fast, scalable and extensible HTTP/1.1 compliant caching-proxy server), and Puppet (a configuration management tool written in Ruby and released under the General Public License or GPL).

Interestingly, Microsoft has also been captioning open source strengths amongst the benefits of Azure, its cloud-computing platform. The Windows Azure software development kit for PHP and the Windows Azure Tools for Eclipse make it easy for programmers to deploy their PHP applications to the Azure cloud, not to forget the support for command-line developers to leverage scripting skills in the deployment of existing PHP applications. Then there is the Windows Azure Companion, which apparently eases the task of deploying open source community applications such as WordPress, SugarCRM, Drupal, etc, onto Windows Azure. Microsoft has also been working with many open source developers to put more and more FOSS programs on Azure.

Licensing issues in a FOSS+cloud world
In March 2007, Linux Magazine published an article that revealed the loopholes associated with delivering free software over a network. The article, The GPL Has No (Networked) Future pointed out that free software license requirements to release source code are all triggered by the act of distribution, and that Web applications, which are not actually “distributed” are therefore not bound by these licenses. Other experts continued to point out this loophole.

In November 2007, the Free Software Foundation (FSF) released the Affero General Public License, a version of GPLv3, which intended to close the legal loophole associated with free software designed to be run over a network. The relevant clause read: “…if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software.”

Despite the rising number of cloud offerings with licenses like AGPLv3, Apache, BSD and Creative Commons, apart from the lesser-known but equally good ones like the MIT Open Source License, there continues to be a lot of confusion concerning the implementation of FOSS over a cloud. We spoke to free software evangelist and consultant Niyam Bhushan about this……

From the point of view of free/open source software, is cloud computing good or bad?

Cloud computing is amoral. The licenses and privacy policies you choose to implement decide whether you are creating a cloud-based golden cage, or a door that opens to the heavens of freedom.

Are there any real issues with the implementation or deployment of free/open-source software over a cloud? If so, is the problem solely faced by public clouds or also by private ones?

The problems of FOSS implemented over a cloud are manifold, if you love and honour digital freedom.

  1. With desktops, the problem was relatively less acute, as your data was still on your local hard disk. With clouds, both the software and your data is locked away in the cloud. At any time, the powers that be can block or delete your data, or deny you access, by accident or design. A Flickr user was horrified to discover, one fine morning, that his entire archive of photos was deleted by Flickr due to an error on Flickr’s part.
  2. With Web-based software engines, you do not know whether the software obeys you, or another master. There could be hidden features, leaks, insidious code that could damage your business, work, or even your reputation. Every Web cloud and engine starts on the premise that the end-user should trust it. This trust could well be a trap.
  3. The issues of privacy, in the name of collaboration, and the use and abuse of your data to build a profile of you are indeed alarming. There is no control over how that data could be used.
  4. If data and software are Web-hosted and served globally, what legal jurisdiction is applicable to it is an old debate that may find new wrinkles.
  5. Data licensing can also be another issue, depending on the type and purpose of the data. For that, a user may wish to have choices from among several Creative Commons licenses.

So, if some people wish to use a free/open-source license for their software, and also wish to deliver it over a cloud, would they face any problems?

None whatsoever, if you use Affero GPL, also known as AGPLv3.

However, if someone wishes to use FOSS but not use AGPLv3, that too is possible. That is the freedom guaranteed by other copyleft licenses, including GPLv2 and GPLv3. However, the spirit of freedom is not truly honoured for the end-user. It is only applicable to the individual or organisation deploying free software enabled as a service. This can be quite tragic for those who love digital freedom.

What are the best free/open-source licenses for the FOSS+cloud combo?

AGPLv3.

The focus on open standards

It is clear that there are tons of open source software for cloud computing; you just need to make sure you pick the right ones. Remember, it is not just about open source software; you need to ensure that your choice is based on open standards too. That is precisely what the OpenStack project and consortium are all about.

OpenStack is a collection of open source technologies that deliver a highly-scalable cloud operating system, based completely on open standards. OpenStack has two interrelated projects: OpenStack Compute and OpenStack Object Storage. Compute is for provisioning and managing large groups of virtual private servers, while Object Storage is for creating redundant, scalable object storage using clusters of commodity servers. Object Storage can handle even petabytes of data! OpenStack integrates code from NASA’s Nebula platform as well as Rackspace’s Cloud Files platform, and is released under the Apache 2.0 licence. OpenStack’s forte is large-scale computational prowess, like those needed for DNA modelling, space research, and the likes.

More than anything else, OpenStack.org is a great example of how industry players are converging to set open standards in place to build clouds. OpenStack is backed by big names such as Rackspace, NASA, Dell, Citrix, Cisco and Canonical, not to forget the large global open source community. The team feels that an open development model is the only way to foster badly-needed cloud standards, remove the fear of proprietary lock-in for cloud customers, and create a large ecosystem that spans cloud providers.

“It is important for customers to have choices and not be tied down to a proprietary platform. Open standards for the cloud will play a key role in enhancing cloud adoption. When I say open standards, I don’t necessarily imply open source, but the fact that customers will have choices and the ability to stitch together solutions that meet their needs. Open standards and specifications would ensure that different products and tools can co-exist to deliver the cloud. Open source will be a key enabler to providing that choice. Such initiatives are already under way — one of them being the OpenStack consortium, in which Dell is a key participant. We believe such initiatives, and the existing open source ecosystem, will evolve cloud computing to the next stage,” says Sinha.

Open source, the best for India

There are loads of open source solutions for cloud computing out there. New ones, old ones, modified or extended ones, et al. Cloud computing being a nascent space, there is still lots more for the community to do. Sinha says, “Cloud computing throws up new problems to tackle, and new possibilities to address through the open source community. While some of the existing OSS contributions would be a great natural extension to the cloud world, there are several areas in the cloud-computing puzzle that require newer open source offerings.”

Of course, open source software and the community will continue to do its bit — and more — for cloud computing. “The ongoing impact of open source, the whole concept of SaaS meeting the cloud, is going to be a major trend in India. The open source approach in cloud computing will definitely grow in the future. Different organisations with different requirements can customise the applications according to their own use. New and advanced components, which are more flexible, transparent and cost-effective, can be integrated with the applications at any point of time, enhancing the capabilities of the system,” says Paul.

Advani sums up: “The availability and cost benefit of open source allows India to build cloud solutions for India. Service providers can adopt Ubuntu Enterprise Cloud, for instance, and provide a local, world-class scalable cloud provision for local businesses. The risk of a cloud in the proprietary world is reliance on a single vendor, external to the country, with service-level agreements that are liable to change. Widespread adoption of an open cloud solution is a much better route to avoid this risk. Of course, the open source cloud has all the advantages: it’s cost-effective, there’s no vendor lock-in, and you have access to the code, so you are not worried about getting stuck with the wrong technology!” That, after all, is very important for nascent technologies like the cloud.

Integration: An emerging service
So now, the cloud is raining resources. All we have to do is catch whatever we need in a tub, and start using it (and of course, pay for it too). But who does that? Cloud integration is no mean task, as it involves integrating not just many cloud-based resources, but also integrating legacy non-cloud resources with the cloud-based ones!

Cloud-computing pioneers such as Amazon EC2 and Force.com have their own application platforms, in addition to the computing resources, so developers can easily build their applications on the platform and get going in no time. But that is just one of the simpler options. You can also do a complete custom build, by selecting disparate resources and putting these together, not to forget integrating the company’s pre-cloud resources.

There are many integration tools available, including software-based, appliance-based, and cloud-delivered ones. Developers can choose from amongst these integration tools to build their solutions. Cloud integration is emerging as a specialised genre of development and service. It is also a popular start-up idea!

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.