Archive

Archive for the ‘Cloud Computing’ Category

Getting started with the Cloud

January 25, 2010 Indraneel Leave a comment

There has been so much of hoo-ha about the cloud in recent days that it’s really difficult to filter out the noise. A lot of times I get this question from people – “So how do we start using the cloud?” With the slew of cloud vendors and underlying technologies, it’s really pretty easy to start using the cloud in our day-to-day lives. In this post I will talk only about the IaaS cloud, Amazon cloud components in particular. But you can choose any vendor you like.

Rapid provisioning and disposal of engineering environments

 
One of the biggest advantages of the cloud is that you can fire up an instance in the blink of an eye. OK, maybe not in the blink of an eye, but if you are using Amazon EC2, it takes about 3-5 minutes to fire up an instance. And that is probably umpteen times faster than your IT department provisioning a server for you. And this gives us the opportunity to rapidly provision engineering environments like testing, staging etc. at will. It also saves a great deal of money if you throw away the instance when you don’t need it. Say, you are at towards the end of the first sprint of the first release of the product. The product owner would need to see the demo of the working product at the end of the sprint. He would also like to play with the product a little. But you don’t want to give out your testing environment so that the product owner and the testers don’t step on each other’s toes. So what do you do. You fire up another instance of the AMI that you saved with all the configurations, pre-requisite software and libraries that’s needed to run the product and deploy the finished product to that instance right before the day of the demo. How much time would it take. Not more than 3-5 minutes, for launching the instance and maybe 10 minutes for the product if you have set up automated deployment.
When the product owner is done with playing around, you can just bring down the instance with the click of a button. It is really that simple.
Cloud - Rapid provisioning of engineering environments

Deployment of several versions of products

 
In order to keep development work on groups of features independent of each other for one of the products that we were building, we created branches on which groups of developers would work. This is an excellent strategy if you have short development cycles because if there are a group of features that need to go in simultaneously into the product, you can have a (relatively) long running branch. However these branches have to be tested independently. Also, just before the release, you will have to do away with these branches, deploy the trunk after the merge has happened and test that too. So you will need different versions of the product, residing in separate branches deployed at the same time. And that’s where the cloud comes in handy because you can provision and dispose in a jiffy.
Cloud - Automated Deployments

Saving application state

 
Have you ever faced the situation when you found a huge bug in a product but the developers could not replicate it and marked the bug as invalid? I bet you have. A lot of times even saving the bug as a video as mentioned in one of my posts, does not help unearthing the real problem. That’s because, most of the times, these issues are embedded in the environment or the data. How good it would be if you could freeze the entire state of the application – session, data, environment and all, bundle it all up and gave it to the development team. Some developers will hate you for that, but I’m sure they’ll thank you later. One of the biggest advantages the cloud gives us is bundling up the entire environment. If you are using Amazon EC2, you can just bundle up the instance, data, environment, session and all into an AMI and go on your way of testing the application again.
Cloud - Saving the application state

Cloudification of a part of the application services

 
If you are in an organization where you cannot deploy anything outside the organization firewalls due to security policy or if the product itself is such that using public infrastructure like the Amazon cloud is not a viable option (one example could be IP protection), you can still use the cloud. Let me take the example of an application that indexes huge documents or maybe media amongst doing a lot of other things. Now, indexing media is very computationally intensive. And what is the cloud good for if it can’t take up the burden of computationally intensive stuff? You can keep the entire application inside the firewall and host a web-service on the cloud which listens for indexing requests and fires up an instance as soon as it gets one and processes the request. Better yet, you can enable Amazon Cloudwatch and set thresholds so that it fires up more instances when one instance is thrashing. Simple but powerful.

 
Cloudify a part of application services

As you can see, it’s really easy to start taking the advantages of public cloud components. So go ahead and get started today.

The PC Quest article

January 16, 2009 Indraneel Leave a comment

Some of my comments appeared in the January 2009 edition of PC Quest. The online version of it is available over here.

Categories: Cloud Computing

Cloud Computing – Large scale computing for everyone

November 2, 2008 Indraneel Leave a comment

In the beginning of 2008, New York Times ingested 405,000 very large TIFF images, 3.3 million articles in SGML and 405,000 xml files mapping articles to rectangular regions in the TIFF’s using Amazon Web Services, Hadoop and some custom code. This data was converted to a more web-friendly 810,000 PNG images and 405,000 Javascript files containing JSON in less than 36 hours.

Why is this a big deal?

NASA has been computing on far greater scale than this for a long time. NASA’s weather simulation software is computationally a lot more complex than indexing documents.
So why is this a big deal? It’s a big deal because it was neither NASA nor CERN, not even Google. It was a business who did this without buying a single machine. They rented computing power on the fly. They rented slices of a cloud.

What is cloud computing?

A few days ago a journalist said “There is clear consensus that there is no consensus on what cloud computing is”. I like to think of cloud computing as the commercialization of computing resources like CPU cycles, storage, memory etc just like public utilities like electricity, water or natural gas. At the very core of the cloud is virtualization. Virtualization is a technique in which software is used to completely simulate or emulate hardware.

Types of clouds

I see two distinct categories of clouds that vendors are selling today:

  • Infrastructure as a Service – IaaS vendors sell raw compute power – CPU cycles, memory, bandwidth etc. IaaS clouds are complex but with the complexity comes flexibility. Most cloud vendors allow root access to an instance. And hence, specialized knowledge is necessary to handle such flexibility.
  • Platform as a service – PaaS refers to those clouds which provide frameworks and infrastructure on which users can build applications. PaaS clouds are built on IaaS clouds. Most PaaS clouds are very restrictive. They generally allow users to build applications on a particular set or sets of technologies. For example Google App Engine allows users to build applications using Python only. Portability is an inherent issue with PaaS clouds, because of the lack of standards in this domain. So if you have built an application using Google App Engine and BigTable you probably won’t be ableto port the data to any other cloud without spending a huge amount of time and money.

Inside the cloud

At a very high level clouds are made up of the following layers:

Inside the cloud

Inside the cloud

  1. At the very bottom is the hardware layer. Many cloud vendors build their clouds out of of the shelf server class software. For example Joyent uses Dell servers with quad core intel processors for their cloud. Plumbing refers to the networking elements in the cloud with all the fast router, switches and load balancers connected by fiber optic cabling. Clusters, made up of ordinary server class machines make up the skeleton of the cloud.
  2. Storage services refer the storage provided by the cloud. Most cloud vendors offer SAN or NAS storage. Provisioning is generally on the fly and users can ask for virtually unlimited amount of storage.
  3. As mentioned earlier, virtualization is at the very core of the cloud. Virtualization has made creation of a software machine as a clone of an existing one super fast. Think of the cluster (mentioned earlier) as one mega machine with one host OS managing all its resources. Creating virtual machines with pre-defined CPU and memory is fast and easy. Many vendors like Amazon Web Services use Xen virtualization.
  4. Platform services are bunch of pre-installed and packaged goodies that an user of the cloud gets whenever an instance of the cloud is brought up. The LAMP stack supported by AWS and Joyent is an example of platform services.
  5. No matter what the vendor says, if it takes more than 10-15 minutes to bring up an instance, then it is not a cloud. The web services layer is the one that enables users to templatize an instance, bring up a new instance from a template, take backups, restore from a backup etc. instantly, as and when needed.

Just in time deployment using the cloud

Deployment of products is messy business. Not so long ago, fledgling organizations had to first calculate the amount of computing resources needed for a launch, translate that into hardware requirements, call up the hardware vendors or the hosting company and wait till they provisioned the hardware and then installed and configured the software. Provisioning, installation and configuration took several days. It was a lose-lose scenario for everyone. Product success meant another cycle of calls and provisioning while the users suffered due to unresponsive software caused due to heavy load. Product failure meant huge losses due to unused hardware.
Not any more with the advent of the cloud. Now product launches can happen at the click of a button with just enough computing resources sitting behind a Virtual IP. The utilization of the resources are closely monitored. New, templatized instances of the cloud are instantiated whenever the threshold for the monitored utilization is reached. The users of the cloud pay for what they use at any instant of time. The users of the product never find it unresponsive, since computing resources are always adequate, just in time to meet the users’ needs.

Large scale computing for all

So long the ability to do large scale computing was within the reach of an elite club of businesses. Google, Amazon and Yahoo were amongst the very few in the club. Few businesses had the means to lay their hands on infrastructure of that scale. Cloud computing has changed that. Today, a ‘large’ Amazon EC2 instance with 4 EC2 compute units (which is equivalent to the capacity of 4 Opteron or Xeon processors) and 7.5 Gigs of memory costs as less as $288 per month. Users can choose from quite a few operating systems and scale up and down on the fly. Application development platforms like JBoss Enterprise Application Platform and Ruby on Rails come built into it. Clouds have opened the doors of large scale computing to virtually everyone.