Sunday, May 6, 2012

Helix Nebula: Its a Science Cloud!

By now, you must have understood that in this post I will be covering the applicability of cloud computing phenomenon in basic scientific research and development. We are talking about connecting the dots between a technology that offers unlimited computing power and the science community who generate, analyze and integrate an unfathomable amount of data. Its a natural fit! But there exist some other benefits of cloud computing that are extremely appealing to the researchers in various fields of science. Lets start with the Helix Nebula, the latest initiative to build an European cloud for scientific research, then move on to some more specific science clouds such as cloud for material science, cloud for life sciences etc.

In this blog, you have seen several characteristics of cloud computing and the most attractive one for various industries is the on-demand-service (http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf) that essentially enables businesses to reduce their IT infrastructure cost. But for scientific community, the resource pooling and rapid elasticity (http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf) aspects of cloud computing are the biggest enablers. These two aspects together render unlimited computing power and an extremely efficient and transparent collaboration platform.  Watch the following video that explains why such aspects are attractive to the researchers and scientists:


The European Organization for Nuclear Research, famously know as CERN, is the home of the large Hadron collider that empower high energy physics research. On March 12, 2012, CERN  announced (http://press.web.cern.ch/press/pressreleases/releases2012/PR03.12E.html) that  it will collaborate with European Molecular Biology Laboratory (EMBL), European Space Agency (ESA) and a consortium of leading IT providers to launch the "Helix Nebula-the Science Cloud" that will host the massive IT requirement of European Science community and later will be available to industries and various governments institutions. You have to remember, that CERN's grid computing facility is considered to be able to handle extremely high volume data. Here is a video that explains CERN's current computing capabilities:


So why do they need Helix Nebula? Lets look at some numbers. CERN  produces 6 GB of data per second, keeps 150,000 CPUs continuously busy and stores 15 petabytes of data per year. (http://gigaom.com/cloud/super-science-cloud-coming-to-europe/). The scientific community is extremely data intensive and has an insatiable hunger for computer power and storage. This enormous volume of data needs to be analyzed in different ways by different data mining processes to extract valuable an logical information. Helix Nebula will be used by EMBL to simplify the analysis of large genomes and the European Space Agency will use Helix Nebula to study earthquakes and volcanoes. Companies such as Atos, Capegemini, Cloud Sigma, Logica, SAP etc will be the commercial partners in the Helix Nebula Project. The official report on the strategic planning process is available here: (http://cdsweb.cern.ch/record/1374172/files/CERN-OPEN-2011-036.pdf)

Researchers in University of Washington (http://arxiv.org/ftp/arxiv/papers/1110/1110.0543.pdf) are trying to develop a Science Cloud Computing (SCC) platform that will provide simulation capabilities specific for Material Science, Condensed Matter Physics and Quantum Mechanics. This platform can be used in Amazon EC2 and consists of a virtual machine with UNIX operating system, material science codes and relevant interfacing tools. Thus this cloud will be able to generate simulations and solve the structure of novel materials just like local computer clusters that are being used currently. But the cloud will have more computing power, it will reduce the time to solve the structure and will be able to increase collaborative efforts among scientists. Here is sample simulation picture  that provides structural details of a superconductor(http://arxiv.org/ftp/arxiv/papers/1110/1110.0543.pdf).


Similar efforts can be observed to facilitate life sciences projects which also in extremely data intensive. Solving the sequence of a specific human genome and to facilitate virtualized and networked environment of R&D, Oracle will be offering a health sciences cloud in the near future. This cloud will be a automated continuum of practices and process (http://www.oracle.com/us/industries/life-sciences/oracle-health-sciences-cloud-wp-367168.pdf) and would look like the following:

Oracle recently released the following video as well:


IBM's life sciences cloud aims to facilitate the bacteria research:


It is evident that cloud computing is extremely relevant for scientific community because it has the scalability to accommodate unlimited volume of data, can provide faster and efficient data mining and analyzing capabilities and hence brings down the time to finish an experiment and facilitate collaboration on a standardized platform that can be accessed and used by researchers from all over the world in real time. The possibility that the cloud platform will enable integration of various fields of science to solve a problem in a virtual world where distance is not an obstacle anymore, is extremely attractive to the science community. I believe in the future we will see a lot of customized clouds for various fields of scientific research. Some of the pressing issues of current era such as food shortage, search for renewable alternative energy and finding novel materials for semiconductor production will find cloud computing to be a great enabler.

2 comments: