The Large Hadron Collider Project moves research foward with JBoss Fuse

March 12, 2013

"FuseSource’s open source solution offers us the flexibility to build on what we already have and support our enterprise operations. With CERN’s commitment to open science, the open-source model is ideal." -Ian Bird, Project Leader, WLCG, CERN

Geography: EMEA
Business Challenge:

Take advantage of the vast amounts computing power from over 100,000 machines contributed by more than 140 diverse facilities in 20 different countries.

Solution:

Use Apache ActiveMQ to reliably move operational monitoring data between nodes without imposing technology on users, and FuseSource for enterprise class support and services.

Software:

FuseSource

Benefits:

CERN was able to improve the quality of the monitoring information, reduce time to detect and fix problems in the infrastructure and reduce the manpower needed to manage the infrastructure.

More
Business Challenge:

The cutting edge of science

The European Organization for Nuclear Research, known as CERN, is one of the world’s largest and most respected centers for scientific research. Its business is fundamental physics, finding out what the universe is made of and how it works. At CERN, the most complex scientific instruments, including the world-renowned Large Hadron Collider (LHC), are used to create high energy conditions similar to those in the first instants of the universe. Results from the LHC, including the potential discovery of Higgs Boson particle, could revolutionize the way we look at the universe.

The LHC, and their other scientific instruments, need the world’s most scalable and reliable infrastructure to support their operation systems, and CERN chose to use Apache ActiveMQ and partner with FuseSource Corporation for training and enterprise-class subscriptions. With the best open source software available, and the help from FuseSource, CERN is now relaying huge amounts of scientific data between scientific machinery and computational servers deployed all across the work using an operational grid to help physicists learn about the laws of nature.

CERN is an exciting laboratory, exploring the most fundamental building blocks of our universe. The Large Hadron Collider (LHC) is a particle accelerator that boosts beams of particles to high energies before they are made to collide with each other or with stationary targets, and detectors observe and record the results of the collisions. The Worldwide LHC Computing Grid (WLCG) is a global collaboration of more than 170 computing centers in 34 countries, the 4 LHC experiments, and several national and international grid projects. The mission of the WLCG project is to build and maintain a data storage and analysis infrastructure for the entire high energy physics community that will use the Large Hadron Collider at CERN.

Such a complex distributed infrastructure needs advanced monitoring and operations capabilities. Initially, many components were developed in technology silos with their own interface code to send and receive data and results. Since each component had their own, unique development environment – some used Python, some used C++, still others preferred Perl – a shared integration infrastructure seemed impossible. The resulting integration infrastructure was very brittle, not scalable, and required a large operational overhead in terms of staff. If CERN needed to change any interface it was time-consuming and expensive, and service nodes could not be added to or removed from the grid without extreme effort.

CERN knew that their integration infrastructure would become a gating factor in their research if they did not supply a better solution.

Solution:

The move to messaging infrastructure

The diverse computing environments were not the only challenges CERN faced when they looked at improving their operations infrastructure. Because the WLCG project is a collaboration between many institutions there was no centrally funded organization that could buy, build out, and deploy a centralized solution. CERN could not impose technical restrictions on the participating computing centers, and they could not force them to assume licensing expenses.

The distributed approach of a message-based solution was the obvious choice. A message broker could be deployed at each node and there would be no need for a centralized server, nor would a centralized IT department dictate how each lab would establish its computing environment or internal infrastructure.

Cost was another consideration. It would be difficult to force sites to ante up for licensing costs even though most sites were prepared to pay something in exchange for a better integration solution. CERN began its search looking at open source options.

Apache ActiveMQ – the best tool for the job

CERN began its investigation into message brokers, and Apache ActiveMQ quickly rose to the top of the list. Its impressive list of supported languages and technologies meant that the broker would be easy for sites to adopt, and the lack of licensing costs meant that there were no upfront costs.

CERN also considered how the solution was licensed, and the Apache license used by ActiveMQ fit their model perfectly. CERN would be able to package ActiveMQ into their solution and make it available to any location that wanted it without requiring them to work with a commercial vendor.

Reliability was also a top criterion in their decision. ActiveMQ is the onlyopen source alternative that CERN felt was mature enough for their purposes. Having access to services and support from a large, stable company was also a key advantage. They chose ActiveMQ in 2007, and began development immediately.

Services from the people who know ActiveMQ best

Of the many firms that offer services and support, FuseSource stood apart because the ActiveMQ Project Management Committee (PMC) Chair and many of the key committers are employees of FuseSource, and it was important to CERN that services and subscriptions come from the people who developed and influence the project.

It was also important to CERN that they have access to stable and productized releases. FuseSource takes the Apache projects, performs additional QA tests, certifies them, and releases the products with version numbers. With CERN looking to deploy ActiveMQ at so many sites, having stable, trackable releases was critical.

One of the first things CERN wanted to do with ActiveMQ was to make it work with the STOMP protocol for a few of their users. This required expertise that went beyond the skill set of the grid operations group at CERN, so they brought in expert consulting from FuseSource. Their work resulted in improvements to ActiveMQ that were submitted to Apache, and were included in the next build of the project.

Deploying with ActiveMQ

In 2009, CERN began deploying Fuse Message Broker®, the enterprise version of Apache ActiveMQ, across the grid. Installation and transition went smoothly, and soon service nodes were sending logs and other critical data in support of the various physics and biomedical projects running at CERN. When the LHC particle accelerator came back on line at the end of the year, the integration infrastructure began to run at peak capacity.

CERN now sends several million messages a day through the ActiveMQ infrastructure from components deployed all of the 34 partner countries. This comes with improved reliability in the system and reduced load to operate it.

Benefits:

Beyond ActiveMQ

With the success of ActiveMQ, CERN is beginning to investigate using Fuse Mediation Router®, the enterprise version of Apache Camel for message routing. Apache Camel is a tool for integrating services, applications, and transport protocols using Enterprise Integration Patterns.

Apache Camel is a rule-based routing and transport mediation engine that combines the ease of basic POJO development with the clarity of a standard notation for integration patterns. CERN will be using Camel to rout messages to the right place, and this will allow them to remove routing logic from applications. This will simplify development at the service nodes, and further reduce the footprint of the integration infrastructure.

CERN set out to reduce the brittleness of its ad hoc integration infrastructure, and turned to Apache ActiveMQ for its robustness, low cost, and flexible licensing. The project was a success, and CERN was soon sharing data securely and efficiently across the operation grid. As a result, CERN’s projects were able to spend more human and computing resources on advancing science.

FuseSource was an instrumental player in making it possible. It provided the enterprise-class services and subscriptions the grid operations group needed, and is an option for all service node locations that want additional help. Between the leading open source software, world-class support, and CERN, science is taking a great leap forward.

Contact Sales

Less