News In Engineering

Departments
Predicting Material Performance
Future Innovators: NSF CAREER Award Winners
Bridging Technologies
Students Solve Real Engineering Problems at Honda
Engineering Q&A: Dave Winwood

Departments
From the Dean
College Report
Happenings
Awards
New Faculty
Research Update
Student Update
Alumni Report
Alumni Highlight
Ohio State Alumna Honored as First Woman Electrical Engineer

Cover Stories
A University of Champions
Keep your Buckeye Connection
2003 College of Engineering Faculty Awards

Credits
College of Engineering address

From The Dean

Dhabaleswar Panda picture
Dhabaleswar Panda 
Photo Geneva Ringel

Bridging Technologies

Time is of the essence when it comes to supercomputing. The complex mathematical problems that employ supercomputers, such as atomic modeling, decoding the human genome or e-commerce web sites, can take days to weeks for a network of high performance computers to solve. An Ohio State researcher may have developed a software program to help shorten this time.

Several system components contribute to supercomputing computation speed, such as the software used, the networking platform in place and the computer hardware that supports the network. Researchers continuously work to improve the performance of these components with the overall goal of reducing computation time by developing new supercomputing technologies. However, end users cannot fully reap the benefits of these new technologies until solutions are created that allow these technologies to work together.

Dhabaleswar Panda, professor of computer and information science, Pete Wyckoff, research scientist at the Ohio Supercomputer Center, and their research team have created a technology bridge between two promising supercomputing technologies. Their solution, MVAPICH, is a software program that provides an interface between MPI, a software standard used in many supercomputing applications, and computer hardware supporting an emerging networking technology called InfiniBand.

Failure to Communicate

MPI and InfinBand were both developed to provide faster computation speeds to supercomputer applications, but the two technologies were not initially compatible.

Parallel computing, the use of two or more computers to solve a single problem, enables researchers to break large problems down to smaller pieces that can be solved in parallel -- a big time saver. The computer network receives its directions on how to break the problem into smaller pieces from parallel computing software programs written by the supercomputer user. MPI, or the message passing interface standard, has become the de-facto choice for writing and developing parallel programs for supercomputing applications.

MPI has been developed over the years to write parallel programs on distributed-memory parallel systems that are portable across different types of computer systems, thus providing supercomputer users with more options. Prior to this, supercomputer users had to write separate programs for applications for specific supercomputing systems.

Panda examines the network
Dhabaleswar Panda examines the network connections to the InfiniBand hardware. Photo Geneva Ringel

As beneficial as MPI has proven to be for parallel computing applications, it can not deliver performance unless protocols are known for each type of networking technology that can be employed by a supercomputer network.

“Even though basic MPI can work on naïve networking protocols like TCP/IP, it does not deliver performance unless the protocol stack of MPI is redesigned and implemented over the lower-level communication interfaces of a network while trying to take advantage of its features,” said Panda. “Thus, as new networking technologies evolve, it becomes a challenge to redesign and implement the MPI protocol stack to reap the benefits of the new networking technologies, such as InfiniBand."

InfiniBand, short for infinite bandwidth, is a new generation networking architecture standard that was developed by an industry consortium to provide high performance and scalable networking solutions for next generation high-end computing systems, including supercomputers.

“The InfiniBand standard takes a new approach to the design of high performance networks while trying to address interprocessor communication, I/O, and quality of service (QoS) support in an integrated manner,” explained Panda. “It also provides a range of mechanisms to offload communication tasks from the main processor to the network adapters so that maximum overlap of computation and communication can be achieved, which is fundamental to obtaining high performance parallel processing and scalability.”

InfiniBand supporters claim that this new generation networking architecture can speed parallel computing applications as well as provide limitless scalability.

As InfiniBand technology was rolling out, however, no MPI protocol stack existed for the InfiniBand networking architecture. The existing implementations of MPI were incompatible with the InfiniBand networking architecture. This threatened the implementation of InfiniBand on current generation high performance computing systems.

Bridging the Gap

Noticing the technology gap, Panda and Wyckoff conceptualized a design and implementation of a new interface to MPI that would allow the benefits of MPI and InfiniBand to be simultaneously exploited. If successful, it would also provide an opportunity for supercomputer users to evaluate and take advantage of the new InfiniBand networking architecture.

Panda and Wyckoff set out to learn everything about a technology that was still in development. Their delving was aided by their prior research on MPI for other networking architectures, such as Myrinet and Quadrics, and low-level communication interfaces of these architectures, such as Virtual Interface Architecture or VIA.

After discerning the critical factors, they developed new designs and solutions that outlined the optimum communication paths to be created between MPI and the low-level communication interfaces, called Verbs, provided by InfiniBand. They incorporated their designs and solutions into a previous generation of MPI for VIA called MVICH and created the MVAPICH software.

Panda and students picture
Computer and information science grad students, Sushmita Kini, Jizxing Liu, and Jiesheng Wu, were part of the research group that created MVAPICH. Photo Geneva Ringel

Finding InfiniBand hardware to develop the new software and conduct the associated tests was another matter. Until recently, InfiniBand hardware was not commercially available. Even as hardware became available, it wasn’t always stable and not all the anticipated features were present. Finally, Panda established a collaborative partnership with Mellanox, a leading manufacturer of InfiniBand hardware and they provided Panda with a set of InfiniBand adapters and a switch to design, test, and evaluate his software.

The Results

Currently, the MVAPICH software delivers one-way latency (application to application across two nodes) of only 6.8 microsecs and unidirectional bandwidth up to 827 megabytes/sec on Mellanox’s InfiniBand hardware. These performance numbers continue to improve as Panda’s team incorporates better designs to take advantage of multiple features associated with InfiniBand.

“These performance numbers are far superior compared to other proprietary networking technologies, such as Myrinet and Quadrics, being used in current generation clusters,” stated Panda.

For the standard NAS Benchmarks, a commonly used benchmark suite in the high performance computing community, the MVAPICH software with InfiniBand reduces parallel execution times of some applications up to 33% when compared to other proprietary networking technologies (such as Myrinet and Quadrics) and the associated MPI implementations. This groundwork opens the door for market entry of InfiniBand hardware into high performance computing.

The first public release of the MVAPICH software was in October 2002, immediately before the International Conference on Supercomputing (SC ’02) in Baltimore, the major international conference for this field. At the SC ’02 exhibition, many organizations - including an Ohio State, Mellanox and Sandia National Laboratory demo - used MVAPICH to demonstrate the capability of the InfiniBand network to run their parallel applications on clusters with better performance.

Since its first release, MVAPICH has been downloaded by a large number of organizations, including industry, national labs, and universities from around the world, in order to test and evaluate new parallel systems with InfiniBand and gain the benefits of InfiniBand-based clusters for their MPI applications.

The Los Alamos National Laboratory will be one of the first organizations to install a large-scale (128 node) InfiniBand-based experimental cluster to study systems-level issues (such as routing, connection management, etc) and conduct scalability studies with system and application size. They plan to use Panda’s MVAPICH software for their experimentation and evaluation.

Panda is continuing his research to provide other critical interfaces in MVAPICH to take advantage of many novel features of InfiniBand. These interfaces include support for collective communication and quality of service. Collaborations have already begun to design next generation MPI, called MPI-2, on InfiniBand as well as other programming interfaces. Panda is also designing other kinds of systems for high-end computing, such as distributed shared memory, parallel file systems, web servers, and internet data centers, to exploit the benefits of InfiniBand.

It is hoped that the current MVAPICH work and its follow-up research directions will not only impact the computation speed of supercomputer powered research, but eventually enable society to solve the complex problems that are requiring increased computation speed.

Contact: Dhabaleswar Panda panda.2@osu.edu, or visit http://nowlab.cse.ohio-state.edu/projects/mpi-iba/

This research was supported by Sandia National Laboratories, contract # 30505.