With conventional error-detecting programs ill-equipped to deal with the large and complex data sets of parallel distributed supercomputers, revolutionary debugging software developed by the research team has garnered interest from agencies around the world.
The research team, led by the Lab's Director, Professor David Abramson, recently received funding support from the United States Department of Energy, an agency leading an international supercomputer R&D consortium that includes IBM, and has a commercialisation agreement with supercomputer manufacturing giant Cray.
Professor Abramson said the funding support from the US Department of Energy, together with an Australian Research Council Linkage grant with Cray, would enable the research team to "develop debuggers that scale to millions of processors. It will also allow us to leverage state-of-the-art software development environments to improve programmer productivity."
The uniqueness of the research team's expertise lies in a novel approach to the debugging process.
"While traditional debuggers work by comparing program variables with user expectations, our 'relative' debugging operates by comparing data in one program with data in another that is known to be correct. So it works by detecting where the codes differ rather than from the principle of how the code should be," Professor Abramson said.
"The debugging software which we have developed - and which is a commercial application of research we have been conducting for several years - efficiently weeds out glitches in supercomputers through a process that could be described as the technical equivalent of a 'spot the difference' puzzle."
Professor Abramson said that because of the complexity and size of supercomputer programs, previous approaches to debugging - developed with far smaller systems in mind - were ineffective.
"Previous debugging techniques for parallel computing have been effective for smaller-sized computations jobs, but their operating logic makes them ill-suited to supercomputers working with massive data sets. So there has been a growing demand for debugging software fit for supercomputers, because the impact of glitches in these multiprocessor systems is huge and can be very costly to fix," he said.
Professor Abramson said supercomputers were continually growing in size and sophistication, and an increasing number of industries were utilising them in core operations.
"They are involved in everything from simulating and testing materials to designing drugs, so our expertise has implications for many research fields and industries, particular for groups trying to transfer old codes to new platforms. It is fantastic that Monash is playing such a leading role in the international arena in this critically important field."