The project, called the Center for Large-scale Data Systems Research (CLDS), formally begins operations this fall and will also be home to the ongoing How Much Information? (HMI?) research program, which released a new report this week at the Storage Networking World (SNW) Spring 2011 conference in Santa Clara, Calif.
The latest report by HMI?, a consortium led by UC San Diego and previously based at the university’s School of International Relations and Pacific Studies, analyzes the growth of “big data” in companies. The authors found that the world’s installed base of computer servers processed almost ten million million gigabytes of information in 2008, almost 10 to the 22nd power. Full details are available here.
“We are entering an era of data-intensive computing, where all of us - academia, industry, and government - will be faced with organizing, analyzing, and drawing meaningful conclusions from unprecedented amounts of data, and doing so in a cost- and performance-effective manner,” said Michael Norman, SDSC’s director.
SDSC recently announced the startup of two data-intensive computing systems, Dash and Trestles. Those systems will be followed later this year by a significantly larger system called Gordon, which will be the first supercomputer to employ large amounts of flash memory to help speed solutions to computing problems now limited by higher latency spinning disk technology. When deployed, Gordon should rank among the top 100 supercomputers in the world, capable of doing latency-bound file reads 10 times faster and more efficiently than any high-performance computing system today.
“It is new technology such as SDSC’s flash memory-based systems that is changing how science and research will be done in the Information Age,” added Norman. “CLDS will serve as a laboratory that will put us on the leading edge of adaptation and integration of technologies such as this, and explore the multi-faceted challenge of working with big data in collaboration with academic and industry partners.”
In addition to serving as the host site for ongoing HMI? research, CLDS will test and evaluate new trends in cloud-based storage systems, examining the cloud computing principles of “on-demand, elasticity, and scalability” in the context of large-scale storage requirements. Research will include exploration of new storage architectures and benchmark development.
“Establishing CLDS at SDSC is a natural fit,” said Chaitan Baru, an SDSC distinguished scientist and director of the new project, adding that the center will be structured as an industry-university consortium. “SDSC is recognized for its expertise in the development of systems for storing, managing, and analyzing ‘big data.’ Our goal here is to understand how new technologies will change the way we work in this data-rich age.”
Moreover, CLDS will be a key resource to strengthen analytical and research relationships, while fostering industry partnerships and exchanges through individual or group research projects, and providing support for industry forums and other professional education programs.
“Integrating management, economic and technical analysis is what all companies will need in the world of “big data” and even bigger analytics,” said James Short, research director of the HMI? program and lead scientist for the CLDS project. “SDSC offers a rich environment for integrating management analysis with both applied and theoretical computer science for research in large-scale data systems.”
This Phys.org Science News Wire page contains a press release issued by an organization mentioned above and is provided to you “as is” with little or no review from Phys.Org staff.