An Extensible Internet for Science Applications and Beyond
ICSI PI Dr. Jay Chen is collaborating on this project with a team of researchers at UC Berkeley led by Prof. Scott Shenker.
There are many science experiments that generate huge amounts of data, ranging from terabytes to petabytes and beyond. Some of these science applications require their data to be processed in complex workflows that involve different functions (data generation, analysis, and storage) and span multiple collaborating sites. While progress has been made in addressing these needs – particularly the Data Transfer Nodes (DTNs) developed by ESnet – more work is needed to provide the detailed management and specific data-handling functionality these complex workflows require.
While one could address these requirements through custom changes in the computational facilities at each site, such an approach creates deployment and integration challenges that will be difficult to overcome. This project investigates a different solution, which involves embedding the necessary functionality within the network. As such, the network becomes a “Data Valet” for science applications. While placing application-level functionality within a network may seem antithetical to the nature of the Internet, this approach is following the precedent of the large private networks being deployed by content and cloud providers. These private networks now offer various forms of in-network processing to the great benefit of their users (e.g., lower latency and better reliability) the functionality proposed here is merely applying this approach to science applications.
This project investigates the Data Valet approach within the context of three other efforts:
- ESnet is a high-performance network optimized for largescale science, interconnecting the National Laboratory System in the United States. This effort includes two collaborators from ESnet whose insights into the needs of science applications and the role of ESnet in meeting those needs is crucial to the success of the project.
- The in-network functions are being designed within the framework of the Extensible Internet, a recent proposal for an Internet architecture that encompasses in-network processing in a backwards compatible and extensible manner.
- FABRIC, a new NSF-funded testbed, is being used for early testing and deployment. Once the operational prototype of the Data Valet is completed, the FABRIC deployment can make the developed functionality available to all users.
At ICSI, PI Jay Chen (in collaboration with PI Shenker and the other PIs) is identifying possible application scenarios on ESnet where the Extensible
Internet design could be usefully applied, designing multipoint delivery protocols, and overseeing the programming work to implement them.