Designing an Internet for Content Delivery and Not Communication by Krishna Gummadi, Max-Planck Institute for Software Systems Extended Abstract: (1) A challenge for the future Internet: Delivering bulk content cheaply and efficiently A vast majority of today's Internet traffic goes towards delivering content rather than supporting interactive communication. A growing fraction of the content delivery traffic is due to bulk content transfers. Examples of bulk content transfers include movie/music file downloads, software downloads, personal/enterprise data backups, and data for experimental sciences, such as physics or astronomy. The Internet today is ill-suited for such transfers both from economic and efficiency perspectives; it is less efficient and more costly to transfer data beyond a certain size over the Internet than it is to transfer the data over traditional mail networks using physical storage media like DvDs or hard-disks. In this talk, I will argue (a) that the challenge of delivering bulk data cheaply and efficiently has many near-term and long-term applications and (b) that it requires a fundamental rethinking of the Internet design, and (c) that experimental investigation of new Internet designs require certain facilities from research infrastructures. (2) Why does the challenge require a fundamental rethinking of the Internet design? The Internet evolved out of attempts to enable computers to communicate. Its design is rooted in presumed characteristics of computer communication, namely short bursts of interactive traffic at unpredictable times. The ideas of packet-switching and statistical multiplexing proved to be an excellent fit for such type of traffic. Interestingly, the design also proved to be highly successful for delivering the Web content. The small sizes of typical Web content transfers generate short and bursty Web traffic that is well suited for delivery over the Internet. However, as content sizes grow larger, the inadequacies of the Internet infrastructure become obvious. So lets pose a hypothetical question: Would we have designed the Internet differently if our goal was bulk content delivery, and not communication? I believe that a network designed for bulk content delivery would be inspired by transportation networks like the rail/truck cargo networks as opposed to communication networks like the telephony network. Specifically, in cargo networks, the end hosts hand over the content to the network, which delivers it before a pre-determined deadline using routes and schedules optimized for cost and efficiency. A significant fraction of their deliver schedule involves wait times at intermediate store and forward centers. Similarly, I believe that an Internet designed for bulk content delivery would offer support for "offline" data transfers, where the content could be stored at intermediate points in the network for prolonged periods of time before their delivery. Why would offline data transfers be more efficient and cheaper than today's online data transfers? By allowing greater flexibility in scheduling delivery times and routes, offline data transfers provide a better opportunity to exploit temporal and spatial multiplexing. For example, different parts of the same content transfer between two end hosts might be routed along different routes for load-balancing or cost reasons. The transfers can be delayed and scheduled during periods of lower utilization of the network. Such a network design immediately raises several interesting architectural questions. For instance, how would storage be provided at intermediate points in the network? should storage be provided at the edges of the network or at the every network router? how long should content be stored before the space is reclaimed? what are the resource contention policies for the intermediate stores? how is the reliability of data delivery ensured? is it between every two intermediate storage points along the path? how are the end hosts informed of the status of the transfer? who controlsthe delivery routes and schedules and to what extent? should it be the ISPs or the end hosts in cooperation with information provided by ISPs? could the data cache stored at intermediate nodes for one transfer be leveraged for another transfer? If so, how should intermediate storage and data be named and addressed? (3) What should the experimental infrastructure provide to enable this research? To enable this research. ideally every intermediate routers in the experimental network infrastructure should be capable of acting as an intermediate store for bulk data. This might require the routers to be equipped with large and high-performance storage systems. Given the cost and performance trends of storage systems, I would expect that they would not add a significant cost to the research infrastructure. However, the resulting research infrastructure could provide a major boost to many other on-going research projects such as delay-tolerant networking, logistical networking, and data-oriented transfers.