Objective

With increasing link bandwidths and router capacities in todays Internet it becomes increasingly difficult for network operators to be aware of amount and nature of the traffic they have to manage. This information is needed for several reasons. First, it is necessary for network operators to be aware of network related problems like hardware or software failure, misconfigurations and attacks as soon as possible. Second, traffic volume information is needed for both billing of customers and provisioning of the network. Third, network operators need information on traffic to be able to actively influence traffic passing over the network which is commonly referred to as traffic engineering.

The high bandwidths used make it very hard to do such analysis on packet trace level. Packet counters in routers can be used to infer the total number of packets per time unit passing a link or a router but don't provide information necessary for billing and traffic engineering. To tackle these problems, the notion of network flows, or short netflows, were introduced and deployed.

A netflow in this context is a unidirectional stream of related packets. Netflows have a start and an end, determined by time gaps during which no related packet has been observed. This concept leads to a drastic reduction in data volume while conserving information like endpoints of the flow and information on the nature of the data in the flow. This concept is particularly interesting when it comes to traffic engineering.

As traffic engineering involves changes in routers forwarding tables, it is infeasible to do this on a per-connection basis or on a per-host basis. Instead one can define what packets to account for what flow and then do traffic engineering on a per-flow basis. Here one wants to concentrate on large and long living flows.

But what are large flows and what are long living flows? Are large flows long living? Are long living flows large? These are but a few of the questions that occur when coping with netflows. Obviously we lack basic understanding of the nature of netflows that can be observed in the Internet. There are a few findings related to the behavior of flows on a very fine timescale. We also have learned that the size of netflows is consistent with Zipf's law. This means that we have a very small number of flows, that contribute most of the bytes and packets going over a link. But we have no information on what these few large flows look like, where they come from, and how they behave over time.

It is these questions that we try to to find answers for. The answers will allow us to make more efficient use of netflow information for applications like billing and traffic engineering. We also hope to find new aspects of network traffic that allow for new applications and further out under- standing of what exactly happens on today's and tomorrow's Internet.

Software

During the project, a software package has been developed aimed at processing large amounts of network measurement data in the form of packet level traces and NetFlow data. It contains tools to convert packet level traces into NetFlow traces and normalize, aggregate and rank NetFlows by rate across time. It also contains tools to convert NetFlow traces between native wire format, ASCII, ans so on and also a tool to generate timeseries of packet counts and volume from packet level traces.

Publications

  1. Jörg Wallerich, Holger Dreger, Anja Feldmann, Balachander Krishnamurthy, Walter Willinger. A methodology for studying persistency aspects of internet flows. SIGCOMM Computer Communications Review (CCR), 35(2):23-36, 2005. details
  2. Jörg Wallerich, Anja Feldmann. Capturing the Variability of Internet Flows Across Time. In Proceedings of the INFOCOM 2006. 25th IEEE International Conference on Computer Communications, 9th IEEE Global Internet Symposium, Pages 1-6, April 2006. details
  3. Jörg Wallerich. Capturing the Variability of Internet Flows in a Workload Generator for Network Simulators. PhD Thesis Technische Universität München, Munich, Germany, 2007. details