Our approach to traffic generation is illustrated in Figure 5.1.
Given a packet header trace collected from some Internet link, we first use the
methods described and evaluated in Chapters 3 and 4
to analyze this trace and describe its content.
This description is a collection of connection vectors
. Each vector
describes the source-level behavior of one of the TCP connections in
using either the sequential or the concurrent a-b-t model. In addition,
each vector includes the relative start time of each connection, and its
measured round-trip time, TCP receiver window sizes and loss rate.
The basic approach for generating traffic according to
is to replay
each connection vector. For each connection vector, the replay consists of
starting a TCP connection, carefully preserving its relative start time, and
reproducing ADUs and inter-ADU quiet times. We call this traffic generation method
source-level trace replay, and
we have implemented it in a network testbed.
Source-level trace replay in our environment implies the need to
first partition
into disjoint subsets
and then assign each subset to a pair of traffic generators.
Partitioning is important in our environment, since the high
throughput and large number of simultaneously alive connections in our real traces
prevents us from using a single pair of traffic generators. We provide
further details on our partitioning method in 5.1.1.
The goal of the direct source-level trace replay of is to reproduce
the source-level characteristics of the traffic in the original link, generating
the traffic in a closed-loop fashion. Closed-loop traffic generation requires
to simulate the behavior of applications, using regular network stacks to actually
translate source-level behavior (the input of the generation) into network traffic
(the output of the generation).
In our implementation, described in Section 5.1.2,
this is accomplished by relying on the standard socket
interface to reproduce the communication in each connection vector. This is a
closed-loop manner of generating traffic in the sense that it preserves the feedback
mechanisms in the TCP layer,
which adapt their behavior to changes in network conditions, such as in congestion.
In contrast,
packet-level trace replay, which means to directly reproduce
, is an
open-loop traffic generation method where TCP and lower layers
are not used, and the traffic does not adapt to network conditions.
A new packet header trace
can be obtained from the source-level trace
replay of
. Our validation of the traffic generation method is then
based on analyzing this trace using the same methods used to transform
into
.
We then compare the resulting set of connection vectors
with the original
. In principle, they should be identical, since
represents the
invariant source-level characteristics of
. Section 5.2
studies the results from the source-level trace replay of three traces, assessing
how closely
approximates
.
is necessarily different from
. Besides the stochastic nature of network traffic,
this is because
is generated according to
, which is
a simplified description of the source-level behavior and network parameters
in the original trace
. It is however important to understand the difference
between
and
in order to understand to what extent
describes the
original traffic. Chapter 6 is an in-depth study of this question.
![]() |
The focus of our traffic generation work is the generation of wide-area traffic in a closed-loop manner. This type of generation process requires to drive a large number of connections by simulating the behavior of the applications on the endpoints. For example, the experiments presented in the latter part of this chapter involve several millions of TCP connections, behaving in the manner specified by as many connection vectors. At any given point in time during the generation, tens of thousands of connections are active. Given CPU, memory and bus speed limitations, a single pair of traffic generators cannot handle such loads, so we generate traffic in our experiment in a distributed fashion. Experiments are conducted in the environment illustrated in Figure 5.2. The goal of the experiment is to generate traffic on the link between the two routers. Traffic is generated by 42 traffic generators, 21 on each side of the network. This type of topology is usually known as the ``dumbbell'' topology.
Each pair of traffic generators (one on each side) is responsible for replaying
the source-level behavior of a (disjoint) subset of the connection vectors in
. In our experience, assigning connection vectors to subsets in a round-robin
fashion works well. While the resulting subsets are far from being completely
balanced, this simple partitioning technique results in subsets that can be easily
handled by a pair of traffic generators. We carefully collected statistics on
CPU and memory utilization from our source-level trace replay experiments,
and found that no pair of traffic generators was ever overloaded.
For the results in this dissertation, CPU utilizations were never above 60%,
and usually well below that. The use of network connections involves allocating
and deallocating pieces of memory known as ``mbufs'' for buffering purposes.
No request for this type of memory was ever denied for the experiments reported
in this dissertation. While larger traces than the ones we use in
this dissertation could certainly overload our specific environment,
our approach is fully scalable, in the sense that
can be partition into
an arbitrary number of subsets. This means that the number of traffic generators
can increase as much as necessary to handle the replay of any trace without running into
resource constraints. This is obviously true as long as no individual connection
requires more resources than those provided by an entire traffic generator end host.
We have developed a traffic generation tool,
tmix , which accurately replays the source-level behavior of an input set of connection
vectors using real TCP sockets in a FreeBSD environment.
In addition, we make use of a modified version of dummynet [Riz97]
to apply arbitrary packet delays and packet drop rates to the segments
in each connection5.1Our version of dummynet ,
that we will call usernet in the rest of this text, implements a
user-level interface that can be used by tmix instances to assign
per-connection delays and loss rates read from the input set of connection vectors.
Finally, a single program, treplay, is used to control the setup of the
experimental environment, configure and start tmix
instances (assigning them a subset of and a traffic generation peer),
and collect the results.
Tmix is a user-level program that receives a collection of connection vectors as input, and generates traffic according to their source-level behavior. Figure 5.3 illustrates the relationship between tmix and the network layers in the traffic generation end host in which a tmix instance runs. Tmix instances rely on the standard socket interface to create a connection, send and receive ADUs, and to close the connection. The socket interface is an that enables user-level programs, such as tmix , to communicate with other end host using a programming abstraction similar to a file. Calls to the socket interface are translated by the kernel into requests to use the process-to-process communication service provided by the transport layer (TCP). The transport layer itself uses the host-to-host communication service provided by the network layer (IP), and the network layer uses the link layer (Ethernet in our case) to handle the network interface and create physical packets.
Our experiments also require a special simulation service, usernet , which is a modified version of
dummynet , that provides a highly scalable way of imposing per-connection
round-trip times and loss rates. These per-connection round-trip times and loss rates
are directly controlled from the user level by tmix instances.
This requires a direct communication between the tmix instance
and the usernet layer that is not directly supported by the network stack.
In order to overcome this difficulty, we use a covert communication channel: the
source port number of each replayed connection.
By having tmix assigning specific source port numbers to each connection,
we can then use ioctl calls to modify a table at the usernet layer
that maps source port numbers to
round-trip times and loss rates.
When a segment is received by usernet (from the higher layer),
usernet can appropriately use the source port number to decide which network
parameters should be applied. Source port numbers are unique for each active connection
in the same end host, and they are always present in TCP
segments5.2.
The user-level program, i.e., the tmix instance, has therefore to keep track
of the (dynamic) source port number that is used for each new TCP connection it opens.
Using this technique, usernet can determine the delay and loss rate that
should be applied to each segment simply by reading an entry in a table indexed by source
port number, so the lookup time is . The number of source
port numbers is small (
), so this table does not require too much kernel memory
(524 KB).
No special infrastructure was required to accurately replay the receiver window
sizes measured for each connection. This is because these parameters can
be directly modified by tmix instances using a FreeBSD system call.
This approach has worked very well in our experiments.
An alternative solution using traditional dummynet
would be to use the programmable API of ipfw,
which makes it possible to add new dummynet rules from a user-level program.
The idea would be to add a new rule for each connection, again using the source
port number to map delay/loss to individual connections. However,
this will mean an lookup cost for each segment, where
is the
number of rules, since the current implementation of ipfw searches through
the rules in a sequential fashion. Given the large number of connections that each end host
handles during the experiments, this per-segment lookup is unacceptable.
Another way of introducing per-connection round-trip times was used by Le et al. [LAJS03]. This study used random sampling from a uniform distribution whose parameters were be set at the start of the experiment. As seen in Section 4.1.1, the uniform distribution is not a good approximation of real round-trip times. A later refinement enabling sampling from an empirical distribution was rather inflexible, since it required to modify the dummynet source code and recompile it for each experiment. The use of usernet , which is fully controllable from the user level, is far more convenient.
Two instances of tmix can replay an arbitrary subset of
by establishing one TCP connection for each connection vector in the trace,
with one instance of the program playing the role of the connection initiator and
the other instance playing the role of the connection acceptor.
To begin, the connection initiator opens the connection and performs one or more
socket writes in order to send exactly the
number of bytes specified in the first ADU
. The other endpoint accepts
the connection and reads as many bytes as specified in the ADU
.
For efficiency, the size of these read and write operations was chosen to be a multiple
of the in our Ethernet testbed (1,460 bytes).
We made no attempt to actually measure and reproduce the size of the I/O operations in
the original connections. The impact of this simplification is likely to be small,
given the results in Section 3.4.
One important issue is how to synchronize the two endpoints (i.e., instances of tmix )
of the connection to replay exactly the same connector vector.
This is accomplished by having the first ADU unit in each
generated connection include a 32-bit connection vector id
in the ADU's first four bytes.
Connection vector ids are assigned to each connection vector prior to
the traffic generation, and they are unique.
Since this id is part of the content of the first data unit,
the acceptor can unambiguously identify the connection vector that is to be replayed
in this new connection. If is less than 4 bytes in length,
the connection initiator will open the connection using a special port
number designated for connections for which the id is provided by the connection acceptor.
This approach guarantees that the two tmix instances
always remain properly synchronized (i.e., they agree on the
they replay within each TCP connection) even if connection establishment
segments are lost or reordered. It also makes it possible to generate traffic without
introducing any control traffic into the experiment, i.e., traffic comes only from the
replay of connection vectors, and from any need to manage the behavior of the tmix
instances.
One important design consideration in the implementation of our traffic generation approach
is the assumption of independence among flows. While this is not completely realistic, the
level of aggregation at which we generate traffic makes it a reasonable approach
(see Hohn et al. [HVA02] for a related discussion).
This assumption makes traffic generation fully
scalable, since can be partitioned into an arbitrary number of subsets.
As long as there are enough traffic generation hosts, we can replay traffic from
arbitrarily large traces.
We obtain two types of data from each experiment. First, we collect a new packet
header trace
, which can be directly compared with the original packet
header trace
and analyzed with our methods to extract a new set of
connection vectors
. This new set can be directly compared to
.
Second, tmix instances create a number of logs.
Some tmix logs can be used to verify that the traffic generation host did
not run out of resources during traffic generation, and they successfully
replayed their subset of
.
Other tmix logs report on the performance of the TCP connections in the
experiments.
This includes connection and epoch response times and the list of uncompleted
connections with a description of their progress by the end of the experiment.
Doctoral Dissertation: Generation and Validation of Empirically-Derived TCP Application Workloads
© 2006 Félix Hernández-Campos