Traffic Traces for TCP Benchmarking

This page provides an initial draft of TCP traffic traces for use in the test suite described in the paper Towards a common TCP evaluation suite.

The traces are for use with the Tmix [1] traffic generator. The format used is that described in [2], which differs in minor details from that in [1]. (In particular, the global-id and subset-id attributes are present and all times are given in microseconds.)

Here are the traces: traces-20080623.tar.bz2 (168 MB file; 625 MB uncompressed).

These files are based on a 60-minute trace of campus traffic at the University of North Carolina, provided by Jay Aikat. One trace contains connections initiated inside the campus network; the other contains those initiated from external sites.

The traces available here were produced by the following process:

  1. The two original traces, with received and initiated connections, were merged. This reverses the direction of one of the traces, but since the traces are intended to be used in a symmetric configuration this is acceptable.
  2. Connections starting within the last 100 seconds are deleted. This reduces the dip in rate due to the end of the original trace, since only fully-captured connections are present.
  3. A cyclic permutation of the original trace is chosen such that the average load for the first 100 seconds is approximately the same as the overall average load.
  4. The trace was sorted by RTT and split into nine traces, such that each sub-trace has approximately the same average load.
  5. The sub-traces were truncated to different lengths, from 52 minutes to 60 minutes. This way, experiments can be made longer than the trace duration without repeating the same traffic.

tmix_utils script

Here is a python script that was used to generate these traces:

tmix_utils.py

It can perform various manipulations of tmix connection vector traces:

It can also display basic information about a trace.

The block and poisson resampling algorithms are likely buggy, as they have not been tested extensively. In particular, they will create connections with the same global-id and subset-id; since I don't know how Tmix works, I don't know whether this is a serious bug or not.

Run python tmix_utils.py --help for a description of the script parameters, or see the source code for more detailed information.

tmix_utils.py uses the Psyco specializing compiler to significantly speed up computation, if it is available. It will work without Psyco, but much slower.

There is also a small script plotrate.py which generates a listing of the offered rate vs. time. The output of this script is suitable for use with gnuplot. The plotrate.py script is a bit more accurate, as it takes into account the delay times in the connection vectors, whereas these are mostly ignored by tmix_utils.py. (But it still doesn't take into account the time needed to transfer data--in effect it assumes infinite transfer rate.)

Trace statistics

This table shows the RTT range and average load of the nine traces:

Trace Min RTT (ms) Max RTT (ms) Rate (Mbps) Rate for first 100 sec (Mbps) Reverse Rate (Mbps) Duration (seconds) Number of connections
unc_20080110_1400_1hr-r4s10.18210.16641.614226.15523.184583499.84455805
unc_20080110_1400_1hr-r4s210.16613.65142.394727.06972.274673439.83364212
unc_20080110_1400_1hr-r4s313.65120.27141.422325.62753.16993379.83480683
unc_20080110_1400_1hr-r5s120.27123.36741.8169102.1823.124423319.82323561
unc_20080110_1400_1hr-r5s223.36736.76437.428631.27166.313833259.82544653
unc_20080110_1400_1hr-r5s336.76452.19438.612527.23606.27823199.84456672
unc_20080110_1400_1hr-r6s152.19481.95339.838342.53775.279053139.82455387
unc_20080110_1400_1hr-r6s281.954100.35539.747945.78034.842543079.84445810
unc_20080110_1400_1hr-r6s3100.35629787.733.610828.083710.53733019.84577518
original trace 0.182 29787.7357.085355.94246.09983499.844443454

more detailed statistics are included in the trace_info file in the tarball.

References

  1. M. C. Weigle, P. Adurthi, F. Hernandez-Campos, K. Jeffay, F. D. Smith, Tmix: a tool for generating realistic TCP application workloads in ns-2, ACM SIGCOMM Computer Communication Review, vol. 36, no. 3, pp. 65-76, July 2006.
  2. P. Adurthi, Generatign Tmix-based TCP application workloads in ns-2 and GTNetS (M.S. thesis), 2006. [Online]. Available: http://www.cs.odu.edu/~mweigle/papers/adurthi-thesis06.pdf.
  3. F. Hernandez-Campos, Generation and validation of empirically-derived TCP application workloads (doctoral dissertation), 2006. [Online]. http://www.cs.unc.edu/~fhernand/diss-html/.

Last updated 2008-06-23 11:28:53 PDT (-0700)