Traces
From this location you can download several traces, including anonymized packet headers (tcpdump/libcap), Netflow version 5 data, a labeled dataset for intrusion detection, and Dropbox traffic traces. More information on the data collection and on the anonymization procedures can be found below. When using these traces, please refer to the Acceptable Use policy.
Contents
- 1 A First Look at QNAME Minimization in the Domain Name System
- 2 Rolling with Confidence: Managing the Complexity of DNSSEC Operations
- 3 Passive Observations of a Large DNS Service: 2.5 Years in the Life of Google
- 4 Recursives in the Wild: Engineering Authoritative DNS Servers
- 5 Broad and Load-Aware Anycast Mapping with Verfploeter
- 6 Anycast vs. DDoS: The Nov 2015 Root DNS Event
- 7 Anycast Latency - How Many Sites Are Enough?
- 8 Booters - An analysis of DDoS-as-a-Service Attacks
- 9 DNSSEC and its Potential for DDoS Attacks
- 10 Cloud Storage
- 11 Intrusion Detection
- 12 Pcap Traces
- 13 NetFlow Traces
- 14 Software
- 15 Other Trace Sources
A First Look at QNAME Minimization in the Domain Name System
Here we list the various datasets (and accompanying scripts) that were used in the results for the paper titled "A First Look at QNAME Minimization in the Domain Name System", presented at PAM 2019, Puerto Varas, Chili.
Descriptions
Analysis 1
The file analysis_1.tar.xz contains all the scrips and data used to perform the long term analysis of qname minimization adoption, as well as the algorithm for generating the qmin signatures.
Concretely, it contains the following files:
- analysis_atlas_longterm.ipynb - Jupyter Notebook with scripts to generate the plots based on the long term atlas measurements
- analysis_qnames_are_cool.ipynb - Jupyter Notebook with scripts to generate the signatures based on the "qnames-are-cool.nl" dataset.
- analysis_resolvers_lab.ipynb - Jupyter Notebook with scripts to generate the signatures for the various resolver implementations, from controlled experiments
The necessary datasets are contained in the qname_data folder.
Analysis 2
The file analysis_2.tar.xz contains all the scrips and data used to evaluate the performance of various resolvers.
On the highest level, the resulting data and result processing scripts can be found. In the folder "pl7", all the experimental data can be found:
- The raw files used as target domains for lookup
- The massdns source code and binary used to lookup the target domains against the various resolvers
- The used source code and binaries of the various resolvers used
- The raw results produced by the measurements runs
Analysis 3
The file analysis_3.tar.xz contains all the scripts and data used to generate figures 3 and 4, based on Root DNS (letter K), and Entrada (.nl) data.
The scripts fig_3.py and fig_4.py generate figures 3 and 4 respectively.
Figure 3 is based on two data sets that show the number of queries that are sent by the resolvers minimized as well as the total number of queries.
Figure 4 is based on three data sets. The first contains the share of minimized queries to .nl from 2018-03-03 to 2017-08-18. The second and third contain the percentage of queries for each label length to the servers of K-Root during the DITL collection dates.
DnsThought
Some of the data contained in analysis 1 consists of preprocessed RIPE Atlas measurements. If needed, this preprocessing can be performed by downloading the dnsthought.tar.xz file and performing the steps outlined in the README. This data can also be explored online at https://dnsthought.nlnetlabs.nl/.
Datasets
Name | Used for | Link | Size |
---|---|---|---|
Analysis 1 | Fig 1, 2 - Table 2, 3 | [1] | 1G |
Analysis 2 | Table 2, 3 | [2] | 35G |
Analysis 3 | Fig 3, 4 | [3] | 246K |
DnsThought | Fig 1, 2 | [4] | 50K |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Rolling with Confidence: Managing the Complexity of DNSSEC Operations
Below you find the measurement dataset used to monitor the DNSSEC Algorithm Rollover of the Swedish ccTLD in December 2017. Details about the rollover can be found here. Based on this rollover we developed a methodology to monitor DNSSEC key rollovers and a tool to automate the measurements. The tool can be found here, the related study is currently under review.
This dataset consists of two sub-sets. The first sub-set contains measurements carried out by RIPE Atlas and the raw data is publicly available in the Atlas website. It contains DNS queries for the DNSKEY, RRSIG and DS records of .se towards the authoritative name servers of .se, the root and the configured resolvers of the RIPE Atlas probes. In the table below you find the links to the source of measurement data we used.
The second subset measurements with the Luminati VPN Network. These measurements contain the processed results of resolvers states. A resolver can either be a validating resolver, a non-validating resolver or a failing resolver. For more details about using Luminati for network measurement, we refer the reader to the study by Chung et al..
RIPE Atlas Dataset
Monitoring Goal | Record | Target | Atlas Msm ID | Link to Atlas |
---|---|---|---|---|
Publication Delay | RRSIG | a.ns.se. | 10318217 | Atlas measurement link |
Publication Delay | RRSIG | b.ns.se. | 10318225 | Atlas measurement link |
Publication Delay | RRSIG | c.ns.se. | 10318233 | Atlas measurement link |
Publication Delay | RRSIG | d.ns.se. | 10318237 | Atlas measurement link |
Publication Delay | RRSIG | e.ns.se. | 10318238 | Atlas measurement link |
Publication Delay | RRSIG | f.ns.se. | 10318245 | Atlas measurement link |
Publication Delay | RRSIG | g.ns.se. | 10318248 | Atlas measurement link |
Publication Delay | RRSIG | i.ns.se. | 10318250 | Atlas measurement link |
Publication Delay | RRSIG | j.ns.se. | 10318254 | Atlas measurement link |
Publication Delay | RRSIG | x.ns.se. | 10318260 | Atlas measurement link |
Publication Delay | DNSKEY | a.ns.se. | 10333536 | Atlas measurement link |
Publication Delay | DNSKEY | b.ns.se. | 10318225 | Atlas measurement link |
Publication Delay | DNSKEY | c.ns.se. | 10333541 | Atlas measurement link |
Publication Delay | DNSKEY | d.ns.se. | 10333543 | Atlas measurement link |
Publication Delay | DNSKEY | e.ns.se. | 10333573 | Atlas measurement link |
Publication Delay | DNSKEY | f.ns.se. | 10333578 | Atlas measurement link |
Publication Delay | DNSKEY | g.ns.se. | 10333580 | Atlas measurement link |
Publication Delay | DNSKEY | i.ns.se. | 10333586 | Atlas measurement link |
Publication Delay | DNSKEY | j.ns.se. | 10333587 | Atlas measurement link |
Publication Delay | DNSKEY | x.ns.se. | 10333593 | Atlas measurement link |
Publication Delay | DS | a.root-servers.net. | 10417878 | Atlas measurement link |
Publication Delay | DS | b.root-servers.net. | 10417976 | Atlas measurement link |
Publication Delay | DS | c.root-servers.net. | 10417987 | Atlas measurement link |
Publication Delay | DS | d.root-servers.net. | 10417988 | Atlas measurement link |
Publication Delay | DS | e.root-servers.net. | 10417989 | Atlas measurement link |
Publication Delay | DS | f.root-servers.net. | 10418008 | Atlas measurement link |
Publication Delay | DS | g.root-servers.net. | 10418017 | Atlas measurement link |
Publication Delay | DS | h.root-servers.net. | 10418022 | Atlas measurement link |
Publication Delay | DS | i.root-servers.net. | 10418026 | Atlas measurement link |
Publication Delay | DS | j.root-servers.net. | 10418032 | Atlas measurement link |
Publication Delay | DS | k.root-servers.net. | 10418033 | Atlas measurement link |
Publication Delay | DS | l.root-servers.net. | 10418042 | Atlas measurement link |
Publication Delay | DS | m.root-servers.net. | 10418043 | Atlas measurement link |
Propagation Delay | RRSIG | Use probe's resolver | 10318273 | Atlas measurement link |
Propagation Delay | DNSKEY | Use probe's resolver | 10333530 | Atlas measurement link |
Propagation Delay | DS | Use probe's resolver | 10416497 | Atlas measurement link |
Trust Chain Bogus | - | bogus.d1a8n1.algorithm-rollover-U2ZKQBZU.se. | 10342213 | Atlas measurement link |
Trust Chain Secure | - | secure.d1a8n1.algorithm-rollover-U2ZKQBZU.se. | 10342233 | Atlas measurement link |
Trust Chain Bogus | - | bogus.d1a8n1.algorithm-rollover-PYO9DCYH.se. | 10342174 | Atlas measurement link |
Trust Chain Secure | - | secure.d1a8n1.algorithm-rollover-PYO9DCYH.se. | 10342181 | Atlas measurement link |
Trust Chain Bogus | - | bogus.d1a8n1.algorithm-rollover-IUHQB8DK.se. | 10342098 | Atlas measurement link |
Trust Chain Secure | - | secure.d1a8n1.algorithm-rollover-IUHQB8DK.se. | 10342129 | Atlas measurement link |
Trust Chain Bogus | - | bogus.d1a8n1.algorithm-rollover-B0U1E9ME.se. | 10342092 | Atlas measurement link |
Trust Chain Secure | - | secure.d1a8n1.algorithm-rollover-B0U1E9ME.se. | 10342095 | Atlas measurement link |
Trust Chain Bogus | - | bogus.d1a8n1.algorithm-rollover-4HSU0A0F.se. | 10340721 | Atlas measurement link |
Trust Chain Secure | - | secure.d1a8n1.algorithm-rollover-4HSU0A0F.se. | 10340727 | Atlas measurement link |
Luminati Dataset
The Luminati data sets consist out of a raw json file and two processed files. The raw json file contains the list of resolvers that we measured and their DNSSEC policy that we identified; each key indicates the IP address, its AS number, ISP, and country code. The value contains the four different numbers which indicate the ratio of policy we determined as (1) Non-DNSSEC support resolvers, (2) DNSSEC-supporting but non validating resolvers, (3) validating-DNSSEC resolvers, and (4) the total number of measurements to the resolver.
The processed CSVs contain on a hourly and daily basis the number of observed resolvers that where validating, non-validating or did not support DNSSEC, which are obtained after processing the raw json file.
Name | Frequency | Start Date | End Date | File Type | Download | File Size |
---|---|---|---|---|---|---|
resolver-policy-hourly-basis.csv | Hourly | 2017-11-29 00:00 | 2017-12-20 20:00 | Proceessed CSV | link | 14K |
resolver-policy-daily-basis.csv | Daily | 2017-11-29 | 2017-12-20 | Proceessed CSV | link | 690B |
resolver-policy-all-accuracy.json | - | 2017-11-29 | 2017-12-20 | Raw JSON | link | 5M |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Passive Observations of a Large DNS Service: 2.5 Years in the Life of Google
Below you find the measurement dataset used to study the Google Public DNS. These datasets were used in the following paper:
- Passive Observations of a Large DNS Service: 2.5 Years in the Life of Google, by Wouter B. de Vries, Roland van Rijswijk-Deij, Pieter-Tjerk de Boer and Aiko Pras
This dataset is also available at [5].
Dataset
Start | Stop | Duration | Link | Size | Number of files |
---|---|---|---|---|---|
June 2015 | December 2017 | 2.5 Years | [6] | 106G | 1 |
Data format
The file above contains the dataset. Inside the data is split up by day, each day consists of one or more gzipped CSVs. In total there are approximately 3.5 billion rows.
Field name | Format | Description |
---|---|---|
timestamp | unix timestamp | When query was received |
q_src | ip address | Source of query (this will always contain a IP belonging to Google) |
q_as | as number | Autonomous System Number corresponding to q_src |
q_geoip | country code (2 characters) | Country corresponding to q_src |
ecs_ip | ip address | Source of query according to EDNS0 Client Subnet (not included in dataset, for reference for following fields) |
ecs_ip_as | ip address | Autonomous System Number corresponding to ecs_ip |
ecs_ip_geoip | country code (2 characters) | Country corresponding to ecs_ip |
qtype | query type int | Query type of query (e.g. 1 for A) |
google_dc | IANA airport code | Location of Google DC that answered the query |
qname_hash | sha256 | Salted hash of query name, salt is consistent over the dataset |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Recursives in the Wild: Engineering Authoritative DNS Servers
Below you find the measurement dataset used to study authoritative selection algorithms of recursives. These datasets were used in the following paper:
- [7], by Moritz Müller, Giovane C. M. Moura, Ricardo de O. Schmidt and John Heidemann. Technical Report ISI-TR-720, May 2017.
Dataset
Id | Measurement method | Start | Duration | Link | Size | Number of files | RIPE Atlas Measurement |
---|---|---|---|---|---|---|---|
2A-GRU | tcpdump at authoritative | 2017-03-23 | 71m | 2A.gru.20170323.121010.583247.pcap.gz | 11M | 1 | 7951948 |
2A-NRT | tcpdump at authoritative | 2017-03-23 | 71m | 2A.nrt.20170323.124427.847510.pcap.gz | 8M | 1 | 7951948 |
2B-DUB | tcpdump at authoritative | 2017-03-24 | 69m | 2B.dub.20170324.071658.103478.pcap.gz | 8M | 1 | 7953390 |
2B-FRA | tcpdump at authoritative | 2017-03-24 | 69m | 2B.fra.20170324.071736.808716.pcap.gz | 11M | 1 | 7953390 |
2C-FRA | tcpdump at authoritative | 2017-03-27 | 71m | 2C.fra.20170327.075729.pcap.gz | 14M | 1 | 7967380 |
2C-SYD | tcpdump at authoritative | 2017-03-27 | 71m | 2C.syd.20170327.075729.pcap.gz | 4M | 1 | 7967380 |
3A-GRU | tcpdump at authoritative | 2017-03-25 | 69m | 3A.gru.20170325.120358.pcap.gz | 8M | 1 | 7961003 |
3A-NRT | tcpdump at authoritative | 2017-03-25 | 69m | 3A.nrt.20170325.120357.pcap.gz | 6M | 1 | 7951948 |
3A-SYD | tcpdump at authoritative | 2017-03-25 | 69m | 3A.syd.20170325.120358.pcap.gz | 5M | 1 | 7961003 |
3B-DUB | tcpdump at authoritative | 2017-03-24 | 78m | 3B.dub.20170324.135442.pcap.gz | 6M | 1 | 7954122 |
3B-FRA | tcpdump at authoritative | 2017-03-24 | 78m | 3B.fra.20170324.135441.pcap.gz | 9M | 1 | 7954122 |
3B-IAD | tcpdump at authoritative | 2017-03-24 | 78m | 3B.iad.20170324.135441.pcap.gz | 5M | 1 | 7954122 |
4A-GRU | tcpdump at authoritative | 2017-03-26 | 138m | 4A.gru.20170326.165403.pcap.gz | 6M | 1 | 7966930 |
4A-NRT | tcpdump at authoritative | 2017-03-26 | 138m | 4A.nrt.20170326.165402.pcap.gz | 7M | 1 | 7966930 |
4A-SYD | tcpdump at authoritative | 2017-03-26 | 138m | 4A.syd.20170326.165403.pcap.gz | 6M | 1 | 7966930 |
4A-DUB | tcpdump at authoritative | 2017-03-26 | 138m | 4A.dub.20170326.165402.pcap.gz | 18M | 1 | 7966930 |
4B-DUB | tcpdump at authoritative | 2017-03-25 | 74m | 4B.dub.20170325.082824.pcap.gz | 4M | 1 | 7960323 |
4B-FRA | tcpdump at authoritative | 2017-03-25 | 74m | 4B.fra.20170325.082107.pcap.gz | 7M | 1 | 7960323 |
4B-IAD | tcpdump at authoritative | 2017-03-25 | 74m | 4B.iad.20170325.082823.pcap.gz | 5M | 1 | 7960323 |
4B-SFO | tcpdump at authoritative | 2017-03-25 | 74m | 4B.sfo.20170325.082824.pcap.gz | 3M | 1 | 7960323 |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Broad and Load-Aware Anycast Mapping with Verfploeter
Below you find the measurement dataset used to study the catchment of anycast services. These datasets were used in the following paper:
- Verfploeter: Broad and Load-Aware Anycast Mapping, by Wouter B. de Vries, Ricardo de O. Schmidt, Wes Hardaker, John Heidemann, PIeter-Tjerk de Boer and Aiko Pras. Technical Report ISI-TR-719, May 2017.
Dataset
Id | Measurement method | Start | Duration | Link | Size | Number of files |
---|---|---|---|---|---|---|
STV-2-01 | Verfploeter | 2017-02-01 | 10m | icmp_measurement_20170201.csv.gz | 28M | 1 |
STA-2-01 | RIPE Atlas | 2017-02-01 | 10m | atlas_measurement_20170201.csv.gz | 50K | 1 |
STV-3-23 | Verfploeter | 2017-03-23 | 24h | icmp_measurement_20170323.csv.gz | 1.6G | 1 |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Anycast vs. DDoS: The Nov 2015 Root DNS Event
Below you find the measurement dataset used to study the implications of the DDoS attacks from Nov/Dec 2015 towards the Root DNS. The study is documented in:
- Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event, in ACM Internet Measurement Conference (IMC) 2016, by Giovane C. M. Moura, Ricardo de O. Schmidt, John Heidemann Wouter B. de Vries and Moritz Müller, Lan Wei and Cristian Hesselman.
- Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event, Technical Report ISI-TR-2016-709, USC/Information Sciences Institute, by Giovane C. M. Moura, Ricardo de O. Schmidt, John Heidemann Wouter B. de Vries and Moritz Müller, Lan Wei and Cristian Hesselman. May 2016.
Details on the measurement methodology are found in Section 2 of the paper above.
This dataset consists of DNS CHAOS measurements towards all DNS Root Letters. These measurements are originally from RIPE Atlas and the raw data is publicly available in the Atlas website. In the table below you find the links to the source of measurement data we used, and the processed dataset.
The steps of processing the raw measurement data were:
- Converted from JSON to CSV, removing fields we did not need for our study, and decoding the binary field (answer) to retrieve the hostname.bind for each measurement.
- Filtered out hijacked probes.
- Split each measurement file first per RCODE, then per probe ID, and also per site and server.
Dataset
Target | Atlas Msm ID | Link to Atlas | Processed CSV file | File size |
---|---|---|---|---|
A-Root | 10309 | Atlas measurement link | A-Root processed (gz) | 14M |
B-Root | 10310 | Atlas measurement link | B-Root processed (gz) | 101M |
C-Root | 10311 | Atlas measurement link | C-Root processed (gz) | 106M |
D-Root | 10312 | Atlas measurement link | D-Root processed (gz) | 111M |
E-Root | 10313 | Atlas measurement link | E-Root processed (gz) | 111M |
F-Root | 10304 | Atlas measurement link | F-Root processed (gz) | 109M |
G-Root | 10314 | Atlas measurement link | G-Root processed (gz) | 104M |
H-Root | 10315 | Atlas measurement link | H-Root processed (gz) | 103M |
I-Root | 10305 | Atlas measurement link | I-Root processed (gz) | 106M |
J-Root | 10316 | Atlas measurement link | J-Root processed (gz) | 108M |
K-Root | 10301 | Atlas measurement link | K-Root processed (gz) | 109M |
L-Root | 10308 | Atlas measurement link | L-Root processed (gz) | 114M |
M-Root | 10306 | Atlas measurement link | M-Root processed (gz) | 105M |
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Anycast Latency - How Many Sites Are Enough?
Below you find the measurement dataset used to study the relationship between IP anycast and latency documented in:
- Anycast Latency: How Many Sites Are Enough?, in Passive and Active Measurements (PAM) conference, by Ricardo de O. Schmidt, John Heidemann and Jan Harm Kuipers. 2017.
- Anycast Latency: How Many Sites Are Enough?, Technical Report ISI-TR-2016-708, USC/Information Sciences Institute, by Ricardo de O. Schmidt, John Heidemann and Jan Harm Kuipers. May 2016.
Details on the measurement methodology are found in Section 2 of the paper above.
The dataset consists of DNS CHAOS queries from Atlas probes to anycast services of C- and K-Root name servers, and of ICMP ECHO requests (ping) to every single instance of C and K Root anycast infrastructures. The whole measurement is divided in multiple batches. Below you find links to directly download JSON files containing the measurement data. Our measurements are also publicly available at RIPE Atlas platform and HERE you find a list of all measurement IDs in plain text.
Dataset
Target | Measurement | Link | Size | Number of files |
---|---|---|---|---|
C-Root | DNS CHAOS query | C-Root-CHAOS.tar.gz | 1.2M | 1 |
ICMP request (ping) | C-Root-ping.tar.gz | 33M | 8 | |
F-Root | DNS CHAOS query | F-Root-CHAOS.tar.gz | 58M | 1 |
ICMP request (ping) | F-Root-ping.tar.gz | 222M | 918 | |
K-Root* | DNS CHAOS query | K-Root-CHAOS.tar.gz | 1.2M | 17 |
ICMP request (ping) | K-Root-ping.tar.gz | 368M | 508 | |
L-Root | DNS CHAOS query | L-Root-CHAOS.tar.gz | 281K | 17 |
ICMP request (ping) | L-Root-ping.tar.gz | 504M | 2074 | |
NK-Root** | DNS CHAOS query | NK-Root-CHAOS.tar.gz | 14M | 3 |
ICMP request (ping) | NK-Root-ping.tar.gz | 31M | 420 |
*Measurement done when K-Root consisted of a mix of global and local anycast sites. **Measurement done when K-Root consisted of 35 global and one local anycast site.
PS #1: From the traces in the table above, only CHAOS measurements are available at this moment. ICMP measurements (ping) will be available soon our paper is guaranteed official publication.
PS #2: Ping datasets are available on request. If you are interested in the dataset, please drop an email to r.schmidt -at- utwente.nl
Acceptable use
The use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper when partially or fully using our dataset.
Booters - An analysis of DDoS-as-a-Service Attacks
Below you can find the data sets presented in:
- "Booters - An analysis of DDoS-as-a-Service Attacks" by José Jair Santanna, Roland van Rijswijk-Deij, Anna Sperotto, Rick Hofstede, Mark Wierbosch, Lisandro Zambenedetti Granville, and Aiko Pras. In Proceedings of 14th IFIP/IEEE Symposium on Integrated Network and Service Management (IM), May 11-15 2015, Ottawa, Canada. (Alternative link here.)
Information about how we performed our measurements and the characteristics of our network infrastructure can be found in Section II (Methodology) of the paper above cited, specifically section II-B (Measurements) and II-C (Compensating DDoS attack traffic).
Datasets for Booter attacks
Description | Filename | File size | Attack type | Attack average | Attack sources |
---|---|---|---|---|---|
Booter 1 | anon-Booter1.pcap.gz | 1.6G | DNS-based | 700Mbps | 4486 |
Booter 2 | anon-Booter2.pcap.gz | 818M | DNS-based | 250Mbps | 78 |
Booter 3 | anon-Booter3.pcap.gz | 1.1G | DNS-based | 330Mbps | 54 |
Booter 4 | anon-Booter4.pcap.gz | 5.5G | DNS-based | 1.19Gbps | 2970 |
Booter 5 | anon-Booter5.pcap.gz | 60M | DNS-based | 6Mbps | 8281 |
Booter 6 | anon-Booter6.pcap.gz | 1.4M | DNS-based | 150Mbps | 7379 |
Booter 7 | anon-Booter7.pcap.gz | 2.4M | DNS-based | 320Mbps | 6075 |
Booter 8 | anon-Booter8.pcap.gz | 197M | CharGen-based | 990Mbps | 281 |
Booter 9 | anon-Booter9.pcap.gz | 465M | CharGen-based | 5.48Gbps | 3779 |
Acceptable use
Use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper:
@inproceedings{santannajjIM2015, author={Santanna, J.J. and van Rijswijk-Deij, R. and Hofstede, R. and Sperotto, A. and Wierbosch, M. and Zambenedetti Granville, L. and Pras, A.}, booktitle={IFIP/IEEE International Symposium on Integrated Network Management (IM)}, title={Booters - An analysis of DDoS-as-a-service attacks}, year={2015}, month={May}, pages={243-251}, doi={10.1109/INM.2015.7140298} }
DNSSEC and its Potential for DDoS Attacks
Introduction
Below you can find the data sets presented in:
- "DNSSEC and its Potential for DDoS Attacks" by Roland van Rijswijk-Deij, Anna Sperotto and Aiko Pras. In Proceedings of the 14th ACM Internet Measurement Conference (IMC 2014), November 5-7 2014, Vancouver, BC, Canada.
A technical report describing the data sets and outlining the acceptable use of the data sets can be downloaded here:
- "Large-scale DNS and DNSSEC data sets for network security research" by Roland van Rijswijk-Deij, Anna Sperotto and Aiko Pras. Technical Report, University of Twente, 2014.
Datasets for DNSSEC-signed domains
Top-level domain | Filename | File size | SHA256 hash |
---|---|---|---|
.com | com.dnssec.db.gz | 841M | c69d8bd680825bac272e97b0575a07e90ccbe4ffb2492e13edca4781fb574b7d
|
.net | net.dnssec.db.gz | 174M | e6c2b600c895a30b90fb1dc126a0ae55b28d0d6e0378164f4bca12b0b259bfa9
|
.org | org.dnssec.db.gz | 113M | aff93cb57405d9131567e9d4b687dbd86067c8c162ab28528be3b09ca3f11a08
|
.nl | nl.dnssec.db.gz | 3.8G | 740ea3b30c23992ee00a9d595982f0635eb67b12c13c4338b4e39afd72ecfeaa
|
.se | se.dnssec.db.gz | 601M | af2d4a8b1503d021f9151136f39eb843809684de26d64ab3e4a47594953fd4df
|
.uk | uk.dnssec.db.gz | 25M | dda8f4320d7dd8ce44d52bf4d59c30b2dc043fca01cefec06a8922494acb1e93
|
Datasets for regular domains
Top-level domain | Filename | File size | SHA256 hash |
---|---|---|---|
.com | com.non-dnssec.db.gz | 1023M | ce3b98e524f4f3589ed4e9a746bf88314a5c3e0815193e21ef98f286d5f787fb
|
.net | net.non-dnssec.db.gz | 209M | 47679adee3eb0b8228e4fae98596db64b605aef5ee35324c228e8531a86d2f45
|
.org | org.non-dnssec.db.gz | 209M | 0d57968b4734501fba52bca310bbf2e76684a82fe86b393e1dca1a97f4d55758
|
.nl | nl.non-dnssec.db.gz | 1.9G | b07548f8d7ea2d3baf28971d2ec89f5fcf5235c3cee5b3543a159079e2e208a6
|
.se | se.non-dnssec.db.gz | 605M | e2194c2ecedbd962cc5a51411edee9569061228693afc86881f57cda29d300cd
|
.uk | uk.non-dnssec.db.gz | 39M | 8809d590c26fb25c903c3ce037c5cc9ec1d28e7a5e6ef90d13b464cd9f6040f4
|
Acceptable use
Use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license".
Please make sure to cite either our IMC 2014 paper:
@inproceedings{IMC2014, address = {Vancouver, BC, Canada}, author = {van Rijswijk-Deij, Roland and Sperotto, Anna and Pras, Aiko}, booktitle = {Proceedings of the Internet Measurement Conference 2014}, doi = {"http://dx.doi.org/10.1145/2663716.2663731"}, publisher = {ACM Press}, title = { {DNSSEC and its potential for DDoS attacks - a comprehensive measurement study} }, year = {2014} }
Or cite the technical report that describes the datasets and accompanies the IMC 2014 paper:
@techreport{dnssec-techrep-2014, address = {Enschede, The Netherlands}, author = {van Rijswijk-Deij, Roland and Sperotto, Anna and Pras, Aiko}, institution = {University of Twente}, title = { {Large-scale DNS and DNSSEC data sets for network security research} }, url = {"http://www.simpleweb.org/w/images/0/04/Techreport.pdf"}, year = {2014} }
Cloud Storage
Benchmarks
You can download from this link the software and data presented in:
- "Benchmarking Personal Cloud Storage" by Idilio Drago, Enrico Bocchi, Marco Mellia, Herman Slatman and Aiko Pras. In Proceedings of the 13th ACM Internet Measurement Conference. IMC 2013.
Dropbox User Files
In this experiment we collected basic statistics of what files are stored in Dropbox folders.
Download our datasets:
Name | File Size | Volunteers |
---|---|---|
Crawler Dataset | 219M | 333 |
Some results derived from these data can be found in here.
Dropbox Traffic Traces
You can download from this page the flow data used in the following paper:
Check here for more details. Several scripts used to process the data are also available here.
First Data Capture
These datasets were captured from March 24, 2012 to May 5, 2012.
Name | File Size | Flows | Devices |
---|---|---|---|
Campus 1 | 21MB | 167,189 | 283 |
Campus 2 | 262M | 1,902,824 | 6,609 |
Home 1 | 181M | 1,438,369 | 3,350 |
Home 2 | 82M | 693,086 | 1,313 |
Second Data Capture
This dataset was captured from June 01, 2012 to July 31, 2012.
Name | File Size | Flows | Devices |
---|---|---|---|
Campus 1 | 32M | 264,131 | 270 |
Intrusion Detection
SSH datasets
The SSH datasets feature a unique combination of flow data (exported using NetFlow) and authentication log files, allowing for validation of any flow-based intrusion detection system. More information on the datasets can be found here. Citing the paper accompanying the SSH datasets is required when using the datasets:
SSH Compromise Detection using NetFlow/IPFIX
Rick Hofstede, Luuk Hendriks, Anna Sperotto, Aiko Pras. In: ACM SIGCOMM Computer Communication Review, Vol. 44, No. 5, 2014, ISSN 0146-4833, pp. 20-26.
Labeled Dataset for Intrusion Detection
In this scenario, a honeypot (running in a virtual machine) ran for 6 days, from Tuesday 23 September 2008 12:40:00 GMT to Monday 29 September 2008 22:40:00 GMT. The honeypot was hosted in the University of Twente network and directly connected to the Internet. The monitoring window is comprehen- sive of both working days and weekend days. The data collection resulted in a 24 GB dump file containing 155.2 million packets. The processing of the dumped data and logs, collected over a period of 6 days, resulted in 14.2M flows and 7.6M alerts. More information on the labeling procedure can be found here.
Pcap Traces
These datasets are a collection of anonymized packet headers (tcpdump/libcap) and NetFlow data collected from various locations in the Netherlands. More information on the data collection and anonymization procedures can be found here. You can find bellow a short description of the scenarios where the datasets where collected.
Trace 1 - Packet Headers
In scenario 1, the 300 Mbit/s (a trunk of 3 x 100 Mbit/s) ethernet link has been measured, which connects a residential network of a university to the core network of this university. On the residential network, about 2000 students are connected, each having a 100 Mbit/s ethernet access link. The residential network itself consists of 100 and 300 Mbit/s links to the various switches, depending on the aggregation level. The measured link has an average load of about 60%. Measurements have taken place in July 2002.
Trace 2 - Packet Headers
In the second scenario, the 1 Gbit/s ethernet link connecting a research institute to the Dutch academic and research network has been measured. There are about 200 researchers and support staff working at this institute. They all have a 100 Mbit/s access link, and the core network of the institute consists of 1 Gbit/s links. The measured link is only mildly loaded, usually around 1%. The measurements are from May - August 2003.
Trace 3 - Packet Headers
This dataset was collected in a large college. Their 1 Gbit/s link (i.e., the link that has been measured) to the Dutch academic and research network carries traffic for over 1000 students and staff concurrently, during busy hours. The access link speed on this network is, in general, 100 Mbit/s. The average load on the 1 Gbit/s link usually is around 10-15%. These measurements have been done from September - December 2003.
Trace 4 - Packet Headers
In scenario 4, the 1 Gbit/s aggregated uplink of an ADSL access network has been monitored. A couple of hundred ADSL customers, mostly student dorms, are connected to this access network. Access link speeds vary from 256 kbit/s (down and up) to 8 Mbit/s (down) and 1 Mbit/s (up). The average load on the aggregated uplink is around 150 Mbit/s. These measurements are from February - July 2004.
Trace 5 - Packet Headers
The dataset Packet Headers 5 was collected in a hosting-provider, i.e. a commercial party that offers floor- and rack-space to clients who want to connect, for example, their WWW-servers to the Internet. At this hosting-provider, these servers are connected at (in most cases) 100 Mbit/s to the core network of the provider. The bandwidth capacity level of this hosting-provider’s uplink (that we have measured) is around 50 Mbit/s. These measurements are from December 2003 - February 2004.
Trace 6 - Packet Headers
In scenario 6, a 100 Mbit/s Ethernet link connecting an educational organization to the internet has been measured. This is a relatively small organization with around 35 employees and a little over 100 students working and studying at this site (the headquarter location of this organization). All workstations at this location ( 100 in total) have a 100Mbit/s Lan connection. The core network consists of a 1 Gbit/s connection. The recordings took place between the external optical fiber modem and the first firewall. The measured link was only mildly loaded during this period. These measurements are from May - June 2007.
NetFlow Traces
Trace 7 - NetFlow Data
The Netflow version 5 data was recorded in the access router connecting a university to its ISP. It contains flow information about most of the incoming and outgoing university’s traffic and some internal traffic as well. The traces cover a period of time of two working days, namely between Wednesday August 1st 2007, 00:00 and Thursday August 2nd 2007, 23:59. The university has a /16 network providing connectivity to the employees and the students on its buildings and the campus. The university is connected to its ISP through a 10 Gbps optical link with an average load of 650 Mbps and peaks up to 1.0 Gbps.
Please note that this trace consists of NetFlow datagrams, collected between flow exporter and flow collector. As such, to obtain the raw flow data, the trace should be imported or replayed to a flow collector, such as nfcapd. For more information on this works, we refer to the following tutorial on flow monitoring.
Software
Some analysis software is described in this PDF document and can be downloaded from here.
The source code of the application based in AnonTool API, used to anonymize the Netflow data can be found here.