Framework
We collect DNS, URL, and Internet address threat intelligence data from four sources and we enrich these with metadata including data origin tags to create composite records. Composite records are the bases for our analytics, measurements, and reporting.
We warehouse the source records for each feed that we employ in the original format. For each identifier (URL, domain name, IP, ASN) listed in the source records, we collect additional metadata that is relevant for our intended analyses and create our composite records from these. Each composite record contains a data origin tag that we can use to identify the source (and with this, the licensing arrangement associated with the source). In cases where we intend to conduct studies or investigations, we process the composite record set to eliminate duplicates, erroneous or incomplete data, and to further enrich the metadata to facilitate analysis. We then apply subject matter expertise (ours, our associates with whom we publish reports) to the aggregated data set(s) that we derive from these processes.
The findings and recommendations that compiled from these analyses are published in yearly phishing and malware studies. We now have sufficient historical data to include year-over-year analyses in these.
We publish quarterly updates of phishing activity, malware activity, and spam activity. These provide key statistics and rankings of TLDs, gTLD registrars and Hosting Networks (ASNs) that are most misused or exploited by cybercriminals.
Aggregate data sets (records) associated with charts or tables published in the report are also made publicly available in our repository.
We routinely blog to report events in a timely manner.