My research focuses on empirical systems and network security. I am joining the Stanford Computer Science Department as an assistant professor in Fall 2018. I lead the ZMap Project and am also CTO and co-founder of Censys.
I received my Ph.D. in computer science from the University of Michigan in 2017 and B.S. in mathematics and computer science from the University of Iowa in 2011.
The SMTP protocol is responsible for carrying some of users’ most intimate communication, but like other Internet protocols, authentication and confidentiality were added only as an afterthought. In this work, we present the first report on global adoption rates of SMTP security extensions, including StartTLS, SPF, DKIM, and DMARC. We present data from two perspectives: SMTP server configurations for the Alexa Top Million domains, and over a year of SMTP connections to and from Gmail. We find that the top mail providers (e.g., Gmail, Yahoo, Outlook) all proactively encrypt and authenticate messages. However, these best practices have yet to reach widespread adoption in a long tail of over 700,000 SMTP servers, of which only 35% successfully configure encryption and 1.1% specify a DMARC authentication policy. This security patchwork—paired with SMTP policies that favor failing open to allow gradual deployment—exposes users to attackers who downgrade TLS connections in favor of cleartext and who falsify MX records to reroute messages. We present evidence of such attacks in the wild, highlighting seven countries where more than 20% of inbound Gmail messages arrive in cleartext due to network attackers.
Fast Internet-wide scanning has opened new avenues for security research, ranging from uncovering widespread vulnerabilities in random number generators to tracking the evolving impact of Heartbleed. However, this technique still requires significant effort: even simple questions, such as, “What models of embedded devices prefer CBC ciphers?”, require developing an application scanner, manually identifying and tagging devices, negotiating with network administrators, and responding to abuse complaints. In this paper, we introduce Censys, a public search engine and data processing facility backed by data collected from ongoing Internet-wide scans. Designed to help researchers answer security-related questions, Censys supports full-text searches on protocol banners and querying a wide range of derived fields (e.g., 443.https.cipher). It can identify specific vulnerable devices and networks and generate statistical reports on broad usage patterns and trends. Censys returns these results in sub-second time, dramatically reducing the effort of understanding the hosts that comprise the Internet. We present the search engine architecture and experimentally evaluate its performance. We also explore Censys’s applications and show how questions asked in recent studies become simple to answer.
We investigate the security of Diffie-Hellman key exchange as used in popular Internet protocols and find it to be less secure than widely believed. First, we present a novel flaw in TLS that allows a man-in-the-middle to downgrade connections to “export-grade” Diffie-Hellman. To carry out this attack, we implement the number field sieve discrete log algorithm. After a week-long precomputation for a specified 512-bit group, we can compute arbitrary discrete logs in this group in minutes. We find that 82% of vulnerable servers use a single 512-bit group, allowing us to compromise connections to 7% of Alexa Top Million HTTPS sites. In response, major browsers are being changed to reject short groups. We go on to consider Diffie-Hellman with 768- and 1024-bit groups. A small number of fixed or standardized groups are in use by millions of TLS, SSH, and VPN servers. Performing precomputations on a few of these groups would allow a passive eavesdropper to decrypt a large fraction of Internet traffic. In the 1024-bit case, we estimate that such computations are plausible given nation-state resources, and a close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break. We conclude that moving to stronger key exchange methods should be a priority for the Internet community.
On Tuesday, March 3, 2015, researchers announced a new SSL/TLS vulnerability called the FREAK attack. It allows an attacker to intercept HTTPS connections between vulnerable clients and servers and force them to use weakened encryption, which the attacker can break to steal or manipulate sensitive data. We are tracking the impact of the attack and helping users test whether they’re vulnerable.
Probing the Whole Internet for Weak Spots: Rapidly scanning the Internet has become vital to efforts to keep it secure.
19.5% of HTTPS-enabled sites in Alexa's Top 1 Million trigger or will trigger a Chrome security warning because they use the now deprecated SHA-1 signature algorithm to sign their HTTPS certificates. Soon those sites will be flagged by all major browsers as insecure.
The Internet-Wide Scan Data Repository is a public multi-institutional archive of research data collected through active scans of the Internet that I am leading. The repository was founded as a collaboration between the University of Michigan and and Rapid7 and currently hosts several terabytes of data including our regular scans of the HTTPS ecosystem, copies of the root HTTP pages, comprehensive reserve DNS lookups, and banner grabs from dozens of other protocols.
We are tracking the patching of the Heartbleed vulnerability via regular comprehensive ZMap scans of the IPv4 address space and by monitoring the Alexa Top 1 Million most popular websites.
A guide to parsing and validating X.509 digital certificates using OpenSSL based on our experiences performing scans of the HTTPS ecosystem.
ZMap is an open-source network scanner that enables researchers to easily perform Internet-wide network studies. With a single machine and a well provisioned network uplink, ZMap is capable of comprehensively scanning the IPv4 address space in under 45 minutes.
I will be presenting my most recent work at USENIX Security '22 in Washington, D.C. on August 16, 2013. Abstract: Internet-wide network scanning has numerous security applications, including exposing new vulnerabilities and tracking the adoption of defensive mechanisms, but probing the entire public address space with existing tools is both difficult and slow. We introduce ZMap, a modular, open-source network scanner specifically architected to perform Internet-wide scans and capable of surveying the entire IPv4 address space in under 45 minutes from user space on a single machine, approaching the theoretical maximum speed of gigabit Ethernet. We present the scanner architecture, experimentally characterize its performance and accuracy, and explore the security implications of high speed Internet-scale network surveys, both offensive and defensive. We also discuss best practices for good Internet citizenship when performing Internet-wide surveys, informed by our own experiences conducting a long-term research survey over the past year.
A parsed version of the published Microsoft cryptographic Object IDs and code to import into OpenSSL.
We are excited to be releasing our full scientific study on weak keys, "Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices" which will appear at USENIX Security 2012. In the study we find that at least 5.57% of TLS hosts and 9.60% of SSH hosts use repeated keys in an apparently vulnerable manner and that even more alarmingly, we are able to obtain RSA private keys for 0.50% of TLS hosts and 0.03% of SSH hosts, because their public keys shared nontrivial common factors due to entropy problems, and DSA private keys for 1.03% of SSH hosts, because of insufficient signature randomness.
You may have seen the preprint posted today by Lenstra et al. about entropy problems in public keys. Nadia Heninger, Eric Wustrow, Alex Halderman, and I have been waiting to talk about some similar results. We will be publishing a full paper after the relevant manufacturers have been notified. Meanwhile, we’d like to give a more complete explanation of what’s really going on. We have been able to remotely compromise about 0.4% of all the public keys used for SSL web site security. The keys we were able to compromise were generated incorrectly–using predictable “random” numbers that were sometimes repeated. There were two kinds of problems: keys that were generated with predictable randomness, and a subset of these, where the lack of randomness allows a remote attacker to efficiently factor the public key and obtain the private key.
A stored procedure for finding object dependences across multiple databases and servers in Microsoft SQL Server.