I am an assistant professor of computer science at Stanford University, where my research focuses on empirical systems and network security. I’m also co-founder and CTO of Censys.

Prior to joining Stanford, I received my Ph.D. in computer science from the University of Michigan in 2017 and B.S. in mathematics and computer science from the University of Iowa in 2011.

I am best reached at zakir@cs.stanford.edu (key) or (224) 286-4210.


Understaing the Mirai Botnet (SEC'17)

The Mirai botnet, composed primarily of embedded and IoT devices, took the Internet by storm in late 2016 when it overwhelmed several high-profile targets with massive distributed denial-of-service (DDoS) attacks. In this paper, we provide a seven-month retrospective analysis of Mirai’s growth to a peak of 600k infections and a history of its DDoS victims. By combining a variety of measurement perspectives, we analyze how the botnet emerged, what classes of devices were affected, and how Mirai variants evolved and competed for vulnerable hosts. Our measurements serve as a lens into the fragile ecosystem of IoT devices. We argue that Mirai may represent a sea change in the evolutionary development of botnets—the simplicity through which devices were infected and its precipitous growth, demonstrate that novice malicious techniques can compromise enough low-end devices to threaten even some of the best-defended targets. To address this risk, we recommend technical and nontechnical interventions, as well as propose future research directions.

The Security Impact of HTTPS Interception (NDSS'17)

As HTTPS deployment grows, middlebox and antivirus products are increasingly intercepting TLS connections to retain visibility into network traffic. In this work, we present a comprehensive study on the prevalence and impact of HTTPS interception. First, we show that web servers can detect interception by identifying a mismatch between the HTTP User-Agent header and TLS client behavior. We characterize the TLS handshakes of major browsers and popular interception products, which we use to build a set of heuristics to detect interception and identify the responsible product. We deploy these heuristics at three large network providers: (1) Mozilla Firefox update servers, (2) a set of popular e-commerce sites, and (3) the Cloudflare content distribution network. We find more than an order of magnitude more interception than previously estimated and with dramatic impact on connection security. To understand why security suffers, we investigate popular middleboxes and clientside security software, finding that nearly all reduce connection security and many introduce severe vulnerabilities. Drawing on our measurements, we conclude with a discussion on recent proposals to safely monitor HTTPS and recommendations for the security community.

Users Really Do Plug in USB Drives They Find (Oakland'16)

We investigate the anecdotal belief that end users will pick up and plug in USB flash drives they find by completing a controlled experiment in which we drop 297 flash drives on a large university campus. We find that the attack is effective with an estimated success rate of 45–98% and expeditious with the first drive connected in less than six minutes. We analyze the types of drives users connected and survey those users to understand their motivation and security profile. We find that a drive’s appearance does not increase attack success. Instead, users connect the drive with the altruistic intention of finding the owner. These individuals are not technically incompetent, but are rather typical community members who appear to take more recreational risks than their peers. We conclude with lessons learned and discussion on how social engineering attacks—while less technical— continue to be an effective attack vector that our community has yet to successfully address.

Neither Snow Nor Rain Nor MITM… An Empirical Analysis of Email Delivery Security (IMC'15)

The SMTP protocol is responsible for carrying some of users’ most intimate communication, but like other Internet protocols, authentication and confidentiality were added only as an afterthought. In this work, we present the first report on global adoption rates of SMTP security extensions, including StartTLS, SPF, DKIM, and DMARC. We present data from two perspectives: SMTP server configurations for the Alexa Top Million domains, and over a year of SMTP connections to and from Gmail. We find that the top mail providers (e.g., Gmail, Yahoo, Outlook) all proactively encrypt and authenticate messages. However, these best practices have yet to reach widespread adoption in a long tail of over 700,000 SMTP servers, of which only 35% successfully configure encryption and 1.1% specify a DMARC authentication policy. This security patchwork—paired with SMTP policies that favor failing open to allow gradual deployment—exposes users to attackers who downgrade TLS connections in favor of cleartext and who falsify MX records to reroute messages. We present evidence of such attacks in the wild, highlighting seven countries where more than 20% of inbound Gmail messages arrive in cleartext due to network attackers.

Censys: A Search Engine Backed by Internet Wide Scanning (CCS'15)

Fast Internet-wide scanning has opened new avenues for security research, ranging from uncovering widespread vulnerabilities in random number generators to tracking the evolving impact of Heartbleed. However, this technique still requires significant effort: even simple questions, such as, “What models of embedded devices prefer CBC ciphers?”, require developing an application scanner, manually identifying and tagging devices, negotiating with network administrators, and responding to abuse complaints. In this paper, we introduce Censys, a public search engine and data processing facility backed by data collected from ongoing Internet-wide scans. Designed to help researchers answer security-related questions, Censys supports full-text searches on protocol banners and querying a wide range of derived fields (e.g., 443.https.cipher). It can identify specific vulnerable devices and networks and generate statistical reports on broad usage patterns and trends. Censys returns these results in sub-second time, dramatically reducing the effort of understanding the hosts that comprise the Internet. We present the search engine architecture and experimentally evaluate its performance. We also explore Censys’s applications and show how questions asked in recent studies become simple to answer.

Imperfect Forward Secrecy: How Diffie Hellman Fails in Practice (CCS'15)

We investigate the security of Diffie-Hellman key exchange as used in popular Internet protocols and find it to be less secure than widely believed. First, we present a novel flaw in TLS that allows a man-in-the-middle to downgrade connections to “export-grade” Diffie-Hellman. To carry out this attack, we implement the number field sieve discrete log algorithm. After a week-long precomputation for a specified 512-bit group, we can compute arbitrary discrete logs in this group in minutes. We find that 82% of vulnerable servers use a single 512-bit group, allowing us to compromise connections to 7% of Alexa Top Million HTTPS sites. In response, major browsers are being changed to reject short groups. We go on to consider Diffie-Hellman with 768- and 1024-bit groups. A small number of fixed or standardized groups are in use by millions of TLS, SSH, and VPN servers. Performing precomputations on a few of these groups would allow a passive eavesdropper to decrypt a large fraction of Internet traffic. In the 1024-bit case, we estimate that such computations are plausible given nation-state resources, and a close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break. We conclude that moving to stronger key exchange methods should be a priority for the Internet community.

Tracking the FREAK Attack

On Tuesday, March 3, 2015, researchers announced a new SSL/TLS vulnerability called the FREAK attack. It allows an attacker to intercept HTTPS connections between vulnerable clients and servers and force them to use weakened encryption, which the attacker can break to steal or manipulate sensitive data. We are tracking the impact of the attack and helping users test whether they’re vulnerable.

Scanning Highlighted by MIT Tech Review

Probing the Whole Internet for Weak Spots: Rapidly scanning the Internet has become vital to efforts to keep it secure.

19.5% of HTTPS Sites Will Trigger Browser Warnings

19.5% of HTTPS-enabled sites in Alexa's Top 1 Million trigger or will trigger a Chrome security warning because they use the now deprecated SHA-1 signature algorithm to sign their HTTPS certificates. Soon those sites will be flagged by all major browsers as insecure.

Scans.IO: Internet-Wide Scan Data Repository

The Internet-Wide Scan Data Repository is a public multi-institutional archive of research data collected through active scans of the Internet that I am leading. The repository was founded as a collaboration between the University of Michigan and and Rapid7 and currently hosts several terabytes of data including our regular scans of the HTTPS ecosystem, copies of the root HTTP pages, comprehensive reserve DNS lookups, and banner grabs from dozens of other protocols.

Heartbleed Health Report

We are tracking the patching of the Heartbleed vulnerability via regular comprehensive ZMap scans of the IPv4 address space and by monitoring the Alexa Top 1 Million most popular websites.

OpenSSL Certificate Parsing

A guide to parsing and validating X.509 digital certificates using OpenSSL based on our experiences performing scans of the HTTPS ecosystem.

ZMap Scanner Released

ZMap is an open-source network scanner that enables researchers to easily perform Internet-wide network studies. With a single machine and a well provisioned network uplink, ZMap is capable of comprehensively scanning the IPv4 address space in under 45 minutes.

ZMap: Fast Internet Wide Scanning and its Security Applications

I will be presenting my most recent work at USENIX Security '22 in Washington, D.C. on August 16, 2013. Abstract: Internet-wide network scanning has numerous security applications, including exposing new vulnerabilities and tracking the adoption of defensive mechanisms, but probing the entire public address space with existing tools is both difficult and slow. We introduce ZMap, a modular, open-source network scanner specifically architected to perform Internet-wide scans and capable of surveying the entire IPv4 address space in under 45 minutes from user space on a single machine, approaching the theoretical maximum speed of gigabit Ethernet. We present the scanner architecture, experimentally characterize its performance and accuracy, and explore the security implications of high speed Internet-scale network surveys, both offensive and defensive. We also discuss best practices for good Internet citizenship when performing Internet-wide surveys, informed by our own experiences conducting a long-term research survey over the past year.

Microsoft Object Identifiers

A parsed version of the published Microsoft cryptographic Object IDs and code to import into OpenSSL.

Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices

We are excited to be releasing our full scientific study on weak keys, "Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices" which will appear at USENIX Security 2012. In the study we find that at least 5.57% of TLS hosts and 9.60% of SSH hosts use repeated keys in an apparently vulnerable manner and that even more alarmingly, we are able to obtain RSA private keys for 0.50% of TLS hosts and 0.03% of SSH hosts, because their public keys shared nontrivial common factors due to entropy problems, and DSA private keys for 1.03% of SSH hosts, because of insufficient signature randomness.

No need to panic over factorable keys–just mind your Ps and Qs

You may have seen the preprint posted today by Lenstra et al. about entropy problems in public keys. Nadia Heninger, Eric Wustrow, Alex Halderman, and I have been waiting to talk about some similar results. We will be publishing a full paper after the relevant manufacturers have been notified. Meanwhile, we’d like to give a more complete explanation of what’s really going on. We have been able to remotely compromise about 0.4% of all the public keys used for SSL web site security. The keys we were able to compromise were generated incorrectly–using predictable “random” numbers that were sometimes repeated. There were two kinds of problems: keys that were generated with predictable randomness, and a subset of these, where the lack of randomness allows a remote attacker to efficiently factor the public key and obtain the private key.

Finding Cross Database Dependencies

A stored procedure for finding object dependences across multiple databases and servers in Microsoft SQL Server.