Twenty Thousand Emails an Hour

Another email disaster

Starting in early sophomore year and continuing until I graduated from Purdue, I worked as a student in the IT department as a software developer working on an internal web application. It was a fantastic opportunity, and I learned just as much at that job as I did through my classes. Since that internal web app is still in use to this day, it would be inappropriate for me to share most of the stories from my time there, since they were fairly specific to the work, but there is one that I feel is generic enough to share.

I was tasked with making some change to how authentication on the site worked. That was some of the oldest code we had, so this was a nice opportunity to lightly restructure it for readability and to more closely match the style of the other code. I noticed that strangely, the existing code made a considerable effort to gracefully handle unauthenticated requests that weren't even made from a browser. Of course, the existing code did not allow such requests to access anything. The code would be simpler if it would just assume that the user was a human sitting at a browser so we could unconditionally redirect unauthenticated callers to the university's single-sign on page. I made whatever change I was originally assigned to make, along with that refactoring. The changes worked their way through testing environments, and one morning a few hours before I clocked in, the changes went live on the user-facing version of the website.

When I entered the student office that morning to work, the student team lead who had deployed my changes a few hours ago asked if I had seen all the emails coming in about errors. As a simple way to handle error reporting, our application would send an email to all the students on the team with the corresponding application logs whenever an unhandled error occurred. I and everyone on the team had email filtering rules set up to send those emails to a dedicated folder to look at as needed, so I hadn't noticed. I asked what the errors were about and he said he wasn't sure, but there were so many that an administrator for the university's email system had noticed and temporarily revoked our application's authorization to send emails. I pulled out my phone to check to see how many. I had received over 20,000 emails in the last hour or so.

Every email I opened described the same error: an attempt to access the home page of our application that failed because our user authentication code did not gracefully handle when the request was from an automated system that, since it was not a human using a real browser, we couldn't redirect to the login page. Apparently, someone had set up an automatic application health monitoring process that checked on our website by repeatedly trying to fetch the home page, and the poor little monitoring process responded to the errors it was receiving by stubbornly and rapidly retrying. Once we discovered this, my changes were reverted, and permission for our application to send emails was restored. Later when I adjusted my work, when I added back the graceful handling of these requests, I added some comments to the code explaining why that special handling of unauthenticated requests was so important. In the end, this was a classic example of Chesterton's Fence: I came across something that was clearly set up intentionally, and since I could not see the reason for it, I made the mistake of assuming it served no purpose and could be removed.

Later on, we were asked to implement rate-limiting for sending emails in our application, to contain the damage if (really "when") we were flooded by errors again. I had the happy task of adding this rate limiting, and I think a certain aspect of it was amusing. At an extremely high level, I initially changed the application's email sending code to resemble:

public function sendEmail(emailDetails: EmailDetails): void {
    if (we have not hit the rate limit) {
        actuallySendTheEmail(emailDetails);
    }
}

private function actuallySendTheEmail(emailDetails: EmailDetails): void {
    Do the nitty-gritty work of sending an email, unconditionally
}

I was not satisfied with this. Since my team was made of students, no one stayed for more than a few years, so we spent much time considering how to make the reasons for why the code does what it does as obvious as possible. Writing documentation did not solve this problem. After all, we could not trust people to read wikis or the like. We could only know that they would read the code itself. I was worried that someone someday would have to do work on the application that involved sending out an automated email, and they might see this code, not think much about it, and copy-paste the raw email-sending code, defeating the rate limiting. The solution I came up with was the following:

public function sendEmail(emailDetails: EmailDetails): void {
    if (we have not hit the rate limit) {
        actuallySendTheEmail(emailDetails, "If I am changing this 
            code, then I promise that I have read all the notes on
            ticket 1234 and that I have explicit permission from 
            upper management to be changing it!");
    }
}

private function actuallySendTheEmail(emailDetails: EmailDetails, vow: String): void {
    if (vow === "If I am changing this code, then I promise that I 
        have read all the notes on ticket 1234 and that I have explicit
        permission from upper management to be changing it!") {
        Do the nitty-gritty work of sending an email, unconditionally
    } else {
        throw "Error: You almost certainly shouldn't be touching this code!";
    }
}

As silly as it sounds, the only way I could think of to discourage tampering with the rate limiting was to scare the person who was reading the code. We rarely interacted with upper management, so mentioning them was a reliable way to make someone think twice. As the cherry on top of it all, I later found out while still working there that someone soon refactored this code in a way that made it possible to once again unthinkingly copy-paste the unconditional email-sending code. They left the scary message in place though.