Mementomori.social experienced 57 minutes of downtime earlier today. Here's what happened and what we've done about it.
What happened: A large volume of connections from external crawler/bot IP ranges accumulated in a stuck state (CLOSEWAIT) inside our web server (nginx). These zombie connections exhausted nginx's connection pool and connections per worker, which caused it to stop accepting new connections entirely resulting in ERRCONNECTION_CLOSED for everyone trying to reach the site.
W...