Facebook crash increased developer productivity by 32% according to Haystack report

The global shutdown of Facebook Inc.’s main services – including WhatsApp, Messenger, Instagram, and Facebook – on Monday impacted several workers and possibly even developer performance. Haystack released a report on Tuesday in which he said he found that Facebook’s global outage caused developers to be 32% more productive. However, this increase does not concern the development of new functionalities, but rather the validation of merger requests. The report reopens the debate on the loss of time that social networks can cause to teams of engineers.

Haystack analyzes historical git data to give you a clear (and precise) idea of the health of your teams. On Monday, the Haystack team analyzed the data available to them to see what impact the global Facebook outage was having on developer productivity (number of Pull Requests merged) and recorded their results in a report. As a reminder, the failure follows the sudden disappearance of routing prefixes from BGP (Border Gateway Protocol) tables. BGP makes the Internet work and allows devices on one side of the world to reach devices on the other side using prefixes or “routes”.

As Facebook’s domain registry and DNS servers are hosted on the firm’s own routing prefix, when the BGP prefixes were removed from the routing tables, no one could connect to their IP addresses or to dependent services. they. “In this case, the Internet no longer knows where to find Facebook’s IP addresses. One symptom is that DNS queries are failing. However, this is just the result of Facebook hosting its DNS servers within its own network, ”explains Johannes Ulrich of the SANS Technology Institute. Haystack relates the following facts in his report.

Chronology

According to DownDetector, a website that presents real-time issues and outages for all kinds of online services, the Facebook service outage started at 3:24 PM UTC (or 5:24 PM in France). At 22:46 UTC (00:46 in France), Facebook’s technical director posted on Twitter that the services were back online, but it could take some time to reach 100%. The incident was largely resolved overnight. Throughout the day, the Haystack team found that developer performance continued to follow the baseline (see image).

However, she said that changed significantly after 9:00 p.m. UTC (11:00 p.m. in France). According to the team, while it’s quite usual to see an increase in yield at this time of day on Monday, the growth has been much larger than usual. Between 9:00 p.m. UTC and midnight (11:00 p.m. and 2:00 a.m. in France), it saw an increase of approximately 2.6 times the number of update requests being merged. For context, midnight UTC is 5:00 p.m. Pacific Time (where many Haystack customers are located).

Causes

As developer productivity has increased, Haystack has found that the turnaround time (the time between the first commit and the Pull Requests merge) has increased dramatically for these merge requests. This indicates that the real reason for the increase is rather that the developers used the extra time at the end of their day to clean up old merge requests, shutting down old merge requests that were long running.

In fact, Haystack, as a product, provides development teams with alerts on pending merge requests (like those that have already been reviewed and are waiting to be merged). Rather than seeing a dramatic increase in programming productivity, she saw developers take care of their household chores. “The Facebook outage prompted developers to clean up their gardens,” explains technical director Kan Yilmaz.

This does not justify micromanagement

“As a developer analysis tool, Haystack takes care not to encourage micromanagement. Unlike our competitors, we don’t compare engineers, ”said Haystack. “Research on software engineering teams has consistently shown that micromanagement is detrimental to team effectiveness and that psychological safety is essential for improving productivity, addressing software reliability, and preventing failure. burnout, ”the team added.

According to her, the fact that she did not see a substantial drop in productivity during the interval the incident was publicized on other social media platforms shows that developers are less inclined to be distracted from their productive work than one might think. Additionally, she believes the key to fostering sustained developer productivity lies in creating a flow-oriented development experience, where manual process is replaced by automation and tools.

“When developers are freed from inefficient processes – bureaucracy and technical debt – work can flow quickly without compromising reliability or causing burnout. That’s why more and more tech organizations are setting up EngProd teams to focus on removing these barriers, ”the team concluded.

EngProd is how elite software engineering teams deliver reliable software based on business results while ensuring the well-being of the team. Companies like Google have Engineering Productivity Teams (EngProd), and other companies like Netflix call them Developer Productivity (DevProd) teams.

Source : Haystack