How Meta understands data at scale

Managing and understanding large-scale data ecosystems is a significant challenge for many organizations, requiring innovative solutions to efficiently safeguard user data. Meta’s vast and diverse systems make it particularly challenging to comprehend its structure, meaning, and context at scale. To address these challenges, we made substantial investments in advanced data understanding technologies, as part of our Privacy Aware Infrastructure (PAI). Specifically, we have adopted a “shift-left” approach, integrating data schematization and annotations early in the product development process. We also created a universal privacy taxonomy, a standardized framework providing a common semantic vocabulary for data privacy management across Meta’s products that ensures quality data understanding and provides developers with reusable and efficient compliance tooling. We discovered that a flexible and incremental approach was necessary to onboard the wide variety of systems and languages used in building Meta’s products. Additionally, continuous collaboration between privacy and product teams was essential to unlock the value of data understanding at scale. We embarked on the journey of understanding data across Meta a decade ago with millions of assets in scope ranging from structured and unstructured, processed by millions of flows across many of the Meta App offerings. Over the past 10 years, Meta has cataloged millions of data assets and is classifying them daily, supporting numerous privacy initiatives across our product groups. Additionally, our continuous understanding approach ensures that privacy considerations are embedded at every stage of product development. At Meta, we have a deep responsibility to protect the privacy of our community. We’re upholding that by investing our vast engineering capabilities into building cutting-edge privacy technology. We believe that privacy drives product innovation. This led us to develop our Privacy Aware Infrastructure (PAI), which integrates efficient and reliable privacy tools into Meta’s systems to address needs such as purpose limitation—restricting how data can be used while also unlocking opportunities for product innovation by ensuring transparency in data flows ...

April 28, 2025

How the April 28, 2025 power outage in Portugal and Spain impacted Internet traffic and connectivity

A massive power outage struck significant portions of Portugal and Spain at 10:34 UTC on April 28, grinding transportation to a halt, shutting retail businesses, and otherwise disrupting everyday activities and services. Parts of France were also reportedly impacted by the power outage. Portugal’s electrical grid operator blamed the outage on a "fault in the Spanish electricity grid”, and later stated that "due to extreme temperature variations in the interior of Spain, there were anomalous oscillations in the very high voltage lines (400 kilovolts), a phenomenon known as 'induced atmospheric vibration'" and that "These oscillations caused synchronisation failures between the electrical systems, leading to successive disturbances across the interconnected European network." ...

April 28, 2025

How WhatsApp Handles 40 Billion Messages Per Day

Build to Prod: Secure, Scalable MCP Servers with Docker (Sponsored)The AI agent era is here, but running tools in production with MCP is still a mess—runtime headaches, insecure secrets, and a discoverability black hole. Docker fixes that. Learn how to simplify, secure, and scale your MCP servers using Docker containers, Docker Desktop, and the included MCP gateway. From trusted discovery to sandboxed execution and secrets management, Docker gives you the foundation to run agentic tools at scale—with confidence. ...

April 28, 2025

How the power outage of April 28, 2025, in Portugal and Spain impacted Internet traffic and connectivity

A massive power outage struck significant portions of Portugal and Spain at 10:34 UTC on April 28, grinding transportation to a halt, shutting retail businesses, and otherwise disrupting everyday activities and services. Parts of France were also reportedly impacted by the power outage. Portugal’s electrical grid operator blamed the outage on a "fault in the Spanish electricity grid”, and later stated that "due to extreme temperature variations in the interior of Spain, there were anomalous oscillations in the very high voltage lines (400 kilovolts), a phenomenon known as 'induced atmospheric vibration'" and that "These oscillations caused synchronisation failures between the electrical systems, leading to successive disturbances across the interconnected European network." ...

April 27, 2025

Targeted by 20.5 million DDoS attacks, up 358% year-over-year: Cloudflare’s 2025 Q1 DDoS Threat Report

Welcome to the 21st edition of the Cloudflare DDoS Threat Report. Published quarterly, this report offers a comprehensive analysis of the evolving threat landscape of Distributed Denial of Service (DDoS) attacks based on data from the Cloudflare network. In this edition, we focus on the first quarter of 2025. To view previous reports, visit www.ddosreport.com. While this report primarily focuses on 2025 Q1, it also includes late-breaking data from a hyper-volumetric DDoS campaign observed in April 2025, featuring some of the largest attacks ever publicly disclosed. In a historic surge of activity, we blocked the most intense packet rate attack on record, peaking at 4.8 billion packets per second (Bpps), 52% higher than the previous benchmark, and separately defended against a massive 6.5 terabits-per-second (Tbps) flood, matching the highest bandwidth attacks ever reported. ...

April 27, 2025

EP160: Top 20 System Design Concepts You Should Know

✂️Cut your QA cycles down to minutes with QA Wolf (Sponsored)If slow QA processes bottleneck you or your software engineering team and you’re releasing slower because of it — you need to check out QA Wolf. QA Wolf’s AI-native service supports web and mobile apps, delivering 80% automated test coverage in weeks and helping teams ship 5x faster by reducing QA cycles from hours to minutes. ...

April 26, 2025

How the GitHub CLI can now enable triangular workflows

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> Most developers are familiar with the standard Git workflow. You create a branch, make changes, and push those changes back to the same branch on the main repository. Git calls this a centralized workflow. It’s straightforward and works well for many projects. However, sometimes you might want to pull changes from a different branch directly into your feature branch to help you keep your branch updated without constantly needing to merge or rebase. However, you’ll still want to push local changes to your own branch. This is where triangular workflows come in. ...

April 25, 2025

PhD Timeline

April 25, 2025

Domain-Driven Design (DDD) Demystified

Most software doesn’t break because of syntax errors or flawed if-else logic.  It breaks because teams lose alignment with the business problem they’re supposed to solve. Systems become tangled with technical assumptions that age poorly. Features get implemented without proper design considerations. And over time, every new requirement creates more issues that keep piling up.  Often, this isn’t a tooling problem. It’s a modeling problem.  Domain-Driven Design (DDD) tries to tackle this problem head-on. At its core, DDD is a way of designing software that keeps the business domain, not the database schema or the latest framework, at the center of decision-making. It insists that engineers collaborate deeply with domain experts during the project lifecycle, not just to gather requirements once and vanish into Jira tickets. It gives teams the vocabulary, patterns, and boundaries to model complex systems without getting buried in accidental complexity. ...

April 24, 2025

Improving brain models with ZAPBench

April 24, 2025