Working Papers

(with Dominik Rehse)
On October 4, 2021, all services provided by Meta Platforms, Inc. (then Facebook, Inc.) became unavailable unexpectedly worldwide for approximately six hours, simultaneously displacing entire user populations. We use high-frequency device tracking data from thousands of individuals in the United States and Spain to estimate effective substitution rates, measures of the reallocation of usage time per minute of Meta usage lost. Non-Meta social media and messaging services absorb the largest share of reallocated time, but substitution also crosses conventional service category boundaries, and a substantial share of displaced time leaves the device entirely. These aggregate patterns mask pronounced heterogeneity. Multi-homers substitute at rates an order of magnitude larger than single-homers, younger users drive social media substitution, and the composition of within-Meta usage shapes substitution destinations differently across countries. Users with the heaviest pre-outage Meta reliance exhibit persistent reductions in Meta's usage share over the four post-outage weeks, concentrated among the same services that absorbed the most during-outage substitution. Because the outage displaced entire social graphs, our estimates provide revealed-preference evidence on substitution when users must coordinate their reallocation, drawn from a broad demographic cross-section across two countries with different platform ecosystems
(with Dominik Rehse and Johannes Walter)
Red teaming, where testers attempt to elicit harmful outputs from a large language model (LLM), has become central to AI safety practices. Yet without coordination, testers duplicate effort by probing the same attack vectors or miss novel attack vectors due to insufficient exploration incentives. We test whether real-time novelty incentives can solve this in two preregistered experiments (N=1,075) where participants attempt to elicit harassing outputs from an LLM. Treatment participants earn bonuses based on the product of harassment and novelty scores; control participants earn bonuses based on harassment scores alone. In both experiments, treatment performs significantly worse on our primary outcome, novelty-weighted harassment (NWH). This “backfiring effect” is driven by reduced harassment elicitation with no overall gain in novelty. Filtering low-harassment outputs reverses this pattern, i.e., treatment then achieves higher novelty and NWH, indicating the backfiring effect is mainly driven by increased production of low-quality outputs. Analysis of attack strategies shows participants overuse ineffective strategies across conditions, but treatment executes identical strategies less effectively. For red teaming, our results suggest that quality thresholds should accompany novelty incentives and participant selection matters more than incentive design. More broadly, multi-dimensional incentives can backfire when participants cannot attend to competing objectives.

Work in Progress

Early adoption of generative AI: Users, uses, and behavioral change
(with Chiara Farronato and Dominik Rehse)
Competing for Attention: The Impact of YouTube Shorts on Content Supply and Consumption
(with Camille Urvoy)
LLMs as advisors: An experimental analysis of risky decision-making
(solo-authored)
Adoption of large language models and the demand for human expertise
(with Johannes Walter)

Tools

(with Johannes Walter)
SCALE (Serverless Chat Architecture for LLM Experiments) is an open-source framework for running scalable online experiments involving chat-based interactions with large language models. It combines oTree with AWS Lambda to enable high-throughput, low-latency deployments and is easy to set up without managing backend servers.