Research Stories

A curated collection of research data and resources documenting web crawler behavior across different content types. Each resource below provides a different perspective on how automated systems interact with varied file formats.

Data Resources (19 files)

These resources are part of an ongoing study into cross-origin content consumption by web crawlers and AI systems. The collection includes HTML pages, stylesheets, scripts, data files, images in multiple formats, and binary archives.