A curated collection of research data and resources documenting web crawler behavior across different content types. Each resource below provides a different perspective on how automated systems interact with varied file formats.
Data Resources (19 files)
- Home Page
- About Section
- Error Reference
- Stylesheet
- Application Script
- Research Dataset
- Project Readme
- Logo Image
- Banner Graphic
- Animated Icon
- Architecture Diagram
- Landscape Photo
- Configuration File
- Web Manifest
- RSS Feed
- Credits File
- Server Access Log
- Data Archive
- Research Report
These resources are part of an ongoing study into cross-origin content consumption by web crawlers and AI systems. The collection includes HTML pages, stylesheets, scripts, data files, images in multiple formats, and binary archives.