Skip to content
HN On Hacker News ↗

New agents.txt file found on DreamHost | K-Squared Ramblings

▲ 11 points 16 comments by speckx 2w ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

0 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 2 of 2
SEGMENTS · AI 0 of 2
WORD COUNT 296
PEAK AI % 0% · §1
Analyzed
May 14
backend: pangram/v3.3
Segments scanned
2 windows
avg 148 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 296 words · 2 segments analyzed

Human AI-generated
§1 Human · 0%

Sci-fi, comics, humor, photos…it's all fair game.

Post navigation

I host most of my websites on a DreamHost VPS. This morning I discovered that a new file had been added, agents.txt, to the root of each site, on May 7.It was easy to confirm that this is a new default file similar to the default robots.txt and favicon.ico DreamHost puts in every new site to get you started. Apparently they retroactively added it to sites that don’t already have one. So it’s a host action, not a hack. That’s good at least.The contents are simple, and sensible for a new website: Discourage LLM training and actions, allow on-the-fly “AI”-generated summaries, disallow access to some common folders that shouldn’t be used for any of the above.Though I am annoyed that they added it retroactively, particularly since it includes what looks like an explicit opt-in to retrieval-augmented generation, even if it’s something that’s happening already and less of a problem than a model vacuuming up your entire website for regurgitation. (Guess who’s already in Common Crawl!)# Data use policy Allow-Training: no Allow-RAG: yes Allow-Actions: no

# Default rules for all agents [Agent: *] Allow: / Disallow: /admin/ Disallow: /config/ Disallow: /tmp/ Disallow: /logs/ Disallow: /backup/ Disallow: /.env Disallow: /wp-admin/ Disallow: /wp-includes/ Harder to find was what else goes in this file. The first agents.txt spec I found used a completely different syntax and a completely different purpose. I had to search for the policy directives (in quotation marks) to find the proposal it’s implementing, which turns out to have been renamed as agent-manifest.txt shortly after it was proposed in March. Apparently whoever DreamHost didn’t get the memo before it rolled out.Good: sensible defaults for new sites. Bad: rolled out to existing sites without notice, half-baked implementation.

§2 Human · 0%

Post navigation