If someone asked ChatGPT to recommend a personal injury lawyer in your city, would your firm show up? For most law firms, the answer depends on three files that either exist on your website or don't. This guide explains each one in plain English, shows you exactly what they look like, and tells you how to implement them.
You don't need to be technical to understand this. But you do need to understand why these files matter so you can tell your web developer or IT person what to do.
When ChatGPT, Claude, Perplexity, or Gemini wants to learn about a law firm, it sends a "crawler" — an automated program that visits your website and reads the content. This is similar to how Google sends Googlebot to index websites for search results.
But AI crawlers work differently from Google's crawler in important ways. They're looking for clearly structured, factual information they can use in conversation. They need to understand not just what words are on your pages, but what your firm actually does, who works there, where you operate, and why someone should hire you.
Three files give AI this information:
robots.txt is a plain text file that sits at the root of your website (yourdomain.com/robots.txt). Every well-behaved crawler — whether Google's or ChatGPT's — checks this file before accessing your site. It tells them what they're allowed to read and what's off-limits.
Most law firm websites use WordPress with a security plugin (Wordfence, Sucuri) or are behind Cloudflare. These tools often block AI crawlers by default because they look like "bots" — which, technically, they are. But they're helpful bots you want on your site.
WordPress: Install the Yoast SEO plugin → go to Tools → File Editor → Edit robots.txt. Paste the above content (with your domain name). If your theme doesn't support this, you can also create a robots.txt file via FTP and upload it to your site's root directory.
Cloudflare users: Go to Security → Bots → make sure "Bot Fight Mode" allows verified AI crawlers. In Workers, you can create a route that returns a custom robots.txt.
llms.txt is a new standard (introduced in 2024) that gives AI language models a structured overview of your website. While AI crawlers can read your entire site, llms.txt gives them a curated summary — like an executive brief that helps AI understand your firm quickly and accurately.
Without llms.txt, AI has to piece together information about your firm from scattered web pages, blog posts, and third-party directories. The result is often incomplete or inaccurate. With llms.txt, you control the narrative. You tell AI exactly how to describe your firm.
Create a plain text file named llms.txt and upload it to your website's root directory (the same place robots.txt lives). Your web developer can do this via FTP, or if you're on WordPress, you can use a plugin that allows custom file uploads to the root.
There's also llms-full.txt — a more comprehensive version that includes your entire site's content in markdown format. Luna Legal AI generates both files for you automatically.
JSON-LD (JavaScript Object Notation for Linked Data) is a way to embed structured data in your web pages that machines can read instantly. It uses the schema.org vocabulary — a standardized set of data types that Google, AI platforms, and other services all understand.
When you add JSON-LD schema to your law firm's website, you're essentially saying: "Here are machine-readable facts about my business — my name, address, practice areas, attorney credentials, FAQs, and more."
Schema markup is used in 75% of high-performing Generative Engine Optimization (GEO) pages. AI platforms heavily weight structured data when deciding which firms to recommend because it's unambiguous — there's no interpretation needed. The data is either there or it isn't.
Most law firms that have any schema at all have the basic version auto-generated by WordPress plugins. That typically includes your business name, address, and maybe a phone number. But GEO-optimized schema goes much further:
| Field | Basic Schema | GEO-Optimized |
|---|---|---|
| Business name & address | Yes | Yes |
| Phone & hours | Sometimes | Yes |
about — detailed firm description | No | Yes, 100+ words |
audience — who you serve | No | Yes |
areaServed — geographic coverage | No | Yes, with GeoCircle |
speakable — content for voice AI | No | Yes |
alternativeHeadline | No | Yes, 3-5 variants |
knowsAbout — practice areas | No | Yes |
| FAQPage schema | No | Yes, 8-12 per practice area |
| Person schema for attorneys | No | Yes, with credentials |
WordPress with Yoast/RankMath: These plugins handle basic schema but can't add GEO fields like speakable, audience, or alternativeHeadline. You'll need to add a custom JSON-LD block to your theme's header.php or use a plugin like "Schema Pro" that allows custom schema types.
Easiest method: Luna Legal AI generates your complete GEO-optimized JSON-LD schema as a deliverable. Copy the code, paste it into your site's <head> tag, done.
Think of it as a three-layer system:
Layer 1: robots.txt opens the door. Without this, AI can't even enter your website. It's the prerequisite for everything else.
Layer 2: llms.txt hands AI a brochure at the door. "Here's who we are, what we do, and why you should recommend us." It gives AI a quick, authoritative overview without having to crawl every page.
Layer 3: JSON-LD schema provides the detailed facts on every page. When AI is deciding between two firms to recommend, the one with structured data wins — because AI can be confident in the accuracy of the information.
All three are necessary. robots.txt without the other two means AI can read your site but has to guess what's important. Schema without robots.txt means AI has the facts but can't access them. They work as a system.
Run a GEO scan and get a smart robots.txt, enhanced llms.txt, and GEO-optimized JSON-LD schema — plus 13 more deliverables including FAQ schema, XML sitemap, security headers, and a complete implementation checklist.
Start Free Trial →