Download the PHP package b13/ai-bots-love-markdown without Composer
On this page you can find all versions of the php package b13/ai-bots-love-markdown. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download b13/ai-bots-love-markdown
More information about b13/ai-bots-love-markdown
Files in b13/ai-bots-love-markdown
Package ai-bots-love-markdown
Short Description Serve page content as Markdown for AI bots
License GPL-2.0-or-later
Informations about the package ai-bots-love-markdown
AI Bots Love Markdown - Serve TYPO3 pages as Markdown for AI crawlers
This TYPO3 extension provides an alternative Markdown representation of your pages for AI bots and crawlers.
This makes content accessible not just for humans and screen readers, but also for AI systems that consume and process web content.
Features
- Converts any TYPO3 page to Markdown on-the-fly via content negotiation
- Works automatically with your existing page templates - no TypoScript configuration needed
- Extracts content from
<main>element (or<body>as fallback) - Automatic
<link rel="alternate">tag for Markdown discovery - Strips navigation, header, footer, and other non-content elements
- Includes page metadata (title, dates, description, categories) as YAML front matter
Installation
Use composer req b13/ai-bots-love-markdown or install it via TYPO3's Extension Manager.
After installation, add the site set to your site configuration:
Configuration
Site Settings
The extension provides the following settings that can be configured per site:
| Setting | Default | Description |
|---|---|---|
ai_bots_love_markdown.enableContentNegotiation |
true |
Enable content negotiation via Accept: text/markdown header |
ai_bots_love_markdown.enableDiscoveryTag |
true |
Add <link rel="alternate"> tag to HTML pages for Markdown discovery |
ai_bots_love_markdown.pageTypeSuffix |
ai-bots-love.md |
PageType Suffix for Markdown link |
ai_bots_love_markdown.pageTypeTypeNum |
2026 |
PageType TypeNum for Markdown link |
ai_bots_love_markdown.removeElements |
script style nav footer aside form iframe noscript |
Space-separated HTML tags stripped from the markdown output. <header> is intentionally not included — article-level <header> regions often hold the H1 |
ai_bots_love_markdown.excludedDoktypes |
3,4,6,7,199,254,255 |
Comma-separated page doktypes that never produce a Markdown alternate response. Default covers TYPO3 system doktypes (external link, shortcut, mountpoint, sysfolder, recycler, etc.) |
ai_bots_love_markdown.cacheable |
true |
Allow CDN / reverse-proxy caching of Markdown responses. Disable to force every hit through to the origin — required when you want to count AI-bot deliveries accurately (e.g. via b13/ai-bot-tracker). When false, Cache-Control: private, no-store is set on Markdown responses |
To override these settings, add them to your site's settings.yaml:
Per-page opt-out
Editors can disable the Markdown alternate for individual pages via the page property
Disable Markdown version (TCA field pages.markdown_version, default on, rendered
inverted in the BE so the toggle reads as "disable"). When the toggle is turned on, the
<link rel="alternate"> discovery tag is stripped from the HTML response and any direct
request to .md / Accept: text/markdown returns the regular HTML.
Caching and content negotiation
Markdown responses share the URL of the HTML page when Accept: text/markdown is
used. To prevent reverse proxies from serving a cached HTML response to a Markdown
requester (or vice versa), the middleware adds a Vary: Accept header to every
response that goes through a site with content negotiation enabled — regardless
of whether the current request asked for Markdown.
By default (cacheable: true), TYPO3's page-level Cache-Control is preserved
on the Markdown response, so the CDN may cache. If you need every Markdown
delivery to reach the origin — typically to track bot consumption — set
cacheable: false and the response is sent with Cache-Control: private, no-store.
Usage
Access Methods
-
Accept header: Request any page with
Accept: text/markdownheadercurl -H "Accept: text/markdown" https://example.com/my-page/
-
URL suffix: Append
/ai-bots-love.mdto any page URL
Output Format
The extension outputs Markdown with YAML front matter, extracting metadata from HTML meta tags with fallback to TYPO3 page record data:
Metadata Priority:
title: og:title →<title>tag → page recordurl: canonical link → request URLdescription: og:description → meta description → page recordimage: og:image (if present)author: meta author → page recorddate/modified: page record (crdate/tstamp)keywords: meta keywords (if present)categories: TYPO3 sys_category (from database)
Auto-Discovery
The extension automatically adds a <link> tag to all HTML pages for Markdown discovery:
Opt-in Behavior
Pages are only converted to Markdown if they contain the <link rel="alternate" type="text/markdown"> tag.
This means:
- Pages without the site set enabled won't be converted
- You can disable conversion for specific pages by disabling the discovery tag
Technical Details
How It Works
- A PSR-15 middleware intercepts requests with
Accept: text/markdownheader or/ai-bots-love.mdsuffix - The normal page rendering proceeds (your existing templates, TypoScript, etc.)
- The middleware checks if the response contains the markdown alternate link tag
- If present, it extracts
<main>content (or<body>as fallback) - The HTML is converted to Markdown using league/html-to-markdown
- Navigation, header, footer, and other non-content elements are stripped
- Page metadata is added as YAML front matter
Response Headers
The Markdown response includes:
Content-Type: text/markdown; charset=utf-8X-Robots-Tag: noindex(to prevent indexing of Markdown version)
Best Practices for Templates
For best results, wrap your main content in a <main> element:
Excluding Content from Markdown
Beyond the <main> selection and the configurable removeElements list, you
can mark template regions explicitly via two bundled Fluid partials. They are
auto-registered through the site set, so no partialRootPaths setup is
required.
Wrap a region you want excluded from the markdown output (teasers, related-article boxes, breadcrumbs, CTAs):
Or explicitly mark the main content region (overrides <main> detection,
useful when <main> is missing or contains too much):
Both partials emit HTML comments (<!-- markdown-start -->,
<!-- markdown-exclude-start -->, …) that survive caching and are stripped
from regular page responses by a frontend middleware before they reach human
visitors. Excludes can be nested.
Events
The extension dispatches three PSR-14 events. Listen via the
#[AsEventListener] attribute or by registering a listener in your
extension's Configuration/Services.yaml. All events live under the
B13\AiBotsLoveMarkdown\Event\ namespace.
| Event | Fired | Use case |
|---|---|---|
BuildHtmlMarkdownConverterEvent |
Before HTML → Markdown conversion runs | Add custom node converters, override HtmlConverter options |
AfterFrontMatterForPageIsCreatedEvent |
After the YAML front-matter array is assembled from meta tags and the page record, before serialisation | Add, remove, or replace front-matter entries (e.g. enrich with domain-specific keys) |
AfterMarkdownConversionEvent |
After Markdown content has been built, before the response is returned | Side effects on every Markdown delivery (e.g. b13/ai-bot-tracker writes a tracking row from this event) |
See Extending the YAML front matter below
for a full AfterFrontMatterForPageIsCreatedEvent listener example.
Extending the YAML front matter
The default front matter is built from HTML meta tags and the page record. To
add domain-specific keys (e.g. seminar data, product attributes, event dates),
listen to the AfterFrontMatterForPageIsCreatedEvent. Listeners receive the assembled
data array before it is rendered to YAML and may add, remove, or replace
entries.
The event also exposes $html, $pageInformation, and $request (each
read-only) so listeners can resolve site, language, or routing information
when needed. Array order is preserved in the rendered YAML, so you can
prepend custom keys by recreating the array if order matters.
If you need to bypass the YAML rendering entirely, the underlying methods
MetadataService::buildFrontMatterData() and MetadataService::renderYaml()
are public and can be used directly.
License
The extension is licensed under GPL v2+, same as the TYPO3 Core.
Background & Authors
This extension was created by b13 GmbH in 2026 to enable AI systems to better consume website content. As AI crawlers become increasingly important for content discovery and processing, providing a clean Markdown representation helps ensure your content is accurately understood and indexed by AI systems.
Huge credits to Dries Buytaert's inspiration on this topic:
Find more TYPO3 extensions we have developed that help us deliver value in client projects. As part of the way we work, we focus on testing and best practices to ensure long-term performance, reliability, and results in all our code.
All versions of ai-bots-love-markdown with dependencies
typo3/cms-core Version ^13.4 || ^14.0
league/html-to-markdown Version ^5.1