www.crunchbase.comscanned May 25, 2026 · 09:191.64s

Public AI visibility report

www.crunchbase.comAI visibilityPoor

This site is difficult for AI tools to read right now.

Key strengths include structured data, while homepage access and crawler policy need attention.

Recommended next step

remove AI crawler Disallow: / rules or replace them with narrower path-level restrictions for private content only.

Turn this scan into weekly monitoring.

Create a free workspace first, then unlock weekly monitoring for AI visibility changes after site, pricing, docs, sitemap, or crawler-rule updates.

Unlock monitoring View AI Optimization docs

Overall score

17/100

Poor

Download PDF

Go to fixes Overall position1,115 out of 1,130Leaderboard

// score breakdown

Points by check

8 checks

Crawlability0/20FAIL

Robots.txt0/15FAIL

llms.txt0/15FAIL

Sitemap0/10FAIL

Markdown support0/15FAIL

Semantic HTML7.1/10WARN

Structured data10/10PASS

Content signals0/5FAIL

1pass1warn6fail

Public link

llmscan.dev/scan/xaWMmvSyDBVcGIN94IOLY

Signals checked

8 AI visibility signals

Fix bundle

4 copy-ready files

Share badge

Poor · 17/100

Add a polished proof badge

A compact badge for footer, press, or trust sections that links visitors to this public report.

Embed codellmscan.dev/scan/xaWMmvSyDBVcGIN94IOLY

<a href="https://www.llmscan.dev/scan/xaWMmvSyDBVcGIN94IOLY"
  target="_blank"
  rel="noopener"
>
  <img
    src="https://www.llmscan.dev/scan/xaWMmvSyDBVcGIN94IOLY/badge.png"
    alt="LLM Scan AI visibility score badge"
    width="460"
    height="120"
    style="width: 260px; max-width: 100%; height: auto;"
  />
</a>

Open badge

LLM Scan

Poor

Score

17/100

Share your score

Post the public report with: “We scored 17/100 for AI-readability.”

Download fixes

Grab generated files and implementation notes for the highest-impact gaps.

Rescan weekly

Save this domain to catch regressions after content, sitemap, or robots changes.

Monitor weekly

// signal breakdown

8 signals AI systems depend on

The homepage is reachable, but robots.txt blocks GPTBot, ChatGPT-User, Claude-Web, PerplexityBot, and Google-Extended from crawling the site.

Signal weight

0/20

Fail

Evidence

url: https://crunchbase.com/
finalUrl: https://www.crunchbase.com/
status: 200

Recommendation

Next step: Remove AI crawler Disallow: / rules or replace them with narrower path-level restrictions for private content only.

robots.txt explicitly blocks GPTBot, ChatGPT-User, Claude-Web, PerplexityBot, and Google-Extended from the whole site.

Signal weight

0/15

Fail

Evidence

robotsTxtUrl: https://www.crunchbase.com/robots.txt
exists: true
rawRobotsTxt: User-agent: * # Allow API and JS paths to be requested by crawlers Allow: /v4/md/applications/crunchbase Allow: /*.js$ Disallow: /login Disallow: /register Disallow: /account Disallow: /account/invite Disallow: /reset-password Disallow: /subscriptions Disallow: /contribute Disallow: /add-new Disallow: /edit Disallow: /edit/success Disallow: /edit/review Disallow: /buy Allow: /buy/select-product Disallow: /account-setup Disallow: /verify Disallow: /admin Disallow: /v4 Disallow: /home Disallow: /search Disallow: /discover Disallow: /textsearch # AI and LLM Crawling User-agent: CCBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: GPTBot Disallow: / User-agent: Google-Extended Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Omgilibot Disallow: / User-agent: Omgili Disallow: / User-agent: FacebookBot Disallow: / User-agent: Diffbot Disallow: / User-agent: Bytespider Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: cohere-ai Disallow: / User-agent: Claude-Web Disallow: / User-agent: PerplexityBot Disallow: / Sitemap: https://www.crunchbase.com/www-sitemaps/sitemap-index.xml

Recommendation

Next step: Remove AI crawler Disallow: / rules or add narrower Allow/Disallow rules if AI crawlers should be able to discover public content.

No llms.txt file was found for this site.

Signal weight

0/15

Fail

Evidence

llmsTxtUrl: https://www.crunchbase.com/llms.txt
present: false
accessible: false

Recommendation

Next step: Publish /llms.txt as text or markdown with more than 200 characters, markdown headings, and at least one absolute URL.

No accessible XML sitemap was found for this site.

Signal weight

0/10

Fail

Evidence

sitemapUrl: https://www.crunchbase.com/sitemap.xml
sitemapUrls: [https://www.crunchbase.com/sitemap.xml, https://www.crunchbase.com/www-sitemaps/sitemap-index.xml]
robotsSitemapUrls: [https://www.crunchbase.com/www-sitemaps/sitemap-index.xml]

Recommendation

Next step: Publish a valid XML sitemap at /sitemap.xml and reference it from robots.txt so crawlers and AI systems can discover important URLs.

The homepage returned HTML when requested with Accept: text/markdown, so the server appears to ignore markdown content negotiation.

Signal weight

0/15

Fail

Evidence

url: https://www.crunchbase.com/
acceptHeader: text/markdown
status: 200

Recommendation

Next step: Add content negotiation for Accept: text/markdown on the homepage and return a markdown representation with Content-Type: text/markdown. Keep the HTML response for regular browser requests.

The homepage has some semantic HTML signals, but one or more title, metadata, heading, landmark, content, or link text checks need improvement.

Signal weight

7/10

Warn

Evidence

url: https://www.crunchbase.com/
quality: partial
score: 71

Recommendation

Next step: Add a meta description between 50 and 160 characters. Add missing semantic elements: main, article, nav, footer.

Valid JSON-LD structured data was found with core Organization or WebSite schema.org types.

Signal weight

10/10

Pass

Evidence

url: https://www.crunchbase.com/
quality: good
hasStructuredData: true

Content-Signal directive not detected in headers, HTML metadata, or robots.txt.

Signal weight

0/5

Fail

Evidence

url: https://www.crunchbase.com/
hasContentSignals: false
hasContentSignalHeader: false

Recommendation

Next step: Add the standard directive 'Content-Signal: ai-train=no, search=yes, ai-input=yes' to robots.txt, HTML metadata, or HTTP headers so AI systems can discover content usage preferences.

// generated fixes

Downloadable fix files

Preview the generated files below. Enter your email to reveal the full fixes, download the bundle, or copy the agent-ready implementation prompt.

Done-for-you

Agency package

Not sure how to ship the technical fixes? Book a call and we can help turn this report into implemented updates.

Fix planning from your scan

Implementation guidance

AI visibility monitoring

llms.txtMarkdown

# Crunchbase > Use this file to orient AI systems to the site's public content, canonical URLs, and crawler expectations. This llms.txt file summarizes the public, canonical resources that AI assistants and crawlers should use to understand this site. ## Site Overview - Canonical URL: https://www.crunchbase.com/- Site type: web site

robots.txtTXT

# robots.txt additions# Copy these blocks into the existing robots.txt file. Keep current rules unless a note calls out a conflicting Disallow. # AI crawler access# Add explicit Allow rules for blocked AI crawlers; remove or narrow conflicting Disallow rules if your crawler target requires precedence.User-agent: GPTBotAllow: / User-agent: ChatGPT-UserAllow: /

schema.jsonJSON

{  "@context": "https://schema.org",  "@graph": [    {      "@type": "Organization",      "@id": "https://www.crunchbase.com/#organization",      "name": "Crunchbase",      "description": "Website for Crunchbase.",      "url": "https://www.crunchbase.com/",      "logo": "https://images.crunchbase.com/image/upload/c_pad,h_25,w_25,f_auto,b_white,q_auto:eco,dpr_1/c2905ebc906840f98955e4acd3d6fbb0?ik-sanitizeSvg=true",

head metaHTML

# Content-Signal recommendations Use these directives to make AI-use preferences explicit for compliant crawlers and AI systems. They are advisory signals, so keep them aligned with robots.txt, terms, and access controls. ## Recommended values - ai-train=no: AI model training, fine-tuning, and dataset creation.- search=yes: AI search indexing, snippets, and discovery.- ai-input=yes: AI answer grounding, retrieval, and generated-response context.

www.crunchbase.comAI visibilityPoor

Points by check

Add a polished proof badge

8 signals AI systems depend on

Downloadable fix files

Similar AI visibility reports