Lucas the Spider Bot

Identity

  • Name: Lucas the Spider
  • User-Agent Strings:
    • Mozilla/5.0 (X11; Ubuntu; Linux x86_64) WebKit/18.2 (KHTML, like Gecko) Lucas/v2025.70 (+https://noordigital.com/lucas) Playwright/1.49.0.0
    • Mozilla/5.0 (X11; Ubuntu; Linux x86_64; .NET 8.0.20) Lucas/v2025.70 (+https://noordigital.com/lucas) HttpClient/8.0.0.0
    • Identifier substring: Lucas/v20
  • Owner: Noor Digital Agency AB
  • Contact: [email protected]

Purpose & Behavior

  • Crawls a list of specified URLs only (not a general spider).
  • Fetches HTML pages only.
  • Purpose: HTTP status health checks and content snapshots.
  • Crawl frequency: Based on freshness of status and on-demand crawl requests.

Respect & Controls

  • Robots.txt: Not applicable (Lucas only crawls explicit URL lists, not whole sites).
  • Meta robots: Not applicable.
  • Opt-out: Exclude the URL from crawl list.
  • Rate limits: Adaptive, per-domain backoff. General limit: 25 URLs per second unless specifically configured.
  • Concurrency: Adaptive per domain.

Technical Details

  • Reverse DNS ranges:
    • .lucas-hub.dnscdn.se
    • .lucas1.dnscdn.se.lucas10.dnscdn.se
  • Headers: Default headless browser headers only.
  • Retry policy: No retries unless the URL reappears in the crawl list.
  • Crawl origin: Hetzner servers, Finland.

Access Info