Lucas the Spider Bot
Identity
- Name: Lucas the Spider
- User-Agent Strings:
Mozilla/5.0 (X11; Ubuntu; Linux x86_64) WebKit/18.2 (KHTML, like Gecko) Lucas/v2025.70 (+https://noordigital.com/lucas) Playwright/1.49.0.0Mozilla/5.0 (X11; Ubuntu; Linux x86_64; .NET 8.0.20) Lucas/v2025.70 (+https://noordigital.com/lucas) HttpClient/8.0.0.0- Identifier substring:
Lucas/v20
- Owner: Noor Digital Agency AB
- Contact: [email protected]
Purpose & Behavior
- Crawls a list of specified URLs only (not a general spider).
- Fetches HTML pages only.
- Purpose: HTTP status health checks and content snapshots.
- Crawl frequency: Based on freshness of status and on-demand crawl requests.
Respect & Controls
- Robots.txt: Not applicable (Lucas only crawls explicit URL lists, not whole sites).
- Meta robots: Not applicable.
- Opt-out: Exclude the URL from crawl list.
- Rate limits: Adaptive, per-domain backoff. General limit: 25 URLs per second unless specifically configured.
- Concurrency: Adaptive per domain.
Technical Details
- Reverse DNS ranges:
.lucas-hub.dnscdn.se.lucas1.dnscdn.se→.lucas10.dnscdn.se
- Headers: Default headless browser headers only.
- Retry policy: No retries unless the URL reappears in the crawl list.
- Crawl origin: Hetzner servers, Finland.
Legal & Policy
- Data retention: Up to 2 years.
- Removal requests: [email protected]
- Abuse contact: [email protected]
- Attribution: Data processed internally only.
Access Info
- Public info URL: https://noordigital.com/lucas
- Sitemap: Not relevant.
- Logo/Icon: Not provided at this time.