Configure crawl depth, speed limits, user agents, and exclusions to customize how NitroShock audits your website.
When running site audits in NitroShock, you need more than just a "start crawl" button. Configure crawl depth, speed limits, user agents, and exclusions to customize how NitroShock audits your website—ensuring you get accurate data while respecting your server resources and avoiding unnecessary credit usage on irrelevant pages.
Proper crawl configuration prevents common issues like overwhelming your server, auditing duplicate content, or wasting credits on admin pages and PDFs. Whether you're auditing a small business site with 50 pages or an enterprise e-commerce platform with thousands of products, these settings give you precise control over what gets crawled and how.
This guide covers the essential crawl settings available in NitroShock's Site Audit feature and how to configure them for different scenarios.
Crawl depth determines how far NitroShock will follow links from your starting URL. Think of it as the number of "clicks away" from your entry point the crawler will go.
A depth of 0 means only the starting URL gets audited—no links are followed. This is useful when you want to audit a single specific page without burning credits on the entire site.
A depth of 1 audits your starting page plus all pages directly linked from it. For example, if you start at your homepage, depth 1 would include your homepage and all pages in your main navigation.
A depth of 2 goes one level deeper—auditing your starting page, all directly linked pages, and all pages linked from those pages. This typically covers most small to medium websites completely.
Higher depths (3-5) are necessary for larger sites with deep hierarchies, but they exponentially increase the number of pages crawled. A site with 50 links per page could theoretically crawl 50 pages at depth 1, but 2,500 pages at depth 2.
For most projects, start with depth 2 as your baseline. This captures the majority of important pages on standard websites without crawling your entire site map every time.
Use depth 0 or 1 when:
Use depth 3 or higher when:
To configure crawl depth:
Remember that each page audited uses credits. Starting with a conservative depth helps you avoid unexpected credit usage on your first crawl.
Speed limits control how aggressively NitroShock crawls your website. While faster crawls complete quickly, they can strain your server resources and trigger security protections.
The requests per second setting determines how many pages NitroShock attempts to fetch simultaneously. The available range typically runs from 1 request per second (very slow and polite) to 10+ requests per second (fast but potentially aggressive).
At 1-2 requests per second, the crawler acts extremely conservatively. This is appropriate for:
At 3-5 requests per second, you achieve moderate speed without overwhelming most servers. This works well for:
At 6-10 requests per second, crawls complete quickly but demand significant server resources. Consider this for:
Request timeout determines how long NitroShock waits for a page to respond before giving up. The standard timeout is 30 seconds, but you can adjust this based on your site's performance.
Increase timeout (45-60 seconds) for:
Decrease timeout (15-20 seconds) for:
Start conservatively with your first audit of any site. Use 2-3 requests per second and monitor how your server handles the load. Check your server logs or monitoring tools during the crawl.
If your server handles the initial crawl easily (CPU and memory stay in normal ranges), you can increase speed for subsequent audits. If you notice performance degradation or receive alerts, reduce the crawl speed.
For WordPress sites with caching plugins like WP Rocket or W3 Total Cache, you can typically use moderate to fast speeds (4-6 requests per second) since cached pages serve quickly without database queries.
If NitroShock encounters multiple timeouts or errors during a crawl, it automatically reduces speed to prevent overwhelming your server.
The user agent string identifies NitroShock's crawler to your web server. Proper user agent configuration ensures you get accurate audit results that reflect real user experiences.
NitroShock uses a clearly identified crawler user agent by default: NitroShock-Bot/1.0. This identifies the crawler in your server logs and allows you to:
You can configure NitroShock to crawl using either desktop or mobile user agents, which is crucial since many sites serve different content or styling based on device type.
Desktop user agent crawling:
Mobile user agent crawling:
For most modern websites, run mobile user agent audits as your primary monitoring tool. Google predominantly uses mobile crawlers for indexing, so your mobile experience matters most for SEO.
Run periodic desktop audits to ensure desktop users still get a quality experience, especially if your analytics show significant desktop traffic.
For specialized scenarios:
Advanced users can set custom user agent strings to:
To set a custom user agent:
Exclusion rules prevent NitroShock from crawling and auditing pages that don't need analysis, saving you credits and focusing results on pages that matter.
URL pattern exclusions use pattern matching to skip entire categories of pages. Common patterns to exclude include:
Admin and system pages:
/wp-admin/*
/wp-login.php
/admin/*
/dashboard/*
User account pages:
/account/*
/login/*
/register/*
/checkout/*
Non-HTML resources:
*.pdf
*.jpg
*.png
*.zip
*.doc
Pagination and filters:
?page=
?sort=
/page/[2-9]/
/page/[0-9][0-9]/
Query parameters often create duplicate content variations that waste crawl budget. Common parameters to exclude:
utm_source, utm_medium, utm_campaign - Tracking parametersfbclid, gclid - Platform click IDssessionid, PHPSESSID - Session identifierssort, filter, color, size - E-commerce filterspage, offset, limit - Pagination parametersConfigure parameter exclusions to treat URLs as identical regardless of these parameters. For example, excluding utm_source means:
example.com/product andexample.com/product?utm_source=facebookare treated as the same page, and only one gets crawled.
NitroShock can either respect or ignore your robots.txt file during crawls.
Respecting robots.txt (default):
Ignoring robots.txt:
If you're unsure why certain pages aren't being crawled, try running one audit with robots.txt ignored to see if it's blocking the crawler.
To configure exclusions:
Exclusion rules persist across audits, so you only need to configure them once per project unless your site structure changes.
Small business sites (10-100 pages):
E-commerce sites:
News or blog sites:
Enterprise or large sites:
Beyond basic crawl configuration, NitroShock offers advanced settings for specialized audit scenarios and precise control over crawler behavior.
Modern websites often rely heavily on JavaScript to generate content. The JavaScript rendering setting determines whether NitroShock executes JavaScript before analyzing pages.
JavaScript rendering disabled (faster, uses fewer credits):
JavaScript rendering enabled (slower, uses more credits):
Enable JavaScript rendering if:
Some sites display different content based on cookie consent or user preferences. Configure how NitroShock handles cookies:
Accept all cookies:
Reject all cookies:
Custom cookie values:
Add custom HTTP headers to crawl requests for specialized scenarios:
To add custom headers:
By default, NitroShock only crawls internal links within your domain. The follow external links setting changes this behavior:
Disabled (default):
Enabled:
Rather than discovering pages through link crawling, you can direct NitroShock to use your XML sitemap:
Sitemap-based crawling:
Combined approach:
Configure sitemap settings:
/sitemap.xml)Set up automated crawls to monitor your site continuously without manual intervention:
One-time crawl:
Scheduled crawls:
Configure scheduled audits:
Scheduled crawls automatically use the same configuration settings each time. Update your crawl configuration to affect future scheduled crawls.
How many pages will my crawl audit?
The exact number depends on your crawl depth, exclusion rules, and site structure. Before confirming any audit, NitroShock estimates the page count based on your settings and shows the expected credit cost. Start with conservative settings (depth 2, moderate exclusions) for your first crawl, then adjust based on results.
Why isn't NitroShock crawling some of my pages?
Check these common causes: your robots.txt file may be blocking the crawler, your exclusion rules might be too broad, pages may not be linked from other pages within your crawl depth, or your server might be returning errors or timeouts for those pages. Run an audit with robots.txt ignored and minimal exclusions to diagnose the issue.
Can I crawl a staging or password-protected site?
Yes, use custom headers to include authentication credentials. Navigate to Site Audit → Configure Crawl → Advanced Settings and add an Authorization header with your credentials. For HTTP basic authentication, use the format Basic [base64-encoded-credentials]. Alternatively, if your staging site uses IP whitelisting, contact NitroShock support to whitelist the crawler IP addresses.
Should I run mobile or desktop audits?
Run mobile audits as your primary monitoring tool since Google uses mobile-