Adding Websites — Sources — Hyperleap AI Help Center

Website URLs let you index live web pages, making their content available for AI retrieval. This is perfect for existing documentation, knowledge bases, or product pages.

Watch the Video

How Website Indexing Works

When you add a URL, Hyperleap:

Fetches the page — Downloads the HTML content
Extracts text — Removes navigation, ads, and scripts
Processes content — Chunks and indexes the text
Stores for retrieval — Makes content searchable by AI

Adding Website URLs

Open Your Source

Navigate to the Source where you want to add websites.

Go to Website URLs Tab

Click the "Website URLs" tab.

Add URL

Click "Add URL" and enter the full URL (including https://).

Configure Options

Choose whether to index just this page or crawl linked pages.

Save and Index

Click "Add" to start indexing the page(s).

Tip:

Add specific page URLs rather than just the homepage. For example, add your FAQ page URL directly rather than relying on crawling.

Indexing Options

Single Page

Index just the URL you provide. Best for:

Landing Pages

Specific marketing pages

Blog Posts

Individual articles

FAQ Pages

Frequently asked questions

Product Pages

Individual product info

Crawl Mode

Index the page and follow links to related pages. Best for:

Documentation sites with multiple pages
Knowledge bases
Help centers

Note:

Crawling can index many pages quickly. Set a reasonable page limit to avoid indexing irrelevant content.

URL Status

After adding URLs, you'll see their status:

Pending

Queued for indexing

Indexing

Currently being processed

Indexed

Content is searchable

Failed

Error during indexing

Best Practices

URL Selection

Use canonical URLs — The main version of each page
Avoid dynamic URLs — Parameters can cause duplicate content
Prefer HTTPS — Secure pages are more likely to be accessible

Content Quality

Text-heavy pages work best — Images and videos aren't indexed
Well-structured content — Headings help organize chunks
Public pages only — Login-protected content can't be indexed

Keeping Content Fresh

Re-index when content changes — Click "Refresh" on updated URLs
Remove outdated URLs — Delete pages that no longer exist
Schedule re-indexing — For frequently updated content

Troubleshooting

Page Won't Index

Check that:

The URL is publicly accessible (not behind a login)
The page allows crawling (check robots.txt)
The URL is correctly formatted
The site isn't blocking automated access

Content Not Retrieved

If AI doesn't reference your web content:

Verify the URL is in "Indexed" status
Check that the Source is connected to your chatbot/tool
Try asking questions using exact phrases from the page

Managing Website URLs

In the Website URLs tab, you can:

View

See indexed content from each URL

Refresh

Re-index to get latest content

Delete

Remove a URL from the Source

Next Steps

Learn about Workspaces to collaborate with your team on AI projects.