Adding Websites

Index web pages to make their content searchable by your AI chatbots, tools, and assistants.

Website URLs let you index live web pages, making their content available for AI retrieval. This is perfect for existing documentation, knowledge bases, or product pages.

How Website Indexing Works

When you add a URL, Hyperleap:

  1. Fetches the page — Downloads the HTML content
  2. Extracts text — Removes navigation, ads, and scripts
  3. Processes content — Chunks and indexes the text
  4. Stores for retrieval — Makes content searchable by AI

Adding Website URLs

1

Open Your Source

Navigate to the Source where you want to add websites.

2

Go to Website URLs Tab

Click the "Website URLs" tab.

3

Add URL

Click "Add URL" and enter the full URL (including https://).

4

Configure Options

Choose whether to index just this page or crawl linked pages.

5

Save and Index

Click "Add" to start indexing the page(s).

Tip:
Add specific page URLs rather than just the homepage. For example, add your FAQ page URL directly rather than relying on crawling.

Indexing Options

Single Page

Index just the URL you provide. Best for:

Landing Pages

Specific marketing pages

Blog Posts

Individual articles

FAQ Pages

Frequently asked questions

Product Pages

Individual product info

Crawl Mode

Index the page and follow links to related pages. Best for:

  • Documentation sites with multiple pages
  • Knowledge bases
  • Help centers
Note:
Crawling can index many pages quickly. Set a reasonable page limit to avoid indexing irrelevant content.

URL Status

After adding URLs, you'll see their status:

Pending

Queued for indexing

Indexing

Currently being processed

Indexed

Content is searchable

Failed

Error during indexing

Best Practices

URL Selection

  • Use canonical URLs — The main version of each page
  • Avoid dynamic URLs — Parameters can cause duplicate content
  • Prefer HTTPS — Secure pages are more likely to be accessible

Content Quality

  • Text-heavy pages work best — Images and videos aren't indexed
  • Well-structured content — Headings help organize chunks
  • Public pages only — Login-protected content can't be indexed

Keeping Content Fresh

  • Re-index when content changes — Click "Refresh" on updated URLs
  • Remove outdated URLs — Delete pages that no longer exist
  • Schedule re-indexing — For frequently updated content

Troubleshooting

Page Won't Index

Check that:

  • The URL is publicly accessible (not behind a login)
  • The page allows crawling (check robots.txt)
  • The URL is correctly formatted
  • The site isn't blocking automated access

Content Not Retrieved

If AI doesn't reference your web content:

  • Verify the URL is in "Indexed" status
  • Check that the Source is connected to your chatbot/tool
  • Try asking questions using exact phrases from the page

Managing Website URLs

In the Website URLs tab, you can:

View

See indexed content from each URL

Refresh

Re-index to get latest content

Delete

Remove a URL from the Source

Next Steps

Learn about Workspaces to collaborate with your team on AI projects.