This article will provide a deeper view into bot training via Web URLs and points to keep in mind.
Updated: February 17, 2025
- Steps to training your bot using Web URLs
- URL Crawling Modes
- Uploaded Links Table
Steps to training your bot using Web URLs
- In your account, navigate to Settings > Conversation AI
- Select "Bot Training" from the top menu
- Enter an entire URL (along with Https://) and choose one of the three web crawling mode (explained below)
- Wait for the URLs to be fetched and crawled
- Select the URLs and hit "Train Bot"
- Each URL is trained and added to the table below with its status. (Wait for all URLs to be trained before using the bot)
URL Crawling Modes
This is the recommended option for precise training. With the Exact URL method, the bot will crawl the exact URL provided and train itself.
1. Exact URL
Steps:
- Choose the option "Exact URL."
- Enter the URL you want to crawl and hit "Get Data."
- The URL is crawled and the bot is trained on it and added to the Uploaded Links Table
2. All URLs in this Domain
Train your bot with a broader range of information from a specific domain. The bot will crawl all the pages and links on the specified domain and provide you with the option to select which URLs to train from.
Steps:
Choose the option "All URLs in this domain."
Enter the URL and hit "Get Data."
Wait for the pages to load, and then you'll be presented with a list of available URLs.
Choose the pages that will be relevant to training the bot and hit "Train Bot."
During page selection (Step 4 above), you'll encounter two lists:
New Pages - Fresh URLs not part of the bot's current training data. Selecting them will add them to the "Uploaded Links" table once the training is completed
Existing Pages - URLs already part of the bot's current training dataset and visible in the "Uploaded Links" table below. Selecting them will refresh all the URLs selected
Uploaded Links Table
All the links/URLs that the bot has been trained on are visible in the Uploaded link table.
Trained URLs can be refreshed (retrains the bot on the latest information) or deleted (information will be removed from the bot's knowledge base)
Each URL will have one of these 3 statuses:
- Getting Data - The bot is training again on this URL ie the URL's information is being refreshed
- Trained - The bot successfully learned from this URL. The "Last data refreshed at" is also visible which can be used to identify if a data refresh is required
- Failed - The bot failed to train for this URL. You can either refresh and try again or delete the URL
- Regularly review & remove old URLs from Uploaded Links table for better responses