Apify Integration: Allow agents and admins to crawl web pages
planned
Lilith Jalalyan
UPDATED USER Story
As a user interested in web scraping data from various websites, I want to be able to browse and select pre-built scraping actors from the marketplace that suit my specific scraping needs. Once I find the right actor, I want to be able to configure it according to my requirements, including providing target URLs, specifying input data (if needed), and setting up the scraping depth and pages. The scrapped data should be learned by the AI
Acceptance Criteria:
- The user should be able to access the "Pre-Built Scraping Actors Marketplace" from Scrapper section
In the marketplace, the user can browse through the available actors and view their descriptions, features, and sample output.
- Each actor in the marketplace should have relevant tags or categories to aid in easy search and identification.
- The user can select a specific actor by clicking on it, which takes them to the configuration page.
- On the configuration page, the user can input the target URL they wish to scrape.
- The user can set up the desired scraping depth and the number of pages they want to scrape.
- Once the configuration is complete, the user can initiate the scraping task by clicking the "Run" button.
- The selected actor will start scraping the specified webpages based on the provided configuration.
- After the scraping task is completed, the UI webpage URL will appear in the "Crawled Content" section under the corresponding actor.
- The scraped data will be automatically learned by the AI Assistant to be able to provide answers from that knowledge.
- The user will not have the ability to edit the scraped data, but they can delete specific information by clicking on the delete.
- The AI system will automatically forget any deleted information to ensure data privacy and compliance.
- If the user attempts to crawl the same ID webpage URL, the system will check for existing information and replace it with the new data.
Lilith Jalalyan
planned
Lilith Jalalyan
Pouria Tajdivand
under review
API is completed
Pouria Tajdivand
in progress
This post was marked as
closed
Pouria Tajdivand
Datasources that we want from Apify:
Website Content Crawler
Instagram Profile Scraper
Facebook Pages Scraper