Data Extraction

Smart scraping with AVM

Objective: Enable smart scraping and data extraction through LLM-generated scripts executed within AVM’s sandbox.

Data Extraction

Delegate web scraping logic to an LLM, execute safely on AVM nodes, and obtain structured CSV/JSON without local risk.

Use Cases

Web2: Shopify Product APIs

Extract product data, pricing, and inventory information from e-commerce platforms.

Web3: CoinGecko, Twitter, Dune

Scrape cryptocurrency data, social sentiment, and blockchain analytics for comprehensive market analysis.

Scenario: Bulk Scraping

Extract account balances pages from a DeFi dashboard in parallel.

Implementation: Two-Stage Extraction

  1. Fetch HTML Retrieve page content locally.

  2. Parse with LLM Prompt the LLM to extract table data via BeautifulSoup.

  3. Run in AVM Execute parsing code with the runPython tool.

  4. Aggregate Results Combine CSV outputs for all URLs.

Last updated