Agent Virtual Machine
  • Getting Started
    • Introduction
    • Roadmap
  • Use Cases
    • Code Generation Evaluations
    • Data Analysis
    • Data Extraction
    • Data Transformation
Powered by GitBook
On this page
  • Data Extraction
  • ​Use Cases
  • ​Scenario: Bulk Scraping
  • ​Implementation: Two-Stage Extraction
  1. Use Cases

Data Extraction

Smart scraping with AVM

PreviousData AnalysisNextData Transformation

Last updated 5 days ago

Objective: Enable smart scraping and data extraction through LLM-generated scripts executed within AVM’s sandbox.

Data Extraction

Delegate web scraping logic to an LLM, execute safely on AVM nodes, and obtain structured CSV/JSON without local risk.

Use Cases

Web2: Shopify Product APIs

Extract product data, pricing, and inventory information from e-commerce platforms.

Web3: CoinGecko, Twitter, Dune

Scrape cryptocurrency data, social sentiment, and blockchain analytics for comprehensive market analysis.

Scenario: Bulk Scraping

Extract account balances pages from a DeFi dashboard in parallel.

Implementation: Two-Stage Extraction

  1. Fetch HTML Retrieve page content locally.

  2. Parse with LLM Prompt the LLM to extract table data via BeautifulSoup.

  3. Run in AVM Execute parsing code with the runPython tool.

  4. Aggregate Results Combine CSV outputs for all URLs.

​
​
​
​
​