Technical advances in document understanding
Practical AI · 2025-12-02 · 49 min
Episode notes
Chris and Daniel unpack how AI-driven document processing has rapidly evolved well beyond traditional OCR with many technical advances that fly under the radar. They explore the progression from document structure models to language-vision models, all the way to the newest innovations like Deepseek-OCR. The discussion highlights the pros and cons of these various approaches focusing on practical implementation and usage. Featuring: Chris Benson - Website , LinkedIn , Bluesky , GitHub , X Daniel Whitenack - Website , GitHub , X Sponsors: Shopify - The commerce platform trusted by millions. From idea to checkout, Shopify gives you everything you need to launch and scale your business - no matter your level of experience. Build beautiful storefronts, market with built-in AI tools, and tap into the platform powering 10% of all U.S. eCommerce. Start your one-dollar trial at shopify.com/practicalai Fabi.ai - The all-in-one data analysis platform for modern teams. From ad hoc queries to advanced analytics, Fabi lets you explore data wherever it lives - spreadsheets, Postgres, Snowflake, Airtable and more.
More from Practical AI
All episodes →- AIUC-1: Building trust in AI agents54 / 100
- Zero Trust for AI Agents41 / 100
- Breaking down the 2026 Stanford AI Index Report33 / 100
- Rebooting Enterprise AI with MCP and Kubernetes
- Hermes Agent: Agents that grow with you