MicrocosmWorksInovasi dan Seni Bina Kosmos Digital
TentangHubungi
MicrocosmWorksMemperbaharui dan Merangka Kosmos Digital

Menyampaikan penyelesaian IT yang penting. Kami bersemangat tentang teknologi, keselamatan, dan membantu perniagaan berkembang melalui infrastruktur IT yang boleh dipercayai dan inovatif.

[email protected]
+91 7011868196
New Delhi, India

Pusat Pertumbuhan AI

AI HubInovasi PermulaanPemecut Perusahaan

Penyelesaian

Semua PenyelesaianAplikasi Kesihatan & KecergasanPlatform Video AIPembangunan Ejen AI

Sumber

WawasanPanduan IndustriPelan Tindakan Kes PenggunaanCorak Seni BinaKajian Kes

Syarikat

Tentang KamiHubungiKerja Kami

Perkhidmatan

Perundingan DigitalInfrastruktur AwanPembangunan SaaSPembangunan AITeknologi Video
Pembangunan ERPPenyesuaian ZohoPembangunan OdooIntegrasi SalesforcePembangunan CRM Tersuai
Integrasi QuickBooksPenyelesaian IoTPembangunan Blockchain
Perundingan Keselamatan SiberSokongan IT - L3

ยฉ 2026 MicrocosmWorks. Hak cipta terpelihara.

Dasar PrivasiTerma Perkhidmatan
Kembali ke Kajian Kes
Document IntelligenceDiterbitkan June 18, 2026 ยท Dikemas kini May 25, 2026

AI-Powered Spreadsheet & Document Analysis with Multi-Agent Orchestration and Cross-Document Reference

An enterprise data team needed to analyze, query, and edit large collections of spreadsheets and documents (Excel, CSV, Google Sheets, PDFs, Word docs) using natural language โ€” with the ability to cross-reference data across multiple files and execute multi-step analytical workflows without manual data wrangling.

Bincangkan Projek Anda
spreadsheet-docs-analysis-multi-agent.webp
Document Intelligence
Domain
15
Technologies
6
Key Results
Delivered
Status

Cabaran

Working with business documents at scale was riddled with friction:

  • Siloed Data โ€” Critical information was scattered across dozens of spreadsheets, PDFs, and Word documents with no way to query across them
  • Manual Cross-Referencing โ€” Comparing a vendor price list (Excel) against contract terms (PDF) against invoice history (CSV) required hours of manual lookup
  • Formula Limitations โ€” Complex analytical questions couldn't be answered with spreadsheet formulas alone
  • Context Window Limits โ€” Large spreadsheets (50,000+ rows) exceeded LLM context windows, making naive approaches fail
  • No Edit Capabilities โ€” Existing AI tools could analyze documents but couldn't write changes back to the source files
  • Multi-Step Reasoning โ€” Questions requiring sequential analysis across documents needed orchestrated multi-step workflows

Penyelesaian Kami

We built a multi-agent AI document intelligence platform with vector database-backed retrieval for large documents, specialized agents for different document types, an orchestrator for cross-document reasoning, and write-back capabilities for spreadsheet editing.

Architecture

  • Orchestrator: AI orchestrator agent coordinating multi-step workflows across specialized agents
  • Spreadsheet Agent: Handles Excel/CSV/Google Sheets analysis, formula generation, and cell edits
  • Document Agent: Handles PDF/Word document reading, extraction, and summarization
  • Cross-Reference Agent: Performs joins, comparisons, and reconciliation across document types
  • Vector Database: Milvus for semantic indexing of document chunks and spreadsheet rows
  • LLM Layer: Multi-model approach with function calling
  • Backend: Python/FastAPI for document processing and agent orchestration
  • Frontend: React dashboard with file upload, chat interface, and live spreadsheet preview
  • Storage: S3 for original files, PostgreSQL for metadata and job tracking

Multi-Agent Architecture

Agent Roles

1. Orchestrator Agent

The central coordinator that receives user queries, decomposes them into sub-tasks, and delegates to specialized agents. It analyzes user intent, creates execution plans, manages data flow between agents, aggregates results, and handles error recovery.

2. Spreadsheet Agent

Specialized for tabular data operations including schema understanding, natural language to query translation, aggregations and filtering, formula generation, cell editing and column fills, chart suggestions, and data validation/anomaly detection.

3. Document Agent

Specialized for unstructured and semi-structured documents including OCR and layout-aware text extraction, section identification, key-value extraction from contracts, summarization, semantic clause search, and table extraction from PDFs/Word docs.

4. Cross-Reference Agent

Specialized for multi-document reasoning including entity matching across documents, data reconciliation and discrepancy identification, timeline analysis, dependency resolution for conflicting data, and SQL-like join operations across document types.

Vector Database Layer

Why Vector DB for Documents

Large documents and spreadsheets can't fit in a single LLM context window. The vector database enables semantic search across millions of rows and document chunks, retrieval of only relevant portions per query, cross-document entity linking via embedding similarity, and persistent indexing that doesn't need re-processing on every query.

Indexing Strategy

Spreadsheet Indexing:

Each row is converted to a natural language representation by concatenating key column values, then embedded and stored with references back to the original file, sheet, and row index for write-back operations.

Document Indexing:

Documents are extracted with layout awareness, chunked into semantic segments with overlap, embedded, and stored with references to the source file, section, and page number.

Cross-Document Entity Index:

A separate index links entities (vendors, products, people, invoice numbers) across documents, enabling cross-reference queries to quickly find all mentions of an entity regardless of source file.

Retrieval Pipeline

When a user asks a cross-document question, the orchestrator identifies which documents and agents are needed, performs vector searches to find relevant data across all sources, delegates to specialized agents for processing, and aggregates results into a coherent response.

Orchestration Engine

Query Decomposition

The orchestrator breaks complex queries into multi-step execution plans. For example, a question like "Find vendors with late deliveries, check contract penalty clauses, and calculate claimable penalties" would be decomposed into sequential steps: querying delivery data via the Spreadsheet Agent, searching contracts via the Document Agent, and joining results via the Cross-Reference Agent.

Agent Communication

  • Agents communicate via structured messages with typed payloads
  • The orchestrator maintains execution context with intermediate results
  • Failed steps trigger retry or fallback strategies
  • Partial results are returned if some steps complete but others fail

Spreadsheet Edit & Write-Back

Edit Capabilities

The platform supports cell updates, column fills, row insertion, conditional formatting, new sheet creation, and formula injection โ€” all proposed by AI agents and applied with user approval.

Write-Back Pipeline

  1. Agent determines the edit operation (which cells, what values)
  2. Edit preview shown to user with diff highlighting (old vs. new values)
  3. User approves or modifies the proposed changes
  4. Backend applies changes to the file using appropriate libraries per format
  5. Modified file saved as a new version with edit audit trail
  6. Vector index updated for changed rows

Version Control

  • Every edit creates a new file version (original preserved)
  • Diff log shows exactly what changed, when, and why
  • Rollback to any previous version with one click
  • Edit attribution: which agent or user made each change

Processing Pipeline for New Documents

File Upload Flow

  1. User uploads files (drag-and-drop or API)
  2. File type detected and routed to appropriate processor
  3. Spreadsheets: Parsed, schema inferred, rows embedded and indexed
  4. PDFs: OCR (if scanned) โ†’ layout extraction โ†’ chunking โ†’ embedding โ†’ indexing
  5. Word Docs: Text extraction โ†’ section parsing โ†’ chunking โ†’ embedding โ†’ indexing
  6. Entity Extraction: NER identifies people, organizations, dates, amounts across all docs
  7. Cross-Document Linking: Entity index updated with new mentions
  8. File metadata stored in PostgreSQL, embeddings in vector DB, originals in S3

Supported Formats

The platform supports Excel, CSV, and Google Sheets (with full write-back), native and scanned PDFs (read-only), and Word docs and Google Docs (limited write-back).

Key Features

  1. Multi-Agent Architecture โ€” Specialized agents for spreadsheets, documents, and cross-referencing
  2. AI Orchestrator โ€” Decomposes complex queries into multi-step execution plans
  3. Cross-Document Reference โ€” Entity linking and data reconciliation across file types
  4. Vector-Powered Retrieval โ€” Semantic search handles datasets beyond LLM context limits
  5. Spreadsheet Write-Back โ€” AI edits cells, fills columns, and injects formulas with user approval
  6. Large Dataset Support โ€” 50,000+ row spreadsheets indexed and queryable via vector search
  7. Version Control โ€” Every edit versioned with diff log and rollback capability
  8. Natural Language Queries โ€” Ask complex analytical questions in plain English
  9. Multi-Format Support โ€” Excel, CSV, Google Sheets, PDF, Word, Google Docs
  10. Edit Preview โ€” Diff-highlighted preview before any changes are applied

Keputusan

Query Speed: Cross-document questions answered in 10-30 seconds vs. hours of manual lookup
Data Scale: Handled 500+ documents and spreadsheets with 2M+ total rows indexed
Edit Accuracy: AI-proposed spreadsheet edits accepted without modification 85% of the time

Timbunan Teknologi

PythonFastAPILLM (GPT-4oClaude)MilvusOpenAI EmbeddingsLangChainLangGraphReactPostgreSQLS3Job QueueRedisOCR

caseStudyDetail.more Kajian Kes

Terokai lebih banyak pelaksanaan teknikal kami

Document Intelligence

Sistem RAG Dokumen Lokal-Pertama dengan Carian Hibrid & Sokongan Pelbagai Format

Sebuah pasukan yang membangunkan alatan pembangun memerlukan sistem kecerdasan dokumen yang beroperasi sepenuhnya secara lokal, memelihara privasi, yang boleh menyerap pelbagai format fail, membina pangkalan pengetahuan yang boleh dicari, dan menjawab pertanyaan bahasa semula jadi menggunakan Retrieval-Augmented Generation โ€” tanpa menghantar sebarang data kepada API luaran.

Baca Kajian Kes
AI Accounting

Pemprosesan Invois Berkuasa AI dengan OCR dan Integrasi QuickBooks

Sebuah perniagaan bersaiz sederhana yang memproses ratusan invois vendor setiap bulan perlu menghapuskan kemasukan data manual dengan mengekstrak data invois secara automatik menggunakan AI/OCR dan menyegerakkannya terus ke dalam QuickBooks untuk tujuan simpan kira dan penjejakan pembayaran.

Bersedia untuk Mentransformasi Perniagaan Anda?

Mari bincangkan bagaimana kami boleh mengaplikasikan penyelesaian serupa untuk cabaran anda.

Hubungi KamicaseStudyDetail.viewAllCaseStudies
Cross-Reference: Entity matching linked data across documents with 92% accuracy
Retrieval Precision: Vector search returned relevant chunks in top-5 results 94% of the time
Time Savings: Reduced multi-document analysis workflows from hours to minutes
Document Processing Libraries
Baca Kajian Kes
Video Encoding

Penyisipan Iklan Sisi Klien (CSAI) dengan Penghuraian Penanda SCTE-35 & Integrasi Pemain Berbilang Platform

Sebuah platform penstriman video perlu melaksanakan Client-Side Ad Insertion (CSAI) merentasi aplikasi web, mudah alih, dan TV bersambung โ€” membolehkan pengalaman iklan yang diperibadikan pada peringkat peranti dengan sokongan interaksi iklan penuh (lapisan tindanan boleh klik, sepanduk pendamping, butang langkau) yang tidak dapat disediakan oleh penyisipan sisi pelayan.

Baca Kajian Kes

Soalan Lazim

MicrocosmWorks designed a multi-agent architecture where specialized agents handle different aspects of document analysis, such as a table extraction agent for spreadsheets, a text summarization agent for narrative documents, and a cross-reference agent that identifies relationships between data points across multiple files. This division of labor produces more accurate results than a single monolithic LLM call because each agent operates within a focused context window and applies domain-specific prompting strategies.

Yes, MicrocosmWorks built a spreadsheet parsing engine that resolves formula dependencies, expands pivot table summaries, and traces cross-sheet references before passing structured data to the analysis agents. The system converts complex Excel constructs into flattened data representations that LLMs can reason about effectively, and preserves the relational context between sheets so the AI can answer questions like 'which department exceeded its Q3 budget' that require joining data across multiple tabs.

MicrocosmWorks implemented an entity linking pipeline that extracts named entities, numeric identifiers, and date references from all uploaded documents, then builds a knowledge graph connecting related mentions across files. When a user asks a question, the cross-reference agent traverses this graph to pull relevant data from multiple source documents, providing answers that synthesize information in ways that would take a human analyst hours of manual cross-checking.

MicrocosmWorks designed the system to handle document batches of up to 500 files per analysis session, with individual file sizes up to 100MB for spreadsheets and 50MB for PDFs. Large documents are automatically chunked and processed in parallel across multiple agent instances, and the orchestrator maintains a coherent view of the entire document set by aggregating agent outputs into a unified knowledge representation.

MicrocosmWorks develops multi-agent document analysis platforms at rates of $30-$50/hr, with a production-ready system typically requiring 3-5 months of development including document parsing, agent orchestration, cross-reference detection, and a user-facing query interface. The per-query cost in production depends on document volume and LLM token usage, but multi-agent architectures actually reduce LLM costs by routing only relevant context to each agent rather than stuffing entire document sets into a single prompt.