AI Chatפורסם June 17, 2026 · עודכן May 25, 2026

Enterprise Multi-Model AI Chat Platform with Credit-Based Billing

An organization needed a unified platform for teams to access multiple AI models (GPT, Claude, Gemini, Grok, Perplexity) with enterprise-grade security, usage tracking, and cost management.

דון בפרויקט שלך

AI Chat

Domain

Technologies

Key Results

Delivered

Status

האתגר

Teams were using multiple AI tools with no centralization or cost control:

Each team member had separate subscriptions to different AI providers
No unified conversation history or knowledge sharing across the organization
No visibility into AI usage costs or per-user consumption
Enterprise security and GDPR compliance requirements couldn't be met with consumer tools
Comparing model outputs required switching between multiple interfaces

הפתרון שלנו

We built a production-grade multi-model AI chat platform with credit-based billing, role-based access control, and GDPR compliance.

Architecture

Frontend: React 18 + TypeScript + Vite with Tailwind CSS
Backend: Node.js/Express with TypeScript and Prisma ORM
Database: PostgreSQL (60+ tables) with Redis caching
Auth: AWS Cognito with JWT-based RBAC
Billing: LemonSqueezy with credit-based consumption tracking
Queue: BullMQ for background job processing
Infrastructure: AWS (ECS/Fargate, RDS, ElastiCache, S3, KMS, SES)

AI Integrations

OpenAI GPT models
Anthropic Claude models
Google Gemini models
xAI Grok models
Perplexity for web search
Suno for AI music generation

Key Features

Multi-Model Chat - Switch between AI providers per conversation
Split-Screen Comparison - Side-by-side model output comparison
Workflow Automation - LangGraph-powered step-by-step AI workflows
GPT Marketplace - Discover, create, and share custom GPTs
Artifacts - Sandboxed code/HTML preview within conversations
Credit System - Pay-per-use with automatic refills and admin grants
GDPR Compliance - Automated deletion, data export, AES-256-GCM encryption
Content Moderation - Flagging system with auto-triage for inappropriate content
Group Chat - Multiple AI participants in a single conversation
Web Search - Perplexity integration for grounded, up-to-date responses

תוצאות

Cost Visibility: Per-user token usage and cost tracking

Security: AES-256-GCM encryption at rest, AWS KMS key rotation, full audit trail

Compliance: GDPR-compliant with automated erasure and data export

מחסנית טכנולוגית

ReactTypeScriptViteNode.jsExpressPrismaPostgreSQLRedisBullMQAWS CognitoAWS ECS/FargateLemonSqueezyOpenAIAnthropic

caseStudyDetail.more מקרי בוחן

גלה עוד מהיישומים הטכניים שלנו

AI Accounting

עיבוד חשבוניות מבוסס AI עם OCR ושילוב QuickBooks

עסק בגודל בינוני שעיבד מאות חשבוניות ספק בחודש נזקק לביטול הזנת נתונים ידנית על ידי חילוץ אוטומטי של נתוני חשבוניות באמצעות AI/OCR וסנכרונם ישירות ל-QuickBooks לצורך הנהלת חשבונות ומעקב תשלומים.

קרא מקרה בוחן

Video Encoding

הזרקת פרסומות בצד הלקוח (CSAI) עם ניתוח סמני SCTE-35 ושילוב נגן מרובה פלטפורמות

פלטפורמת הזרמת וידאו נזקקה ליישם הזרקת פרסומות בצד הלקוח (CSAI) על פני יישומי אינטרנט, מובייל וטלוויזיות חכמות — המאפשרת חוויות פרסום מותאמות אישית ברמת המכשיר עם תמיכה מלאה באינטראקציה עם פרסומות (שכבות-על ניתנות ללחיצה, באנרים נלווים, כפתורי דילוג) שאותן הזרקה בצד השרת אינה יכולה לספק.

קרא מקרה בוחן

שאלות נפוצות

MicrocosmWorks engineered an intelligent routing layer that evaluates incoming prompts based on task type, complexity, and token requirements, then dispatches them to the most appropriate model whether that is GPT-4, Claude, Llama, or a specialized fine-tuned model. This approach optimizes both response quality and cost, since simpler queries can be handled by faster, cheaper models while complex reasoning tasks go to more capable ones.

MicrocosmWorks implemented a unified credit system that abstracts away the varying per-token costs of different AI providers into a single internal currency that enterprise customers purchase in bulk. Each model interaction deducts credits proportional to its actual API cost plus a configurable margin, giving administrators a single dashboard to track usage, set department-level budgets, and generate chargeback reports.

Yes, MicrocosmWorks built a centralized governance layer that enforces consistent data handling policies regardless of which underlying LLM processes the query. All conversations are encrypted at rest, role-based access controls determine which teams can access which models, and configurable retention policies automatically purge conversation history according to your compliance requirements.

MicrocosmWorks optimized the routing layer to add under 50 milliseconds of overhead per request, which is negligible compared to typical LLM response times of 1-10 seconds. The platform uses connection pooling, pre-authenticated sessions with each provider, and async streaming so that tokens begin appearing in the user interface as soon as the selected model starts generating them.

MicrocosmWorks builds enterprise multi-model chat platforms at development rates of $30-$50/hr, which is a fraction of what large consultancies charge for similar AI infrastructure projects. The total scope depends on the number of model integrations, authentication and SSO requirements, and whether you need features like conversation branching, prompt libraries, or fine-tuning pipelines.

מוכן לשנות את העסק שלך?

בואו נדון כיצד נוכל ליישם פתרונות דומים לאתגרים שלך.

צור קשר caseStudyDetail.viewAllCaseStudies