挑战
传统的数据库查询无法满足搜索需求:
- 对每个供应商超过 80 个字段进行全文搜索,使用 SQL 过于缓慢
- 基于优先级的排名需要考虑数据完整性和验证情况
- 社交媒体存在需要作为一流属性进行搜索
- 模糊匹配和错别字容错对于国际供应商名称至关重要
- 类别和位置层次结构需要分面搜索功能
我们的解决方案
我们实施了自定义的 Elasticsearch 集成,通过基于优先级的索引、多字段搜索和智能排名来实现供应商发现。
架构
- 搜索引擎:Elasticsearch,带有针对供应商、类别、社交媒体的自定义映射
- 数据层:TypeORM/PostgreSQL 作为数据源,同步到 Elasticsearch
- API 层:Node.js/Express,带 Elasticsearch 客户端
- 前端:React,支持实时输入即搜
- 分析:PostHog,用于搜索行为跟踪
搜索能力
- 多字段搜索 - 同时查询供应商名称、描述、品牌、类别
- 社交媒体筛选 - 根据供应商在特定平台上的存在进行查找
- 类别分面 - 深入钻取产品类别层次结构
- 位置筛选 - 按国家、地区或城市搜索
- 优先级排名 - 已验证且数据完整的供应商排名靠前
- 模糊匹配 - 处理错别字和国际名称变体
主要特性
- 自定义索引映射 - 针对供应商、类别和社交媒体数据优化的 Schema
- 实时同步 - 数据库更改在数秒内反映到搜索结果中
- 搜索分析 - 跟踪热门查询、零结果搜索和点击率
- 批量索引 - 用于大型供应商导入的高效批量索引
- 加权评分 - 基于字段重要性可配置的相关性评分
成果
技术栈
常见问题
MicrocosmWorks configured Elasticsearch with custom analyzers that combine edge n-gram tokenization for partial matching, synonym dictionaries for industry terminology, and a dedicated keyword field for exact part number lookups. This approach returns relevant suppliers even when buyers use different terminology than what appears in the supplier's catalog.
MicrocosmWorks designed the Elasticsearch cluster with a sharding strategy that distributes supplier documents across multiple nodes based on industry vertical, enabling horizontal scaling without reindexing. The architecture supports cross-cluster search for geographic distribution, maintaining sub-200ms query response times even at millions of supplier records.
Yes, MicrocosmWorks implemented function score queries that dynamically boost supplier rankings based on buyer-defined weights for proximity, MOQ fit, lead time, certification requirements, and past transaction history. Buyers can save their weighting profiles and apply them across searches for consistent sourcing preferences.
MicrocosmWorks built a change data capture pipeline using Debezium connected to the PostgreSQL source database, streaming supplier record changes to Elasticsearch in near real-time via Kafka. This ensures search results reflect database updates within seconds rather than waiting for batch reindex cycles.
MicrocosmWorks delivers Elasticsearch-powered search solutions at rates of $20-$45/hr, with a full B2B supplier search engine including custom analyzers, relevance tuning, faceted filtering, and CDC pipeline typically requiring 350-550 development hours. The Elasticsearch infrastructure itself runs cost-effectively on three-node clusters starting around $500/month on AWS.
