MLããã³ã³ã³ãã³ãäœæã®ããã®ããã°ã©ã ã«ããåç»ã¢ãããŒã·ã§ã³ãã¬ãŒã ã¯ãŒã¯
MLç ç©¶è ãšåç»ã³ã³ãã³ãã¯ãªãšã€ã¿ãŒã¯ããã¬ãŒãã³ã°ããŒã¿æºåããæè²çšãªãŒããŒã¬ã€ãŸã§ãå€§èŠæš¡ã«ã¢ãããŒã·ã§ã³ä»ãåç»ãçæã§ãããæè»ã§ã³ãŒãé§ååã®åç»ã¢ãããŒã·ã§ã³ããŒã«ãå¿ èŠãšããŠããŸããã
ãããžã§ã¯ããçžè«ãã
課é¡
æ¢åã®åç»ã¢ãããŒã·ã§ã³ããŒã«ã¯ãããã°ã©ã ã«ããAPIããªãGUIã«äŸåãããããããŸãã¯å¯èŠåæ©èœã貧匱ãªã³ãã³ãã©ã€ã³ããŒã«ã®ããããã§ããã
- MLããŒã ã¯ãå€§èŠæš¡ãªãã¬ãŒãã³ã°ããŒã¿çšã«ããŠã³ãã£ã³ã°ããã¯ã¹ãããªãŽã³ãã©ãã«ãå¿ èŠãšããŠããŸãã
- æè²è ã¯ãæè²çšåç»ã®ããã«ã¢ãã¡ãŒã·ã§ã³ä»ããªãŒããŒã¬ã€ïŒç¢å°ãã¹ãããã©ã€ããããã¹ãïŒãå¿ èŠãšããŠããŸãã
- åŸæ¥ã®ã¢ãããŒã·ã§ã³ããŒã«ã¯ãããŒãã¬ãŒã è£éãã€ãŒãžã³ã°ã¢ãã¡ãŒã·ã§ã³ã«å¯Ÿå¿ã§ããŸããã§ãã
- OpenCVåŠçãšãããã§ãã·ã§ãã«ãªåç»åºåãçµã¿åããããã¹ã¯ããããã€ãã£ããœãªã¥ãŒã·ã§ã³ã¯ãããŸããã§ãã
ç§ãã¡ã®ãœãªã¥ãŒã·ã§ã³
ç§ãã¡ã¯ãåå®å šãªã¢ãããŒã·ã§ã³ã·ã¹ãã ãããŒãã¬ãŒã è£éãããã³Tauriãã¹ã¯ããããšãã£ã¿ãåããReact/RemotionããŒã¹ã®åç»ã¢ãããŒã·ã§ã³ãã¬ãŒã ã¯ãŒã¯ãæ§ç¯ããŸããã
ã¢ãŒããã¯ãã£
- åç»ãšã³ãžã³: ããã°ã©ã ã«ãããã¬ãŒã ããšã®ã¬ã³ããªã³ã°ã®ããã®Remotion 4.0
- ããã³ããšã³ã: Viteã䜿çšããReact 18 + TypeScript
- ãã¹ã¯ãããã¢ããª: OpenCV.jsãšONNX RuntimeãåããTauri 2
- ãšã¯ã¹ããŒã: é«å質ãªåç»åºåã®ããã®FFmpeg
ã¢ãããŒã·ã§ã³ã®çš®é¡
- ããŠã³ãã£ã³ã°ããã¯ã¹ - ã©ãã«ãšä¿¡é ŒåºŠã¹ã³ã¢ãæã€ç©åœ¢é å
- å - èšå®å¯èœãªååŸãæã€ç¹ã¢ãããŒã·ã§ã³
- ããªãŽã³ - äžèŠåãªåœ¢ç¶ã®ããã®è€éãªé åã®ã¢ãŠãã©ã€ã³
- ããã¹ãã©ãã« - äœçœ®æå®å¯èœãªã¹ã¿ã€ã«ä»ãããã¹ããªãŒããŒã¬ã€
- ç¢å° - æµããæ³šæãä¿ãããã®æ¹åæç€ºåš
- ããªãŒãã³ããã¹ - ã«ã¹ã¿ã æç»ãããã¢ãããŒã·ã§ã³
- ã¹ãããã©ã€ã - èæ¯ãæããªã£ããã€ã©ã€ãé å
ã¢ãã¡ãŒã·ã§ã³ã·ã¹ãã
- ããŒãã¬ãŒã è£é - ã¢ãããŒã·ã§ã³ã®ç¶æ éã®ã¹ã ãŒãºãªé·ç§»
- ã€ãŒãžã³ã°é¢æ° - Springãease-in-outãbounceãããã³ã«ã¹ã¿ã ã«ãŒã
- ã·ãŒã³åæ - ã€ã³ãããã¢ãããŒã·ã§ã³ã¬ã€ã€ãŒãçµåãããã¿ã€ã ã©ã€ã³ãã¢ãŠãã
- ãã§ãŒããšãã§ã¯ã - èšå®å¯èœãªæç¶æéã§ã®ãã§ãŒãã€ã³/ã¢ãŠã
äž»èŠæ©èœ
- åå®å šãªAPI - ãã¹ãŠã®ã¢ãããŒã·ã§ã³ããªããã£ãã«å¯Ÿããå æ¬çãªTypeScriptå
- ã·ãŒã³ã·ã¹ãã - ã·ãŒã³æ§æèŠçŽ ããè€éãªåç»ãæ§æ
- ããŒãã¬ãŒã ã¢ãã¡ãŒã·ã§ã³ - æéçµéãšãšãã«ä»»æã®ã¢ãããŒã·ã§ã³ããããã£ãã¢ãã¡ãŒã·ã§ã³å
- ãã¹ã¯ããããšãã£ã¿ - ãªã¢ã«ã¿ã€ã ãã¬ãã¥ãŒä»ãTauriããŒã¹GUI
- ããããšã¯ã¹ããŒã - FFmpegãä»ããŠã¢ãããŒã·ã§ã³ä»ãåç»ãã¬ã³ããªã³ã°
- OpenCVçµ±å - ãã¹ã¯ãããã¢ããªã§ã®ã³ã³ãã¥ãŒã¿ããžã§ã³åŠç
ææ
æè¡ã¹ã¿ãã¯
caseStudyDetail.more ã±ãŒã¹ã¹ã¿ãã£
ãã®ä»ã®æè¡å®è£ äºäŸãã芧ãã ãã
AIãæŽ»çšããé·ç·šæ ç»çæãã€ãã©ã€ã³
ã·ã³ãã«ãªããã¹ãããã³ããã15åãã90åã®æ ç»ã«å€æãããšã³ãããŒãšã³ãã®AIãã€ãã©ã€ã³ãæ§ç¯ããããšã§ãé·ç·šæ ç»å¶äœã®æ°äž»åãç®æãéå¿çãªã³ã³ãã³ãå¶äœãããžã§ã¯ãã
AIãæŽ»çšããOCRã«ããè«æ±æžåŠçãšQuickBooks飿º
æ¯ææ°çŸä»¶ã®ä»å ¥å è«æ±æžãåŠçããäžèŠæš¡äŒæ¥ããAI/OCRã䜿çšããŠè«æ±æžããŒã¿ãèªåæœåºãããããèšåž³ãšæ¯æè¿œè·¡ã®ããã«QuickBooksã«çŽæ¥åæãããããšã§ãæåããŒã¿å ¥åãæé€ããå¿ èŠããããŸããã
ãããã質å
MicrocosmWorks built this framework for teams that need to generate annotations at scale using code-driven rules rather than human clicking. It supports writing annotation pipelines as Python scripts that apply pre-trained detectors, temporal logic, and spatial rules to automatically generate training data, then exports in COCO, Pascal VOC, or YOLO formats.
Yes, MicrocosmWorks implemented a temporal annotation model that supports frame ranges, keyframe interpolation, and event-based labels with start/end timestamps. Annotators can define temporal rules like 'label as running when pose estimation detects both feet off ground for more than 3 consecutive frames' to automate action labeling.
MicrocosmWorks built a validation pipeline that computes agreement scores between programmatic annotations and a human-reviewed golden set, flagging any annotations that fall below a configurable IoU or temporal overlap threshold. The framework also supports active learning workflows that route low-confidence annotations to human reviewers.
MicrocosmWorks built the framework on top of FFmpeg and OpenCV, supporting all major container formats including MP4, MKV, AVI, and MOV, with codecs from H.264 to ProRes. The framework processes videos at their native resolution but supports configurable downscaling for the annotation pass to accelerate throughput on large datasets.
MicrocosmWorks delivers ML infrastructure projects at rates of $25-$45/hr, with a programmatic video annotation framework including the rule engine, format exporters, and quality validation pipeline typically requiring 300-500 development hours. The framework pays for itself quickly by reducing manual annotation costs that can run $5-$15 per minute of video.
ããžãã¹ã®å€é©ã®æºåã¯ã§ããŠããŸããïŒ
ã客æ§ã®èª²é¡ã«é¡äŒŒã®ãœãªã¥ãŒã·ã§ã³ãé©çšããæ¹æ³ã«ã€ããŠè©±ãåããŸãããã