groundingLMM: GLaMM (Grounding Large Multimodal Model) is an end-to-end trained LMM capable of generating natural language responses integrated with object segmentation masks, enabling visual grounding and versatile interaction with images at multiple granularity levels. It introduces the novel task of Grounded Conversation Generation (GCG), supports various downstream applications like referring expression segmentation and region-level captioning, and is underpinned by the large-scale GranD dataset.; ruflo: Ruflo v3 is an enterprise AI orchestration platform for Claude-based multi-agent swarms, closely related to Claude Flow. It features an inter-agent consensus algorithm, vector database integration for persistent memory, self-learning workflows, and native Claude Code SDK integration. Designed for deploying autonomous agent pipelines at scale with shared state management.
Interactive visual assistants that understand and respond to user queries about specific image regions.
Running parallel Claude agent swarms for large-scale document processing