groundingLMM: GLaMM (Grounding Large Multimodal Model) is an end-to-end trained LMM capable of generating natural language responses integrated with object segmentation masks, enabling visual grounding and versatile interaction with images at multiple granularity levels. It introduces the novel task of Grounded Conversation Generation (GCG), supports various downstream applications like referring expression segmentation and region-level captioning, and is underpinned by the large-scale GranD dataset.; AIlice: Ailice is a fully autonomous, general-purpose AI agent built on open-source LLMs, utilizing a unique IACT architecture to decompose complex tasks. It aims to achieve self-evolution, enabling AI agents to autonomously build feature expansions and new types of agents.
Interactive visual assistants that understand and respond to user queries about specific image regions.
Automated software development, system administration, and script execution.