groundingLMM: GLaMM (Grounding Large Multimodal Model) is an end-to-end trained LMM capable of generating natural language responses integrated with object segmentation masks, enabling visual grounding and versatile interaction with images at multiple granularity levels. It introduces the novel task of Grounded Conversation Generation (GCG), supports various downstream applications like referring expression segmentation and region-level captioning, and is underpinned by the large-scale GranD dataset.; google_workspace_mcp: This production-ready MCP server enables full natural language control over Google Workspace services like Calendar, Drive, Gmail, Docs, Sheets, Slides, Forms, Tasks, and Chat through MCP clients, AI assistants, and developer tools. It is the most feature-complete Google Workspace MCP server, now supporting Remote OAuth2.1 multi-user operation and one-click Claude installation.
Interactive visual assistants that understand and respond to user queries about specific image regions.
Enabling natural language control for Google Workspace services via AI assistants.