Companies in the category 'AI evaluation'
These are companies that provide open source tools for evaluating the performance, fairness, and robustness of artificial intelligence models.
AI red teaming & LLM testing platform
Giskard is an AI red teaming and testing platform that helps companies detect security vulnerabilities, hallucinations, and quality issues in LLM-based agents before deployment. Its open-source Python library and enterprise hub provide automated scanning, evaluation, and continuous testing capabilities for AI systems ranging from RAG pipelines to traditional machine learning models.
Simulation environments for AI agents
Vibrant Labs builds simulation environments and developer workflows for evaluating and training long-horizon AI agents. The company is the creator of Ragas, the leading open-source framework for evaluating LLM applications, and is extending that work into RL-ready environments where agents can safely benchmark, train, and improve before deployment.
COSS Weekly Newsletter
Stay up to date with the latest news, funding rounds, and announcements from the COSS universe.
Check out COSS Weekly on the web
