Home > Testing > giskard-oss

giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

agent-evaluation ai-red-team ai-security ai-testing fairness-ai llm llm-eval llm-evaluation python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

🐢 Open-Source Evaluation & Testing library for LLM Agents

README

Evals, Red Teaming and Test Generation for Agentic Systems

Modular, Lightweight, Dynamic and Async-first

Docs • Website • Community

Important

Giskard v3 is a fresh rewrite designed for dynamic, multi-turn testing of AI agents. This release drops heavy dependencies for better efficiency while introducing a more powerful AI vulnerability scanner and enhanced RAG evaluation capabilities. For now, the vulnerability scanner and RAG evaluation still rely on Giskard v2. Giskard v2 remains available but is no longer actively maintained. Follow progress → Read the v3 Annoucement · Roadmap

Install

pip install giskard

Requires Python 3.12+.

Giskard is an open-source Python library for testing and evaluating agentic systems. The v3 architecture is a modular set of focused packages — each carrying only the dependencies it needs — built from scratch to wrap anything: an LLM, a black-box agent, or a multi-step pipeline.

Status	Package	Description
✅ Alpha	`giskard-checks`	Testing & evaluation — scenario API, built-in checks, LLM-as-judge
🚧 In progress	`giskard-scan`	Agent vulnerability scanner — red teaming, prompt injection, data leakage (successor of v2 Scan)
📋 Planned	`giskard-rag`	RAG evaluation & synthetic data generation (successor of v2 RAGET)

Giskard Checks — create and apply evals for testing agents

pip install giskard-checks

Giskard Checks is a lightweight library for creating evaluations (evals) that test LLM-based systems — from simple assertions to LLM-as-judge assessments. Unlike traditional unit tests, evals are designed for non-deterministic outputs where the same input can produce different valid responses.

Use Giskard Checks to:

Catch regressions — verify your system still behaves correctly after changes
Validate RAG quality — check if answers are grounded in retrieved context
Enforce safety rules — ensure outputs conform to your content policies
Evaluate multi-turn agents — test full conversations, not just single exchanges

Built-in evals include string matching, comparisons, regex, semantic similarity, and LLM-as-judge checks (Groundedness, Conformity, LLMJudge).

Quickstart

from openai import OpenAI
from giskard.checks import Scenario, Groundedness

client = OpenAI()

def get_answer(inputs: str) -> str:
    response = client.chat.completions.create(
        model="gpt-5-mini",
        messages=[{"role": "user", "content": inputs}],
    )
    return response.choices[0].message.content

scenario = (
    Scenario("test_dynamic_output")
    .interact(
        inputs="What is the capital of France?",
        outputs=get_answer,
    )
    .check(
        Groundedness(
            name="answer is grounded",
            answer_key="trace.last.outputs",
            context="France is a country in Western Europe. Its capital is Paris.",
        )
    )
)

result = await scenario.run()
result.print_report()

The run() method is async. In a script, wrap it with asyncio.run(). See the full docs for Suites, LLMJudge, multi-turn scenarios, and more.

Looking for Giskard v2?

Giskard v2 included Scan (automatic vulnerability detection) and RAGET (RAG evaluation test set generation) for both ML models and LLM applications. These features are not available in v3.

pip install "giskard[llm]>2,<3"

Scan — automatically detect performance, bias & security issues

Wrap your model and run the scan:

import giskard
import pandas as pd

# Replace my_llm_chain with your actual LLM chain or model inference logic
def model_predict(df: pd.DataFrame):
    """The function takes a DataFrame and must return a list of outputs (one per row)."""
    return [my_llm_chain.run({"query": question}) for question in df["question"]]

giskard_model = giskard.Model(
    model=model_predict,
    model_type="text_generation",
    name="My LLM Application",
    description="A question answering assistant",
    feature_names=["question"],
)

scan_results = giskard.scan(giskard_model)
display(scan_results)

RAGET — generate evaluation datasets for RAG applications

Automatically generate questions, reference answers, and context from your knowledge base:

import pandas as pd
from giskard.rag import generate_testset, KnowledgeBase

# Load your knowledge base documents
df = pd.read_csv("path/to/your/knowledge_base.csv")
knowledge_base = KnowledgeBase.from_pandas(df, columns=["column_1", "column_2"])

testset = generate_testset(
    knowledge_base,
    num_questions=60,
    language='en',
    agent_description="A customer support chatbot for company X",
)

Full v2 docs

👋 Community

We welcome contributions from the AI community! Read this guide to get started, and join our thriving community on Discord.

Follow the progress and share feedback: v3 Announcement · Roadmap

🌟 Leave us a star, it helps the project to get discovered by others and keeps us motivated to build awesome open-source tools! 🌟

❤️ If you find our work useful, please consider sponsoring us on GitHub. With a monthly sponsoring, you can get a sponsor badge, display your company in this readme, and get your bug reports prioritized. We also offer one-time sponsoring if you want us to get involved in a consulting project, run a workshop, or give a talk at your company.

💚 Current sponsors

We thank the following companies which are sponsoring our project with monthly donations:

Lunary

Biolevate

Release History

Version	Changes	Urgency	Date
giskard-checks/v1.0.2b3	Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-agents/v1.0.2b3...giskard-checks/v1.0.2b3	High	5/20/2026
giskard-checks/v1.0.2b2	## What's Changed * deps: Upgrade giskard-agents to 1.0.2b2 for giskard-checks by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2432 Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-agents/v1.0.2b2...giskard-checks/v1.0.2b2	High	4/29/2026
giskard-checks/v1.0.2b1	## What's Changed * security(deps): Upgrade pygments to fix CVE CVE-2026-4539 by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2351 * fix(checks): preserve generator and embedding model when running scenarios by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2293 * security: upgrade aiohttp to fix multiple cve by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2356 * Add Junix XML export for SuiteResult by @Mapalo90 in https://github.com/Gisk	High	4/10/2026
giskard-agents/v1.0.2b1	Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-checks/v1.0.1b1...giskard-agents/v1.0.2b1	Medium	3/26/2026
giskard-checks/v1.0.1b1	Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-core/v1.0.1b2...giskard-checks/v1.0.1b1	Medium	3/26/2026
giskard-core/v1.0.1b2	Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-core/v1.0.1b1...giskard-core/v1.0.1b2	Medium	3/26/2026
giskard-core/v1.0.1b1	## What's Changed * docs(giskard-agents): update README examples to match current API and improve clarity by @Hartorn in https://github.com/Giskard-AI/giskard-oss/pull/2312 * fix(agents): use shallow copy in BaseGenerator.with_params() by @Hartorn in https://github.com/Giskard-AI/giskard-oss/pull/2319 * chore(deps): update astral-sh/setup-uv digest to 37802ad by @renovate[bot] in https://github.com/Giskard-AI/giskard-oss/pull/2322 * feat(checks): show scenario and check error details in suite re	Medium	3/26/2026
giskard-checks/v1.0.1a2	## What's Changed * feat: reexport resolve at root level by @Deepak8858 in https://github.com/Giskard-AI/giskard-oss/pull/2311 * Make Suite.append() chainable (return self) by @Bingtagui404 in https://github.com/Giskard-AI/giskard-oss/pull/2317 * feat(checks): add print_report to result models by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2314 ## New Contributors * @Deepak8858 made their first contribution in https://github.com/Giskard-AI/giskard-oss/pull/2311 * @Bingtagui	Low	3/20/2026
giskard-agents/v1.0.2a1	## What's Changed * fix(tests): Uses gemini-embedding-001 by default by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2306 * chore(deps): update github-actions dependencies by @renovate[bot] in https://github.com/Giskard-AI/giskard-oss/pull/2309 * fix(checks)!: make Interact injection name-based by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2302 * feat(agents): add metadata parameter to generator completion pipeline by @Hartorn in https://github.com/Giska	Low	3/17/2026
giskard-checks/v1.0.1a1	Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-agents/v1.0.1a1...giskard-checks/v1.0.1a1	Low	3/12/2026
giskard-agents/v1.0.1a1	## What's Changed * test(agents): skip OpenAI-dependent embedding functional test by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2304 Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/giskard-core/v1.0.1a1...giskard-agents/v1.0.1a1	Low	3/12/2026
giskard-core/v1.0.1a1	## What's Changed * feat!: rewrite as v3 monorepo with giskard-core, giskard-agents, and giskard-checks by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2229 * fix: refactor release workflow by @henchaves in https://github.com/Giskard-AI/giskard-oss/pull/2256 * chore: update readme by @mattbit in https://github.com/Giskard-AI/giskard-oss/pull/2257 * chore(deps): update pep621 dependencies by @renovate[bot] in https://github.com/Giskard-AI/giskard-oss/pull/2258 * feat(checks):	Low	3/12/2026
v2.19.1	## What's Changed * chore: drop pkg_resources usage and setuptools dependency by @henchaves in https://github.com/Giskard-AI/giskard-oss/pull/2243 Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/v2.19.0...v2.19.1	Low	2/17/2026
v2.19.0	## What's Changed * fix(security): prevent command injection in GitHub Actions workflows [ENG-1131] by @kevinmessiaen in https://github.com/Giskard-AI/giskard-oss/pull/2214 * feat: add support to custom embeddings by @henchaves in https://github.com/Giskard-AI/giskard-oss/pull/2241 Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/v2.18.1...v2.19.0	Low	2/10/2026
v2.18.1	## What's Changed * Add legacy documentation notice with toggle functionality and styling by @davidberenstein1957 in https://github.com/Giskard-AI/giskard-oss/pull/2192 Full Changelog: https://github.com/Giskard-AI/giskard-oss/compare/v2.18.0...v2.18.1	Low	9/18/2025
v2.18.0	## What's Changed * implement abstract methods for RagasEmbeddingsWrapper by @tituslhy in https://github.com/Giskard-AI/giskard/pull/2178 * Add Groq model support to LLMClient (#1977) by @kunjanshah0811 in https://github.com/Giskard-AI/giskard/pull/2165 * fix Discord Sever badge in docs page by @henchaves in https://github.com/Giskard-AI/giskard/pull/2184 ## New Contributors * @tituslhy made their first contribution in https://github.com/Giskard-AI/giskard/pull/2178 * @kunjanshah0811 made their	Low	8/18/2025
v2.17.0	## What's Changed * chore(deps): batch upgrade 2025-03 by @henchaves in https://github.com/Giskard-AI/giskard/pull/2130 * upgrade litellm version in pyproject.toml by @GTimothee in https://github.com/Giskard-AI/giskard/pull/2139 * chore(docs): update how to create custom metrics by @GTimothee in https://github.com/Giskard-AI/giskard/pull/2150 * chore(deps): batch upgrade 2025-04 by @henchaves in https://github.com/Giskard-AI/giskard/pull/2157 * chore(deps): update package `faiss-cpu` version by	Low	6/11/2025
v2.16.2	## What's Changed * chore: update getting started docs by @jmsquare in https://github.com/Giskard-AI/giskard/pull/2110 * Chore: Moved test datasets to s3 by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2109 * chore(deps): constrain package versions (`ragas`, `langchain` and `wandb`) by @henchaves in https://github.com/Giskard-AI/giskard/pull/2122 * chore(docs): replace llama3.1 with qwen2.5 on ollama docs by @henchaves in https://github.com/Giskard-AI/giskard/pull/2124 * fix: Re	Low	3/19/2025
v2.16.1	## What's Changed * [GSK-3950] Add scan results to RAGET banking supervision report by @henchaves in https://github.com/Giskard-AI/giskard/pull/2079 * Refresh RAGET notebooks by @henchaves in https://github.com/Giskard-AI/giskard/pull/2087 * [GSK-4014] Set encoding as utf-8 for all open statements by @henchaves in https://github.com/Giskard-AI/giskard/pull/2088 * Fix GitHub Actions workflow for building Python by @henchaves in https://github.com/Giskard-AI/giskard/pull/2099 * [GSK-4033] Fix corr	Low	2/12/2025
v2.16.0	## What's Changed * Adding security policy by @mattbit in https://github.com/Giskard-AI/giskard/pull/2072 * [GSK-3940] Make RagasMetric compatible with Ragas v0.2 by @henchaves in https://github.com/Giskard-AI/giskard/pull/2073 * [GSK-3941] Fix Azure/OpenAI client setup docs by @henchaves in https://github.com/Giskard-AI/giskard/pull/2074 * [GSK-3914] Fix RagasMetric outdated message by @henchaves in https://github.com/Giskard-AI/giskard/pull/2062 * Dependa bot - batch upgrade 2024-11-18 by @hen	Low	11/21/2024
v2.15.5	Release 2.15.5 fixes a [ReDoS vulnerability](https://github.com/Giskard-AI/giskard/security/advisories/GHSA-pjwm-cr36-mwv3) discovered in Giskard text perturbation detector (CVE-2024-52524). Thanks to @kevinbackhouse for disclosing the issue and helping with the resolution. Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.15.4...v2.15.5	Low	11/14/2024
v2.15.4	## What's Changed * Update report testset file extension by @PierreMesure in https://github.com/Giskard-AI/giskard/pull/2058 * Limit mlflow-skinny version in ml_runtime by @henchaves in https://github.com/Giskard-AI/giskard/pull/2060 * GSK-3609 Avoid redundant questions in data generation by @JiaenLiu in https://github.com/Giskard-AI/giskard/pull/1990 * Update pdm lock by @henchaves in https://github.com/Giskard-AI/giskard/pull/2063 * Update test suite result on LLM QA notebooks by @henchaves i	Low	11/12/2024
v2.15.3	## What's Changed * [GSK-3835] Update setting up LLM doc by @henchaves in https://github.com/Giskard-AI/giskard/pull/2042 * Add force_ascii=False when generating JSONL by @PierreMesure in https://github.com/Giskard-AI/giskard/pull/2049 * Update README.md by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/2053 * [GSK-3881] Fix RAGAS compatibility by @henchaves in https://github.com/Giskard-AI/giskard/pull/2052 ## New Contributors * @PierreMesure made their first contribution in http	Low	10/29/2024
v2.15.2	## What's Changed * Fixed randomly failing tests by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2026 * Add sponsors to README, remove Sonar badge by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/2031 * Increased threshold of import time by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2034 * Added documentation on which tag to use when using the scan's `only` param for each detector by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/203	Low	10/15/2024
v2.15.1	## What's Changed * Upgraded `mistralai` dep to >= 1 by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2025 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.15.0...v2.15.1	Low	9/18/2024
v2.15.0	## What's Changed * [GSK-3652] Add set embedding model (for OpenAI client) by @henchaves in https://github.com/Giskard-AI/giskard/pull/1992 * Fix incorrect temperature usage in `_LLMBasedQuestionGenerator` by @asselindebeauville in https://github.com/Giskard-AI/giskard/pull/2007 * Fix typo in code sample for RAGET evaluation documentation by @AlexDut in https://github.com/Giskard-AI/giskard/pull/1988 * Batch dependency upgrade by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2014	Low	9/3/2024
v2.14.6	## What's Changed * Fixed token count on Gemini client by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2012 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.14.5...v2.14.6	Low	8/30/2024
v2.14.5	## What's Changed * Removed Sentry integration by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2002 * Added prompt injection datasets to packaged data by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2005 * Fixed broken link to "Setting up the LLM Client" documentation page by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2010 * Fixed gemini role issue by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2011 Full Changelog: https://	Low	8/29/2024
v2.14.4	## What's Changed * [GSK-3667] Remove disclaimer about GPT4 dependency on LLM detectors by @henchaves in https://github.com/Giskard-AI/giskard/pull/1994 * Updating documentation for vision by @bmalezieux in https://github.com/Giskard-AI/giskard/pull/1999 * Limited griffe version to <0.49.0 due to removed package docstrings by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/2001 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.14.3...v2.14.4	Low	8/20/2024
v2.14.3	## What's Changed * Add max_issues_per_detector as parameter inside the scanner by @bmalezieux in https://github.com/Giskard-AI/giskard/pull/1987 * Add giskard-vision link to README by @henchaves in https://github.com/Giskard-AI/giskard/pull/1989 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.14.2...v2.14.3	Low	7/25/2024
v2.14.2	## What's Changed * Bump SonarSource/sonarcloud-github-action from 2.2.0 to 2.3.0 by @dependabot in https://github.com/Giskard-AI/giskard/pull/1968 * Save a suite locally by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1978 * Fixed issue with slicing function and transformation function instances being shared by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1979 * Disable sentry by default and handle connection error when network is not available by @kevinmessiaen	Low	7/17/2024
v2.14.1	## What's Changed * Fixed merge conflict creating duplicated scan result by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1948 * Remove noise by filtering error that arise outside of giskard (ei. in jupiter notebook) by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1938 * JSON report generation from scan results by @bmalezieux in https://github.com/Giskard-AI/giskard/pull/1929 * Bump SonarSource/sonarcloud-github-action from 2.1.1 to 2.2.0 by @dependabot in https://	Low	7/9/2024
v2.14.0	## What's Changed * [GSK-3568] Remove legacy hub mentions from docs by @henchaves in https://github.com/Giskard-AI/giskard/pull/1942 * Group message by role when using claude bedrock client by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1949 * Added Llama Bedrock client by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1943 * [GSK-3589] Decouple `giskard` lib release from `legacy-hub` by @henchaves in https://github.com/Giskard-AI/giskard/pull/1950 **Full Changel	Low	6/4/2024
v2.13.0	## What's Changed * Upgrade openai version in documentation notebooks by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1903 * Propagate bedrock instruction into other pages by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1926 * Tweak documentation while we update the Hub by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/1928 * Disabled perturbation on date by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1932 * Fixed typo in coherency.p	Low	5/28/2024
v2.12.0	## What's Changed * [GSK-1590] native support for claude 3 and titan embeddings on Bedrock by @celmore25 in https://github.com/Giskard-AI/giskard/pull/1905 * Fix RAGET correctness value by @pierlj in https://github.com/Giskard-AI/giskard/pull/1910 * [GSK-2696] Recover daemonized ML worker with multi ML worker support by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1884 * updated mlflow notebook on docs by @rabah-khalek in https://github.com/Giskard-AI/giskard/pull/1912 * Documentatio	Low	5/3/2024
v2.11.0	## What's Changed * Polish new README by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/1886 * Feature/gsk 2334 talk to my model mvp by @AbSsEnT in https://github.com/Giskard-AI/giskard/pull/1831 * Revert "Feature/gsk 2334 talk to my model mvp" by @rabah-khalek in https://github.com/Giskard-AI/giskard/pull/1887 * [GSK-3365] Ease custom metrics support by @pierlj in https://github.com/Giskard-AI/giskard/pull/1871 * Raget redesign by @davidjanmercado in https://github.com/Giskard-AI/	Low	4/19/2024
v2.10.0	## What's Changed * Remove uppercas F in examples, so model return proper JSON by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1877 * Readme updates by @luca-martial in https://github.com/Giskard-AI/giskard/pull/1878 * Full Apache 2 license by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/1885 * Support for custom LLM clients by @mattbit in https://github.com/Giskard-AI/giskard/pull/1839 * Bump softprops/action-gh-release from 1 to 2 by @dependabot in https://github.com/	Low	4/10/2024
v2.9.1	## What's Changed * [GSK-2879][GSK-3300] Technical documentation of Giskard Evaluator in the Giskard doc by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1848 * New tools image by @TeoAle in https://github.com/Giskard-AI/giskard/pull/1882 * GIFs update by @TeoAle in https://github.com/Giskard-AI/giskard/pull/1883 ## New Contributors * @TeoAle made their first contribution in https://github.com/Giskard-AI/giskard/pull/1882 Full Changelog: https://github.com/Giskard-AI/giskard/com	Low	4/9/2024
v2.9.0	## What's Changed * [GSK-2497] Allow to connect multiple external mlworkers by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1740 * Switching from mlWorkerId to kernel name by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1777 * GSK-2896 Automatically generate a Kernel based on dependencies by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1829 * GSK-3241 Make `kernel_name` the last argument of `create_project` by @kevinmessiaen in https://github.com/Giskard-AI/gi	Low	4/8/2024
v2.8.0	## What's Changed * [GSK-2892] Improve RAG toolset by @pierlj in https://github.com/Giskard-AI/giskard/pull/1820 * [GSK-3361] Model predictions are casted to double by default in regression mode by @bmalezieux in https://github.com/Giskard-AI/giskard/pull/1857 * Update mlflow-llm-example.ipynb (typo) by @rabah-khalek in https://github.com/Giskard-AI/giskard/pull/1858 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.7.7...v2.8.0	Low	3/25/2024
v2.7.7	## What's Changed * [GSK-3303] Fix validation issue with tuple, also fixes param names by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1844 * Fix the llm client to reset when changing llm api or model by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1842 * Issue with evaluating openai-based LLMs with mlflow.evaluate() API by @rabah-khalek in https://github.com/Giskard-AI/giskard/pull/1843 * GSK-2409 Add doc to how to update the lockfile by @kevinmessiaen in https://github.com/	Low	3/19/2024
v2.7.6	## What's Changed * [GSK-2937] Update docs after moving non-ML worker code outside by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1836 * GSK-2604 GSK-2766 Renamed tittle of wage classification notebook by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1841 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.7.5...v2.7.6	Low	3/13/2024
v2.7.5	## What's Changed * [GSK-2920] Fallback to english if `num2words` does not implement in a language by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1826 * Fix LLMON url typo by @alexcombessie in https://github.com/Giskard-AI/giskard/pull/1828 * Bump pre-commit/action from 3.0.0 to 3.0.1 by @dependabot in https://github.com/Giskard-AI/giskard/pull/1827 * GSK-1807 Integrate Sentry by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1823 * Fix download link for enron by @Hart	Low	3/6/2024
v2.7.4	## What's Changed * [GSK-2871] Fix TextPerturbationDetector failed with error from `num2word` under "it" or "fa" language code by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1815 * GSK-2874 Fix quality badge using https://peps.python.org/pep-0563/ by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1816 * [GSK-2793] Updated injection prompts (removing duplicates) by @rabah-khalek in https://github.com/Giskard-AI/giskard/pull/1817 * Remove verbose log when wrapping datase	Low	2/28/2024
v2.7.3	## What's Changed * [GSK-2270] Add the transformation column back & hide it only if not used by @cy-moi in https://github.com/Giskard-AI/giskard/pull/1803 * GSK-2426 only run `core.core.CallableMeta.extract_doc` when necessary by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1776 * GSK-2754 Added requirements to the llm-as-a-judge per row test metadata by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1792 * [GSK-2834] Show only light theme logo in PyPI package descr	Low	2/22/2024
v2.7.2	## What's Changed * [GSK-2540, GSK-2558, GSK-2663] RAG toolset by @pierlj in https://github.com/Giskard-AI/giskard/pull/1735 * Fix pipy release by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1801 * Temporary fix for docs links to notebooks by @mattbit in https://github.com/Giskard-AI/giskard/pull/1805 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.7.1...v2.7.2	Low	2/19/2024
v2.7.1	## What's Changed * Add documentation for AVID integration by @mattbit in https://github.com/Giskard-AI/giskard/pull/1677 * [GSK-2280] Feature: Added Number-to-Words Transformation by @sagar118 in https://github.com/Giskard-AI/giskard/pull/1615 * [GSK-2625] Fix deprecations in tests fixtures from sklearn 1.4.0 by @andreybavt in https://github.com/Giskard-AI/giskard/pull/1743 * Remove cache in CI by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1778 * Add type for SuiteTest by @Smixi in	Low	2/13/2024
v2.7.0	## What's Changed * [GSK-2672] Optimize import giskard execution time by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1765 * GSK-2684 Added LLM as a judge ground truth test by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1767 * [GSK-2594 \| GSK-2611 \| GSK-2610 \| GSK-2607 \| GSK-2602] Fixed documentation notebooks by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1760 * [GSK-1069] Add text transform: remove accents by @asselindebeauville in https://github.com	Low	1/29/2024
v2.6.0	## What's Changed * [GSK-2626] Fix run test action for custom tests by @henchaves in https://github.com/Giskard-AI/giskard/pull/1753 * [GSK-2659] Fix link in log after uploading a test suite by @Inokinoki in https://github.com/Giskard-AI/giskard/pull/1757 * Update calibration detectors to display `metric_value_perc` with 2 decimals by @asselindebeauville in https://github.com/Giskard-AI/giskard/pull/1754 * feat: set __str__ to use name and id for Dataset and Model by @AdriMarteau in https://gith	Low	1/23/2024
v2.5.3	## What's Changed * Add ggshield to pre-commit by @Hartorn in https://github.com/Giskard-AI/giskard/pull/1750 * Gsk 2622 GSK-2584: Added a method to create Dataset from TestDetails by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1747 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.5.2...v2.5.3	Low	1/19/2024
v2.5.2	## What's Changed * GSK-2468 Try to automatically add import when adding kwargs tests to giskard Hub by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1734 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.5.1...v2.5.2	Low	1/19/2024
v2.5.1	## What's Changed * GSK-2425 Add test to assert how long `import giskard` is running by @kevinmessiaen in https://github.com/Giskard-AI/giskard/pull/1739 * GSK-2627 Limit scikit-learn to <1.4.0 by @andreybavt in https://github.com/Giskard-AI/giskard/pull/1744 * Fix BaseModel loading by @pierlj in https://github.com/Giskard-AI/giskard/pull/1745 Full Changelog: https://github.com/Giskard-AI/giskard/compare/v2.5.0...v2.5.1	Low	1/19/2024

Dependencies & License Audit

Loading dependencies...

Similar Packages

AI-Infra-GuardA full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.v4.1.11

promptfooTest your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and0.121.14

trulensEvaluation and Tracking for LLM Experiments and AI Agentstrulens-2.8.1

agent-actionsDeclarative framework for orchestrating multi-model LLM pipelines with context engineering and quality gates.v0.2.4

local-rag-system🤖 Build your own local Retrieval-Augmented Generation system for private, offline AI memory without ongoing costs or data privacy concerns.main@2026-06-05

More in Testing

fspecFSPEC: The Spec-Driven, Multi-Agent Coding Factory. It is infrastructure for the "Dark Factory"—the emerging model of fully autonomous software development where AI agents handle all implementation wh

vector-db-benchmarkFramework for benchmarking vector search engines

pilot#1 Terminal Benchmark 2.0 — AI that ships your tickets.

GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issues—such as security vulnerabilities, bugs, and maintainability concerns.