Langfuse just got faster →

Changelog

Latest release updates from the Langfuse team. Check out our Roadmap to see what's next.

All posts

Experiments as a First-Class Concept

Experiments now live alongside Datasets as their own top-level feature—run them with or without datasets, compare across runs, and track progress over time.

11 Days Ago

Boolean LLM-as-a-Judge Scores

LLM-as-a-Judge evaluators can now return boolean scores for `true` / `false` decisions.

2 Weeks Ago

Updates to Dashboards

Detailed reference for how dashboards behave differently in "Fast Preview" — trace counts, histograms, filters, and more.

March 23, 2026

Categorical LLM-as-a-Judge Scores

LLM-as-a-Judge evaluators can now return categorical scores in addition to numeric ones.

March 20, 2026

Simplify Langfuse for Scale

Langfuse now delivers faster product performance at scale. See the overview page for rollout details, access, and migration steps.

March 10, 2026

Langfuse CLI

Fully use Langfuse from the CLI. Built for AI agents and power users.

February 17, 2026

Evaluate Individual Operations: Faster, More Precise LLM-as-a-Judge

Observation-level evaluations enable precise operation-specific scoring for production monitoring.

February 13, 2026

Run Experiments on Versioned Datasets

Fetch datasets at specific version timestamps and run experiments on historical dataset versions via UI, API, and SDKs for full reproducibility.

February 11, 2026

Corrected Outputs for Traces and Observations

Capture improved versions of LLM outputs directly in trace views. Build fine-tuning datasets and drive continuous improvement with domain expert feedback.

January 14, 2026

Inline Comments on Observation I/O

Anchor comments to specific text selections within trace and observation input, output, and metadata fields.

January 7, 2026

Filter Observations by Tool Calls and add Tool Calls to Dashboard Widgets

Add filtering, table columns, and dashboard widgets for analyzing tool usage in your LLM applications.

December 22, 2025

v2 Metrics and Observations API (Beta)

New high-performance v2 APIs for metrics and observations with cursor-based pagination, selective field retrieval, and optimized data architecture.

December 17, 2025

Dataset Item Versioning

Track dataset changes over time with automatic versioning on every addition, update, or deletion of dataset items.

December 15, 2025

OpenAI GPT-5.2 support

Langfuse now supports OpenAI GPT-5.2 with day 1 support across all major features.

December 12, 2025

Batch Add Observations to Datasets

Select multiple observations from the observations table and add them to a new or existing dataset with flexible field mapping.

December 11, 2025

Pricing Tiers for Accurate Model Cost Tracking

Langfuse now supports pricing tiers for models with context-dependent pricing, enabling accurate cost calculation for models with context-dependent pricing.

December 2, 2025

Hosted MCP Server for Langfuse Prompt Management

Langfuse now includes a native Model Context Protocol (MCP) server with write capabilities, enabling AI agents to fetch and update prompts directly.

November 20, 2025

OpenAI GPT-5.1 support

Langfuse now supports OpenAI GPT-5.1 with day 1 support for the LLM playground, LLM-as-a-judge evaluations, and comprehensive cost tracking.

November 14, 2025

Launch Week 4 🚀

Organize Your Datasets in Folders

Use slashes in dataset names to create folders for better organization.

November 8, 2025

Launch Week 4 🚀

JSON Schema Enforcement for Dataset Items

Define JSON schemas for your dataset inputs and expected outputs to ensure data quality and consistency across your test datasets.

November 8, 2025

Launch Week 4 🚀

Score Analytics with Multi-Score Comparison

Validate evaluation reliability and uncover insights with comprehensive score analysis. Compare different evaluation methods, track trends over time, and measure agreement between human annotators and LLM judges.

November 7, 2025

Annotation Support in Experiment Compare View

Add human annotations while reviewing experiment results side-by-side. Review experiment outputs, assign scores, and leave comments while viewing full experiment context.

November 6, 2025

Launch Week 4 🚀

Baseline Support in Experiment Compare View

Compare experiment runs side-by-side with baseline designation to systematically identify regressions and improvements

November 6, 2025

Launch Week 4 🚀

Filters in Compare View

Filter experiment results in the compare view to focus on specific subsets, such as items where evaluator scores dropped below a threshold

November 6, 2025

Launch Week 4 🚀

Langfuse for Agents

Trace agents with beautifully rendered tool calls and understand their performance through Agent Evals.

November 5, 2025

Launch Week 4 🚀

Amazon Bedrock AgentCore Integration

Trace AI agents built with Amazon Bedrock AgentCore via OpenTelemetry and Langfuse

November 4, 2025

Launch Week 4 🚀

@Mentions and Reactions in Comments

Tag teammates with @mentions to notify them instantly, and add emoji reactions to comments for quick acknowledgments.

November 4, 2025

Launch Week 4 🚀

IdP-Initiated SSO Support

Langfuse now supports IdP-initiated SSO, allowing users to start authentication directly from their identity provider (e.g., Okta, Azure AD, Keycloak, JumpCloud)

November 4, 2025

Launch Week 4 🚀

Mixpanel integration

We teamed up with Mixpanel to integrate LLM-related product metrics into your existing Mixpanel dashboards.

November 4, 2025

Launch Week 4 🚀

Advanced Filtering for Public Traces and Observations API

The traces endpoint in public API now supports complex JSON-based filtering.

November 3, 2025

Launch Week 4 🚀

Filter Sidebar for Tables

Quickly filter tables by column values in the filter sidebar with one click.

November 3, 2025

Langchain v1 Support

Langfuse SDKs now support Langchain v1 for both Python and JS/TS. The integration remains stable and backward compatible.

October 26, 2025

LLM-as-a-Judge Execution Tracing & Enhanced Observability

Every LLM-as-a-Judge evaluator execution now creates a trace, allowing you to inspect the exact prompts, responses, and token usage for each evaluation.

October 16, 2025

Spend Alerts

Monitor your organization's cloud spending and receive notifications when costs exceed predefined monetary thresholds.

October 10, 2025

Natural Language Filtering for Traces

Filter your traces using plain English queries. Powered by AWS Bedrock with zero data retention.

September 30, 2025

Structured Output Support for Prompt Experiments

Enforce JSON schema response formats in prompt experiments to ensure consistent, parseable outputs for evaluation and analysis.

September 30, 2025

Mutable Score Configs

Score configurations can now be updated after creation. Modify existing configs via API/SDK and UI while keeping all your data safe.

September 29, 2025

Experiment Runner SDK

New high-level SDK abstraction for running experiments on datasets with automatic tracing, concurrent execution, and flexible evaluation.

September 17, 2025

TypeScript SDK v4 (GA)

The new OpenTelemetry-based TypeScript SDK v4 is now generally available with improved DX, modular packages, and seamless integrations.

August 28, 2025

Additional Observation Types for More Meaningful Span Context

New observation types including Agent, Tool, Chain, Retriever, Evaluator, Embedding, and Guardrail provide semantic meaning to your traces.

August 27, 2025

New End-to-End Walkthrough Videos

We've released new comprehensive walkthrough videos covering observability, prompt management, and evaluation to help you get up to speed quickly with Langfuse.

August 26, 2025

Full-Text Search Across Dataset Items

Find dataset items by searching through their actual content with our new full-text search capability

August 25, 2025

Additional provider options in LLM calls in playground and evals

Set additional provider options in your LLM calls in playground and in llm-as-a-judge evaluations.

August 14, 2025

Docs now available as Markdown (.md) endpoints

Append .md to any docs URL to fetch the page as Markdown. Built at compile time for fast, reliable access.

August 7, 2025

OpenAI GPT-5 pricing now available in Langfuse

Day 1 support for OpenAI GPT-5 including tracking token counts and USD spend.

August 7, 2025

Annotation Queue Assignments

Assign users to annotation queues to make it easier for team members to focus on relevant tasks.

August 6, 2025

Slack Integration for Prompt Webhooks

Receive prompt change notifications directly in your Slack channels with our native integration.

July 30, 2025

LLM Playground with Side-by-Side Comparison

The LLM Playground now supports side-by-side prompt comparison with parallel LLM execution.

July 28, 2025

Sessions in Annotation Queues

Annotation Queues now support session-level annotation, making it easier to evaluate multi-turn interactions in your LLM applications.

July 28, 2025

LiveKit Agents Tracing Integration

Trace real-time voice AI agents and multimodal conversations built with LiveKit Agents via OpenTelemetry and Langfuse

July 25, 2025