LangSmith学习

张开发

• 2026/5/4 7:37:33 • 15 分钟阅读

分享文章

1、概述LangSmith 是 LangChain 官方推出的 LLM 应用全生命周期开发平台核心定位是为基于 LangChain/LangGraph 构建的大模型应用提供端到端的可观测性、调试、测试与评估能力解决 LLM 应用开发中「黑盒难调试、性能难监控、效果难评估」的核心痛点。其实就是监视agent评测agent效果主要针对LangChain、LangGraph家族还可以对Agent批量自动化测试。当 Agent 执行出现问题时可以在 LangSmith 中查看1、每个节点的输入和输出2、LLM 的完整 Prompt 和响应3、工具调用的参数和结果4、状态转换的详细过程2、使用1、获取API密钥官网https://smith.langchain.com2、设置环境变量LANGSMITH_API_KEYyour-api-key-hereLANGCHAIN_TRACING_V2trueLANGCHAIN_PROJECTmy_langsmith_demo【LANGSMITH_API_KEY】LangSmith的key用于身份验证【LANGCHAIN_TRACING_V2】是否启用追踪功能启用后Agent执行步骤会被记录【LANGCHAIN_PROJECT】你的AI项目在LangSmith里的文件夹名称3、代码中追踪配置tag字段用于langsmith的分类、筛选metadata字段相当于源数据会显示在网页上run_name字段在langSmith列表中显示的大标题trace_config {} if LANGSMITH_ENABLED: # 【关键】给这次运行打标签、备注元数据网页里好筛选查看 trace_config RunnableConfig( tags[simple-demo, learn-langsmith], metadata{ user_question: 今天大盘怎样, run_time: 2026-04-03, model: qwen-turbo, scene: 股市查询, test_user: test-001, anything: 随便写 }, run_nameLangSmith学习演示运行 ) if trace_config: # 【最关键】invoke 传入 config → 自动上报所有节点、LLM调用到LangSmith result agent.invoke(init_state, configtrace_config) else: result agent.invoke(init_state)4、langsmith使用1、运行python代码比如我这里python langsmith_demo.py2、打开langsmith网站查看运行效果5、langsmith使用相当于对我们写的agent做批量自动化测试并且结果同步到langsmith【核心步骤】1、定义多条测试用例数据2、在LangSmith后台自动建数据集把所有考题塞进去3、逐个取出题库里的问题4、调用你写的LangGraph Agent完整走一遍节点流程5、带上追踪配置全程日志自动上报LangSmith6、拿Agent实际输出 vs 预设期望outputs7、自动算分【自动化测试示例代码】#!/usr/bin/env python # -*- coding: utf-8 -*- 极简 Demo Agent 自动化测试完整版功能输入期望输出自动评估上报 LangSmith 已修复dict 对象无 inputs 属性错误 import os from datetime import datetime from langsmith import Client, RunEvaluator from langsmith.evaluation import evaluate from langsmith.schemas import Run, Example # 导入你的极简 Agent from langsmith_demo import build_simple_graph, SimpleState # 初始化 LangSmith LANGSMITH_ENABLED os.getenv(LANGCHAIN_TRACING_V2, ).lower() true client Client() # 工具函数安全获取输入输出修复核心 def _get_example_inputs(example): try: if hasattr(example, inputs): return example.inputs if isinstance(example.inputs, dict) else {} if isinstance(example, dict): return example.get(inputs, example) if isinstance(example.get(inputs), dict) else example if hasattr(example, __dict__): return example.__dict__.get(inputs, {}) except: pass return {} def _get_example_outputs(example): try: if hasattr(example, outputs): return example.outputs if isinstance(example.outputs, dict) else {} if isinstance(example, dict): return example.get(outputs, {}) if hasattr(example, __dict__): return example.__dict__.get(outputs, {}) except: pass return {} # 测试用例输入期望输出 TEST_CASES [ { inputs: { user_input: 今天大盘怎么样 }, outputs: { should_contain: [大盘, 行情, 走势], not_empty: True } }, { inputs: { user_input: 最近基金值得买吗 }, outputs: { should_contain: [基金, 投资, 建议], not_empty: True } }, { inputs: { user_input: 黄金价格走势如何 }, outputs: { should_contain: [黄金, 价格, 走势], not_empty: True } } ] # 测试执行函数 def agent_executor(example: Example) - dict: agent build_simple_graph() # 安全获取输入 inputs _get_example_inputs(example) user_input inputs.get(user_input, ) # 初始化状态 init_state: SimpleState { user_input: user_input, step1_result: , step2_result: , final_answer: } # 追踪配置 trace_config { tags: [simple-agent, test], metadata: { customer_id: TEST-USER, user_query: user_input, test_time: datetime.now().isoformat() }, run_name: ftest-{user_input[:8]}-{datetime.now().strftime(%H%M%S)} } # 运行 Agent result agent.invoke(init_state, configtrace_config) return { final_answer: result.get(final_answer, ) } # 评估器 class KeywordEvaluator(RunEvaluator): def evaluate_run(self, run: Run, example: Example, **kwargs): expected _get_example_outputs(example) answer run.outputs.get(final_answer, ).lower() keywords expected.get(should_contain, []) found [k for k in keywords if k.lower() in answer] score len(found) / len(keywords) if keywords else 1.0 return { key: keyword_match, score: score, comment: f匹配 {len(found)}/{len(keywords)} } class NotEmptyEvaluator(RunEvaluator): def evaluate_run(self, run: Run, example: Example, **kwargs): answer run.outputs.get(final_answer, ) score 1.0 if answer.strip() else 0.0 return { key: not_empty, score: score, comment: 回答有效 if score else 回答为空 } # 主程序 if __name__ __main__: dataset_name simple-agent-test-dataset # 1. 创建数据集 try: client.create_dataset(dataset_name) except Exception as e: pass # 2. 创建测试用例输入期望输出 for test in TEST_CASES: client.create_example( inputstest[inputs], outputstest[outputs], dataset_namedataset_name ) # 3. 运行评估 print(开始自动化测试...) evaluate( agent_executor, datadataset_name, evaluators[KeywordEvaluator(), NotEmptyEvaluator()], experiment_prefixsimple-agent-test, max_concurrency1 ) print(测试完成已上报 LangSmith)3、langsmith demo示例代码#!/usr/bin/env python # -*- coding: utf-8 -*- LangSmith 极简学习版只演示追踪 LangGraph 流程删掉所有复杂投顾业务、多余Prompt、结构体 import os from typing import TypedDict from langchain_community.llms import Tongyi from langchain_core.runnables import RunnableConfig from langgraph.graph import StateGraph, END # LangSmith 用于 Agent 调试、追踪和监控 # 使用前需要设置以下环境变量 # 1. LANGSMITH_API_KEY: 从 https://smith.langchain.com 获取 # 2. LANGCHAIN_TRACING_V2: 设置为 true 启用追踪 # 3. LANGCHAIN_PROJECT: 项目名称可选用于组织追踪记录 # 4. LANGCHAIN_ENDPOINT: LangSmith API端点可选默认使用官方端点 # 检查是否启用 LangSmith LANGSMITH_ENABLED os.getenv(LANGCHAIN_TRACING_V2, ).lower() true print(LANGSMITH_ENABLED, LANGSMITH_ENABLED) # 简易LLM复用你原来的通义千问 llm Tongyi( model_nameqwen-turbo-latest, dashscope_api_keyos.getenv(DASHSCOPE_API_KEY) ) # 2、极简状态定义 class SimpleState(TypedDict): user_input: str step1_result: str step2_result: str final_answer: str # 3、极简节点模拟多步Agent流程 def node_step1(state: SimpleState) - SimpleState: 第一步简单理解问题 res llm.invoke(f简要理解这句话{state[user_input]}) return {**state, step1_result: res} def node_step2(state: SimpleState) - SimpleState: 第二步补充回答 res llm.invoke(f基于理解补全回答{state[step1_result]}) return {**state, step2_result: res} def node_end(state: SimpleState) - SimpleState: 收尾输出 return {**state, final_answer: state[step2_result]} # 4、搭建极简LangGraph流程 def build_simple_graph(): graph StateGraph(SimpleState) graph.add_node(step1, node_step1) graph.add_node(step2, node_step2) graph.add_node(end, node_end) graph.set_entry_point(step1) graph.add_edge(step1, step2) graph.add_edge(step2, end) graph.add_edge(end, END) return graph.compile() # 5、核心LangSmith追踪配置重点学这里 if __name__ __main__: agent build_simple_graph() # 初始输入 init_state { user_input: 今天大盘行情怎么样, step1_result: , step2_result: , final_answer: } # 用户id customer_id CUST_001 trace_config {} if LANGSMITH_ENABLED: # 【关键】给这次运行打标签、备注元数据网页里好筛选查看 trace_config RunnableConfig( tags[simple-demo, learn-langsmith], metadata{ user_question: 今天大盘怎样, run_time: 2026-04-03, model: qwen-turbo, scene: 股市查询, test_user: test-001, anything: 随便写, customer_id: customer_id }, run_nameLangSmith学习演示运行 ) if trace_config: # 【最关键】invoke 传入 config → 自动上报所有节点、LLM调用到LangSmith result agent.invoke(init_state, configtrace_config) else: result agent.invoke(init_state) print(运行完成) print(最终回答, result[final_answer]) print(\n去 https://smith.langchain.com 打开项目learn-langsmith-simple 就能看到完整流程图每一步LLM日志)

更多文章

前端开发 2026/5/4 7:37:02

AI赋能性能优化：让快马平台的智能模型帮你重构高性能代码

AI赋能性能优化：让快马平台的智能模型帮你重构高性能代码最近在开发一个需要处理大量数据的项目时，遇到了性能瓶颈。一个简单的斐波那契数列计算函数，在输入值稍大时就变得异常缓慢。这让我开始思考如何利用AI工具来优化代码性能&#xff0…

论文降AI率最有效的3种方法，第3种省时省力效果最好网上降AI率的方法五花八门，很多都是无效的。这篇文章只说3种真正有用的，从效果一般到效果最好，排好了顺序，每种说清楚操作方式和适用条件。方法一：改…

张开发

前端开发 2026/4/13 14:46:40

3个方法解决C盘空间不足问题的系统优化工具

3个方法解决C盘空间不足问题的系统优化工具【免费下载链接】WindowsCleaner Windows Cleaner——专治C盘爆红及各种不服！ 项目地址: https://gitcode.com/gh_mirrors/wi/WindowsCleaner WindowsCleaner是一款开源的系统优化工具，专为解决Windows…

张开发

LangSmith学习

最新文章

EF Core 10向量搜索扩展仅支持.NET 8+？不！这3种降级兼容方案已被头部金融客户验证上线

从Kaggle竞赛到工业落地：MATLAB环境下XGBoOST调参的实战避坑指南

保姆级教程：用Python和LQR从零实现自动驾驶横向控制（附MATLAB代码对比）

别再自己搭文件服务器了！Spring Boot整合阿里云OSS，5分钟搞定图片上传功能

高德/百度地图API实战：如何用AOI数据给你的POI打上“商圈”标签？

架构师视角：vue-office在企业级文档预览系统中的技术实现与优化策略

推荐文章

相关文章

分享文章

更多文章

AI赋能性能优化：让快马平台的智能模型帮你重构高性能代码

OCAuxiliaryTools：重新定义OpenCore配置的全流程管理工具

赋能中国地方政府掘金非洲，打造统一大市场下产业出海新范式:HAKUNA MATATA创新“双飞地、双博览会“模式

为什么选择DzzOffice：高效协同办公的完整开源解决方案

Unity游戏插件加载器MelonLoader完全指南：从安装到精通

Apifox接口测试实战：5分钟搞定从零配置到请求发送（附常见问题排查）

BEV感知新思路：拆解CRN论文中的多模态可变形注意力（MDCA）与稀疏聚合

智能剪辑自动化工作流：用JianYingApi构建高效视频处理系统

嵌入式常用通信协议速率对比及布线要点全解析

视频 SEO 工具对于提高视频在社交媒体上的曝光度有帮助吗

论文降AI率最有效的3种方法，第3种省时省力效果最好

3个方法解决C盘空间不足问题的系统优化工具