8.5 KiB

Raw Permalink Blame History

Changelog: 剧本 AI 解析格式兼容性修复

日期: 2026-02-09
类型: Bug Fix
影响范围: 剧本解析功能
严重程度: High

问题描述

症状

API 调用 /api/v1/screenplays/{id}/parse 解析剧本时，AI 返回的格式经常不符合预期：

使用 scenes 而不是 locations
场景对象使用 location 字段而不是 name
分镜嵌套在 scenes[].shots 而不是顶层 storyboards

但手动测试脚本 tests/manual/test_screenplay_full_parsing.py 返回的格式是正确的。

根本原因

AI Skill 加载失败
- Celery Task 尝试从数据库加载 screenplay_parsing 技能失败
- 降级到硬编码的旧版提示词（格式要求不一致）
AI 模型行为不一致
- 不同模型（GPT-4 vs Gemini）对提示词的遵守程度不同
- Gemini 更倾向于返回 scenes 格式（符合自然语言习惯）
格式转换逻辑已存在但未被充分利用
- ScreenplayService._transform_ai_tags_format() 已有转换逻辑
- 但 AI Skill 加载失败导致使用了错误的提示词

解决方案

修复 1: 优化 AI Skill 加载逻辑

文件: server/app/tasks/ai_tasks.py

变更:

# ✅ 优先从文件系统加载（确保使用最新版本）
skill_file_path = "app/resources/ai_skills/screenplay_parsing.md"
if os.path.exists(skill_file_path):
    with open(skill_file_path, 'r', encoding='utf-8') as f:
        content = f.read()
        # 提取提示词模板部分
        start_marker = "### 系统角色"
        end_marker = "## 使用示例"
        
        start_idx = content.find(start_marker)
        end_idx = content.find(end_marker)
        
        if start_idx != -1 and end_idx != -1:
            system_prompt = content[start_idx:end_idx].strip()
            logger.info("✅ 从文件系统加载 AI 技能: screenplay_parsing v1.2.0")

优势:

确保使用最新的 AI Skill 文件（v1.2.0）
避免数据库加载失败导致的降级
提示词格式要求更严格

修复 2: 更新硬编码降级提示词

文件: server/app/tasks/ai_tasks.py

变更:

将硬编码提示词更新为与 AI Skill 文件一致的格式
强化格式要求，明确禁止使用 scenes、嵌套格式等

关键要求:

**严格要求**: 必须返回以下7个顶层键，缺一不可：
1. **characters** - 顶层角色数组
2. **character_tags** - 角色标签字典
3. **locations** - 顶层场景数组（禁止使用 "scenes"）
4. **location_tags** - 场景标签字典
5. **props** - 顶层道具数组
6. **prop_tags** - 道具标签字典
7. **storyboards** - 顶层分镜数组

**禁止的错误格式**:
- 使用 "scenes" 而不是 "locations"
- 角色嵌套在场景内
- 镜头嵌套在场景内（使用 "shots"）
- 场景对象使用 "location" 字段而不是 "name"

修复 3: 增强格式转换逻辑（已存在，无需修改）

文件: server/app/services/screenplay_service.py

现有功能:

_transform_ai_tags_format() 方法已支持三种格式转换
自动将 scenes 转换为 locations
自动提取 scenes[].characters 为顶层角色数组
自动提取 scenes[].shots 为顶层 storyboards 数组

转换流程:

# 步骤1：将 scenes 转换为 locations
if 'scenes' in parsed_data:
    locations = []
    for scene in scenes:
        location_name = scene.get('location') or scene.get('name')
        location_obj = {
            'name': location_name,  # ✅ 统一使用 name 字段
            'location': location_name,
            'description': scene.get('description', ''),
            'meta_data': {...}
        }
        locations.append(location_obj)
    result['locations'] = locations

# 步骤1.5：从 scenes[].characters 提取角色
if not result.get('characters'):
    unique_characters = set()
    for scene in scenes:
        for char_name in scene.get('characters', []):
            unique_characters.add(char_name)
    result['characters'] = [{'name': name, ...} for name in unique_characters]

# 步骤1.6：从 scenes[].shots 提取分镜
if not result.get('storyboards'):
    storyboards = []
    for scene in scenes:
        for shot in scene.get('shots', []):
            storyboard = {
                'shot_number': shot_counter,
                'title': shot.get('title'),
                'locations': [scene_location],
                ...
            }
            storyboards.append(storyboard)
    result['storyboards'] = storyboards

测试验证

测试场景 1: API 调用（修复前）

# 调用 API
POST /api/v1/screenplays/{id}/parse

# AI 返回格式（错误）
{
  "scenes": [
    {
      "location": "海边",  # ❌ 应该是 name
      "shots": [...]       # ❌ 应该在顶层 storyboards
    }
  ]
}

# 结果：格式转换失败，数据存储异常

测试场景 2: API 调用（修复后）

# 调用 API
POST /api/v1/screenplays/{id}/parse

# AI 返回格式（可能仍然是 scenes）
{
  "scenes": [
    {
      "location": "海边",
      "shots": [...]
    }
  ]
}

# ✅ 格式转换成功
{
  "characters": [...],
  "locations": [{"name": "海边", ...}],
  "storyboards": [...]
}

# 结果：数据正确存储到数据库

测试场景 3: 手动测试脚本（始终正确）

# 运行手动测试
docker exec jointo-server-app python tests/manual/test_screenplay_full_parsing.py

# AI 返回格式（正确）
{
  "characters": [...],
  "locations": [...],
  "storyboards": [...]
}

# 结果：格式正确，无需转换

影响范围

受影响的功能

✅ 剧本 AI 解析 API (POST /api/v1/screenplays/{id}/parse)
✅ Celery 异步任务 (parse_screenplay_task)
✅ 剧本元素存储 (ScreenplayService.store_parsed_elements)

不受影响的功能

✅ 手动测试脚本（始终使用正确的提示词）
✅ 其他 AI 生成功能（图片、视频、配音等）
✅ 剧本 CRUD 操作

后续优化建议

1. 统一 AI Skill 管理

问题: 当前 AI Skill 同时存在于文件系统和数据库，容易不同步

建议:

使用文件系统作为唯一真实来源（Single Source of Truth）
数据库仅作为缓存，定期从文件系统同步
或者完全移除数据库存储，直接从文件系统加载

2. 增强 AI 模型提示词遵守度

问题: 不同 AI 模型对提示词的遵守程度不同

建议:

使用 JSON Schema 约束 AI 输出格式
使用 Function Calling / Structured Output（GPT-4 支持）
增加输出格式验证和自动重试机制

3. 添加格式验证中间件

问题: AI 返回格式错误时，错误信息不够明确

建议:

def validate_screenplay_parse_result(data: Dict[str, Any]) -> None:
    """验证 AI 解析结果格式"""
    required_keys = ['characters', 'character_tags', 'locations', 
                     'location_tags', 'props', 'prop_tags', 'storyboards']
    
    missing_keys = [k for k in required_keys if k not in data]
    if missing_keys:
        raise ValidationError(f"AI 返回格式缺少必需字段: {missing_keys}")
    
    # 检查禁止的格式
    if 'scenes' in data:
        logger.warning("AI 返回了 'scenes' 格式，将自动转换为 'locations'")
    
    # 检查场景对象格式
    for loc in data.get('locations', []):
        if 'location' in loc and 'name' not in loc:
            logger.warning("场景对象使用了 'location' 字段，建议使用 'name'")

4. 监控和告警

建议:

记录 AI 返回格式的统计数据（正确格式 vs 需要转换的格式）
当转换率超过阈值时发送告警
定期审查 AI Skill 提示词的有效性

总结

本次修复通过以下三个层面确保剧本解析格式的正确性：

源头控制: 优化 AI Skill 加载，确保使用最新的提示词
降级保护: 更新硬编码提示词，确保降级时仍然有正确的格式要求
容错转换: 利用现有的格式转换逻辑，兼容 AI 可能返回的各种格式

这种多层防护策略确保了系统的健壮性，即使 AI 模型不完全遵守提示词，也能通过格式转换得到正确的结果。

8.5 KiB Raw Permalink Blame History