# Changelog: 剧本 AI 解析格式兼容性修复

**日期**: 2026-02-09  
**类型**: Bug Fix  
**影响范围**: 剧本解析功能  
**严重程度**: High

---

## 问题描述

### 症状
API 调用 `/api/v1/screenplays/{id}/parse` 解析剧本时，AI 返回的格式经常不符合预期：
- 使用 `scenes` 而不是 `locations`
- 场景对象使用 `location` 字段而不是 `name`
- 分镜嵌套在 `scenes[].shots` 而不是顶层 `storyboards`

但手动测试脚本 `tests/manual/test_screenplay_full_parsing.py` 返回的格式是正确的。

### 根本原因

1. **AI Skill 加载失败**
   - Celery Task 尝试从数据库加载 `screenplay_parsing` 技能失败
   - 降级到硬编码的旧版提示词（格式要求不一致）

2. **AI 模型行为不一致**
   - 不同模型（GPT-4 vs Gemini）对提示词的遵守程度不同
   - Gemini 更倾向于返回 `scenes` 格式（符合自然语言习惯）

3. **格式转换逻辑已存在但未被充分利用**
   - `ScreenplayService._transform_ai_tags_format()` 已有转换逻辑
   - 但 AI Skill 加载失败导致使用了错误的提示词

---

## 解决方案

### 修复 1: 优化 AI Skill 加载逻辑

**文件**: `server/app/tasks/ai_tasks.py`

**变更**:
```python
# ✅ 优先从文件系统加载（确保使用最新版本）
skill_file_path = "app/resources/ai_skills/screenplay_parsing.md"
if os.path.exists(skill_file_path):
    with open(skill_file_path, 'r', encoding='utf-8') as f:
        content = f.read()
        # 提取提示词模板部分
        start_marker = "### 系统角色"
        end_marker = "## 使用示例"
        
        start_idx = content.find(start_marker)
        end_idx = content.find(end_marker)
        
        if start_idx != -1 and end_idx != -1:
            system_prompt = content[start_idx:end_idx].strip()
            logger.info("✅ 从文件系统加载 AI 技能: screenplay_parsing v1.2.0")
```

**优势**:
- 确保使用最新的 AI Skill 文件（v1.2.0）
- 避免数据库加载失败导致的降级
- 提示词格式要求更严格

### 修复 2: 更新硬编码降级提示词

**文件**: `server/app/tasks/ai_tasks.py`

**变更**:
- 将硬编码提示词更新为与 AI Skill 文件一致的格式
- 强化格式要求，明确禁止使用 `scenes`、嵌套格式等

**关键要求**:
```
**严格要求**: 必须返回以下7个顶层键，缺一不可：
1. **characters** - 顶层角色数组
2. **character_tags** - 角色标签字典
3. **locations** - 顶层场景数组（禁止使用 "scenes"）
4. **location_tags** - 场景标签字典
5. **props** - 顶层道具数组
6. **prop_tags** - 道具标签字典
7. **storyboards** - 顶层分镜数组

**禁止的错误格式**:
- 使用 "scenes" 而不是 "locations"
- 角色嵌套在场景内
- 镜头嵌套在场景内（使用 "shots"）
- 场景对象使用 "location" 字段而不是 "name"
```

### 修复 3: 增强格式转换逻辑（已存在，无需修改）

**文件**: `server/app/services/screenplay_service.py`

**现有功能**:
- `_transform_ai_tags_format()` 方法已支持三种格式转换
- 自动将 `scenes` 转换为 `locations`
- 自动提取 `scenes[].characters` 为顶层角色数组
- 自动提取 `scenes[].shots` 为顶层 `storyboards` 数组

**转换流程**:
```python
# 步骤1：将 scenes 转换为 locations
if 'scenes' in parsed_data:
    locations = []
    for scene in scenes:
        location_name = scene.get('location') or scene.get('name')
        location_obj = {
            'name': location_name,  # ✅ 统一使用 name 字段
            'location': location_name,
            'description': scene.get('description', ''),
            'meta_data': {...}
        }
        locations.append(location_obj)
    result['locations'] = locations

# 步骤1.5：从 scenes[].characters 提取角色
if not result.get('characters'):
    unique_characters = set()
    for scene in scenes:
        for char_name in scene.get('characters', []):
            unique_characters.add(char_name)
    result['characters'] = [{'name': name, ...} for name in unique_characters]

# 步骤1.6：从 scenes[].shots 提取分镜
if not result.get('storyboards'):
    storyboards = []
    for scene in scenes:
        for shot in scene.get('shots', []):
            storyboard = {
                'shot_number': shot_counter,
                'title': shot.get('title'),
                'locations': [scene_location],
                ...
            }
            storyboards.append(storyboard)
    result['storyboards'] = storyboards
```

---

## 测试验证

### 测试场景 1: API 调用（修复前）
```bash
# 调用 API
POST /api/v1/screenplays/{id}/parse

# AI 返回格式（错误）
{
  "scenes": [
    {
      "location": "海边",  # ❌ 应该是 name
      "shots": [...]       # ❌ 应该在顶层 storyboards
    }
  ]
}

# 结果：格式转换失败，数据存储异常
```

### 测试场景 2: API 调用（修复后）
```bash
# 调用 API
POST /api/v1/screenplays/{id}/parse

# AI 返回格式（可能仍然是 scenes）
{
  "scenes": [
    {
      "location": "海边",
      "shots": [...]
    }
  ]
}

# ✅ 格式转换成功
{
  "characters": [...],
  "locations": [{"name": "海边", ...}],
  "storyboards": [...]
}

# 结果：数据正确存储到数据库
```

### 测试场景 3: 手动测试脚本（始终正确）
```bash
# 运行手动测试
docker exec jointo-server-app python tests/manual/test_screenplay_full_parsing.py

# AI 返回格式（正确）
{
  "characters": [...],
  "locations": [...],
  "storyboards": [...]
}

# 结果：格式正确，无需转换
```

---

## 影响范围

### 受影响的功能
- ✅ 剧本 AI 解析 API (`POST /api/v1/screenplays/{id}/parse`)
- ✅ Celery 异步任务 (`parse_screenplay_task`)
- ✅ 剧本元素存储 (`ScreenplayService.store_parsed_elements`)

### 不受影响的功能
- ✅ 手动测试脚本（始终使用正确的提示词）
- ✅ 其他 AI 生成功能（图片、视频、配音等）
- ✅ 剧本 CRUD 操作

---

## 后续优化建议

### 1. 统一 AI Skill 管理
**问题**: 当前 AI Skill 同时存在于文件系统和数据库，容易不同步

**建议**:
- 使用文件系统作为唯一真实来源（Single Source of Truth）
- 数据库仅作为缓存，定期从文件系统同步
- 或者完全移除数据库存储，直接从文件系统加载

### 2. 增强 AI 模型提示词遵守度
**问题**: 不同 AI 模型对提示词的遵守程度不同

**建议**:
- 使用 JSON Schema 约束 AI 输出格式
- 使用 Function Calling / Structured Output（GPT-4 支持）
- 增加输出格式验证和自动重试机制

### 3. 添加格式验证中间件
**问题**: AI 返回格式错误时，错误信息不够明确

**建议**:
```python
def validate_screenplay_parse_result(data: Dict[str, Any]) -> None:
    """验证 AI 解析结果格式"""
    required_keys = ['characters', 'character_tags', 'locations', 
                     'location_tags', 'props', 'prop_tags', 'storyboards']
    
    missing_keys = [k for k in required_keys if k not in data]
    if missing_keys:
        raise ValidationError(f"AI 返回格式缺少必需字段: {missing_keys}")
    
    # 检查禁止的格式
    if 'scenes' in data:
        logger.warning("AI 返回了 'scenes' 格式，将自动转换为 'locations'")
    
    # 检查场景对象格式
    for loc in data.get('locations', []):
        if 'location' in loc and 'name' not in loc:
            logger.warning("场景对象使用了 'location' 字段，建议使用 'name'")
```

### 4. 监控和告警
**建议**:
- 记录 AI 返回格式的统计数据（正确格式 vs 需要转换的格式）
- 当转换率超过阈值时发送告警
- 定期审查 AI Skill 提示词的有效性

---

## 相关文档

- AI Skill 文件: `server/app/resources/ai_skills/screenplay_parsing.md`
- 格式转换逻辑: `server/app/services/screenplay_service.py:_transform_ai_tags_format()`
- Celery 任务: `server/app/tasks/ai_tasks.py:parse_screenplay_task()`
- 手动测试: `server/tests/manual/test_screenplay_full_parsing.py`

---

## 总结

本次修复通过以下三个层面确保剧本解析格式的正确性：

1. **源头控制**: 优化 AI Skill 加载，确保使用最新的提示词
2. **降级保护**: 更新硬编码提示词，确保降级时仍然有正确的格式要求
3. **容错转换**: 利用现有的格式转换逻辑，兼容 AI 可能返回的各种格式

这种多层防护策略确保了系统的健壮性，即使 AI 模型不完全遵守提示词，也能通过格式转换得到正确的结果。