5.7 KiB

Raw Permalink Blame History

从 scenes[].shots 提取分镜数据

日期: 2026-02-09
类型: Bug 修复
影响范围: 剧本解析 AI 任务

背景

在修复 Markdown JSON 解析问题后，发现分镜数据仍然为空。分析发现 AI 返回的数据结构与代码期望不匹配。

问题表现

剧本解析完成: characters=0, locations=5, props=0, storyboards=0
剧本元素存储成功: 角色=0, 场景=5, 道具=0, 标签=0, 分镜=0

AI 返回的数据结构

{
  "scenes": [
    {
      "scene_number": 1,
      "location": "海边",
      "time": "晨",
      "description": "...",
      "characters": ["女孩", "渔民", "老警察", "小警察"],
      "shots": [
        {
          "shot_number": 1,
          "shot_size": "特写",
          "camera_movement": "static",
          "description": "...",
          "duration": 5
        }
      ]
    }
  ]
}

代码期望的数据结构

{
  "characters": [...],
  "locations": [...],
  "storyboards": [...]  # 顶层分镜数组
}

根本原因

AI 返回的是 scenes 数组，每个 scene 包含 shots 数组
代码期望顶层有 storyboards 数组
_transform_ai_tags_format 方法只转换了 scenes 为 locations，没有提取 shots

解决方案

在 _transform_ai_tags_format 方法中添加逻辑，将 scenes[].shots 提取并转换为顶层的 storyboards 数组。

修改内容

文件: server/app/services/screenplay_service.py

步骤 1.5：提取 scenes[].shots 转换为顶层 storyboards 数组

# ✅ 步骤1.5：提取 scenes[].shots 转换为顶层 storyboards 数组
if not result.get('storyboards'):
    storyboards = []
    shot_counter = 1
    
    for scene in scenes:
        scene_location = scene.get('location') or scene.get('name') or scene.get('title')
        scene_characters = scene.get('characters', [])
        
        for shot in scene.get('shots', []):
            # 构建标准的 storyboard 对象
            storyboard = {
                'shot_number': shot_counter,
                'title': shot.get('title') or f"镜头 {shot_counter}",
                'description': shot.get('description', ''),
                'dialogue': shot.get('dialogue', ''),
                'shot_size': shot.get('shot_size'),
                'camera_movement': shot.get('camera_movement'),
                'estimated_duration': shot.get('estimated_duration') or shot.get('duration', 5),
                'characters': shot.get('characters') or scene_characters,
                'character_tags': shot.get('character_tags', {}),
                'locations': [scene_location] if scene_location else [],
                'location_tags': shot.get('location_tags', {}),
                'props': shot.get('props', []),
                'prop_tags': shot.get('prop_tags', {}),
                'meta_data': {
                    'scene_number': scene.get('scene_number'),
                    'scene_location': scene_location,
                    'scene_time': scene.get('time'),
                    **shot.get('meta_data', {})
                }
            }
            storyboards.append(storyboard)
            shot_counter += 1
    
    if storyboards:
        result['storyboards'] = storyboards
        logger.info("✅ 成功从 scenes[].shots 提取 %d 个分镜", len(storyboards))

转换逻辑

遍历所有场景：从 scenes 数组中提取每个场景
提取场景信息：获取场景的 location 和 characters
遍历场景的镜头：从 scene.shots 中提取每个镜头
构建分镜对象：
- shot_number: 全局镜头编号（自动递增）
- title: 镜头标题（如果没有则生成"镜头 N"）
- characters: 优先使用镜头的角色列表，否则使用场景的角色列表
- locations: 使用场景的位置
- meta_data: 保存场景编号、场景位置、场景时间等信息
添加到结果：将所有分镜添加到顶层 storyboards 数组

重启服务

docker restart jointo-server-celery-ai jointo-server-app

测试验证

测试结果

✅ 成功转换 5 个场景为 locations 格式
✅ 成功从 scenes[].shots 提取 12 个分镜
分镜创建完成: screenplay_id=..., 总数=12
剧本元素存储成功: 角色=0, 场景=5, 道具=0, 标签=0, 分镜=12

验证步骤

✅ 场景数据正确：5 个场景
✅ 分镜数据正确：12 个分镜
✅ scene_count 正确更新为 5
⚠️ 角色数据缺失：需要进一步修复

后续工作

待修复问题

角色数据缺失：
- AI 返回的角色信息在 scenes[].characters 中
- 需要从场景中提取所有唯一的角色名称
- 创建顶层 characters 数组
道具数据缺失：
- AI 可能没有返回道具信息
- 或者道具信息在 scenes[].props 或 shots[].props 中

优化方向

优化 AI Skill Prompt：
- 明确要求返回顶层 characters、locations、props、storyboards 数组
- 提供标准的 JSON 示例
增强数据转换逻辑：
- 支持更多的 AI 返回格式
- 自动从嵌套结构中提取数据
添加数据验证：
- 验证必需字段是否存在
- 提供更详细的错误信息

技术债务

从 scenes[].characters 提取角色数据
优化 AI Skill Prompt 以返回标准格式
添加数据格式验证和错误处理
支持更多的 AI 返回格式变体

5.7 KiB Raw Permalink Blame History