You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

51 KiB

AI 解析剧本工作流

文档版本:v1.1
最后更新:2026-01-30


目录

  1. 工作流概述
  2. 完整流程图
  3. 详细步骤说明
  4. AI 输出格式规范
  5. 数据自动存储逻辑
  6. 分镜自动关联逻辑
  7. 错误处理
  8. 示例代码

工作流概述

AI 解析剧本工作流是系统的核心功能之一,实现从剧本文本到结构化数据的自动转换。

核心目标

  1. 自动提取剧本元素:AI 识别并提取角色、场景、道具
  2. 自动识别标签:AI 识别角色/场景/道具的多个标签(年龄段、时代、状态等)
  3. 自动拆解分镜:AI 将剧本拆解为分镜脚本
  4. 自动建立关联:自动关联分镜与角色/场景/道具
  5. 数据持久化:将所有识别结果自动存储到数据库

涉及的服务

  • AI Service:调用 AI 模型进行解析
  • Screenplay Service:管理剧本和剧本元素
  • Screenplay Tag Service:管理标签(新增)
  • Storyboard Service:管理分镜
  • Credit Service:管理积分扣除

涉及的数据表

  • screenplays:剧本表
  • screenplay_characters:剧本角色表
  • screenplay_scenes:剧本场景表
  • screenplay_props:剧本道具表
  • screenplay_element_tags:剧本元素标签表(统一管理角色/场景/道具的变体标签)
  • storyboards:分镜表
  • project_resources:项目素材表(关联标签,存储冗余字段 element_name、tag_label)
  • ai_jobs:AI 任务表
  • credit_consumption_logs:积分消耗记录表

完整流程图

┌─────────────────────────────────────────────────────────────────┐
│                     用户上传/创建剧本                              │
│                   (screenplays 表)                               │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              用户触发 AI 解析剧本                                  │
│         POST /api/v1/screenplays/{id}/parse                     │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              1. 检查用户积分是否充足                               │
│                 (Credit Service)                                │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              2. 预扣积分 + 创建 AI 任务                            │
│           (credit_consumption_logs + ai_jobs 表)                │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              3. 提交异步任务到 Celery                              │
│                (parse_screenplay_task)                          │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              4. Celery Worker 调用 AI 模型                        │
│         (GPT-4 / Claude / Gemini / 文心一言)                     │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              5. AI 返回结构化 JSON 数据                            │
│    {characters, scenes, props, tags, storyboards}               │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              6. 自动存储角色数据                                    │
│         批量插入 screenplay_characters 表                         │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              7. 自动存储场景数据                                    │
│          批量插入 screenplay_scenes 表                            │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              8. 自动存储道具数据                                    │
│           批量插入 screenplay_props 表                            │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│              9. 自动存储标签数据                                    │
│    调用 ScreenplayTagService.store_tags()                        │
│    批量插入 screenplay_element_tags 表                            │
│    自动设置元素的 has_tags = true                                 │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│             10. 自动创建分镜记录                                    │
│            批量插入 storyboards 表                                │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│             11. 自动关联分镜与元素                                  │
│      根据分镜中的角色/场景/道具名称建立关联                          │
│      (通过 screenplay_character_id 等字段)                       │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│             12. 更新 AI 任务状态为 completed                       │
│              确认积分消耗                                          │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│             13. 返回解析结果给用户                                  │
│      {characters, scenes, props, tags, storyboards}             │
└─────────────────────────────────────────────────────────────────┘

详细步骤说明

步骤 1:用户上传/创建剧本

操作

  • 用户通过 API 上传剧本文件或直接输入剧本文本
  • 系统创建 screenplays 表记录

API

POST /api/v1/screenplays

请求体

{
  "project_id": "019d1234-5678-7abc-def0-987654321fed",
  "name": "第一集剧本",
  "type": "text",
  "content": "剧本内容..."
}

步骤 2:用户触发 AI 解析

操作

  • 用户点击"AI 解析剧本"按钮
  • 前端调用 AI 解析 API

API

POST /api/v1/screenplays/{screenplay_id}/parse

请求体

{
  "auto_create_elements": true,
  "auto_create_tags": true,
  "auto_create_storyboards": true,
  "model": "gpt-4"
}

步骤 3-4:积分检查与任务创建

操作

  1. AI Service 调用 Credit Service 检查用户积分
  2. 预扣积分(如 50 积分)
  3. 创建 ai_jobs 表记录
  4. 创建 credit_consumption_logs 表记录
  5. 提交异步任务到 Celery

代码示例

# AI Service
async def parse_screenplay(
    self,
    user_id: UUID,
    screenplay_id: UUID,
    auto_create_elements: bool = True,
    auto_create_tags: bool = True,
    auto_create_storyboards: bool = True,
    model: str = "gpt-4"
) -> Dict[str, Any]:
    # 1. 获取剧本内容
    screenplay = await self.screenplay_repo.get_by_id(screenplay_id)
    
    # 2. 计算所需积分
    model_config = await self.model_repo.get_by_name(model)
    credits_needed = model_config.credits_per_unit * 10  # 假设解析剧本需要 10 个单位
    
    # 3. 预扣积分
    consumption_log = await self.credit_service.consume_credits(
        user_id=user_id,
        amount=credits_needed,
        feature_type='screenplay_parse',
        task_params={
            'screenplay_id': str(screenplay_id),
            'model': model
        }
    )
    
    # 4. 创建 AI 任务
    job = await self.job_repo.create({
        'job_type': 'text_processing',
        'status': 'pending',
        'user_id': user_id,
        'project_id': screenplay.project_id,
        'model_id': model_config.model_id,
        'model_name': model,
        'consumption_log_id': consumption_log.consumption_id,
        'input_data': {
            'screenplay_id': str(screenplay_id),
            'screenplay_content': screenplay.content,
            'auto_create_elements': auto_create_elements,
            'auto_create_tags': auto_create_tags,
            'auto_create_storyboards': auto_create_storyboards
        }
    })
    
    # 5. 更新 consumption_log
    consumption_log.ai_job_id = job.ai_job_id
    await self.db.commit()
    
    # 6. 提交异步任务
    task = parse_screenplay_task.delay(
        job_id=str(job.ai_job_id),
        screenplay_id=str(screenplay_id),
        screenplay_content=screenplay.content,
        model=model,
        auto_create_elements=auto_create_elements,
        auto_create_tags=auto_create_tags,
        auto_create_storyboards=auto_create_storyboards
    )
    
    return {
        'job_id': str(job.ai_job_id),
        'task_id': task.id,
        'status': 'pending'
    }

步骤 5:Celery Worker 调用 AI 模型

操作

  • Celery Worker 接收任务
  • 调用 AI 模型(GPT-4 / Claude / Gemini 等)
  • AI 模型分析剧本内容,返回结构化 JSON 数据

Celery Task

@celery_app.task
def parse_screenplay_task(
    job_id: str,
    screenplay_id: str,
    screenplay_content: str,
    model: str,
    auto_create_elements: bool,
    auto_create_tags: bool,
    auto_create_storyboards: bool
):
    try:
        # 1. 更新任务状态
        ai_service.update_job(job_id, {'status': 'processing', 'started_at': datetime.utcnow()})
        
        # 2. 构建 AI 提示词
        prompt = build_screenplay_parse_prompt(screenplay_content)
        
        # 3. 调用 AI 模型
        ai_response = call_ai_model(model, prompt)
        
        # 4. 解析 AI 返回的 JSON
        parsed_data = json.loads(ai_response)
        
        # 5. 自动存储数据(如果启用)
        if auto_create_elements:
            await store_screenplay_elements(screenplay_id, parsed_data)
        
        if auto_create_tags:
            await store_screenplay_tags(screenplay_id, parsed_data)
        
        if auto_create_storyboards:
            await store_storyboards(screenplay_id, parsed_data)
        
        # 6. 更新任务状态
        ai_service.update_job(job_id, {
            'status': 'completed',
            'completed_at': datetime.utcnow(),
            'output_data': parsed_data
        })
        
        # 7. 确认积分消耗
        job = ai_service.get_job(job_id)
        credit_service.confirm_consumption(job.consumption_log_id)
        
    except Exception as e:
        # 任务失败,退还积分
        ai_service.update_job(job_id, {
            'status': 'failed',
            'error_message': str(e)
        })
        job = ai_service.get_job(job_id)
        credit_service.refund_credits(job.consumption_log_id, reason=str(e))

AI 输出格式规范

AI 模型需要返回以下格式的 JSON 数据:

{
  "characters": [
    {
      "name": "张三",
      "description": "男主角,30岁,程序员",
      "role_type": "main",
      "metadata": {
        "age": 30,
        "gender": "male",
        "occupation": "程序员",
        "personality": "内向、聪明、善良"
      }
    },
    {
      "name": "李四",
      "description": "女主角,28岁,设计师",
      "role_type": "main",
      "metadata": {
        "age": 28,
        "gender": "female",
        "occupation": "设计师",
        "personality": "外向、热情、乐观"
      }
    }
  ],
  "scenes": [
    {
      "scene_number": 1,
      "title": "咖啡厅",
      "location": "市中心星巴克",
      "time_of_day": "afternoon",
      "description": "一个温馨的咖啡厅,阳光透过落地窗洒进来",
      "duration_estimate": 120.0,
      "order_index": 0,
      "metadata": {
        "atmosphere": "温馨、浪漫",
        "weather": "晴天"
      }
    }
  ],
  "props": [
    {
      "name": "笔记本电脑",
      "description": "张三的工作电脑",
      "category": "电子设备",
      "importance": "normal",
      "metadata": {
        "brand": "MacBook Pro",
        "color": "银色"
      }
    },
    {
      "name": "古剑",
      "description": "传说中的宝剑,剧情关键道具",
      "category": "武器",
      "importance": "key",
      "metadata": {
        "material": "玄铁",
        "special_ability": "可以斩断任何物体"
      }
    }
  ],
  "character_tags": [
    {
      "character_name": "张三",
      "tag_key": "youth",
      "tag_label": "少年",
      "description": "15岁的张三,还在上高中",
      "order_index": 0,
      "ai_confidence": 0.95,
      "ai_context": "剧本第3场:回忆杀,张三回忆起高中时代...",
      "metadata": {
        "age_range": "13-17",
        "key_features": ["短发", "校服", "天真"]
      }
    },
    {
      "character_name": "张三",
      "tag_key": "adult",
      "tag_label": "成年",
      "description": "30岁的张三,现在的样子",
      "order_index": 1,
      "ai_confidence": 1.0,
      "ai_context": "剧本主线",
      "metadata": {
        "age_range": "30",
        "key_features": ["成熟", "西装", "眼镜"]
      }
    }
  ],
  "scene_tags": [
    {
      "scene_name": "咖啡厅",
      "tag_key": "era_1990",
      "tag_label": "1990年代",
      "description": "90年代的咖啡厅,复古装修",
      "order_index": 0,
      "ai_confidence": 0.92,
      "ai_context": "剧本第10场:回忆杀,1990年代的咖啡厅...",
      "metadata": {
        "time_period": "1990-1999",
        "key_features": ["霓虹灯", "老式电话亭", "磁带播放器"]
      }
    }
  ],
  "prop_tags": [
    {
      "prop_name": "古剑",
      "tag_key": "damaged",
      "tag_label": "破损",
      "description": "锈迹斑斑的古剑",
      "order_index": 0,
      "ai_confidence": 0.88,
      "ai_context": "剧本第5场:古剑被发现时已经破损...",
      "metadata": {
        "condition": "破损",
        "key_features": ["锈迹", "剑刃卷曲"]
      }
    },
    {
      "prop_name": "古剑",
      "tag_key": "restored",
      "tag_label": "修复",
      "description": "重铸后的古剑,锋利如新",
      "order_index": 1,
      "ai_confidence": 0.90,
      "ai_context": "剧本第15场:古剑被重铸...",
      "metadata": {
        "condition": "完好",
        "key_features": ["锋利", "闪光"]
      }
    }
  ],
  "storyboards": [
    {
      "shot_number": "001",
      "title": "开场",
      "description": "张三坐在咖啡厅里,看着窗外",
      "dialogue": "张三:又是一个平凡的下午...",
      "shot_size": "medium_shot",
      "camera_movement": "static",
      "estimated_duration": 5.5,
      "order_index": 0,
      "start_time": 0.0,
      "end_time": 5.5,
      "metadata": {
        "lighting": "自然光",
        "weather": "晴天",
        "time_of_day": "下午"
      },
      "characters": ["张三"],
      "character_tags": {
        "张三": "adult"
      },
      "scenes": ["咖啡厅"],
      "scene_tags": {
        "咖啡厅": "modern"
      },
      "props": ["笔记本电脑"],
      "prop_tags": {}
    },
    {
      "shot_number": "002",
      "title": "回忆杀",
      "description": "少年张三在学校操场上奔跑",
      "dialogue": null,
      "shot_size": "wide_shot",
      "camera_movement": "tracking",
      "estimated_duration": 8.0,
      "order_index": 1,
      "start_time": 5.5,
      "end_time": 13.5,
      "metadata": {
        "lighting": "阳光",
        "weather": "晴天",
        "time_of_day": "上午"
      },
      "characters": ["张三"],
      "character_tags": {
        "张三": "youth"
      },
      "scenes": ["学校操场"],
      "scene_tags": {
        "学校操场": "1990s"
      },
      "props": [],
      "prop_tags": {}
    }
  ]
}

字段说明

characters 数组

  • name:角色名称(必填)
  • description:角色描述(必填)
  • role_type:角色类型(main/supporting/extra)
  • metadata:额外元数据(可选)

scenes 数组

  • scene_number:场景编号(必填)
  • title:场景标题(必填)
  • location:场景地点(可选)
  • time_of_day:时间段(dawn/morning/noon/afternoon/dusk/night)
  • description:场景描述(必填)
  • duration_estimate:预估时长(秒)
  • order_index:排序索引(必填)
  • metadata:额外元数据(可选)

props 数组

  • name:道具名称(必填)
  • description:道具描述(必填)
  • category:道具类别(可选)
  • importance:重要性(key/normal/background)
  • metadata:额外元数据(可选)

character_tags 数组

  • character_name:角色名称(必填,用于关联)
  • tag_key:标签标识(必填,如 youth/adult/elder)
  • tag_label:标签显示名称(必填)
  • description:标签描述(必填)
  • order_index:排序索引(必填)
  • ai_confidence:AI 识别置信度(0.0-1.0)
  • ai_context:AI 识别的上下文(剧本原文片段)
  • metadata:额外元数据(可选)

scene_tags 数组

  • 字段同 character_tags,但 character_name 改为 scene_name

prop_tags 数组

  • 字段同 character_tags,但 character_name 改为 prop_name

storyboards 数组

  • shot_number:镜号(自动生成,如 "001")
  • title:分镜标题(必填)
  • description:分镜描述(必填)
  • dialogue:对白(可选)
  • shot_size:景别(可选)
  • camera_movement:运镜(可选)
  • estimated_duration:预估时长(秒)
  • order_index:排序索引(必填)
  • start_time:开始时间(秒)
  • end_time:结束时间(秒)
  • metadata:额外元数据(可选)
  • characters:涉及的角色名称数组(必填)
  • character_tags:角色标签映射(可选,格式:{"角色名": "标签key"})
  • scenes:涉及的场景名称数组(必填)
  • scene_tags:场景标签映射(可选)
  • props:涉及的道具名称数组(可选)
  • prop_tags:道具标签映射(可选)

数据自动存储逻辑

1. 存储角色数据

函数store_screenplay_characters()

逻辑

  1. 遍历 parsed_data['characters'] 数组
  2. 对每个角色,插入 screenplay_characters
  3. 返回角色 ID 映射(角色名 → character_id)

代码示例

async def store_screenplay_characters(
    screenplay_id: UUID,
    characters_data: List[Dict[str, Any]]
) -> Dict[str, UUID]:
    """
    批量存储角色数据
    
    Returns:
        角色名到 character_id 的映射
    """
    character_id_map = {}
    
    for char_data in characters_data:
        character = ScreenplayCharacter(
            screenplay_id=screenplay_id,
            name=char_data['name'],
            description=char_data['description'],
            role_type=char_data.get('role_type', 'supporting'),
            metadata=char_data.get('metadata', {})
        )
        
        created_character = await screenplay_repo.create_character(character)
        character_id_map[char_data['name']] = created_character.character_id
    
    return character_id_map

2. 存储场景数据

函数store_screenplay_scenes()

逻辑

  1. 遍历 parsed_data['scenes'] 数组
  2. 对每个场景,插入 screenplay_scenes
  3. 返回场景 ID 映射(场景名 → scene_id)

代码示例

async def store_screenplay_scenes(
    screenplay_id: UUID,
    scenes_data: List[Dict[str, Any]]
) -> Dict[str, UUID]:
    """
    批量存储场景数据
    
    Returns:
        场景名到 scene_id 的映射
    """
    scene_id_map = {}
    
    for scene_data in scenes_data:
        scene = ScreenplayScene(
            screenplay_id=screenplay_id,
            scene_number=scene_data['scene_number'],
            title=scene_data['title'],
            location=scene_data.get('location'),
            time_of_day=scene_data.get('time_of_day'),
            description=scene_data['description'],
            duration_estimate=scene_data.get('duration_estimate'),
            order_index=scene_data['order_index'],
            metadata=scene_data.get('metadata', {})
        )
        
        created_scene = await screenplay_repo.create_scene(scene)
        scene_id_map[scene_data['title']] = created_scene.scene_id
    
    return scene_id_map

3. 存储道具数据

函数store_screenplay_props()

逻辑

  1. 遍历 parsed_data['props'] 数组
  2. 对每个道具,插入 screenplay_props
  3. 返回道具 ID 映射(道具名 → prop_id)

代码示例

async def store_screenplay_props(
    screenplay_id: UUID,
    props_data: List[Dict[str, Any]]
) -> Dict[str, UUID]:
    """
    批量存储道具数据
    
    Returns:
        道具名到 prop_id 的映射
    """
    prop_id_map = {}
    
    for prop_data in props_data:
        prop = ScreenplayProp(
            screenplay_id=screenplay_id,
            name=prop_data['name'],
            description=prop_data['description'],
            category=prop_data.get('category'),
            importance=prop_data.get('importance', 'normal'),
            metadata=prop_data.get('metadata', {})
        )
        
        created_prop = await screenplay_repo.create_prop(prop)
        prop_id_map[prop_data['name']] = created_prop.prop_id
    
    return prop_id_map

4. 存储标签数据

服务ScreenplayTagService.store_tags()

逻辑

  1. 遍历 parsed_data['character_tags'] 字典
  2. 根据 character_name 查找对应的 character_id
  3. 批量插入 screenplay_element_tags
  4. 自动设置 screenplay_characters.has_tags = true
  5. 对场景标签和道具标签执行相同操作
  6. 返回标签 ID 映射(用于分镜关联)

代码示例

# 调用 ScreenplayTagService
from app.services.screenplay_tag_service import ScreenplayTagService
tag_service = ScreenplayTagService(db)

tag_id_maps = await tag_service.store_tags(
    screenplay_id=screenplay_id,
    parsed_data=parsed_data,
    character_id_map=character_id_map,
    scene_id_map=scene_id_map,
    prop_id_map=prop_id_map
)

# 返回的 tag_id_maps 结构
{
    'character_tags': {
        '张三-youth': UUID('019d1234-5678-7abc-def0-444444444444'),
        '张三-adult': UUID('019d1234-5678-7abc-def0-555555555555')
    },
    'scene_tags': {
        '花果山-daytime': UUID('019d1234-5678-7abc-def0-666666666666'),
        '花果山-night': UUID('019d1234-5678-7abc-def0-777777777777')
    },
    'prop_tags': {
        '金箍棒-new': UUID('019d1234-5678-7abc-def0-888888888888')
    }
}

ScreenplayTagService.store_tags() 实现

async def store_tags(
    self,
    screenplay_id: UUID,
    parsed_data: Dict[str, Any],
    character_id_map: Dict[str, UUID],
    scene_id_map: Dict[str, UUID],
    prop_id_map: Dict[str, UUID]
) -> Dict[str, Dict[str, UUID]]:
    """存储 AI 解析的标签"""
    tag_id_maps = {
        'character_tags': {},
        'scene_tags': {},
        'prop_tags': {}
    }
    
    # 1. 存储角色标签
    for char_name, tags in parsed_data.get('character_tags', {}).items():
        character_id = character_id_map.get(char_name)
        if not character_id:
            continue
        
        for tag_data in tags:
            tag = await self.repository.create(ScreenplayElementTag(
                screenplay_id=screenplay_id,
                element_type=ElementType.CHARACTER,
                element_id=character_id,
                element_name=char_name,
                tag_key=tag_data['tag_key'],
                tag_label=tag_data['tag_label'],
                description=tag_data.get('description'),
                metadata=tag_data.get('metadata', {})
            ))
            
            map_key = f"{char_name}-{tag_data['tag_key']}"
            tag_id_maps['character_tags'][map_key] = tag.tag_id
        
        # 更新角色的 has_tags 标志
        await self._update_element_has_tags(ElementType.CHARACTER, character_id, True)
    
    # 2. 存储场景标签(逻辑类似)
    # 3. 存储道具标签(逻辑类似)
    
    return tag_id_maps

AI 返回的标签数据结构

{
  "character_tags": {
    "张三": [
      {
        "tag_key": "youth",
        "tag_label": "少年",
        "description": "15岁的张三,穿着校服",
        "metadata": {"age": 15, "clothing": "校服"}
      },
      {
        "tag_key": "adult",
        "tag_label": "成年",
        "description": "30岁的张三,身穿西装",
        "metadata": {"age": 30, "clothing": "西装"}
      }
    ]
  },
  "scene_tags": {
    "花果山": [
      {
        "tag_key": "daytime",
        "tag_label": "白天",
        "description": "阳光明媚的花果山"
      },
      {
        "tag_key": "night",
        "tag_label": "夜晚",
        "description": "月光下的花果山"
      }
    ]
  },
  "prop_tags": {
    "金箍棒": [
      {
        "tag_key": "new",
        "tag_label": "崭新",
        "description": "刚打造的金箍棒"
      }
    ]
  }
}

分镜自动关联逻辑

核心原理

分镜数据中包含角色/场景/道具的名称数组标签映射,需要根据这些名称查找对应的数据库 ID,然后建立关联。

关联步骤

  1. 存储分镜基础数据:先插入 storyboards 表,获取 storyboard_id
  2. 解析角色关联:根据分镜中的 characters 数组和 character_tags 映射,查找对应的 character_idelement_tag_id
  3. 解析场景关联:根据分镜中的 scenes 数组和 scene_tags 映射,查找对应的 scene_idelement_tag_id
  4. 解析道具关联:根据分镜中的 props 数组和 prop_tags 映射,查找对应的 prop_idelement_tag_id
  5. 更新分镜记录:将关联的 ID 写入 storyboards 表的对应字段

函数实现

函数store_storyboards_with_associations()

代码示例

async def store_storyboards_with_associations(
    screenplay_id: UUID,
    project_id: UUID,
    storyboards_data: List[Dict[str, Any]],
    character_id_map: Dict[str, UUID],
    scene_id_map: Dict[str, UUID],
    prop_id_map: Dict[str, UUID],
    tag_id_maps: Dict[str, Dict[str, UUID]]
) -> List[UUID]:
    """
    批量存储分镜数据并自动建立关联
    
    Args:
        screenplay_id: 剧本 ID
        project_id: 项目 ID
        storyboards_data: AI 返回的分镜数据数组
        character_id_map: 角色名 → character_id 映射
        scene_id_map: 场景名 → scene_id 映射
        prop_id_map: 道具名 → prop_id 映射
        tag_id_maps: 标签 ID 映射
    
    Returns:
        创建的分镜 ID 列表
    """
    storyboard_ids = []
    
    for storyboard_data in storyboards_data:
        # 1. 解析角色关联
        character_ids = []
        character_tag_ids = []
        
        for char_name in storyboard_data.get('characters', []):
            character_id = character_id_map.get(char_name)
            if character_id:
                character_ids.append(character_id)
                
                # 检查是否指定了标签
                tag_key = storyboard_data.get('character_tags', {}).get(char_name)
                if tag_key:
                    map_key = f"{char_name}-{tag_key}"
                    tag_id = tag_id_maps['character_tags'].get(map_key)
                    if tag_id:
                        character_tag_ids.append(tag_id)
        
        # 2. 解析场景关联
        scene_ids = []
        scene_tag_ids = []
        
        for scene_name in storyboard_data.get('scenes', []):
            scene_id = scene_id_map.get(scene_name)
            if scene_id:
                scene_ids.append(scene_id)
                
                # 检查是否指定了标签
                tag_key = storyboard_data.get('scene_tags', {}).get(scene_name)
                if tag_key:
                    map_key = f"{scene_name}-{tag_key}"
                    tag_id = tag_id_maps['scene_tags'].get(map_key)
                    if tag_id:
                        scene_tag_ids.append(tag_id)
        
        # 3. 解析道具关联
        prop_ids = []
        prop_tag_ids = []
        
        for prop_name in storyboard_data.get('props', []):
            prop_id = prop_id_map.get(prop_name)
            if prop_id:
                prop_ids.append(prop_id)
                
                # 检查是否指定了标签
                tag_key = storyboard_data.get('prop_tags', {}).get(prop_name)
                if tag_key:
                    map_key = f"{prop_name}-{tag_key}"
                    tag_id = tag_id_maps['prop_tags'].get(map_key)
                    if tag_id:
                        prop_tag_ids.append(tag_id)
        
        # 4. 创建分镜记录
        storyboard = Storyboard(
            project_id=project_id,
            screenplay_id=screenplay_id,
            shot_number=storyboard_data['shot_number'],
            title=storyboard_data['title'],
            description=storyboard_data['description'],
            dialogue=storyboard_data.get('dialogue'),
            shot_size=storyboard_data.get('shot_size'),
            camera_movement=storyboard_data.get('camera_movement'),
            estimated_duration=storyboard_data.get('estimated_duration'),
            order_index=storyboard_data['order_index'],
            start_time=storyboard_data.get('start_time'),
            end_time=storyboard_data.get('end_time'),
            metadata=storyboard_data.get('metadata', {}),
            # 关联字段
            screenplay_character_ids=character_ids,
            element_tag_ids=character_tag_ids + scene_tag_ids + prop_tag_ids,
            screenplay_scene_ids=scene_ids,
            screenplay_prop_ids=prop_ids
        )
        
        created_storyboard = await storyboard_repo.create(storyboard)
        storyboard_ids.append(created_storyboard.storyboard_id)
    
    return storyboard_ids

关联逻辑说明

1. 角色关联

输入

{
  "characters": ["张三", "李四"],
  "character_tags": {
    "张三": "youth",
    "李四": "adult"
  }
}

处理逻辑

  1. 遍历 characters 数组
  2. character_id_map 查找 character_id
  3. 检查 character_tags 映射,如果存在标签 key
  4. 构建映射 key:"张三-youth"
  5. tag_id_maps['character_tags'] 查找 element_tag_id
  6. character_id 添加到 screenplay_character_ids 数组
  7. element_tag_id 添加到 element_tag_ids 数组

结果

screenplay_character_ids = [
    UUID('019d1234-5678-7abc-def0-111111111111'),  # 张三
    UUID('019d1234-5678-7abc-def0-222222222222')   # 李四
]
element_tag_ids = [
    UUID('019d1234-5678-7abc-def0-333333333333'),  # 张三-youth 标签
    UUID('019d1234-5678-7abc-def0-444444444444')   # 李四-adult 标签
]

2. 场景关联

逻辑同角色关联,但使用 scenesscene_tagsscene_id_maptag_id_maps['scene_tags']

3. 道具关联

逻辑同角色关联,但使用 propsprop_tagsprop_id_maptag_id_maps['prop_tags']

数据库字段说明

storyboards 表中的关联字段:

字段名 类型 说明
screenplay_character_ids UUID[] 关联的角色 ID 数组
screenplay_scene_ids UUID[] 关联的场景 ID 数组
screenplay_prop_ids UUID[] 关联的道具 ID 数组
element_tag_ids UUID[] 关联的元素标签 ID 数组(包含角色/场景/道具标签)

关联查询示例

查询分镜关联的角色和标签信息

SELECT 
    s.storyboard_id,
    s.shot_number,
    s.title,
    c.character_id,
    c.name AS character_name,
    et.tag_id,
    et.tag_label,
    et.element_type
FROM storyboards s
LEFT JOIN LATERAL unnest(s.screenplay_character_ids) WITH ORDINALITY AS char_id(id, ord) ON true
LEFT JOIN screenplay_characters c ON c.character_id = char_id.id
LEFT JOIN LATERAL unnest(s.element_tag_ids) AS tag_id(id) ON true
LEFT JOIN screenplay_element_tags et ON et.tag_id = tag_id.id AND et.character_id = c.character_id
WHERE s.screenplay_id = '019d1234-5678-7abc-def0-987654321fed'
ORDER BY s.order_index, char_id.ord;

错误处理

1. AI 调用失败

场景:AI 模型调用超时、返回错误、或返回格式不正确

处理逻辑

  1. 捕获异常
  2. 更新 ai_jobs.status = 'failed'
  3. 记录错误信息到 ai_jobs.error_message
  4. 调用 Credit Service 退还积分
  5. 返回错误信息给用户

代码示例

try:
    ai_response = await call_ai_model(model, prompt)
    parsed_data = json.loads(ai_response)
except Exception as e:
    # 更新任务状态
    await ai_service.update_job(job_id, {
        'status': 'failed',
        'error_message': f'AI 调用失败: {str(e)}',
        'completed_at': datetime.utcnow()
    })
    
    # 退还积分
    job = await ai_service.get_job(job_id)
    await credit_service.refund_credits(
        consumption_log_id=job.consumption_log_id,
        reason=f'AI 调用失败: {str(e)}'
    )
    
    raise

2. 数据存储失败

场景:数据库写入失败、数据验证失败

处理逻辑

  1. 使用数据库事务(Transaction)
  2. 如果任何一步失败,回滚所有操作
  3. 更新 ai_jobs.status = 'failed'
  4. 退还积分
  5. 返回错误信息给用户

代码示例

async def store_screenplay_elements(
    screenplay_id: UUID,
    parsed_data: Dict[str, Any]
) -> Dict[str, Any]:
    """
    存储剧本元素(使用事务)
    """
    async with db.begin():  # 开启事务
        try:
            # 1. 存储角色
            character_id_map = await store_screenplay_characters(
                screenplay_id, 
                parsed_data['characters']
            )
            
            # 2. 存储场景
            scene_id_map = await store_screenplay_scenes(
                screenplay_id, 
                parsed_data['scenes']
            )
            
            # 3. 存储道具
            prop_id_map = await store_screenplay_props(
                screenplay_id, 
                parsed_data['props']
            )
            
            # 4. 存储标签(使用 ScreenplayTagService)
            from app.services.screenplay_tag_service import ScreenplayTagService
            tag_service = ScreenplayTagService(db)
            
            tag_id_maps = await tag_service.store_tags(
                screenplay_id=screenplay_id,
                parsed_data=parsed_data,
                character_id_map=character_id_map,
                scene_id_map=scene_id_map,
                prop_id_map=prop_id_map
            )
            
            # 5. 存储分镜
            storyboard_ids = await store_storyboards_with_associations(
                screenplay_id,
                project_id,
                parsed_data['storyboards'],
                character_id_map,
                scene_id_map,
                prop_id_map,
                tag_id_maps
            )
            
            # 提交事务
            await db.commit()
            
            return {
                'character_ids': list(character_id_map.values()),
                'scene_ids': list(scene_id_map.values()),
                'prop_ids': list(prop_id_map.values()),
                'storyboard_ids': storyboard_ids
            }
            
        except Exception as e:
            # 回滚事务
            await db.rollback()
            raise Exception(f'数据存储失败: {str(e)}')

3. 关联失败

场景:分镜中引用的角色/场景/道具名称在数据库中找不到

处理逻辑

  1. 记录警告日志
  2. 跳过该关联(不阻断整个流程)
  3. storyboards.metadata 中记录未找到的元素

代码示例

# 解析角色关联时
for char_name in storyboard_data.get('characters', []):
    character_id = character_id_map.get(char_name)
    if character_id:
        character_ids.append(character_id)
    else:
        # 记录警告
        logger.warning(f'分镜 {storyboard_data["shot_number"]} 引用的角色 "{char_name}" 未找到')
        
        # 在 metadata 中记录
        if 'missing_associations' not in storyboard_data['metadata']:
            storyboard_data['metadata']['missing_associations'] = {}
        if 'characters' not in storyboard_data['metadata']['missing_associations']:
            storyboard_data['metadata']['missing_associations']['characters'] = []
        storyboard_data['metadata']['missing_associations']['characters'].append(char_name)

4. 积分不足

场景:用户积分不足以支付 AI 解析费用

处理逻辑

  1. 在预扣积分阶段检查
  2. 如果积分不足,抛出 InsufficientCreditsError
  3. 返回 HTTP 402 Payment Required
  4. 提示用户充值

代码示例

# Credit Service
async def consume_credits(
    self,
    user_id: UUID,
    amount: int,
    feature_type: str,
    task_params: Dict[str, Any]
) -> CreditConsumptionLog:
    # 检查积分余额
    user_credits = await self.get_user_credits(user_id)
    
    if user_credits.available_credits < amount:
        raise InsufficientCreditsError(
            f'积分不足。需要 {amount} 积分,当前可用 {user_credits.available_credits} 积分'
        )
    
    # 预扣积分
    # ...

5. 并发冲突

场景:同一个剧本被多次触发 AI 解析

处理逻辑

  1. screenplays 表添加 parsing_status 字段(idle/parsing/completed/failed)
  2. 触发解析前检查状态
  3. 如果状态为 parsing,返回错误提示
  4. 使用数据库锁(SELECT FOR UPDATE)防止并发

代码示例

async def parse_screenplay(
    self,
    user_id: UUID,
    screenplay_id: UUID,
    **kwargs
) -> Dict[str, Any]:
    # 1. 加锁查询剧本
    async with db.begin():
        screenplay = await db.execute(
            select(Screenplay)
            .where(Screenplay.screenplay_id == screenplay_id)
            .with_for_update()
        )
        screenplay = screenplay.scalar_one_or_none()
        
        if not screenplay:
            raise ScreenplayNotFoundError()
        
        # 2. 检查解析状态
        if screenplay.parsing_status == 'parsing':
            raise ScreenplayParsingInProgressError('该剧本正在解析中,请稍后再试')
        
        # 3. 更新状态为 parsing
        screenplay.parsing_status = 'parsing'
        await db.commit()
    
    try:
        # 执行解析逻辑
        # ...
        
        # 更新状态为 completed
        await db.execute(
            update(Screenplay)
            .where(Screenplay.screenplay_id == screenplay_id)
            .values(parsing_status='completed')
        )
        await db.commit()
        
    except Exception as e:
        # 更新状态为 failed
        await db.execute(
            update(Screenplay)
            .where(Screenplay.screenplay_id == screenplay_id)
            .values(parsing_status='failed')
        )
        await db.commit()
        raise

示例代码

完整的端到端示例

场景:用户上传剧本并触发 AI 解析

1. 用户上传剧本

请求

POST /api/v1/screenplays
Content-Type: application/json
Authorization: Bearer <token>

{
  "project_id": "019d1234-5678-7abc-def0-987654321fed",
  "name": "第一集剧本",
  "type": "text",
  "content": "第一场 咖啡厅 下午\n\n张三坐在咖啡厅里,看着窗外。\n\n张三:又是一个平凡的下午...\n\n(回忆杀)\n\n第二场 学校操场 上午\n\n少年张三在操场上奔跑。"
}

响应

{
  "screenplay_id": "019d1234-5678-7abc-def0-111111111111",
  "project_id": "019d1234-5678-7abc-def0-987654321fed",
  "name": "第一集剧本",
  "type": "text",
  "parsing_status": "idle",
  "created_at": "2026-01-19T10:00:00Z"
}

2. 用户触发 AI 解析

请求

POST /api/v1/screenplays/019d1234-5678-7abc-def0-111111111111/parse
Content-Type: application/json
Authorization: Bearer <token>

{
  "auto_create_elements": true,
  "auto_create_tags": true,
  "auto_create_storyboards": true,
  "model": "gpt-4"
}

响应

{
  "job_id": "019d1234-5678-7abc-def0-222222222222",
  "task_id": "abc123-def456-ghi789",
  "status": "pending",
  "estimated_credits": 50,
  "message": "AI 解析任务已提交,请稍后查询结果"
}

3. 查询任务状态

请求

GET /api/v1/ai/jobs/019d1234-5678-7abc-def0-222222222222
Authorization: Bearer <token>

响应(处理中)

{
  "job_id": "019d1234-5678-7abc-def0-222222222222",
  "job_type": "text_processing",
  "status": "processing",
  "progress": 50,
  "started_at": "2026-01-19T10:00:05Z",
  "message": "正在解析剧本..."
}

响应(完成)

{
  "job_id": "019d1234-5678-7abc-def0-222222222222",
  "job_type": "text_processing",
  "status": "completed",
  "progress": 100,
  "started_at": "2026-01-19T10:00:05Z",
  "completed_at": "2026-01-19T10:00:30Z",
  "credits_consumed": 45,
  "output_data": {
    "characters_count": 2,
    "scenes_count": 2,
    "props_count": 1,
    "tags_count": 2,
    "storyboards_count": 2
  },
  "message": "解析完成"
}

4. 查询解析结果

请求

GET /api/v1/screenplays/019d1234-5678-7abc-def0-111111111111/elements
Authorization: Bearer <token>

响应

{
  "screenplay_id": "019d1234-5678-7abc-def0-111111111111",
  "characters": [
    {
      "character_id": "019d1234-5678-7abc-def0-333333333333",
      "name": "张三",
      "description": "男主角,30岁,程序员",
      "role_type": "main",
      "has_tags": true,
      "tags": [
        {
          "tag_id": "019d1234-5678-7abc-def0-444444444444",
          "tag_key": "youth",
          "tag_label": "少年",
          "description": "15岁的张三,还在上高中"
        },
        {
          "tag_id": "019d1234-5678-7abc-def0-555555555555",
          "tag_key": "adult",
          "tag_label": "成年",
          "description": "30岁的张三,现在的样子"
        }
      ]
    }
  ],
  "scenes": [
    {
      "scene_id": "019d1234-5678-7abc-def0-666666666666",
      "scene_number": 1,
      "title": "咖啡厅",
      "location": "市中心星巴克",
      "time_of_day": "afternoon",
      "description": "一个温馨的咖啡厅,阳光透过落地窗洒进来"
    },
    {
      "scene_id": "019d1234-5678-7abc-def0-777777777777",
      "scene_number": 2,
      "title": "学校操场",
      "location": "某高中操场",
      "time_of_day": "morning",
      "description": "宽阔的操场,学生们在上体育课"
    }
  ],
  "props": [
    {
      "prop_id": "019d1234-5678-7abc-def0-888888888888",
      "name": "笔记本电脑",
      "description": "张三的工作电脑",
      "category": "电子设备",
      "importance": "normal"
    }
  ]
}

5. 查询分镜列表

请求

GET /api/v1/projects/019d1234-5678-7abc-def0-987654321fed/storyboards?screenplay_id=019d1234-5678-7abc-def0-111111111111
Authorization: Bearer <token>

响应

{
  "total": 2,
  "items": [
    {
      "storyboard_id": "019d1234-5678-7abc-def0-999999999999",
      "shot_number": "001",
      "title": "开场",
      "description": "张三坐在咖啡厅里,看着窗外",
      "dialogue": "张三:又是一个平凡的下午...",
      "shot_size": "medium_shot",
      "camera_movement": "static",
      "estimated_duration": 5.5,
      "characters": [
        {
          "character_id": "019d1234-5678-7abc-def0-333333333333",
          "name": "张三",
          "tag": {
            "tag_id": "019d1234-5678-7abc-def0-555555555555",
            "tag_label": "成年"
          }
        }
      ],
      "scenes": [
        {
          "scene_id": "019d1234-5678-7abc-def0-666666666666",
          "title": "咖啡厅"
        }
      ],
      "props": [
        {
          "prop_id": "019d1234-5678-7abc-def0-888888888888",
          "name": "笔记本电脑"
        }
      ]
    },
    {
      "storyboard_id": "019d1234-5678-7abc-def0-aaaaaaaaaaaa",
      "shot_number": "002",
      "title": "回忆杀",
      "description": "少年张三在学校操场上奔跑",
      "dialogue": null,
      "shot_size": "wide_shot",
      "camera_movement": "tracking",
      "estimated_duration": 8.0,
      "characters": [
        {
          "character_id": "019d1234-5678-7abc-def0-333333333333",
          "name": "张三",
          "tag": {
            "tag_id": "019d1234-5678-7abc-def0-444444444444",
            "tag_label": "少年"
          }
        }
      ],
      "scenes": [
        {
          "scene_id": "019d1234-5678-7abc-def0-777777777777",
          "title": "学校操场"
        }
      ],
      "props": []
    }
  ]
}

总结

工作流特点

  1. 全自动化:从 AI 解析到数据存储,无需人工干预
  2. 事务保证:使用数据库事务确保数据一致性
  3. 异步处理:使用 Celery 异步任务,不阻塞用户请求
  4. 积分管理:预扣积分 + 确认消耗/退还机制
  5. 关联自动化:根据名称自动建立分镜与元素的关联
  6. 标签支持:支持角色/场景/道具的多标签识别和关联
  7. 错误处理:完善的错误处理和回滚机制

关键技术点

  • UUID v7:使用 UUID v7 作为主键,支持时间排序
  • PostgreSQL 数组:使用 UUID[] 数组类型存储多对多关联
  • JSON 字段:使用 JSONB 存储元数据和 AI 输出
  • 数据库事务:确保数据一致性
  • Celery 异步任务:处理耗时的 AI 调用
  • 积分系统:预扣 + 确认/退还机制

后续优化方向

  1. 增量解析:支持只解析剧本的某一部分
  2. 人工校正:支持用户手动调整 AI 识别结果
  3. 批量解析:支持一次解析多个剧本
  4. 解析模板:支持自定义 AI 提示词模板
  5. 多模型对比:支持同时调用多个 AI 模型,对比结果
  6. 解析历史:保存每次解析的历史记录,支持版本对比

文档结束