# AI 解析剧本工作流 > **文档版本**:v1.1 > **最后更新**:2026-01-30 --- ## 目录 1. [工作流概述](#工作流概述) 2. [完整流程图](#完整流程图) 3. [详细步骤说明](#详细步骤说明) 4. [AI 输出格式规范](#ai-输出格式规范) 5. [数据自动存储逻辑](#数据自动存储逻辑) 6. [分镜自动关联逻辑](#分镜自动关联逻辑) 7. [错误处理](#错误处理) 8. [示例代码](#示例代码) --- ## 工作流概述 AI 解析剧本工作流是系统的核心功能之一,实现从剧本文本到结构化数据的自动转换。 ### 核心目标 1. **自动提取剧本元素**:AI 识别并提取角色、场景、道具 2. **自动识别标签**:AI 识别角色/场景/道具的多个标签(年龄段、时代、状态等) 3. **自动拆解分镜**:AI 将剧本拆解为分镜脚本 4. **自动建立关联**:自动关联分镜与角色/场景/道具 5. **数据持久化**:将所有识别结果自动存储到数据库 ### 涉及的服务 - **AI Service**:调用 AI 模型进行解析 - **Screenplay Service**:管理剧本和剧本元素 - **Screenplay Tag Service**:管理标签(新增) - **Storyboard Service**:管理分镜 - **Credit Service**:管理积分扣除 ### 涉及的数据表 - `screenplays`:剧本表 - `screenplay_characters`:剧本角色表 - `screenplay_scenes`:剧本场景表 - `screenplay_props`:剧本道具表 - `screenplay_element_tags`:剧本元素标签表(统一管理角色/场景/道具的变体标签) - `storyboards`:分镜表 - `project_resources`:项目素材表(关联标签,存储冗余字段 element_name、tag_label) - `ai_jobs`:AI 任务表 - `credit_consumption_logs`:积分消耗记录表 --- ## 完整流程图 ``` ┌─────────────────────────────────────────────────────────────────┐ │ 用户上传/创建剧本 │ │ (screenplays 表) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 用户触发 AI 解析剧本 │ │ POST /api/v1/screenplays/{id}/parse │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 1. 检查用户积分是否充足 │ │ (Credit Service) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 2. 预扣积分 + 创建 AI 任务 │ │ (credit_consumption_logs + ai_jobs 表) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 3. 提交异步任务到 Celery │ │ (parse_screenplay_task) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 4. Celery Worker 调用 AI 模型 │ │ (GPT-4 / Claude / Gemini / 文心一言) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 5. AI 返回结构化 JSON 数据 │ │ {characters, scenes, props, tags, storyboards} │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 6. 自动存储角色数据 │ │ 批量插入 screenplay_characters 表 │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 7. 自动存储场景数据 │ │ 批量插入 screenplay_scenes 表 │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 8. 自动存储道具数据 │ │ 批量插入 screenplay_props 表 │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 9. 自动存储标签数据 │ │ 调用 ScreenplayTagService.store_tags() │ │ 批量插入 screenplay_element_tags 表 │ │ 自动设置元素的 has_tags = true │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 10. 自动创建分镜记录 │ │ 批量插入 storyboards 表 │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 11. 自动关联分镜与元素 │ │ 根据分镜中的角色/场景/道具名称建立关联 │ │ (通过 screenplay_character_id 等字段) │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 12. 更新 AI 任务状态为 completed │ │ 确认积分消耗 │ └────────────────────────┬────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 13. 返回解析结果给用户 │ │ {characters, scenes, props, tags, storyboards} │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## 详细步骤说明 ### 步骤 1:用户上传/创建剧本 **操作**: - 用户通过 API 上传剧本文件或直接输入剧本文本 - 系统创建 `screenplays` 表记录 **API**: ``` POST /api/v1/screenplays ``` **请求体**: ```json { "project_id": "019d1234-5678-7abc-def0-987654321fed", "name": "第一集剧本", "type": "text", "content": "剧本内容..." } ``` --- ### 步骤 2:用户触发 AI 解析 **操作**: - 用户点击"AI 解析剧本"按钮 - 前端调用 AI 解析 API **API**: ``` POST /api/v1/screenplays/{screenplay_id}/parse ``` **请求体**: ```json { "auto_create_elements": true, "auto_create_tags": true, "auto_create_storyboards": true, "model": "gpt-4" } ``` --- ### 步骤 3-4:积分检查与任务创建 **操作**: 1. AI Service 调用 Credit Service 检查用户积分 2. 预扣积分(如 50 积分) 3. 创建 `ai_jobs` 表记录 4. 创建 `credit_consumption_logs` 表记录 5. 提交异步任务到 Celery **代码示例**: ```python # AI Service async def parse_screenplay( self, user_id: UUID, screenplay_id: UUID, auto_create_elements: bool = True, auto_create_tags: bool = True, auto_create_storyboards: bool = True, model: str = "gpt-4" ) -> Dict[str, Any]: # 1. 获取剧本内容 screenplay = await self.screenplay_repo.get_by_id(screenplay_id) # 2. 计算所需积分 model_config = await self.model_repo.get_by_name(model) credits_needed = model_config.credits_per_unit * 10 # 假设解析剧本需要 10 个单位 # 3. 预扣积分 consumption_log = await self.credit_service.consume_credits( user_id=user_id, amount=credits_needed, feature_type='screenplay_parse', task_params={ 'screenplay_id': str(screenplay_id), 'model': model } ) # 4. 创建 AI 任务 job = await self.job_repo.create({ 'job_type': 'text_processing', 'status': 'pending', 'user_id': user_id, 'project_id': screenplay.project_id, 'model_id': model_config.model_id, 'model_name': model, 'consumption_log_id': consumption_log.consumption_id, 'input_data': { 'screenplay_id': str(screenplay_id), 'screenplay_content': screenplay.content, 'auto_create_elements': auto_create_elements, 'auto_create_tags': auto_create_tags, 'auto_create_storyboards': auto_create_storyboards } }) # 5. 更新 consumption_log consumption_log.ai_job_id = job.ai_job_id await self.db.commit() # 6. 提交异步任务 task = parse_screenplay_task.delay( job_id=str(job.ai_job_id), screenplay_id=str(screenplay_id), screenplay_content=screenplay.content, model=model, auto_create_elements=auto_create_elements, auto_create_tags=auto_create_tags, auto_create_storyboards=auto_create_storyboards ) return { 'job_id': str(job.ai_job_id), 'task_id': task.id, 'status': 'pending' } ``` --- ### 步骤 5:Celery Worker 调用 AI 模型 **操作**: - Celery Worker 接收任务 - 调用 AI 模型(GPT-4 / Claude / Gemini 等) - AI 模型分析剧本内容,返回结构化 JSON 数据 **Celery Task**: ```python @celery_app.task def parse_screenplay_task( job_id: str, screenplay_id: str, screenplay_content: str, model: str, auto_create_elements: bool, auto_create_tags: bool, auto_create_storyboards: bool ): try: # 1. 更新任务状态 ai_service.update_job(job_id, {'status': 'processing', 'started_at': datetime.utcnow()}) # 2. 构建 AI 提示词 prompt = build_screenplay_parse_prompt(screenplay_content) # 3. 调用 AI 模型 ai_response = call_ai_model(model, prompt) # 4. 解析 AI 返回的 JSON parsed_data = json.loads(ai_response) # 5. 自动存储数据(如果启用) if auto_create_elements: await store_screenplay_elements(screenplay_id, parsed_data) if auto_create_tags: await store_screenplay_tags(screenplay_id, parsed_data) if auto_create_storyboards: await store_storyboards(screenplay_id, parsed_data) # 6. 更新任务状态 ai_service.update_job(job_id, { 'status': 'completed', 'completed_at': datetime.utcnow(), 'output_data': parsed_data }) # 7. 确认积分消耗 job = ai_service.get_job(job_id) credit_service.confirm_consumption(job.consumption_log_id) except Exception as e: # 任务失败,退还积分 ai_service.update_job(job_id, { 'status': 'failed', 'error_message': str(e) }) job = ai_service.get_job(job_id) credit_service.refund_credits(job.consumption_log_id, reason=str(e)) ``` --- ## AI 输出格式规范 AI 模型需要返回以下格式的 JSON 数据: ```json { "characters": [ { "name": "张三", "description": "男主角,30岁,程序员", "role_type": "main", "metadata": { "age": 30, "gender": "male", "occupation": "程序员", "personality": "内向、聪明、善良" } }, { "name": "李四", "description": "女主角,28岁,设计师", "role_type": "main", "metadata": { "age": 28, "gender": "female", "occupation": "设计师", "personality": "外向、热情、乐观" } } ], "scenes": [ { "scene_number": 1, "title": "咖啡厅", "location": "市中心星巴克", "time_of_day": "afternoon", "description": "一个温馨的咖啡厅,阳光透过落地窗洒进来", "duration_estimate": 120.0, "order_index": 0, "metadata": { "atmosphere": "温馨、浪漫", "weather": "晴天" } } ], "props": [ { "name": "笔记本电脑", "description": "张三的工作电脑", "category": "电子设备", "importance": "normal", "metadata": { "brand": "MacBook Pro", "color": "银色" } }, { "name": "古剑", "description": "传说中的宝剑,剧情关键道具", "category": "武器", "importance": "key", "metadata": { "material": "玄铁", "special_ability": "可以斩断任何物体" } } ], "character_tags": [ { "character_name": "张三", "tag_key": "youth", "tag_label": "少年", "description": "15岁的张三,还在上高中", "order_index": 0, "ai_confidence": 0.95, "ai_context": "剧本第3场:回忆杀,张三回忆起高中时代...", "metadata": { "age_range": "13-17", "key_features": ["短发", "校服", "天真"] } }, { "character_name": "张三", "tag_key": "adult", "tag_label": "成年", "description": "30岁的张三,现在的样子", "order_index": 1, "ai_confidence": 1.0, "ai_context": "剧本主线", "metadata": { "age_range": "30", "key_features": ["成熟", "西装", "眼镜"] } } ], "scene_tags": [ { "scene_name": "咖啡厅", "tag_key": "era_1990", "tag_label": "1990年代", "description": "90年代的咖啡厅,复古装修", "order_index": 0, "ai_confidence": 0.92, "ai_context": "剧本第10场:回忆杀,1990年代的咖啡厅...", "metadata": { "time_period": "1990-1999", "key_features": ["霓虹灯", "老式电话亭", "磁带播放器"] } } ], "prop_tags": [ { "prop_name": "古剑", "tag_key": "damaged", "tag_label": "破损", "description": "锈迹斑斑的古剑", "order_index": 0, "ai_confidence": 0.88, "ai_context": "剧本第5场:古剑被发现时已经破损...", "metadata": { "condition": "破损", "key_features": ["锈迹", "剑刃卷曲"] } }, { "prop_name": "古剑", "tag_key": "restored", "tag_label": "修复", "description": "重铸后的古剑,锋利如新", "order_index": 1, "ai_confidence": 0.90, "ai_context": "剧本第15场:古剑被重铸...", "metadata": { "condition": "完好", "key_features": ["锋利", "闪光"] } } ], "storyboards": [ { "shot_number": "001", "title": "开场", "description": "张三坐在咖啡厅里,看着窗外", "dialogue": "张三:又是一个平凡的下午...", "shot_size": "medium_shot", "camera_movement": "static", "estimated_duration": 5.5, "order_index": 0, "start_time": 0.0, "end_time": 5.5, "metadata": { "lighting": "自然光", "weather": "晴天", "time_of_day": "下午" }, "characters": ["张三"], "character_tags": { "张三": "adult" }, "scenes": ["咖啡厅"], "scene_tags": { "咖啡厅": "modern" }, "props": ["笔记本电脑"], "prop_tags": {} }, { "shot_number": "002", "title": "回忆杀", "description": "少年张三在学校操场上奔跑", "dialogue": null, "shot_size": "wide_shot", "camera_movement": "tracking", "estimated_duration": 8.0, "order_index": 1, "start_time": 5.5, "end_time": 13.5, "metadata": { "lighting": "阳光", "weather": "晴天", "time_of_day": "上午" }, "characters": ["张三"], "character_tags": { "张三": "youth" }, "scenes": ["学校操场"], "scene_tags": { "学校操场": "1990s" }, "props": [], "prop_tags": {} } ] } ``` ### 字段说明 #### characters 数组 - `name`:角色名称(必填) - `description`:角色描述(必填) - `role_type`:角色类型(main/supporting/extra) - `metadata`:额外元数据(可选) #### scenes 数组 - `scene_number`:场景编号(必填) - `title`:场景标题(必填) - `location`:场景地点(可选) - `time_of_day`:时间段(dawn/morning/noon/afternoon/dusk/night) - `description`:场景描述(必填) - `duration_estimate`:预估时长(秒) - `order_index`:排序索引(必填) - `metadata`:额外元数据(可选) #### props 数组 - `name`:道具名称(必填) - `description`:道具描述(必填) - `category`:道具类别(可选) - `importance`:重要性(key/normal/background) - `metadata`:额外元数据(可选) #### character_tags 数组 - `character_name`:角色名称(必填,用于关联) - `tag_key`:标签标识(必填,如 youth/adult/elder) - `tag_label`:标签显示名称(必填) - `description`:标签描述(必填) - `order_index`:排序索引(必填) - `ai_confidence`:AI 识别置信度(0.0-1.0) - `ai_context`:AI 识别的上下文(剧本原文片段) - `metadata`:额外元数据(可选) #### scene_tags 数组 - 字段同 character_tags,但 `character_name` 改为 `scene_name` #### prop_tags 数组 - 字段同 character_tags,但 `character_name` 改为 `prop_name` #### storyboards 数组 - `shot_number`:镜号(自动生成,如 "001") - `title`:分镜标题(必填) - `description`:分镜描述(必填) - `dialogue`:对白(可选) - `shot_size`:景别(可选) - `camera_movement`:运镜(可选) - `estimated_duration`:预估时长(秒) - `order_index`:排序索引(必填) - `start_time`:开始时间(秒) - `end_time`:结束时间(秒) - `metadata`:额外元数据(可选) - `characters`:涉及的角色名称数组(必填) - `character_tags`:角色标签映射(可选,格式:{"角色名": "标签key"}) - `scenes`:涉及的场景名称数组(必填) - `scene_tags`:场景标签映射(可选) - `props`:涉及的道具名称数组(可选) - `prop_tags`:道具标签映射(可选) --- ## 数据自动存储逻辑 ### 1. 存储角色数据 **函数**:`store_screenplay_characters()` **逻辑**: 1. 遍历 `parsed_data['characters']` 数组 2. 对每个角色,插入 `screenplay_characters` 表 3. 返回角色 ID 映射(角色名 → character_id) **代码示例**: ```python async def store_screenplay_characters( screenplay_id: UUID, characters_data: List[Dict[str, Any]] ) -> Dict[str, UUID]: """ 批量存储角色数据 Returns: 角色名到 character_id 的映射 """ character_id_map = {} for char_data in characters_data: character = ScreenplayCharacter( screenplay_id=screenplay_id, name=char_data['name'], description=char_data['description'], role_type=char_data.get('role_type', 'supporting'), metadata=char_data.get('metadata', {}) ) created_character = await screenplay_repo.create_character(character) character_id_map[char_data['name']] = created_character.character_id return character_id_map ``` --- ### 2. 存储场景数据 **函数**:`store_screenplay_scenes()` **逻辑**: 1. 遍历 `parsed_data['scenes']` 数组 2. 对每个场景,插入 `screenplay_scenes` 表 3. 返回场景 ID 映射(场景名 → scene_id) **代码示例**: ```python async def store_screenplay_scenes( screenplay_id: UUID, scenes_data: List[Dict[str, Any]] ) -> Dict[str, UUID]: """ 批量存储场景数据 Returns: 场景名到 scene_id 的映射 """ scene_id_map = {} for scene_data in scenes_data: scene = ScreenplayScene( screenplay_id=screenplay_id, scene_number=scene_data['scene_number'], title=scene_data['title'], location=scene_data.get('location'), time_of_day=scene_data.get('time_of_day'), description=scene_data['description'], duration_estimate=scene_data.get('duration_estimate'), order_index=scene_data['order_index'], metadata=scene_data.get('metadata', {}) ) created_scene = await screenplay_repo.create_scene(scene) scene_id_map[scene_data['title']] = created_scene.scene_id return scene_id_map ``` --- ### 3. 存储道具数据 **函数**:`store_screenplay_props()` **逻辑**: 1. 遍历 `parsed_data['props']` 数组 2. 对每个道具,插入 `screenplay_props` 表 3. 返回道具 ID 映射(道具名 → prop_id) **代码示例**: ```python async def store_screenplay_props( screenplay_id: UUID, props_data: List[Dict[str, Any]] ) -> Dict[str, UUID]: """ 批量存储道具数据 Returns: 道具名到 prop_id 的映射 """ prop_id_map = {} for prop_data in props_data: prop = ScreenplayProp( screenplay_id=screenplay_id, name=prop_data['name'], description=prop_data['description'], category=prop_data.get('category'), importance=prop_data.get('importance', 'normal'), metadata=prop_data.get('metadata', {}) ) created_prop = await screenplay_repo.create_prop(prop) prop_id_map[prop_data['name']] = created_prop.prop_id return prop_id_map ``` --- ### 4. 存储标签数据 **服务**:`ScreenplayTagService.store_tags()` **逻辑**: 1. 遍历 `parsed_data['character_tags']` 字典 2. 根据 `character_name` 查找对应的 `character_id` 3. 批量插入 `screenplay_element_tags` 表 4. 自动设置 `screenplay_characters.has_tags = true` 5. 对场景标签和道具标签执行相同操作 6. 返回标签 ID 映射(用于分镜关联) **代码示例**: ```python # 调用 ScreenplayTagService from app.services.screenplay_tag_service import ScreenplayTagService tag_service = ScreenplayTagService(db) tag_id_maps = await tag_service.store_tags( screenplay_id=screenplay_id, parsed_data=parsed_data, character_id_map=character_id_map, scene_id_map=scene_id_map, prop_id_map=prop_id_map ) # 返回的 tag_id_maps 结构 { 'character_tags': { '张三-youth': UUID('019d1234-5678-7abc-def0-444444444444'), '张三-adult': UUID('019d1234-5678-7abc-def0-555555555555') }, 'scene_tags': { '花果山-daytime': UUID('019d1234-5678-7abc-def0-666666666666'), '花果山-night': UUID('019d1234-5678-7abc-def0-777777777777') }, 'prop_tags': { '金箍棒-new': UUID('019d1234-5678-7abc-def0-888888888888') } } ``` **ScreenplayTagService.store_tags() 实现**: ```python async def store_tags( self, screenplay_id: UUID, parsed_data: Dict[str, Any], character_id_map: Dict[str, UUID], scene_id_map: Dict[str, UUID], prop_id_map: Dict[str, UUID] ) -> Dict[str, Dict[str, UUID]]: """存储 AI 解析的标签""" tag_id_maps = { 'character_tags': {}, 'scene_tags': {}, 'prop_tags': {} } # 1. 存储角色标签 for char_name, tags in parsed_data.get('character_tags', {}).items(): character_id = character_id_map.get(char_name) if not character_id: continue for tag_data in tags: tag = await self.repository.create(ScreenplayElementTag( screenplay_id=screenplay_id, element_type=ElementType.CHARACTER, element_id=character_id, element_name=char_name, tag_key=tag_data['tag_key'], tag_label=tag_data['tag_label'], description=tag_data.get('description'), metadata=tag_data.get('metadata', {}) )) map_key = f"{char_name}-{tag_data['tag_key']}" tag_id_maps['character_tags'][map_key] = tag.tag_id # 更新角色的 has_tags 标志 await self._update_element_has_tags(ElementType.CHARACTER, character_id, True) # 2. 存储场景标签(逻辑类似) # 3. 存储道具标签(逻辑类似) return tag_id_maps ``` **AI 返回的标签数据结构**: ```json { "character_tags": { "张三": [ { "tag_key": "youth", "tag_label": "少年", "description": "15岁的张三,穿着校服", "metadata": {"age": 15, "clothing": "校服"} }, { "tag_key": "adult", "tag_label": "成年", "description": "30岁的张三,身穿西装", "metadata": {"age": 30, "clothing": "西装"} } ] }, "scene_tags": { "花果山": [ { "tag_key": "daytime", "tag_label": "白天", "description": "阳光明媚的花果山" }, { "tag_key": "night", "tag_label": "夜晚", "description": "月光下的花果山" } ] }, "prop_tags": { "金箍棒": [ { "tag_key": "new", "tag_label": "崭新", "description": "刚打造的金箍棒" } ] } } ``` --- ## 分镜自动关联逻辑 ### 核心原理 分镜数据中包含角色/场景/道具的**名称数组**和**标签映射**,需要根据这些名称查找对应的数据库 ID,然后建立关联。 ### 关联步骤 1. **存储分镜基础数据**:先插入 `storyboards` 表,获取 `storyboard_id` 2. **解析角色关联**:根据分镜中的 `characters` 数组和 `character_tags` 映射,查找对应的 `character_id` 和 `element_tag_id` 3. **解析场景关联**:根据分镜中的 `scenes` 数组和 `scene_tags` 映射,查找对应的 `scene_id` 和 `element_tag_id` 4. **解析道具关联**:根据分镜中的 `props` 数组和 `prop_tags` 映射,查找对应的 `prop_id` 和 `element_tag_id` 5. **更新分镜记录**:将关联的 ID 写入 `storyboards` 表的对应字段 ### 函数实现 **函数**:`store_storyboards_with_associations()` **代码示例**: ```python async def store_storyboards_with_associations( screenplay_id: UUID, project_id: UUID, storyboards_data: List[Dict[str, Any]], character_id_map: Dict[str, UUID], scene_id_map: Dict[str, UUID], prop_id_map: Dict[str, UUID], tag_id_maps: Dict[str, Dict[str, UUID]] ) -> List[UUID]: """ 批量存储分镜数据并自动建立关联 Args: screenplay_id: 剧本 ID project_id: 项目 ID storyboards_data: AI 返回的分镜数据数组 character_id_map: 角色名 → character_id 映射 scene_id_map: 场景名 → scene_id 映射 prop_id_map: 道具名 → prop_id 映射 tag_id_maps: 标签 ID 映射 Returns: 创建的分镜 ID 列表 """ storyboard_ids = [] for storyboard_data in storyboards_data: # 1. 解析角色关联 character_ids = [] character_tag_ids = [] for char_name in storyboard_data.get('characters', []): character_id = character_id_map.get(char_name) if character_id: character_ids.append(character_id) # 检查是否指定了标签 tag_key = storyboard_data.get('character_tags', {}).get(char_name) if tag_key: map_key = f"{char_name}-{tag_key}" tag_id = tag_id_maps['character_tags'].get(map_key) if tag_id: character_tag_ids.append(tag_id) # 2. 解析场景关联 scene_ids = [] scene_tag_ids = [] for scene_name in storyboard_data.get('scenes', []): scene_id = scene_id_map.get(scene_name) if scene_id: scene_ids.append(scene_id) # 检查是否指定了标签 tag_key = storyboard_data.get('scene_tags', {}).get(scene_name) if tag_key: map_key = f"{scene_name}-{tag_key}" tag_id = tag_id_maps['scene_tags'].get(map_key) if tag_id: scene_tag_ids.append(tag_id) # 3. 解析道具关联 prop_ids = [] prop_tag_ids = [] for prop_name in storyboard_data.get('props', []): prop_id = prop_id_map.get(prop_name) if prop_id: prop_ids.append(prop_id) # 检查是否指定了标签 tag_key = storyboard_data.get('prop_tags', {}).get(prop_name) if tag_key: map_key = f"{prop_name}-{tag_key}" tag_id = tag_id_maps['prop_tags'].get(map_key) if tag_id: prop_tag_ids.append(tag_id) # 4. 创建分镜记录 storyboard = Storyboard( project_id=project_id, screenplay_id=screenplay_id, shot_number=storyboard_data['shot_number'], title=storyboard_data['title'], description=storyboard_data['description'], dialogue=storyboard_data.get('dialogue'), shot_size=storyboard_data.get('shot_size'), camera_movement=storyboard_data.get('camera_movement'), estimated_duration=storyboard_data.get('estimated_duration'), order_index=storyboard_data['order_index'], start_time=storyboard_data.get('start_time'), end_time=storyboard_data.get('end_time'), metadata=storyboard_data.get('metadata', {}), # 关联字段 screenplay_character_ids=character_ids, element_tag_ids=character_tag_ids + scene_tag_ids + prop_tag_ids, screenplay_scene_ids=scene_ids, screenplay_prop_ids=prop_ids ) created_storyboard = await storyboard_repo.create(storyboard) storyboard_ids.append(created_storyboard.storyboard_id) return storyboard_ids ``` ### 关联逻辑说明 #### 1. 角色关联 **输入**: ```json { "characters": ["张三", "李四"], "character_tags": { "张三": "youth", "李四": "adult" } } ``` **处理逻辑**: 1. 遍历 `characters` 数组 2. 从 `character_id_map` 查找 `character_id` 3. 检查 `character_tags` 映射,如果存在标签 key 4. 构建映射 key:`"张三-youth"` 5. 从 `tag_id_maps['character_tags']` 查找 `element_tag_id` 6. 将 `character_id` 添加到 `screenplay_character_ids` 数组 7. 将 `element_tag_id` 添加到 `element_tag_ids` 数组 **结果**: ```python screenplay_character_ids = [ UUID('019d1234-5678-7abc-def0-111111111111'), # 张三 UUID('019d1234-5678-7abc-def0-222222222222') # 李四 ] element_tag_ids = [ UUID('019d1234-5678-7abc-def0-333333333333'), # 张三-youth 标签 UUID('019d1234-5678-7abc-def0-444444444444') # 李四-adult 标签 ] ``` #### 2. 场景关联 逻辑同角色关联,但使用 `scenes`、`scene_tags`、`scene_id_map` 和 `tag_id_maps['scene_tags']`。 #### 3. 道具关联 逻辑同角色关联,但使用 `props`、`prop_tags`、`prop_id_map` 和 `tag_id_maps['prop_tags']`。 ### 数据库字段说明 `storyboards` 表中的关联字段: | 字段名 | 类型 | 说明 | |--------|------|------| | `screenplay_character_ids` | `UUID[]` | 关联的角色 ID 数组 | | `screenplay_scene_ids` | `UUID[]` | 关联的场景 ID 数组 | | `screenplay_prop_ids` | `UUID[]` | 关联的道具 ID 数组 | | `element_tag_ids` | `UUID[]` | 关联的元素标签 ID 数组(包含角色/场景/道具标签) | ### 关联查询示例 **查询分镜关联的角色和标签信息**: ```sql SELECT s.storyboard_id, s.shot_number, s.title, c.character_id, c.name AS character_name, et.tag_id, et.tag_label, et.element_type FROM storyboards s LEFT JOIN LATERAL unnest(s.screenplay_character_ids) WITH ORDINALITY AS char_id(id, ord) ON true LEFT JOIN screenplay_characters c ON c.character_id = char_id.id LEFT JOIN LATERAL unnest(s.element_tag_ids) AS tag_id(id) ON true LEFT JOIN screenplay_element_tags et ON et.tag_id = tag_id.id AND et.character_id = c.character_id WHERE s.screenplay_id = '019d1234-5678-7abc-def0-987654321fed' ORDER BY s.order_index, char_id.ord; ``` --- ## 错误处理 ### 1. AI 调用失败 **场景**:AI 模型调用超时、返回错误、或返回格式不正确 **处理逻辑**: 1. 捕获异常 2. 更新 `ai_jobs.status = 'failed'` 3. 记录错误信息到 `ai_jobs.error_message` 4. 调用 Credit Service 退还积分 5. 返回错误信息给用户 **代码示例**: ```python try: ai_response = await call_ai_model(model, prompt) parsed_data = json.loads(ai_response) except Exception as e: # 更新任务状态 await ai_service.update_job(job_id, { 'status': 'failed', 'error_message': f'AI 调用失败: {str(e)}', 'completed_at': datetime.utcnow() }) # 退还积分 job = await ai_service.get_job(job_id) await credit_service.refund_credits( consumption_log_id=job.consumption_log_id, reason=f'AI 调用失败: {str(e)}' ) raise ``` --- ### 2. 数据存储失败 **场景**:数据库写入失败、数据验证失败 **处理逻辑**: 1. 使用数据库事务(Transaction) 2. 如果任何一步失败,回滚所有操作 3. 更新 `ai_jobs.status = 'failed'` 4. 退还积分 5. 返回错误信息给用户 **代码示例**: ```python async def store_screenplay_elements( screenplay_id: UUID, parsed_data: Dict[str, Any] ) -> Dict[str, Any]: """ 存储剧本元素(使用事务) """ async with db.begin(): # 开启事务 try: # 1. 存储角色 character_id_map = await store_screenplay_characters( screenplay_id, parsed_data['characters'] ) # 2. 存储场景 scene_id_map = await store_screenplay_scenes( screenplay_id, parsed_data['scenes'] ) # 3. 存储道具 prop_id_map = await store_screenplay_props( screenplay_id, parsed_data['props'] ) # 4. 存储标签(使用 ScreenplayTagService) from app.services.screenplay_tag_service import ScreenplayTagService tag_service = ScreenplayTagService(db) tag_id_maps = await tag_service.store_tags( screenplay_id=screenplay_id, parsed_data=parsed_data, character_id_map=character_id_map, scene_id_map=scene_id_map, prop_id_map=prop_id_map ) # 5. 存储分镜 storyboard_ids = await store_storyboards_with_associations( screenplay_id, project_id, parsed_data['storyboards'], character_id_map, scene_id_map, prop_id_map, tag_id_maps ) # 提交事务 await db.commit() return { 'character_ids': list(character_id_map.values()), 'scene_ids': list(scene_id_map.values()), 'prop_ids': list(prop_id_map.values()), 'storyboard_ids': storyboard_ids } except Exception as e: # 回滚事务 await db.rollback() raise Exception(f'数据存储失败: {str(e)}') ``` --- ### 3. 关联失败 **场景**:分镜中引用的角色/场景/道具名称在数据库中找不到 **处理逻辑**: 1. 记录警告日志 2. 跳过该关联(不阻断整个流程) 3. 在 `storyboards.metadata` 中记录未找到的元素 **代码示例**: ```python # 解析角色关联时 for char_name in storyboard_data.get('characters', []): character_id = character_id_map.get(char_name) if character_id: character_ids.append(character_id) else: # 记录警告 logger.warning(f'分镜 {storyboard_data["shot_number"]} 引用的角色 "{char_name}" 未找到') # 在 metadata 中记录 if 'missing_associations' not in storyboard_data['metadata']: storyboard_data['metadata']['missing_associations'] = {} if 'characters' not in storyboard_data['metadata']['missing_associations']: storyboard_data['metadata']['missing_associations']['characters'] = [] storyboard_data['metadata']['missing_associations']['characters'].append(char_name) ``` --- ### 4. 积分不足 **场景**:用户积分不足以支付 AI 解析费用 **处理逻辑**: 1. 在预扣积分阶段检查 2. 如果积分不足,抛出 `InsufficientCreditsError` 3. 返回 HTTP 402 Payment Required 4. 提示用户充值 **代码示例**: ```python # Credit Service async def consume_credits( self, user_id: UUID, amount: int, feature_type: str, task_params: Dict[str, Any] ) -> CreditConsumptionLog: # 检查积分余额 user_credits = await self.get_user_credits(user_id) if user_credits.available_credits < amount: raise InsufficientCreditsError( f'积分不足。需要 {amount} 积分,当前可用 {user_credits.available_credits} 积分' ) # 预扣积分 # ... ``` --- ### 5. 并发冲突 **场景**:同一个剧本被多次触发 AI 解析 **处理逻辑**: 1. 在 `screenplays` 表添加 `parsing_status` 字段(idle/parsing/completed/failed) 2. 触发解析前检查状态 3. 如果状态为 `parsing`,返回错误提示 4. 使用数据库锁(SELECT FOR UPDATE)防止并发 **代码示例**: ```python async def parse_screenplay( self, user_id: UUID, screenplay_id: UUID, **kwargs ) -> Dict[str, Any]: # 1. 加锁查询剧本 async with db.begin(): screenplay = await db.execute( select(Screenplay) .where(Screenplay.screenplay_id == screenplay_id) .with_for_update() ) screenplay = screenplay.scalar_one_or_none() if not screenplay: raise ScreenplayNotFoundError() # 2. 检查解析状态 if screenplay.parsing_status == 'parsing': raise ScreenplayParsingInProgressError('该剧本正在解析中,请稍后再试') # 3. 更新状态为 parsing screenplay.parsing_status = 'parsing' await db.commit() try: # 执行解析逻辑 # ... # 更新状态为 completed await db.execute( update(Screenplay) .where(Screenplay.screenplay_id == screenplay_id) .values(parsing_status='completed') ) await db.commit() except Exception as e: # 更新状态为 failed await db.execute( update(Screenplay) .where(Screenplay.screenplay_id == screenplay_id) .values(parsing_status='failed') ) await db.commit() raise ``` --- ## 示例代码 ### 完整的端到端示例 **场景**:用户上传剧本并触发 AI 解析 #### 1. 用户上传剧本 **请求**: ```bash POST /api/v1/screenplays Content-Type: application/json Authorization: Bearer { "project_id": "019d1234-5678-7abc-def0-987654321fed", "name": "第一集剧本", "type": "text", "content": "第一场 咖啡厅 下午\n\n张三坐在咖啡厅里,看着窗外。\n\n张三:又是一个平凡的下午...\n\n(回忆杀)\n\n第二场 学校操场 上午\n\n少年张三在操场上奔跑。" } ``` **响应**: ```json { "screenplay_id": "019d1234-5678-7abc-def0-111111111111", "project_id": "019d1234-5678-7abc-def0-987654321fed", "name": "第一集剧本", "type": "text", "parsing_status": "idle", "created_at": "2026-01-19T10:00:00Z" } ``` --- #### 2. 用户触发 AI 解析 **请求**: ```bash POST /api/v1/screenplays/019d1234-5678-7abc-def0-111111111111/parse Content-Type: application/json Authorization: Bearer { "auto_create_elements": true, "auto_create_tags": true, "auto_create_storyboards": true, "model": "gpt-4" } ``` **响应**: ```json { "job_id": "019d1234-5678-7abc-def0-222222222222", "task_id": "abc123-def456-ghi789", "status": "pending", "estimated_credits": 50, "message": "AI 解析任务已提交,请稍后查询结果" } ``` --- #### 3. 查询任务状态 **请求**: ```bash GET /api/v1/ai/jobs/019d1234-5678-7abc-def0-222222222222 Authorization: Bearer ``` **响应(处理中)**: ```json { "job_id": "019d1234-5678-7abc-def0-222222222222", "job_type": "text_processing", "status": "processing", "progress": 50, "started_at": "2026-01-19T10:00:05Z", "message": "正在解析剧本..." } ``` **响应(完成)**: ```json { "job_id": "019d1234-5678-7abc-def0-222222222222", "job_type": "text_processing", "status": "completed", "progress": 100, "started_at": "2026-01-19T10:00:05Z", "completed_at": "2026-01-19T10:00:30Z", "credits_consumed": 45, "output_data": { "characters_count": 2, "scenes_count": 2, "props_count": 1, "tags_count": 2, "storyboards_count": 2 }, "message": "解析完成" } ``` --- #### 4. 查询解析结果 **请求**: ```bash GET /api/v1/screenplays/019d1234-5678-7abc-def0-111111111111/elements Authorization: Bearer ``` **响应**: ```json { "screenplay_id": "019d1234-5678-7abc-def0-111111111111", "characters": [ { "character_id": "019d1234-5678-7abc-def0-333333333333", "name": "张三", "description": "男主角,30岁,程序员", "role_type": "main", "has_tags": true, "tags": [ { "tag_id": "019d1234-5678-7abc-def0-444444444444", "tag_key": "youth", "tag_label": "少年", "description": "15岁的张三,还在上高中" }, { "tag_id": "019d1234-5678-7abc-def0-555555555555", "tag_key": "adult", "tag_label": "成年", "description": "30岁的张三,现在的样子" } ] } ], "scenes": [ { "scene_id": "019d1234-5678-7abc-def0-666666666666", "scene_number": 1, "title": "咖啡厅", "location": "市中心星巴克", "time_of_day": "afternoon", "description": "一个温馨的咖啡厅,阳光透过落地窗洒进来" }, { "scene_id": "019d1234-5678-7abc-def0-777777777777", "scene_number": 2, "title": "学校操场", "location": "某高中操场", "time_of_day": "morning", "description": "宽阔的操场,学生们在上体育课" } ], "props": [ { "prop_id": "019d1234-5678-7abc-def0-888888888888", "name": "笔记本电脑", "description": "张三的工作电脑", "category": "电子设备", "importance": "normal" } ] } ``` --- #### 5. 查询分镜列表 **请求**: ```bash GET /api/v1/projects/019d1234-5678-7abc-def0-987654321fed/storyboards?screenplay_id=019d1234-5678-7abc-def0-111111111111 Authorization: Bearer ``` **响应**: ```json { "total": 2, "items": [ { "storyboard_id": "019d1234-5678-7abc-def0-999999999999", "shot_number": "001", "title": "开场", "description": "张三坐在咖啡厅里,看着窗外", "dialogue": "张三:又是一个平凡的下午...", "shot_size": "medium_shot", "camera_movement": "static", "estimated_duration": 5.5, "characters": [ { "character_id": "019d1234-5678-7abc-def0-333333333333", "name": "张三", "tag": { "tag_id": "019d1234-5678-7abc-def0-555555555555", "tag_label": "成年" } } ], "scenes": [ { "scene_id": "019d1234-5678-7abc-def0-666666666666", "title": "咖啡厅" } ], "props": [ { "prop_id": "019d1234-5678-7abc-def0-888888888888", "name": "笔记本电脑" } ] }, { "storyboard_id": "019d1234-5678-7abc-def0-aaaaaaaaaaaa", "shot_number": "002", "title": "回忆杀", "description": "少年张三在学校操场上奔跑", "dialogue": null, "shot_size": "wide_shot", "camera_movement": "tracking", "estimated_duration": 8.0, "characters": [ { "character_id": "019d1234-5678-7abc-def0-333333333333", "name": "张三", "tag": { "tag_id": "019d1234-5678-7abc-def0-444444444444", "tag_label": "少年" } } ], "scenes": [ { "scene_id": "019d1234-5678-7abc-def0-777777777777", "title": "学校操场" } ], "props": [] } ] } ``` --- ## 总结 ### 工作流特点 1. **全自动化**:从 AI 解析到数据存储,无需人工干预 2. **事务保证**:使用数据库事务确保数据一致性 3. **异步处理**:使用 Celery 异步任务,不阻塞用户请求 4. **积分管理**:预扣积分 + 确认消耗/退还机制 5. **关联自动化**:根据名称自动建立分镜与元素的关联 6. **标签支持**:支持角色/场景/道具的多标签识别和关联 7. **错误处理**:完善的错误处理和回滚机制 ### 关键技术点 - **UUID v7**:使用 UUID v7 作为主键,支持时间排序 - **PostgreSQL 数组**:使用 `UUID[]` 数组类型存储多对多关联 - **JSON 字段**:使用 `JSONB` 存储元数据和 AI 输出 - **数据库事务**:确保数据一致性 - **Celery 异步任务**:处理耗时的 AI 调用 - **积分系统**:预扣 + 确认/退还机制 ### 后续优化方向 1. **增量解析**:支持只解析剧本的某一部分 2. **人工校正**:支持用户手动调整 AI 识别结果 3. **批量解析**:支持一次解析多个剧本 4. **解析模板**:支持自定义 AI 提示词模板 5. **多模型对比**:支持同时调用多个 AI 模型,对比结果 6. **解析历史**:保存每次解析的历史记录,支持版本对比 --- **文档结束**