You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
16 KiB
16 KiB
剧本解析任务实现
日期: 2026-02-03
类型: 功能实现
影响范围: 后端 AI 服务 - 剧本解析
概述
实现剧本解析功能,使用 AI 自动提取剧本中的角色、场景、道具、标签和分镜,完成阶段 3 的核心功能。
实施内容
1. Celery 异步任务
文件: server/app/tasks/ai_tasks.py
✅ 新增 parse_screenplay_task
功能特性:
- 调用 AI Provider 解析剧本
- 构建专业的解析提示词
- 解析 AI 返回的 JSON 数据
- 调用 Screenplay Service 存储数据
- 任务状态管理(pending → processing → completed/failed)
- 进度更新(10% → 30% → 60% → 100%)
- 积分确认/退还
- 失败重试(最多 3 次)
解析提示词:
system_prompt = """你是一个专业的剧本分析助手。请将输入的剧本文本拆解为结构化的数据。
输出 JSON 格式,包含以下字段:
1. characters: 角色列表
2. scenes: 场景列表
3. props: 道具列表
4. character_tags: 角色标签(变体)
5. scene_tags: 场景标签(变体)
6. prop_tags: 道具标签(变体)
7. storyboards: 分镜列表(可选)
"""
存储流程:
# 1. 调用 AI Provider 解析剧本
result = await provider.process_text(
task_type='screenplay_parse',
text=screenplay_content,
output_format='json',
system_prompt=system_prompt
)
# 2. 提取解析结果
parsed_data = result.get('result')
# 3. 调用 Screenplay Service 存储
storage_result = await screenplay_service.store_parsed_elements(
screenplay_id=UUID(screenplay_id),
parsed_data=parsed_data,
auto_create_elements=auto_create_elements,
auto_create_tags=auto_create_tags,
auto_create_storyboards=auto_create_storyboards
)
# 4. 更新任务状态
await _update_job_status(
job_id,
AIJobStatus.COMPLETED,
progress=100,
output_data={
'parsed_data': parsed_data,
'storage_result': storage_result
}
)
# 5. 确认积分消耗
await _confirm_or_refund_credits(
job_id=job_id,
consumption_log_id=consumption_log_id,
success=True
)
2. AI Service 方法
文件: server/app/services/ai_service.py
✅ 新增 parse_screenplay() 方法
功能特性:
- 验证用户和剧本
- 检查配额
- 获取模型配置(默认 gpt-4)
- 计算积分(基于字符数)
- 预扣积分
- 创建任务记录
- 提交 Celery 任务
方法签名:
async def parse_screenplay(
self,
user_id: str,
screenplay_id: str,
screenplay_content: str,
model: Optional[str] = None,
project_id: Optional[str] = None,
auto_create_elements: bool = True,
auto_create_tags: bool = True,
auto_create_storyboards: bool = True,
**kwargs
) -> Dict[str, Any]:
"""解析剧本(异步)"""
返回值:
{
"job_id": "019d1234-5678-7abc-def0-222222222222",
"task_id": "abc123-def456-ghi789",
"status": "pending",
"estimated_credits": 50
}
3. API 路由
文件: server/app/api/v1/screenplays.py
✅ 新增 POST /api/v1/screenplays/{screenplay_id}/parse 端点
请求参数:
{
"model": "gpt-4",
"auto_create_elements": true,
"auto_create_tags": true,
"auto_create_storyboards": true,
"temperature": 0.7,
"max_tokens": 4000
}
响应:
{
"code": 200,
"message": "剧本解析任务已提交,请使用 job_id 查询任务状态",
"data": {
"job_id": "019d1234-5678-7abc-def0-222222222222",
"task_id": "abc123-def456-ghi789",
"status": "pending",
"estimated_credits": 50
}
}
权限控制:
- 需要编辑权限(editor)
- 验证剧本是否存在
- 验证剧本内容是否为空
4. Schema 定义
文件: server/app/schemas/screenplay.py
✅ 新增 Schema:
ScreenplayParseRequest:
class ScreenplayParseRequest(BaseModel):
"""剧本解析请求模型"""
model: Optional[str] = Field('gpt-4', description="AI 模型名称")
auto_create_elements: bool = Field(True, description="是否自动创建元素")
auto_create_tags: bool = Field(True, description="是否自动创建标签")
auto_create_storyboards: bool = Field(True, description="是否自动创建分镜")
temperature: Optional[float] = Field(0.7, description="温度参数", ge=0.0, le=2.0)
max_tokens: Optional[int] = Field(4000, description="最大 token 数", ge=100, le=8000)
ScreenplayParseResponse:
class ScreenplayParseResponse(BaseModel):
"""剧本解析响应模型"""
job_id: str = Field(..., description="AI 任务 ID")
task_id: str = Field(..., description="Celery 任务 ID")
status: str = Field(..., description="任务状态")
estimated_credits: int = Field(..., description="预估消耗积分")
技术规范
✅ 符合 jointo-tech-stack 规范
- ✅ 异步操作: 所有数据库操作使用
async/await - ✅ Event Loop 管理: 使用
run_async_task()运行异步代码 - ✅ 统一响应格式: 使用
success_response() - ✅ 完整的错误处理: try-except + exc_info=True
- ✅ %-formatting 日志:
logger.error("错误: %s", str(e), exc_info=True) - ✅ 类型提示: 完整的 Python 类型注解
- ✅ UUID v7: 所有 ID 参数使用
UUID类型 - ✅ 失败重试: 最多重试 3 次,指数退避
- ✅ 积分管理: 任务成功确认积分,失败退还积分
✅ 代码质量验证
- ✅ 通过
getDiagnostics检查,无语法错误 - ✅ 无 import 错误
- ✅ 无类型错误
- ✅ 代码结构清晰,注释完整
工作流程
1. 用户发起解析请求
POST /api/v1/screenplays/{screenplay_id}/parse
Authorization: Bearer <token>
Content-Type: application/json
{
"model": "gpt-4",
"auto_create_elements": true,
"auto_create_tags": true,
"auto_create_storyboards": true
}
2. API 路由处理
# 1. 验证剧本是否存在
screenplay = await repo.get_by_id(screenplay_id)
# 2. 检查权限(需要编辑权限)
await service._check_project_permission(
current_user.user_id,
screenplay.project_id,
'editor'
)
# 3. 检查剧本内容
if not screenplay.content:
raise ValidationError("剧本内容为空,无法解析")
# 4. 调用 AI Service
result = await ai_service.parse_screenplay(...)
# 5. 返回任务信息
return success_response(data=result)
3. AI Service 处理
# 1. 验证用户和剧本
await self._validate_user_exists(user_id)
screenplay = await screenplay_repo.get_by_id(screenplay_id)
# 2. 检查配额
await self._check_quota(user_id, 'screenplay_parse')
# 3. 获取模型配置
model_config = await self._get_model(model, AIModelType.TEXT)
# 4. 计算积分
credits_needed = await self.credit_service.calculate_credits(...)
# 5. 预扣积分
consumption_log = await self.credit_service.consume_credits(...)
# 6. 创建任务记录
job = await self.job_repository.create({...})
# 7. 提交 Celery 任务
task = parse_screenplay_task.delay(...)
# 8. 返回任务信息
return {'job_id': ..., 'task_id': ..., 'status': 'pending'}
4. Celery Worker 执行
# 1. 更新任务状态为处理中
await _update_job_status(job_id, AIJobStatus.PROCESSING, progress=10)
# 2. 创建 AI Provider
provider = AIProviderFactory.create_provider(model)
# 3. 调用 AI 解析剧本
result = await provider.process_text(
task_type='screenplay_parse',
text=screenplay_content,
output_format='json',
system_prompt=system_prompt
)
# 4. 提取解析结果
parsed_data = result.get('result')
# 5. 存储解析结果到数据库
storage_result = await screenplay_service.store_parsed_elements(
screenplay_id=screenplay_id,
parsed_data=parsed_data,
auto_create_elements=auto_create_elements,
auto_create_tags=auto_create_tags,
auto_create_storyboards=auto_create_storyboards
)
# 6. 更新任务状态为完成
await _update_job_status(
job_id,
AIJobStatus.COMPLETED,
progress=100,
output_data={
'parsed_data': parsed_data,
'storage_result': storage_result
}
)
# 7. 确认积分消耗
await _confirm_or_refund_credits(job_id, consumption_log_id, success=True)
5. 查询任务状态
GET /api/v1/ai/jobs/{job_id}
Authorization: Bearer <token>
响应:
{
"code": 200,
"message": "Success",
"data": {
"ai_job_id": "019d1234-5678-7abc-def0-222222222222",
"status": "completed",
"progress": 100,
"output_data": {
"parsed_data": {
"characters": [...],
"scenes": [...],
"props": [...],
"character_tags": {...},
"scene_tags": {...},
"prop_tags": {...},
"storyboards": [...]
},
"storage_result": {
"characters_created": 5,
"scenes_created": 3,
"props_created": 2,
"tags_created": 8,
"storyboards_created": 10
}
}
}
}
AI 输出格式
AI 模型返回的 JSON 数据结构:
{
"characters": [
{
"name": "张三",
"description": "男主角,30岁,程序员",
"role_type": "main",
"metadata": {"age": 30, "gender": "male"}
}
],
"scenes": [
{
"scene_number": 1,
"title": "咖啡厅",
"location": "市中心星巴克",
"time_of_day": "afternoon",
"description": "温馨的咖啡厅"
}
],
"props": [
{
"name": "笔记本电脑",
"description": "张三的工作电脑",
"category": "电子设备",
"importance": "normal"
}
],
"character_tags": {
"张三": [
{
"tag_key": "youth",
"tag_label": "少年",
"description": "15岁的张三"
},
{
"tag_key": "adult",
"tag_label": "成年",
"description": "30岁的张三"
}
]
},
"scene_tags": {...},
"prop_tags": {...},
"storyboards": [
{
"shot_number": "001",
"title": "开场",
"description": "张三坐在咖啡厅里",
"dialogue": "又是一个平凡的下午...",
"shot_size": "medium_shot",
"camera_movement": "static",
"estimated_duration": 5.5,
"characters": ["张三"],
"character_tags": {"张三": "adult"},
"scenes": ["咖啡厅"],
"props": ["笔记本电脑"]
}
]
}
存储逻辑
1. 存储角色
# 批量插入 screenplay_characters 表
character_id_map = {}
for char_data in parsed_data.get('characters', []):
character = await repo.create_character(...)
character_id_map[char_data['name']] = character.character_id
2. 存储场景
# 批量插入 screenplay_scenes 表
scene_id_map = {}
for scene_data in parsed_data.get('scenes', []):
scene = await repo.create_scene(...)
scene_id_map[scene_data['title']] = scene.scene_id
3. 存储道具
# 批量插入 screenplay_props 表
prop_id_map = {}
for prop_data in parsed_data.get('props', []):
prop = await repo.create_prop(...)
prop_id_map[prop_data['name']] = prop.prop_id
4. 存储标签
# 调用 ScreenplayTagService.store_tags()
tag_id_maps = await tag_service.store_tags(
screenplay_id=screenplay_id,
parsed_data=parsed_data,
character_id_map=character_id_map,
scene_id_map=scene_id_map,
prop_id_map=prop_id_map
)
# 返回的 tag_id_maps 结构
{
'character_tags': {
'张三-youth': UUID('...'),
'张三-adult': UUID('...')
},
'scene_tags': {...},
'prop_tags': {...}
}
5. 存储分镜
# 批量插入 storyboards 表,同时建立关联关系
for storyboard_data in parsed_data.get('storyboards', []):
# 查找角色 ID
character_ids = [
character_id_map.get(name)
for name in storyboard_data.get('characters', [])
]
# 查找标签 ID
character_tag_ids = [
tag_id_maps['character_tags'].get(f"{name}-{tag_key}")
for name, tag_key in storyboard_data.get('character_tags', {}).items()
]
# 创建分镜
storyboard = await repo.create_storyboard(
screenplay_character_ids=character_ids,
screenplay_character_tag_ids=character_tag_ids,
...
)
测试建议
1. 测试剧本解析
# 在 Docker 容器中测试
docker exec jointo-server-app python -c "
from app.tasks.ai_tasks import parse_screenplay_task
screenplay_content = '''
第一场 咖啡厅 - 白天
张三(30岁,程序员)坐在咖啡厅里,面前放着一台笔记本电脑。
张三:又是一个平凡的下午...
'''
result = parse_screenplay_task.delay(
job_id='test-job-id',
user_id='test-user-id',
screenplay_id='test-screenplay-id',
screenplay_content=screenplay_content,
model='gpt-4',
auto_create_elements=True,
auto_create_tags=True,
auto_create_storyboards=True
)
print(f'Task ID: {result.id}')
"
2. 测试 API 端点
# 发起解析请求
curl -X POST http://localhost:8000/api/v1/screenplays/{screenplay_id}/parse \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"auto_create_elements": true,
"auto_create_tags": true,
"auto_create_storyboards": true
}'
# 查询任务状态
curl http://localhost:8000/api/v1/ai/jobs/{job_id} \
-H "Authorization: Bearer <token>"
3. 查看 Celery Worker 日志
# 查看 AI Worker 日志
docker logs jointo-server-celery-ai -f
相关文档
需求文档
docs/requirements/backend/04-services/ai/ai-service.md- AI 生成服务需求
实施文档
docs/server/changelogs/2026-02-03-ai-api-routes-implementation.md- API 路由实现docs/server/changelogs/2026-02-03-ai-celery-tasks-verification.md- Celery 任务验证docs/server/changelogs/2026-02-03-ai-services-implementation-summary.md- 完整实施总结
架构文档
docs/architecture/tech-stack.md- 技术栈规范
影响范围
新增功能
- ✅ 剧本解析 Celery 任务
- ✅ AI Service
parse_screenplay()方法 - ✅ API 端点
POST /api/v1/screenplays/{screenplay_id}/parse - ✅ Schema 定义(ScreenplayParseRequest, ScreenplayParseResponse)
修改文件
server/app/tasks/ai_tasks.py- 添加parse_screenplay_taskserver/app/services/ai_service.py- 添加parse_screenplay()方法server/app/api/v1/screenplays.py- 添加解析端点server/app/schemas/screenplay.py- 添加解析 Schema
无影响
- 现有 API 路由
- 数据库结构
- 前端代码
注意事项
- 权限控制: 解析剧本需要编辑权限(editor)
- 剧本内容: 剧本内容不能为空
- 积分扣除: 解析任务会扣除用户积分,需确保积分充足
- 任务异步: 解析任务异步执行,需要轮询任务状态
- AI 模型: 默认使用 gpt-4,可以指定其他模型
- 存储选项: 可以选择是否自动创建元素/标签/分镜
- 失败重试: 任务失败会自动重试,最多 3 次
验证清单
- Celery 任务实现完整
- AI Service 方法实现完整
- API 路由实现完整
- Schema 定义完整
- 代码通过 getDiagnostics 检查
- 符合 jointo-tech-stack 规范
- 完整的错误处理
- 完整的日志记录
- 积分管理正确
- 权限控制正确
总结
阶段 3(剧本解析任务实现)已完成 100%:
✅ 已完成:
- Celery 异步任务(
parse_screenplay_task) - AI Service 方法(
parse_screenplay()) - API 路由(
POST /api/v1/screenplays/{screenplay_id}/parse) - Schema 定义(ScreenplayParseRequest, ScreenplayParseResponse)
- 代码质量验证
- 文档输出
🎉 AI 服务功能开发完成:
- 阶段 1:API 路由层实现(22 个端点)✅
- 阶段 2:Celery 异步任务实现(7 种任务)✅
- 阶段 3:剧本解析任务实现 ✅
当前 AI 服务功能已完整实现,可以支持所有 AI 生成场景,包括剧本解析。