# Changelog: 批量对话配音生成接口 **日期**: 2026-02-13 **类型**: Feature **影响范围**: AI Service, API, Celery Tasks, Storyboard Resources --- ## 变更概述 新增 `/api/v1/ai/generate-dialogue-voiceovers` 接口,支持为多个分镜对话批量生成 AI 配音,并自动写入 `storyboard_voiceovers` 表。 --- ## 核心功能 ### 1. 批量生成配音 - ✅ 一次请求处理最多 50 个对话 - ✅ 自动从 `storyboard_dialogues` 读取文本 - ✅ 调用 TTS Provider 生成音频 - ✅ 上传到 MinIO - ✅ 写入 `storyboard_voiceovers` 表 ### 2. 完整参数支持 | 参数 | 类型 | 默认值 | 说明 | |------|------|--------|------| | `storyboardId` | `string` | 必填 | 分镜 ID | | `dialogueIds` | `string[]` | 必填 | 对话 ID 列表(1-50 个) | | `voiceId` | `string` | 必填 | 音色 ID | | `voiceName` | `string` | 可选 | 音色名称 | | `speed` | `number` | 1.0 | 语速(0.25-4.0) | | `volume` | `number` | 1.0 | 音量(0.0-2.0) | | `pitch` | `number` | 1.0 | 音调(0.5-2.0) | | `isActive` | `boolean` | false | 是否设为激活配音 | **注意**: - ✅ 使用 camelCase 参数格式(API 规范) - ✅ 系统自动选择最佳音频模型 ### 3. 部分失败容错 - ✅ 单个对话失败不影响其他对话 - ✅ 已成功的配音保留 - ✅ 失败的对话在结果中标记 --- ## 详细变更 ### 1. 数据模型 **文件**: `server/app/models/ai_job.py` ```python class AIJobType(IntEnum): # ... 现有类型 ... DIALOGUE_VOICEOVER = 10 # 批量对话配音生成 ``` --- ### 2. Request Schema **文件**: `server/app/schemas/ai.py` ```python class GenerateDialogueVoiceoversRequest(BaseModel): """批量对话配音生成请求""" storyboard_id: str = Field(..., alias="storyboardId", description="分镜 ID") dialogue_ids: list[str] = Field(..., min_length=1, max_length=50, alias="dialogueIds") voice_id: str = Field(..., alias="voiceId") voice_name: Optional[str] = Field(None, alias="voiceName") speed: float = Field(1.0, ge=0.25, le=4.0) volume: float = Field(1.0, ge=0.0, le=2.0) pitch: float = Field(1.0, ge=0.5, le=2.0) is_active: bool = Field(False, alias="isActive") model_config = ConfigDict(populate_by_name=True) ``` --- ### 3. API 路由 **文件**: `server/app/api/v1/ai.py` ```python @router.post("/generate-dialogue-voiceovers", response_model=SuccessResponse[AIJobResponse]) async def generate_dialogue_voiceovers( request: GenerateDialogueVoiceoversRequest, current_user: User = Depends(get_current_user), db: AsyncSession = Depends(get_session) ): """批量为对话生成 AI 配音""" # ... ``` **端点**: `POST /api/v1/ai/generate-dialogue-voiceovers` --- ### 4. Service 方法 **文件**: `server/app/services/ai_service.py` ```python async def generate_dialogue_voiceovers( self, user_id: str, dialogue_ids: list[str], voice_id: str, voice_name: Optional[str] = None, speed: float = 1.0, volume: float = 1.0, pitch: float = 1.0, is_active: bool = False, model: Optional[str] = None, **kwargs ) -> Dict[str, Any]: """批量为对话生成 AI 配音""" # 1. 验证用户存在 # 2. 验证对话数量(1-50) # 3. 获取所有对话并验证 # 4. 验证所有对话属于同一分镜 # 5. 验证分镜权限 # 6. 检查配额 # 7. 获取模型配置 # 8. 计算所需积分(基于总字符数) # 9. 预扣积分 # 10. 创建 AI Job # 11. 启动 Celery 任务 # 12. 返回 job_id ``` **关键验证**: - ✅ 所有对话必须属于同一分镜 - ✅ 用户对分镜至少有 viewer 权限 - ✅ 对话数量 1-50 个 --- ### 5. Celery 任务 **文件**: `server/app/tasks/ai_tasks.py` ```python @celery_app.task(base=AITask, bind=True, max_retries=3) def generate_dialogue_voiceovers_task( self, job_id: str, user_id: str, dialogue_ids: list[str], voice_id: str, model: str, voice_name: Optional[str] = None, speed: float = 1.0, volume: float = 1.0, pitch: float = 1.0, is_active: bool = False, **kwargs ): """批量对话配音生成任务""" successful_voiceovers = [] failed_dialogues = [] # 逐个处理对话 for idx, dialogue_id in enumerate(dialogue_ids): try: # 1. 获取对话内容 dialogue = await get_dialogue_by_id(dialogue_id) text = dialogue.content # 2. 调用 TTS 生成配音 result = await provider.generate_voice( text=text, voice_type=voice_id, speed=speed, ... ) # 3. 上传音频到 MinIO metadata = await file_storage.upload_file( file_content=result['audio_data'], filename=f"dialogue_voice_{dialogue_id}.mp3", category='ai-generated/dialogue-voiceovers', ... ) # 4. 写入 storyboard_voiceovers 表 if is_active: await deactivate_all_voiceovers(dialogue_id) voiceover = StoryboardVoiceover( voiceover_id=generate_uuid(), dialogue_id=dialogue_id, storyboard_id=storyboard_id, audio_url=metadata.file_url, status=ResourceStatus.COMPLETED, is_active=is_active, voice_id=voice_id, voice_name=voice_name, speed=speed, volume=volume, pitch=pitch, ... ) await create_voiceover(voiceover) successful_voiceovers.append({ 'dialogue_id': dialogue_id, 'voiceover_id': str(voiceover.voiceover_id), 'audio_url': metadata.file_url }) except Exception as e: # 记录失败,继续处理下一个 failed_dialogues.append({ 'dialogue_id': dialogue_id, 'error': str(e) }) # 更新任务状态为完成 await update_job_status( job_id, AIJobStatus.COMPLETED, progress=100, output_data={ 'successful_count': len(successful_voiceovers), 'failed_count': len(failed_dialogues), 'successful_voiceovers': successful_voiceovers, 'failed_dialogues': failed_dialogues } ) ``` **容错策略**: - ✅ 部分失败继续处理 - ✅ 已成功的配音保留 - ✅ 积分不退还(按总字符数预扣) --- ## 使用示例 ### API 请求 ```bash curl -X POST "http://localhost:6160/api/v1/ai/generate-dialogue-voiceovers" \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dialogue_ids": [ "d1d2d3d4-1234-5678-90ab-cdef12345678", "e2e3e4e5-1234-5678-90ab-cdef12345679", "f3f4f5f6-1234-5678-90ab-cdef12345680" ], "voice_id": "EXAVITQu4vr4xnSDxMaL", "voice_name": "Bella", "speed": 1.0, "volume": 1.0, "pitch": 1.0, "is_active": true }' ``` **响应**: ```json { "code": 200, "message": "批量配音生成任务创建成功", "data": { "jobId": "550e8400-e29b-41d4-a716-446655440000", "taskId": "celery-task-id", "status": "pending", "estimatedCredits": 150, "dialogueCount": 3 } } ``` --- ### 查询任务状态 ```bash curl -X GET "http://localhost:6160/api/v1/ai/jobs/550e8400-e29b-41d4-a716-446655440000" \ -H "Authorization: Bearer $TOKEN" ``` **任务完成后**: ```json { "code": 200, "message": "查询成功", "data": { "jobId": "550e8400-e29b-41d4-a716-446655440000", "jobType": 10, "status": 3, "progress": 100, "outputData": { "successful_count": 2, "failed_count": 1, "successful_voiceovers": [ { "dialogue_id": "d1d2d3d4-1234-5678-90ab-cdef12345678", "voiceover_id": "v1v2v3v4-1234-5678-90ab-cdef12345678", "audio_url": "https://minio.example.com/ai-generated/dialogue-voiceovers/dialogue_voice_d1d2d3d4.mp3" }, { "dialogue_id": "e2e3e4e5-1234-5678-90ab-cdef12345679", "voiceover_id": "v2v3v4v5-1234-5678-90ab-cdef12345679", "audio_url": "https://minio.example.com/ai-generated/dialogue-voiceovers/dialogue_voice_e2e3e4e5.mp3" } ], "failed_dialogues": [ { "dialogue_id": "f3f4f5f6-1234-5678-90ab-cdef12345680", "error": "TTS 生成失败: timeout" } ] } } } ``` --- ### 前端集成 ```typescript // 1. 批量生成配音 const { data } = await fetch('/api/v1/ai/generate-dialogue-voiceovers', { method: 'POST', headers: { Authorization: `Bearer ${token}` }, body: JSON.stringify({ dialogue_ids: dialogueIds, voice_id: selectedVoiceId, voice_name: selectedVoiceName, speed: 1.0, volume: 1.0, pitch: 1.0, is_active: true }) }); // 2. 轮询任务状态 const jobId = data.jobId; const interval = setInterval(async () => { const { data: job } = await fetch(`/api/v1/ai/jobs/${jobId}`); if (job.status === 3) { // COMPLETED clearInterval(interval); console.log(`成功: ${job.outputData.successful_count}`); console.log(`失败: ${job.outputData.failed_count}`); // 刷新分镜对话列表 await refreshDialogues(); } }, 2000); ``` --- ## 与现有接口对比 | 特性 | `/generate-voice` | `/generate-dialogue-voiceovers` | |------|------------------|--------------------------------| | **用途** | 通用 TTS | 分镜对话配音 | | **输入** | 自由文本 | 对话 ID 列表 | | **数据源** | 用户输入 | `storyboard_dialogues` 表 | | **写入表** | `ai_generation_results` | `storyboard_voiceovers` | | **批量支持** | ❌ | ✅ 最多 50 个 | | **分镜关联** | 可选 | 强制验证 | | **权限验证** | 用户存在性 | 分镜权限 | | **失败策略** | 全部失败 | 部分失败继续 | **保留原有接口**: - ✅ `/generate-voice` 可用于通用 TTS(非分镜场景) - ✅ 测试音色、预览效果 - ✅ 其他业务模块使用 --- ## 注意事项 ### 1. 积分消耗 - **预扣全部积分**:基于所有对话的总字符数一次性预扣 - **部分失败不退款**:即使部分对话失败,积分不退还 - **建议**:先测试少量对话,确认效果后再批量生成 ### 2. 性能考虑 - **最大批量限制**:50 个对话/请求 - **推荐批量大小**:10-20 个对话 - **超时设置**:每个对话最多 60 秒 ### 3. 失败处理 - **部分失败**:已成功的配音保留 - **重试策略**:失败的对话可单独重新提交 - **查看原因**:查询 `outputData.failed_dialogues` ### 4. 激活配音 - **`is_active=true`**:自动停用该对话的其他配音 - **`is_active=false`**:需手动激活 --- ## 后续优化 ### 短期 - [ ] 并行生成(提升性能) - [ ] 进度细化(实时显示每个对话进度) - [ ] 失败自动重试 ### 长期 - [ ] 音色预设(角色默认音色) - [ ] 情绪映射(根据 emotion 字段调整参数) - [ ] 实时预览 - [ ] 批量导出 --- ## 相关文档 - RFC 145: 批量对话配音生成接口 - RFC 142: ElevenLabs Integration - RFC 144: AI Models Capability Config --- ## 测试验证 ```bash # 1. 创建测试对话 curl -X POST "http://localhost:6160/api/v1/storyboard-resources/dialogues" \ -H "Authorization: Bearer $TOKEN" \ -d '{"storyboard_id":"xxx","content":"测试文本1",...}' # 2. 批量生成配音 curl -X POST "http://localhost:6160/api/v1/ai/generate-dialogue-voiceovers" \ -H "Authorization: Bearer $TOKEN" \ -d '{"dialogue_ids":["d1","d2"],"voice_id":"alloy"}' # 3. 验证配音已写入 curl -X GET "http://localhost:6160/api/v1/storyboard-resources/dialogues/d1/voiceovers" \ -H "Authorization: Bearer $TOKEN" ``` --- ## 验收标准 - [x] API 接口实现完成 - [x] 支持批量生成(1-50 个对话) - [x] 配音自动写入 `storyboard_voiceovers` 表 - [x] 支持部分失败容错 - [x] 完整的权限验证 - [x] 积分预扣和消耗记录 - [x] Celery 异步任务实现 - [x] 文档完整(RFC + Changelog) - [ ] 单元测试覆盖(待补充) - [ ] 集成测试覆盖(待补充)