7.8 KiB

Raw Permalink Blame History

AI 模型适配器实施文档

日期: 2026-02-13
类型: 架构增强（基于 RFC 144）
影响范围: 后端 API、AI Service

📋 概述

在 RFC 144（AI 模型能力配置）基础上，进一步实施适配器模式，使前端可以使用统一参数调用不同 AI 模型，无需关心模型特定的参数格式。

🎯 核心价值

问题

即使有了 RFC 144 的 capabilities 配置，前端仍需要：

理解每个模型的参数格式差异（size vs resolution vs width/height）
知道 Sora 用 "1280x720"，Veo 用 "720P"，DALL-E 用固定尺寸列表
手动拼接参数字符串

解决方案

双模式设计：

模式	适用场景	参数格式	转换方式
简单模式	普通用户快速生成	`resolution: "720p"`, `aspectRatio: "16:9"`	自动适配器转换
高级模式	专业用户精细控制	`width: 1280`, `height: 720`	直接传递给模型

前端可根据 UI 复杂度选择模式，后端自动检测并处理。

🏗️ 架构设计

整体流程

前端
 ├─ 简单 UI：传递 {resolution: "720p", aspectRatio: "16:9"}
 └─ 高级 UI：传递 {width: 1280, height: 720}
      ↓
API 层（/api/v1/ai/generate/video）
 ├─ 检测参数类型（有 resolution + aspectRatio？）
 │   ├─ 是 → 调用适配器转换（简单模式）
 │   └─ 否 → 直接使用原参数（高级模式）
      ↓
AIService 层
 ├─ 预扣积分
 └─ 创建 Celery 任务
      ↓
Provider 层（RFC 144）
 └─ 使用 capabilities 配置调用模型

核心组件

1. 统一参数模型（`base.py`）

class UnifiedVideoParams(BaseModel):
    prompt: str
    resolution: str                  # "480p", "720p", "1080p", "2k", "4k"
    aspect_ratio: str                # "1:1", "4:3", "16:9", "9:16"
    duration: int
    reference_image: str = None

class UnifiedImageParams(BaseModel):
    prompt: str
    resolution: str                  # "512", "1024", "2048", "4096"
    aspect_ratio: str
    quality: str = "medium"          # "low", "medium", "high"
    reference_images: List[str] = None
    num_images: int = 1

2. 适配器基类

class BaseAIAdapter(ABC):
    @abstractmethod
    def adapt_video_params(self, unified: UnifiedVideoParams) -> Dict[str, Any]:
        """转换统一参数为模型特定参数"""
        pass
    
    # 辅助方法
    @staticmethod
    def _calculate_dimensions(resolution, aspect_ratio, is_landscape) -> tuple[int, int]
    
    @staticmethod
    def _select_closest_value(target, available) -> str

3. 具体适配器

适配器	模型	主要转换逻辑
`SoraAdapter`	sora-2, sora-2-pro	固定分辨率映射 + 时长选择（4/8/12）
`VeoAdapter`	veo-3.0, veo-3.1	档位格式（"720P"）+ 时长（4/6/8）
`FluxAdapter`	flux-2-pro	任意尺寸 + aspect_ratio
`OpenAIAdapter`	dall-e-3	固定尺寸列表 + 质量映射

4. 适配器工厂

class AIAdapterFactory:
    _adapters = {
        "sora-2": SoraAdapter,
        "veo-3.0": VeoAdapter,
        "flux-2-pro": FluxAdapter,
        "dall-e-3": OpenAIAdapter,
        # ...
    }
    
    @classmethod
    def get_adapter(cls, model_name: str) -> BaseAIAdapter

5. API Schema 增强

class GenerateVideoRequest(BaseModel):
    # 高级模式参数（原有）
    prompt: Optional[str]
    duration: int = 5
    fps: int = 30
    
    # 简单模式参数（新增）- 可选
    resolution: Optional[Literal["480p", "720p", "1080p", "2k", "4k"]] = None
    aspect_ratio: Optional[Literal["1:1", "4:3", "16:9", "9:16"]] = None

6. AIService 集成

async def generate_video(..., **kwargs) -> Dict[str, Any]:
    # 🎯 适配器逻辑：检测统一参数
    resolution = kwargs.get('resolution')
    aspect_ratio = kwargs.get('aspect_ratio')
    
    if resolution and aspect_ratio:
        # 简单模式：使用适配器
        adapter = AIAdapterFactory.get_adapter(model)
        adapted_params = adapter.adapt_video_params(UnifiedVideoParams(...))
        kwargs.update(adapted_params)
    
    # 继续原有逻辑
    # ...

📂 文件清单

新增文件

文件路径	作用
`server/app/services/ai_adapters/base.py`	适配器基类 + 统一参数模型
`server/app/services/ai_adapters/sora_adapter.py`	Sora 适配器
`server/app/services/ai_adapters/veo_adapter.py`	Veo 适配器
`server/app/services/ai_adapters/flux_adapter.py`	Flux 适配器
`server/app/services/ai_adapters/openai_adapter.py`	OpenAI 适配器
`server/app/services/ai_adapters/factory.py`	适配器工厂
`server/app/services/ai_adapters/__init__.py`	包初始化

修改文件

文件路径	变更
`server/app/schemas/ai.py`	增加 `resolution`, `aspect_ratio`, `quality` 可选字段
`server/app/services/ai_service.py`	在 `generate_video/generate_image` 开头增加适配器检测逻辑

删除文件

文件路径	原因
~~`server/app/schemas/unified_ai.py`~~	不需要新接口，使用现有接口双模式
~~`server/app/api/v1/ai_unified.py`~~	同上

🧪 使用示例

前端调用示例

简单模式（统一参数）

// 调用现有接口 /api/v1/ai/generate/video
const response = await fetch('/api/v1/ai/generate/video', {
  method: 'POST',
  body: JSON.stringify({
    videoType: 'text2video',
    prompt: '一只猫在跑步',
    model: 'sora-2',
    // 🎯 统一参数（自动适配）
    resolution: '720p',
    aspectRatio: '16:9',
    duration: 5
  })
})

后端自动转换为 Sora 格式：

{
    "prompt": "一只猫在跑步",
    "width": 1280,
    "height": 720,
    "duration": 4  # 自动选择最接近的 4/8/12
}

高级模式（模型特定参数）

// 直接传递 width/height（高级用户）
const response = await fetch('/api/v1/ai/generate/video', {
  method: 'POST',
  body: JSON.stringify({
    videoType: 'text2video',
    prompt: '一只猫在跑步',
    model: 'sora-2',
    // 高级参数（不使用适配器）
    width: 1920,
    height: 1080,
    duration: 8
  })
})

直接传递给 Provider：

{
    "prompt": "一只猫在跑步",
    "width": 1920,
    "height": 1080,
    "duration": 8
}

✅ 技术优势

优势	说明
向后兼容	不影响现有 API，高级模式保持不变
渐进增强	前端可逐步迁移到简单模式
易于扩展	新增模型只需实现适配器，不改动 API
职责分离	适配器专注参数转换，Service 专注业务逻辑
降级处理	适配器失败时自动降级为高级模式

📊 测试策略

单元测试

test_adapters/test_sora_adapter.py
test_adapters/test_veo_adapter.py
test_adapters/test_flux_adapter.py
test_adapters/test_openai_adapter.py
test_adapters/test_factory.py

集成测试

简单模式 + Sora 生成视频
简单模式 + Flux 生成图片
高级模式（验证不受影响）
降级场景（适配器失败）

🔄 后续优化

缓存适配结果：相同参数避免重复转换
扩展适配器：支持更多模型（Gemini, Wan 系列）
参数验证增强：在适配器层提前校验参数合法性
监控指标：统计简单模式 vs 高级模式使用率

📚 相关文档

实施人: Claude
审核状态: ✅ 实施完成

7.8 KiB Raw Permalink Blame History