You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
24 KiB
24 KiB
AI 模型能力配置与适配器架构文档
版本: v1.1
日期: 2026-02-13
状态: ✅ 已实施
最新更新: 数据格式统一 + 补充 3 个适配器(Wan/Jimeng/Gemini)
📋 目录
🏗️ 整体架构
三层架构设计
graph TB
subgraph Frontend["🎨 前端层 (React)"]
SimpleUI["简单模式 UI<br/>统一参数"]
AdvancedUI["高级模式 UI<br/>模型特定参数"]
end
subgraph APILayer["🔌 API 层 (FastAPI)"]
VideoAPI["/api/v1/ai/generate/video"]
ImageAPI["/api/v1/ai/generate/image"]
subgraph ParamDetection["参数检测"]
Detector{"检测到<br/>resolution +<br/>aspectRatio?"}
end
subgraph AdapterLayer["适配器层"]
Factory["AIAdapterFactory<br/>适配器工厂"]
SoraAdapter["SoraAdapter"]
VeoAdapter["VeoAdapter"]
FluxAdapter["FluxAdapter"]
WanAdapter["WanAdapter"]
JimengAdapter["JimengAdapter"]
GeminiAdapter["GeminiAdapter"]
OpenAIAdapter["OpenAIAdapter"]
end
end
subgraph ServiceLayer["⚙️ Service 层"]
AIService["AIService<br/>- 预扣积分<br/>- 创建任务<br/>- 提交 Celery 任务"]
end
subgraph ProviderLayer["🔧 Provider 层"]
Provider["AIHubMixProvider<br/>- 读取 capabilities (JSONB)<br/>- capability_transformer 转换<br/>- 调用 AIHubMix API"]
end
subgraph Database["💾 数据库层 (PostgreSQL)"]
AIModels["ai_models 表<br/>capabilities (JSONB)<br/>- reference_image: {supported, num}<br/>- input_reference: {supported, num}<br/>- size, quality, seconds..."]
end
subgraph External["🌐 外部服务"]
AIHubMix["AIHubMix API<br/>第三方 AI 模型"]
end
%% 简单模式流程
SimpleUI -->|"resolution: '720p'<br/>aspectRatio: '16:9'"| VideoAPI
SimpleUI -->|"resolution: '1024'<br/>aspectRatio: '16:9'"| ImageAPI
%% 高级模式流程
AdvancedUI -->|"width: 1280<br/>height: 720"| VideoAPI
AdvancedUI -->|"width: 1024<br/>height: 1024"| ImageAPI
%% API 层处理
VideoAPI --> Detector
ImageAPI --> Detector
Detector -->|"是<br/>(简单模式)"| Factory
Detector -->|"否<br/>(高级模式)"| AIService
%% 适配器转换
Factory -->|"获取适配器"| SoraAdapter
Factory --> VeoAdapter
Factory --> FluxAdapter
Factory --> WanAdapter
Factory --> JimengAdapter
Factory --> GeminiAdapter
Factory --> OpenAIAdapter
SoraAdapter -->|"转换为<br/>width/height/duration"| AIService
VeoAdapter -->|"转换为<br/>模型特定参数"| AIService
FluxAdapter --> AIService
WanAdapter --> AIService
JimengAdapter --> AIService
GeminiAdapter --> AIService
OpenAIAdapter --> AIService
%% Service 层处理
AIService --> Provider
%% Provider 层处理
Provider --> AIModels
AIModels -->|"读取 capabilities"| Provider
Provider --> AIHubMix
AIHubMix -->|"返回结果"| Provider
%% 样式定义
classDef frontendStyle fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
classDef apiStyle fill:#fff9c4,stroke:#f57f17,stroke-width:2px
classDef serviceStyle fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef providerStyle fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
classDef dbStyle fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef externalStyle fill:#f5f5f5,stroke:#616161,stroke-width:2px
classDef adapterStyle fill:#fff3e0,stroke:#e65100,stroke-width:1px
class SimpleUI,AdvancedUI frontendStyle
class VideoAPI,ImageAPI,Detector apiStyle
class Factory,SoraAdapter,VeoAdapter,FluxAdapter,WanAdapter,JimengAdapter,GeminiAdapter,OpenAIAdapter adapterStyle
class AIService serviceStyle
class Provider providerStyle
class AIModels dbStyle
class AIHubMix externalStyle
架构说明:
-
🎨 前端层:双模式 UI
- 简单模式:使用统一参数(
resolution,aspectRatio,quality) - 高级模式:直接使用模型特定参数(
width,height,duration)
- 简单模式:使用统一参数(
-
🔌 API 层:智能路由与适配
- 统一入口:
/api/v1/ai/generate/{video|image} - 自动检测参数类型,分发到不同处理流程
- 适配器工厂管理 7 个模型适配器
- 统一入口:
-
⚙️ Service 层:业务逻辑
- 积分预扣、任务创建、异步任务提交
- 统一处理简单/高级模式转换后的参数
-
🔧 Provider 层:外部服务集成
- 从数据库读取模型
capabilities(JSONB) - 使用
capability_transformer进行二次转换 - 调用 AIHubMix API
- 从数据库读取模型
-
💾 数据库层:配置存储
ai_models表存储模型能力配置reference_image/input_reference使用对象格式- GIN 索引支持 JSONB 查询
🧩 核心组件
1. 数据库层(RFC 144)
表: ai_models
| 字段 | 类型 | 说明 |
|---|---|---|
model_id |
UUID | 主键 |
model_name |
VARCHAR | 模型名称(如 sora-2) |
model_type |
SMALLINT | 1=文本, 2=图片, 3=视频, 4=音频 |
capabilities |
JSONB | 模型能力配置(核心字段) |
示例 capabilities:
{
"size": {
"values": ["720x1280", "1280x720"],
"default": "720x1280"
},
"seconds": {
"values": ["4", "8", "12"],
"default": "4"
},
"input_reference": {
"supported": true,
"num": 1
}
}
数据格式规范(v1.1 更新):
- 对象格式(推荐):
reference_image和input_reference使用对象表示{ "reference_image": {"supported": true, "num": 5}, "input_reference": {"supported": false} } - 字段说明:
supported: 是否支持该能力num: 支持的最大数量(可选,不支持时省略)
- 向后兼容:
capability_transformer同时支持整数格式(如"reference_image": 1)
GIN 索引: idx_ai_models_capabilities_gin
- 用途:快速查询特定能力的模型
2. 适配器层(Adapter Pattern)
类图:适配器模式结构
classDiagram
class BaseAIAdapter {
<<abstract>>
+adapt_video_params(unified: UnifiedVideoParams) Dict~str, Any~*
+adapt_image_params(unified: UnifiedImageParams) Dict~str, Any~*
}
class UnifiedVideoParams {
+prompt: str
+resolution: str
+aspect_ratio: str
+duration: int
+reference_image: str?
}
class UnifiedImageParams {
+prompt: str
+resolution: str
+aspect_ratio: str
+quality: str
+reference_images: List~str~?
+num_images: int
}
class SoraAdapter {
+adapt_video_params() Dict
-_get_sora_size() Tuple
-_get_sora_duration() str
}
class VeoAdapter {
+adapt_video_params() Dict
-_get_veo_size() str
-_get_veo_duration() str
}
class FluxAdapter {
+adapt_image_params() Dict
-_calculate_dimensions() Tuple
-_map_quality() str
}
class OpenAIAdapter {
+adapt_image_params() Dict
-_get_dalle_size() str
-_map_quality() str
}
class WanAdapter {
+adapt_video_params() Dict
-_get_wan_size() str
-_get_wan_duration() str
}
class JimengAdapter {
+is_pro: bool
+adapt_video_params() Dict
-_get_jimeng_size() str
}
class GeminiAdapter {
+adapt_image_params() Dict
-_get_gemini_size() str
-_map_quality() str
}
class AIAdapterFactory {
-_adapters: Dict~str, Type~
-_instances: Dict~str, BaseAIAdapter~
+get_adapter(model_name: str)$ BaseAIAdapter
+register_adapter(model_name: str, adapter: Type)$
}
BaseAIAdapter <|-- SoraAdapter
BaseAIAdapter <|-- VeoAdapter
BaseAIAdapter <|-- FluxAdapter
BaseAIAdapter <|-- OpenAIAdapter
BaseAIAdapter <|-- WanAdapter
BaseAIAdapter <|-- JimengAdapter
BaseAIAdapter <|-- GeminiAdapter
BaseAIAdapter ..> UnifiedVideoParams : uses
BaseAIAdapter ..> UnifiedImageParams : uses
AIAdapterFactory ..> BaseAIAdapter : creates
note for BaseAIAdapter "抽象基类\n定义适配器接口"
note for AIAdapterFactory "单例模式\n管理所有适配器实例"
note for UnifiedVideoParams "内部统一参数\n视频生成专用"
note for UnifiedImageParams "内部统一参数\n图片生成专用"
2.1 基类
文件: server/app/services/ai_adapters/base.py
class UnifiedVideoParams(BaseModel):
"""统一视频参数(内部使用)"""
prompt: str
resolution: str # "480p", "720p", "1080p", "2k", "4k"
aspect_ratio: str # "1:1", "4:3", "16:9", "9:16"
duration: int
reference_image: str = None
class UnifiedImageParams(BaseModel):
"""统一图片参数(内部使用)"""
prompt: str
resolution: str # "512", "1024", "2048", "4096"
aspect_ratio: str
quality: str = "medium" # "low", "medium", "high"
reference_images: List[str] = None
class BaseAIAdapter(ABC):
@abstractmethod
def adapt_video_params(self, unified: UnifiedVideoParams) -> Dict[str, Any]:
"""将统一参数转换为模型特定参数"""
pass
@abstractmethod
def adapt_image_params(self, unified: UnifiedImageParams) -> Dict[str, Any]:
"""将统一参数转换为模型特定参数"""
pass
2.2 已实现的适配器
| 适配器 | 文件 | 支持模型 | 核心转换逻辑 |
|---|---|---|---|
| SoraAdapter | sora_adapter.py |
sora-2, sora-2-pro | 固定尺寸映射 + 时长选择(4/8/12) |
| VeoAdapter | veo_adapter.py |
veo-3.0, veo-3.1, veo-3.1-fast | 档位格式("720P")+ 时长(4/6/8) |
| FluxAdapter | flux_adapter.py |
flux-2-pro, flux-2-flex, FLUX.1-Kontext-pro | 任意尺寸 + aspect_ratio |
| OpenAIAdapter | openai_adapter.py |
dall-e-3, gpt-image-1.5, gpt-image-1, gpt-image-1-mini | 固定尺寸列表 + 质量映射 |
| WanAdapter | wan_adapter.py |
wan2.6-t2v, wan2.6-i2v | 多种尺寸映射 + 时长选择(5/10s) |
| JimengAdapter | jimeng_adapter.py |
jimeng-3.0-720p, jimeng-3.0-1080p, jimeng-3.0-pro | 固定分辨率 + Pro 版本支持 10s |
| GeminiAdapter | gemini_adapter.py |
gemini-2.5-flash-image, gemini-3-pro-image-preview | K 格式尺寸 + 质量映射 |
2.3 适配器工厂
文件: server/app/services/ai_adapters/factory.py
class AIAdapterFactory:
"""适配器工厂(单例模式)"""
_adapters = {
"sora-2": SoraAdapter,
"sora-2-pro": SoraAdapter,
"veo-3.0-generate-preview": VeoAdapter,
"veo-3.1-generate-preview": VeoAdapter,
"veo-3.1-fast-generate-preview": VeoAdapter,
"flux-2-pro": FluxAdapter,
"flux-2-flex": FluxAdapter,
"FLUX.1-Kontext-pro": FluxAdapter,
"dall-e-3": OpenAIAdapter,
"gpt-image-1.5": OpenAIAdapter,
"gpt-image-1": OpenAIAdapter,
"gpt-image-1-mini": OpenAIAdapter,
"wan2.6-t2v": WanAdapter,
"wan2.6-i2v": WanAdapter,
"jimeng-3.0-720p": JimengAdapter,
"jimeng-3.0-1080p": JimengAdapter,
"jimeng-3.0-pro": lambda: JimengAdapter(is_pro=True),
"gemini-2.5-flash-image": GeminiAdapter,
"gemini-3-pro-image-preview": GeminiAdapter,
}
@classmethod
def get_adapter(cls, model_name: str) -> BaseAIAdapter:
"""获取模型对应的适配器"""
adapter_class = cls._adapters.get(model_name)
if not adapter_class:
raise ValueError(f"不支持的模型: {model_name}")
return adapter_class()
3. API Schema 层
文件: server/app/schemas/ai.py
class GenerateVideoRequest(BaseModel):
"""视频生成请求(支持双模式)"""
# 高级模式参数(原有)
video_type: Literal['text2video', 'img2video']
prompt: Optional[str]
duration: int = 5
fps: int = 30
# 简单模式参数(新增)- 可选
resolution: Optional[Literal["480p", "720p", "1080p", "2k", "4k"]] = None
aspect_ratio: Optional[Literal["1:1", "4:3", "16:9", "9:16"]] = None
class Config:
populate_by_name = True
4. Service 层集成
文件: server/app/services/ai_service.py
async def generate_video(..., **kwargs) -> Dict[str, Any]:
# 🎯 适配器逻辑:检测统一参数
resolution = kwargs.get('resolution')
aspect_ratio = kwargs.get('aspect_ratio')
if resolution and aspect_ratio:
# 简单模式:使用适配器
logger.info("检测到统一参数,使用适配器模式")
from app.services.ai_adapters.factory import AIAdapterFactory
from app.services.ai_adapters.base import UnifiedVideoParams
try:
unified = UnifiedVideoParams(
prompt=prompt or "",
resolution=resolution,
aspect_ratio=aspect_ratio,
duration=duration,
reference_image=image_url
)
adapter = AIAdapterFactory.get_adapter(model)
adapted_params = adapter.adapt_video_params(unified)
# 覆盖参数
kwargs.update(adapted_params)
logger.info("适配器转换完成: %s", adapted_params)
except Exception as e:
logger.warning("适配器转换失败: %s,降级为高级模式", str(e))
# 继续原有逻辑...
🔄 数据流转
时序图:完整请求处理流程
sequenceDiagram
participant FE as 前端
participant API as API 层
participant Factory as 适配器工厂
participant Adapter as 模型适配器
participant Service as AIService
participant Provider as AIHubMixProvider
participant DB as PostgreSQL
participant AIHub as AIHubMix API
rect rgb(230, 245, 255)
Note over FE,API: 简单模式请求
FE->>API: POST /ai/generate/video<br/>{model: "sora-2", resolution: "720p", aspectRatio: "16:9"}
API->>API: 检测参数:有 resolution + aspectRatio
API->>Factory: get_adapter("sora-2")
Factory->>Adapter: 返回 SoraAdapter 实例
Adapter->>Adapter: adapt_video_params()<br/>720p + 16:9 → width:1280, height:720
Adapter-->>API: {width: 1280, height: 720, duration: 4}
end
rect rgb(255, 243, 224)
Note over API,Service: Service 层处理
API->>Service: generate_video(model="sora-2", width=1280, height=720)
Service->>Service: 预扣用户积分
Service->>Service: 创建 AITask 记录
end
rect rgb(232, 245, 233)
Note over Service,AIHub: Provider 层调用
Service->>Provider: generate_video_async()
Provider->>DB: SELECT capabilities FROM ai_models WHERE model_name='sora-2'
DB-->>Provider: {"size": {...}, "seconds": {...}, "input_reference": {...}}
Provider->>Provider: capability_transformer.transform()
Provider->>AIHub: POST /v1/video/generate<br/>{size: "1280x720", seconds: "4"}
AIHub-->>Provider: {task_id: "xxx", status: "processing"}
Provider-->>Service: 返回任务 ID
Service-->>API: 返回响应
API-->>FE: {success: true, task_id: "xxx"}
end
rect rgb(255, 249, 196)
Note over FE,API: 高级模式请求(跳过适配器)
FE->>API: POST /ai/generate/video<br/>{model: "sora-2", width: 1920, height: 1080}
API->>API: 检测参数:无 resolution,直接传递
API->>Service: generate_video(width=1920, height=1080)
Service->>Provider: 直接使用原始参数
end
简单模式流程
1. 前端发送统一参数
POST /api/v1/ai/generate/video
{
"model": "sora-2",
"prompt": "一只猫在跑步",
"resolution": "720p",
"aspectRatio": "16:9",
"duration": 5
}
2. AIService 检测到 resolution + aspectRatio
→ 调用 AIAdapterFactory.get_adapter("sora-2")
→ 返回 SoraAdapter 实例
3. SoraAdapter 转换参数
UnifiedVideoParams {
resolution: "720p",
aspect_ratio: "16:9",
duration: 5
}
↓
{
"width": 1280,
"height": 720,
"duration": 4 // 自动选择最接近的 4/8/12
}
4. AIService 使用转换后的参数
→ 调用 AIHubMixProvider
→ Provider 读取 model.capabilities
→ 使用 capability_transformer 再次转换(如需要)
→ 调用 AIHubMix API
高级模式流程
1. 前端发送模型特定参数
POST /api/v1/ai/generate/video
{
"model": "sora-2",
"prompt": "一只猫在跑步",
"width": 1920,
"height": 1080,
"duration": 8
}
2. AIService 未检测到 resolution + aspectRatio
→ 跳过适配器
→ 直接使用原始参数
3. 继续原有流程
→ 调用 AIHubMixProvider
→ Provider 读取 model.capabilities
→ 调用 AIHubMix API
✅ 已实现功能
1. RFC 144: AI 模型能力配置
- ✅ 数据库 Schema 更新(
capabilitiesJSONB 字段 + GIN 索引) - ✅ Alembic 迁移脚本
- ✅ 数据迁移脚本(31 种模型配置)
- ✅ API Schema 增强(返回
capabilities) - ✅ Provider 层集成(
capability_transformer)
2. 适配器模式
- ✅ 适配器基类(
BaseAIAdapter) - ✅ 7 个具体适配器(Sora, Veo, Flux, OpenAI, Wan, Jimeng, Gemini)
- ✅ 适配器工厂(
AIAdapterFactory) - ✅ API Schema 双模式支持
- ✅ Service 层自动检测与转换
3. 数据完整性
- ✅ 图片模型:3/3 已配置 capabilities
- ✅ 视频模型:9/9 已配置 capabilities
- ✅ 总计:12 个活跃模型 100% 配置完成
- ✅ 数据格式统一:所有
reference_image/input_reference已升级为对象格式
🚀 后续扩展指南
阶段 1:补充适配器(✅ 已完成)
目标:覆盖数据库中已有的模型
| 适配器 | 支持模型 | 状态 |
|---|---|---|
| WanAdapter | wan2.6-t2v, wan2.6-i2v | ✅ 已实现 |
| JimengAdapter | jimeng-3.0-720p, jimeng-3.0-1080p, jimeng-3.0-pro | ✅ 已实现 |
| GeminiAdapter | gemini-2.5-flash-image, gemini-3-pro-image-preview | ✅ 已实现 |
实施记录:
# ✅ 已创建适配器文件
server/app/services/ai_adapters/wan_adapter.py # 86 行
server/app/services/ai_adapters/jimeng_adapter.py # 86 行
server/app/services/ai_adapters/gemini_adapter.py # 75 行
# ✅ 已注册到工厂
server/app/services/ai_adapters/factory.py
- 新增 7 个模型映射
- 支持 lambda 方式传递参数(jimeng-3.0-pro)
# ⚠️ 待补充
server/tests/unit/test_adapters/ # 单元测试
阶段 2:扩展适配器(优先级:中)
目标:覆盖迁移脚本中配置的所有模型
| 适配器 | 支持模型 | 数据库状态 | 优先级 |
|---|---|---|---|
| ImagenAdapter | imagen-4.0-, imagen-3.0- | 未入库 | 🟡 中 |
| QwenAdapter | qwen-image, qwen-image-edit | 未入库 | 🟡 中 |
| DoubaoAdapter | doubao-seedream-4-5, doubao-seedream-4-0 | 未入库 | 🟡 中 |
| IragAdapter | irag-1.0, ernie-irag-edit | 未入库 | 🟢 低 |
| IdeogramAdapter | V3 | 未入库 | 🟢 低 |
说明:这些模型已在迁移脚本中配置 capabilities,但尚未添加到数据库。待业务需要时再实施。
阶段 3:增强功能(优先级:低)
-
缓存优化
- 缓存适配器转换结果
- 避免重复计算
-
参数验证增强
- 在适配器层提前校验参数合法性
- 提供友好的错误提示
-
监控指标
- 统计简单模式 vs 高级模式使用率
- 监控适配器转换失败率
- 追踪各模型的调用频率
-
动态适配器注册
- 支持运行时注册新适配器
- 无需重启服务
📖 运维手册
添加新模型
场景:AIHubMix 发布了新模型 veo-4.0
步骤:
# 1. 在迁移脚本中添加 capabilities 配置
# server/scripts/migrate_model_capabilities.py
MODEL_CAPABILITIES = {
# ...
'veo-4.0': {
'size': {'values': ['720P', '1080P', '4K'], 'default': '1080P'},
'seconds': {'values': ['4', '6', '8', '10'], 'default': '8'},
'input_reference': {'supported': True, 'num': 1} # 使用对象格式
},
}
# 2. 执行数据迁移
docker exec jointo-server-app python scripts/migrate_model_capabilities.py
# 3. 创建/复用适配器
# 如果参数格式与 Veo 3.x 类似,直接复用 VeoAdapter
# 4. 注册到工厂
# server/app/services/ai_adapters/factory.py
AIAdapterFactory._adapters.update({
"veo-4.0": VeoAdapter, # 复用现有适配器
})
# 5. 重启服务
docker restart jointo-server-app jointo-server-celery-ai
修改模型配置
场景:Sora 3.0 支持新的时长选项 16s
# 1. 更新迁移脚本
MODEL_CAPABILITIES['sora-3.0']['seconds']['values'].append('16')
# 2. 手动更新数据库(或重新执行迁移)
docker exec jointo-server-postgres psql -U jointoAI -d jointo -c "
UPDATE ai_models
SET capabilities = jsonb_set(
capabilities,
'{seconds,values}',
'[\"4\", \"8\", \"12\", \"16\"]'::jsonb
)
WHERE model_name = 'sora-3.0';
"
# 3. 重启服务(可选,如果需要重新加载配置)
docker restart jointo-server-app
调试适配器
场景:适配器转换失败
# 1. 查看日志
docker logs jointo-server-app --tail=100 | grep "适配器"
# 示例输出
# 2026-02-13 03:00:00 | INFO | 检测到统一参数 (resolution=720p, aspect_ratio=16:9),使用适配器模式
# 2026-02-13 03:00:00 | WARNING | 适配器转换失败: 不支持的模型: unknown-model,降级为高级模式
# 2. 手动测试适配器
from app.services.ai_adapters.factory import AIAdapterFactory
from app.services.ai_adapters.base import UnifiedVideoParams
unified = UnifiedVideoParams(
prompt="测试",
resolution="720p",
aspect_ratio="16:9",
duration=5
)
adapter = AIAdapterFactory.get_adapter("sora-2")
result = adapter.adapt_video_params(unified)
print(result) # {'width': 1280, 'height': 720, 'duration': 4}
性能优化
场景:大量请求导致适配器成为瓶颈
# 1. 添加缓存(在 AIService 中)
from functools import lru_cache
@lru_cache(maxsize=1000)
def _get_adapter_cached(model_name: str):
return AIAdapterFactory.get_adapter(model_name)
# 2. 批量转换(如果支持)
def adapt_batch(requests: List[UnifiedVideoParams]):
return [adapter.adapt_video_params(req) for req in requests]
📊 架构优势
| 优势 | 说明 |
|---|---|
| 向后兼容 | 不影响现有 API,高级模式保持不变 |
| 渐进增强 | 前端可逐步迁移到简单模式 |
| 易于扩展 | 新增模型只需实现适配器,不改动 API |
| 职责分离 | 适配器专注参数转换,Service 专注业务逻辑 |
| 降级处理 | 适配器失败时自动降级为高级模式 |
| 配置驱动 | 模型参数存储在数据库,支持动态更新 |
📚 相关文档
维护人: Claude
最后更新: 2026-02-13
审核状态: ✅ 已完成