Commit 7bf71620 7bf71620f01eb8ff3bc8ab5cdd8d9832a9780575 by 沈秋雨

Initial commit

0 parents
1 # Required for qwen
2 QWEN_API_KEY=sk-d9b4d3581bde47d887354f9160a509a2
3 QWEN_DASHSCOPE_API_KEY=
4 QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
5 QWEN_MODEL=qwen3-omni-flash
6 QWEN_TIMEOUT=15
7 QWEN_LYRICS_TIMEOUT=90
8 QWEN_MAX_RETRIES=3
9 MUSIC_ANALYZE_LIGHT_MODE=true
10 MUSIC_DOWNLOAD_DIR=music
11 MUSIC_MAPPING_FILE=music/music_file_mapping.csv
12
13 # Optional song structure service
14 SONGFORMER_URL=
15
16 # Optional ASR backend for lyrics_only path
17 MUSIC_LYRICS_ASR_BACKEND=funasr
18 DASHSCOPE_FUNASR_MODEL=fun-asr
19 DASHSCOPE_BASE_HTTP_API_URL=https://dashscope.aliyuncs.com/api/v1
20 DASHSCOPE_ASR_POLL_INTERVAL=1
21 DASHSCOPE_ASR_POLL_TIMEOUT=120
22 DASHSCOPE_ASR_SUBMIT_URL=https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription
23 DASHSCOPE_ASR_MODEL=qwen3-asr-flash-filetrans
24 DASHSCOPE_TASK_STATUS_BASE_URL=https://dashscope.aliyuncs.com/api/v1/tasks
1 .DS_Store
2
3 # Python cache
4 __pycache__/
5 *.py[cod]
6 *.so
7 .pytest_cache/
8 .mypy_cache/
9
10 # Virtual env
11 .venv/
12 venv/
13
14 # Local env
15 .env
16
17 # Logs
18 logs/
19 *.log
20
21 # Runtime outputs
22 outputs/
23 music/
24 *.checkpoint.json
25
26 # Local test/sample data
27 *.xlsx
28 *.xls
29 *.csv
30
31 # Keep env template and source files
32 !.env.example
1 # music_analyze_v2
2
3 当前项目是一个基于 Excel 批量跑音频标签分析的独立流水线。
4
5 实际主流程:
6
7 1. 读取输入 `xlsx`
8 2. 从指定 URL 列取音频地址
9 3. 透传部分元数据给音乐分析器
10 4. 调用 `app.middleware.music_analyze.analyze_music(...)`
11 5. 将结果整理成固定交付列并持续写回输出 `xlsx`
12 6. 通过已有输出文件和 checkpoint 支持断点续跑
13
14 当前批处理入口是 [`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
15
16 ## 当前状态
17
18 - 可直接运行的主入口:[`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
19 - 当前默认分析链路:`QwenAnalyzer`
20 - 当前实际可用 provider:`qwen`
21 - 提示词来源:[`app/prompts/step2_music_decode`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode)
22 - 输出格式:固定交付列,不保留原始全部输入列
23
24 说明:
25
26 - 命令行参数里虽然还保留了 `--provider doubao` 选项,但当前 [`factory.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/factory.py) 只实例化 `qwen`,传 `doubao` 会在运行时失败。
27 - README 以下内容按“当前代码实际行为”描述,而不是按历史规划描述。
28
29 ## 安装
30
31 ```bash
32 python3.10 -m venv .venv
33 source .venv/bin/activate
34 pip install -r requirements.txt
35 cp .env.example .env
36 ```
37
38 ## 环境变量
39
40 最小必需配置通常是:
41
42 ```env
43 QWEN_API_KEY=your_api_key
44 QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
45 QWEN_MODEL=qwen3-omni-flash
46 QWEN_TIMEOUT=15
47 QWEN_LYRICS_TIMEOUT=90
48 QWEN_MAX_RETRIES=3
49 ```
50
51 项目还支持以下可选增强能力:
52
53 - `QWEN_DASHSCOPE_API_KEY`:部分 DashScope/ASR 路径会用到
54 - `SONGFORMER_URL`:启用额外音频结构特征
55 - `MUSIC_LYRICS_ASR_BACKEND``DASHSCOPE_*`:歌词提取相关配置
56 - `OSS_*`:音频过大时走 OSS 降级上传
57
58 配置定义见 [`app/core/config.py`](/Users/sqy/Downloads/music_analyze_v2/app/core/config.py)
59
60 ## 输入要求
61
62 输入文件必须是 `xlsx`
63
64 至少需要一列音频地址。脚本按下面顺序解析 URL 列:
65
66 - 显式传入的 `--url-column`
67 - `URL`
68 - `url`
69 - `cos访问地址`
70 - `cos_url`
71 - `audio_url`
72
73 若整行 URL 为空:
74
75 - 不会发起分析
76 - 该行会被直接跳过
77 - 在断点续跑里会被视为已处理
78
79 元数据不是必填,但建议提供。脚本会优先识别这些字段:
80
81 - `歌曲ID` / `song_id` / `id`
82 - `tmeid` / `tmeID` / `TMEID`
83 - `歌曲名` / `歌曲名称` / `title`
84 - `表演者` / `歌手` / `artist`
85 - `歌曲时长` / `duration`
86
87 默认会额外透传这些列给模型作为 metadata:
88
89 - `tmeID,歌曲名称,歌曲名,歌手,表演者,版本,词作者,曲作者`
90
91 可通过 `--metadata-columns` 覆盖。
92
93 ## 快速开始
94
95 常规跑批:
96
97 ```bash
98 python pipeline/batch_analyze_xlsx.py \
99 --input 待分析.xlsx \
100 --output outputs/标签交付结果.xlsx \
101 --url-column URL \
102 --provider qwen \
103 --workers 3
104 ```
105
106 提取歌词:
107
108 ```bash
109 python pipeline/batch_analyze_xlsx.py \
110 --input 待分析.xlsx \
111 --output outputs/标签交付结果.xlsx \
112 --url-column URL \
113 --provider qwen \
114 --workers 3 \
115 --extract-lyrics
116 ```
117
118 从头重跑,不复用历史输出或 checkpoint:
119
120 ```bash
121 python pipeline/batch_analyze_xlsx.py \
122 --input 待分析.xlsx \
123 --output outputs/标签交付结果.xlsx \
124 --provider qwen \
125 --no-resume
126 ```
127
128
129 ## 命令行参数
130
131 | 参数 | 说明 | 当前实际行为 |
132 |------|------|-------------|
133 | `--input` | 输入 Excel 路径 | 必填 |
134 | `--output` | 输出 Excel 路径 | 必填 |
135 | `--checkpoint` | checkpoint 文件路径 | 默认是 `<output>.checkpoint.json` |
136 | `--url-column` | URL 列名 | 默认 `URL`,不存在时会自动 fallback |
137 | `--provider` | 分析 provider | 参数允许 `qwen`/`doubao`,当前实际只应使用 `qwen` |
138 | `--extract-lyrics` | 是否提取歌词 | 开启后会走带歌词分析路径 |
139 | `--label-level` | 标签级别 | `0``1` |
140 | `--metadata-columns` | 额外透传给模型的列 | 逗号分隔 |
141 | `--workers` | 并发线程数 | 默认 `3` |
142 | `--checkpoint-every` | 每处理多少行保存一次 | 默认 `10` |
143 | `--no-resume` | 禁用断点续跑 | 默认关闭 |
144
145 ## 输出结构
146
147 脚本输出的是固定交付表,不是“原始输入列 + 分析列”的全量回写。
148
149 当前输出列定义在 [`batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)`DEFAULT_OUTPUT_COLUMNS`
150
151 - `tmeid`
152 - `歌曲ID`
153 - `歌曲名`
154 - `表演者`
155 - `歌曲时长`
156 - `表演者类型`
157 - `语种`
158 - `BPM速度`
159 - `情绪`
160 - `网络/抖音歌曲`
161 - `音乐风格`
162 - `配器`
163 - `场景`
164
165 结果字段映射规则:
166
167 - `表演者类型` <- `performer_type``vocal_texture`
168 - `语种` <- `language`
169 - `BPM速度` <- `bpm`
170 - `情绪` <- `emotion`
171 - `网络/抖音歌曲` <- `douyin_tags`
172 - `音乐风格` <- `music_style_tags`,否则回退到 `genre/sub_genre`
173 - `配器` <- `instrument_tags`
174 - `场景` <- `scene`
175
176 列表型字段会被拼成 `、` 分隔字符串。
177
178 ## 断点续跑
179
180 当前断点续跑逻辑比 README 旧版描述更具体,实际行为如下:
181
182 - 如果输出文件已存在,且行数与本次输入一致:
183 直接按行号复用历史输出
184 - 如果输出文件已存在,但行数不一致:
185 尝试按 `歌曲ID``tmeid` 复用旧结果
186 - 如果 checkpoint 存在:
187 会在“输出按索引对齐”的前提下合并 checkpoint 完成状态
188 - 空 URL 行会直接加入 completed 集合
189 - 处理中按 `--checkpoint-every` 周期性落盘
190 - `Ctrl+C` 时会先保存当前进度,再强制退出避免卡住线程
191
192 默认 checkpoint 文件名:
193
194 ```text
195 <output>.checkpoint.json
196 ```
197
198 ## 提示词与分析链路
199
200 批处理脚本本身不直接读取 prompt 文件,而是走统一分析入口:
201
202 [`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
203 -> [`app/middleware/music_analyze/__init__.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/__init__.py)
204 -> [`app/middleware/music_analyze/music_analyzer.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/music_analyzer.py)
205 -> [`app/middleware/music_analyze/factory.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/factory.py)
206 -> [`app/middleware/music_analyze/qwen_analyzer.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/qwen_analyzer.py)
207 -> [`app/middleware/music_analyze/prompts.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/prompts.py)
208
209 当前 prompt 目录固定为:
210
211 - [`music_analyze_system_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt.md)
212 - [`music_analyze_system_prompt_part_a.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt_part_a.md)
213 - [`music_analyze_system_prompt_part_b.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt_part_b.md)
214 - [`music_analyze_user_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_user_prompt.md)
215 - [`music_lyrics_only_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_lyrics_only_prompt.md)
216
217 ## 项目结构
218
219 ```text
220 music_analyze_v2/
221 ├── app/
222 │ ├── core/
223 │ │ └── config.py
224 │ ├── middleware/
225 │ │ └── music_analyze/
226 │ │ ├── __init__.py
227 │ │ ├── base.py
228 │ │ ├── factory.py
229 │ │ ├── music_analyzer.py
230 │ │ ├── prompts.py
231 │ │ ├── qwen_analyzer.py
232 │ │ ├── doubao_analyzer.py
233 │ │ ├── audio_features.py
234 │ │ └── bpm_analyzer_tools.py
235 │ ├── prompts/
236 │ │ └── step2_music_decode/
237 │ └── utils/
238 ├── pipeline/
239 │ └── batch_analyze_xlsx.py
240 ├── outputs/
241 ├── requirements.txt
242 ├── .env
243 ├── .env.example
244 └── README.md
245 ```
246
247 ## 依赖
248
249 基础依赖见 [`requirements.txt`](/Users/sqy/Downloads/music_analyze_v2/requirements.txt)
250
251 当前显式包含:
252
253 - `openai`
254 - `requests`
255 - `httpx`
256 - `python-dotenv`
257 - `pydantic-settings`
258 - `numpy`
259 - `scipy`
260 - `librosa`
261 - `soundfile`
262 - `pandas`
263 - `openpyxl`
264
265 `dashscope``requirements.txt` 中仍是注释状态;如果你要跑依赖该 SDK 的歌词路径,需要自行安装并校验对应代码分支。
266
267 ## 常见问题
268
269 ### 为什么传了 `--provider doubao` 还是失败?
270
271 因为当前 CLI 还保留了 `doubao` 选项,但分析器工厂只支持 `qwen`。这是代码现状,不是使用方式问题。
272
273 ### 输出为什么没有保留原 Excel 的全部列?
274
275 因为当前脚本在保存时只写 `DEFAULT_OUTPUT_COLUMNS`,这是代码的固定行为。
276
277 ### 修改提示词应该改哪里?
278
279 [`app/prompts/step2_music_decode`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode) 下的模板文件即可。
280
281 ### 行数变了还能续跑吗?
282
283 可以部分复用。脚本会尝试按 `歌曲ID``tmeid` 匹配历史输出。
284
285 ### 如何完全重跑?
286
287 `--no-resume`,并删除旧输出和旧 checkpoint,最干净。
1 """Standalone audio analysis package."""
1 from .config import settings
2
3 __all__ = ["settings"]
1 """Minimal settings for standalone audio analysis pipeline."""
2
3 from pydantic_settings import BaseSettings, SettingsConfigDict
4
5
6 class Settings(BaseSettings):
7 model_config = SettingsConfigDict(
8 env_file=".env",
9 env_file_encoding="utf-8",
10 extra="ignore",
11 )
12
13 # Qwen
14 QWEN_API_KEY: str | None = None
15 QWEN_DASHSCOPE_API_KEY: str | None = None
16 QWEN_BASE_URL: str | None = "https://dashscope.aliyuncs.com/compatible-mode/v1"
17 QWEN_MODEL: str | None = "qwen3-omni-flash"
18 QWEN_TIMEOUT: float = 15.0
19 QWEN_LYRICS_TIMEOUT: float = 90.0
20 QWEN_MAX_RETRIES: int = 3
21 MUSIC_ANALYZE_LIGHT_MODE: bool = True
22 MUSIC_DOWNLOAD_DIR: str = "music"
23 MUSIC_MAPPING_FILE: str = "music/music_file_mapping.csv"
24
25 # Optional features
26 SONGFORMER_URL: str | None = None
27
28 # DashScope ASR
29 DASHSCOPE_FUNASR_MODEL: str = "fun-asr"
30 DASHSCOPE_BASE_HTTP_API_URL: str = "https://dashscope.aliyuncs.com/api/v1"
31 DASHSCOPE_ASR_POLL_INTERVAL: float = 1.0
32 DASHSCOPE_ASR_POLL_TIMEOUT: float = 120.0
33 DASHSCOPE_ASR_SUBMIT_URL: str = (
34 "https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription"
35 )
36 DASHSCOPE_ASR_MODEL: str = "qwen3-asr-flash-filetrans"
37 DASHSCOPE_TASK_STATUS_BASE_URL: str = "https://dashscope.aliyuncs.com/api/v1/tasks"
38
39 # OSS
40 OSS_ACCESS_KEY_ID: str | None = None
41 OSS_ACCESS_KEY_SECRET: str | None = None
42 OSS_ENDPOINT: str | None = None
43 OSS_BUCKET_NAME: str | None = None
44 OSS_ENDPOINT_INTERNAL: str | None = None
45
46
47 settings = Settings()
1 """
2 自定义异常定义
3
4 所有业务异常都应该继承自 APIException,
5 由全局异常处理器统一处理并返回标准格式的错误响应
6 """
7 from fastapi import HTTPException, status
8 from typing import Optional, Any
9
10
11 class APIException(HTTPException):
12 """
13 API基础异常
14
15 所有业务异常的基类,可以被全局异常处理器捕获和统一处理
16 """
17
18 def __init__(
19 self,
20 status_code: int = status.HTTP_400_BAD_REQUEST,
21 detail: str = None,
22 error_code: str = None,
23 data: Any = None,
24 headers: dict = None,
25 ):
26 super().__init__(status_code=status_code, detail=detail, headers=headers)
27 self.error_code = error_code or "UNKNOWN_ERROR"
28 self.data = data
29
30
31 class UnauthorizedException(APIException):
32 """未授权异常 - 认证失败"""
33
34 def __init__(self, detail: str = "未授权", error_code: str = "UNAUTHORIZED"):
35 super().__init__(
36 status_code=status.HTTP_401_UNAUTHORIZED,
37 detail=detail,
38 error_code=error_code
39 )
40
41
42 class ForbiddenException(APIException):
43 """禁止访问异常 - 权限不足"""
44
45 def __init__(self, detail: str = "禁止访问", error_code: str = "FORBIDDEN"):
46 super().__init__(
47 status_code=status.HTTP_403_FORBIDDEN,
48 detail=detail,
49 error_code=error_code
50 )
51
52
53 class NotFoundException(APIException):
54 """资源不存在异常"""
55
56 def __init__(self, detail: str = "资源不存在", error_code: str = "NOT_FOUND"):
57 super().__init__(
58 status_code=status.HTTP_404_NOT_FOUND,
59 detail=detail,
60 error_code=error_code
61 )
62
63
64 class ConflictException(APIException):
65 """冲突异常 - 资源已存在"""
66
67 def __init__(self, detail: str = "资源已存在", error_code: str = "CONFLICT"):
68 super().__init__(
69 status_code=status.HTTP_409_CONFLICT,
70 detail=detail,
71 error_code=error_code
72 )
73
74
75 class ValidationException(APIException):
76 """验证异常 - 输入验证失败"""
77
78 def __init__(self, detail: str = "验证失败", error_code: str = "VALIDATION_ERROR"):
79 super().__init__(
80 status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
81 detail=detail,
82 error_code=error_code
83 )
84
85
86 class BusinessException(APIException):
87 """业务异常 - 业务规则验证失败"""
88
89 def __init__(
90 self,
91 detail: str = "业务操作失败",
92 error_code: str = "BUSINESS_ERROR",
93 status_code: int = status.HTTP_500_INTERNAL_SERVER_ERROR,
94 ):
95 super().__init__(
96 status_code=status_code,
97 detail=detail,
98 error_code=error_code
99 )
100
101
102 class InternalServerException(APIException):
103 """内部服务器异常"""
104
105 def __init__(
106 self,
107 detail: str = "内部服务器错误",
108 error_code: str = "INTERNAL_SERVER_ERROR",
109 ):
110 super().__init__(
111 status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
112 detail=detail,
113 error_code=error_code
114 )
115
116
117 class DatabaseException(APIException):
118 """数据库异常"""
119
120 def __init__(
121 self,
122 detail: str = "数据库操作失败",
123 error_code: str = "DATABASE_ERROR",
124 ):
125 super().__init__(
126 status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
127 detail=detail,
128 error_code=error_code
129 )
130
131
132 class ExternalServiceException(APIException):
133 """外部服务异常 - 调用第三方服务失败"""
134
135 def __init__(
136 self,
137 detail: str = "外部服务调用失败",
138 error_code: str = "EXTERNAL_SERVICE_ERROR",
139 ):
140 super().__init__(
141 status_code=status.HTTP_502_BAD_GATEWAY,
142 detail=detail,
143 error_code=error_code
144 )
145
146
147 class RateLimitException(APIException):
148 """限流异常 - 请求过于频繁"""
149
150 def __init__(
151 self,
152 detail: str = "请求过于频繁,请稍后再试",
153 error_code: str = "RATE_LIMIT_EXCEEDED",
154 ):
155 super().__init__(
156 status_code=status.HTTP_429_TOO_MANY_REQUESTS,
157 detail=detail,
158 error_code=error_code
159 )
1 """Middleware package."""
1 """
2 音乐分析模块
3 提供统一的音乐标签分析功能,支持通义千问和火山引擎豆包
4
5 主要功能:
6 - 音乐风格识别(与国际音乐分类体系对齐)
7 - 情绪识别
8 - 人声质感识别
9 - 语种识别
10 - 节奏强度分析(1-5,用于指导视频剪辑)
11 - 高潮点识别
12 - 视觉概念生成(用于MV创作)
13 - 歌词识别(可选)
14
15 支持的提供商:
16 - qwen: 通义千问 (qwen3-omni-flash)
17 - doubao: 火山引擎豆包 (doubao-seed-1-8-251228)
18
19 使用示例:
20 from app.middleware.music_analyze import analyze_music
21
22 # 基本分析
23 result = analyze_music(
24 metadata={"title": "稻香", "artist": "周杰伦"},
25 music_url="https://example.com/music.mp3",
26 provider="qwen",
27 )
28
29 # 含歌词识别
30 result = analyze_music(
31 metadata={"title": "稻香"},
32 music_url="https://example.com/music.mp3",
33 provider="qwen",
34 extract_lyrics=True,
35 )
36 """
37
38 # 主函数导出
39 from .music_analyzer import (
40 analyze_music,
41 analyze_music_lyrics_only,
42 analyze_music_with_qwen,
43 analyze_music_with_doubao,
44 get_available_providers,
45 )
46
47 # 类导出
48 from .base import AudioAnalyzer
49 from .qwen_analyzer import QwenAnalyzer
50 from .doubao_analyzer import DoubaoAnalyzer
51 from .factory import AnalyzerFactory
52
53 __version__ = "1.0.0"
1 # -*- coding: utf-8 -*-
2 """
3 音乐分析器工厂
4 """
5
6 from typing import Dict, Any, Optional
7 from .base import AudioAnalyzer
8 from .qwen_analyzer import QwenAnalyzer
9
10
11 class AnalyzerFactory:
12 """音乐分析器工厂"""
13
14 _analyzers: Dict[str, AudioAnalyzer] = {}
15
16 @classmethod
17 def get_analyzer(cls, provider: str = "qwen", **kwargs) -> AudioAnalyzer:
18 """
19 获取分析器实例
20
21 Args:
22 provider: 提供商名称(仅支持 qwen)
23 **kwargs: 额外配置参数(如 api_key, model, timeout 等)
24
25 Returns:
26 AudioAnalyzer 实例
27 """
28 key = f"{provider}"
29 cache_key = f"{provider}_{kwargs.get('model', '')}"
30
31 if cache_key in cls._analyzers:
32 return cls._analyzers[cache_key]
33
34 if provider == "qwen":
35 analyzer = QwenAnalyzer(**kwargs)
36 else:
37 raise ValueError(f"Unknown provider: {provider}. Only 'qwen' is supported.")
38
39 cls._analyzers[cache_key] = analyzer
40 return analyzer
41
42 @classmethod
43 def get_default_analyzer(cls) -> AudioAnalyzer:
44 """获取默认分析器(从环境变量读取)"""
45 import os
46
47 provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
48 return cls.get_analyzer(provider=provider)
49
50 @classmethod
51 def list_providers(cls) -> list:
52 """列出可用的提供商"""
53 return ["qwen"]
54
55 @classmethod
56 def clear_cache(cls):
57 """清除缓存的分析器实例"""
58 cls._analyzers.clear()
1 # -*- coding: utf-8 -*-
2 """
3 音乐分析统一入口
4 提供简化的 analyze_music() 函数
5 """
6
7 from typing import Dict, Any, Optional
8 import os
9
10 from .factory import AnalyzerFactory
11
12
13 def analyze_music(
14 metadata: Dict[str, Any],
15 music_url: str,
16 provider: str = None,
17 extract_lyrics: bool = False,
18 label_level: int = 0,
19 ) -> Optional[Dict[str, Any]]:
20 """
21 音乐分析统一入口函数
22
23 Args:
24 metadata: 音乐元数据字典(如 title, artist 等)
25 music_url: 音乐文件 URL
26 provider: 提供商(qwen | doubao),默认从环境变量读取
27 extract_lyrics: 是否识别歌词
28 label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
29
30 Returns:
31 分析结果字典,包含以下字段:
32 - genre: 音乐风格(一级风格,如:流行、摇滚)
33 - emotion: 情绪列表
34 - emotional_intensity: 情绪强度
35 - vocal_texture: 人声质感
36 - vocal_description: 人声质感描述
37 - visual_concept: 视觉概念
38 - language: 语种
39 - bpm: 节拍数(可选)
40 - lyrics: 歌词列表(可选)
41 - _model: 使用的模型名称
42 - _token_info: Token 使用信息
43
44 Example:
45 >>> result = analyze_music(
46 ... metadata={"title": "稻香", "artist": "周杰伦"},
47 ... music_url="https://example.com/music.mp3",
48 ... provider="qwen",
49 ... extract_lyrics=False,
50 ... )
51 >>> print(result["genre"])
52 流行
53 """
54 if provider is None:
55 provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
56
57 analyzer = AnalyzerFactory.get_analyzer(provider=provider)
58
59 return analyzer.analyze(
60 metadata=metadata,
61 music_url=music_url,
62 extract_lyrics=extract_lyrics,
63 label_level=label_level,
64 )
65
66
67 def analyze_music_with_qwen(
68 metadata: Dict[str, Any],
69 music_url: str,
70 extract_lyrics: bool = False,
71 label_level: int = 0,
72 ) -> Optional[Dict[str, Any]]:
73 """使用通义千问分析音乐"""
74 return analyze_music(
75 metadata=metadata,
76 music_url=music_url,
77 provider="qwen",
78 extract_lyrics=extract_lyrics,
79 label_level=label_level,
80 )
81
82
83 def analyze_music_with_doubao(
84 metadata: Dict[str, Any],
85 music_url: str,
86 extract_lyrics: bool = False,
87 label_level: int = 0,
88 ) -> Optional[Dict[str, Any]]:
89 """使用火山引擎豆包分析音乐"""
90 return analyze_music(
91 metadata=metadata,
92 music_url=music_url,
93 provider="doubao",
94 extract_lyrics=extract_lyrics,
95 label_level=label_level,
96 )
97
98
99 def analyze_music_lyrics_only(
100 metadata: Dict[str, Any],
101 music_url: str,
102 provider: str = None,
103 ) -> Optional[Dict[str, Any]]:
104 """仅识别歌词,避免重复做基础标签分析"""
105 if provider is None:
106 provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
107
108 analyzer = AnalyzerFactory.get_analyzer(provider=provider)
109 if hasattr(analyzer, "analyze_lyrics_only"):
110 return analyzer.analyze_lyrics_only(metadata=metadata, music_url=music_url)
111
112 # 兼容未实现 lyrics_only 的提供商
113 result = analyzer.analyze(
114 metadata=metadata,
115 music_url=music_url,
116 extract_lyrics=True,
117 label_level=0,
118 )
119 if isinstance(result, dict):
120 lyrics = result.get("lyrics", [])
121 return {
122 "lyrics": lyrics if isinstance(lyrics, list) else [],
123 "_model": result.get("_model"),
124 "_token_info": result.get("_token_info"),
125 }
126 return None
127
128
129 def get_available_providers() -> list:
130 """获取可用的提供商列表"""
131 return AnalyzerFactory.list_providers()
1 # -*- coding: utf-8 -*-
2 """
3 音乐分析提示词模板构建器
4 支持从外部模板文件读取提示词,便于动态修改
5 """
6
7 import os
8 from pathlib import Path
9 from typing import Dict, Any, Optional
10
11
12 # 模板文件路径(已迁移到 app/prompts/step2_music_decode)
13 PROMPTS_DIR = Path(__file__).parent.parent.parent / "prompts" / "step2_music_decode"
14 SYSTEM_PROMPT_FILE = PROMPTS_DIR / "music_analyze_system_prompt.md"
15 SYSTEM_PROMPT_PART_A_FILE = PROMPTS_DIR / "music_analyze_system_prompt_part_a.md"
16 SYSTEM_PROMPT_PART_B_FILE = PROMPTS_DIR / "music_analyze_system_prompt_part_b.md"
17 USER_PROMPT_FILE = PROMPTS_DIR / "music_analyze_user_prompt.md"
18 LYRICS_ONLY_PROMPT_FILE = PROMPTS_DIR / "music_lyrics_only_prompt.md"
19
20
21 def load_template(template_path: Path) -> str:
22 """
23 从文件加载模板
24
25 Args:
26 template_path: 模板文件路径
27
28 Returns:
29 模板内容字符串
30 """
31 if not template_path.exists():
32 raise FileNotFoundError(f"模板文件不存在: {template_path}")
33
34 with open(template_path, "r", encoding="utf-8") as f:
35 content = f.read()
36
37 # 只移除文件顶部的 Markdown 注释(以 # 开头的注释行)
38 # 保留 ## 标题行和正文内容
39 lines = content.split("\n")
40 filtered_lines = []
41 in_header = True
42
43 for line in lines:
44 stripped = line.strip()
45 # 如果是空行,保留
46 if not stripped:
47 filtered_lines.append(line)
48 continue
49
50 # 如果在文件头部且是单行注释(# 但不是 ##),则跳过
51 if in_header and stripped.startswith("#") and not stripped.startswith("##"):
52 continue
53
54 # 遇到 ## 标题或正文内容,不再是头部
55 in_header = False
56 filtered_lines.append(line)
57
58 return "\n".join(filtered_lines)
59
60
61 class PromptBuilder:
62 """音乐分析提示词模板构建器"""
63
64 def __init__(self, label_level: int = 0):
65 """
66 初始化提示词构建器
67
68 Args:
69 label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
70 """
71 self.label_level = label_level
72
73 def build_system_prompt(self) -> str:
74 """构建系统提示词 - 直接加载静态模板"""
75 return load_template(SYSTEM_PROMPT_FILE)
76
77 def build_system_prompt_part_a(self) -> str:
78 """构建系统提示词A组"""
79 return load_template(SYSTEM_PROMPT_PART_A_FILE)
80
81 def build_system_prompt_part_b(self) -> str:
82 """构建系统提示词B组"""
83 return load_template(SYSTEM_PROMPT_PART_B_FILE)
84
85 def build_metadata_section(self, metadata: Optional[Dict[str, Any]] = None) -> str:
86 """构建元数据部分"""
87 if not metadata:
88 return ""
89
90 sections = ["## 音乐元数据"]
91 for key, value in metadata.items():
92 if key.startswith("_"):
93 continue
94 if value and str(value).strip():
95 sections.append(f"- {key}: {value}")
96 sections.append("")
97 return "\n".join(sections)
98
99 def build_output_format(
100 self,
101 include_lyrics: bool = False,
102 include_bpm: bool = True,
103 ) -> str:
104 """构建输出格式说明"""
105 if include_lyrics and include_bpm:
106 format_spec = """{
107 "genre": "",
108 "sub_genre": "",
109 "language": "",
110 "vocal_type": "",
111 "vocal_description": "",
112 "emotion": [""],
113 "scene": [""],
114 "age": "",
115 "rhythm_intensity": "",
116 "is_sinking": false,
117 "song_description": "",
118 "visual_concept": "",
119 "emotional_intensity": "",
120 "bpm": 0,
121 "lyrics": [{"time": "", "text": ""}]
122 }"""
123 elif include_bpm:
124 format_spec = """{
125 "genre": "",
126 "sub_genre": "",
127 "language": "",
128 "vocal_type": "",
129 "vocal_description": "",
130 "emotion": [""],
131 "scene": [""],
132 "age": "",
133 "rhythm_intensity": "",
134 "is_sinking": false,
135 "song_description": "",
136 "visual_concept": "",
137 "emotional_intensity": "",
138 "bpm": 0
139 }"""
140 elif include_lyrics:
141 format_spec = """{
142 "genre": "",
143 "sub_genre": "",
144 "language": "",
145 "vocal_type": "",
146 "vocal_description": "",
147 "emotion": [""],
148 "scene": [""],
149 "age": "",
150 "rhythm_intensity": "",
151 "is_sinking": false,
152 "song_description": "",
153 "visual_concept": "",
154 "emotional_intensity": "",
155 "lyrics": [{"time": "", "text": ""}]
156 }"""
157 else:
158 format_spec = """{
159 "genre": "",
160 "sub_genre": "",
161 "language": "",
162 "vocal_type": "",
163 "vocal_description": "",
164 "emotion": [""],
165 "scene": [""],
166 "age": "",
167 "rhythm_intensity": "",
168 "is_sinking": false,
169 "song_description": "",
170 "visual_concept": "",
171 "emotional_intensity": ""
172 }"""
173
174 return format_spec
175
176 def build_user_prompt(
177 self,
178 metadata: Optional[Dict[str, Any]] = None,
179 include_lyrics: bool = False,
180 include_bpm: bool = True,
181 ) -> str:
182 """
183 构建完整的用户提示词
184 使用模板文件并替换占位符
185
186 Args:
187 metadata: 音乐元数据字典(可选)
188 include_lyrics: 是否识别歌词(保留参数以兼容现有调用)
189 include_bpm: 是否包含BPM识别(保留参数以兼容现有调用)
190
191 Returns:
192 完整的用户提示词
193 """
194 # 加载模板
195 template = load_template(USER_PROMPT_FILE)
196
197 # 准备替换字典 - 只替换元数据部分
198 # 输出格式已在系统提示词中定义,不需要在用户提示词中重复
199 replacements = {
200 "{{METADATA_SECTION}}": self.build_metadata_section(metadata),
201 }
202
203 # 替换占位符
204 result = template
205 for placeholder, value in replacements.items():
206 result = result.replace(placeholder, value)
207
208 return result
209
210 def build_lyrics_only_prompt(self) -> str:
211 """构建仅识别歌词的提示词"""
212 return load_template(LYRICS_ONLY_PROMPT_FILE)
213
214
215 def build_analyze_prompt(
216 metadata: Optional[Dict[str, Any]] = None,
217 include_lyrics: bool = False,
218 label_level: int = 0,
219 ) -> tuple[str, str]:
220 """
221 构建完整的分析提示词
222
223 Args:
224 metadata: 音乐元数据字典(可选)
225 include_lyrics: 是否识别歌词
226 label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
227
228 Returns:
229 (system_prompt, user_prompt) 元组
230 """
231 builder = PromptBuilder(label_level=label_level)
232 system_prompt = builder.build_system_prompt()
233 user_prompt = builder.build_user_prompt(
234 metadata=metadata,
235 include_lyrics=include_lyrics,
236 include_bpm=True,
237 )
238 return system_prompt, user_prompt
239
240
241 def build_analyze_prompt_part_a(
242 metadata: Optional[Dict[str, Any]] = None,
243 include_lyrics: bool = False,
244 label_level: int = 0,
245 ) -> tuple[str, str]:
246 """
247 构建A组分析提示词(标签与基础信息)
248 """
249 builder = PromptBuilder(label_level=label_level)
250 system_prompt = builder.build_system_prompt_part_a()
251 user_prompt = builder.build_user_prompt(
252 metadata=metadata,
253 include_lyrics=include_lyrics,
254 include_bpm=True,
255 )
256 return system_prompt, user_prompt
257
258
259 def build_analyze_prompt_part_b(
260 metadata: Optional[Dict[str, Any]] = None,
261 include_lyrics: bool = False,
262 label_level: int = 0,
263 ) -> tuple[str, str]:
264 """
265 构建B组分析提示词(节奏与视觉描述)
266 """
267 builder = PromptBuilder(label_level=label_level)
268 system_prompt = builder.build_system_prompt_part_b()
269 user_prompt = builder.build_user_prompt(
270 metadata=metadata,
271 include_lyrics=include_lyrics,
272 include_bpm=True,
273 )
274 return system_prompt, user_prompt
275
276
277 def build_lyrics_prompt() -> str:
278 """构建仅识别歌词的提示词"""
279 builder = PromptBuilder()
280 return builder.build_lyrics_only_prompt()
281
282
283 # 向后兼容:保留原有的构建函数
284 def build_user_prompt(
285 metadata: Optional[Dict[str, Any]] = None,
286 include_lyrics: bool = False,
287 label_level: int = 0,
288 ) -> str:
289 """构建用户提示词(兼容函数)"""
290 builder = PromptBuilder(label_level=label_level)
291 return builder.build_user_prompt(
292 metadata=metadata,
293 include_lyrics=include_lyrics,
294 include_bpm=True,
295 )
1 # 聚音标签识别助手 - 系统角色定义
2
3 ## 角色定位
4
5 你是音乐内容标签标注助手。
6 你的任务是基于输入的歌曲信息(如歌词、标题、风格描述、音频特征等),严格按照「聚音标签字典」输出标准化标签字段。
7
8 只输出标签结果,不做解释,不做分析,不添加任何多余文本。
9
10 ------
11
12 ## 输出格式
13
14 仅输出 JSON 纯文本,结构如下:
15
16 {
17 "performer_type": "",
18 "language": "",
19 "emotion": [],
20 "douyin_tags": [],
21 "music_style_tags": [],
22 "instrument_tags": [],
23 "scene": []
24 }
25
26 禁止输出任何解释性文字、注释或额外字段。
27
28 ------
29
30 ## 全局约束规则
31
32 1. 所有标签必须严格从下方字典中选择,禁止自造词。
33 2. 不允许基于刻板印象猜测(如仅凭曲风推断情绪)。
34 3. 标签必须基于明确特征:
35 - 歌词内容
36 - 音乐风格特征
37 - 明确出现的配器
38 - 明确使用场景
39 4. 多选字段仅选择高度确定且核心表达的标签,避免过度打标。
40 5. 注意!所有字段至少选择一个标签,不允许留空。
41
42 ------
43
44 # 字段判定标准说明
45
46 ## 一、演唱者类型 performer_type(单选)
47
48 用于标注主要人声类型,仅根据实际听感或明确描述判断:
49
50 - 男声:主要为男性声线
51 - 女声:主要为女性声线
52 - 童声:明显儿童声线
53 - 合唱:多人群体演唱为主(非简单和声)
54
55 不确定时输出 ""。
56
57 ------
58
59 ## 二、情绪 emotion(多选)
60
61 必须基于歌曲整体情绪表达判断,而非个别词语。
62
63 - 喜庆:节日、庆典氛围明显
64 - 浪漫:爱情氛围浓厚
65 - 雄壮:宏大、史诗、气势恢宏
66 - 蛊惑:迷幻、魅惑、暧昧
67 - 宣泄:情绪爆发、释放
68 - 悲壮:悲情但具有力量感
69 - 愤怒:强烈对抗或激烈表达
70 - 庄重:正式、肃穆
71 - 激情:热烈高昂
72 - 沉重:压抑、厚重
73 - 快乐:轻松开心
74 - 励志:奋斗、成长、自我激励
75 - 思念:想念某人或过往
76 - 紧张:悬而未决、焦虑
77 - 恐怖:惊悚氛围
78 - 感动:温情催泪
79 - 恶搞:刻意夸张调侃
80 - 搞笑:明显幽默表达
81 - 期待:盼望未来
82 - 怀念:回忆过去
83 - 甜蜜:恋爱甜感
84 - 孤独:孤单、自我独白
85 - 伤感:悲伤低落
86 - 悬疑:神秘未知感
87 - 祝福:祝愿表达
88 - 佛系:平淡随性
89 - 舒缓:节奏慢、平稳
90 - 悠扬:旋律流畅优美
91 - 温暖:柔和治愈
92 - 忧郁:带有阴郁气质
93
94 避免:
95
96 - 同时选择强烈对立情绪(如 快乐 与 伤感)
97 - 同类标签堆叠(如 伤感 + 忧郁 + 孤独 需明确区分)
98
99 ------
100
101 ## 三、语种 language(单选)
102
103 仅从下列标签中选择一个最主要的演唱语种:
104
105 - 普通话
106 - 粤语
107 - 藏语
108 - 英语
109 - 韩语
110 - 闽南语
111 - 蒙语
112 - 俄语
113 - 其他
114
115 规则:
116
117 - 只输出一个语种标签
118 - 依据实际演唱语言判断,不根据歌手国籍或曲风猜测
119 - 纯音乐或无法判断时输出 ""
120
121 ------
122
123 ## 四、网络/抖音歌曲 douyin_tags(可多选)
124
125 仅当歌曲具备明显网络传播特征或主题风格时选择:
126
127 - 草原:草原文化、民族草原元素
128 - 故乡:思乡主题
129 - 神曲:洗脑旋律、强节奏重复
130 - 文艺:小众表达、诗性表达
131 - 青春:校园或成长主题
132 - 治愈系:温暖疗愈风格
133 - 清新:轻快自然风格
134 - 奇幻:幻想、魔幻元素
135
136 非明显网络属性不要强行标注。
137
138 ------
139
140 ## 五、音乐风格 music_style_tags(多选)
141
142 必须根据音乐结构与风格特征判断,不根据歌词主题判断。
143
144 - 世界音乐
145 - 雷鬼
146 - R&B/Soul
147 - MC喊麦
148 - 另类音乐
149 - 民歌
150 - 戏曲
151 - 古风
152 - 古典音乐
153 - HipHop
154 - Rap
155 - 摇滚
156 - DJ嗨曲
157 - 布鲁斯/蓝调
158 - 拉丁
159 - 舞曲
160 - 爵士
161 - 乡村
162 - 民谣
163 - 流行
164 - 轻音乐
165 - 国风
166 - 儿歌
167
168 规则:
169
170 - 只选核心风格,不叠加相似风格
171 - 不因使用某个乐器就推断整体风格
172 - 无明显风格时可只选“流行”
173
174 ------
175
176 ## 六、配器 instrument_tags(多选)
177
178 仅在明确可识别时选择:
179
180 - 二胡
181 - 竹笛
182 - 琵琶
183 - 音效
184 - 口琴
185 - 电子
186 - 木吉他
187 - 鼓组
188 - 弦乐
189 - 电吉他
190 - 古筝
191 - 钢琴
192
193 规则:
194
195 - 必须为明显主导或突出配器
196 - 不因常规伴奏默认存在而标注
197 - 不确定不要猜
198
199 ------
200
201 ## 七、场景 scene(多选)
202
203 根据歌曲使用场景或明显氛围判断:
204
205 - 餐厅
206 - 汽车
207 - 跳舞
208 - 旅行
209 - 工作
210 - 校园
211 - 夜店
212 - 运动
213 - 休闲
214 - live house
215 - 广场舞
216 - 抖音
217 - 婚礼
218 - 约会
219
220 规则:
221
222 - 仅当歌曲明显适配该场景时标注
223 - 避免泛化场景(如所有慢歌都标“休闲”)
224
225 ------
226
227 ## 最终执行要求
228
229 - 只输出 JSON
230 - 不解释
231 - 不补充说明
232 - 不输出字典内容
233 - 不输出“分析如下”之类文字
234 - 不添加未定义字段
235
236 严格遵守字段范围与空值规则。
237
238 ## 输出格式
239 必须严格输出以下 JSON 结构,字段名不能改:
240
241 ```json
242 {
243 "performer_type": "",
244 "language": "",
245 "emotion": [],
246 "douyin_tags": [],
247 "music_style_tags": [],
248 "instrument_tags": [],
249 "scene": []
250 }
251 ```
1 # 待分析元数据
2 {{METADATA_SECTION}}
3
4 # 任务目标
5 请基于音频内容完成聚音标签识别,仅输出系统要求的标签字段。
6
7 # 约束提醒
8 - 必须基于实际听到的特征,无法确认的标签输出空值。
9 - 严格执行 JSON 纯文本输出,禁止任何 Markdown 格式。
1 # 歌词识别提示词模板
2 # 仅识别歌词内容,不包含其他音乐分析
3 请识别并转录音频中的完整歌词。
4
5 ## 核心任务
6 1. **逐句识别**:按时间顺序输出每一句歌词,每句通过换行进行分隔。
7 2. **字段要求**:每条记录必须包含 `time` (格式 "mm:ss.xxx",无法确定则为 null) 和 `text` (歌词内容)。
8 3. **无语义音节压缩**:对于“啊/呜/哦/嗯/啦”等辅助音节,禁止逐字展示,统一使用 `...` 缩略(例:把“啊啊啊啊”识别为“啊...”)。
9 4. **完整性**:必须转录包括重复段落在内的全曲内容。
10 5. **静默与纯音乐**:若为纯音乐或无歌词,仅返回空数组 `[]`
11 6. 完整识别歌曲所有段落的完整歌词,包括不同段落之间重复了的歌词
12
13 ## 输出格式规范
14 - 严格输出 JSON,不得包含任何 Markdown 转义符(如 ```json)或解释性文字。
15 - 字段统一为: {"lyrics": [{"time": "00:00.000", "text": "内容"}]}
16
17 ## 质量控制
18 - 遇到合唱/重叠时,以主旋律为主。
19 - 严禁自行脑补不存在的歌词。
20 - 不要返回任何其他无关内容
1 """
2 阿里云OSS文件上传模块
3 """
4 import os
5 import uuid
6 import logging
7 from datetime import datetime, timedelta
8
9 import oss2
10
11 from app.core.config import settings
12
13 logger = logging.getLogger(__name__)
14
15
16 class OSSUploader:
17 """阿里云OSS上传器"""
18
19 def __init__(self):
20 """初始化OSS客户端"""
21 self.access_key_id = settings.OSS_ACCESS_KEY_ID
22 self.access_key_secret = settings.OSS_ACCESS_KEY_SECRET
23 self.endpoint = settings.OSS_ENDPOINT
24 self.bucket_name = settings.OSS_BUCKET_NAME
25
26 if not all([
27 self.access_key_id,
28 self.access_key_secret,
29 self.endpoint,
30 self.bucket_name,
31 ]):
32 raise ValueError("OSS配置不完整,请检查 .env 中的 OSS_ACCESS_KEY_ID/OSS_ACCESS_KEY_SECRET/OSS_ENDPOINT/OSS_BUCKET_NAME")
33
34 logger.info(
35 "OSS配置: endpoint=%s, bucket=%s",
36 self.endpoint,
37 self.bucket_name,
38 )
39 # 创建认证对象
40 self.auth = oss2.Auth(self.access_key_id, self.access_key_secret)
41
42 # 默认使用公网 endpoint;非阿里云内网环境下访问 internal endpoint 容易失败。
43 self.bucket = oss2.Bucket(self.auth, self.endpoint, self.bucket_name)
44
45 def upload_file(self, local_file_path, oss_object_name=None):
46 """
47 上传文件到OSS
48
49 Args:
50 local_file_path: 本地文件路径
51 oss_object_name: OSS对象名称,如果不指定则使用时间戳+原文件名
52
53 Returns:
54 tuple: (success: bool, url: str) 或 (success: bool, error: str)
55 """
56 try:
57 if not os.path.exists(local_file_path):
58 logger.error(f"本地文件不存在: {local_file_path}")
59 return False, "本地文件不存在"
60
61 if not oss_object_name:
62 _, ext = os.path.splitext(local_file_path)
63 oss_object_name = f"{uuid.uuid4()}{ext}"
64
65 # 如果没有指定OSS对象名称,则生成一个
66 date = datetime.now().strftime("%Y%m%d")
67 oss_object_name = f"temp_ai/{date}/{oss_object_name}"
68
69 # 上传文件
70 result = self.bucket.put_object_from_file(oss_object_name, local_file_path)
71
72 # 构建文件URL
73 file_url = f"https://{self.bucket_name}.{self.endpoint}/{oss_object_name}"
74
75 logger.info(f"文件上传成功: {local_file_path} -> {file_url}")
76 return True, file_url
77
78 except Exception as e:
79 logger.error(f"文件上传失败: {local_file_path}, 错误: {e}")
80 return False, str(e)
81
82 def upload_data(self, data, oss_object_name):
83 """
84 上传数据到OSS
85
86 Args:
87 data: 要上传的数据(字符串或字节)
88 oss_object_name: OSS对象名称
89
90 Returns:
91 dict: 包含上传结果的字典
92 """
93 try:
94 # 上传数据
95 result = self.bucket.put_object(oss_object_name, data)
96
97 # 构建文件URL
98 file_url = f"{self.endpoint.rstrip('/')}/{self.bucket_name}/{oss_object_name}"
99
100 return {
101 "success": True,
102 "oss_object_name": oss_object_name,
103 "file_url": file_url,
104 "etag": result.etag,
105 "size": len(data) if isinstance(data, (str, bytes)) else 0
106 }
107
108 except Exception as e:
109 return {"success": False, "error": str(e)}
110
111
112 def get_bucket():
113 """获取Bucket对象"""
114 if not all([
115 settings.OSS_ACCESS_KEY_ID,
116 settings.OSS_ACCESS_KEY_SECRET,
117 settings.OSS_ENDPOINT,
118 settings.OSS_BUCKET_NAME,
119 ]):
120 raise ValueError("OSS配置不完整,请检查 .env 中的 OSS_ACCESS_KEY_ID/OSS_ACCESS_KEY_SECRET/OSS_ENDPOINT/OSS_BUCKET_NAME")
121
122 auth = oss2.Auth(settings.OSS_ACCESS_KEY_ID, settings.OSS_ACCESS_KEY_SECRET)
123 bucket = oss2.Bucket(auth, settings.OSS_ENDPOINT, settings.OSS_BUCKET_NAME)
124 return bucket
125
126
127 def clean_expire_file():
128 """核心任务函数"""
129 print(f"\n[{datetime.now()}] 开始执行每日清理任务...")
130 ROOT_PREFIX = 'temp_ai/'
131 bucket = get_bucket()
132
133 # 1. 计算时间阈值
134 now = datetime.now()
135 yesterday_date = (now - timedelta(days=1)).date()
136 print(f"保留阈值: {yesterday_date} (即 {yesterday_date} 之前的数据将被删除)")
137
138 # 2. 遍历目录
139 try:
140 for obj in oss2.ObjectIterator(bucket, prefix=ROOT_PREFIX, delimiter='/'):
141 path = ""
142 is_directory = False
143
144 # --- [核心修改] 统一路径获取方式 ---
145
146 # 情况 A: 它是虚拟目录 (CommonPrefix)
147 if hasattr(obj, 'prefix'):
148 path = obj.prefix
149 is_directory = True
150
151 # 情况 B: 它是实际对象 (SimplifiedObjectInfo)
152 elif hasattr(obj, 'key'):
153 path = obj.key
154 # 如果 key 以 / 结尾,说明它是一个显式创建的文件夹对象
155 if path.endswith('/'):
156 is_directory = True
157 else:
158 is_directory = False # 这是一个普通文件
159
160 # --- 逻辑分流 ---
161
162 if not is_directory:
163 # 这是一个真正的文件(且不是文件夹对象),直接跳过
164 # print(f"[跳过] 散落文件: {path}")
165 continue
166
167 # 此时 path 必定是目录格式 (如 'temp_ai/20251229/')
168 # 下面开始正常的日期判断逻辑
169
170 # 防御性去空,防止路径即为 'temp_ai/' 本身
171 if path == ROOT_PREFIX:
172 continue
173
174 # 解析目录名 (取倒数第二个元素,因为最后一位是空字符串)
175 folder_name_raw = path.strip('/').split('/')[-1]
176
177 try:
178 folder_date_obj = datetime.strptime(folder_name_raw, "%Y%m%d").date()
179
180 if folder_date_obj < yesterday_date:
181 print(f"[删除] 发现过期目录: {path}")
182 # 注意:delete_objects_by_prefix 会删除该前缀下的所有文件
183 # 如果这个目录本身是个对象,也会被一并删除,无需特殊处理
184 delete_objects_by_prefix(bucket, path)
185 else:
186 # print(f"[跳过] 目录较新: {path}")
187 pass
188
189 except ValueError:
190 print(f"[跳过] 非日期命名目录: {path}")
191
192 except Exception as e:
193 import traceback
194 print(f"[严重错误] 任务执行失败: {e}")
195 traceback.print_exc()
196
197
198 def delete_objects_by_prefix(bucket, prefix):
199 """递归删除指定前缀下的所有文件"""
200 print(f" -> 正在清理目录: {prefix} ...")
201 batch_list = []
202 try:
203 for obj in oss2.ObjectIterator(bucket, prefix=prefix):
204 batch_list.append(obj.key)
205 if len(batch_list) >= 1000:
206 bucket.batch_delete_objects(batch_list)
207 batch_list = []
208
209 if batch_list:
210 bucket.batch_delete_objects(batch_list)
211 print(f" -> 目录 {prefix} 清理完毕。")
212 except Exception as e:
213 print(f" [错误] 删除过程出错: {e}")
214
215
216 # 创建OSS上传器实例
217 oss_uploader = OSSUploader()
218
219 if __name__ == '__main__':
220 resp = oss_uploader.upload_file('想-dj-片段.mp3')
221 print(resp)
1
2 from dashscope.common.constants import DASHSCOPE_API_KEY_ENV
3
4
5 ENV = 'test'
6 # ENV = 'local'
7
8
9 DEBUG = True
10 ### 数据库
11 #dev
12 DB_USER = 'root'
13 DB_PASSWORD = 'Hikoon123!'
14 DB_HOST = 'rm-bp18h64ad9ak4d7h5do.mysql.rds.aliyuncs.com'
15 DB_DATABASE = 'music_partner'
16
17 #Redis
18 REDIS_HOST = '172.23.209.46'
19 REDIS_PORT = 6379
20 REDIS_PSW = '1bvvpAmKXFhDDJXb'
21 REDIS_DB = 0
22 #新抖key
23 NEW_RANK_KEY = 'vh1gbvynpyegg6gebhgepgvc6'
24
25 BACK_BASE_URL = 'https://ai-test.hikoon.com/api/partner'
26
27 EMAIL_HOST = 'smtp.exmail.qq.com'
28 EMAIL_PORT = 465
29 EMAIL_HOST_USER = 'bigmusic@hikoon.com'
30 EMAIL_HOST_PASSWORD = 'Music!123'
31 #邮件接收人列表
32 EMAIL_RECEIVERS = ['1774507011@qq.com','yangsheng@hikoon.com']
33
34
35 #标签字典
36 TAG_DICT = {
37 "viral_song": "网络热歌",
38 "sad_songs": "伤感老歌",
39 "folk_songs": "民谣",
40 "catchy_pop": "口水歌",
41 "kids_songs": "洗脑儿歌",
42 "tk_songs": "抖音热歌",
43 "net_songs": "网络歌曲",
44
45 "dj_remix": "DJ嗨曲",
46 "Cheesy_EDM": "土嗨/慢摇",
47 "car_music": "车载音乐",
48 "shout_rap": "喊麦",
49 "heavy_metal": "重金属/土摇DJ嗨曲",
50
51 "mandarin_pop": "华语流行",
52 "mainstream_pop": "主流Pop",
53 "sweet_songs": "甜歌/校园",
54 "hip_rock": "嘻哈说唱R&B摇滚",
55 "child_songs": "主流儿歌",
56
57 "international_pop": "国外流行",
58 "jp_pop": "日韩流行",
59 "west_pop": "欧美流行",
60 "el_edm": "电音EDM",
61
62 "chinese_style": "国风",
63 "opera_vocal": "戏腔/古韵",
64 "guochao_EDM": "国潮电子",
65 "gufeng_music": "传统器乐古风",
66
67 "soundtrack_instrumental": "影视/纯音",
68 "ys_ost": "影视OST",
69 "pur_music": "纯音乐",
70 "no_lyric": "无词BGM",
71
72 "other_music": "其他",
73 "jazz_blue": "爵士/蓝调",
74 "voice_book": "有声书",
75 "lab_music": "实验音乐",
76 "healing": "治愈",
77 "melancholy": "伤感",
78 "lonely": "孤独",
79 "sweet": "甜蜜",
80 "inspiring": "励志",
81 "missing": "思念",
82 "nostalgic": "怀旧",
83 "angry": "愤怒",
84 "relaxing": "放松",
85 "catchy": "魔性洗脑",
86 "heroic": "悲壮",
87 "calm": "平静",
88 "festive": "喜庆",
89 "romantic": "浪漫",
90 "majestic": "雄壮",
91 "bewitching": "蛊惑",
92 "cathartic": "宣泄",
93 "solemn": "庄重",
94 "passionate": "激情",
95 "heavy": "沉重",
96 "happy": "快乐",
97 "tense": "紧张",
98 "horror": "恐怖",
99 "touching": "感动",
100 "spoof": "恶搞",
101 "funny": "搞笑",
102 "expectation": "期待",
103 "remembrance": "怀念",
104 "mysterious": "悬疑",
105 "blessing": "祝福",
106 "zen": "佛系",
107 "soothing": "舒缓",
108 "melodious": "悠扬",
109 "warm": "温暖",
110 "depressed": "忧郁",
111 "elderly": "老年",
112 "middle_aged": "中年",
113 "young_adult": "青年",
114 "teenager": "少年",
115 "life_scene": "生活场景",
116 "sports": "运动",
117 "driving": "开车",
118 "travel": "旅行",
119 "sleep": "睡前",
120 "study": "学习",
121 "cafe": "咖啡厅",
122 "bar": "酒吧",
123 "douyin":"抖音",
124 "restaurant": "餐厅",
125 "car_scene": "汽车",
126 "dance": "跳舞",
127 "work": "工作",
128 "nightclub": "夜店",
129 "leisure": "休闲",
130 "live_house": "live house",
131 "square_dance": "广场舞",
132 "wedding": "婚礼",
133 "dating": "约会",
134 "festival_scene": "节日场景",
135 "summer": "夏天",
136 "winter": "冬天",
137 "autumn": "秋天",
138 "spring_festival": "春节",
139 "christmas": "圣诞",
140 "valentine": "情人节",
141 "time_scene": "时间场景",
142 "morning": "清晨",
143 "afternoon": "午后",
144 "evening": "夜晚",
145 "midnight": "深夜",
146 "regional_scene": "地域场景",
147 "campus": "校园",
148 "city": "城市",
149 "grassland": "草原",
150 "tibet": "西藏",
151 "xinjiang": "新疆",
152 "transition_style": "转场类",
153 "card_point_switch": "卡点切换画面类",
154 "reverse_suspense": "反转悬念类",
155 "emotion_contrast": "情绪对比类",
156 "mashup_collection": "混剪合集类",
157 "emotional_resonance": "情感共鸣向剪辑",
158 "scene_adaptation": "场景适配剪辑",
159 "highlight_slice": "高光切片剪辑",
160 "live_performance": "现场表演类",
161 "singer_live": "歌手现场演唱",
162 "talent_cover": "达人翻唱表演",
163 "audience_interaction": "观众互动表演",
164 "card_point_speed": "卡点、变速类",
165 "multi_scene_fragment": "多场景碎片化卡点",
166 "tech_effect_speed": "技术流特效变速",
167 "lyric_concrete": "歌词具象化卡点",
168 "loop_speed_brainwash": "循环变速洗脑",
169 "ugc_co_creation": "UGC共创类",
170 "jianying_template": "剪映模板",
171 "ai_singing": "AI唱歌",
172 "emotional_quotes": "情感语录类",
173 "late_night_emo": "深夜emo类",
174 "morning_inspiration": "清晨励志类",
175 "memory_destiny": "回忆杀/宿命感类",
176 "dynamic_lyrics_visual": "动态歌词可视化",
177 "basic_lyrics_effect": "基础歌词动效",
178 "creative_visual_enhance": "创意视觉强化",
179 "adaptation": "改编",
180 "special_effects_interaction": "特效互动类",
181 "gesture_magic_effect": "手势魔法特效互动",
182 "lip_sync_challenge": "对口型挑战",
183 "douyin_effect_show": "抖音特效变装秀",
184 # 听感演绎流
185 "singing_montage": "演唱混剪",
186 "live_singing": "现场演唱",
187
188 # 视觉冲击流
189 "change_transition": "变装转场",
190 "hand_dance": "手势舞",
191 "addictive_dance": "魔性舞蹈",
192 "landscape_account": "风景号",
193
194 # 氛围素材流
195 "cute_pets": "萌宠",
196 "movie_anime_edit": "影视剧/动漫混剪",
197 "chinese_classical": "古风",
198 "mood_post": "图文心情",
199
200 # 情感共鸣流
201 "animated_lyrics": "动态歌词",
202 "storytelling": "故事演绎",
203 "beauty_snaps": "颜值随拍"
204 }
205
206
207 # 模型相关配置
208 BASE_MODEL = "/data/qufeng/models--MIT--ast-finetuned-audioset-10-10-0.4593/snapshots/f826b80d28226b62986cc218e5cec390b1096902"
209 MOE_DIR = "/data/qufeng/moe_outputs"
210 BASELINE_CHECKPOINT = "/data/qufeng/best_epoch_base.pt"
211 LABEL_MAPPING = "/data/qufeng/label_mapping.txt"
212 DEVICE = "cuda" # 可选: cuda/mps/cpu,为空时自动选择
213 ROUTER_CHECKPOINT = "" # 为空时自动从 moe_dir/joint_train/joint/router_best.pt 推断
214 EXPERTS_DIR = "" # 为空时自动从 moe_dir/experts_train/experts 推断
215
216 # 音频处理配置
217 CHUNK_SECONDS = 10.24 # 按多少秒切块推理
218 CROP_SECONDS = 204.8 # 若音频超过该时长,则仅截取中间这段再切块
219 MAX_CHUNKS = 10 # 每首歌最多使用多少个切片参与推理
220 CHUNK_BATCH_SIZE = 8 # 切块推理的 batch size
221 ROUTING_THRESHOLD = 0.6
222
223 API_CONFIG = {
224 "api_key": "sk-d9b4d3581bde47d887354f9160a509a2",
225 "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
226 "model": "qwen3-omni-flash",
227 "audio_mode": "auto",
228 "timeout": 15,
229 "lyrics_timeout": 60,
230 "lyrics_retries": 2,
231 "max_retries": 5,
232 "retry_delay": 5
233 }
234 # API_CONFIG_91 = {
235 # "api_key": "sk-E90VNVMyhfk2zDBDoToCXoipzGofD2SobwBqaCzbG3junlob",
236 # "base_url": "https://api.91aopusi.com/v1",
237 # "model": "qwen3-omni-flash",
238 # "audio_mode": "auto",
239 # "timeout": 30,
240 # "lyrics_timeout": 60,
241 # "max_retries": 5,
242 # "retry_delay": 5
243 # }
244
245 DASHSCOPE_API_KEY = 'sk-d9b4d3581bde47d887354f9160a509a2'
246
247 OSS_ACCESS_KEY_ID='LTAI4G7UvaW2e4UTCb3KCNjN'
248 OSS_ACCESS_KEY_SECRET='ow5hlVMmJAQY9o7nEAtMER6MFkPedm'
249 OSS_ENDPOINT='oss-cn-hangzhou.aliyuncs.com'
250 OSS_ENDPOINT_INTERNAL='oss-cn-hangzhou-internal.aliyuncs.com'
251 OSS_BUCKET_NAME='ai-sound-data-test'
...\ No newline at end of file ...\ No newline at end of file
1 import logging.handlers
2 import os
3 from config import DEBUG
4
5 log_dir = "./logs"
6 log_max_bytes = 1024 * 1024 * 10
7 log_backup_count = 5
8
9
10 def get_logger(name, level=None):
11 if not level:
12 level = logging.DEBUG if DEBUG else logging.INFO
13
14 # 配置日志
15 logger = logging.getLogger(name)
16 logger.setLevel(level)
17 # 检查日志目录是否存在,如果不存在则创建
18 if not os.path.exists(log_dir):
19 os.makedirs(log_dir)
20
21 # 创建一个handler,用于写入日志文件
22 file_handler = logging.handlers.RotatingFileHandler(f'./{log_dir}/{name}.log', maxBytes=log_max_bytes,
23 backupCount=log_backup_count,encoding='utf-8')
24 file_handler.setLevel(level)
25
26 # 定义handler的输出格式
27 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
28 file_handler.setFormatter(formatter)
29
30 # 给logger添加handler
31 logger.addHandler(file_handler)
32 return logger
33
34
35 # 定义一个模块级别的变量来存储日志记录器实例
36 _app_logger = None
37
38
39 def get_app_logger():
40 global _app_logger
41 if _app_logger is None:
42 _app_logger = get_logger("app")
43 return _app_logger
1 openai>=1.58.1
2 requests>=2.31.0
3 httpx>=0.28.1
4 python-dotenv>=1.0.1
5 pydantic-settings>=2.6.1
6
7 numpy>=1.24.0
8 scipy>=1.10.0
9 librosa>=0.10.2
10 soundfile>=0.12.1
11
12 pandas>=2.2.0
13 openpyxl>=3.1.2
14
15 # Optional: enable funasr backend in qwen_analyzer
16 # dashscope>=1.20.0