Commit 7bf71620 7bf71620f01eb8ff3bc8ab5cdd8d9832a9780575 by 沈秋雨

Initial commit

0 parents
# Required for qwen
QWEN_API_KEY=sk-d9b4d3581bde47d887354f9160a509a2
QWEN_DASHSCOPE_API_KEY=
QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
QWEN_MODEL=qwen3-omni-flash
QWEN_TIMEOUT=15
QWEN_LYRICS_TIMEOUT=90
QWEN_MAX_RETRIES=3
MUSIC_ANALYZE_LIGHT_MODE=true
MUSIC_DOWNLOAD_DIR=music
MUSIC_MAPPING_FILE=music/music_file_mapping.csv
# Optional song structure service
SONGFORMER_URL=
# Optional ASR backend for lyrics_only path
MUSIC_LYRICS_ASR_BACKEND=funasr
DASHSCOPE_FUNASR_MODEL=fun-asr
DASHSCOPE_BASE_HTTP_API_URL=https://dashscope.aliyuncs.com/api/v1
DASHSCOPE_ASR_POLL_INTERVAL=1
DASHSCOPE_ASR_POLL_TIMEOUT=120
DASHSCOPE_ASR_SUBMIT_URL=https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription
DASHSCOPE_ASR_MODEL=qwen3-asr-flash-filetrans
DASHSCOPE_TASK_STATUS_BASE_URL=https://dashscope.aliyuncs.com/api/v1/tasks
.DS_Store
# Python cache
__pycache__/
*.py[cod]
*.so
.pytest_cache/
.mypy_cache/
# Virtual env
.venv/
venv/
# Local env
.env
# Logs
logs/
*.log
# Runtime outputs
outputs/
music/
*.checkpoint.json
# Local test/sample data
*.xlsx
*.xls
*.csv
# Keep env template and source files
!.env.example
# music_analyze_v2
当前项目是一个基于 Excel 批量跑音频标签分析的独立流水线。
实际主流程:
1. 读取输入 `xlsx`
2. 从指定 URL 列取音频地址
3. 透传部分元数据给音乐分析器
4. 调用 `app.middleware.music_analyze.analyze_music(...)`
5. 将结果整理成固定交付列并持续写回输出 `xlsx`
6. 通过已有输出文件和 checkpoint 支持断点续跑
当前批处理入口是 [`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
## 当前状态
- 可直接运行的主入口:[`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
- 当前默认分析链路:`QwenAnalyzer`
- 当前实际可用 provider:`qwen`
- 提示词来源:[`app/prompts/step2_music_decode`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode)
- 输出格式:固定交付列,不保留原始全部输入列
说明:
- 命令行参数里虽然还保留了 `--provider doubao` 选项,但当前 [`factory.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/factory.py) 只实例化 `qwen`,传 `doubao` 会在运行时失败。
- README 以下内容按“当前代码实际行为”描述,而不是按历史规划描述。
## 安装
```bash
python3.10 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
```
## 环境变量
最小必需配置通常是:
```env
QWEN_API_KEY=your_api_key
QWEN_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
QWEN_MODEL=qwen3-omni-flash
QWEN_TIMEOUT=15
QWEN_LYRICS_TIMEOUT=90
QWEN_MAX_RETRIES=3
```
项目还支持以下可选增强能力:
- `QWEN_DASHSCOPE_API_KEY`:部分 DashScope/ASR 路径会用到
- `SONGFORMER_URL`:启用额外音频结构特征
- `MUSIC_LYRICS_ASR_BACKEND``DASHSCOPE_*`:歌词提取相关配置
- `OSS_*`:音频过大时走 OSS 降级上传
配置定义见 [`app/core/config.py`](/Users/sqy/Downloads/music_analyze_v2/app/core/config.py)
## 输入要求
输入文件必须是 `xlsx`
至少需要一列音频地址。脚本按下面顺序解析 URL 列:
- 显式传入的 `--url-column`
- `URL`
- `url`
- `cos访问地址`
- `cos_url`
- `audio_url`
若整行 URL 为空:
- 不会发起分析
- 该行会被直接跳过
- 在断点续跑里会被视为已处理
元数据不是必填,但建议提供。脚本会优先识别这些字段:
- `歌曲ID` / `song_id` / `id`
- `tmeid` / `tmeID` / `TMEID`
- `歌曲名` / `歌曲名称` / `title`
- `表演者` / `歌手` / `artist`
- `歌曲时长` / `duration`
默认会额外透传这些列给模型作为 metadata:
- `tmeID,歌曲名称,歌曲名,歌手,表演者,版本,词作者,曲作者`
可通过 `--metadata-columns` 覆盖。
## 快速开始
常规跑批:
```bash
python pipeline/batch_analyze_xlsx.py \
--input 待分析.xlsx \
--output outputs/标签交付结果.xlsx \
--url-column URL \
--provider qwen \
--workers 3
```
提取歌词:
```bash
python pipeline/batch_analyze_xlsx.py \
--input 待分析.xlsx \
--output outputs/标签交付结果.xlsx \
--url-column URL \
--provider qwen \
--workers 3 \
--extract-lyrics
```
从头重跑,不复用历史输出或 checkpoint:
```bash
python pipeline/batch_analyze_xlsx.py \
--input 待分析.xlsx \
--output outputs/标签交付结果.xlsx \
--provider qwen \
--no-resume
```
## 命令行参数
| 参数 | 说明 | 当前实际行为 |
|------|------|-------------|
| `--input` | 输入 Excel 路径 | 必填 |
| `--output` | 输出 Excel 路径 | 必填 |
| `--checkpoint` | checkpoint 文件路径 | 默认是 `<output>.checkpoint.json` |
| `--url-column` | URL 列名 | 默认 `URL`,不存在时会自动 fallback |
| `--provider` | 分析 provider | 参数允许 `qwen`/`doubao`,当前实际只应使用 `qwen` |
| `--extract-lyrics` | 是否提取歌词 | 开启后会走带歌词分析路径 |
| `--label-level` | 标签级别 | `0``1` |
| `--metadata-columns` | 额外透传给模型的列 | 逗号分隔 |
| `--workers` | 并发线程数 | 默认 `3` |
| `--checkpoint-every` | 每处理多少行保存一次 | 默认 `10` |
| `--no-resume` | 禁用断点续跑 | 默认关闭 |
## 输出结构
脚本输出的是固定交付表,不是“原始输入列 + 分析列”的全量回写。
当前输出列定义在 [`batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)`DEFAULT_OUTPUT_COLUMNS`
- `tmeid`
- `歌曲ID`
- `歌曲名`
- `表演者`
- `歌曲时长`
- `表演者类型`
- `语种`
- `BPM速度`
- `情绪`
- `网络/抖音歌曲`
- `音乐风格`
- `配器`
- `场景`
结果字段映射规则:
- `表演者类型` <- `performer_type``vocal_texture`
- `语种` <- `language`
- `BPM速度` <- `bpm`
- `情绪` <- `emotion`
- `网络/抖音歌曲` <- `douyin_tags`
- `音乐风格` <- `music_style_tags`,否则回退到 `genre/sub_genre`
- `配器` <- `instrument_tags`
- `场景` <- `scene`
列表型字段会被拼成 `、` 分隔字符串。
## 断点续跑
当前断点续跑逻辑比 README 旧版描述更具体,实际行为如下:
- 如果输出文件已存在,且行数与本次输入一致:
直接按行号复用历史输出
- 如果输出文件已存在,但行数不一致:
尝试按 `歌曲ID``tmeid` 复用旧结果
- 如果 checkpoint 存在:
会在“输出按索引对齐”的前提下合并 checkpoint 完成状态
- 空 URL 行会直接加入 completed 集合
- 处理中按 `--checkpoint-every` 周期性落盘
- `Ctrl+C` 时会先保存当前进度,再强制退出避免卡住线程
默认 checkpoint 文件名:
```text
<output>.checkpoint.json
```
## 提示词与分析链路
批处理脚本本身不直接读取 prompt 文件,而是走统一分析入口:
[`pipeline/batch_analyze_xlsx.py`](/Users/sqy/Downloads/music_analyze_v2/pipeline/batch_analyze_xlsx.py)
-> [`app/middleware/music_analyze/__init__.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/__init__.py)
-> [`app/middleware/music_analyze/music_analyzer.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/music_analyzer.py)
-> [`app/middleware/music_analyze/factory.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/factory.py)
-> [`app/middleware/music_analyze/qwen_analyzer.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/qwen_analyzer.py)
-> [`app/middleware/music_analyze/prompts.py`](/Users/sqy/Downloads/music_analyze_v2/app/middleware/music_analyze/prompts.py)
当前 prompt 目录固定为:
- [`music_analyze_system_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt.md)
- [`music_analyze_system_prompt_part_a.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt_part_a.md)
- [`music_analyze_system_prompt_part_b.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_system_prompt_part_b.md)
- [`music_analyze_user_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_analyze_user_prompt.md)
- [`music_lyrics_only_prompt.md`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode/music_lyrics_only_prompt.md)
## 项目结构
```text
music_analyze_v2/
├── app/
│ ├── core/
│ │ └── config.py
│ ├── middleware/
│ │ └── music_analyze/
│ │ ├── __init__.py
│ │ ├── base.py
│ │ ├── factory.py
│ │ ├── music_analyzer.py
│ │ ├── prompts.py
│ │ ├── qwen_analyzer.py
│ │ ├── doubao_analyzer.py
│ │ ├── audio_features.py
│ │ └── bpm_analyzer_tools.py
│ ├── prompts/
│ │ └── step2_music_decode/
│ └── utils/
├── pipeline/
│ └── batch_analyze_xlsx.py
├── outputs/
├── requirements.txt
├── .env
├── .env.example
└── README.md
```
## 依赖
基础依赖见 [`requirements.txt`](/Users/sqy/Downloads/music_analyze_v2/requirements.txt)
当前显式包含:
- `openai`
- `requests`
- `httpx`
- `python-dotenv`
- `pydantic-settings`
- `numpy`
- `scipy`
- `librosa`
- `soundfile`
- `pandas`
- `openpyxl`
`dashscope``requirements.txt` 中仍是注释状态;如果你要跑依赖该 SDK 的歌词路径,需要自行安装并校验对应代码分支。
## 常见问题
### 为什么传了 `--provider doubao` 还是失败?
因为当前 CLI 还保留了 `doubao` 选项,但分析器工厂只支持 `qwen`。这是代码现状,不是使用方式问题。
### 输出为什么没有保留原 Excel 的全部列?
因为当前脚本在保存时只写 `DEFAULT_OUTPUT_COLUMNS`,这是代码的固定行为。
### 修改提示词应该改哪里?
[`app/prompts/step2_music_decode`](/Users/sqy/Downloads/music_analyze_v2/app/prompts/step2_music_decode) 下的模板文件即可。
### 行数变了还能续跑吗?
可以部分复用。脚本会尝试按 `歌曲ID``tmeid` 匹配历史输出。
### 如何完全重跑?
`--no-resume`,并删除旧输出和旧 checkpoint,最干净。
"""Standalone audio analysis package."""
from .config import settings
__all__ = ["settings"]
"""Minimal settings for standalone audio analysis pipeline."""
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
extra="ignore",
)
# Qwen
QWEN_API_KEY: str | None = None
QWEN_DASHSCOPE_API_KEY: str | None = None
QWEN_BASE_URL: str | None = "https://dashscope.aliyuncs.com/compatible-mode/v1"
QWEN_MODEL: str | None = "qwen3-omni-flash"
QWEN_TIMEOUT: float = 15.0
QWEN_LYRICS_TIMEOUT: float = 90.0
QWEN_MAX_RETRIES: int = 3
MUSIC_ANALYZE_LIGHT_MODE: bool = True
MUSIC_DOWNLOAD_DIR: str = "music"
MUSIC_MAPPING_FILE: str = "music/music_file_mapping.csv"
# Optional features
SONGFORMER_URL: str | None = None
# DashScope ASR
DASHSCOPE_FUNASR_MODEL: str = "fun-asr"
DASHSCOPE_BASE_HTTP_API_URL: str = "https://dashscope.aliyuncs.com/api/v1"
DASHSCOPE_ASR_POLL_INTERVAL: float = 1.0
DASHSCOPE_ASR_POLL_TIMEOUT: float = 120.0
DASHSCOPE_ASR_SUBMIT_URL: str = (
"https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription"
)
DASHSCOPE_ASR_MODEL: str = "qwen3-asr-flash-filetrans"
DASHSCOPE_TASK_STATUS_BASE_URL: str = "https://dashscope.aliyuncs.com/api/v1/tasks"
# OSS
OSS_ACCESS_KEY_ID: str | None = None
OSS_ACCESS_KEY_SECRET: str | None = None
OSS_ENDPOINT: str | None = None
OSS_BUCKET_NAME: str | None = None
OSS_ENDPOINT_INTERNAL: str | None = None
settings = Settings()
"""
自定义异常定义
所有业务异常都应该继承自 APIException,
由全局异常处理器统一处理并返回标准格式的错误响应
"""
from fastapi import HTTPException, status
from typing import Optional, Any
class APIException(HTTPException):
"""
API基础异常
所有业务异常的基类,可以被全局异常处理器捕获和统一处理
"""
def __init__(
self,
status_code: int = status.HTTP_400_BAD_REQUEST,
detail: str = None,
error_code: str = None,
data: Any = None,
headers: dict = None,
):
super().__init__(status_code=status_code, detail=detail, headers=headers)
self.error_code = error_code or "UNKNOWN_ERROR"
self.data = data
class UnauthorizedException(APIException):
"""未授权异常 - 认证失败"""
def __init__(self, detail: str = "未授权", error_code: str = "UNAUTHORIZED"):
super().__init__(
status_code=status.HTTP_401_UNAUTHORIZED,
detail=detail,
error_code=error_code
)
class ForbiddenException(APIException):
"""禁止访问异常 - 权限不足"""
def __init__(self, detail: str = "禁止访问", error_code: str = "FORBIDDEN"):
super().__init__(
status_code=status.HTTP_403_FORBIDDEN,
detail=detail,
error_code=error_code
)
class NotFoundException(APIException):
"""资源不存在异常"""
def __init__(self, detail: str = "资源不存在", error_code: str = "NOT_FOUND"):
super().__init__(
status_code=status.HTTP_404_NOT_FOUND,
detail=detail,
error_code=error_code
)
class ConflictException(APIException):
"""冲突异常 - 资源已存在"""
def __init__(self, detail: str = "资源已存在", error_code: str = "CONFLICT"):
super().__init__(
status_code=status.HTTP_409_CONFLICT,
detail=detail,
error_code=error_code
)
class ValidationException(APIException):
"""验证异常 - 输入验证失败"""
def __init__(self, detail: str = "验证失败", error_code: str = "VALIDATION_ERROR"):
super().__init__(
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
detail=detail,
error_code=error_code
)
class BusinessException(APIException):
"""业务异常 - 业务规则验证失败"""
def __init__(
self,
detail: str = "业务操作失败",
error_code: str = "BUSINESS_ERROR",
status_code: int = status.HTTP_500_INTERNAL_SERVER_ERROR,
):
super().__init__(
status_code=status_code,
detail=detail,
error_code=error_code
)
class InternalServerException(APIException):
"""内部服务器异常"""
def __init__(
self,
detail: str = "内部服务器错误",
error_code: str = "INTERNAL_SERVER_ERROR",
):
super().__init__(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=detail,
error_code=error_code
)
class DatabaseException(APIException):
"""数据库异常"""
def __init__(
self,
detail: str = "数据库操作失败",
error_code: str = "DATABASE_ERROR",
):
super().__init__(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=detail,
error_code=error_code
)
class ExternalServiceException(APIException):
"""外部服务异常 - 调用第三方服务失败"""
def __init__(
self,
detail: str = "外部服务调用失败",
error_code: str = "EXTERNAL_SERVICE_ERROR",
):
super().__init__(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=detail,
error_code=error_code
)
class RateLimitException(APIException):
"""限流异常 - 请求过于频繁"""
def __init__(
self,
detail: str = "请求过于频繁,请稍后再试",
error_code: str = "RATE_LIMIT_EXCEEDED",
):
super().__init__(
status_code=status.HTTP_429_TOO_MANY_REQUESTS,
detail=detail,
error_code=error_code
)
"""Middleware package."""
"""
音乐分析模块
提供统一的音乐标签分析功能,支持通义千问和火山引擎豆包
主要功能:
- 音乐风格识别(与国际音乐分类体系对齐)
- 情绪识别
- 人声质感识别
- 语种识别
- 节奏强度分析(1-5,用于指导视频剪辑)
- 高潮点识别
- 视觉概念生成(用于MV创作)
- 歌词识别(可选)
支持的提供商:
- qwen: 通义千问 (qwen3-omni-flash)
- doubao: 火山引擎豆包 (doubao-seed-1-8-251228)
使用示例:
from app.middleware.music_analyze import analyze_music
# 基本分析
result = analyze_music(
metadata={"title": "稻香", "artist": "周杰伦"},
music_url="https://example.com/music.mp3",
provider="qwen",
)
# 含歌词识别
result = analyze_music(
metadata={"title": "稻香"},
music_url="https://example.com/music.mp3",
provider="qwen",
extract_lyrics=True,
)
"""
# 主函数导出
from .music_analyzer import (
analyze_music,
analyze_music_lyrics_only,
analyze_music_with_qwen,
analyze_music_with_doubao,
get_available_providers,
)
# 类导出
from .base import AudioAnalyzer
from .qwen_analyzer import QwenAnalyzer
from .doubao_analyzer import DoubaoAnalyzer
from .factory import AnalyzerFactory
__version__ = "1.0.0"
# -*- coding: utf-8 -*-
"""
音乐分析器工厂
"""
from typing import Dict, Any, Optional
from .base import AudioAnalyzer
from .qwen_analyzer import QwenAnalyzer
class AnalyzerFactory:
"""音乐分析器工厂"""
_analyzers: Dict[str, AudioAnalyzer] = {}
@classmethod
def get_analyzer(cls, provider: str = "qwen", **kwargs) -> AudioAnalyzer:
"""
获取分析器实例
Args:
provider: 提供商名称(仅支持 qwen)
**kwargs: 额外配置参数(如 api_key, model, timeout 等)
Returns:
AudioAnalyzer 实例
"""
key = f"{provider}"
cache_key = f"{provider}_{kwargs.get('model', '')}"
if cache_key in cls._analyzers:
return cls._analyzers[cache_key]
if provider == "qwen":
analyzer = QwenAnalyzer(**kwargs)
else:
raise ValueError(f"Unknown provider: {provider}. Only 'qwen' is supported.")
cls._analyzers[cache_key] = analyzer
return analyzer
@classmethod
def get_default_analyzer(cls) -> AudioAnalyzer:
"""获取默认分析器(从环境变量读取)"""
import os
provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
return cls.get_analyzer(provider=provider)
@classmethod
def list_providers(cls) -> list:
"""列出可用的提供商"""
return ["qwen"]
@classmethod
def clear_cache(cls):
"""清除缓存的分析器实例"""
cls._analyzers.clear()
# -*- coding: utf-8 -*-
"""
音乐分析统一入口
提供简化的 analyze_music() 函数
"""
from typing import Dict, Any, Optional
import os
from .factory import AnalyzerFactory
def analyze_music(
metadata: Dict[str, Any],
music_url: str,
provider: str = None,
extract_lyrics: bool = False,
label_level: int = 0,
) -> Optional[Dict[str, Any]]:
"""
音乐分析统一入口函数
Args:
metadata: 音乐元数据字典(如 title, artist 等)
music_url: 音乐文件 URL
provider: 提供商(qwen | doubao),默认从环境变量读取
extract_lyrics: 是否识别歌词
label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
Returns:
分析结果字典,包含以下字段:
- genre: 音乐风格(一级风格,如:流行、摇滚)
- emotion: 情绪列表
- emotional_intensity: 情绪强度
- vocal_texture: 人声质感
- vocal_description: 人声质感描述
- visual_concept: 视觉概念
- language: 语种
- bpm: 节拍数(可选)
- lyrics: 歌词列表(可选)
- _model: 使用的模型名称
- _token_info: Token 使用信息
Example:
>>> result = analyze_music(
... metadata={"title": "稻香", "artist": "周杰伦"},
... music_url="https://example.com/music.mp3",
... provider="qwen",
... extract_lyrics=False,
... )
>>> print(result["genre"])
流行
"""
if provider is None:
provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
analyzer = AnalyzerFactory.get_analyzer(provider=provider)
return analyzer.analyze(
metadata=metadata,
music_url=music_url,
extract_lyrics=extract_lyrics,
label_level=label_level,
)
def analyze_music_with_qwen(
metadata: Dict[str, Any],
music_url: str,
extract_lyrics: bool = False,
label_level: int = 0,
) -> Optional[Dict[str, Any]]:
"""使用通义千问分析音乐"""
return analyze_music(
metadata=metadata,
music_url=music_url,
provider="qwen",
extract_lyrics=extract_lyrics,
label_level=label_level,
)
def analyze_music_with_doubao(
metadata: Dict[str, Any],
music_url: str,
extract_lyrics: bool = False,
label_level: int = 0,
) -> Optional[Dict[str, Any]]:
"""使用火山引擎豆包分析音乐"""
return analyze_music(
metadata=metadata,
music_url=music_url,
provider="doubao",
extract_lyrics=extract_lyrics,
label_level=label_level,
)
def analyze_music_lyrics_only(
metadata: Dict[str, Any],
music_url: str,
provider: str = None,
) -> Optional[Dict[str, Any]]:
"""仅识别歌词,避免重复做基础标签分析"""
if provider is None:
provider = os.getenv("DEFAULT_MUSIC_ANALYZER", "qwen")
analyzer = AnalyzerFactory.get_analyzer(provider=provider)
if hasattr(analyzer, "analyze_lyrics_only"):
return analyzer.analyze_lyrics_only(metadata=metadata, music_url=music_url)
# 兼容未实现 lyrics_only 的提供商
result = analyzer.analyze(
metadata=metadata,
music_url=music_url,
extract_lyrics=True,
label_level=0,
)
if isinstance(result, dict):
lyrics = result.get("lyrics", [])
return {
"lyrics": lyrics if isinstance(lyrics, list) else [],
"_model": result.get("_model"),
"_token_info": result.get("_token_info"),
}
return None
def get_available_providers() -> list:
"""获取可用的提供商列表"""
return AnalyzerFactory.list_providers()
# -*- coding: utf-8 -*-
"""
音乐分析提示词模板构建器
支持从外部模板文件读取提示词,便于动态修改
"""
import os
from pathlib import Path
from typing import Dict, Any, Optional
# 模板文件路径(已迁移到 app/prompts/step2_music_decode)
PROMPTS_DIR = Path(__file__).parent.parent.parent / "prompts" / "step2_music_decode"
SYSTEM_PROMPT_FILE = PROMPTS_DIR / "music_analyze_system_prompt.md"
SYSTEM_PROMPT_PART_A_FILE = PROMPTS_DIR / "music_analyze_system_prompt_part_a.md"
SYSTEM_PROMPT_PART_B_FILE = PROMPTS_DIR / "music_analyze_system_prompt_part_b.md"
USER_PROMPT_FILE = PROMPTS_DIR / "music_analyze_user_prompt.md"
LYRICS_ONLY_PROMPT_FILE = PROMPTS_DIR / "music_lyrics_only_prompt.md"
def load_template(template_path: Path) -> str:
"""
从文件加载模板
Args:
template_path: 模板文件路径
Returns:
模板内容字符串
"""
if not template_path.exists():
raise FileNotFoundError(f"模板文件不存在: {template_path}")
with open(template_path, "r", encoding="utf-8") as f:
content = f.read()
# 只移除文件顶部的 Markdown 注释(以 # 开头的注释行)
# 保留 ## 标题行和正文内容
lines = content.split("\n")
filtered_lines = []
in_header = True
for line in lines:
stripped = line.strip()
# 如果是空行,保留
if not stripped:
filtered_lines.append(line)
continue
# 如果在文件头部且是单行注释(# 但不是 ##),则跳过
if in_header and stripped.startswith("#") and not stripped.startswith("##"):
continue
# 遇到 ## 标题或正文内容,不再是头部
in_header = False
filtered_lines.append(line)
return "\n".join(filtered_lines)
class PromptBuilder:
"""音乐分析提示词模板构建器"""
def __init__(self, label_level: int = 0):
"""
初始化提示词构建器
Args:
label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
"""
self.label_level = label_level
def build_system_prompt(self) -> str:
"""构建系统提示词 - 直接加载静态模板"""
return load_template(SYSTEM_PROMPT_FILE)
def build_system_prompt_part_a(self) -> str:
"""构建系统提示词A组"""
return load_template(SYSTEM_PROMPT_PART_A_FILE)
def build_system_prompt_part_b(self) -> str:
"""构建系统提示词B组"""
return load_template(SYSTEM_PROMPT_PART_B_FILE)
def build_metadata_section(self, metadata: Optional[Dict[str, Any]] = None) -> str:
"""构建元数据部分"""
if not metadata:
return ""
sections = ["## 音乐元数据"]
for key, value in metadata.items():
if key.startswith("_"):
continue
if value and str(value).strip():
sections.append(f"- {key}: {value}")
sections.append("")
return "\n".join(sections)
def build_output_format(
self,
include_lyrics: bool = False,
include_bpm: bool = True,
) -> str:
"""构建输出格式说明"""
if include_lyrics and include_bpm:
format_spec = """{
"genre": "",
"sub_genre": "",
"language": "",
"vocal_type": "",
"vocal_description": "",
"emotion": [""],
"scene": [""],
"age": "",
"rhythm_intensity": "",
"is_sinking": false,
"song_description": "",
"visual_concept": "",
"emotional_intensity": "",
"bpm": 0,
"lyrics": [{"time": "", "text": ""}]
}"""
elif include_bpm:
format_spec = """{
"genre": "",
"sub_genre": "",
"language": "",
"vocal_type": "",
"vocal_description": "",
"emotion": [""],
"scene": [""],
"age": "",
"rhythm_intensity": "",
"is_sinking": false,
"song_description": "",
"visual_concept": "",
"emotional_intensity": "",
"bpm": 0
}"""
elif include_lyrics:
format_spec = """{
"genre": "",
"sub_genre": "",
"language": "",
"vocal_type": "",
"vocal_description": "",
"emotion": [""],
"scene": [""],
"age": "",
"rhythm_intensity": "",
"is_sinking": false,
"song_description": "",
"visual_concept": "",
"emotional_intensity": "",
"lyrics": [{"time": "", "text": ""}]
}"""
else:
format_spec = """{
"genre": "",
"sub_genre": "",
"language": "",
"vocal_type": "",
"vocal_description": "",
"emotion": [""],
"scene": [""],
"age": "",
"rhythm_intensity": "",
"is_sinking": false,
"song_description": "",
"visual_concept": "",
"emotional_intensity": ""
}"""
return format_spec
def build_user_prompt(
self,
metadata: Optional[Dict[str, Any]] = None,
include_lyrics: bool = False,
include_bpm: bool = True,
) -> str:
"""
构建完整的用户提示词
使用模板文件并替换占位符
Args:
metadata: 音乐元数据字典(可选)
include_lyrics: 是否识别歌词(保留参数以兼容现有调用)
include_bpm: 是否包含BPM识别(保留参数以兼容现有调用)
Returns:
完整的用户提示词
"""
# 加载模板
template = load_template(USER_PROMPT_FILE)
# 准备替换字典 - 只替换元数据部分
# 输出格式已在系统提示词中定义,不需要在用户提示词中重复
replacements = {
"{{METADATA_SECTION}}": self.build_metadata_section(metadata),
}
# 替换占位符
result = template
for placeholder, value in replacements.items():
result = result.replace(placeholder, value)
return result
def build_lyrics_only_prompt(self) -> str:
"""构建仅识别歌词的提示词"""
return load_template(LYRICS_ONLY_PROMPT_FILE)
def build_analyze_prompt(
metadata: Optional[Dict[str, Any]] = None,
include_lyrics: bool = False,
label_level: int = 0,
) -> tuple[str, str]:
"""
构建完整的分析提示词
Args:
metadata: 音乐元数据字典(可选)
include_lyrics: 是否识别歌词
label_level: 标签级别(0: 一级标签, 1: 一级+二级标签)
Returns:
(system_prompt, user_prompt) 元组
"""
builder = PromptBuilder(label_level=label_level)
system_prompt = builder.build_system_prompt()
user_prompt = builder.build_user_prompt(
metadata=metadata,
include_lyrics=include_lyrics,
include_bpm=True,
)
return system_prompt, user_prompt
def build_analyze_prompt_part_a(
metadata: Optional[Dict[str, Any]] = None,
include_lyrics: bool = False,
label_level: int = 0,
) -> tuple[str, str]:
"""
构建A组分析提示词(标签与基础信息)
"""
builder = PromptBuilder(label_level=label_level)
system_prompt = builder.build_system_prompt_part_a()
user_prompt = builder.build_user_prompt(
metadata=metadata,
include_lyrics=include_lyrics,
include_bpm=True,
)
return system_prompt, user_prompt
def build_analyze_prompt_part_b(
metadata: Optional[Dict[str, Any]] = None,
include_lyrics: bool = False,
label_level: int = 0,
) -> tuple[str, str]:
"""
构建B组分析提示词(节奏与视觉描述)
"""
builder = PromptBuilder(label_level=label_level)
system_prompt = builder.build_system_prompt_part_b()
user_prompt = builder.build_user_prompt(
metadata=metadata,
include_lyrics=include_lyrics,
include_bpm=True,
)
return system_prompt, user_prompt
def build_lyrics_prompt() -> str:
"""构建仅识别歌词的提示词"""
builder = PromptBuilder()
return builder.build_lyrics_only_prompt()
# 向后兼容:保留原有的构建函数
def build_user_prompt(
metadata: Optional[Dict[str, Any]] = None,
include_lyrics: bool = False,
label_level: int = 0,
) -> str:
"""构建用户提示词(兼容函数)"""
builder = PromptBuilder(label_level=label_level)
return builder.build_user_prompt(
metadata=metadata,
include_lyrics=include_lyrics,
include_bpm=True,
)
# 聚音标签识别助手 - 系统角色定义
## 角色定位
你是音乐内容标签标注助手。
你的任务是基于输入的歌曲信息(如歌词、标题、风格描述、音频特征等),严格按照「聚音标签字典」输出标准化标签字段。
只输出标签结果,不做解释,不做分析,不添加任何多余文本。
------
## 输出格式
仅输出 JSON 纯文本,结构如下:
{
"performer_type": "",
"language": "",
"emotion": [],
"douyin_tags": [],
"music_style_tags": [],
"instrument_tags": [],
"scene": []
}
禁止输出任何解释性文字、注释或额外字段。
------
## 全局约束规则
1. 所有标签必须严格从下方字典中选择,禁止自造词。
2. 不允许基于刻板印象猜测(如仅凭曲风推断情绪)。
3. 标签必须基于明确特征:
- 歌词内容
- 音乐风格特征
- 明确出现的配器
- 明确使用场景
4. 多选字段仅选择高度确定且核心表达的标签,避免过度打标。
5. 注意!所有字段至少选择一个标签,不允许留空。
------
# 字段判定标准说明
## 一、演唱者类型 performer_type(单选)
用于标注主要人声类型,仅根据实际听感或明确描述判断:
- 男声:主要为男性声线
- 女声:主要为女性声线
- 童声:明显儿童声线
- 合唱:多人群体演唱为主(非简单和声)
不确定时输出 ""。
------
## 二、情绪 emotion(多选)
必须基于歌曲整体情绪表达判断,而非个别词语。
- 喜庆:节日、庆典氛围明显
- 浪漫:爱情氛围浓厚
- 雄壮:宏大、史诗、气势恢宏
- 蛊惑:迷幻、魅惑、暧昧
- 宣泄:情绪爆发、释放
- 悲壮:悲情但具有力量感
- 愤怒:强烈对抗或激烈表达
- 庄重:正式、肃穆
- 激情:热烈高昂
- 沉重:压抑、厚重
- 快乐:轻松开心
- 励志:奋斗、成长、自我激励
- 思念:想念某人或过往
- 紧张:悬而未决、焦虑
- 恐怖:惊悚氛围
- 感动:温情催泪
- 恶搞:刻意夸张调侃
- 搞笑:明显幽默表达
- 期待:盼望未来
- 怀念:回忆过去
- 甜蜜:恋爱甜感
- 孤独:孤单、自我独白
- 伤感:悲伤低落
- 悬疑:神秘未知感
- 祝福:祝愿表达
- 佛系:平淡随性
- 舒缓:节奏慢、平稳
- 悠扬:旋律流畅优美
- 温暖:柔和治愈
- 忧郁:带有阴郁气质
避免:
- 同时选择强烈对立情绪(如 快乐 与 伤感)
- 同类标签堆叠(如 伤感 + 忧郁 + 孤独 需明确区分)
------
## 三、语种 language(单选)
仅从下列标签中选择一个最主要的演唱语种:
- 普通话
- 粤语
- 藏语
- 英语
- 韩语
- 闽南语
- 蒙语
- 俄语
- 其他
规则:
- 只输出一个语种标签
- 依据实际演唱语言判断,不根据歌手国籍或曲风猜测
- 纯音乐或无法判断时输出 ""
------
## 四、网络/抖音歌曲 douyin_tags(可多选)
仅当歌曲具备明显网络传播特征或主题风格时选择:
- 草原:草原文化、民族草原元素
- 故乡:思乡主题
- 神曲:洗脑旋律、强节奏重复
- 文艺:小众表达、诗性表达
- 青春:校园或成长主题
- 治愈系:温暖疗愈风格
- 清新:轻快自然风格
- 奇幻:幻想、魔幻元素
非明显网络属性不要强行标注。
------
## 五、音乐风格 music_style_tags(多选)
必须根据音乐结构与风格特征判断,不根据歌词主题判断。
- 世界音乐
- 雷鬼
- R&B/Soul
- MC喊麦
- 另类音乐
- 民歌
- 戏曲
- 古风
- 古典音乐
- HipHop
- Rap
- 摇滚
- DJ嗨曲
- 布鲁斯/蓝调
- 拉丁
- 舞曲
- 爵士
- 乡村
- 民谣
- 流行
- 轻音乐
- 国风
- 儿歌
规则:
- 只选核心风格,不叠加相似风格
- 不因使用某个乐器就推断整体风格
- 无明显风格时可只选“流行”
------
## 六、配器 instrument_tags(多选)
仅在明确可识别时选择:
- 二胡
- 竹笛
- 琵琶
- 音效
- 口琴
- 电子
- 木吉他
- 鼓组
- 弦乐
- 电吉他
- 古筝
- 钢琴
规则:
- 必须为明显主导或突出配器
- 不因常规伴奏默认存在而标注
- 不确定不要猜
------
## 七、场景 scene(多选)
根据歌曲使用场景或明显氛围判断:
- 餐厅
- 汽车
- 跳舞
- 旅行
- 工作
- 校园
- 夜店
- 运动
- 休闲
- live house
- 广场舞
- 抖音
- 婚礼
- 约会
规则:
- 仅当歌曲明显适配该场景时标注
- 避免泛化场景(如所有慢歌都标“休闲”)
------
## 最终执行要求
- 只输出 JSON
- 不解释
- 不补充说明
- 不输出字典内容
- 不输出“分析如下”之类文字
- 不添加未定义字段
严格遵守字段范围与空值规则。
## 输出格式
必须严格输出以下 JSON 结构,字段名不能改:
```json
{
"performer_type": "",
"language": "",
"emotion": [],
"douyin_tags": [],
"music_style_tags": [],
"instrument_tags": [],
"scene": []
}
```
# 待分析元数据
{{METADATA_SECTION}}
# 任务目标
请基于音频内容完成聚音标签识别,仅输出系统要求的标签字段。
# 约束提醒
- 必须基于实际听到的特征,无法确认的标签输出空值。
- 严格执行 JSON 纯文本输出,禁止任何 Markdown 格式。
# 歌词识别提示词模板
# 仅识别歌词内容,不包含其他音乐分析
请识别并转录音频中的完整歌词。
## 核心任务
1. **逐句识别**:按时间顺序输出每一句歌词,每句通过换行进行分隔。
2. **字段要求**:每条记录必须包含 `time` (格式 "mm:ss.xxx",无法确定则为 null) 和 `text` (歌词内容)。
3. **无语义音节压缩**:对于“啊/呜/哦/嗯/啦”等辅助音节,禁止逐字展示,统一使用 `...` 缩略(例:把“啊啊啊啊”识别为“啊...”)。
4. **完整性**:必须转录包括重复段落在内的全曲内容。
5. **静默与纯音乐**:若为纯音乐或无歌词,仅返回空数组 `[]`
6. 完整识别歌曲所有段落的完整歌词,包括不同段落之间重复了的歌词
## 输出格式规范
- 严格输出 JSON,不得包含任何 Markdown 转义符(如 ```json)或解释性文字。
- 字段统一为: {"lyrics": [{"time": "00:00.000", "text": "内容"}]}
## 质量控制
- 遇到合唱/重叠时,以主旋律为主。
- 严禁自行脑补不存在的歌词。
- 不要返回任何其他无关内容
"""
阿里云OSS文件上传模块
"""
import os
import uuid
import logging
from datetime import datetime, timedelta
import oss2
from app.core.config import settings
logger = logging.getLogger(__name__)
class OSSUploader:
"""阿里云OSS上传器"""
def __init__(self):
"""初始化OSS客户端"""
self.access_key_id = settings.OSS_ACCESS_KEY_ID
self.access_key_secret = settings.OSS_ACCESS_KEY_SECRET
self.endpoint = settings.OSS_ENDPOINT
self.bucket_name = settings.OSS_BUCKET_NAME
if not all([
self.access_key_id,
self.access_key_secret,
self.endpoint,
self.bucket_name,
]):
raise ValueError("OSS配置不完整,请检查 .env 中的 OSS_ACCESS_KEY_ID/OSS_ACCESS_KEY_SECRET/OSS_ENDPOINT/OSS_BUCKET_NAME")
logger.info(
"OSS配置: endpoint=%s, bucket=%s",
self.endpoint,
self.bucket_name,
)
# 创建认证对象
self.auth = oss2.Auth(self.access_key_id, self.access_key_secret)
# 默认使用公网 endpoint;非阿里云内网环境下访问 internal endpoint 容易失败。
self.bucket = oss2.Bucket(self.auth, self.endpoint, self.bucket_name)
def upload_file(self, local_file_path, oss_object_name=None):
"""
上传文件到OSS
Args:
local_file_path: 本地文件路径
oss_object_name: OSS对象名称,如果不指定则使用时间戳+原文件名
Returns:
tuple: (success: bool, url: str) 或 (success: bool, error: str)
"""
try:
if not os.path.exists(local_file_path):
logger.error(f"本地文件不存在: {local_file_path}")
return False, "本地文件不存在"
if not oss_object_name:
_, ext = os.path.splitext(local_file_path)
oss_object_name = f"{uuid.uuid4()}{ext}"
# 如果没有指定OSS对象名称,则生成一个
date = datetime.now().strftime("%Y%m%d")
oss_object_name = f"temp_ai/{date}/{oss_object_name}"
# 上传文件
result = self.bucket.put_object_from_file(oss_object_name, local_file_path)
# 构建文件URL
file_url = f"https://{self.bucket_name}.{self.endpoint}/{oss_object_name}"
logger.info(f"文件上传成功: {local_file_path} -> {file_url}")
return True, file_url
except Exception as e:
logger.error(f"文件上传失败: {local_file_path}, 错误: {e}")
return False, str(e)
def upload_data(self, data, oss_object_name):
"""
上传数据到OSS
Args:
data: 要上传的数据(字符串或字节)
oss_object_name: OSS对象名称
Returns:
dict: 包含上传结果的字典
"""
try:
# 上传数据
result = self.bucket.put_object(oss_object_name, data)
# 构建文件URL
file_url = f"{self.endpoint.rstrip('/')}/{self.bucket_name}/{oss_object_name}"
return {
"success": True,
"oss_object_name": oss_object_name,
"file_url": file_url,
"etag": result.etag,
"size": len(data) if isinstance(data, (str, bytes)) else 0
}
except Exception as e:
return {"success": False, "error": str(e)}
def get_bucket():
"""获取Bucket对象"""
if not all([
settings.OSS_ACCESS_KEY_ID,
settings.OSS_ACCESS_KEY_SECRET,
settings.OSS_ENDPOINT,
settings.OSS_BUCKET_NAME,
]):
raise ValueError("OSS配置不完整,请检查 .env 中的 OSS_ACCESS_KEY_ID/OSS_ACCESS_KEY_SECRET/OSS_ENDPOINT/OSS_BUCKET_NAME")
auth = oss2.Auth(settings.OSS_ACCESS_KEY_ID, settings.OSS_ACCESS_KEY_SECRET)
bucket = oss2.Bucket(auth, settings.OSS_ENDPOINT, settings.OSS_BUCKET_NAME)
return bucket
def clean_expire_file():
"""核心任务函数"""
print(f"\n[{datetime.now()}] 开始执行每日清理任务...")
ROOT_PREFIX = 'temp_ai/'
bucket = get_bucket()
# 1. 计算时间阈值
now = datetime.now()
yesterday_date = (now - timedelta(days=1)).date()
print(f"保留阈值: {yesterday_date} (即 {yesterday_date} 之前的数据将被删除)")
# 2. 遍历目录
try:
for obj in oss2.ObjectIterator(bucket, prefix=ROOT_PREFIX, delimiter='/'):
path = ""
is_directory = False
# --- [核心修改] 统一路径获取方式 ---
# 情况 A: 它是虚拟目录 (CommonPrefix)
if hasattr(obj, 'prefix'):
path = obj.prefix
is_directory = True
# 情况 B: 它是实际对象 (SimplifiedObjectInfo)
elif hasattr(obj, 'key'):
path = obj.key
# 如果 key 以 / 结尾,说明它是一个显式创建的文件夹对象
if path.endswith('/'):
is_directory = True
else:
is_directory = False # 这是一个普通文件
# --- 逻辑分流 ---
if not is_directory:
# 这是一个真正的文件(且不是文件夹对象),直接跳过
# print(f"[跳过] 散落文件: {path}")
continue
# 此时 path 必定是目录格式 (如 'temp_ai/20251229/')
# 下面开始正常的日期判断逻辑
# 防御性去空,防止路径即为 'temp_ai/' 本身
if path == ROOT_PREFIX:
continue
# 解析目录名 (取倒数第二个元素,因为最后一位是空字符串)
folder_name_raw = path.strip('/').split('/')[-1]
try:
folder_date_obj = datetime.strptime(folder_name_raw, "%Y%m%d").date()
if folder_date_obj < yesterday_date:
print(f"[删除] 发现过期目录: {path}")
# 注意:delete_objects_by_prefix 会删除该前缀下的所有文件
# 如果这个目录本身是个对象,也会被一并删除,无需特殊处理
delete_objects_by_prefix(bucket, path)
else:
# print(f"[跳过] 目录较新: {path}")
pass
except ValueError:
print(f"[跳过] 非日期命名目录: {path}")
except Exception as e:
import traceback
print(f"[严重错误] 任务执行失败: {e}")
traceback.print_exc()
def delete_objects_by_prefix(bucket, prefix):
"""递归删除指定前缀下的所有文件"""
print(f" -> 正在清理目录: {prefix} ...")
batch_list = []
try:
for obj in oss2.ObjectIterator(bucket, prefix=prefix):
batch_list.append(obj.key)
if len(batch_list) >= 1000:
bucket.batch_delete_objects(batch_list)
batch_list = []
if batch_list:
bucket.batch_delete_objects(batch_list)
print(f" -> 目录 {prefix} 清理完毕。")
except Exception as e:
print(f" [错误] 删除过程出错: {e}")
# 创建OSS上传器实例
oss_uploader = OSSUploader()
if __name__ == '__main__':
resp = oss_uploader.upload_file('想-dj-片段.mp3')
print(resp)
from dashscope.common.constants import DASHSCOPE_API_KEY_ENV
ENV = 'test'
# ENV = 'local'
DEBUG = True
### 数据库
#dev
DB_USER = 'root'
DB_PASSWORD = 'Hikoon123!'
DB_HOST = 'rm-bp18h64ad9ak4d7h5do.mysql.rds.aliyuncs.com'
DB_DATABASE = 'music_partner'
#Redis
REDIS_HOST = '172.23.209.46'
REDIS_PORT = 6379
REDIS_PSW = '1bvvpAmKXFhDDJXb'
REDIS_DB = 0
#新抖key
NEW_RANK_KEY = 'vh1gbvynpyegg6gebhgepgvc6'
BACK_BASE_URL = 'https://ai-test.hikoon.com/api/partner'
EMAIL_HOST = 'smtp.exmail.qq.com'
EMAIL_PORT = 465
EMAIL_HOST_USER = 'bigmusic@hikoon.com'
EMAIL_HOST_PASSWORD = 'Music!123'
#邮件接收人列表
EMAIL_RECEIVERS = ['1774507011@qq.com','yangsheng@hikoon.com']
#标签字典
TAG_DICT = {
"viral_song": "网络热歌",
"sad_songs": "伤感老歌",
"folk_songs": "民谣",
"catchy_pop": "口水歌",
"kids_songs": "洗脑儿歌",
"tk_songs": "抖音热歌",
"net_songs": "网络歌曲",
"dj_remix": "DJ嗨曲",
"Cheesy_EDM": "土嗨/慢摇",
"car_music": "车载音乐",
"shout_rap": "喊麦",
"heavy_metal": "重金属/土摇DJ嗨曲",
"mandarin_pop": "华语流行",
"mainstream_pop": "主流Pop",
"sweet_songs": "甜歌/校园",
"hip_rock": "嘻哈说唱R&B摇滚",
"child_songs": "主流儿歌",
"international_pop": "国外流行",
"jp_pop": "日韩流行",
"west_pop": "欧美流行",
"el_edm": "电音EDM",
"chinese_style": "国风",
"opera_vocal": "戏腔/古韵",
"guochao_EDM": "国潮电子",
"gufeng_music": "传统器乐古风",
"soundtrack_instrumental": "影视/纯音",
"ys_ost": "影视OST",
"pur_music": "纯音乐",
"no_lyric": "无词BGM",
"other_music": "其他",
"jazz_blue": "爵士/蓝调",
"voice_book": "有声书",
"lab_music": "实验音乐",
"healing": "治愈",
"melancholy": "伤感",
"lonely": "孤独",
"sweet": "甜蜜",
"inspiring": "励志",
"missing": "思念",
"nostalgic": "怀旧",
"angry": "愤怒",
"relaxing": "放松",
"catchy": "魔性洗脑",
"heroic": "悲壮",
"calm": "平静",
"festive": "喜庆",
"romantic": "浪漫",
"majestic": "雄壮",
"bewitching": "蛊惑",
"cathartic": "宣泄",
"solemn": "庄重",
"passionate": "激情",
"heavy": "沉重",
"happy": "快乐",
"tense": "紧张",
"horror": "恐怖",
"touching": "感动",
"spoof": "恶搞",
"funny": "搞笑",
"expectation": "期待",
"remembrance": "怀念",
"mysterious": "悬疑",
"blessing": "祝福",
"zen": "佛系",
"soothing": "舒缓",
"melodious": "悠扬",
"warm": "温暖",
"depressed": "忧郁",
"elderly": "老年",
"middle_aged": "中年",
"young_adult": "青年",
"teenager": "少年",
"life_scene": "生活场景",
"sports": "运动",
"driving": "开车",
"travel": "旅行",
"sleep": "睡前",
"study": "学习",
"cafe": "咖啡厅",
"bar": "酒吧",
"douyin":"抖音",
"restaurant": "餐厅",
"car_scene": "汽车",
"dance": "跳舞",
"work": "工作",
"nightclub": "夜店",
"leisure": "休闲",
"live_house": "live house",
"square_dance": "广场舞",
"wedding": "婚礼",
"dating": "约会",
"festival_scene": "节日场景",
"summer": "夏天",
"winter": "冬天",
"autumn": "秋天",
"spring_festival": "春节",
"christmas": "圣诞",
"valentine": "情人节",
"time_scene": "时间场景",
"morning": "清晨",
"afternoon": "午后",
"evening": "夜晚",
"midnight": "深夜",
"regional_scene": "地域场景",
"campus": "校园",
"city": "城市",
"grassland": "草原",
"tibet": "西藏",
"xinjiang": "新疆",
"transition_style": "转场类",
"card_point_switch": "卡点切换画面类",
"reverse_suspense": "反转悬念类",
"emotion_contrast": "情绪对比类",
"mashup_collection": "混剪合集类",
"emotional_resonance": "情感共鸣向剪辑",
"scene_adaptation": "场景适配剪辑",
"highlight_slice": "高光切片剪辑",
"live_performance": "现场表演类",
"singer_live": "歌手现场演唱",
"talent_cover": "达人翻唱表演",
"audience_interaction": "观众互动表演",
"card_point_speed": "卡点、变速类",
"multi_scene_fragment": "多场景碎片化卡点",
"tech_effect_speed": "技术流特效变速",
"lyric_concrete": "歌词具象化卡点",
"loop_speed_brainwash": "循环变速洗脑",
"ugc_co_creation": "UGC共创类",
"jianying_template": "剪映模板",
"ai_singing": "AI唱歌",
"emotional_quotes": "情感语录类",
"late_night_emo": "深夜emo类",
"morning_inspiration": "清晨励志类",
"memory_destiny": "回忆杀/宿命感类",
"dynamic_lyrics_visual": "动态歌词可视化",
"basic_lyrics_effect": "基础歌词动效",
"creative_visual_enhance": "创意视觉强化",
"adaptation": "改编",
"special_effects_interaction": "特效互动类",
"gesture_magic_effect": "手势魔法特效互动",
"lip_sync_challenge": "对口型挑战",
"douyin_effect_show": "抖音特效变装秀",
# 听感演绎流
"singing_montage": "演唱混剪",
"live_singing": "现场演唱",
# 视觉冲击流
"change_transition": "变装转场",
"hand_dance": "手势舞",
"addictive_dance": "魔性舞蹈",
"landscape_account": "风景号",
# 氛围素材流
"cute_pets": "萌宠",
"movie_anime_edit": "影视剧/动漫混剪",
"chinese_classical": "古风",
"mood_post": "图文心情",
# 情感共鸣流
"animated_lyrics": "动态歌词",
"storytelling": "故事演绎",
"beauty_snaps": "颜值随拍"
}
# 模型相关配置
BASE_MODEL = "/data/qufeng/models--MIT--ast-finetuned-audioset-10-10-0.4593/snapshots/f826b80d28226b62986cc218e5cec390b1096902"
MOE_DIR = "/data/qufeng/moe_outputs"
BASELINE_CHECKPOINT = "/data/qufeng/best_epoch_base.pt"
LABEL_MAPPING = "/data/qufeng/label_mapping.txt"
DEVICE = "cuda" # 可选: cuda/mps/cpu,为空时自动选择
ROUTER_CHECKPOINT = "" # 为空时自动从 moe_dir/joint_train/joint/router_best.pt 推断
EXPERTS_DIR = "" # 为空时自动从 moe_dir/experts_train/experts 推断
# 音频处理配置
CHUNK_SECONDS = 10.24 # 按多少秒切块推理
CROP_SECONDS = 204.8 # 若音频超过该时长,则仅截取中间这段再切块
MAX_CHUNKS = 10 # 每首歌最多使用多少个切片参与推理
CHUNK_BATCH_SIZE = 8 # 切块推理的 batch size
ROUTING_THRESHOLD = 0.6
API_CONFIG = {
"api_key": "sk-d9b4d3581bde47d887354f9160a509a2",
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"model": "qwen3-omni-flash",
"audio_mode": "auto",
"timeout": 15,
"lyrics_timeout": 60,
"lyrics_retries": 2,
"max_retries": 5,
"retry_delay": 5
}
# API_CONFIG_91 = {
# "api_key": "sk-E90VNVMyhfk2zDBDoToCXoipzGofD2SobwBqaCzbG3junlob",
# "base_url": "https://api.91aopusi.com/v1",
# "model": "qwen3-omni-flash",
# "audio_mode": "auto",
# "timeout": 30,
# "lyrics_timeout": 60,
# "max_retries": 5,
# "retry_delay": 5
# }
DASHSCOPE_API_KEY = 'sk-d9b4d3581bde47d887354f9160a509a2'
OSS_ACCESS_KEY_ID='LTAI4G7UvaW2e4UTCb3KCNjN'
OSS_ACCESS_KEY_SECRET='ow5hlVMmJAQY9o7nEAtMER6MFkPedm'
OSS_ENDPOINT='oss-cn-hangzhou.aliyuncs.com'
OSS_ENDPOINT_INTERNAL='oss-cn-hangzhou-internal.aliyuncs.com'
OSS_BUCKET_NAME='ai-sound-data-test'
\ No newline at end of file
import logging.handlers
import os
from config import DEBUG
log_dir = "./logs"
log_max_bytes = 1024 * 1024 * 10
log_backup_count = 5
def get_logger(name, level=None):
if not level:
level = logging.DEBUG if DEBUG else logging.INFO
# 配置日志
logger = logging.getLogger(name)
logger.setLevel(level)
# 检查日志目录是否存在,如果不存在则创建
if not os.path.exists(log_dir):
os.makedirs(log_dir)
# 创建一个handler,用于写入日志文件
file_handler = logging.handlers.RotatingFileHandler(f'./{log_dir}/{name}.log', maxBytes=log_max_bytes,
backupCount=log_backup_count,encoding='utf-8')
file_handler.setLevel(level)
# 定义handler的输出格式
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
file_handler.setFormatter(formatter)
# 给logger添加handler
logger.addHandler(file_handler)
return logger
# 定义一个模块级别的变量来存储日志记录器实例
_app_logger = None
def get_app_logger():
global _app_logger
if _app_logger is None:
_app_logger = get_logger("app")
return _app_logger
openai>=1.58.1
requests>=2.31.0
httpx>=0.28.1
python-dotenv>=1.0.1
pydantic-settings>=2.6.1
numpy>=1.24.0
scipy>=1.10.0
librosa>=0.10.2
soundfile>=0.12.1
pandas>=2.2.0
openpyxl>=3.1.2
# Optional: enable funasr backend in qwen_analyzer
# dashscope>=1.20.0