1836 lines
123 KiB
Markdown
1836 lines
123 KiB
Markdown
# sub2api-cn-relay-manager 执行板
|
||
|
||
日期:2026-05-22
|
||
当前 Gate:APPROVED(代码门禁已通过,并且 2026-05-21 已继续收掉 account probe、gateway probe 认证语义和 latest-head `self_service` fresh-host 复验的剩余问题。最新 MiniMax 53hk fresh-host 验收 `artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import/21-summary.json`、DeepSeek 2166 `subscription` fresh-host 验收 `artifacts/real-host-acceptance/20260521_201509_remote43_deepseek_key_import/21-summary.json`、以及 latest-head `self_service` 标准 fresh-host 验收 `artifacts/real-host-acceptance/20260521_210403/05-import.json` / `07-access-status.json` 已共同证明:`subscription` 与 `self_service` 主链路都能在真实 fresh host 上闭环到 ready,host `/v1/models` 与 `/v1/chat/completions` 也都真实返回 `HTTP 200`。当前仍存在的 `reconcile=drifted` 只反映共享 fresh-host 环境里的历史残留资源,不阻塞 PRD 首版放行)
|
||
目标:实现独立控制面、零侵入宿主、可导入国产模型并具备可运维的导入/回滚/访问闭环。
|
||
|
||
## 2026-05-22 当前真相
|
||
|
||
- 当前主目录 `artifacts/real-host-acceptance/` 已只保留最终证据;历史调试样本已迁到 `artifacts/real-host-acceptance-archive/`
|
||
- access ready 语义已经收口为:`/v1/models` 命中 `smoke_test_model`,且最小 `POST /v1/chat/completions` smoke 成功;不会再出现 models-only 假 ready
|
||
- `subscription` 主链路已通过 latest fresh-host 复验:
|
||
- MiniMax 53hk:`artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import/21-summary.json`
|
||
- DeepSeek 2166:`artifacts/real-host-acceptance/20260521_201509_remote43_deepseek_key_import/21-summary.json`
|
||
- Kimi A7M(local host `v0.1.129`):`artifacts/real-host-acceptance/20260522_122706_local_v0129_kimi_a7m_subscription_freshhost/21-summary.json`
|
||
- `self_service` 主链路已通过 latest-head 标准 fresh-host 复验:
|
||
- `artifacts/real-host-acceptance/20260521_210403/05-import.json`
|
||
- `artifacts/real-host-acceptance/20260521_210403/07-access-status.json`
|
||
- 2026-05-27 已把公网用户入口从 `kimi-portal` 收口为通用多模型 portal:
|
||
- 新正式地址:`https://sub.tksea.top/portal/`
|
||
- 旧地址 `https://sub.tksea.top/kimi-portal/` 当前保留为 `302` 跳转,避免历史分享链接失效
|
||
- 站点资产与 Nginx 路由不再只存在 `/tmp` 临时文件,已收口进仓库:
|
||
- `deploy/tksea-portal/index.html`
|
||
- `deploy/tksea-portal/admin-batch-import.html`
|
||
- `deploy/tksea-portal/nginx.sub.tksea.top.conf.example`
|
||
- `scripts/deploy/deploy_tksea_portal.sh`
|
||
- 新页面已补齐登录态、用户信息、可绑定分组、活跃订阅、历史 key 列表,以及“新创建 key 对应分组/模型”的即时展示
|
||
- 同轮已补最小 batch-import 管理页:
|
||
- 地址:`/portal/admin-batch-import.html`
|
||
- 直接消费 `POST /api/batch-import/runs`
|
||
- 直接消费 `GET /api/batch-import/runs/{run_id}`
|
||
- 直接消费 `GET /api/batch-import/runs/{run_id}/items`
|
||
- 用于验证 `matched_account_state / account_resolution / provision_reused`
|
||
- 2026-05-27 已继续把管理入口收成统一 `/portal/admin/` 体系:
|
||
- `https://sub.tksea.top/portal/admin/`:管理首页
|
||
- `https://sub.tksea.top/portal/admin/providers.html`:provider 目录 / preview-import / import / manifest 草稿页
|
||
- `https://sub.tksea.top/portal/admin/batch-import.html`:结构化 batch-import 入口,当前跳转到 legacy `admin-batch-import.html`
|
||
- Nginx 示例与 deploy 脚本已补同域 CRM 反代 `https://sub.tksea.top/portal-admin-api/`
|
||
- 目的不是绕过鉴权,而是让浏览器可直接操作 remote43 CRM;当前已继续补成“管理员用户名 / 密码登录 + HttpOnly session cookie”,同时保留 Bearer admin token 兼容脚本与紧急兜底
|
||
- 2026-05-27 已继续把 provider manifest 草稿从“只存在浏览器”补成真正的服务端能力:
|
||
- 新增 `POST /api/provider-drafts`
|
||
- 新增 `GET /api/provider-drafts`
|
||
- 新增 `GET /api/provider-drafts/{draft_id}`
|
||
- 新增 `PUT /api/provider-drafts/{draft_id}`
|
||
- 新增 `DELETE /api/provider-drafts/{draft_id}`
|
||
- 数据当前落到 CRM SQLite `provider_drafts` 表
|
||
- `providers.html` 已可直接“保存到服务端”、回看历史草稿、以及更新 / 删除已保存草稿
|
||
- 2026-05-28 已把“草稿一键生成 pack/provider 文件并提交到仓库”的发布链路补齐:
|
||
- 新增 `POST /api/provider-drafts/{draft_id}/publish`
|
||
- 发布动作会把草稿 canonicalize 成完整 `pack.ProviderManifest`
|
||
- 服务端会原子执行:写 `providers/<provider_id>.json`、bump `pack.json` patch 版本、更新 `checksums.txt`、重跑整包校验、`git add` + `git commit`
|
||
- 运行前提新增:`SUB2API_CRM_REPO_ROOT` 必须指向**真实 Git 仓库**
|
||
- remote43 原本的 `/home/ubuntu/sub2api-cn-relay-manager` 只是普通目录,不带 `.git`
|
||
- 2026-05-28 已继续把这条路径收成正式部署约定:
|
||
- CRM 现在应统一指向 `/home/ubuntu/sub2api-cn-relay-manager-git-current`
|
||
- `scripts/deploy/setup_remote43_patched_stack.sh` 会自动生成并刷新该固定 checkout
|
||
- 这样 provider 草稿发布链不再依赖任何一次性的时间戳 repo 目录
|
||
- 公网 `providers.html` 已新增“发布到仓库”按钮与 commit message 输入框
|
||
- remote43 公网真验已通过:
|
||
- `draft_id=draft_remote43_publish_smoke_1779924243`
|
||
- `provider_id=smoke-publish-1779924243`
|
||
- `provider_path=packs/openai-cn-pack/providers/smoke-publish-1779924243.json`
|
||
- `pack_version=1.1.5 -> 1.1.6`
|
||
- `publish_mode=created`
|
||
- `commit_sha=d8d647e`
|
||
- 远端 repo `HEAD` 与 API 返回 `commit_sha` 一致,说明 create -> publish -> git commit 已完整闭环
|
||
- 线上无副作用验收已确认:
|
||
- `GET /portal/` 返回 `200`
|
||
- `GET /kimi-portal/` 返回 `302 -> /portal/`
|
||
- `GET /portal-proxy/api/v1/keys` 在无效 token 下已命中宿主真实 `INVALID_TOKEN`,说明新的同域代理已生效
|
||
- 2026-05-28 已继续把管理态“每次手贴 Bearer token”收口为正式登录流:
|
||
- 新增 `GET /api/admin/session`
|
||
- 新增 `POST /api/admin/session/login`
|
||
- 新增 `POST /api/admin/session/logout`
|
||
- 管理态受保护接口现已同时接受:
|
||
- `Authorization: Bearer <SUB2API_CRM_ADMIN_TOKEN>`
|
||
- 或同域管理员 session cookie
|
||
- `providers.html` 与 `admin-batch-import.html` 现已优先走 session,token 输入框仅保留为兜底
|
||
- 当前部署环境可通过以下变量显式配置管理员账号:
|
||
- `SUB2API_CRM_ADMIN_USERNAME`
|
||
- `SUB2API_CRM_ADMIN_PASSWORD`
|
||
- `SUB2API_CRM_ADMIN_SESSION_TTL`
|
||
- 2026-05-28 已继续把 `providers.html` 的 manifest 草稿表单收口成“按最近成功模板起步”的录入流:
|
||
- 草稿区首次打开且字段为空时,会优先回填最近一次成功发布的模板;没有历史时,回退到当前 pack/provider 目录里的现有 provider,最后才使用静态样例
|
||
- `Provider ID` 现已按 `display_name / base_url / supported_models` 自动生成,并在与现有 provider / draft 冲突时自动补后缀避重
|
||
- 当新填的 `supported_models` 已在现有 provider 或草稿里出现时,页面会直接提示“同模型已存在”,并优先建议复用已有 `provider_id`,避免因为“官方 / 中转”重复新增同一模型定义
|
||
- `GET /api/packs/{pack_id}/providers` 现已补充返回 `base_url / smoke_test_model / supported_models`,用于前端做模板参考和模型冲突提示
|
||
- 2026-05-28 继续把这条规则下沉到服务端:`POST /api/provider-drafts`、`PUT /api/provider-drafts/{draft_id}`、`POST /api/provider-drafts/{draft_id}/publish` 现在都会做 pack 级模型冲突校验;同模型若已被其他 provider / draft 占用,会直接返回 `409 provider_model_conflict`
|
||
- 2026-05-28 已新增独立实验 pack `packs/openai-cn-pack-route-lab/`,用于验证 “同一个 group 下挂多条 GPT 路线” 的方案 B 骨架:
|
||
- 当前实验 provider 为 `gpt-asxs-route-lab` 与 `gpt-codex2api-route-lab`
|
||
- 两者共用 `group_template.name=GPT Shared 路由实验` 与 `plan_template.name=GPT Shared 路由实验套餐`
|
||
- 两者使用不同 `channel_template.name`,并先以不同公开 alias 对外:
|
||
- `gpt-5.4-asxs` / `gpt-5.4-mini-asxs`
|
||
- `gpt-5.4-codex2api` / `gpt-5.4-mini-codex2api`
|
||
- 这轮只解决 “同组多线路” 的第一阶段验证,不代表当前系统已经支持 “同公开模型名双线路 + 显式 route policy”
|
||
- `codex2api` 当前先按 `https://www.codex2api.com/v1` 作为 API 根地址假设,若 preview/import 失败,需先校正真实 API base URL
|
||
- 2026-05-28 remote43 真验结论已落地:
|
||
- `gpt-asxs-route-lab` 可成功导入,artifact:`artifacts/real-host-acceptance/20260528_142205_remote43_gpt-asxs-route-lab_key_import/21-summary.json`
|
||
- 导入时创建资源:`group_id=8`、`channel_id=7`、`plan_id=7`、`account_id=9`
|
||
- upstream `asxs` 侧 `/models` 与 `/chat/completions` 为 `200`;但该轮 managed chat 仍返回 `503`,原因是脚本首轮使用了 canonical model `gpt-5.4` 探测,而当前 route-lab 对外 alias 实际是 `gpt-5.4-asxs`
|
||
- `gpt-codex2api-route-lab` 在尝试复用同一 group 时被宿主直接拒绝,artifact:`artifacts/real-host-acceptance/20260528_142320_remote43_gpt-codex2api-route-lab_key_import/03-import.body.json`
|
||
- 宿主返回 `409 GROUP_ALREADY_IN_CHANNEL`,错误为:`one or more groups already belong to another channel`
|
||
- 因此当前真实结论不是“同组多 channel 可继续验证路由策略”,而是 **stock / patched sub2api 当前结构上不允许同一个 group 绑定到第二个 channel**
|
||
- 2026-05-28 已进一步把宿主侧最小改造方案固化到 `docs/HOST_MULTI_CHANNEL_MINIMAL_RETROFIT.md`:
|
||
- 真实最小改造不只是移除 `channel_groups(group_id)` 唯一索引
|
||
- 还必须给宿主 `account_groups` 引入 `channel_id`,并让 `gateway / scheduler / sticky session / account stats pricing` 全部从 `group -> single channel` 升级到 `group + channel` 维度
|
||
- 否则就算数据库允许同一 group 绑定多个 channel,运行时账号池仍会被按 group 混跑,结构上仍不成立
|
||
- 2026-05-28 已明确 fallback 方案:不修改宿主源码,改由 relay-manager 插件层维护 `logical_group -> route -> shadow_group` 三层抽象,详见 `docs/PLUGIN_ROUTE_STICKY_DESIGN.md`
|
||
- 前端只看到一个逻辑分组
|
||
- 插件层先做 route 级 sticky,再把请求稳定转发到某个宿主 shadow group
|
||
- 宿主继续只做单线路 group 内的 account sticky / 调度
|
||
- 2026-05-29 已基于上述结论新增 canonical shadow pack:
|
||
- `packs/openai-cn-pack-shadow-asxs/`
|
||
- provider:`gpt-asxs-shadow-lab`
|
||
- 当前约束是:
|
||
- 一个 route 对应一个独立宿主 shadow group
|
||
- 宿主 shadow group 只承载 canonical upstream model:`gpt-5.4`、`gpt-5.4-mini`
|
||
- alias/public model 的抽象只保留在插件 `logical_group -> route -> shadow_model` 层,不再下沉到宿主 channel
|
||
- 设计与验收路径已单独沉淀到 `docs/SHADOW_PROVIDER_VALIDATION.md`
|
||
- 2026-05-29 已完成 remote43 真实宿主直连验收:
|
||
- 验收 artifact:`artifacts/real-host-acceptance/20260529_123659_remote43_gpt-asxs-shadow-lab_key_import/21-summary.json`
|
||
- 通过 `gpt-asxs-shadow-lab` 成功导入 canonical shadow provider
|
||
- 导入后创建资源:`subscription_group_id=9`、`import_group_id=9`
|
||
- 真实 managed key 直连宿主结果:
|
||
- `/v1/models` 返回 `200`
|
||
- 模型集包含 canonical model:`gpt-5.4`、`gpt-5.4-mini`
|
||
- `/v1/chat/completions` 返回 `200`
|
||
- upstream `asxs` 侧 `/models` 与 `/chat/completions` 同样返回 `200`
|
||
- 说明 canonical shadow 设计已经绕过旧 `route-lab` 的 alias 下沉问题
|
||
- 2026-05-29 已修复 remote43 导入脚本 pack 路径使用错误:
|
||
- `scripts/acceptance/import_remote43_provider.sh` 新增 `REQUEST_PACK_PATH`
|
||
- 本地 pack 解析仍使用 `PACK_PATH`
|
||
- 发给 remote43 CRM 的导入请求改为使用远端实际可见路径,避免 CRM 在远端错误地 `stat /home/long/...`
|
||
- 2026-05-28 已新增插件整体需求盘点 `docs/PLUGIN_REQUIREMENTS_OVERVIEW_2026-05-28.md`
|
||
- 已把“增加模型、维护逻辑分组、智能路由、供应商帐号导入与停启用、普通用户前端”五大功能域统一收口
|
||
- 并明确区分 `已完成 / 待优化 / 待完成 / 未来规划`
|
||
- 2026-05-28 已继续细化闭环实施规划 `docs/PLUGIN_CLOSED_LOOP_IMPLEMENTATION_PLAN_2026-05-28.md`
|
||
- 明确当前插件数据库仍为 SQLite(`SUB2API_CRM_SQLITE_DSN`)
|
||
- 明确后续继续以 SQLite 作为主状态库,Redis 作为智能路由运行态缓存
|
||
- 明确智能路由日志必须结构化落入插件 SQLite,而不是只放 Redis 或 stdout
|
||
- 2026-05-28 已新增 Phase 1 可开工任务单 `docs/plans/2026-05-28-phase1-logical-routing-foundation-plan.md`
|
||
- 已把 `SQLite migration / logical_group-route repo+API / 路由日志写入器 / Redis sticky 抽象` 拆成可执行任务
|
||
- 已继续细化到任务级 `入场条件 / 产出清单 / 远端验证步骤 / 证据要求 / 回滚原则`
|
||
- 并明确要求:每个闭环功能完成后,都必须提交、推送、部署到 `remote43` 再验证,不能只停留在本地测试
|
||
- 当前 Phase 1 的统一真相是:
|
||
- 主状态库继续使用 SQLite
|
||
- 路由运行态使用 Redis 或 memory backend 抽象
|
||
- 智能路由日志必须最终结构化写回插件 SQLite
|
||
- 2026-05-29 已新增 Phase 2 可开工任务单 `docs/plans/2026-05-29-phase2-intelligent-routing-closure-plan.md`
|
||
- 已把 `管理页入口 / 正式数据面入口 / route 健康视图 / 真实验收矩阵` 拆成可执行任务
|
||
- 已明确当前 Phase 2 不是再证明“路由能跑”,而是把现有能力收敛成产品闭环:
|
||
- 管理员可维护 `logical_group -> route -> shadow_group`
|
||
- 插件存在正式数据面入口,而不只是实验 proxy
|
||
- route 的 sticky / failover / cooldown 状态可被运营查看
|
||
- `remote43` 真验收敛成固定矩阵,而不是零散命令
|
||
- 当前 Phase 2 的建议实施顺序是:
|
||
- `P2-T1 管理页入口`
|
||
- `P2-T2 正式数据面入口`
|
||
- `P2-T3 route 健康视图`
|
||
- `P2-T4 真实验收矩阵`
|
||
- 2026-05-29 已完成 Phase 2 / `P2-T1 管理页入口`
|
||
- 提交:`2e9b4ab9 feat(portal): add logical group admin page`
|
||
- 新增静态页:
|
||
- `deploy/tksea-portal/admin/logical-groups.html`
|
||
- 已完成 admin 导航接线:
|
||
- `deploy/tksea-portal/admin/index.html`
|
||
- `deploy/tksea-portal/admin/providers.html`
|
||
- `deploy/tksea-portal/admin-batch-import.html`
|
||
- 当前页面覆盖的最小运营流:
|
||
- `logical_group` 创建 / 更新 / 删除
|
||
- `public_model` 新增 / 删除
|
||
- `route` 创建 / 更新 / 删除
|
||
- `route model` 新增 / 查看
|
||
- 静态资产与脚本回归已通过:
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- Go 质量门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- portal 已部署到 remote43:
|
||
- 部署脚本:`scripts/deploy/deploy_tksea_portal.sh`
|
||
- 公网新页面:`https://sub.tksea.top/portal/admin/logical-groups.html`
|
||
- 管理首页已可见入口:`https://sub.tksea.top/portal/admin/`
|
||
- 公网页面回读已确认:
|
||
- `logical-groups.html` 已包含 `Logical Group / Public Models / Routes / Route Models`
|
||
- 管理首页已出现 `逻辑分组 / 路由` 导航与入口卡片
|
||
- 公网 admin API 真验已通过:
|
||
- `POST /api/admin/session/login` 建立管理员会话成功
|
||
- `POST /api/logical-groups` 创建 `logical_group_id=p2t1-lg-1780031264`
|
||
- `POST /api/logical-groups/p2t1-lg-1780031264/routes` 创建 `route_id=asxs-ui-1780031264`
|
||
- `GET /api/logical-groups/p2t1-lg-1780031264` 已回读到:
|
||
- `shadow_group_id=9`
|
||
- `shadow_host_id=proxy-real-host-1780026133`
|
||
- `upstream_base_url_hint=https://api.asxs.top/v1`
|
||
- 当前结论:
|
||
- `logical_group -> route -> shadow_group` 已有独立管理页入口
|
||
- 现有 CRM API 已足够支撑首版 UI
|
||
- `P2-T2` 可以直接在这个页面基础上继续对接正式数据面入口
|
||
- 2026-05-29 已完成 Phase 2 / `P2-T2 正式数据面入口`
|
||
- 提交:`ecdeedb1 feat(routing): add formal chat route endpoint`
|
||
- 新增正式入口:
|
||
- `POST /api/routing/chat/completions`
|
||
- 兼容策略:
|
||
- 旧 `POST /api/routing/proxy/chat/completions` 保留,继续作为实验/调试入口
|
||
- 新入口复用既有 `resolve -> sticky -> failover -> managed subscription -> forward -> route_decision_logs` 链路
|
||
- 对外返回改为正式产品语义:
|
||
- `model`
|
||
- `selected_route`
|
||
- `sticky_hit / sticky_action / fallback_used`
|
||
- `forward.upstream_status`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到:
|
||
- `repo HEAD = ecdeedb1`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 本轮还修正了一个远端部署细节:
|
||
- 18173 活跃实例曾继续跑旧 CRM 二进制
|
||
- 原因是实例目录里的 `sub2api-cn-relay-manager-server` 未被新构建产物覆盖
|
||
- 现已通过定向替换实例二进制并按实际监听 PID 重启收口
|
||
- 公网 admin API 真验已通过:
|
||
- 先创建临时 `logical_group_id=p2t2-check-1780032198`
|
||
- 再创建临时 `route_id=asxs-check-1780032198`
|
||
- route 命中真实 canonical shadow:
|
||
- `shadow_host_id=proxy-real-host-1780026133`
|
||
- `shadow_group_id=9`
|
||
- `shadow_model=gpt-5.4`
|
||
- 调用 `POST /api/routing/chat/completions`:
|
||
- `request_id=req-p2t2-check-1780032198`
|
||
- `backend=redis`
|
||
- `sticky_hit=false`
|
||
- `sticky_action=bind`
|
||
- `selected_route.route_id=asxs-check-1780032198`
|
||
- `selected_route.shadow_group_id=9`
|
||
- `selected_route.shadow_model=gpt-5.4`
|
||
- `forward.ok=true`
|
||
- `forward.upstream_status=200`
|
||
- `forward.effective_gateway_key_source=managed_subscription`
|
||
- `forward.managed_user_id=36`
|
||
- `forward.content_type=text/event-stream`
|
||
- 返回 completion 内容:`content=pong`
|
||
- `GET /api/routing/logs/decisions?request_id=req-p2t2-check-1780032198&limit=5`
|
||
- 已回读到 `2` 条 decision log
|
||
- 最新一条:
|
||
- `selected_route_id=asxs-check-1780032198`
|
||
- `selected_shadow_group_id=9`
|
||
- `upstream_status=200`
|
||
- `fallback_used=false`
|
||
- 当前结论:
|
||
- 正式入口 `POST /api/routing/chat/completions` 已经可用
|
||
- canonical shadow route + managed subscription key + real host `/v1/chat/completions` 已在正式入口下再次验证为 `200`
|
||
- `P2-T3` 可以直接在这一入口之上补 route 健康视图与聚合状态
|
||
- 2026-05-29 已完成 Phase 2 / `P2-T3 route 健康视图`
|
||
- 提交:`2896e620 feat(routing): add route health admin view`
|
||
- 新增聚合状态 API:
|
||
- `GET /api/routing/routes/health`
|
||
- 可选过滤:
|
||
- `logical_group_id`
|
||
- `route_id`
|
||
- `status=healthy|cooldown|failing|disabled`
|
||
- 聚合字段首版已收口:
|
||
- `route_id`
|
||
- `route_name`
|
||
- `logical_group_id`
|
||
- `shadow_host_id`
|
||
- `shadow_group_id`
|
||
- `priority`
|
||
- `runtime_status`
|
||
- `failure_count`
|
||
- `cooldown_until`
|
||
- `cooldown_reason`
|
||
- `last_error_class`
|
||
- `last_selected_at`
|
||
- `last_upstream_status`
|
||
- `recent_failover_count`
|
||
- 新增管理页:
|
||
- `https://sub.tksea.top/portal/admin/route-health.html`
|
||
- 现有管理导航已接线:
|
||
- `admin/index.html`
|
||
- `admin/logical-groups.html`
|
||
- `admin/providers.html`
|
||
- `admin-batch-import.html`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已原位升级到 `repo HEAD = 2896e620`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- portal 已重新发布,公网健康页在线:
|
||
- `GET https://sub.tksea.top/portal/admin/route-health.html -> HTTP 200`
|
||
- 页面标题包含 `Route Health Admin`
|
||
- remote43 真实矩阵已通过:
|
||
- 临时 `logical_group_id = p2t3-health-1780033345`
|
||
- `primary-1780033345` 写入 cooldown:`runtime_status=cooldown`
|
||
- `failing-1780033345` 写入 route failure:`runtime_status=failing`
|
||
- `POST /api/routing/resolve`
|
||
- `request_id=req-p2t3-health-1780033345`
|
||
- 因 `primary-1780033345` 处于 `active_cooldown:degraded`
|
||
- 自动切到 `fallback-1780033345`
|
||
- `resolve.route_id=fallback-1780033345`
|
||
- `fallback_used=true`
|
||
- 再次回读 `GET /api/routing/routes/health?logical_group_id=p2t3-health-1780033345`
|
||
- `primary-1780033345=cooldown`
|
||
- `fallback-1780033345=healthy`
|
||
- `failing-1780033345=failing`
|
||
- `fallback-1780033345.last_selected_at` 已写入
|
||
- `fallback-1780033345.recent_failover_count=1`
|
||
- `GET /api/routing/logs/failovers?request_id=req-p2t3-health-1780033345&limit=5`
|
||
- `from_route_id=primary-1780033345`
|
||
- `to_route_id=fallback-1780033345`
|
||
- `reason=active_cooldown:degraded`
|
||
- 当前结论:
|
||
- route 健康页已经能把 `cooldown / failing / healthy / disabled` 四态聚合出来
|
||
- 健康页与真实 `resolve` / `failover` 日志已经对齐
|
||
- `P2-T4` 可以开始把这些验证步骤收敛成标准化验收矩阵与脚本
|
||
- 2026-05-29 已完成 Phase 2 / `P2-T4 真实验收矩阵`
|
||
- 提交:
|
||
- `94913400 feat(routing): add route acceptance matrix scripts`
|
||
- `5689286f fix(acceptance): align route model list parsing`
|
||
- 新增标准化验收脚本:
|
||
- `scripts/acceptance/route_acceptance_lib.sh`
|
||
- `scripts/acceptance/verify_route_control_plane.sh`
|
||
- `scripts/acceptance/verify_route_health_ui.sh`
|
||
- `scripts/acceptance/verify_route_data_plane.sh`
|
||
- `scripts/acceptance/verify_route_acceptance_matrix.sh`
|
||
- 新增矩阵说明:
|
||
- `docs/ROUTE_ACCEPTANCE_MATRIX.md`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已使用 fixed checkout:
|
||
- `repo HEAD = 5689286f`
|
||
- 本轮仅新增脚本和文档,无需重启 CRM
|
||
- `http://127.0.0.1:18173/healthz` 持续返回 `ok`
|
||
- remote43 已按标准命令完成整套矩阵真验:
|
||
- `CRM_BASE=https://sub.tksea.top/portal-admin-api`
|
||
- `ROUTE_HEALTH_PAGE_URL=https://sub.tksea.top/portal/admin/route-health.html`
|
||
- `SHADOW_HOST_ID=proxy-real-host-1780026133`
|
||
- `SHADOW_GROUP_ID=9`
|
||
- `SUBSCRIPTION_USER_ID=36`
|
||
- 产物目录:
|
||
- `/tmp/phase2-routing-matrix/1780034317_route_matrix`
|
||
- control plane summary:
|
||
- `group_id=p2t4-cp-1780034317`
|
||
- `route_id=primary-1780034317`
|
||
- `public_model=gpt-5.4`
|
||
- `shadow_model=gpt-5.4`
|
||
- 已验证:
|
||
- 创建 group
|
||
- 更新 group
|
||
- 创建 route
|
||
- 更新 route
|
||
- 创建 route model
|
||
- 列出 route model
|
||
- health UI / runtime summary:
|
||
- `group_id=p2t4-health-1780034318`
|
||
- `request_id=req-p2t4-health-1780034318`
|
||
- `resolve_route_id=fallback-1780034318`
|
||
- `resolve_fallback_used=true`
|
||
- 状态矩阵已对齐:
|
||
- `primary-1780034318=cooldown`
|
||
- `fallback-1780034318=healthy`
|
||
- `failing-1780034318=failing`
|
||
- `fallback_recent_failover_count=1`
|
||
- data plane summary:
|
||
- `group_id=p2t4-dp-1780034319`
|
||
- `route_id=primary-1780034319`
|
||
- `request_id=req-p2t4-dp-1780034319`
|
||
- `auth_mode=managed_subscription`
|
||
- `forward_upstream_status=200`
|
||
- `selected_shadow_host_id=proxy-real-host-1780026133`
|
||
- `selected_shadow_group_id=9`
|
||
- `selected_shadow_model=gpt-5.4`
|
||
- `effective_gateway_key_source=managed_subscription`
|
||
- `decision_log_count=2`
|
||
- 当前结论:
|
||
- Phase 2 的控制面、运行态、正式数据面、管理页都已被收敛进固定脚本矩阵
|
||
- 后续任何 Phase 2 回归都可以直接在 `remote43` 上复用同一套脚本,而不再依赖临时串命令
|
||
- `P2` 已具备可持续回归的最小产品闭环
|
||
- 2026-05-29 已完成 Phase 3 / `P3-T1 帐号清单与启停 API`
|
||
- 提交:
|
||
- `b5343452 feat(accounts): add provider account inventory api`
|
||
- 新增 SQLite migration:
|
||
- `internal/store/migrations/0012_provider_accounts.sql`
|
||
- 新增帐号库存 repo / 回填逻辑:
|
||
- `internal/store/sqlite/provider_accounts_repo.go`
|
||
- `internal/store/sqlite/provider_accounts_sync.go`
|
||
- 新增管理 API:
|
||
- `GET /api/provider-accounts`
|
||
- `POST /api/provider-accounts/{account_id}/enable`
|
||
- `POST /api/provider-accounts/{account_id}/disable`
|
||
- `POST /api/provider-accounts/{account_id}/retire`
|
||
- 当前首版设计边界已明确:
|
||
- `provider_accounts` 是插件侧帐号资产库存
|
||
- 列表接口会先按最新 `reconcilable import batch` 做一次库存回填,避免 remote43 上已有导入结果看不到
|
||
- runtime import 完成后会自动把 `managed_resources + import_batch_items` 同步进 `provider_accounts`
|
||
- `enable / disable / retire` 当前只更新插件 SQLite 库存状态,不猜测宿主 `/api/v1/admin/accounts/:id` 的写入 payload
|
||
- 为避免人工禁用状态被回填覆盖,同步层现在会保留人工 `disabled` / 人工 `deprecated`;只允许 `missing_from_latest_batch` 这种自动弃用在新 batch 命中后被恢复
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到:
|
||
- `repo HEAD = b5343452`
|
||
- CRM 实例目录:`/home/ubuntu/sub2api-kimi-patched-auto2-20260525_18169`
|
||
- 活跃监听 PID:`1771174`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实 API 验证已通过:
|
||
- `GET /api/provider-accounts`
|
||
- `count=1`
|
||
- 样本帐号:
|
||
- `id=1`
|
||
- `provider_id=gpt-asxs-shadow-lab`
|
||
- `host_account_id=10`
|
||
- 初始 `account_status=active`
|
||
- `POST /api/provider-accounts/1/disable`
|
||
- `reason=p3_t1_remote_verify`
|
||
- 返回 `account_status=disabled`
|
||
- 再次 `GET /api/provider-accounts`
|
||
- 同一条样本仍为 `account_status=disabled`
|
||
- 说明列表回填不会把人工禁用状态刷回 `active`
|
||
- `POST /api/provider-accounts/1/enable`
|
||
- 返回 `account_status=active`
|
||
- 再次 `GET /api/provider-accounts`
|
||
- 同一条样本恢复为 `account_status=active`
|
||
- 当前结论:
|
||
- 插件侧已经具备“帐号库存列表 + 人工启停状态”的最小资产闭环
|
||
- 首版语义仍然是“插件库存状态”,不是“宿主账号状态联动”
|
||
- `P3-T2` 可以直接继续做帐号资产页和 route / shadow group 归属展示
|
||
- 2026-05-29 已完成 Phase 3 / `P3-T2 帐号资产页与归属展示`
|
||
- 提交:
|
||
- `c982c595 feat(accounts): add provider account admin view`
|
||
- `d8d9e6e1 fix(accounts): tolerate ambiguous shadow bindings`
|
||
- 新增管理页:
|
||
- `deploy/tksea-portal/admin/accounts.html`
|
||
- 公网地址:`https://sub.tksea.top/portal/admin/accounts.html`
|
||
- 当前页面能力:
|
||
- 列出 `provider_accounts`
|
||
- 按 `host_id / provider_id / logical_group_id / route_id / shadow_group_id / account_status / q` 过滤
|
||
- 详情区展示:
|
||
- `provider`
|
||
- `host`
|
||
- `logical_group`
|
||
- `route`
|
||
- `shadow_group`
|
||
- `shadow_host`
|
||
- `upstream_base_url_hint`
|
||
- `last_probe_status`
|
||
- 直接触发:
|
||
- `POST /api/provider-accounts/{account_id}/enable`
|
||
- `POST /api/provider-accounts/{account_id}/disable`
|
||
- `POST /api/provider-accounts/{account_id}/retire`
|
||
- 后端归属补齐规则已收口:
|
||
- 若 `shadow_host_id + shadow_group_id` 只命中一条 `logical_group_route`,则自动回填 `route_id`
|
||
- 若命中多条 route,则不再返回 `500`;改为保留 `shadow_group`,并把 `route / logical_group` 显式留空,等待后续运营整理
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已原位升级到:
|
||
- `repo HEAD = d8d9e6e1`
|
||
- CRM 活跃监听 PID:`2015702`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- portal 已重新发布,`https://sub.tksea.top/portal/admin/accounts.html` 返回 `200`
|
||
- remote43 真实验证已通过:
|
||
- 页面回读:
|
||
- `page_title=Provider Accounts Admin`
|
||
- `page_nav=/portal/admin/accounts.html`
|
||
- `GET /api/provider-accounts`
|
||
- 返回样本:
|
||
- `id=1`
|
||
- `provider_id=gpt-asxs-shadow-lab`
|
||
- `account_status=active`
|
||
- `shadow_group_id=9`
|
||
- `shadow_host=proxy-real-host-1780026133`
|
||
- 同一条样本的启停动作:
|
||
- `disable -> list` 保持 `disabled`
|
||
- `enable -> list` 恢复 `active`
|
||
- `retire -> list` 保持 `deprecated`
|
||
- 验证完成后已恢复为 `active`
|
||
- 当前 remote43 真相:
|
||
- 这条现网样本的 `logical_group / route` 当前仍为空
|
||
- 根因不是列表页丢字段,而是 remote43 上有多条 `logical_group_route` 复用同一 `shadow_host_id + shadow_group_id`
|
||
- 当前系统已把这种情况从“500 致命错误”降级成“可读但未归属”的显式状态
|
||
- 当前结论:
|
||
- 管理员已经能通过统一页面查看帐号资产、筛选 `shadow_group`、执行人工启停
|
||
- `route / logical_group` 归属在唯一 shadow binding 下会自动补齐;在歧义 shadow binding 下会显式显示未归属
|
||
- `P3-T3` 可以继续做“帐号资产与 route 的显式整理动作”或更完整的运营看板
|
||
- 2026-05-29 已完成 Phase 3 / `P3-T3 帐号归属显式整理`
|
||
- 提交:`649eb13f feat(accounts): add explicit route binding workflow`
|
||
- 新增后端能力:
|
||
- `GET /api/provider-accounts/{account_id}/binding-candidates`
|
||
- `POST /api/provider-accounts/{account_id}/binding`
|
||
- `GET /api/provider-accounts?binding_state=assigned|unassigned|conflict`
|
||
- 当前绑定语义已收口:
|
||
- `assigned`
|
||
- `provider_accounts.route_id` 已明确绑定到一条 route
|
||
- `unassigned`
|
||
- 当前没有 route 归属,且同一 `shadow_host_id + shadow_group_id` 下候选 route 不超过 1 条
|
||
- `conflict`
|
||
- 当前没有 route 归属,且同一 `shadow_host_id + shadow_group_id` 下候选 route 超过 1 条
|
||
- 新增页面动作:
|
||
- `deploy/tksea-portal/admin/accounts.html`
|
||
- 新增 `binding_state` 过滤
|
||
- 新增“显式整理归属”区块
|
||
- 可查看 route 候选
|
||
- 可手动绑定到一条 route
|
||
- 可清空 route 归属,回到未归属 / 冲突状态
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已原位升级到:
|
||
- `repo HEAD = 649eb13f`
|
||
- CRM 活跃监听 PID:`2709328`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- portal 已重新发布,`https://sub.tksea.top/portal/admin/accounts.html` 返回包含:
|
||
- `Provider Accounts Admin`
|
||
- `显式整理归属`
|
||
- remote43 真实验证已通过:
|
||
- 现网样本:
|
||
- `account_id=1`
|
||
- `shadow_group_id=9`
|
||
- `shadow_host_id=proxy-real-host-1780026133`
|
||
- 为验证冲突整理动作,临时创建:
|
||
- `logical_group_id=p3t3-bind-1780054239`
|
||
- `route_a=primary-1780054239`
|
||
- `route_b=fallback-1780054239`
|
||
- 两条 route 都复用同一 `shadow_host_id + shadow_group_id`
|
||
- 回读结果:
|
||
- `GET /api/provider-accounts?binding_state=conflict&shadow_group_id=9`
|
||
- `conflict_count=1`
|
||
- `GET /api/provider-accounts/1/binding-candidates`
|
||
- `candidate_route_count=2`
|
||
- `POST /api/provider-accounts/1/binding`
|
||
- `route_id=primary-1780054239`
|
||
- `binding_state=assigned`
|
||
- `GET /api/provider-accounts?route_id=primary-1780054239`
|
||
- `assigned_count=1`
|
||
- `POST /api/provider-accounts/1/binding` with `{"clear":true}`
|
||
- 返回 `binding_state=conflict`
|
||
- 再次 `GET /api/provider-accounts?binding_state=conflict&shadow_group_id=9`
|
||
- `after_clear_conflict_count=1`
|
||
- 验证结束后已删除临时 `logical_group / routes`,避免 remote43 再次积累测试噪音
|
||
- 当前结论:
|
||
- 帐号资产已不再只是“看见未归属”,而是可以直接在插件侧完成 route 归属整理
|
||
- `shadow binding` 歧义场景已经从“只能读”升级为“可读 + 可人工收口”
|
||
- `P3` 已具备进入下一步“更完整的运营看板”或直接切到 `P4` 的条件
|
||
- 2026-05-29 已完成 Phase 4 / `P4-T1 portal logical group catalog API`
|
||
- 提交:`97fd72e2 feat(portal): add logical group catalog api`
|
||
- 新增公开聚合 API:
|
||
- `GET /api/portal/logical-groups`
|
||
- `GET /api/portal/logical-groups/{group_id}`
|
||
- `GET /api/portal/logical-groups/{group_id}/models`
|
||
- 当前公开语义已收口:
|
||
- 只暴露插件 `logical_group` 产品层
|
||
- 不把 `shadow_group / shadow_host / route_id` 直接暴露给普通用户前端
|
||
- 仅返回 `status=active` 的 `logical_group`
|
||
- 仅返回 `status=active` 的 `public_model`
|
||
- 每个逻辑分组同时返回:
|
||
- `route_count`
|
||
- `active_route_count`
|
||
- `sticky_mode`
|
||
- `route_policy`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到:
|
||
- `repo HEAD = 97fd72e2`
|
||
- CRM 活跃监听 PID:`2772411`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实验证已通过:
|
||
- 通过管理员 API 临时创建:
|
||
- `group_id=p4t1-catalog-1780055254`
|
||
- `route_id=primary-1780055254`
|
||
- `public_model=gpt-5.4`
|
||
- 本机回读:
|
||
- `GET http://127.0.0.1:18173/api/portal/logical-groups`
|
||
- `GET http://127.0.0.1:18173/api/portal/logical-groups/p4t1-catalog-1780055254`
|
||
- `GET http://127.0.0.1:18173/api/portal/logical-groups/p4t1-catalog-1780055254/models`
|
||
- 公网代理回读:
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups`
|
||
- 关键结果:
|
||
- `route_count=1`
|
||
- `active_route_count=1`
|
||
- `models_count=1`
|
||
- `first_model=gpt-5.4`
|
||
- `public_list_contains_group=true`
|
||
- 验证结束后已删除临时 `logical_group / route / model`,未向 remote43 留下测试垃圾
|
||
- 当前结论:
|
||
- 普通用户前端已经有了第一批“逻辑分组产品层”读取接口
|
||
- 下一步可以直接做 `P4-T2`:让 `/portal/` 改吃这些聚合 API,而不是继续硬编码宿主分组目录
|
||
- 2026-05-30 已完成 Phase 4 / `P4-T2 portal logical group catalog frontend`
|
||
- 提交:`ac1d8e27 feat(portal): switch user catalog to logical groups`
|
||
- 本轮前端改造范围:
|
||
- `deploy/tksea-portal/index.html`
|
||
- `scripts/test/test_tksea_portal_assets.sh`
|
||
- 当前用户 Portal 语义已切换为:
|
||
- 主目录优先展示插件 `logical_group`
|
||
- 卡片展示 `display_name / logical_group_id / public_models / route_policy / sticky_mode / route_count`
|
||
- 不再把宿主真实分组当成用户主视角
|
||
- `申请测试 Key` 仍走兼容宿主线路,但必须命中唯一兼容线路才允许直接申请
|
||
- 现有 Key 列表已补充 `逻辑分组` 视角展示,不再只显示宿主分组
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已完成静态门户部署:
|
||
- 公网页面:`https://sub.tksea.top/portal/`
|
||
- `nginx -t` 成功
|
||
- `systemctl reload nginx` 成功
|
||
- remote43 / 公网真实验证已通过:
|
||
- 公网页面回读已确认包含:
|
||
- `逻辑分组目录`
|
||
- `选择逻辑分组`
|
||
- `PORTAL_CATALOG_PREFIX = "/portal-admin-api/api/portal"`
|
||
- 为验证前端所消费的聚合接口,临时创建:
|
||
- `logical_group_id=p4t2-portal-1780100901`
|
||
- `route_id=primary-1780100901`
|
||
- `public_model=gpt-5.4`
|
||
- 公网回读:
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups`
|
||
- 返回样本 `logical_group_id=p4t2-portal-1780100901`
|
||
- `route_count=1`
|
||
- `active_route_count=1`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t2-portal-1780100901/models`
|
||
- 返回 `public_model=gpt-5.4`
|
||
- 验证结束后已删除临时 `logical_group / route / model`,未向 remote43 留下测试噪音
|
||
- 当前结论:
|
||
- `/portal/` 已开始真正切到“逻辑分组视角”
|
||
- 兼容宿主分组仍仅保留在测试 Key 申请链路,不再作为主目录抽象
|
||
- 下一步可以直接进入 `P4-T3`:把普通用户的 key / 订阅 / 权限展示进一步收口到逻辑分组产品层
|
||
- 2026-05-30 已完成 Phase 4 / `P4-T3 portal logical entitlement projection`
|
||
- 提交:`542c6823 feat(portal): add logical group entitlement view`
|
||
- 本轮前端改造范围:
|
||
- `deploy/tksea-portal/index.html`
|
||
- `scripts/test/test_tksea_portal_assets.sh`
|
||
- 当前用户 Portal 继续从“宿主兼容线路细节”收口到“逻辑分组产品层”:
|
||
- 新增 `权限与订阅视图`
|
||
- 把 `subscriptions / keys / allowed_groups` 聚合投影回逻辑分组
|
||
- 会话区新增 `逻辑分组权限`
|
||
- 统计卡把第二列从“已开通兼容线路”切到“已激活产品权限”
|
||
- 历史 Key 列表继续保留兼容线路信息,但优先展示逻辑分组归属
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已重新发布门户静态资源:
|
||
- 公网页面:`https://sub.tksea.top/portal/`
|
||
- `nginx -t` 成功
|
||
- `systemctl reload nginx` 成功
|
||
- remote43 / 公网真实验证已通过:
|
||
- 公网页面回读已确认包含:
|
||
- `已激活产品权限`
|
||
- `权限与订阅视图`
|
||
- `逻辑分组权限`
|
||
- 为验证普通用户权限投影,临时创建:
|
||
- `logical_group_id=p4t3-portal-1780107436`
|
||
- `route_id=primary-1780107436`
|
||
- `public_model=gpt-5.4`
|
||
- `temp_user_email=p4t3-portal-1780107436@sub2api.local`
|
||
- `temp_user_id=39`
|
||
- 远端原位回读结果:
|
||
- `GET /api/portal/logical-groups` 包含 `p4t3-portal-1780107436`
|
||
- 该样本 `public_models = ["gpt-5.4"]`
|
||
- 临时普通用户 `balance=10`
|
||
- 临时普通用户 `subscriptions_for_group_4 = ["active"]`
|
||
- 临时普通用户 `keys_group_ids = [4]`
|
||
- 这与前端投影规则一致,对应逻辑分组状态应为 `已开通订阅`
|
||
- 验证结束后已自动删除:
|
||
- 临时 `logical_group / route / model`
|
||
- 临时普通用户、其 `api_keys` 与 `user_subscriptions`
|
||
- 当前 remote43 未留下新的测试噪音
|
||
- 当前结论:
|
||
- 普通用户页已经不只是“看见逻辑分组目录”,而是能把 `Key / 订阅 / 权限` 投影回逻辑分组产品层
|
||
- 宿主兼容线路仍存在,但已退到实现细节和运维视角
|
||
- 下一步可以直接进入更完整的普通用户产品化,例如逻辑分组级套餐、购买、升级或使用建议
|
||
- 2026-05-30 已完成 Phase 4 / `P4-T4 portal logical group usage guidance`
|
||
- 提交:`037e141c feat(portal): add logical group usage guidance`
|
||
- 本轮前端改造范围:
|
||
- `deploy/tksea-portal/index.html`
|
||
- `scripts/test/test_tksea_portal_assets.sh`
|
||
- 当前用户 Portal 新增的产品层信息:
|
||
- 新增 `使用建议与可用模型说明`
|
||
- 按 `logical_group` 聚合展示:
|
||
- `推荐模型`
|
||
- `适用场景`
|
||
- `接入建议`
|
||
- `下一步`
|
||
- `兼容线路`
|
||
- `路由策略 / sticky_mode`
|
||
- 推荐信息优先按公开模型映射到 `MODEL_GUIDANCE`,没有命中时回退到通用接入建议
|
||
- 视图继续复用 `P4-T1` 的公开聚合接口,不把宿主分组重新暴露给普通用户
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已重新发布门户静态资源:
|
||
- 公网页面:`https://sub.tksea.top/portal/`
|
||
- `nginx -t` 成功
|
||
- `systemctl reload nginx` 成功
|
||
- 这轮是纯 Portal 静态资源发布;固定仓库 checkout 仍为 `97fd72e2`
|
||
- remote43 / 公网真实验证已通过:
|
||
- 公网页面回读已确认包含:
|
||
- `使用建议与可用模型说明`
|
||
- `推荐模型`
|
||
- `接入建议`
|
||
- `下一步`
|
||
- `路由策略`
|
||
- 公网聚合接口回读已确认仍可为该视图供数:
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups`
|
||
- 当前包含 `logical_group_id=p4t3-portal-1780107301`
|
||
- `route_count=1`
|
||
- `active_route_count=1`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t3-portal-1780107301`
|
||
- `display_name=P4T3 Portal GPT Shared`
|
||
- `sticky_mode=conversation_preferred`
|
||
- `route_policy=priority`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t3-portal-1780107301/models`
|
||
- 返回 `public_model=gpt-5.4`
|
||
- 远端实例状态:
|
||
- `GET http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 本轮临时 `p4t4-guide-*` 验证样本未落库:
|
||
- remote43 实例库回读 `count=0`
|
||
- 未向 remote43 留下新的测试噪音
|
||
- 当前结论:
|
||
- 普通用户页已经开始直接告诉用户“这个逻辑分组适合什么、推荐先用哪个模型、下一步该做什么”
|
||
- 逻辑分组产品层不再只是目录和权限展示,而是开始承接用户接入指引
|
||
- 如果继续产品化,下一步更适合进入逻辑分组级套餐、购买/升级入口或可用性说明的更细粒度运营配置
|
||
- 2026-05-30 已完成 Phase 4 / `P4-T5 logical group guidance config`
|
||
- 提交:`3bfd4cfc feat(portal): add logical group guidance config`
|
||
- 本轮后端与前端改造范围:
|
||
- 新 migration:`internal/store/migrations/0013_logical_group_guidance.sql`
|
||
- `logical_groups` 新增字段:
|
||
- `usage_scenario`
|
||
- `recommendation`
|
||
- `next_step_hint`
|
||
- 管理 API:
|
||
- `POST /api/logical-groups`
|
||
- `PUT /api/logical-groups/{group_id}`
|
||
- `GET /api/logical-groups`
|
||
- `GET /api/logical-groups/{group_id}`
|
||
- 现已同时读写上述 3 个运营配置字段
|
||
- 公网聚合 API:
|
||
- `GET /api/portal/logical-groups`
|
||
- `GET /api/portal/logical-groups/{group_id}`
|
||
- 现已向普通用户侧暴露上述 3 个字段
|
||
- 管理页:
|
||
- `deploy/tksea-portal/admin/logical-groups.html`
|
||
- 新增 `usage_scenario / recommendation / next_step_hint` 配置项
|
||
- 普通用户页:
|
||
- `deploy/tksea-portal/index.html`
|
||
- `renderUsageGuides()` 现已优先消费逻辑分组运营配置,只有空值时才回退到 `LEGACY_MODEL_GUIDANCE`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已完成 CRM + Portal 升级:
|
||
- `repo HEAD = 3bfd4cfc`
|
||
- `GET http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 18173 当前活跃 CRM PID:`1651657`
|
||
- portal 静态资源已重新发布,`nginx -t` 与 reload 成功
|
||
- 本轮远端部署还顺手修正了一个真实问题:
|
||
- 首次重启时按绝对路径 `pgrep` 未命中实际以 `./sub2api-cn-relay-manager-server` 运行的旧进程
|
||
- 结果是 fixed checkout 和磁盘二进制虽然已更新,但 18173 仍由旧进程持有,表现为 `usage_scenario` 被旧 API 视为未知字段
|
||
- 随后已改为按监听 PID 定向 kill 并重启,新字段真验恢复正常
|
||
- remote43 / 公网真实验证已通过:
|
||
- 管理页源码回读已确认包含:
|
||
- `usage_scenario`
|
||
- `recommendation`
|
||
- `next_step_hint`
|
||
- 普通用户页源码回读已确认包含:
|
||
- `LEGACY_MODEL_GUIDANCE`
|
||
- `renderUsageGuides`
|
||
- `group.usage_scenario`
|
||
- `group.recommendation`
|
||
- `group.next_step_hint`
|
||
- 远端原位临时创建样本:
|
||
- `logical_group_id=p4t4-config-1780108991`
|
||
- `route_id=primary-1780108991`
|
||
- `public_model=gpt-5.4`
|
||
- 创建结果:
|
||
- `POST /api/logical-groups` 返回 `201`
|
||
- 返回字段包含:
|
||
- `usage_scenario=适合高质量推理、复杂编排和统一 GPT 产品入口。`
|
||
- `recommendation=优先使用 gpt-5.4 作为主模型。`
|
||
- `next_step_hint=先创建测试 Key,再按推荐模型发起第一次请求。`
|
||
- 公网聚合回读:
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t4-config-1780108991`
|
||
- 返回 `usage_scenario / recommendation / next_step_hint`
|
||
- `route_count=1`
|
||
- `active_route_count=1`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups`
|
||
- 列表包含 `logical_group_id=p4t4-config-1780108991`
|
||
- 验证结束后已删除临时样本:
|
||
- `DELETE /api/logical-groups/p4t4-config-1780108991`
|
||
- 公网列表回读已确认该样本不再出现
|
||
- 当前结论:
|
||
- “逻辑分组级使用建议”已经不再是前端写死常量,而是后端可配置运营数据
|
||
- 普通用户 Portal 现在具备了“目录 -> 权限 -> 接入建议 -> 下一步”的完整逻辑分组产品链路
|
||
- 如果继续产品化,下一步最合适的是把这套运营配置继续扩展到套餐、购买/升级与可见性控制
|
||
- 2026-05-30 已完成 Phase 4 / `P4-T6 logical group packaging and visibility config`
|
||
- 提交:`ef33762d feat(portal): add logical group packaging config`
|
||
- 本轮后端与前端改造范围:
|
||
- 新 migration:`internal/store/migrations/0014_logical_group_packaging.sql`
|
||
- `logical_groups` 新增字段:
|
||
- `visibility_scope`
|
||
- `package_tier`
|
||
- `purchase_cta_label`
|
||
- `purchase_cta_url`
|
||
- 管理 API:
|
||
- `POST /api/logical-groups`
|
||
- `PUT /api/logical-groups/{group_id}`
|
||
- `GET /api/logical-groups`
|
||
- `GET /api/logical-groups/{group_id}`
|
||
- 现已同时读写套餐层级、可见性范围、购买入口配置
|
||
- 公网聚合 API:
|
||
- `GET /api/portal/logical-groups`
|
||
- `GET /api/portal/logical-groups/{group_id}`
|
||
- 现已向普通用户侧暴露 `visibility_scope / package_tier / purchase_cta_label / purchase_cta_url`
|
||
- 管理页:
|
||
- `deploy/tksea-portal/admin/logical-groups.html`
|
||
- 新增 `visibility_scope / package_tier / purchase_cta_label / purchase_cta_url` 配置项
|
||
- 普通用户页:
|
||
- `deploy/tksea-portal/index.html`
|
||
- 新增按逻辑分组可见性过滤
|
||
- 新增套餐层级与可见性徽标
|
||
- 新增按逻辑分组生成的购买 / 升级入口
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- `bash ./scripts/test/test_tksea_portal_assets.sh`
|
||
- `bash ./scripts/test/test_real_host_scripts.sh`
|
||
- remote43 已完成 CRM + Portal 升级:
|
||
- `repo HEAD = ef33762d`
|
||
- `GET http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 18173 当前活跃 CRM PID:`1700878`
|
||
- portal 静态资源已重新发布,`nginx -t` 与 reload 成功
|
||
- remote43 / 公网真实验证已通过:
|
||
- 管理页源码回读已确认包含:
|
||
- `visibility_scope`
|
||
- `package_tier`
|
||
- `purchase_cta_label`
|
||
- `purchase_cta_url`
|
||
- 普通用户页源码回读已确认包含:
|
||
- `logicalGroupVisibleForViewer`
|
||
- `package_tier`
|
||
- `purchase_cta_label`
|
||
- `purchase_cta_url`
|
||
- `cta-link`
|
||
- 远端原位临时创建样本:
|
||
- `logical_group_id=p4t6-packaging-1780110031`
|
||
- `route_id=asxs-packaging-1780110031`
|
||
- `public_model=gpt-5.4`
|
||
- 创建样本时写入:
|
||
- `visibility_scope=login_required`
|
||
- `package_tier=pro`
|
||
- `purchase_cta_label=升级到 Pro`
|
||
- `purchase_cta_url=https://sub.tksea.top/portal/upgrade/pro`
|
||
- 公网聚合回读:
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t6-packaging-1780110031`
|
||
- 返回 `visibility_scope / package_tier / purchase_cta_label / purchase_cta_url`
|
||
- 返回 `usage_scenario / recommendation / next_step_hint`
|
||
- `route_count=1`
|
||
- `active_route_count=1`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups/p4t6-packaging-1780110031/models`
|
||
- 返回 `public_model=gpt-5.4`
|
||
- `GET https://sub.tksea.top/portal-admin-api/api/portal/logical-groups`
|
||
- 列表包含 `logical_group_id=p4t6-packaging-1780110031`
|
||
- 验证结束后已删除临时样本:
|
||
- 公网列表回读已确认 `public_list_contains_group=false`
|
||
- 当前结论:
|
||
- 普通用户 Portal 现在可以按逻辑分组控制“是否可见”
|
||
- 逻辑分组已经具备最小可用的套餐层级语义
|
||
- 购买 / 升级入口已经可以按逻辑分组配置并投影到普通用户页
|
||
- 2026-05-30 已完成最终连续闭环真验 / `新增模型绑定 + 新供应商帐号 + 普通用户真实可用`
|
||
- 本轮不是新增功能提交,而是对现有链路做最终连续真验;主证据:
|
||
- 新导入验收:`artifacts/real-host-acceptance/20260530_111023_remote43_minimax53hk_final_e2e/21-summary.json`
|
||
- 连续闭环验收:`artifacts/real-host-acceptance/1780110840_remote43_final_user_flow_e2e/99-final-summary.json`
|
||
- 本轮真实执行链:
|
||
- 使用 `MINIMAX_API_KEY` 重新运行 `scripts/acceptance/import_remote43_provider.sh`
|
||
- provider:`minimax-53hk`
|
||
- 模型:`MiniMax-M2.7-highspeed`
|
||
- 新导入宿主 group:`5`
|
||
- 新导入宿主 account 资源:`HostResourceID=11`
|
||
- 新导入后由脚本自动准备:
|
||
- 普通用户 `user_id=41`
|
||
- 普通用户 key `api_key_id=43`
|
||
- `user_subscriptions.id=59`
|
||
- 随后额外执行:
|
||
- 创建临时 `logical_group_id=final-e2e-1780110840`
|
||
- 创建临时 `public_model=minimax-m2-7-final-e2e`
|
||
- 创建临时 `route_id=primary-1780110840`
|
||
- 绑定 `provider_account_id=19` 到该 route
|
||
- 用同一把普通用户 key 先直打宿主 `/v1/chat/completions`
|
||
- 再调用插件正式入口 `POST /api/routing/chat/completions`
|
||
- 最后回读 `route_decision_logs`、`provider_accounts` 绑定状态与 host `usage_logs`
|
||
- 关键真实结果:
|
||
- 新供应商帐号导入结果:
|
||
- `provider_id=minimax-53hk`
|
||
- `batch_id=2`
|
||
- `accepted_keys_count=1`
|
||
- `access_status_from_import=subscription_ready`
|
||
- `subscription_group_id=5`
|
||
- 普通用户直打宿主:
|
||
- `GET /v1/models` 返回 `200`
|
||
- `POST /v1/chat/completions` with `model=MiniMax-M2.7-highspeed` 返回 `200`
|
||
- 新逻辑分组绑定后的插件正式数据面:
|
||
- `POST /api/routing/chat/completions`
|
||
- `request_id=req-final-e2e-1780110840`
|
||
- `selected_route.route_id=primary-1780110840`
|
||
- `selected_route.shadow_group_id=5`
|
||
- `selected_route.shadow_model=MiniMax-M2.7-highspeed`
|
||
- `forward.upstream_status=200`
|
||
- `forward.effective_gateway_key_source=requested_probe_api_key`
|
||
- provider account 绑定回读:
|
||
- `provider_account_id=19`
|
||
- `host_account_id=11`
|
||
- `route_id=primary-1780110840`
|
||
- `binding_state=assigned`
|
||
- host usage 真正落库证据:
|
||
- `usage_logs.id=112`
|
||
- `user_id=41`
|
||
- `api_key_id=43`
|
||
- `group_id=5`
|
||
- `subscription_id=59`
|
||
- `model=MiniMax-M2.7-highspeed`
|
||
- `inbound_endpoint=/v1/chat/completions`
|
||
- `channel_id=4`
|
||
- 命中的 host account 元数据:
|
||
- `accounts.id=6`
|
||
- `accounts.name=minimax-53hk-01`
|
||
- `status=active`
|
||
- `schedulable=true`
|
||
- 该证据已把“普通用户真实请求命中新导入帐号”坐实
|
||
- 本轮还顺手确认了一个 host 侧计量事实:
|
||
- `api_keys.usage_5h / usage_1d / usage_7d / quota_used` 在 stock host 下不会稳定反映这类请求
|
||
- 这轮真正可用的 host usage 证据是 `usage_logs`
|
||
- 后续若要判断“普通用户真实请求是否落库”,应优先查 `usage_logs`,不要再把 `api_keys.usage_*` 当唯一证据
|
||
- 清理状态:
|
||
- 临时 `logical_group_id=final-e2e-1780110840` 已删除
|
||
- 临时 route 绑定已清空
|
||
- 新导入的 `minimax-53hk` 帐号保留在 remote43,作为后续可复用 real provider 样本
|
||
- 当前结论:
|
||
- 现在可以把“新增绑定模型和供应商帐号并让普通用户正常使用”表述为:**核心流程完全闭环,且已真验**
|
||
- 仍需注意一处现网噪音:这轮 import 侧 `provider_status_from_import=degraded`、`provider_accounts.last_probe_status=failed`
|
||
- 但这不再能阻断放行,因为普通用户真实 `/v1/models`、真实 `/v1/chat/completions`、插件正式 route 数据面、以及 host `usage_logs` 都已经同时证明主链路可用
|
||
- 2026-05-28 已完成 Phase 1 / `P1-T1 SQLite schema foundation`
|
||
- 提交:`7f75d8a6 feat(routing): add logical group schema foundation`
|
||
- 新 migration:`internal/store/migrations/0010_logical_groups_and_routes.sql`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 7f75d8a`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 实例 SQLite `/home/ubuntu/sub2api-kimi-patched-auto2-20260525_18169/sub2api-cn-relay-manager.db` 已确认包含:
|
||
- `logical_groups`
|
||
- `logical_group_models`
|
||
- `logical_group_routes`
|
||
- `logical_group_route_models`
|
||
- 这轮远端验证还顺手暴露并修正了一个部署细节:
|
||
- 若只在 `/home/ubuntu` 下直接拉起 CRM,新进程会回退到默认相对 SQLite 路径 `/home/ubuntu/sub2api-cn-relay-manager.db`
|
||
- 当前已改为显式 `cd` 到实例目录并 `source .env.crm` 后再启动,确保 migration 生效在实例库而不是错误的默认库
|
||
- 2026-05-28 已完成 Phase 1 / `P1-T2 logical_group / route repo + admin API`
|
||
- 提交:`28188922 feat(routing): add logical group admin api`
|
||
- 新增 SQLite repo:
|
||
- `logical_groups`
|
||
- `logical_group_models`
|
||
- `logical_group_routes`
|
||
- `logical_group_route_models`
|
||
- 新增管理 API:
|
||
- `POST /api/logical-groups`
|
||
- `GET /api/logical-groups`
|
||
- `GET /api/logical-groups/{group_id}`
|
||
- `PUT /api/logical-groups/{group_id}`
|
||
- `DELETE /api/logical-groups/{group_id}`
|
||
- `POST /api/logical-groups/{group_id}/models`
|
||
- `GET /api/logical-groups/{group_id}/models`
|
||
- `DELETE /api/logical-groups/{group_id}/models/{model}`
|
||
- `POST /api/logical-groups/{group_id}/routes`
|
||
- `GET /api/logical-groups/{group_id}/routes`
|
||
- `PUT /api/logical-groups/{group_id}/routes/{route_id}`
|
||
- `DELETE /api/logical-groups/{group_id}/routes/{route_id}`
|
||
- `POST /api/logical-groups/{group_id}/routes/{route_id}/models`
|
||
- `GET /api/logical-groups/{group_id}/routes/{route_id}/models`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 2818892`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实 API 验证已通过:
|
||
- `POST /api/logical-groups` 创建 `logical_group_id=p1t2-gpt-shared-1779971040`
|
||
- `GET /api/logical-groups` 返回列表,当前计数 `2`
|
||
- `GET /api/logical-groups/p1t2-gpt-shared-1779971040` 返回 `display_name=P1T2 GPT Shared`,建 route 前 `routes_count=0`
|
||
- `POST /api/logical-groups/p1t2-gpt-shared-1779971040/routes` 创建 `route_id=asxs-1779971040`,`shadow_group_id=p1t2-gpt-shared-1779971040__asxs`
|
||
- 2026-05-28 已完成 Phase 1 / `P1-T3 route logging repo + async writer`
|
||
- 提交:`6e0bd59e feat(routing): add route log writer and admin api`
|
||
- 新增 migration:`internal/store/migrations/0011_route_logging.sql`
|
||
- 新增 SQLite repo:
|
||
- `route_decision_logs`
|
||
- `route_failover_events`
|
||
- `route_sticky_audit`
|
||
- 新增异步写入器:
|
||
- `internal/routing/logwriter.go`
|
||
- 默认采用内存 channel + 定时/显式 `Flush()` 批量落 SQLite
|
||
- 队列满时退化为同步写入,避免热路径静默丢日志
|
||
- 新增管理 API:
|
||
- `POST /api/routing/logs/decisions`
|
||
- `GET /api/routing/logs/decisions`
|
||
- `POST /api/routing/logs/failovers`
|
||
- `GET /api/routing/logs/failovers`
|
||
- `POST /api/routing/logs/sticky-audit`
|
||
- `GET /api/routing/logs/sticky-audit`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 6e0bd59`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实公网 API 验证已通过:
|
||
- `POST /api/routing/logs/decisions` 创建 `request_id=req-p1t3-decision-1779976705`,返回 `selected_route_id=asxs-1779971040`
|
||
- `GET /api/routing/logs/decisions?request_id=req-p1t3-decision-1779976705` 回读到同一条 decision log
|
||
- `POST /api/routing/logs/failovers` 创建 `request_id=req-p1t3-failover-1779976748`,返回 `to_route_id=asxs-1779971040-fallback`
|
||
- `GET /api/routing/logs/failovers?request_id=req-p1t3-failover-1779976748` 回读到同一条 failover event
|
||
- `POST /api/routing/logs/sticky-audit` 创建 `sticky_key=sticky-p1t3-1779976750`,返回 `action=bind`
|
||
- `GET /api/routing/logs/sticky-audit?sticky_key=sticky-p1t3-1779976750` 回读到同一条 sticky audit
|
||
- 2026-05-29 已完成 Phase 1 / `P1-T4 Redis sticky store abstraction`
|
||
- 提交:`98bd619e feat(routing): add sticky runtime backends`
|
||
- 新增运行时抽象:
|
||
- `internal/routing/sticky.go`
|
||
- `internal/routing/sticky_memory.go`
|
||
- `internal/routing/sticky_redis.go`
|
||
- 新增统一 key 规则:
|
||
- `sticky:{scope}:{logical_group_id}:{public_model}:{subject_id}`
|
||
- `routefail:{route_id}`
|
||
- `routecool:{route_id}`
|
||
- 新增启动配置:
|
||
- `SUB2API_CRM_ROUTE_RUNTIME_BACKEND`
|
||
- `SUB2API_CRM_REDIS_ADDR`
|
||
- `SUB2API_CRM_REDIS_PASSWORD`
|
||
- `SUB2API_CRM_REDIS_DB`
|
||
- 新增管理 API:
|
||
- `POST /api/routing/sticky/bindings`
|
||
- `GET /api/routing/sticky/bindings`
|
||
- `POST /api/routing/sticky/route-failures`
|
||
- `GET /api/routing/sticky/route-failures`
|
||
- `POST /api/routing/sticky/cooldowns`
|
||
- `GET /api/routing/sticky/cooldowns`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 98bd619`
|
||
- `http://127.0.0.1:18173/healthz` 在 `memory` / `redis` 两种运行模式下均返回 `ok`
|
||
- remote43 `redis` 运行时目标已确认使用容器地址 `172.24.0.3:6379`,`127.0.0.1:6379` 不可用
|
||
- remote43 真实公网 API 验证已通过:
|
||
- `memory` 模式:
|
||
- `POST /api/routing/sticky/bindings` 创建 `subject_id=conv-p1t4-memory-1780011984`,返回 `backend=memory`
|
||
- `GET /api/routing/sticky/bindings` 回读到同一条绑定,`route_id=asxs-1779971040`,返回 `backend=memory`
|
||
- `POST /api/routing/sticky/route-failures` 创建 `route_id=route-p1t4-memory-1780011984`,返回 `failure_count=2`、`backend=memory`
|
||
- `GET /api/routing/sticky/route-failures` 回读到同一条故障计数,返回 `backend=memory`
|
||
- `POST /api/routing/sticky/cooldowns` 创建 `route_id=route-p1t4-memory-1780011984`,返回 `reason=degraded`、`backend=memory`
|
||
- `GET /api/routing/sticky/cooldowns` 回读到同一条 cooldown,返回 `backend=memory`
|
||
- `redis` 模式:
|
||
- `POST /api/routing/sticky/bindings` 创建 `subject_id=conv-p1t4-redis-1780012047`,返回 `backend=redis`
|
||
- `GET /api/routing/sticky/bindings` 回读到同一条绑定,`route_id=asxs-1779971040`,返回 `backend=redis`
|
||
- `POST /api/routing/sticky/route-failures` 创建 `route_id=route-p1t4-redis-1780012047`,返回 `failure_count=3`、`backend=redis`
|
||
- `GET /api/routing/sticky/route-failures` 回读到同一条故障计数,返回 `backend=redis`
|
||
- `POST /api/routing/sticky/cooldowns` 创建 `route_id=route-p1t4-redis-1780012047`,返回 `reason=cooldown-active`、`backend=redis`
|
||
- `GET /api/routing/sticky/cooldowns` 回读到同一条 cooldown,返回 `backend=redis`
|
||
- 2026-05-29 已完成基础设施闭环补充 / `route resolve + sticky hit`
|
||
- 提交:`66ad319c feat(routing): add sticky-backed route resolver`
|
||
- 新增管理 API:
|
||
- `POST /api/routing/resolve`
|
||
- 行为收口:
|
||
- 首次 resolve 按 `logical_group_routes.priority` 选择可用 route
|
||
- 选择结果会同步写入 sticky store
|
||
- 同步写入 `route_decision_logs` 与 `route_sticky_audit`
|
||
- 后续同 `scope + logical_group_id + public_model + subject_id` 请求会优先命中 sticky
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 66ad319`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 远端实例二进制校验:
|
||
- 活跃实例目录:`/home/ubuntu/sub2api-kimi-patched-auto2-20260525_18169`
|
||
- 本地构建与远端实例二进制 `sha256` 已对齐为 `f7b8334cd992e0a3e65d3f129163f0a01f06a1c746071b67f6e8d1f6fe38ad99`
|
||
- 修正了一次“按绝对路径 `pkill` 未命中 `./sub2api-cn-relay-manager-server`”导致旧进程仍在跑的问题,之后已定向 kill 活跃 PID 并重启
|
||
- remote43 真实公网 API 验证已通过(`redis` 运行时):
|
||
- 创建临时逻辑分组 `logical_group_id=p1t5-gpt-shared-1780019458`
|
||
- 创建两条 route:
|
||
- `asxs-1780019458`,`priority=20`
|
||
- `codex2api-1780019458`,`priority=10`
|
||
- 第一次 `POST /api/routing/resolve`:
|
||
- `request_id=req-p1t5-first-1780019458`
|
||
- `subject_id=conv-p1t5-1780019458`
|
||
- 返回 `backend=redis`
|
||
- 返回 `route_id=codex2api-1780019458`
|
||
- 返回 `sticky_hit=false`
|
||
- 返回 `sticky_action=bind`
|
||
- 第二次 `POST /api/routing/resolve`(同 subject):
|
||
- `request_id=req-p1t5-second-1780019458`
|
||
- 返回 `backend=redis`
|
||
- 返回 `route_id=codex2api-1780019458`
|
||
- 返回 `sticky_hit=true`
|
||
- 返回 `sticky_action=hit`
|
||
- `GET /api/routing/logs/decisions?sticky_key=lg:p1t5-gpt-shared-1780019458:m:gpt-5.4:conv:conv-p1t5-1780019458`
|
||
- 回读到两条 decision log
|
||
- 最新一条 `request_id=req-p1t5-second-1780019458`,`sticky_hit=true`
|
||
- 较早一条 `request_id=req-p1t5-first-1780019458`,`sticky_hit=false`
|
||
- `GET /api/routing/logs/sticky-audit?sticky_key=lg:p1t5-gpt-shared-1780019458:m:gpt-5.4:conv:conv-p1t5-1780019458`
|
||
- 回读到两条 sticky audit
|
||
- 最新一条 `action=hit`
|
||
- 较早一条 `action=bind`
|
||
- 2026-05-29 已完成基础设施闭环补充 / `route failure threshold failover`
|
||
- 提交:`eb2242ca feat(routing): add resolver failover fallback`
|
||
- 行为收口:
|
||
- `resolve` 现在会在选路时读取 `route-failures` 与 `cooldowns`
|
||
- 当高优先级 route 的 `failure_count >= logical_group.failover_threshold` 时,会自动跳过并选择下一条可用 route
|
||
- 首次 fallback 会把 `route_decision_logs.fallback_used` 置为 `true`
|
||
- 同时写入 `route_failover_events`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = eb2242c`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- 远端实例二进制已更新为 `sha256=cc177700541d9ab85a638f768e6fba045d1e864c347e6dfd895ea9e05f27c571`
|
||
- remote43 真实公网 API 验证已通过(`redis` 运行时):
|
||
- 创建临时逻辑分组 `logical_group_id=p1t5-failover-1780020305`
|
||
- 创建两条 route:
|
||
- `codex2api-1780020305`,`priority=10`
|
||
- `asxs-1780020305`,`priority=20`
|
||
- `POST /api/routing/sticky/route-failures`
|
||
- 为 `codex2api-1780020305` 写入 `failure_count=2`
|
||
- `last_error_class=timeout`
|
||
- 返回 `backend=redis`
|
||
- 第一次 `POST /api/routing/resolve`
|
||
- `request_id=req-p1t5-failover-first-1780020305`
|
||
- 返回 `route_id=asxs-1780020305`
|
||
- 返回 `sticky_hit=false`
|
||
- 返回 `sticky_action=bind`
|
||
- 说明高优先级 `codex2api-1780020305` 已因超阈值被自动跳过
|
||
- 第二次 `POST /api/routing/resolve`(同 subject)
|
||
- `request_id=req-p1t5-failover-second-1780020305`
|
||
- 返回 `route_id=asxs-1780020305`
|
||
- 返回 `sticky_hit=true`
|
||
- `GET /api/routing/logs/failovers?request_id=req-p1t5-failover-first-1780020305`
|
||
- 回读到一条 failover event
|
||
- `from_route_id=codex2api-1780020305`
|
||
- `to_route_id=asxs-1780020305`
|
||
- `reason=failure_threshold_exceeded:timeout`
|
||
- `failure_count=2`
|
||
- `GET /api/routing/logs/decisions?sticky_key=lg:p1t5-failover-1780020305:m:gpt-5.4:conv:conv-p1t5-failover-1780020305`
|
||
- 第一条 resolve 对应记录 `fallback_used=true`、`sticky_hit=false`
|
||
- 第二条 resolve 对应记录 `fallback_used=false`、`sticky_hit=true`
|
||
- 2026-05-29 已完成基础设施闭环补充 / `cooldown rebinding regression matrix`
|
||
- 代码基线:沿用 `eb2242ca feat(routing): add resolver failover fallback`
|
||
- 验证目标:
|
||
- 人工设置 `cooldown`
|
||
- `resolve` 自动跳过处于 cooldown 的高优先级 route
|
||
- 写入 `route_failover_events`
|
||
- 验证旧 sticky 失效后的重新绑定,以及新 sticky 的后续命中
|
||
- remote43 真实公网 API 验证已通过(`redis` 运行时):
|
||
- 创建临时逻辑分组 `logical_group_id=p1t5-cooldown-1780020999`
|
||
- 创建两条 route:
|
||
- `codex2api-1780020999`,`priority=10`
|
||
- `asxs-1780020999`,`priority=20`
|
||
- 第一次 `POST /api/routing/resolve`
|
||
- `request_id=req-p1t5-cooldown-first-1780020999`
|
||
- 正常命中主 route `codex2api-1780020999`
|
||
- 返回 `sticky_hit=false`
|
||
- 返回 `sticky_action=bind`
|
||
- `POST /api/routing/sticky/cooldowns`
|
||
- 为 `codex2api-1780020999` 写入 `reason=degraded`
|
||
- 返回 `backend=redis`
|
||
- 第二次 `POST /api/routing/resolve`(同 subject)
|
||
- `request_id=req-p1t5-cooldown-second-1780020999`
|
||
- 旧 sticky 指向的主 route 因 `cooldown` 被自动判定失效
|
||
- 返回 `route_id=asxs-1780020999`
|
||
- 返回 `sticky_hit=false`
|
||
- 返回 `sticky_action=bind`
|
||
- 说明已完成 sticky 重绑定到 fallback route
|
||
- 第三次 `POST /api/routing/resolve`(同 subject)
|
||
- `request_id=req-p1t5-cooldown-third-1780020999`
|
||
- 返回 `route_id=asxs-1780020999`
|
||
- 返回 `sticky_hit=true`
|
||
- 说明新的 sticky 已稳定命中 fallback route
|
||
- `GET /api/routing/logs/failovers?request_id=req-p1t5-cooldown-second-1780020999`
|
||
- 回读到一条 failover event
|
||
- `from_route_id=codex2api-1780020999`
|
||
- `to_route_id=asxs-1780020999`
|
||
- `reason=active_cooldown:degraded`
|
||
- `failure_count=0`
|
||
- `GET /api/routing/logs/decisions?sticky_key=lg:p1t5-cooldown-1780020999:m:gpt-5.4:conv:conv-p1t5-cooldown-1780020999`
|
||
- 第一条 resolve:`selected_route_id=codex2api-1780020999`、`fallback_used=false`
|
||
- 第二条 resolve:`selected_route_id=asxs-1780020999`、`fallback_used=true`、`sticky_hit=false`
|
||
- 第三条 resolve:`selected_route_id=asxs-1780020999`、`fallback_used=false`、`sticky_hit=true`
|
||
- `GET /api/routing/logs/sticky-audit?sticky_key=lg:p1t5-cooldown-1780020999:m:gpt-5.4:conv:conv-p1t5-cooldown-1780020999`
|
||
- 依次可回读到三条 sticky audit:
|
||
- `action=bind`,`route_id=codex2api-1780020999`
|
||
- `action=bind`,`route_id=asxs-1780020999`
|
||
- `action=hit`,`route_id=asxs-1780020999`
|
||
- 2026-05-29 已完成基础设施闭环补充 / `resolve + minimal chat proxy bridge`
|
||
- 提交:`9b1c6f43 feat(routing): add minimal chat proxy bridge`
|
||
- 新增管理 API:
|
||
- `POST /api/routing/proxy/chat/completions`
|
||
- 行为收口:
|
||
- 先复用 `POST /api/routing/resolve` 选出 `route_id / shadow_host_id / shadow_group_id / shadow_model`
|
||
- 再从插件 SQLite `hosts` 表读取 `shadow_host_id -> host.base_url`
|
||
- 用调用方显式提供的 `gateway_api_key` 代理转发最小 `POST /v1/chat/completions`
|
||
- 默认在未提供 `messages` 时发送最小 `ping` 消息
|
||
- 转发结果会以同一 `request_id` 追加写回 `route_decision_logs`,补齐 `upstream_status / latency_ms / error_class`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = 9b1c6f43`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实服务器 API 验证已通过:
|
||
- 为避免依赖现成 managed gateway key,这轮在 remote43 本机临时起了 loopback stub `shadow host`:
|
||
- `host_id=proxy-shadow-host-1780022254`
|
||
- `base_url=http://127.0.0.1:18095`
|
||
- 只暴露最小 `POST /v1/chat/completions`
|
||
- 要求 `Authorization: Bearer gateway-key`
|
||
- 通过真实 CRM API 创建临时逻辑分组与路由:
|
||
- `logical_group_id=p1t6-proxy-1780022254`
|
||
- `route_id=asxs-1780022254`
|
||
- `shadow_group_id=p1t6-proxy-1780022254__asxs`
|
||
- `shadow_model=gpt-5.4-asxs`
|
||
- 调用 `POST /api/routing/proxy/chat/completions`:
|
||
- `request_id=req-p1t6-proxy-1780022254`
|
||
- `subject_id=conv-p1t6-proxy-1780022254`
|
||
- 返回 `route_id=asxs-1780022254`
|
||
- 返回 `shadow_host_id=proxy-shadow-host-1780022254`
|
||
- 返回 `shadow_model=gpt-5.4-asxs`
|
||
- 返回 `forward.ok=true`
|
||
- 返回 `forward.upstream_status=200`
|
||
- 返回 `forward.response.choices[0].message.content=pong-from-stub`
|
||
- stub 侧回读确认:
|
||
- 实际收到 `Authorization: Bearer gateway-key`
|
||
- 实际收到 `model=gpt-5.4-asxs`
|
||
- `GET /api/routing/logs/decisions?request_id=req-p1t6-proxy-1780022254&limit=5`
|
||
- 共回读到 `2` 条 decision log
|
||
- 最新一条 `upstream_status=200`
|
||
- 说明控制面 `resolve` 与数据面最小转发已被同一 `request_id` 串成闭环
|
||
- 2026-05-29 已完成真实宿主 managed key 供应路径补充 / `route proxy -> managed subscription -> real host chat`
|
||
- 提交:`cffe3332 feat(routing): auto-supply managed proxy keys`
|
||
- `POST /api/routing/proxy/chat/completions` 新增:
|
||
- 当请求未显式提供 `gateway_api_key` 且带 `subscription_user_id` 时,会自动调用目标宿主的 `EnsureSubscriptionAccess(...)`
|
||
- 通过 `shadow_group_id + shadow_host_id` 解析真实宿主 group,再以新签发的 managed subscription key 转发到真实 host `/v1/chat/completions`
|
||
- 转发结果继续以同一 `request_id` 追加写回 `route_decision_logs`
|
||
- 本地门禁已通过:
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `go test ./tests/integration/... -count=1`
|
||
- remote43 已原位升级到 `repo HEAD = cffe3332`
|
||
- `http://127.0.0.1:18173/healthz` 返回 `ok`
|
||
- remote43 真实服务器 API 验证结果:
|
||
- 通过 remote43 本机 `host admin login` 拿到真实宿主管理员 access token
|
||
- 已确认 route-lab 真实 shadow group:
|
||
- `name=GPT Shared 路由实验-subscription`
|
||
- `host_group_id=8`
|
||
- 通过真实 CRM API 创建临时 host / logical group / route:
|
||
- `host_id=proxy-real-host-1780026133`
|
||
- `logical_group_id=p1t7-real-1780026133`
|
||
- `route_id=asxs-real-1780026133`
|
||
- `shadow_group_id=8`
|
||
- `shadow_model=gpt-5.4-asxs`
|
||
- 调用 `POST /api/routing/proxy/chat/completions`:
|
||
- `request_id=req-p1t7-real-1780026133`
|
||
- `subscription_user_id=proxy-managed-1780026133`
|
||
- 返回 `effective_gateway_key_source=managed_subscription`
|
||
- 返回 `managed_user_id=33`
|
||
- 返回 `effective_gateway_key_fingerprint=sha256:f0fa3dec6e94348945431c9470c1faa8258c50fcee1adcb1904dac0fa42a15d6`
|
||
- 返回 `forward.ok=false`
|
||
- 返回 `forward.upstream_status=503`
|
||
- 返回 `forward.response.error.message=Service temporarily unavailable`
|
||
- `GET /api/routing/logs/decisions?request_id=req-p1t7-real-1780026133&limit=5`
|
||
- 共回读到 `2` 条 decision log
|
||
- 最新一条 `upstream_status=503`
|
||
- 当前结论:
|
||
- 插件控制面 `resolve -> shadow_host/shadow_group/shadow_model`
|
||
- managed subscription key 自动供给
|
||
- 真实宿主 `/v1/chat/completions` 转发
|
||
- 插件侧 decision log 回写
|
||
- 以上链路均已真实命中
|
||
- 但该次 upstream 仍返回 `503`,因此当前收口应表述为“真实 managed key 供应路径已打通,真实上游可用性仍待继续压实”
|
||
- 2026-05-29 已完成该 `503` 的只读根因收敛 / `managed key vs gateway vs account test vs host logs`
|
||
- 目标样本仍使用上一轮已真实命中的 managed key:
|
||
- `subscription_user_id=proxy-managed-1780026133`
|
||
- `managed_key_fingerprint=sha256:f0fa3dec6e94348945431c9470c1faa8258c50fcee1adcb1904dac0fa42a15d6`
|
||
- `group_id=8`
|
||
- `group_name=GPT Shared 路由实验-subscription`
|
||
- 同一条 managed key 的宿主 gateway 对照结果:
|
||
- `GET /v1/models` 返回 `200`
|
||
- 返回模型集只包含 alias:`["gpt-5.4-asxs","gpt-5.4-mini-asxs"]`
|
||
- `POST /v1/chat/completions` with `model=gpt-5.4-asxs` 返回 `503`
|
||
- `POST /v1/chat/completions` with `model=gpt-5.4` 也返回 `503`
|
||
- 同一 group 对应账号状态:
|
||
- 仅命中 `account_id=9`
|
||
- `name=gpt-asxs-route-lab-01`
|
||
- `status=active`
|
||
- `schedulable=true`
|
||
- `group_ids=[8]`
|
||
- `credentials.base_url=https://api.asxs.top/v1`
|
||
- `credentials.model_mapping={"gpt-5.4-asxs":"gpt-5.4","gpt-5.4-mini-asxs":"gpt-5.4-mini"}`
|
||
- `GET /api/v1/admin/accounts/9/models` 返回 `200`,模型集包含 `gpt-5.4-asxs` / `gpt-5.4-mini-asxs`
|
||
- `POST /api/v1/admin/accounts/9/test` with `model_id=gpt-5.4` 返回 success
|
||
- `POST /api/v1/admin/accounts/9/test` with `model_id=gpt-5.4-asxs` 也返回 success
|
||
- 同一 group 对应 channel 状态:
|
||
- `channel_id=7`
|
||
- `name=GPT Shared - asxs-subscription`
|
||
- `restrict_models=true`
|
||
- `billing_model_source=channel_mapped`
|
||
- `model_mapping.openai={"gpt-5.4-asxs":"gpt-5.4","gpt-5.4-mini-asxs":"gpt-5.4-mini"}`
|
||
- `model_pricing[0].models=["gpt-5.4-asxs","gpt-5.4-mini-asxs"]`
|
||
- 真实宿主容器日志(`weishaw/sub2api:0.1.129` / container `f2f0490d4b7f`)已明确给出拦截原因:
|
||
- `channel pricing restriction blocked request`
|
||
- `openai_chat_completions.account_select_failed`
|
||
- `error=no available accounts supporting model: gpt-5.4-asxs (channel pricing restriction)`
|
||
- 对 `model=gpt-5.4` 也出现相同 `channel pricing restriction`
|
||
- 本轮判定:
|
||
- 这不是 managed key 失效:同一 key 的 `/v1/models` 已返回 `200`
|
||
- 这不是账号坏掉:`account_id=9` active/schedulable,且 direct `account test` 成功
|
||
- 这不是 pack/provider 基础字段缺失:channel 上 `model_mapping + model_pricing + restrict_models + billing_model_source=channel_mapped` 都已存在
|
||
- 当前 `503` 的直接根因是 **stock host `weishaw/sub2api:0.1.129` 在 gateway 选路阶段把该 shadow group 视为 `channel pricing restriction` 不可用**
|
||
- 工程决策:
|
||
- **优先修 route/provider 配置,不继续押注宿主兼容热修**
|
||
- 插件的 `logical_group/public_model` 仍可继续保留 alias / 路由抽象
|
||
- 但宿主 `shadow_group` 不应继续承载 alias 模型名
|
||
- 后续 shadow group 应改为只承载 canonical upstream model,例如:
|
||
- `gpt-5.4`
|
||
- `gpt-5.4-mini`
|
||
- alias/public model 的抽象只保留在插件 `logical_group -> route -> shadow_model` 这一层;不要再把 alias 下沉到 stock host 的 `channel_mapped + restrict_models` 组合里
|
||
- 2026-05-29 已完成上述根因的修复验证 / `canonical shadow provider -> managed subscription -> real host chat`
|
||
- 提交:
|
||
- `3c061f3d feat(routing): add canonical shadow provider pack`
|
||
- `4a38e95d fix(acceptance): separate request pack path`
|
||
- fixed checkout 已更新到 `repo HEAD = 3c061f3d`
|
||
- 真实宿主影子组改为:
|
||
- `shadow_group_id=9`
|
||
- `provider_id=gpt-asxs-shadow-lab`
|
||
- `shadow_model=gpt-5.4`
|
||
- remote43 本机经真实 CRM API 创建临时路由:
|
||
- `logical_group_id=p1t7-shadow-1780029532`
|
||
- `route_id=asxs-shadow-1780029532`
|
||
- `shadow_host_id=proxy-real-host-1780026133`
|
||
- `subscription_user_id=proxy-shadow-managed-1780029532`
|
||
- 调用 `POST /api/routing/proxy/chat/completions` 的真实结果:
|
||
- `request_id=req-p1t7-shadow-1780029532`
|
||
- `effective_gateway_key_source=managed_subscription`
|
||
- `managed_user_id=35`
|
||
- `forward.ok=true`
|
||
- `forward.upstream_status=200`
|
||
- `forward.shadow_group_id=9`
|
||
- `forward.shadow_model=gpt-5.4`
|
||
- `forward.response.content_type=text/event-stream`
|
||
- 返回内容已回读到正常 completion,`content=pong`
|
||
- `GET /api/routing/logs/decisions?request_id=req-p1t7-shadow-1780029532&limit=5`
|
||
- 共回读到 `2` 条 decision log
|
||
- 最新一条 `upstream_status=200`
|
||
- 当前闭环结论:
|
||
- 旧 `503` 的根因已经固定为 **alias/public model 下沉到 stock host shadow group 后触发 `channel pricing restriction`**
|
||
- 把宿主 shadow group 收回 canonical upstream model 后,真实 managed key `/v1/chat/completions` 已恢复 `200`
|
||
- 插件控制面 `resolve`
|
||
- managed subscription key 自动供给
|
||
- 真实宿主 `/v1/chat/completions`
|
||
- 插件侧 decision log 回写
|
||
- 以上链路现已全部真实跑通
|
||
- 2026-05-26 已把“最终用户 -> 公网域名 -> OpenClaw”这一跳补进正式验证口径:
|
||
- 公网根地址当前统一为 `https://sub.tksea.top`
|
||
- OpenClaw 本地 `MiniMax` 运行时故障已定位为 `pi-ai/openai-node` 未继承系统 `HTTP(S)_PROXY`,不是 allowlist 或模型名大小写问题
|
||
- 操作者本机已新增升级后自检与提醒链路:
|
||
- `~/.openclaw/bin/apply-openclaw-minimax-proxy-fix.sh`
|
||
- `~/.openclaw/bin/openclaw-minimax-post-upgrade-check.sh`
|
||
- `~/.openclaw/bin/openclaw-minimax-proxy-reminder.sh`
|
||
- `~/.openclaw/bin/install-openclaw-minimax-reminder-cron.sh`
|
||
- 当前 OpenClaw 真实验证基线已收口为:
|
||
- `tksea-gpt/gpt-5.4`:PASS
|
||
- `tksea-gpt/gpt-5.4-mini`:PASS
|
||
- `tksea-gpt/gpt-5.5`:当前 upstream `503`
|
||
- `tksea-minimax/MiniMax-M2.5-highspeed`:PASS
|
||
- `tksea-minimax/MiniMax-M2.7-highspeed`:PASS
|
||
- `deepseek-official/deepseek-chat`:PASS(2026-05-27 已补齐本机 auth profile,one-shot 返回 `OK`)
|
||
- 这部分测试用例与执行过程已沉淀到 `docs/OPENCLAW_EXTERNAL_VALIDATION.md`
|
||
- 2026-05-26 remote43 patched host 最新 provider 扩展验收:
|
||
- `openai-zhongzhuan`:`artifacts/real-host-acceptance/20260526_155548_remote43_openai_zhongzhuan_multi_model_rootprep/21-summary.json` 已确认 `batch_status=succeeded`、`provider_status_from_import=active`、`direct_chat_status=200`
|
||
- `minimax-53hk`:`artifacts/real-host-acceptance/20260526_155705_remote43_minimax53hk_multi_model_rootprep/21-summary.json` 已确认 `batch_status=succeeded`、`provider_status_from_import=active`、`direct_chat_status=200`、`upstream_chat_status=200`
|
||
- `deepseek-chat-official`:
|
||
- 旧 artifact `artifacts/real-host-acceptance/20260526_155810_remote43_deepseek_chat_official_multi_model_rootprep/21-summary.json` 停在 `partially_succeeded/degraded`
|
||
- 2026-05-27 rerun `artifacts/real-host-acceptance/20260527_051655_remote43_deepseek-chat-official_key_import/21-summary.json` 已确认本机经 remote43 patched host 的真实数据面恢复:`direct_models_http200=true`、`direct_models=["deepseek-chat"]`、`direct_chat_status=200`、`latest_access_status=subscription_ready`、`upstream_chat_status=200`
|
||
- 剩余 `partially_succeeded/degraded` 的唯一原因已定位为宿主 account probe 返回裸 `API returned 404:`,而后续 gateway `/v1/models` + `/v1/chat/completions` 实际都已通过;HEAD 现已把该 404 视为 advisory,不再阻塞最终状态收敛
|
||
- 同轮还补上 remote43 scripted stack 的真实脚本缺陷:`.env.crm` 里的 SQLite DSN 含 `&_busy_timeout=5000` 时,旧版未做 shell escape,`source .env.crm` 会吞掉 `SUB2API_CRM_SQLITE_DSN`,导致 remote CRM 实际退回默认 DB 路径;`scripts/deploy/remote43_patched_stack_lib.sh` 已修复并有回归测试覆盖
|
||
- latest-head relay-manager 已新增宿主 capability 自愈:
|
||
- 当第三方 OpenAI-compatible upstream 因宿主把 `openai_responses_supported` 误判成 `true` 而导致 host `/v1/chat/completions` 返回 `502 upstream_error` 时,access closure 与后台 reconcile 会自动把相关 account 修正到 raw `/chat/completions` 路径后再重试
|
||
- 但这条控制面自愈当前仍不足以单独收敛本地 stock `weishaw/sub2api:0.1.129` + `kimi-a7m` 场景;`artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_scheme_c_stockhost_rerun/21-summary.json` 已再次证明:在不改宿主源码的前提下,managed `/v1/models` 虽然命中 `kimi-k2.6`,`/v1/chat/completions` 仍会落到 `502 upstream_error`,所以该 case 仍需宿主运行时兼容补丁或 shim
|
||
- 2026-05-23 remote43 线上验收脚本已继续收口:
|
||
- `scripts/acceptance/import_remote43_provider.sh` 现已明确拆分 `CRM_HOST_BASE` 与 `REMOTE_HOST_BASE`
|
||
- 远端 Postgres / Redis 容器已改成按目标宿主端口自动解析,不再硬编码落到 `sub2api-relaymgr-pg/redis`
|
||
- 远端 managed probe `/v1/models` 与 `/v1/chat/completions` 已改成只走 `REMOTE_HOST_BASE`
|
||
- provider status / access status / access preview 末尾查询已补 `host_id`,避免本地 CRM 有多宿主历史时被 `provider exists on multiple hosts` 截断
|
||
- 2026-05-25 已把 Hermes 里可复用的 `a7m-kimi` 正式并入主 pack:
|
||
- 新增 `packs/openai-cn-pack/providers/kimi-a7m.json`
|
||
- `openai-cn-pack` 版本现为 `1.1.3`
|
||
- 当前主仓不再需要依赖历史临时 pack `openai-cn-pack-kimi-a7m`
|
||
- `kimi-a7m` provider manifest 现在也开始承载 `host_overlays` 元数据;本地已把 `sub2api v0.1.129` 的 Kimi A7M runtime overlay 说明与 `.patch` 资产纳入 `packs/openai-cn-pack/overlays/`
|
||
- 新增 `go run ./cmd/cli apply-host-overlay` 最小执行器;当前 pack 内命中的 overlay 已可直接生成 patched 宿主构建目录,不再只是 preview/import 阶段的提示信息
|
||
- 2026-05-25 已继续把路线 A 推进到运行态层面:
|
||
- 从 `/tmp/sub2api-clean` 的 clean worktree `HEAD` 导出 stock 源码,再用 `go run ./cmd/cli apply-host-overlay --provider-id kimi-a7m --host-version 0.1.129` 生成全新 patched 源码树
|
||
- 基于该 patched 源码树重建 `localhost/sub2api:patched-overlay-20260525-clean`,并在独立 Podman 网络里启动新的 Postgres / Redis / App fresh-host
|
||
- `artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_patched_overlay_image_freshhost_clean/21-summary.json` 已确认:`import_batch_status=succeeded`、`provider_status=active`、`latest_access_status=subscription_ready`、`completion_ok=true`、`completion_status=200`
|
||
- 同目录 `07-access-status.json` 与 patched host 运行日志已共同证明 managed subscription key 真实打通 `/v1/models` 与 `POST /v1/chat/completions`
|
||
- 注意:该 fresh-host 使用的镜像基底仍是 `weishaw/sub2api:0.1.129`,但宿主管理 API 当前自报 `host_version=0.1.126`;后续读 artifact 时应以日期和证据链为准,不要只依赖版本字段
|
||
- 2026-05-25 已把同一条 patched overlay 路线放到 remote43 做线上验收:
|
||
- remote43 侧单独拉起了 `sub2api-kimi-patched-20260525-{app,pg,redis}`,patched host 暴露 `127.0.0.1:18139`
|
||
- 临时 CRM 也切到 remote43 本机运行,再通过 SSH 隧道映射回本地 `127.0.0.1:18143`,避免“本地 CRM 透过隧道探远端 host”导致的 `get host version` 超时噪音
|
||
- `artifacts/real-host-acceptance/20260525_remote43_kimi_a7m_patched_overlay_freshhost_remotecrm/21-summary.json` 已确认:`batch_status=succeeded`、`access_status_from_import=subscription_ready`、`provider_status_from_import=active`、`direct_models_http200=true`、`direct_chat_http200=true`、`upstream_chat_status=200`
|
||
- 同目录 `14-access-status.json` 已确认 `effective_probe_key_source=managed_subscription` 且 `completion_status=200`
|
||
- remote43 宿主日志也已落到真实 `GET /v1/models = 200`、`POST /v1/chat/completions = 200`,说明这条 patched overlay 路线不只在本地 fresh-host 成功,也已在远端真实环境收敛到 ready
|
||
- 这轮还顺手修掉了 `scripts/acceptance/import_remote43_provider.sh` 的一个真实脚本缺陷:查找未分配 `relay-sub-*` 用户时,`NOT EXISTS` 子查询错误引用了无 alias 的 `users.id`,在 PostgreSQL 上会报 `invalid reference to FROM-clause entry for table "users"`
|
||
- 2026-05-25 继续把这套 remote43 patched-host / remote CRM 的启动流程脚本化:
|
||
- 新增 `scripts/deploy/setup_remote43_patched_stack.sh`,把 pack 镜像、二进制上传、remote43 上的 PG/Redis/patched host/临时 CRM 拉起、以及本地 operator env / SSH 隧道提示收口为一个固定入口
|
||
- 新增 `scripts/deploy/remote43_patched_stack_lib.sh`,把远端 host env / CRM env / bootstrap script 渲染逻辑抽成可测试 helper
|
||
- `scripts/test/test_real_host_scripts.sh` 已新增对应回归,避免以后再回退到手工 `/tmp/*.sh` 拼装
|
||
- 脚本首轮真实演练暴露出一个运行态细节:patched `sub2api` 二进制实际监听 `8080`,不能沿用旧临时脚本里的 `127.0.0.1:$HOST_PORT:3000` 端口映射;当前 `setup_remote43_patched_stack.sh` 已新增 `HOST_CONTAINER_PORT=8080` 默认值并完成 remote43 二次实跑验证
|
||
- 用修复后的固定脚本在 remote43 新起的 `sub2api-kimi-patched-auto2-20260525` 栈上,`kimi-a7m` 再次完成真实导入主链路:`artifacts/real-host-acceptance/20260525_remote43_kimi_a7m_patched_overlay_scripted_stack/03-import.body.json` 已确认 `batch_status=succeeded`、`access_status=subscription_ready`、`provider_status=active`,同目录 `10-models.body.json` / `12-chat.body.json` / `18-upstream-models.body.json` / `20-upstream-chat.body.txt` 也已再次证明 managed 与 upstream 双侧都回到 `HTTP 200`
|
||
- 2026-05-26 继续把 scripted stack 的末尾状态查询收口为稳定契约:`scripts/acceptance/import_remote43_provider.sh` 末尾不再只传 `host_id`,而是显式拼上 `pack_id=openai-cn-pack&host_id=<create-host 返回值>`;修复原因是 remote43 上同一个 provider 可能存在多个 pack 版本,缺 `pack_id` 时 `/api/providers/{providerID}/status` 会返回 `400 provider exists in multiple packs; pack_id is required`
|
||
- 修复后,`artifacts/real-host-acceptance/20260526_remote43_kimi_a7m_patched_overlay_scripted_stack_rerun2/13-provider-status.json`、`14-access-status.json`、`21-summary.json` 已全部自动补齐;其中 `21-summary.json` 已再次确认 `batch_status=succeeded`、`provider_status_from_import=active`、`latest_access_status=subscription_ready`、`direct_chat_status=200`、`upstream_chat_status=200`
|
||
- `artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_from_hermes/21-summary.json` 已证明:
|
||
- Hermes 本机 `A7M_KIMI_API_KEY` 直探 upstream `/v1/models` 与 `/v1/chat/completions` 均为 `200`
|
||
- latest-head relay-manager + 本地 `weishaw/sub2api:0.1.129` fresh-host 下,import-time gateway `/v1/models` 命中 `kimi-k2.6`
|
||
- 但 completion 仍落到 `502 upstream_error`,手工 managed key 再探 `/v1/chat/completions` 也返回 `503`
|
||
- 管理员 account 视角 `/api/v1/admin/accounts/1/models` 正确,但手工 managed key `/v1/models` 仍会回到 GPT 默认集合,当前应继续归类为宿主运行时 gap / drift,而不是 Hermes key 失效
|
||
- `artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_from_hermes/22-patched-host-validation.json` 已证明:
|
||
- 问题根因是宿主把 Kimi A7M 这类 custom upstream 误走到 `Responses` 路径,而不是 Hermes key 或 relay-manager pack 失效
|
||
- 在 `/tmp/sub2api-clean` 的宿主补丁下,旁路容器 `sub2api-patched` 已恢复 `managed key /v1/models=200`、`managed key /v1/chat/completions=200`、`admin accounts/:id/test=true`
|
||
- fallback 日志与账号 `extra.openai_responses_supported=false` 持久化已同时出现,说明这条链路已经从 stock host 的 `partially_succeeded / broken` 收敛到 patched host 的 `ready`
|
||
- `artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_scheme_c_stockhost_rerun/21-summary.json` 已证明:
|
||
- 仅启用 relay-manager 侧方案 C(预先 `force_disable_openai_responses_api` + import/access/reconcile capability repair),但保持宿主仍是未打补丁的 stock `weishaw/sub2api:0.1.129`
|
||
- import-time gateway `/v1/models` 仍能命中 `kimi-k2.6`
|
||
- 但 import-time gateway `/v1/chat/completions` 依旧返回 `502 upstream_error`,`access_status` 仍是 `broken`,`provider_status_latest` 仍是 `partially_succeeded`
|
||
- 因此当前最新真相不是“方案 C 已替代宿主补丁”,而是“方案 C 缩小了控制面误判范围,但这条 Kimi A7M / stock v0.1.129 链路仍需要宿主运行时兼容修复”
|
||
- 2026-05-25 已继续补齐方案 C(控制面侧 capability repair):
|
||
- `internal/host/sub2api` 新增 `ClearTempUnschedulable`
|
||
- access / reconcile 的 capability repair 现在会同时写 `extra.openai_responses_supported=false` 并清理账号 `temp_unschedulable`
|
||
- `packs/openai-cn-pack/providers/kimi-a7m.json` 新增 `force_disable_openai_responses_api=true`,导入后会在 gateway closure 前预先把该账号切到 raw `/v1/chat/completions`
|
||
- `artifacts/real-host-acceptance/20260523_144937_remote43_kimi-a7m_key_import` 已证明:
|
||
- 这轮线上 `kimi-a7m` 不再复现“错库取 key 导致统一 401”或“模型列表串成 GPT 默认集合”
|
||
- import 已返回 `gateway.status_code=200`、`models=["kimi-k2.6"]`、`has_expected_model=true`
|
||
- upstream `/models` 与 `/chat/completions` 都是 `200`
|
||
- 未改宿主的真实阻塞已收缩为 host `/v1/chat/completions` 仍返回 `503/502`,不再是插件脚本的数据面问题
|
||
- `artifacts/real-host-acceptance/20260523_145531_remote43_kimi-a7m_key_import` 说明另一类运行时噪音:
|
||
- 当本地 SSH 隧道端口存活但链路已失活时,`POST /api/hosts` 阶段会在 `get host version` 处超时
|
||
- 这类现象应优先解释为 tunnel/runtime 故障,而不是 provider 导入逻辑回退
|
||
- 官方 provider 验证矩阵当前仍保留一条非阻塞事实:
|
||
- `artifacts/real-host-acceptance/20260521_222212_remote43_minimax-m2-7-official_key_import/21-summary.json` 已证明 official MiniMax 模板链路是通的,但该验证 key 当前命中 upstream `429`
|
||
- `reconcile=drifted` 仍可能在 shared fresh-host 上出现,但当前解释是“历史残留资源噪音”,不阻塞 PRD 首版放行
|
||
- 调通细节与诊断经验已沉淀到:
|
||
- `docs/REAL_HOST_ACCEPTANCE_LEARNINGS.md`
|
||
- `docs/REAL_HOST_ARTIFACT_RETENTION.md`
|
||
- 2026-05-24 本地代码门禁修复已继续收口三类非回归点:
|
||
- `go test -race ./... -count=1` 现已再次真实通过;根因不是业务逻辑,而是多个测试包并行 `sqlite.Open()` 时与 `modernc.org/sqlite` 初始化路径的 race 噪音。当前已把 `internal/app`、`internal/provision`、`internal/reconcile` 的测试 SQLite 打开路径收口到串行 helper,关闭这类假红灯,同时保持 sqlite 包内测试不引入导入环。
|
||
- `DELETE /api/hosts/{hostID}` 不再默认放行危险级联删除;`hosts` repo 现在会先统计 `import_batches / managed_resources / reconcile_runs` 三类运行态依赖,有残留时返回 `409 host_in_use`,避免误删状态库里的回滚/对账真相。
|
||
- 控制面 JSON 请求体现在统一受 `MaxBytesReader` 限制;超限请求会明确返回 `413 request_too_large`,不再允许无界 body 直接进入解码路径。
|
||
|
||
## 本轮已完成
|
||
|
||
1. 宿主身份模型统一
|
||
- host 注册时持久化 `auth_type/auth_token`
|
||
- import / reconcile / rollback-provider / access 运行时链路切换为 `host_id` 主键
|
||
- provider status / resources / access status / import-batches 支持 `host_id` 查询维度
|
||
2. managed_resources 宿主维度收口
|
||
- 新增迁移 `0004_host_identity_and_managed_resources.sql`
|
||
- `managed_resources` 唯一键提升为 `(host_id, resource_type, host_resource_id)`
|
||
- 仓储与服务查询切换为 host-scoped 语义
|
||
3. reconcile run 结果按批次收口
|
||
- 新增迁移 `0006_reconcile_runs_batch_scope.sql`
|
||
- `reconcile_runs` 补充 `batch_id`,batch detail 仅返回本批次 reconcile 记录
|
||
4. capability probe 收敛为无副作用探测
|
||
- 不再对真实创建接口发送空 `POST`
|
||
5. rollback-provider 风险收敛
|
||
- 改为优先按已记录批次资源 `RollbackStoredResources()` 回滚
|
||
- 缺少已记录资源时拒绝危险删除
|
||
6. 文档真相同步
|
||
- 新增 `docs/2026-05-18-PRODUCTION_REMEDIATION_TASK_BOARD.md`
|
||
- 下调 `DEPLOYMENT.md` 中未实现的 `/metrics` / 限流 / 监控承诺
|
||
7. current-code remote43 导入链路已补齐 tunnel-aware 验证能力
|
||
- `scripts/acceptance/import_remote43_provider.sh` 新增 `CRM_HOST_BASE`,允许把“operator 访问 host 地址”和“CRM 进程访问 host 地址”分离
|
||
- 历史 live model-mapping 关键证据保留在:`artifacts/real-host-acceptance/20260520_222713_crm18100_live_model_mapping_validation`
|
||
8. current-code remote43 access gate 根因修正已落地
|
||
- subscription access 改为宿主侧闭环:CRM 不再依赖外部预先给定的宿主普通用户 key,而是按 `subscription_users` selector 在宿主创建/查找托管普通用户、登录创建托管 key、回写 allowed_groups / balance、再执行订阅分配
|
||
- account 创建请求现在同步写入 `credentials.model_mapping`,修正 `/v1/models` 读取 account model whitelist 时回退到 GPT 默认集合的问题
|
||
- 新增/更新测试覆盖:`internal/access`、`internal/provision`、`internal/host/sub2api`
|
||
9. current-code access ready 语义已提升到 completion 层
|
||
- `/v1/models` 不再单独决定 `subscription_ready/self_service_ready`
|
||
- 只有 `/v1/models` 命中 `smoke_test_model` 且 `/v1/chat/completions` smoke 成功,控制面才会把 access 状态记成 ready
|
||
- access closure / import runtime artifact / reconcile rerun payload 都会持久化 `completion_ok/completion_status/completion_type/completion_preview`
|
||
10. current-code remote43 验收脚本已补 upstream API 证据层
|
||
|
||
- `scripts/acceptance/import_remote43_provider.sh` 会直探 provider `base_url` 对应的 upstream `/models` 与 `/chat/completions`
|
||
- 新增 `21-summary.json`,用于把 completion 失败自动分流成 `host_compatibility_gap` 或 `upstream_key_quota_issue`
|
||
- 2026-05-27 已把 V2 batch-import reuse runtime 真正接到 live action:
|
||
- `internal/app/batch_runtime.go` 现已接入 `InspectReuse`
|
||
- runtime reuse 查询优先命中既有 `import_run_items`,再回退到 legacy `import_batches / import_batch_items / managed_resources / providers`
|
||
- 兼容 V2 短指纹与 legacy 完整 sha256 指纹
|
||
- live run 现在可真实产出 `matched_account_state / account_resolution / provision_reused`
|
||
- 2026-05-27 继续用 `/portal/admin-batch-import.html` 做真实页面操作验证,抓到了一个 live reuse 兼容缺口并已在本地修正:
|
||
- real remote43 样本 `https://api.53hk.cn/v1 + sk-4175...d776 + host=remote43-kimi-patched-auto2-18169` 首轮返回 `TOKEN_EXPIRED`,根因是 CRM 中持久化的宿主 bearer 已过期;刷新 host auth 后,item 已能恢复到 `access_status=active`
|
||
- 旧版 runtime 仍把同一条历史账号判成 `matched_account_state=none / account_resolution=created`,根因是 live runtime 的 normalized `provider_id`(如 `api-53hk-42797c06`)与 legacy pack provider id(如 `minimax-53hk`)不一致时,legacy reuse fallback 只按 `provider_id` 精确匹配
|
||
- 当前已补 `base_url` fallback + `ProviderMatched` 策略信号:legacy lookup 会补查相同 `base_url` 的 provider,且“同 base_url + 同 key + family covered”现在可以真实收敛到 `reused/reactivated`
|
||
- 定向回归已通过:`go test ./internal/app -run 'TestBatchImportHTTP/(create run action reuses matched legacy account|create run action reuses legacy account when pack provider id differs from normalized runtime id)$' -count=1`、`go test ./internal/batch -run TestDecideReuse -count=1`、`go test ./internal/store/sqlite -run 'TestProvidersRepoListBy(BaseURL|BaseURLEmpty)$' -count=1`
|
||
- remote43 二次复验现已补证:更新后的 CRM 二进制已替换到 `18173` 控制面,真实 rerun `run_1779882868037300268` 已确认 item 从 `account_resolution=created` 收敛为 `account_resolution=reused`,并且 `provision_reused=true`、`access_status=active`
|
||
- 当前剩余的细节是:该 rerun item 的 `matched_account_state` 仍为 `none`,说明“reuse 命中后是否补出 active/disabled/deprecated state badge”仍可继续优化;但这不影响本轮要验证的 `created -> reused` 结果成立
|
||
|
||
11. patched CRM external validation 已完成
|
||
|
||
- patched CRM 实例下,DeepSeek 与 MiniMax 都已验证“completion smoke 通过时能落成 succeeded/active,失败时不会误记成 ready”
|
||
- `20260521_191418_remote43_minimax_key_import` 与 `20260521_201509_remote43_deepseek_key_import` 已同时证明当前 `subscription` provider 链路可真实闭环
|
||
- `20260521_210403` 已证明 latest-head `self_service` 标准 fresh-host 验收也可闭环到 `self_service_ready / fully_ready`
|
||
|
||
12. artifact 保留策略已收口
|
||
|
||
- 主目录 `artifacts/real-host-acceptance/` 当前只保留最终证据
|
||
- 历史失败/半成功/试错样本已迁到 `artifacts/real-host-acceptance-archive/`
|
||
- 分类规则见:`docs/REAL_HOST_ARTIFACT_RETENTION.md`
|
||
|
||
13. relay-manager latest-head 已收口 Kimi A7M 两段竞态
|
||
|
||
- account test 首次 `403 Forbidden` 已降级为 advisory warning;只要 `/models` 已命中 `smoke_test_model`,不会再把 batch 误判为 blocking failure
|
||
- access closure 对导入后瞬时 `503 / no available accounts` 增加短暂 completion retry,避免宿主异步 probe / account warm-up 窗口把真实可用链路误记成 `broken`
|
||
- `20260522_122706_local_v0129_kimi_a7m_subscription_freshhost` 已证明:在修复后的 relay-manager + patched host 组合下,`kimi-a7m / kimi-k2.6` 可落到 `batch_status=succeeded`、`provider_status=active`、`latest_access_status=subscription_ready`
|
||
|
||
14. relay-manager latest-head 已补宿主升级后的 capability 自愈
|
||
|
||
- 对 `API returned 403: Forbidden` 这类 `/responses` 误判 advisory,控制面现在会在 access closure 与 reconcile rerun 中把目标 account 的 `openai_responses_supported` 修正为 `false`,随后重试 gateway `/v1/chat/completions`
|
||
- 这样即使宿主升级或异步 probe 把 capability 标记覆写错,控制面也能在“安装后确认”与“后台持续对账”两个环节重新拉回可用状态
|
||
|
||
15. 2026-05-24 本地质量门禁补丁已完成
|
||
|
||
- 新增 repo 级删除保护:`internal/store/sqlite/hosts_repo.go` 引入 `RuntimeDependencyCountsByHostID` 与 `HostDeleteBlocker`
|
||
- 新增回归测试:`TestHostsRepoDeleteByHostIDBlocksWhenRuntimeStateExists`、`TestBatchImportRejectsOversizedJSONBody`、`TestDecodeJSON/rejects oversized request body`
|
||
- `internal/app/http_api.go` 现已统一限制 JSON request body 大小,并把 host 删除占用态映射为 `host_in_use`
|
||
- `internal/app` / `internal/provision` / `internal/reconcile` 测试 SQLite 打开路径已改为串行 helper,当前 `go test -race ./... -count=1` 重新恢复为绿
|
||
|
||
## 已验证门禁
|
||
|
||
- `gofmt -l .` ✅ 空输出
|
||
- `go vet ./...` ✅
|
||
- `go test ./...` ✅
|
||
- `go test -race ./...` ✅
|
||
- `go test -cover ./internal/...` ✅
|
||
- `internal/access`: `80.5%`
|
||
- `internal/host/sub2api`: `78.1%`
|
||
- `internal/pack`: `73.9%`
|
||
- `internal/provision`: `79.4%`
|
||
- `internal/store/sqlite`: `77.9%`
|
||
- `go test ./tests/integration/... -count=1` ✅
|
||
- `bash ./scripts/test/test_real_host_scripts.sh` ✅
|
||
|
||
## 当前保留的最终证据
|
||
|
||
1. `artifacts/real-host-acceptance/20260520_222713_crm18100_live_model_mapping_validation`
|
||
- 证明 account `credentials.model_mapping` 与 live runtime 对齐
|
||
|
||
2. `artifacts/real-host-acceptance/20260521_142211_crm18100_deepseek_completion_split`
|
||
- 证明 host completion 失败与 upstream completion 成功可以分离
|
||
- 是 completion 分流逻辑的关键根因证据
|
||
|
||
3. `artifacts/real-host-acceptance/20260521_191418_remote43_minimax_key_import`
|
||
- MiniMax 53hk `subscription` 最终成功样本
|
||
- `21-summary.json` 已到 `batch_status=succeeded`、`provider_status=active`
|
||
|
||
4. `artifacts/real-host-acceptance/20260521_201509_remote43_deepseek_key_import`
|
||
- DeepSeek 2166 `subscription` 最终成功样本
|
||
- `21-summary.json` 已到 `batch_status=succeeded`、`provider_status=active`
|
||
|
||
5. `artifacts/real-host-acceptance/20260521_210403`
|
||
- latest-head `self_service` 标准 fresh-host 验收最终成功样本
|
||
- `05-import.json` = `succeeded/self_service_ready/active`
|
||
- `07-access-status.json` = `latest_access_status=fully_ready`
|
||
|
||
6. `artifacts/real-host-acceptance/20260521_222212_remote43_minimax-m2-7-official_key_import`
|
||
- official MiniMax 模板 live 样本
|
||
- 模板链路打通,但当前验证 key 命中 upstream `429`
|
||
|
||
7. `artifacts/real-host-acceptance/20260522_122706_local_v0129_kimi_a7m_subscription_freshhost`
|
||
- latest-head relay-manager 对 patched host `v0.1.129` 的 Kimi A7M `subscription` 最终成功样本
|
||
- `21-summary.json` 已到 `batch_status=succeeded`、`provider_status=active`
|
||
- `account_probe_summary` 明确记录 `probe_advisory=true`、`validation_status=warning`,证明 403 probe race 已被 relay-manager 正确降级
|
||
|
||
8. `artifacts/real-host-acceptance/20260523_144937_remote43_kimi-a7m_key_import`
|
||
- remote43 未改宿主 + 修正后的 latest-head 验收脚本样本
|
||
- 已证明脚本层的“错库取 key / 错地址 / 多 host 历史查询”问题被收掉
|
||
- 仍保留的真实阻塞是宿主 completion 路径 `502/503`
|
||
|
||
9. `artifacts/real-host-acceptance/20260525_local_v0129_kimi_a7m_from_hermes`
|
||
- 当前主 pack `1.1.1` 正式纳入 `kimi-a7m` 后的本地 fresh-host 验收样本
|
||
- `21-summary.json` 保留了 stock host `v0.1.129` 的原始失败快照,证明 Hermes A7M upstream key 当前在线可用,阻塞不在 key 本身
|
||
- `22-patched-host-validation.json` 与 `23-sub2api-host-patch-notes.md` 已补齐 patched host 的真实收敛证据:问题是宿主 runtime 的 `Responses -> raw chat` 兼容缺口,补丁后链路已回到 ready
|
||
|
||
## 剩余项(P2 / 运营前置,不阻塞按 PRD 首版范围上线)
|
||
|
||
1. 运营前置
|
||
- 真实宿主初始化不会自动创建普通用户;上线前必须显式创建普通用户并留存可复用凭据
|
||
- `self_service` 需要普通用户 key 绑定目标标准 group,且通常还需要可用余额
|
||
- `subscription` 需要 subscription 类型 group + 普通用户订阅分配 + key/group 绑定
|
||
- 若启用持续后台 reconcile,SQLite 状态库将持久化最新 access probe 元数据,部署时必须按 secret 级别保护数据库文件
|
||
|
||
2. 部署与环境限制
|
||
- 标准多阶段 Dockerfile 在受限网络环境下仍不稳
|
||
- 当前推荐 `scripts/deploy/build_local_image.sh` + `Dockerfile.local`
|
||
|
||
3. official provider 验证矩阵
|
||
- official MiniMax 当前 live 样本已证明模板链路可用,但验证 key 命中 upstream `429`
|
||
- Qwen / GLM / Kimi / Step 等官方 provider 是否通过 live 验收,仍取决于后续官方 key 与 quota
|
||
|
||
## 当前最短后续路径
|
||
|
||
1. 若继续扩大 provider 覆盖面,优先按 `docs/PROVIDER_VALIDATION_MATRIX.md` 补官方 key,再做 official live 验收
|
||
2. 若继续优化 shared fresh-host 信噪比,对历史残留资源做一次环境清理,降低 `reconcile=drifted` 噪音
|
||
3. 若继续产品化,优先扩大 official provider live 验收覆盖面,并基于新 create-run 入口补充真实宿主 acceptance artifact
|
||
|
||
## v2 规划:Batch Auto-Import(URL + Key)
|
||
|
||
**当前阶段**:✅ 已按基线计划恢复实现(T1~T13 已落地,create-run entry wiring 已补齐,最新全量验证通过)
|
||
|
||
**文档**:`docs/2026-05-21-BATCH_AUTO_IMPORT_SPEC.md`(需求规格)
|
||
**TDD 计划**:`docs/2026-05-21-BATCH_AUTO_IMPORT_TDD_PLAN.md`(实现路径,已确认开放问题)
|
||
**技术架构**:`docs/2026-05-22-BATCH_AUTO_IMPORT_V2_ARCHITECTURE.md`(运行态状态库、结果页、API、页面字段布局)
|
||
**Migration 草案**:`docs/2026-05-22-BATCH_AUTO_IMPORT_V2_MIGRATION_DRAFT.md`(SQLite 新表、索引、lease/retry 字段、legacy link)
|
||
**API Schema 细稿**:`docs/2026-05-22-BATCH_AUTO_IMPORT_V2_API_SCHEMAS.md`(run/item 响应结构、筛选参数、badge 文案、错误语义)
|
||
|
||
**本轮设计收敛**:
|
||
|
||
- 已把真实验收中的三类高频问题写入 v2 方案:
|
||
- 添加模型时的模型名归一化与纠错
|
||
- 第三方国产模型的兼容能力画像(`/responses`、`/chat/completions`、Anthropic compatible、stream/tools)
|
||
- 添加账号后的异步确认窗口(首次 `403` probe race、首次 `503 no available accounts` warm-up)
|
||
- 已补充两类产品化能力到 v2:
|
||
- run / item 状态持久化、retry 轨迹、控制面重启后的历史结果查看
|
||
- 批次列表页 / 批次详情页,用于查看模型纠错结果、账号状态、provider warning 与最终 access 状态
|
||
- 当前 v2 的目标已从“同步导入成功”升级为“导入 + 异步确认 + 最终闭环验真”
|
||
- 已按 review 收口最关键的 4 个冲突:
|
||
- 统一 canonical contract:`run_id/item_id/provider_id/run.state/confirmation_status/access_status`
|
||
- 补齐 `subscription` / `self_service` 的输入契约
|
||
- 明确 V2 唯一状态源为 `import_runs/import_run_items/import_run_item_events`
|
||
- 明确 `ConfirmationWorker + lease + next_retry_at` 的异步确认执行机制
|
||
- 其余 review 问题也已同步收口:
|
||
- capability 从 upstream 总画像升级为 transport + model profiles
|
||
- 结果页字段、状态库存储字段、retry/event trail 已统一
|
||
- run 级请求上下文已持久化到 `import_runs`,控制面重启后 validate 能继续使用 `host_id / subscription_users / subscription_days / probe_api_key`
|
||
- OpenAPI 已补齐 `/api/batch-import/runs*`,legacy `/api/import-batches/*` 降级为 v1/legacy
|
||
- run/item 列表 API 已补齐 `cursor/next_cursor`;run 列表 `q` 可命中 `run_id / provider_id / base_url`
|
||
- 已补充重复导入自动复用策略:按 `provider_id + api_key_fingerprint + canonical_model_family` 判断 `reused / patch_only / replace`
|
||
- 已补充同模型别名归一化契约:例如 `kimi 2.6 / kimi-2.6 / kimi-k2.6` 可归并到同一模型家族并快速复用
|
||
- 已补充多账号重复导入与弃用账号再启用策略:active 账号提示“重复已启用”,disabled/deprecated 账号显示原状态并走 `reactivated` 快速启用路径
|
||
- 已修正 access closure artifact 的 probe key 语义漂移:`subscription` 现在持久化 `requested_probe_api_key`、`effective_probe_key_source`、`effective_probe_key_fingerprint`,不再把外部 raw key 伪装成 `probe_api_key`;`probe_api_key` 仅继续用于 `self_service` 向后兼容
|
||
- 最新干净本地 fresh-host 验收 `artifacts/real-host-acceptance/20260523_local_clean_minimax_subscription_probe_semantics` 已验证:
|
||
- `subscription` closure 会正确区分 `requested_probe_api_key` 与 `managed_subscription` 实际 probe 来源
|
||
- 同一轮 raw key 直打宿主仍返回 `403 not assigned to any group`
|
||
- provider completion 仍受 MiniMax 官方 upstream `429 rate_limit_error` 影响,但这已不再会被 artifact 误读成 raw key 可用
|
||
- 同一 fresh-host 上继续补的 MiniMax `M2.5` 缩圈验证已证明:
|
||
- `artifacts/real-host-acceptance/20260523_local_clean_minimax_m25_only_probe`:单独只打 `M2.5` 时,宿主会选中真实账号并命中 upstream `429`
|
||
- `artifacts/real-host-acceptance/20260523_local_clean_minimax_m25_repeated_probe`:连续第 1 次 `M2.5` 是 `429`,后续第 2/3 次退化成 `503 Service temporarily unavailable`
|
||
- 对应宿主日志中,第一次有 `account_id=1` 和 `upstream_status=429`,后续只有 `account_select_failed error=\"no available accounts\"`
|
||
- 当前 MiniMax live 阻断要按两层解释:第一次是 upstream quota/rate-limit,后续 `503` 是唯一账号进入临时不可调度窗口后的宿主侧结果
|
||
|
||
**本轮实现状态(T1 ~ T13)**:
|
||
|
||
- [x] `internal/batch` canonical types / reuse policy / service / confirmation / validation / projection 已落地
|
||
- [x] `internal/probe` models / alias / capability / smoke completion 已落地
|
||
- [x] `internal/store/sqlite` run/item/event runtime state repo 与 migration 已落地
|
||
- [x] `/api/batch-import/runs*` 路由、projection 读取、CLI `batch-import` 命令、集成测试与设计还原审计已落地
|
||
- [x] `go test ./... -count=1`
|
||
- [x] `go test ./tests/integration/... -count=1`
|
||
- [x] `go test -cover ./internal/... -count=1`
|
||
- [x] `go vet ./...`
|
||
- [x] `gofmt -l .`
|
||
|
||
**T13 审计结论**:
|
||
|
||
- `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md` 已完成
|
||
- latest-head 已补齐 `internal/app/http_batch_import.go` -> `internal/app/batch_runtime.go` 的 create-run 入口 wiring
|
||
- API 与 CLI create-run 现在都会真实驱动 `BatchImportService + ConfirmationWorker + ValidationService`
|
||
- 控制面 server 启动后会自动运行 batch-import background scheduler,`running` run 在重启后可继续推进
|
||
- 最新一轮验证结果保持全绿:`go test ./... -count=1`、`go test ./tests/integration/... -count=1`、`go test -cover ./internal/... -count=1`、`go vet ./...`、`gofmt -l .`
|
||
|
||
**真实 Gate**:✅ 文档、状态机、投影、测试、审计与 create-run 入口已经对齐,**V2 设计已按基线计划交付**
|
||
|
||
---
|
||
|
||
## 禁止错误结论
|
||
|
||
- ❌ 历史失败 artifact ≠ 当前 latest-head 仍失败
|
||
- ❌ capability probe 无副作用 ≠ 所有宿主版本都已真实兼容
|
||
- ❌ rollback-provider 已改安全路径 ≠ 历史脏资源自动消失
|
||
- ❌ `HTTP 200` ≠ 宿主初始化会自动准备普通用户/订阅/余额;这些仍是显式运营前置
|
||
|
||
## 2026-05-30 已完成 false-negative 状态语义收口
|
||
|
||
**代码提交**:`15b7437e feat(status): suppress false negative provider readiness`
|
||
|
||
**目标**:把 `provider_status / provider_accounts.last_probe_status` 的判定语义收口到真实用户数据面,不再把“探测失败但真实 access 已 ready”的样本继续标成 `degraded` / `broken`
|
||
|
||
**本地改动**:
|
||
|
||
- `internal/provision/import_service.go`
|
||
- partial import 现在只保留 `batch_status=partially_succeeded`
|
||
- 只要 gateway access closure 已 ready,就保持 `provider_status=active`
|
||
- 只有真实 `GatewayAccessReady=false` 时才降为 `provider_status=degraded`
|
||
- `internal/provision/provider_status_service.go`
|
||
- `deriveProviderStatus()` 现在会把 `partial + access ready` 直接提升为 `active`
|
||
- 不再要求额外出现 `reconcile_status=active` 才恢复 provider 级状态
|
||
- `internal/store/sqlite/provider_accounts_sync.go`
|
||
- 解析 `probe_summary_json.validation_status / probe_advisory / smoke_model_seen`
|
||
- 单帐号、`access_status in {subscription_ready,self_service_ready,fully_ready}`、且 `smoke_model_seen=true` 的 false-negative 场景下:
|
||
- `provider_accounts.account_status` 提升为 `active`
|
||
- `provider_accounts.last_probe_status` 规范为 `gateway_ready`
|
||
|
||
**新增回归测试**:
|
||
|
||
- `TestImportServiceKeepsProviderActiveWhenGatewayReadyDespiteSingleAccountProbeFailure`
|
||
- `TestProviderStatusServicePromotesReadyPartialBatchWithoutReconcile`
|
||
- `TestSyncProviderAccountsFromImportBatchPromotesSingleReadyGatewayAccount`
|
||
|
||
**本地质量门禁**:
|
||
|
||
- `gofmt -l .`
|
||
- `go vet ./...`
|
||
- `go test -cover ./internal/...`
|
||
- `internal/provision` = `80.4%`
|
||
- `internal/access` = `84.3%`
|
||
- `internal/pack` = `75.7%`
|
||
- `go test ./tests/integration/... -count=1`
|
||
|
||
**remote43 部署**:
|
||
|
||
- fixed checkout: `/home/ubuntu/sub2api-cn-relay-manager-git-current`
|
||
- app dir: `/home/ubuntu/sub2api-kimi-patched-auto2-20260525_18169`
|
||
- running repo head: `15b7437e`
|
||
- `http://127.0.0.1:18173/healthz` => `ok`
|
||
|
||
**remote43 真实回读验证**:
|
||
|
||
- provider status:
|
||
- `GET /api/providers/minimax-53hk/status?pack_id=openai-cn-pack&host_id=proxy-real-host-1780026133`
|
||
- 返回:
|
||
- `batch.batch_status=partially_succeeded`
|
||
- `latest_access_status=subscription_ready`
|
||
- `provider_status=active`
|
||
- provider account inventory:
|
||
- `GET /api/provider-accounts?provider_id=minimax-53hk`
|
||
- 样本 `provider_account.id=19`
|
||
- 返回:
|
||
- `host_account_id=11`
|
||
- `account_status=active`
|
||
- `last_probe_status=gateway_ready`
|
||
- `binding_state=unassigned`
|
||
|
||
**结论**:
|
||
|
||
- 这轮已经收掉“真实用户路径可用,但 provider/account 仍显示失败”的核心 false-negative 噪音
|
||
- 当前 `minimax-53hk` 样本现在能同时满足:
|
||
- import batch 仍保留真实 `partially_succeeded`
|
||
- provider 级状态与真实 access closure 一致,显示 `active`
|
||
- account inventory 不再错误显示 `broken/failed`
|
||
|
||
## 2026-05-30 已沉淀项目技能模板
|
||
|
||
**目标**:把本项目后半程最有效的做法沉淀成可复用 skills,减少后续任务中的虚假完成、控制面/数据面割裂和状态误报问题
|
||
|
||
**新增项目内 skills**:
|
||
|
||
- `.agent/skills/remote-truth-closure`
|
||
- 约束 `本地门禁 -> 提交推送 -> 远端部署 -> 真实验证 -> 执行板落盘` 全链路闭环
|
||
- `.agent/skills/routing-data-plane-e2e`
|
||
- 约束 `logical_group / route / shadow_group / managed key / host usage_logs` 的控制面 + 数据面验收
|
||
- `.agent/skills/false-negative-status-triage`
|
||
- 约束 `provider_status / account_status / real data plane` 信号冲突时的分层诊断与语义修复
|
||
|
||
**配套总结文档**:
|
||
|
||
- `docs/2026-05-30-PROJECT_LEARNINGS_AND_LOCAL_SKILLS.md`
|
||
|
||
**本次沉淀固定的关键经验**:
|
||
|
||
- 本地测试通过不能代表任务完成,必须以远端真实验证收口
|
||
- public alias 尽量留在插件层,stock host 的 shadow provider 优先承载 canonical model
|
||
- 最终证据优先级固定为:真实数据面 > host usage logs > route logs > provider/inventory 投影 > probe-only 信号
|
||
- false-negative 应通过修正状态语义解决,而不是简单隐藏失败
|