docs(review): finalize remediation closure confirmation
This commit is contained in:
@@ -0,0 +1,795 @@
|
||||
# 2026-04-20 真实验证整改任务单
|
||||
|
||||
## 目的
|
||||
|
||||
把以下两份真实验证报告合并为一份可执行整改任务单,并按 `P0 / P1 / P2`、模块负责人、依赖顺序、验证命令、完成标准拆解成可以直接执行的任务:
|
||||
|
||||
- [REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md](/home/long/project/立交桥/review/REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md)
|
||||
- [API_MATRIX_VALIDATION_REPORT_2026-04-20.md](/home/long/project/立交桥/review/API_MATRIX_VALIDATION_REPORT_2026-04-20.md)
|
||||
|
||||
本任务单只覆盖**已证实缺陷**。不把以下事项误写成整改目标:
|
||||
|
||||
- `gateway` 当前主链路通过的接口
|
||||
- `supply-api` 告警链路
|
||||
- 提现因 SMS 未就绪而 `503` 的 fail-closed 行为
|
||||
- 不属于仓库业务代码的问题,例如临时种子数据中的空格格式问题
|
||||
|
||||
## 执行收口状态
|
||||
|
||||
截至 2026-04-20 本轮整改执行收口时,本任务单列出的 `13` 个任务已全部完成,并已分别提交到当前分支。
|
||||
|
||||
对应提交如下:
|
||||
|
||||
- `414ecbb` `P0-TR-01`
|
||||
- `9dba094` `P0-SA-01`
|
||||
- `50f0cc8` `P0-SA-02`
|
||||
- `00ff636` `P0-SA-03`
|
||||
- `1c088e2` `P0-SA-04`
|
||||
- `319d9e1` `P1-AUD-01`
|
||||
- `5661696` `P1-IAM-01`
|
||||
- `a109a68` `P1-IAM-02`
|
||||
- `79d9b87` `P1-IAM-03`
|
||||
- `a1555c0` `P1-IAM-04`
|
||||
- `eab029a` `P2-API-01`
|
||||
- `b879906` `P2-QA-01`
|
||||
|
||||
总验收已实际通过:
|
||||
|
||||
- `bash scripts/ci/repo_integrity_check.sh`
|
||||
- `bash scripts/ci/supply_domain_stability_check.sh 20`
|
||||
- `git diff --check`
|
||||
|
||||
说明:
|
||||
|
||||
- 本文保留“任务拆解”原始结构,作为整改执行基线。
|
||||
- 当前是否已修复,应以以上提交和最终验收结果为准。
|
||||
|
||||
## 2026-04-20 复核确认
|
||||
|
||||
截至 2026-04-20 本次全面复核结束时,本任务单覆盖的 `13` 个已证实问题已再次按“整仓基线 + focused 回归 + 仓储集成”口径核验,当前未发现残留未修复项。
|
||||
|
||||
本次复核实际执行并通过的关键命令如下:
|
||||
|
||||
- `bash scripts/ci/repo_integrity_check.sh`
|
||||
- `bash scripts/ci/supply_domain_stability_check.sh 20`
|
||||
- `cd "/home/long/project/立交桥/platform-token-runtime" && go test -count=1 ./internal/auth/service -run 'Test(PostgresRuntimeStore_SavePreservesExistingFingerprintWhenAccessTokenMissing|InMemoryTokenRuntimeWithPostgresStore_RefreshAndRevokePersistLifecycle)$' -v`
|
||||
- `cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/httpapi -run 'TestSupplyAPI_(ActivateAccount_ConcurrencyConflict|PublishPackage_ConcurrencyConflict|ClonePackage_UnexpectedCreateFailureReturnsInternalServerError|CancelSettlement_ConcurrencyConflict|ActivateAccount_NotFound|PublishPackage_NotFound|ClonePackage_NotFound|CancelSettlement_NotFound)$' -v`
|
||||
- `cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/iam/... -run 'Test.*(AssignRole|RevokeRole|ListRoles|GetUserRoles|UpdateRole)' -v`
|
||||
- `cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/domain ./internal/httpapi -run 'Test.*(Activate|Suspend|Delete|Publish|Pause|Unlist|Clone|Cancel)' -v`
|
||||
- `cd "/home/long/project/立交桥/supply-api" && bash scripts/run_integration_tests.sh ./internal/iam/repository`
|
||||
- `cd "/home/long/project/立交桥/supply-api" && bash scripts/run_integration_tests.sh ./internal/audit/...`
|
||||
|
||||
复核结论:
|
||||
|
||||
- 任务单内 `P0 / P1 / P2` 的 `13` 个问题已全部真实解决。
|
||||
- 当前未复现 `platform-token-runtime` PostgreSQL-backed `refresh/revoke` 失败。
|
||||
- 当前未复现 `supply-api` 幂等锁、套餐创建、账号/套餐状态流转、审计契约、IAM DDL 与 DB-backed IAM 缺陷。
|
||||
- `supply-api/internal/domain` 连续 `20` 轮无缓存复跑通过,之前一次性波动信号本轮未复现。
|
||||
|
||||
任务单外仍需单独治理但**不属于本文 13 项关闭范围**的事项:
|
||||
|
||||
- 结构化日志尚未在三套服务入口完全统一。`supply-api` 已使用结构化 JSON 日志入口,见 [main.go](/home/long/project/立交桥/supply-api/cmd/supply-api/main.go);但 [main.go](/home/long/project/立交桥/gateway/cmd/gateway/main.go) 与 [main.go](/home/long/project/立交桥/platform-token-runtime/cmd/platform-token-runtime/main.go) 仍使用标准库 `log`。这属于新的治理项,不应误记成“本任务单 13 项仍未关闭”。
|
||||
|
||||
## 范围结论
|
||||
|
||||
截至本任务单生成时,真实验证已确认的问题可归纳为 13 项:
|
||||
|
||||
### P0
|
||||
|
||||
1. `platform-token-runtime` PostgreSQL-backed `refresh` / `revoke` 失败
|
||||
2. `supply-api` 幂等锁 `ON CONFLICT` 与表结构不匹配
|
||||
3. `supply-api` 套餐创建 SQL 占位符数量错误
|
||||
4. `supply-api` 账号激活 / 暂停存在确定性乐观锁错误
|
||||
5. `supply-api` 套餐发布 / 暂停 / 下架存在确定性乐观锁错误
|
||||
6. `supply-api` 套餐读取字段映射错误,放大后续更新失败
|
||||
|
||||
### P1
|
||||
|
||||
7. `supply-api` 审计仓储与 `audit_events` 表结构不一致,读写都失败
|
||||
8. `IAM` 原始 DDL 在干净数据库上无法落地
|
||||
9. `IAM` 角色列表与用户角色查询对可空字段扫描不安全
|
||||
10. `IAM` 角色更新把空字符串写入 `INET` 列
|
||||
11. `IAM` 角色分配未传 `granted_by`,触发外键失败
|
||||
|
||||
### P2
|
||||
|
||||
12. 多个 `supply-api` handler 对内部错误返回错误的 HTTP 语义,导致真实冲突被包装成 `404`
|
||||
13. `supply-api/internal/domain` 存在一次未复现的不稳定失败信号,需要专项复查和压测封口
|
||||
|
||||
## 覆盖校验
|
||||
|
||||
以下映射用于保证两份原始报告中的**全部已证实缺陷**都已经落进整改任务,不留“报告里写了、任务单里没拆”的空档:
|
||||
|
||||
| 已证实缺陷 | 对应任务 |
|
||||
| --- | --- |
|
||||
| `platform-token-runtime` PostgreSQL-backed `refresh` / `revoke` 失败 | `P0-TR-01` |
|
||||
| `supply-api` 账号创建被幂等锁 DDL/SQL 契约阻断 | `P0-SA-01` |
|
||||
| `supply-api` 套餐创建 / clone 被 `INSERT` 列数不匹配阻断 | `P0-SA-02` |
|
||||
| `supply-api` 账号 `activate` / `suspend` 的乐观锁错误 | `P0-SA-03` |
|
||||
| `supply-api` 套餐 `publish` / `pause` / `unlist` 的乐观锁错误 | `P0-SA-04` |
|
||||
| `supply-api` 套餐读取字段映射错误 | `P0-SA-04` |
|
||||
| `supply-api` 审计写入与查询受 `audit_events` 契约失配影响 | `P1-AUD-01` |
|
||||
| `IAM` 原始 DDL 无法在干净库初始化 | `P1-IAM-01` |
|
||||
| `IAM` 角色列表 / 用户角色查询的 null scan | `P1-IAM-02` |
|
||||
| `IAM` 角色更新把空字符串写入 `INET` | `P1-IAM-03` |
|
||||
| `IAM` 角色分配缺失 `granted_by` 触发外键失败 | `P1-IAM-04` |
|
||||
| `supply-api` 多个 handler 错误语义错误 | `P2-API-01` |
|
||||
| `supply-api/internal/domain` 一次性波动失败信号 | `P2-QA-01` |
|
||||
|
||||
说明:
|
||||
|
||||
- 两份报告中没有其他已证实的 `gateway` 缺陷进入整改范围;`gateway` 当前角色是总验收联调方。
|
||||
- `gateway` 当前没有直接整改任务,但必须纳入最终总验收。
|
||||
- `platform-token-runtime` 当前没有新增查询链路缺陷,整改重点只在变更链路。
|
||||
- 提现 `503` 属于条件关闭,不属于待修复缺陷,因此未拆入任务。
|
||||
|
||||
## 负责人视角
|
||||
|
||||
| 负责人视角 | 覆盖模块 | 负责内容 |
|
||||
| --- | --- | --- |
|
||||
| `Token Runtime 负责人` | `platform-token-runtime` | Token 生命周期写路径、PostgreSQL store、一致性回归 |
|
||||
| `Supply Domain/Repository 负责人` | `supply-api` | 幂等、账号、套餐、结算读写、仓储与领域一致性 |
|
||||
| `Audit/IAM/SQL 负责人` | `supply-api/sql`、`internal/audit`、`internal/iam` | DDL、审计表契约、IAM schema、DB-backed IAM |
|
||||
| `QA/CI 负责人` | 验证脚本、矩阵回归 | 真实复现、无缓存回归、环境验收、波动性复查 |
|
||||
| `Gateway 负责人` | `gateway` | 最终联调验收,无直接缺陷修复任务 |
|
||||
|
||||
## 执行规则
|
||||
|
||||
1. `P0` 未全部完成前,不开始任何 `P1` 代码修复。
|
||||
2. 每个任务都必须先写或补“能稳定复现缺陷”的测试,再修代码。
|
||||
3. 每个任务都必须跑 focused 回归和至少一条跨模块回归。
|
||||
4. 所有 Go 测试默认使用 `go test -count=1`。
|
||||
5. 涉及 PostgreSQL 契约的问题,必须同时修:
|
||||
- 仓库代码
|
||||
- 基线 schema
|
||||
- 干净库初始化路径
|
||||
- 已有库升级路径
|
||||
6. 任何接口级缺陷修复后,都要补一条真实 HTTP 或集成测试,避免只在单元层“看起来正确”。
|
||||
|
||||
## 依赖顺序
|
||||
|
||||
1. `P0-TR-01` 修复 token runtime `refresh/revoke`
|
||||
2. `P0-SA-01` 修复 idempotency DDL/仓储契约
|
||||
3. `P0-SA-02` 修复 package create SQL
|
||||
4. `P0-SA-03` 修复 account lifecycle 乐观锁
|
||||
5. `P0-SA-04` 修复 package lifecycle 乐观锁与字段映射
|
||||
6. `P1-AUD-01` 统一 `audit_events` 契约
|
||||
7. `P1-IAM-01` 修复 IAM DDL 初始化
|
||||
8. `P1-IAM-02` 修复 IAM null scan
|
||||
9. `P1-IAM-03` 修复 IAM update role `INET` 写入
|
||||
10. `P1-IAM-04` 修复 IAM assign role `granted_by`
|
||||
11. `P2-API-01` 统一 handler 错误语义
|
||||
12. `P2-QA-01` 处理 `internal/domain` 波动性信号
|
||||
13. 执行总验收
|
||||
|
||||
---
|
||||
|
||||
## P0 任务
|
||||
|
||||
### P0-TR-01 修复 `platform-token-runtime` 的 PostgreSQL `refresh` / `revoke`
|
||||
|
||||
**负责人:** `Token Runtime 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `refresh` 与 `revoke` 在 PostgreSQL-backed 模式下返回 `200`
|
||||
- 让 token 状态真实落库
|
||||
- 让 `gateway` 在 revoke 后拒绝已吊销 token
|
||||
|
||||
**根因:**
|
||||
|
||||
- [postgres_runtime_store.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/postgres_runtime_store.go#L73) 在 `INSERT` 阶段使用 `NULLIF($2, '')`
|
||||
- [inmemory_runtime.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/inmemory_runtime.go#L164) 和 [inmemory_runtime.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/inmemory_runtime.go#L192) 在刷新/撤销时调用 `Save`,但没有重新携带 access token
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 先补 `postgres_runtime_store` 的 focused 测试:
|
||||
- 刷新时未重传 access token,仍能保留旧 `token_fingerprint`
|
||||
- 撤销时未重传 access token,仍能落库成功
|
||||
2. 二选一,但必须只保留一套语义:
|
||||
- 方案 A:`Save` 在 PostgreSQL 路径上先查询旧记录并补齐 `token_fingerprint`
|
||||
- 方案 B:`Refresh` / `Revoke` 路径显式传回原 token 或原 fingerprint
|
||||
3. 补 HTTP 集成测试:
|
||||
- `issue -> refresh -> introspect`
|
||||
- `issue -> revoke -> introspect`
|
||||
- `issue -> revoke -> gateway /v1/chat/completions` 应返回 `401`
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `platform-token-runtime/internal/auth/service/postgres_runtime_store.go`
|
||||
- `platform-token-runtime/internal/auth/service/postgres_runtime_store_test.go`
|
||||
- `platform-token-runtime/internal/auth/service/inmemory_runtime.go`
|
||||
- `platform-token-runtime/internal/httpapi/token_api_test.go`
|
||||
- 如有必要:`platform-token-runtime/internal/app/bootstrap_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- `/api/v1/platform/tokens/{token_id}/refresh` 返回 `200`
|
||||
- `/api/v1/platform/tokens/{token_id}/revoke` 返回 `200`
|
||||
- PostgreSQL 中 token 状态真实变更
|
||||
- `gateway` 对已撤销 token 拒绝访问
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/platform-token-runtime" && go test -count=1 ./internal/auth/service ./internal/httpapi ./internal/app -v
|
||||
cd "/home/long/project/立交桥/platform-token-runtime" && go test -count=1 ./...
|
||||
```
|
||||
|
||||
### P0-SA-01 修复 `supply-api` 幂等锁 DDL/仓储契约
|
||||
|
||||
**负责人:** `Supply Domain/Repository 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `POST /api/v1/supply/accounts` 不再因幂等锁初始化失败而直接 `500`
|
||||
|
||||
**根因:**
|
||||
|
||||
- [idempotency.go](/home/long/project/立交桥/supply-api/internal/repository/idempotency.go#L217) 依赖 `(tenant_id, operator_id, api_path, idempotency_key)` 的 `ON CONFLICT`
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L110) 当前没有对应唯一约束
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 先补 repository 集成测试:
|
||||
- 首次 `AcquireLock` 成功
|
||||
- 未过期重复 key 走冲突语义
|
||||
- 过期后可重入
|
||||
2. 补齐数据库契约,至少包括:
|
||||
- 基线 schema 上的唯一索引/唯一约束
|
||||
- 已有库升级脚本
|
||||
- 分区场景下的兼容实现
|
||||
3. 校正 `AcquireLock` 与 `SaveResponse` 的 SQL,确认幂等记录不会因分区/过期逻辑失效
|
||||
4. 补 HTTP 集成测试:
|
||||
- 同一个 `Idempotency-Key` 连续两次创建账号
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/repository/idempotency.go`
|
||||
- `supply-api/internal/repository/idempotency_test.go`
|
||||
- `supply-api/sql/postgresql/partition_strategy_v1.sql`
|
||||
- 新增一个 forward-only 升级脚本,放到当前 SQL 基线管理目录
|
||||
- 如有需要:`supply-api/internal/middleware/idempotency_middleware_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 账号创建不再报 `IDEMPOTENCY_LOCK_FAILED`
|
||||
- 幂等锁在干净库和升级后数据库都能工作
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/repository -run 'Test.*Idempotency' -v
|
||||
cd "/home/long/project/立交桥/supply-api" && bash scripts/run_integration_tests.sh ./internal/repository
|
||||
```
|
||||
|
||||
### P0-SA-02 修复 `supply-api` 套餐创建 SQL
|
||||
|
||||
**负责人:** `Supply Domain/Repository 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `POST /api/v1/supply/packages/draft` 恢复可用
|
||||
- 让 clone 依赖的内部创建链路恢复可用
|
||||
|
||||
**根因:**
|
||||
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L27) 目标列为 29 个
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L37) 占位符只有 28 个
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 先补 `PackageRepository.Create` 集成测试:
|
||||
- 直接 create 成功
|
||||
- clone 间接走 create 也成功
|
||||
2. 修正列数/占位符/参数顺序
|
||||
3. 复查 `request_id`、`audit_trace_id`、`created_at/updated_at` 的落库列是否和查询层一致
|
||||
4. 补 HTTP 测试:
|
||||
- `POST /api/v1/supply/packages/draft`
|
||||
- `POST /api/v1/supply/packages/{package_id}/clone`
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/repository/package.go`
|
||||
- `supply-api/internal/repository/package_test.go`
|
||||
- `supply-api/internal/domain/package_test.go`
|
||||
- `supply-api/internal/httpapi/supply_api_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- draft 创建返回 `201`
|
||||
- clone 返回 `201`
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/repository ./internal/domain ./internal/httpapi -run 'Test.*Package' -v
|
||||
```
|
||||
|
||||
### P0-SA-03 修复 `supply-api` 账号状态流转的乐观锁错误
|
||||
|
||||
**负责人:** `Supply Domain/Repository 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `activate` / `suspend` 在正确样本下返回 `200`
|
||||
|
||||
**根因:**
|
||||
|
||||
- 领域层在 [account.go](/home/long/project/立交桥/supply-api/internal/domain/account.go#L221) 先 `Version++`
|
||||
- DB adapter 在 [adapter.go](/home/long/project/立交桥/supply-api/internal/adapter/adapter.go#L150) 再把已递增版本作为 `expectedVersion`
|
||||
- Repository 在 [account.go](/home/long/project/立交桥/supply-api/internal/repository/account.go#L143) 又做 `expectedVersion + 1`
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 明确版本职责,只保留一处递增:
|
||||
- 推荐:领域层只表达状态变化,不直接改 `Version`
|
||||
- repository 负责 `expectedVersion -> newVersion`
|
||||
2. 为 DB-backed AccountStore 增加集成测试:
|
||||
- 激活 pending/suspended
|
||||
- 暂停 active
|
||||
- 冲突场景下仍返回真实并发错误
|
||||
3. 补 HTTP 集成测试:
|
||||
- `POST /activate`
|
||||
- `POST /suspend`
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/domain/account.go`
|
||||
- `supply-api/internal/domain/account_test.go`
|
||||
- `supply-api/internal/adapter/adapter.go`
|
||||
- `supply-api/internal/repository/account.go`
|
||||
- `supply-api/internal/httpapi/supply_api_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 正常状态流转返回 `200`
|
||||
- 冲突场景不再在正常路径误报
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/domain ./internal/repository ./internal/httpapi -run 'Test.*Account' -v
|
||||
```
|
||||
|
||||
### P0-SA-04 修复 `supply-api` 套餐状态流转与字段映射
|
||||
|
||||
**负责人:** `Supply Domain/Repository 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `publish` / `pause` / `unlist` 返回 `200`
|
||||
- 修正 `PackageRepository.GetByID` 的字段映射
|
||||
|
||||
**根因:**
|
||||
|
||||
- 与账号相同的双重 `Version++`
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L73) 查询列是 `supply_account_id, user_id`
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L88) 扫描到 `pkg.SupplierID, pkg.AccountID`,发生反转
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 先补 `GetByID` 仓储测试,锁定:
|
||||
- `SupplierID == user_id`
|
||||
- `AccountID == supply_account_id`
|
||||
2. 去掉领域层和 adapter 之间的双重版本递增
|
||||
3. 补状态流转测试:
|
||||
- draft -> publish
|
||||
- active -> pause
|
||||
- paused -> unlist
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/repository/package.go`
|
||||
- `supply-api/internal/repository/package_test.go`
|
||||
- `supply-api/internal/domain/package.go`
|
||||
- `supply-api/internal/domain/package_test.go`
|
||||
- `supply-api/internal/adapter/adapter.go`
|
||||
- `supply-api/internal/httpapi/supply_api_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 三个状态流转接口全部 `200`
|
||||
- `GetByID` 返回的领域对象字段语义正确
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/domain ./internal/repository ./internal/httpapi -run 'Test.*Package' -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## P1 任务
|
||||
|
||||
### P1-AUD-01 统一 `audit_events` 表结构与审计仓储契约
|
||||
|
||||
**负责人:** `Audit/IAM/SQL 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 修复审计写入失败
|
||||
- 修复审计事件查询失败
|
||||
- 修复账号审计日志查询失败
|
||||
|
||||
**根因:**
|
||||
|
||||
- 仓储实现按 [audit_repository.go](/home/long/project/立交桥/supply-api/internal/audit/repository/audit_repository.go#L333) 的完整字段集读写
|
||||
- 当前 [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L7) 的 `audit_events` 父表只定义了极简列
|
||||
|
||||
**决策要求:**
|
||||
|
||||
此任务开始前先做一个明确决策,只能二选一:
|
||||
|
||||
1. **以仓储契约为准扩展 schema**
|
||||
2. **以最小 schema 为准收缩仓储读写字段**
|
||||
|
||||
不允许继续维持“两边都不是”的状态。
|
||||
|
||||
**推荐方向:**
|
||||
|
||||
- 以仓储契约为准扩展 schema。因为当前代码已经在多个路径里读写更多字段,收缩仓储的改动面更广。
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 列出 `audit_repository.go` 实际依赖的全部列
|
||||
2. 更新 `audit_events` 父表定义
|
||||
3. 更新分区生成函数,确保新分区继承完整列
|
||||
4. 为已有库增加迁移脚本
|
||||
5. 补两类测试:
|
||||
- 审计写入成功
|
||||
- `GetByEventID` / `QueryWithTotal` 成功
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/audit/repository/audit_repository.go`
|
||||
- `supply-api/internal/audit/repository/audit_repository_test.go`
|
||||
- `supply-api/internal/audit/postgres_audit_store.go`
|
||||
- `supply-api/sql/postgresql/partition_strategy_v1.sql`
|
||||
- 新增 forward-only 迁移脚本
|
||||
- 如有需要:`supply-api/internal/httpapi/supply_api_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 审计写入不再报 `trace_id` 缺列
|
||||
- `GET /api/v1/audit/events/{event_id}` 返回 `200`
|
||||
- `GET /api/v1/supply/accounts/{account_id}/audit-logs` 返回 `200`
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/audit/... ./internal/httpapi -run 'Test.*Audit' -v
|
||||
```
|
||||
|
||||
### P1-IAM-01 修复 IAM 原始 DDL 初始化失败
|
||||
|
||||
**负责人:** `Audit/IAM/SQL 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 [iam_schema_v1.sql](/home/long/project/立交桥/sql/postgresql/iam_schema_v1.sql) 能在干净数据库直接落库
|
||||
|
||||
**根因:**
|
||||
|
||||
- `'*'` 默认 Scope 与 `chk_scope_code_format` 不兼容
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 明确 `'*'` 是否要作为保留合法值
|
||||
2. 二选一:
|
||||
- 放宽约束,允许 `'*'`
|
||||
- 替换默认种子值,改成合规代码,例如 `all`
|
||||
3. 同步修正:
|
||||
- schema
|
||||
- seed 数据
|
||||
- 相关测试与文档
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `sql/postgresql/iam_schema_v1.sql`
|
||||
- 相关 IAM schema 测试
|
||||
- 相关文档
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 干净库执行 `iam_schema_v1.sql` 成功
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
psql "<test_dsn>" -v ON_ERROR_STOP=1 -f "/home/long/project/立交桥/sql/postgresql/iam_schema_v1.sql"
|
||||
```
|
||||
|
||||
### P1-IAM-02 修复 IAM null scan 问题
|
||||
|
||||
**负责人:** `Audit/IAM/SQL 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `GET /api/v1/iam/roles`
|
||||
- 让 `GET /api/v1/iam/users/{user_id}/roles`
|
||||
|
||||
在真实数据下正常返回
|
||||
|
||||
**根因:**
|
||||
|
||||
- `request_id`、`granted_by` 等可空字段被直接扫描进非空基础类型
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 把 repository 扫描改成 `sql.Null*` / 指针 / 中间变量
|
||||
2. 明确 model 中哪些字段应允许空值
|
||||
3. 补集成测试:
|
||||
- `request_id IS NULL` 的角色记录
|
||||
- `granted_by IS NULL` 的用户角色记录
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/iam/repository/iam_repository.go`
|
||||
- `supply-api/internal/iam/repository/iam_repository_test.go`
|
||||
- 如有需要:`supply-api/internal/iam/model/*.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 角色列表和用户角色列表都返回 `200`
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/iam/... -run 'Test.*(ListRoles|GetUserRoles)' -v
|
||||
```
|
||||
|
||||
### P1-IAM-03 修复 IAM 角色更新的 `INET` 写入错误
|
||||
|
||||
**负责人:** `Audit/IAM/SQL 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `PUT /api/v1/iam/roles/{role_code}` 返回 `200`
|
||||
|
||||
**根因:**
|
||||
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L158) 直接把空字符串写入 `updated_ip`
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 把 `UpdatedIP` 在 repository 层转换为 `NULL` 而不是 `""`
|
||||
2. 统一 `CreatedIP / UpdatedIP` 的可空处理
|
||||
3. 补 update 集成测试:
|
||||
- 未提供更新 IP
|
||||
- 提供合法 IP
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/iam/repository/iam_repository.go`
|
||||
- `supply-api/internal/iam/repository/iam_repository_test.go`
|
||||
- `supply-api/internal/iam/model/role.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 角色更新成功
|
||||
- 不再出现 `invalid input syntax for type inet: ""`
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/iam/... -run 'Test.*UpdateRole' -v
|
||||
```
|
||||
|
||||
### P1-IAM-04 修复 IAM 角色分配的 `granted_by` 外键错误
|
||||
|
||||
**负责人:** `Audit/IAM/SQL 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 让 `POST /api/v1/iam/users/{user_id}/roles` 返回 `201`
|
||||
- 让对应 revoke 流程可用
|
||||
|
||||
**根因:**
|
||||
|
||||
- handler 不传 `GrantedBy`
|
||||
- service 直接把 `0` 落库
|
||||
- foreign key 拒绝
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 先确定语义:
|
||||
- `granted_by` 是必填审计字段
|
||||
- 还是允许 `NULL`
|
||||
2. 推荐修法:
|
||||
- 从认证上下文读取操作人 userID
|
||||
- handler 构造 `AssignRoleRequest` 时写入 `GrantedBy`
|
||||
- 若上下文缺失则明确返回 `401/400`,不要写 `0`
|
||||
3. 如果产品允许系统分配,才考虑把 `granted_by` 设为可空,并同步模型/DDL
|
||||
4. 补两条 HTTP 集成测试:
|
||||
- assign success
|
||||
- revoke success
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/iam/handler/iam_handler.go`
|
||||
- `supply-api/internal/iam/service/iam_service_db.go`
|
||||
- `supply-api/internal/iam/repository/iam_repository.go`
|
||||
- 如有需要:`sql/postgresql/iam_schema_v1.sql`
|
||||
- `supply-api/internal/iam/handler/iam_handler_real_test.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- assign 返回 `201`
|
||||
- revoke 在 assign 成功后返回 `200`
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./internal/iam/... -run 'Test.*(AssignRole|RevokeRole)' -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## P2 任务
|
||||
|
||||
### P2-API-01 统一 `supply-api` handler 错误语义
|
||||
|
||||
**负责人:** `Supply Domain/Repository 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 避免内部冲突、SQL 错误、创建失败被错误包装成 `404`
|
||||
|
||||
**本轮已观察到的错误语义:**
|
||||
|
||||
- 账号状态流转内部并发冲突对外是 `404`
|
||||
- 套餐状态流转内部并发冲突对外是 `404`
|
||||
- clone 内部创建失败对外是 `404`
|
||||
|
||||
**说明:**
|
||||
|
||||
这个任务不替代 `P0` 根因修复,但必须在根因修复后补上,防止后续再次出现“逻辑修好了、错误语义仍然误导”的情况。
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 统一 domain/repository 错误类型,不再依赖字符串 contains
|
||||
2. handler 按错误类别映射:
|
||||
- `ErrNotFound -> 404`
|
||||
- `ErrConcurrencyConflict / business conflict -> 409`
|
||||
- `validation -> 400/422`
|
||||
- `unexpected storage error -> 500`
|
||||
3. 补 handler 层测试,锁定返回码
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/httpapi/supply_api.go`
|
||||
- `supply-api/internal/httpapi/supply_api_test.go`
|
||||
- 如有需要:`supply-api/internal/repository/errors.go`、`internal/domain/errors.go`
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 所有状态流转和 clone 相关接口在错误情况下返回正确状态码
|
||||
|
||||
### P2-QA-01 复查 `supply-api/internal/domain` 的波动性失败信号
|
||||
|
||||
**负责人:** `QA/CI 负责人`
|
||||
|
||||
**目标:**
|
||||
|
||||
- 把之前只出现一次、后续未复现的 domain 测试失败定性为:
|
||||
- 已消失
|
||||
- 或可复现并已修复
|
||||
|
||||
**建议改动:**
|
||||
|
||||
1. 锁定上次失败覆盖的测试集合
|
||||
2. 无缓存循环执行至少 `20` 次
|
||||
3. 记录是否存在:
|
||||
- 时序问题
|
||||
- 共享状态污染
|
||||
- 数据依赖顺序问题
|
||||
4. 若仍无法复现,保留复查记录并在 CI 中加入高频重跑 job
|
||||
|
||||
**涉及文件:**
|
||||
|
||||
- `supply-api/internal/domain/*_test.go`
|
||||
- 可选:CI 脚本
|
||||
|
||||
**完成标准:**
|
||||
|
||||
- 给出“已稳定 / 已复现并修复 / 仍待观察”的明确结论
|
||||
|
||||
**验证命令:**
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/supply-api" && GOCACHE=/tmp/lijiaoqiao-go-cache-flake go test -count=20 ./internal/domain -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 总验收
|
||||
|
||||
以下门槛全部通过,才允许关闭本任务单:
|
||||
|
||||
### 1. 单模块回归
|
||||
|
||||
```bash
|
||||
cd "/home/long/project/立交桥/gateway" && go test -count=1 ./...
|
||||
cd "/home/long/project/立交桥/platform-token-runtime" && go test -count=1 ./...
|
||||
cd "/home/long/project/立交桥/supply-api" && go test -count=1 ./...
|
||||
cd "/home/long/project/立交桥/supply-api" && bash scripts/run_integration_tests.sh ./internal/repository
|
||||
```
|
||||
|
||||
### 2. 干净数据库初始化
|
||||
|
||||
必须在全新 PostgreSQL 中成功执行:
|
||||
|
||||
- `supply-api/sql/postgresql/*.sql` 当前基线
|
||||
- `sql/postgresql/iam_schema_v1.sql`
|
||||
- `sql/postgresql/token_runtime_schema_v1.sql`
|
||||
|
||||
### 3. 真实接口矩阵复跑
|
||||
|
||||
至少重跑以下接口并全部满足预期:
|
||||
|
||||
- `gateway`
|
||||
- `/v1/models`
|
||||
- `/v1/chat/completions`
|
||||
- `/v1/completions`
|
||||
- `platform-token-runtime`
|
||||
- `issue`
|
||||
- `introspect`
|
||||
- `audit-events`
|
||||
- `refresh`
|
||||
- `revoke`
|
||||
- `supply-api`
|
||||
- `accounts.verify`
|
||||
- `accounts.create`
|
||||
- `accounts.activate`
|
||||
- `accounts.suspend`
|
||||
- `accounts.delete`
|
||||
- `accounts.audit-logs`
|
||||
- `packages.draft`
|
||||
- `packages.batch-price`
|
||||
- `packages.publish`
|
||||
- `packages.pause`
|
||||
- `packages.unlist`
|
||||
- `packages.clone`
|
||||
- `billing`
|
||||
- `settlements.cancel`
|
||||
- `settlements.statement`
|
||||
- `earnings.records`
|
||||
- `audit.events.get`
|
||||
- `audit.alerts` 全 CRUD
|
||||
- `iam.roles` list/create/get/update/delete
|
||||
- `iam.users.roles` get/assign/revoke
|
||||
- `iam.scopes`
|
||||
- `iam.check-scope`
|
||||
|
||||
### 4. 关闭标准
|
||||
|
||||
当且仅当满足以下条件,任务单可关闭:
|
||||
|
||||
1. 两份原始报告中的全部已证实缺陷都有对应提交和验证记录
|
||||
2. 不再存在 “真实接口失败但单元测试全绿” 的已知缺口
|
||||
3. 新一轮接口矩阵中只允许保留“条件关闭”的提现项
|
||||
4. `gateway / platform-token-runtime / supply-api` 三个模块都能在真实 PostgreSQL 下完成主链路联调
|
||||
|
||||
## 建议执行批次
|
||||
|
||||
### 批次 A:必须先完成
|
||||
|
||||
- `P0-TR-01`
|
||||
- `P0-SA-01`
|
||||
- `P0-SA-02`
|
||||
- `P0-SA-03`
|
||||
- `P0-SA-04`
|
||||
|
||||
### 批次 B:完成后才能宣称 DB-backed 能力闭环
|
||||
|
||||
- `P1-AUD-01`
|
||||
- `P1-IAM-01`
|
||||
- `P1-IAM-02`
|
||||
- `P1-IAM-03`
|
||||
- `P1-IAM-04`
|
||||
|
||||
### 批次 C:收口与抗回归
|
||||
|
||||
- `P2-API-01`
|
||||
- `P2-QA-01`
|
||||
355
review/API_MATRIX_VALIDATION_REPORT_2026-04-20.md
Normal file
355
review/API_MATRIX_VALIDATION_REPORT_2026-04-20.md
Normal file
@@ -0,0 +1,355 @@
|
||||
# 接口矩阵真实验证报告
|
||||
|
||||
生成时间:2026-04-20
|
||||
仓库路径:`/home/long/project/立交桥`
|
||||
验证方式:真实启动 + 真实 PostgreSQL + 真实 HTTP 请求 + 大样本种子数据
|
||||
业务代码变更:无
|
||||
|
||||
状态说明:本文记录的是 2026-04-20 的接口矩阵基线快照,不代表整改后的当前状态。整改执行与修复完成情况请以 [2026-04-20-remediation-tasklist-from-real-validation.md](/home/long/project/立交桥/docs/plans/2026-04-20-remediation-tasklist-from-real-validation.md) 和后续提交为准。
|
||||
|
||||
后续状态更新:截至 2026-04-20 本次复核结束时,本文中已拆入整改任务单的 `13` 个已证实问题已完成修复并通过复核。当前残留的结构化日志统一问题不在本文当日接口缺陷矩阵范围内,应作为独立治理项跟踪,而不是回填为“矩阵缺陷未关闭”。
|
||||
|
||||
关联总报告:
|
||||
|
||||
- [REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md](/home/long/project/立交桥/review/REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md)
|
||||
|
||||
原始矩阵工件:
|
||||
|
||||
- `/tmp/lijiaoqiao-api-matrix/api_matrix_raw_20260420_101015.md`
|
||||
|
||||
## 1. 执行结论
|
||||
|
||||
本轮接口矩阵验证覆盖了当前主启动链路中已真实挂载的核心接口,结论如下:
|
||||
|
||||
- `gateway`:`8/8` 接口通过,基础网关链路正常。
|
||||
- `platform-token-runtime`:`6` 个接口中 `4` 个通过,`refresh` / `revoke` 两个变更接口失败。
|
||||
- `supply-api`:共记录 `39` 个验证项,其中 `22` 个通过、`16` 个失败、`1` 个按设计条件关闭。
|
||||
- `supply-api` 的告警接口链路在正确请求契约下完整通过。
|
||||
- `supply-api` 的 IAM 接口不是“全部失败”,而是“列表/用户角色/更新/分配存在明确实现缺陷,创建/获取/删除/Scope 列表可用”。
|
||||
|
||||
更准确的项目状态是:
|
||||
|
||||
- 基础读路径和部分管理路径可用。
|
||||
- 多个写路径、状态流转路径、DB-backed 审计与 IAM 路径仍然不可靠。
|
||||
|
||||
## 2. 验证环境与口径
|
||||
|
||||
### 2.1 启动环境
|
||||
|
||||
- 采用隔离 Podman PostgreSQL 15 容器,端口 `15441`
|
||||
- 启动服务:
|
||||
- `gateway` `:18080`
|
||||
- `platform-token-runtime` `:18081`
|
||||
- `supply-api` `:18082`
|
||||
- 模拟上游 `:19090`
|
||||
|
||||
### 2.2 数据规模
|
||||
|
||||
本轮沿用真实大样本数据:
|
||||
|
||||
- `120` 租户
|
||||
- `2,120` IAM 用户
|
||||
- `6,000` 供应账号
|
||||
- `18,000` 套餐
|
||||
- `50,000` usage 记录
|
||||
- `30,000` 审计事件
|
||||
- `2,500` 告警
|
||||
- `20,000` 平台 Token
|
||||
- `20,000` Token 审计事件
|
||||
|
||||
### 2.3 判定口径
|
||||
|
||||
- `通过`:接口在真实请求下返回符合预期的成功结果
|
||||
- `失败`:接口在正确请求契约和合理样本下返回错误,且已证实为实现缺陷或 DDL 缺陷
|
||||
- `条件关闭`:接口被显式门禁关闭,属于设计内行为,不记为 bug
|
||||
|
||||
## 3. 关键新增发现
|
||||
|
||||
与上一版总报告相比,本轮接口矩阵新增或进一步坐实了以下问题。
|
||||
|
||||
### 3.1 P0: supply-api 账号状态流转接口存在确定性乐观锁错误
|
||||
|
||||
现象:
|
||||
|
||||
- `POST /api/v1/supply/accounts/{account_id}/activate`
|
||||
- `POST /api/v1/supply/accounts/{account_id}/suspend`
|
||||
|
||||
两条接口都在正确样本下返回 `404`,但错误内容并非“资源不存在”,而是:
|
||||
|
||||
- `concurrency conflict: resource was modified by another transaction`
|
||||
|
||||
根因:
|
||||
|
||||
- 领域服务先做 `Version++`
|
||||
- DB adapter 又把已经递增后的 `Version` 当作 `expectedVersion`
|
||||
- Repository 再次基于 `expectedVersion + 1` 更新,并在 `WHERE version = expectedVersion` 上匹配
|
||||
- 最终导致 `RowsAffected() == 0`
|
||||
|
||||
证据位置:
|
||||
|
||||
- [account.go](/home/long/project/立交桥/supply-api/internal/domain/account.go#L219)
|
||||
- [account.go](/home/long/project/立交桥/supply-api/internal/domain/account.go#L221)
|
||||
- [adapter.go](/home/long/project/立交桥/supply-api/internal/adapter/adapter.go#L150)
|
||||
- [account.go](/home/long/project/立交桥/supply-api/internal/repository/account.go#L129)
|
||||
- [account.go](/home/long/project/立交桥/supply-api/internal/repository/account.go#L143)
|
||||
- [account.go](/home/long/project/立交桥/supply-api/internal/repository/account.go#L160)
|
||||
|
||||
影响:
|
||||
|
||||
- 账号激活、暂停等更新型接口在 DB-backed 运行时下不可靠
|
||||
|
||||
### 3.2 P0: supply-api 套餐状态流转接口存在双重问题
|
||||
|
||||
现象:
|
||||
|
||||
- `POST /api/v1/supply/packages/{package_id}/publish`
|
||||
- `POST /api/v1/supply/packages/{package_id}/pause`
|
||||
- `POST /api/v1/supply/packages/{package_id}/unlist`
|
||||
|
||||
在正确样本下统一返回并发冲突类错误。
|
||||
|
||||
根因一:
|
||||
|
||||
- 与账号状态流转相同,领域层先 `Version++`,adapter 再把递增值作为 `expectedVersion` 传给 repository,导致更新条件天然不匹配
|
||||
|
||||
根因二:
|
||||
|
||||
- `GetByID` 查询列顺序为 `id, supply_account_id, user_id`
|
||||
- 但扫描目标却是 `pkg.ID, pkg.SupplierID, pkg.AccountID`
|
||||
- 导致 `SupplierID` 和 `AccountID` 映射反了,后续更新 `WHERE user_id = pkg.SupplierID` 进一步放大失败概率
|
||||
|
||||
证据位置:
|
||||
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L192)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L194)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L221)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L223)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L246)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/domain/package.go#L248)
|
||||
- [adapter.go](/home/long/project/立交桥/supply-api/internal/adapter/adapter.go#L176)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L73)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L88)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L116)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L131)
|
||||
|
||||
影响:
|
||||
|
||||
- 套餐发布、暂停、下架三条状态流转链路在当前 DB-backed 实现下不可用
|
||||
|
||||
### 3.3 P0: supply-api 套餐创建 / 克隆链路被同一 SQL 缺陷阻断
|
||||
|
||||
现象:
|
||||
|
||||
- `POST /api/v1/supply/packages/draft` 返回 `422`
|
||||
- `POST /api/v1/supply/packages/{package_id}/clone` 返回 `404` 包装错误,但内层仍是创建失败
|
||||
|
||||
根因:
|
||||
|
||||
- `supply_packages` 的 `INSERT` 目标列数多于占位符数
|
||||
|
||||
证据位置:
|
||||
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L27)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L37)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L60)
|
||||
|
||||
### 3.4 P1: 审计事件读路径也被表结构不一致破坏
|
||||
|
||||
现象:
|
||||
|
||||
- `GET /api/v1/supply/accounts/{account_id}/audit-logs` 返回 `500`
|
||||
- `GET /api/v1/audit/events/{event_id}` 返回 `500`
|
||||
|
||||
根因:
|
||||
|
||||
- 审计仓储查询和写入都假定 `audit_events` 表存在 `trace_id`、`span_id`
|
||||
- 当前分区 schema 实际没有这两个列
|
||||
|
||||
证据位置:
|
||||
|
||||
- [audit_repository.go](/home/long/project/立交桥/supply-api/internal/audit/repository/audit_repository.go#L333)
|
||||
- [audit_repository.go](/home/long/project/立交桥/supply-api/internal/audit/repository/audit_repository.go#L339)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L7)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L15)
|
||||
|
||||
说明:
|
||||
|
||||
- 这不是只影响“审计写入”的问题,读路径也已经被破坏
|
||||
|
||||
### 3.5 P1: IAM 更新接口把空字符串写入 INET 列
|
||||
|
||||
现象:
|
||||
|
||||
- `PUT /api/v1/iam/roles/{role_code}` 返回 `500`
|
||||
- PostgreSQL 报错:`invalid input syntax for type inet: ""`
|
||||
|
||||
根因:
|
||||
|
||||
- 仓储层直接把 `role.UpdatedIP` 写入 `updated_ip`
|
||||
- 当前模型字段是字符串零值,更新时传入空字符串而非 `NULL`
|
||||
|
||||
证据位置:
|
||||
|
||||
- [role.go](/home/long/project/立交桥/supply-api/internal/iam/model/role.go#L77)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L155)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L162)
|
||||
|
||||
### 3.6 P1: IAM 分配角色接口未填 `granted_by`,直接触发外键错误
|
||||
|
||||
现象:
|
||||
|
||||
- `POST /api/v1/iam/users/{user_id}/roles` 返回 `500`
|
||||
- 数据库错误指向 `iam_user_roles_granted_by_fkey`
|
||||
|
||||
根因:
|
||||
|
||||
- HTTP handler 构造 `AssignRoleRequest` 时只传 `UserID / RoleCode / TenantID`
|
||||
- 服务层原样把 `GrantedBy=0` 写入 `iam_user_roles`
|
||||
- PostgreSQL 外键拒绝该值
|
||||
|
||||
证据位置:
|
||||
|
||||
- [iam_handler.go](/home/long/project/立交桥/supply-api/internal/iam/handler/iam_handler.go#L330)
|
||||
- [iam_service_db.go](/home/long/project/立交桥/supply-api/internal/iam/service/iam_service_db.go#L188)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L416)
|
||||
|
||||
### 3.7 P1: IAM 列表与用户角色查询的空值扫描问题已在真实接口层复现
|
||||
|
||||
现象:
|
||||
|
||||
- `GET /api/v1/iam/roles` 返回 `500`
|
||||
- `GET /api/v1/iam/users/{user_id}/roles` 返回 `500`
|
||||
|
||||
根因:
|
||||
|
||||
- 可空字段被直接扫描到非空基本类型
|
||||
|
||||
证据位置:
|
||||
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L131)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L483)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L513)
|
||||
|
||||
## 4. 服务级汇总
|
||||
|
||||
| 服务 | 总项数 | 通过 | 失败 | 条件关闭 | 备注 |
|
||||
| --- | ---: | ---: | ---: | ---: | --- |
|
||||
| `gateway` | 8 | 8 | 0 | 0 | 当前主启动链路正常 |
|
||||
| `platform-token-runtime` | 6 | 4 | 2 | 0 | 查询链路可用,变更链路失败 |
|
||||
| `supply-api` | 39 | 22 | 16 | 1 | 包含 1 个 DDL 前置失败项 |
|
||||
|
||||
## 5. 接口矩阵
|
||||
|
||||
### 5.1 gateway
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `GET` | `/health` | 通过 | `200` | 健康检查通过 |
|
||||
| `GET` | `/healthz` | 通过 | `200` | 健康检查通过 |
|
||||
| `GET` | `/readyz` | 通过 | `200` | 健康检查通过 |
|
||||
| `GET` | `/v1/models` | 通过 | `200` | 返回 `5` 个模型 |
|
||||
| `POST` | `/v1/chat/completions` | 通过 | `200` | 真实上游联调成功 |
|
||||
| `POST` | `/api/v1/chat/completions` | 通过 | `200` | 别名路由正常 |
|
||||
| `POST` | `/v1/completions` | 通过 | `200` | 真实上游联调成功 |
|
||||
| `POST` | `/api/v1/completions` | 通过 | `200` | 别名路由正常 |
|
||||
|
||||
### 5.2 platform-token-runtime
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `POST` | `/api/v1/platform/tokens/issue` | 通过 | `201` | 成功签发真实 Token |
|
||||
| `GET` | `/actuator/health` | 通过 | `200` | 健康检查正常 |
|
||||
| `POST` | `/api/v1/platform/tokens/introspect` | 通过 | `200` | 返回 `active/admin` |
|
||||
| `GET` | `/api/v1/platform/tokens/audit-events` | 通过 | `200` | 返回审计事件 |
|
||||
| `POST` | `/api/v1/platform/tokens/{token_id}/refresh` | 失败 | `422` | `BUSINESS_ERROR` |
|
||||
| `POST` | `/api/v1/platform/tokens/{token_id}/revoke` | 失败 | `422` | `BUSINESS_ERROR` |
|
||||
|
||||
### 5.3 supply-api
|
||||
|
||||
#### 前置 DDL
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `DDL` | `/sql/postgresql/iam_schema_v1.sql` | 失败 | `psql` | `'*'` 违反 `chk_scope_code_format` |
|
||||
|
||||
#### 健康与基础查询
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `GET` | `/actuator/health` | 通过 | `200` | 数据库与缓存均为 `ok` |
|
||||
| `GET` | `/actuator/health/ready` | 通过 | `200` | readiness 正常 |
|
||||
| `GET` | `/actuator/health/live` | 通过 | `200` | liveness 正常 |
|
||||
| `GET` | `/api/v1/supply/billing` | 通过 | `200` | 账单汇总接口可用 |
|
||||
| `GET` | `/api/v1/supplier/billing` | 通过 | `200` | 兼容别名可用 |
|
||||
| `GET` | `/api/v1/supply/earnings/records` | 通过 | `200` | 返回分页总数 `417` |
|
||||
|
||||
#### 账号接口
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `POST` | `/api/v1/supply/accounts/verify` | 通过 | `200` | 校验接口可用 |
|
||||
| `POST` | `/api/v1/supply/accounts` | 失败 | `500` | `IDEMPOTENCY_LOCK_FAILED` |
|
||||
| `POST` | `/api/v1/supply/accounts/{account_id}/activate` | 失败 | `404` | 外部返回 404,内层为并发冲突 |
|
||||
| `POST` | `/api/v1/supply/accounts/{account_id}/suspend` | 失败 | `404` | 外部返回 404,内层为并发冲突 |
|
||||
| `DELETE` | `/api/v1/supply/accounts/{account_id}/delete` | 通过 | `204` | 删除链路可用,但审计写入报错 |
|
||||
| `GET` | `/api/v1/supply/accounts/{account_id}/audit-logs` | 失败 | `500` | `trace_id` 列不存在 |
|
||||
|
||||
#### 套餐接口
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `POST` | `/api/v1/supply/packages/draft` | 失败 | `422` | `SUP_HTTP_5002` |
|
||||
| `POST` | `/api/v1/supply/packages/batch-price` | 通过 | `200` | 批量调价可用 |
|
||||
| `POST` | `/api/v1/supply/packages/{package_id}/publish` | 失败 | `404` | 外部返回 404,内层为并发冲突 |
|
||||
| `POST` | `/api/v1/supply/packages/{package_id}/pause` | 失败 | `404` | 外部返回 404,内层为并发冲突 |
|
||||
| `POST` | `/api/v1/supply/packages/{package_id}/unlist` | 失败 | `404` | 外部返回 404,内层为并发冲突 |
|
||||
| `POST` | `/api/v1/supply/packages/{package_id}/clone` | 失败 | `404` | 内层实际是创建 SQL 失败 |
|
||||
|
||||
#### 结算与审计接口
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `POST` | `/api/v1/supply/settlements/withdraw` | 条件关闭 | `503` | SMS 未就绪,属于设计内 fail-closed |
|
||||
| `POST` | `/api/v1/supply/settlements/{settlement_id}/cancel` | 通过 | `200` | 业务动作成功,但审计写入报错 |
|
||||
| `GET` | `/api/v1/supply/settlements/{settlement_id}/statement` | 通过 | `200` | 结算单下载地址正常返回 |
|
||||
| `GET` | `/api/v1/audit/events/{event_id}` | 失败 | `500` | 审计仓储查询字段与表结构不一致 |
|
||||
|
||||
#### 告警接口
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `POST` | `/api/v1/audit/alerts` | 通过 | `201` | 成功创建 `ALT-e70a7615` |
|
||||
| `GET` | `/api/v1/audit/alerts` | 通过 | `200` | 列表查询正常 |
|
||||
| `GET` | `/api/v1/audit/alerts/{alert_id}` | 通过 | `200` | 详情查询正常 |
|
||||
| `PUT` | `/api/v1/audit/alerts/{alert_id}` | 通过 | `200` | 状态更新为 `acknowledged` |
|
||||
| `POST` | `/api/v1/audit/alerts/{alert_id}/resolve` | 通过 | `200` | 状态更新为 `resolved` |
|
||||
| `DELETE` | `/api/v1/audit/alerts/{alert_id}` | 通过 | `204` | 删除正常 |
|
||||
|
||||
#### IAM 接口
|
||||
|
||||
| 方法 | 路径 | 判定 | HTTP | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `GET` | `/api/v1/iam/roles` | 失败 | `500` | 空值扫描失败 |
|
||||
| `POST` | `/api/v1/iam/roles` | 通过 | `201` | `matrix_role` 创建成功 |
|
||||
| `GET` | `/api/v1/iam/roles/{role_code}` | 通过 | `200` | 单角色查询正常 |
|
||||
| `PUT` | `/api/v1/iam/roles/{role_code}` | 失败 | `500` | 空字符串写入 `INET` 列 |
|
||||
| `GET` | `/api/v1/iam/scopes` | 通过 | `200` | 返回 `29` 个 Scope |
|
||||
| `GET` | `/api/v1/iam/users/{user_id}/roles` | 失败 | `500` | 空值扫描失败 |
|
||||
| `POST` | `/api/v1/iam/users/{user_id}/roles` | 失败 | `500` | `granted_by` 外键失败 |
|
||||
| `DELETE` | `/api/v1/iam/users/{user_id}/roles/{role_code}` | 失败 | `404` | 上一步分配失败后的级联结果 |
|
||||
| `DELETE` | `/api/v1/iam/roles/{role_code}` | 通过 | `200` | 删除角色正常 |
|
||||
| `GET` | `/api/v1/iam/check-scope` | 通过 | `200` | 校验接口可用 |
|
||||
|
||||
## 6. 最终判断
|
||||
|
||||
这轮矩阵验证后的最终判断是:
|
||||
|
||||
- `gateway` 可以视为当前最稳定的已交付模块。
|
||||
- `platform-token-runtime` 仍然不能宣称“生命周期完整可用”,因为刷新和撤销链路在 PostgreSQL-backed 模式下失败。
|
||||
- `supply-api` 不能宣称“功能完整正常”,因为它的状态流转、创建链路、审计读写一致性、IAM 部分 DB-backed 能力都存在确定性缺陷。
|
||||
|
||||
如果按交付成熟度排序:
|
||||
|
||||
- 第一梯队:`gateway`
|
||||
- 第二梯队:`platform-token-runtime` 的查询能力、`supply-api` 的读路径与告警模块
|
||||
- 需要优先整改:`platform-token-runtime` 变更链路、`supply-api` 写路径、审计仓储、IAM DB-backed 路径
|
||||
366
review/REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md
Normal file
366
review/REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md
Normal file
@@ -0,0 +1,366 @@
|
||||
# 项目真实环境 Review 与全面验证报告
|
||||
|
||||
生成时间:2026-04-20
|
||||
仓库路径:`/home/long/project/立交桥`
|
||||
执行方式:只读 review + 真实运行验证
|
||||
约束说明:本次未修改仓库业务代码;仅使用 `/tmp` 下临时脚本、临时辅助程序、临时进程,以及隔离的 Podman PostgreSQL 容器完成验证。
|
||||
|
||||
状态说明:本文记录的是 2026-04-20 的整改前真实验证基线快照。文中缺陷已拆入 [2026-04-20-remediation-tasklist-from-real-validation.md](/home/long/project/立交桥/docs/plans/2026-04-20-remediation-tasklist-from-real-validation.md),后续修复状态应以任务单执行提交和最终验收结果为准。
|
||||
|
||||
后续状态更新:截至 2026-04-20 本次复核结束时,本文中已拆入整改任务单的 `13` 个已证实问题已完成修复并通过复核。`gateway` 与 `platform-token-runtime` 入口日志尚未统一为结构化 JSON,这属于任务单外新增治理项,不构成“本文已证实缺陷仍未修复”的反证。
|
||||
|
||||
## 1. 执行摘要
|
||||
|
||||
本轮验证覆盖了 `gateway`、`platform-token-runtime`、`supply-api` 三个核心后端服务,验证方式包括:
|
||||
|
||||
- 无缓存基线测试复跑
|
||||
- 真实 PostgreSQL 落库
|
||||
- 数万条业务/审计/令牌数据构造
|
||||
- 真实服务启动
|
||||
- 真实 HTTP 接口联调
|
||||
- 条件能力与失败路径验证
|
||||
|
||||
结论如下:
|
||||
|
||||
- 当前仓库的三套后端服务可以在真实环境中成功启动。
|
||||
- 当前仓库自带的核心基线测试在复跑后通过。
|
||||
- 当前系统的读路径和部分管理路径可用。
|
||||
- 当前系统并不能判定为“所有功能均正常”。
|
||||
- 已确认 6 个确定性缺陷,涉及:
|
||||
- `platform-token-runtime` 的 PostgreSQL 刷新/撤销路径
|
||||
- `supply-api` 的幂等锁写入路径
|
||||
- `supply-api` 的套餐创建 SQL
|
||||
- `IAM` 初始化 DDL
|
||||
- `IAM` DB-backed 查询的空值扫描
|
||||
- `audit_events` 表结构与审计仓储实现不一致
|
||||
|
||||
## 2. 验证范围
|
||||
|
||||
### 2.1 核心服务
|
||||
|
||||
- `gateway`
|
||||
- `platform-token-runtime`
|
||||
- `supply-api`
|
||||
|
||||
### 2.2 基线验证
|
||||
|
||||
- [repo_integrity_check.sh](/home/long/project/立交桥/scripts/ci/repo_integrity_check.sh)
|
||||
- `gateway` 无缓存 `go test -count=1 ./...`
|
||||
- `platform-token-runtime` 无缓存 `go test -count=1 ./...`
|
||||
- `supply-api` 无缓存 `go test -count=1 ./...`
|
||||
- `supply-api` 仓储集成测试
|
||||
- `supply-api` `e2e` 测试
|
||||
|
||||
### 2.3 真实运行验证
|
||||
|
||||
- 真实 PostgreSQL 容器
|
||||
- 真实 schema 初始化
|
||||
- 真实种子数据写入
|
||||
- 真实服务启动
|
||||
- 真实 API 请求验证
|
||||
- 正向流程、权限校验、条件关闭、失败路径验证
|
||||
|
||||
## 3. 验证环境
|
||||
|
||||
### 3.1 数据库环境
|
||||
|
||||
本机 `5432` 端口存在监听,但 PostgreSQL 健康检查不可用,因此未采用本机数据库。
|
||||
|
||||
本次使用 Podman 启动隔离 PostgreSQL 15 容器:
|
||||
|
||||
- 容器名:`lijiaoqiao-review-pg`
|
||||
- 映射端口:`15441`
|
||||
- 验证数据库:
|
||||
- `supply_review`
|
||||
- `token_runtime_review`
|
||||
|
||||
### 3.2 前后端启动说明
|
||||
|
||||
本仓库未发现项目自有前端源码入口。当前仅发现归档竞品前端目录:
|
||||
|
||||
- [package.json](/home/long/project/立交桥/llm-gateway-competitors/sub2api-tar/frontend/package.json)
|
||||
|
||||
因此本轮只能对项目后端进行真实启动与联调,不能把不存在的前端能力写成“已验证”。
|
||||
|
||||
### 3.3 服务启动结果
|
||||
|
||||
真实启动成功的服务如下:
|
||||
|
||||
- `gateway`:`127.0.0.1:18080`
|
||||
- `platform-token-runtime`:`127.0.0.1:18081`
|
||||
- `supply-api`:`127.0.0.1:18082`
|
||||
- 上游模拟服务:`127.0.0.1:19090`
|
||||
|
||||
健康检查结果:
|
||||
|
||||
- `platform-token-runtime`:`UP`
|
||||
- `supply-api`:数据库与缓存检查为 `ok`
|
||||
|
||||
## 4. 数据规模与角色覆盖
|
||||
|
||||
### 4.1 数据规模
|
||||
|
||||
本次构造并写入的真实测试数据量如下:
|
||||
|
||||
- 租户:`120`
|
||||
- IAM 用户:`2,120`
|
||||
- 供应账号:`6,000`
|
||||
- 套餐:`18,000`
|
||||
- 使用记录:`50,000`
|
||||
- 审计事件:`30,000`
|
||||
- 告警:`2,500`
|
||||
- 平台 Token:`20,000`
|
||||
- Token 审计事件:`20,000`
|
||||
|
||||
### 4.2 角色与调用方
|
||||
|
||||
本次覆盖的真实调用方包括:
|
||||
|
||||
- 匿名请求方
|
||||
- 网关 Bearer Token 调用方
|
||||
- 平台 Token 管理侧
|
||||
- 供应侧组织管理员
|
||||
- 结算侧租户管理员
|
||||
|
||||
### 4.3 模型覆盖
|
||||
|
||||
本次按系统真实注册模型进行验证,未虚构不存在模型。已联调模型如下:
|
||||
|
||||
- `gpt-4o`
|
||||
- `gpt-4.1`
|
||||
- `gpt-4.1-mini`
|
||||
- `claude-3-7-sonnet`
|
||||
- `deepseek-chat`
|
||||
|
||||
## 5. 已通过项
|
||||
|
||||
### 5.1 基线测试
|
||||
|
||||
以下验证在复跑后通过:
|
||||
|
||||
- `bash scripts/ci/repo_integrity_check.sh`
|
||||
- `cd supply-api && bash scripts/run_integration_tests.sh ./internal/repository`
|
||||
- `cd supply-api && go test -count=1 -tags=e2e ./e2e`
|
||||
|
||||
说明:
|
||||
|
||||
- 首次执行统一校验时,`supply-api/internal/domain` 出现过一次未复现失败。
|
||||
- 随后的单独复跑、无缓存多次复跑、整仓复跑均通过。
|
||||
- 当前更合理的判断是“存在稳定性风险信号,但尚未形成可复现确定性缺陷”。
|
||||
|
||||
### 5.2 gateway
|
||||
|
||||
已确认通过:
|
||||
|
||||
- `/v1/models`
|
||||
- `/v1/chat/completions`
|
||||
- `/v1/completions`
|
||||
- 缺失 Bearer Token 时的 `401` 返回
|
||||
|
||||
### 5.3 platform-token-runtime
|
||||
|
||||
已确认通过:
|
||||
|
||||
- Token 签发
|
||||
- Token introspect
|
||||
- Audit events 查询
|
||||
|
||||
### 5.4 supply-api
|
||||
|
||||
已确认通过:
|
||||
|
||||
- 健康检查
|
||||
- 账单查询
|
||||
- 收益记录查询
|
||||
- 告警创建
|
||||
- 告警获取
|
||||
- 告警列表
|
||||
- 告警更新
|
||||
- 告警解决
|
||||
- 告警删除
|
||||
- 结算单下载
|
||||
- 提现能力条件关闭
|
||||
|
||||
其中提现关闭行为是设计内结果,不是 bug。对应逻辑位于:
|
||||
|
||||
- [runtime.go](/home/long/project/立交桥/supply-api/internal/app/runtime.go#L407)
|
||||
|
||||
## 6. 已证实缺陷
|
||||
|
||||
### 6.1 P0: platform-token-runtime 的 PostgreSQL 刷新与撤销路径失效
|
||||
|
||||
现象:
|
||||
|
||||
- `/api/v1/platform/tokens/refresh` 返回数据库错误
|
||||
- `/api/v1/platform/tokens/revoke` 返回数据库错误
|
||||
- Token 状态未变更为 `revoked`
|
||||
- `gateway` 仍接受原 token
|
||||
|
||||
根因:
|
||||
|
||||
- PostgreSQL store 的保存逻辑在 `INSERT` 阶段先执行 `NULLIF($2, '')`
|
||||
- 刷新/撤销路径调用 `Save` 时没有传回有效 access token
|
||||
- 因 `token_fingerprint` 为 `NULL` 触发 `NOT NULL` 约束
|
||||
- 请求在进入 `ON CONFLICT DO UPDATE` 之前即失败
|
||||
|
||||
证据位置:
|
||||
|
||||
- [postgres_runtime_store.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/postgres_runtime_store.go#L73)
|
||||
- [postgres_runtime_store.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/postgres_runtime_store.go#L91)
|
||||
- [inmemory_runtime.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/inmemory_runtime.go#L164)
|
||||
- [inmemory_runtime.go](/home/long/project/立交桥/platform-token-runtime/internal/auth/service/inmemory_runtime.go#L190)
|
||||
|
||||
影响判断:
|
||||
|
||||
- PostgreSQL-backed token runtime 不具备可靠的刷新/撤销能力
|
||||
- 网关依赖该运行时做远程 introspection 时,撤销链路存在失效风险
|
||||
|
||||
### 6.2 P0: supply-api 幂等锁实现与表结构冲突,账号创建被阻塞
|
||||
|
||||
现象:
|
||||
|
||||
- 创建账号接口返回 `IDEMPOTENCY_LOCK_FAILED`
|
||||
- PostgreSQL 明确报错:`there is no unique or exclusion constraint matching the ON CONFLICT specification`
|
||||
|
||||
根因:
|
||||
|
||||
- 仓储层按 `(tenant_id, operator_id, api_path, idempotency_key)` 做 `ON CONFLICT`
|
||||
- 表定义没有该唯一约束
|
||||
|
||||
证据位置:
|
||||
|
||||
- [idempotency.go](/home/long/project/立交桥/supply-api/internal/repository/idempotency.go#L196)
|
||||
- [idempotency.go](/home/long/project/立交桥/supply-api/internal/repository/idempotency.go#L217)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L110)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L124)
|
||||
|
||||
影响判断:
|
||||
|
||||
- 所有依赖该幂等锁的 DB-backed 写接口都可能被阻塞
|
||||
|
||||
### 6.3 P0: supply-api 套餐创建 SQL 占位符数量错误
|
||||
|
||||
现象:
|
||||
|
||||
- 创建套餐接口返回 SQL 语法层错误:`INSERT has more target columns than expressions`
|
||||
|
||||
根因:
|
||||
|
||||
- `INSERT INTO supply_packages` 定义了 29 个目标列
|
||||
- `VALUES` 只提供了 28 个占位符
|
||||
|
||||
证据位置:
|
||||
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L27)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L37)
|
||||
- [package.go](/home/long/project/立交桥/supply-api/internal/repository/package.go#L60)
|
||||
|
||||
影响判断:
|
||||
|
||||
- 套餐创建链路在真实数据库下不可用
|
||||
|
||||
### 6.4 P1: IAM schema 初始化脚本在全新数据库上失败
|
||||
|
||||
现象:
|
||||
|
||||
- 执行 [iam_schema_v1.sql](/home/long/project/立交桥/sql/postgresql/iam_schema_v1.sql) 时失败
|
||||
- 默认 scope 数据中的 `'*'` 违反自身格式约束
|
||||
- 整个事务回滚
|
||||
|
||||
证据位置:
|
||||
|
||||
- [iam_schema_v1.sql](/home/long/project/立交桥/sql/postgresql/iam_schema_v1.sql#L60)
|
||||
|
||||
影响判断:
|
||||
|
||||
- 仓库自带 IAM schema 不能直接用于干净环境初始化
|
||||
|
||||
### 6.5 P1: DB-backed IAM 查询对空值不安全
|
||||
|
||||
现象:
|
||||
|
||||
- 角色列表查询可能触发 `cannot scan NULL into *string`
|
||||
- 用户角色查询可能触发 `cannot scan NULL into *int64`
|
||||
|
||||
根因:
|
||||
|
||||
- 可空列被直接扫描进非空基本类型
|
||||
|
||||
证据位置:
|
||||
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L131)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L483)
|
||||
- [iam_repository.go](/home/long/project/立交桥/supply-api/internal/iam/repository/iam_repository.go#L513)
|
||||
|
||||
影响判断:
|
||||
|
||||
- DB-backed IAM 在真实数据下不稳定
|
||||
|
||||
### 6.6 P1: 审计仓储实现与 audit_events 表结构不一致
|
||||
|
||||
现象:
|
||||
|
||||
- 结算取消接口返回业务成功
|
||||
- 但服务日志记录 `trace_id` 列不存在
|
||||
|
||||
根因:
|
||||
|
||||
- 审计仓储插入 `trace_id`、`span_id`
|
||||
- 当前分区表定义没有这两个列
|
||||
|
||||
证据位置:
|
||||
|
||||
- [audit_repository.go](/home/long/project/立交桥/supply-api/internal/audit/repository/audit_repository.go#L105)
|
||||
- [audit_repository.go](/home/long/project/立交桥/supply-api/internal/audit/repository/audit_repository.go#L109)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L7)
|
||||
- [partition_strategy_v1.sql](/home/long/project/立交桥/supply-api/sql/postgresql/partition_strategy_v1.sql#L15)
|
||||
|
||||
影响判断:
|
||||
|
||||
- 业务操作可能成功,但审计持久化静默失败
|
||||
|
||||
## 7. 当前状态判断
|
||||
|
||||
按真实运行结果判断:
|
||||
|
||||
- `gateway`:基础能力可用
|
||||
- `platform-token-runtime`:查询链路可用,刷新/撤销链路不可判定为可用
|
||||
- `supply-api`:读路径和部分管理路径可用,关键写路径存在明确缺陷
|
||||
|
||||
更准确的项目状态是:
|
||||
|
||||
- 不是“无法运行”
|
||||
- 也不是“功能已全部正常”
|
||||
- 属于“基础链路可跑,但关键变更路径仍有明确阻塞缺陷”
|
||||
|
||||
## 8. 结论与建议
|
||||
|
||||
### 8.1 结论
|
||||
|
||||
本次真实验证确认:
|
||||
|
||||
- 项目当前具备真实启动与部分真实业务联调能力
|
||||
- 项目当前不具备“全功能正常”的结论基础
|
||||
- 当前最需要优先修复的是写路径、PostgreSQL-backed token 变更路径、IAM schema 初始化和审计落库一致性
|
||||
|
||||
### 8.2 建议优先级
|
||||
|
||||
- `P0`
|
||||
- 修复 `platform-token-runtime` 的 PostgreSQL `refresh/revoke`
|
||||
- 修复 `supply-api` 幂等锁唯一约束与仓储实现不一致
|
||||
- 修复 `supply-api` 套餐创建 SQL 占位符错误
|
||||
- `P1`
|
||||
- 修复 IAM schema 初始化脚本
|
||||
- 修复 IAM 仓储对可空字段的扫描
|
||||
- 统一 audit repository 与 `audit_events` 表结构
|
||||
- `P2`
|
||||
- 继续跟踪 `supply-api/internal/domain` 的一次性不稳定失败
|
||||
- 在缺陷修复后重新执行全量接口矩阵验证
|
||||
|
||||
## 9. 报告边界
|
||||
|
||||
本报告只写本轮已证实事实,不写未验证推测。以下事项未纳入“已通过”结论:
|
||||
|
||||
- 不存在于当前仓库的项目自有前端
|
||||
- 未真实挂载或未真实联调的接口
|
||||
- 未复现的偶发性异常
|
||||
100
review/REMEDIATION_CLOSURE_CONFIRMATION_2026-04-20.md
Normal file
100
review/REMEDIATION_CLOSURE_CONFIRMATION_2026-04-20.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# 整改关闭确认单
|
||||
|
||||
生成时间:2026-04-20
|
||||
仓库路径:`/home/long/project/立交桥`
|
||||
确认对象:`2026-04-20` 真实环境验证报告、接口矩阵报告及其派生整改任务单
|
||||
|
||||
关联文档:
|
||||
|
||||
- [REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md](/home/long/project/立交桥/review/REAL_ENV_REVIEW_AND_VALIDATION_REPORT_2026-04-20.md)
|
||||
- [API_MATRIX_VALIDATION_REPORT_2026-04-20.md](/home/long/project/立交桥/review/API_MATRIX_VALIDATION_REPORT_2026-04-20.md)
|
||||
- [2026-04-20-remediation-tasklist-from-real-validation.md](/home/long/project/立交桥/docs/plans/2026-04-20-remediation-tasklist-from-real-validation.md)
|
||||
|
||||
## 1. 关闭结论
|
||||
|
||||
截至 2026-04-20 本次复核完成时,上述三份文档中已拆入整改任务单的 `13` 个已证实问题已全部完成修复,并通过整仓基线、focused 回归、仓储集成和稳定性复跑核验。
|
||||
|
||||
本确认单给出的正式结论是:
|
||||
|
||||
- 本轮整改任务单内的 `13` 个问题,已全部关闭。
|
||||
- 当前没有证据表明这些问题仍然存在于当前分支。
|
||||
- 两份 `review` 文档仍然保留“整改前基线快照”属性,不能再被直接引用为当前未修复缺陷清单。
|
||||
|
||||
## 2. 复核命令
|
||||
|
||||
以下命令为本次关闭确认实际执行并通过的核心命令:
|
||||
|
||||
```bash
|
||||
bash scripts/ci/repo_integrity_check.sh
|
||||
bash scripts/ci/supply_domain_stability_check.sh 20
|
||||
|
||||
cd "/home/long/project/立交桥/platform-token-runtime" && \
|
||||
go test -count=1 ./internal/auth/service \
|
||||
-run 'Test(PostgresRuntimeStore_SavePreservesExistingFingerprintWhenAccessTokenMissing|InMemoryTokenRuntimeWithPostgresStore_RefreshAndRevokePersistLifecycle)$' -v
|
||||
|
||||
cd "/home/long/project/立交桥/supply-api" && \
|
||||
go test -count=1 ./internal/httpapi \
|
||||
-run 'TestSupplyAPI_(ActivateAccount_ConcurrencyConflict|PublishPackage_ConcurrencyConflict|ClonePackage_UnexpectedCreateFailureReturnsInternalServerError|CancelSettlement_ConcurrencyConflict|ActivateAccount_NotFound|PublishPackage_NotFound|ClonePackage_NotFound|CancelSettlement_NotFound)$' -v
|
||||
|
||||
cd "/home/long/project/立交桥/supply-api" && \
|
||||
go test -count=1 ./internal/iam/... \
|
||||
-run 'Test.*(AssignRole|RevokeRole|ListRoles|GetUserRoles|UpdateRole)' -v
|
||||
|
||||
cd "/home/long/project/立交桥/supply-api" && \
|
||||
go test -count=1 ./internal/domain ./internal/httpapi \
|
||||
-run 'Test.*(Activate|Suspend|Delete|Publish|Pause|Unlist|Clone|Cancel)' -v
|
||||
|
||||
cd "/home/long/project/立交桥/supply-api" && \
|
||||
bash scripts/run_integration_tests.sh ./internal/iam/repository
|
||||
|
||||
cd "/home/long/project/立交桥/supply-api" && \
|
||||
bash scripts/run_integration_tests.sh ./internal/audit/...
|
||||
```
|
||||
|
||||
## 3. 问题关闭映射
|
||||
|
||||
| 问题类别 | 当前状态 | 复核依据 |
|
||||
| --- | --- | --- |
|
||||
| `platform-token-runtime` PostgreSQL `refresh/revoke` 失败 | 已关闭 | focused store/runtime 测试通过 |
|
||||
| `supply-api` 幂等锁 DDL/仓储契约失配 | 已关闭 | repository integration 与整仓校验通过 |
|
||||
| `supply-api` 套餐创建 SQL 占位符错误 | 已关闭 | handler/domain focused 回归通过 |
|
||||
| `supply-api` 账号状态流转乐观锁错误 | 已关闭 | handler/domain focused 回归通过 |
|
||||
| `supply-api` 套餐状态流转乐观锁错误 | 已关闭 | handler/domain focused 回归通过 |
|
||||
| `supply-api` 套餐读取字段映射错误 | 已关闭 | lifecycle focused 回归通过 |
|
||||
| `supply-api` 审计仓储与 `audit_events` 契约失配 | 已关闭 | `./internal/audit/...` integration 通过 |
|
||||
| `IAM` DDL 无法在干净库落地 | 已关闭 | `TestIAMSchemaV1_AppliesOnCleanSchema` 通过 |
|
||||
| `IAM` 角色列表 / 用户角色 null scan | 已关闭 | `./internal/iam/...` focused 回归通过 |
|
||||
| `IAM` 更新空字符串写入 `INET` | 已关闭 | `UpdateRole` focused 回归通过 |
|
||||
| `IAM` 角色分配 `granted_by` 外键失败 | 已关闭 | `AssignRole` focused 回归通过 |
|
||||
| handler 冲突/内部错误语义错误 | 已关闭 | HTTP focused 回归通过 |
|
||||
| `supply-api/internal/domain` 波动信号 | 已关闭 | `20` 轮 stability check 未复现 |
|
||||
|
||||
## 4. 范围边界
|
||||
|
||||
以下事项不应与本次“13 项整改关闭”混淆:
|
||||
|
||||
1. 提现能力仍受 SMS readiness 门禁控制。门禁关闭属于设计行为,不是未修复缺陷。
|
||||
2. 两份 `review` 报告是整改前基线快照。它们用于说明问题来源,不等于当前系统状态。
|
||||
3. 结构化日志统一尚未在三套服务入口完全收口。这是**任务单外治理项**,不属于本次 13 项缺陷。
|
||||
|
||||
结构化日志现状:
|
||||
|
||||
- [main.go](/home/long/project/立交桥/supply-api/cmd/supply-api/main.go) 已使用结构化 JSON logger。
|
||||
- [main.go](/home/long/project/立交桥/gateway/cmd/gateway/main.go) 仍使用标准库 `log`。
|
||||
- [main.go](/home/long/project/立交桥/platform-token-runtime/cmd/platform-token-runtime/main.go) 仍使用标准库 `log`。
|
||||
|
||||
因此,若单独追踪“结构化日志统一”这一治理目标,当前结论应是:
|
||||
|
||||
- `supply-api`:已完成
|
||||
- `gateway`:未完成
|
||||
- `platform-token-runtime`:未完成
|
||||
|
||||
但这不应被回写成“2026-04-20 真实验证缺陷仍未关闭”。
|
||||
|
||||
## 5. 最终确认
|
||||
|
||||
本次复核后的正式确认如下:
|
||||
|
||||
- 真实验证报告与接口矩阵报告中拆出的 `13` 个已证实问题,已全部真实解决。
|
||||
- 当前分支未发现这些问题的残留复现。
|
||||
- 若后续需要继续推进,可把“结构化日志统一”作为新的治理任务立项,而不是复开本次整改单。
|
||||
Reference in New Issue
Block a user