feat(runtime): harden daily pipeline audit and verification

Tighten real-ingestion success rules, separate scheduled reports from historical rebuilds, and persist source-level runtime audit across daily pipeline runs. Also add the Phase 5 CI workflow contract plus verification updates and supporting docs so the full uncommitted change set can be validated together.
2026-05-14 16:17:39 +08:00
parent 618dff33da
commit a8999abcb0
17 changed files with 880 additions and 45 deletions
--- a/docs/plans/2026-05-14-runtime-trust-gap-remediation-plan.md
+++ b/docs/plans/2026-05-14-runtime-trust-gap-remediation-plan.md
@@ -0,0 +1,189 @@
+# Runtime Trust Gap Remediation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 系统性修复日报与采集链路中影响真实性和长期可信度的 3 个缺口，确保“每日定时产出”的结果来自真实采集、可审计运行、并覆盖多源数据链路。
+
+**Architecture:** 不推翻现有 Phase 1/Phase 2 设计，只在运行语义和审计层补强。将“采集是否真实成功”“这次运行是否为正式日常产出/历史重建”“多源数据是否进入定时链路”拆成独立状态，并让 `run_daily.sh`、日报生成器、验证脚本、数据库记录统一使用同一套运行语义。优先修复最容易掩盖真实失败的宽松成功判定，再修复审计分流，最后把多源采集纳入自动调度。
+
+**Tech Stack:** Bash、Go 1.22、PostgreSQL、cron、html/template
+
+---
+
+### Task 1: 收紧“采集成功”判定，避免 mock / 写库失败被伪装成成功
+
+**Files:**
+- Modify: `scripts/fetch_openrouter.go`
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/verify_phase3.sh`
+- Test: `scripts/fetch_openrouter_test.go`
+- Test: `scripts/run_daily` 对应 shell 验证（可先用现有 verify 脚本）
+
+**Step 1: 写失败测试**
+
+补 3 个失败场景：
+- 没有 `OPENROUTER_API_KEY` 时，调度链不应被当作真实采集成功
+- `summarizeDB` 写库失败时，`fetch_openrouter` 在“真实模式”下应返回非 0
+- `run_daily.sh` 不能仅凭“数据库里已有旧数据”就通过质量检查
+
+**Step 2: 跑测试确认当前行为过宽**
+
+Run:
+- `go test -tags llm_script scripts/fetch_openrouter.go scripts/fetch_openrouter_test.go`
+- `bash scripts/verify_phase3.sh`
+
+Expected:
+- 能看到 mock / 降级 / 旧数据掩盖真实失败的风险暴露出来
+
+**Step 3: 最小实现**
+
+建议分两层收紧：
+- `fetch_openrouter.go` 增加严格模式或显式运行模式，真实调度默认要求数据库写入成功，否则退出非 0
+- `run_daily.sh` 在质量检查中引入“本次运行必须产生当天的写入痕迹”而不是只看历史总量
+- `run_real_pipeline.sh` 明确只把“真实采集 + 真实写库 + 真实日报生成”视为成功
+
+**Step 4: 重新运行验证**
+
+Run:
+- `bash scripts/run_daily.sh`
+- `bash scripts/run_real_pipeline.sh`
+- `bash scripts/verify_phase3.sh`
+
+Expected:
+- 真实失败会真正失败
+- mock / 仅写 JSON / 旧数据不会再伪装成已完成
+
+**Step 5: Commit**
+
+```bash
+git add scripts/fetch_openrouter.go scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/verify_phase3.sh scripts/fetch_openrouter_test.go
+git commit -m "fix(runtime): harden daily ingestion success checks"
+```
+
+### Task 2: 将正式日报与历史重建分流到不同运行语义，修复审计混写
+
+**Files:**
+- Modify: `scripts/generate_daily_report.go`
+- Modify: `scripts/rebuild_historical_report.sh`
+- Modify: `scripts/report_utils.sh`
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/verify_phase3.sh`
+- Test: `scripts/generate_daily_report_test.go`
+
+**Step 1: 写失败测试**
+
+补测试验证：
+- 正式日常产出与历史重建会写入不同的运行类型
+- 历史重建不应冒充“每日定时产出”
+- `fetchLatestReport` 与前端最新日报读取仍然只面向正式产出口径
+
+**Step 2: 跑测试确认当前混写**
+
+Run:
+- `go test -tags llm_script scripts/generate_daily_report.go scripts/generate_daily_report_test.go`
+
+Expected:
+- 当前 `daily_report` / `report_runs` 的运行语义仍不区分正式与重建
+
+**Step 3: 最小实现**
+
+建议新增并统一以下语义字段：
+- `run_kind`: `scheduled` / `historical_rebuild` / `manual`
+- `trigger_source`: `cron` / `cli` / `rebuild_script`
+- `is_official_daily`: 是否属于当天定时正式产出
+
+落点建议：
+- `generate_daily_report.go` 的数据库写入携带运行类型
+- `rebuild_historical_report.sh` 强制标记历史重建语义
+- 前端和 API 默认只读取正式产出作为“最新日报”
+
+**Step 4: 重新运行验证**
+
+Run:
+- `go test ./...`
+- `bash scripts/rebuild_historical_report.sh 2025-08-07`
+- `bash scripts/run_daily.sh`
+
+Expected:
+- 历史重建和日常产出可以共存，但不会再在审计层混为一类
+
+**Step 5: Commit**
+
+```bash
+git add scripts/generate_daily_report.go scripts/rebuild_historical_report.sh scripts/report_utils.sh scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/verify_phase3.sh scripts/generate_daily_report_test.go
+git commit -m "feat(audit): separate scheduled and rebuild report runs"
+```
+
+### Task 3: 把多源数据纳入同一条每日自动调度链
+
+**Files:**
+- Modify: `scripts/run_daily.sh`
+- Modify: `scripts/run_real_pipeline.sh`
+- Modify: `scripts/fetch_multi_source.go`
+- Create or Modify: `scripts/fetch_multi_source_test.go`
+- Modify: `scripts/verify_phase3.sh`
+- Modify: `scripts/verify_phase5.sh`
+- 视需要修改：`scripts/import_phase2_data.go`、`scripts/import_zhipu_data.go`、`scripts/import_bytedance_data.go`
+
+**Step 1: 写失败测试**
+
+补测试验证：
+- 调度链能明确知道哪些来源参与了当日同步
+- 至少 OpenRouter、国内厂商、聚合平台的每日同步在验证层可被看见
+
+**Step 2: 设计最小调度编排**
+
+建议把每日调度拆成可枚举阶段：
+- `openrouter`
+- `multi_source`
+- `official_imports`
+- `daily_report`
+
+并定义每个阶段的失败策略：
+- 任一必需来源失败时，日报应标记为降级/失败，不应伪装成完全成功
+- 允许某些官方导入在单源失败时继续，但必须在运行记录中留下来源级失败痕迹
+
+**Step 3: 最小实现**
+
+优先级建议：
+- 先把 `fetch_multi_source.go` 接入每日调度
+- 再把已有官方导入脚本接入可选的日常补充同步阶段
+- 最后统一审计输出，让 `report_runs` 能显示本次触发的来源集合和失败来源集合
+
+**Step 4: 重新运行验证**
+
+Run:
+- `go test -tags llm_script scripts/fetch_multi_source.go scripts/fetch_multi_source_test.go`
+- `bash scripts/run_daily.sh`
+- `bash scripts/verify_phase3.sh`
+- `bash scripts/verify_phase5.sh`
+
+Expected:
+- 每日调度不再只证明 OpenRouter 独立刷新
+- 多源同步在调度和验收层都能被识别
+
+**Step 5: Commit**
+
+```bash
+git add scripts/run_daily.sh scripts/run_real_pipeline.sh scripts/fetch_multi_source.go scripts/fetch_multi_source_test.go scripts/verify_phase3.sh scripts/verify_phase5.sh
+git commit -m "feat(runtime): fold multi-source sync into daily pipeline"
+```
+
+---
+
+### 执行顺序建议
+
+1. 先做 **Task 1**，因为这是最容易把“假成功”伪装成“真成功”的问题，风险最高。
+2. 再做 **Task 2**，把正式日报与历史重建的审计边界切开。
+3. 最后做 **Task 3**，把多源同步真正纳入每日调度链。
+
+### 验收顺序建议
+
+1. `bash scripts/run_daily.sh`
+2. `bash scripts/rebuild_historical_report.sh <date>`
+3. `bash scripts/verify_phase3.sh`
+4. `bash scripts/verify_phase5.sh`
+5. `go test ./...`
+