Complete batch import v2 runtime and host capability recovery

2026-05-23 09:18:02 +08:00
parent e50c292c7f
commit cfa1eaa904
60 changed files with 3718 additions and 530 deletions
--- a/docs/2026-05-18-PRODUCTION_READINESS_REVIEW.md
+++ b/docs/2026-05-18-PRODUCTION_READINESS_REVIEW.md
@@ -58,7 +58,8 @@
   - 能力探测：[capability_probe.go](/home/long/project/sub2api-cn-relay-manager/internal/host/sub2api/capability_probe.go:1)
   - 导入运行时：[runtime_import_service.go](/home/long/project/sub2api-cn-relay-manager/internal/provision/runtime_import_service.go:1)
   - 回滚：[rollback_service.go](/home/long/project/sub2api-cn-relay-manager/internal/provision/rollback_service.go:1)
-   - 对账：[batch_detail_and_reconcile_service.go](/home/long/project/sub2api-cn-relay-manager/internal/provision/batch_detail_and_reconcile_service.go:1)
+   - 对账：[service.go](/home/long/project/sub2api-cn-relay-manager/internal/reconcile/service.go:1)
+   - batch detail：[batch_detail_service.go](/home/long/project/sub2api-cn-relay-manager/internal/provision/batch_detail_service.go:1)
   - 状态库：[db.go](/home/long/project/sub2api-cn-relay-manager/internal/store/sqlite/db.go:1)
   - 资源记录：[managed_resources_repo.go](/home/long/project/sub2api-cn-relay-manager/internal/store/sqlite/managed_resources_repo.go:1)

@@ -217,8 +218,8 @@

 证据：

- implementation plan 里期望的 `internal/reconcile/*`、`access/planner.go`、`worker/scheduler.go` 等结构仍未落地，[implementation-plan.md](/home/long/project/sub2api-cn-relay-manager/docs/plans/2026-05-12-sub2api-cn-relay-manager-implementation-plan.md:69)
- 当前逻辑主要仍集中在 `internal/provision/*` 与 `internal/access/closure.go`。
+- 该评审形成时，implementation plan 里期望的 `internal/reconcile/*`、`access/planner.go`、`worker/scheduler.go` 等结构仍未落地，[implementation-plan.md](/home/long/project/sub2api-cn-relay-manager/docs/plans/2026-05-12-sub2api-cn-relay-manager-implementation-plan.md:69)
+- 截至 2026-05-22，这些结构项已分别落到 `internal/reconcile/*`、`internal/access/{planner,subscription,self_service,validation}.go` 与 `internal/worker/*`。

 影响：

--- a/docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md
+++ b/docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md
@@ -30,6 +30,7 @@

 - [x] 单一状态源为 `import_runs / import_run_items / import_run_item_events`
 - [x] migration 已落地并受集成测试保护
+- [x] run 级请求上下文（`host_id / subscription_users / subscription_days / probe_api_key`）已持久化，支持重启后恢复 validate
 - [x] `/api/batch-import/runs*` 已接到 V2 projection
 - [x] CLI `batch-import` 已通过 `ActionSet.CreateBatchImportRun` 进入真实 pipeline
 - [x] 结果页/结果 API 不回退 legacy 表结构
@@ -54,14 +55,14 @@

 - Probe / alias / capability：`internal/probe/models.go`、`internal/probe/aliases.go`、`internal/probe/capability.go`、`internal/probe/completion.go`
 - Reuse / orchestration / worker / validation：`internal/batch/provider_id.go`、`internal/batch/reuse_policy.go`、`internal/batch/service.go`、`internal/batch/confirmation.go`、`internal/batch/validation.go`
- 状态库存储：`internal/store/sqlite/import_runs_repo.go`、`internal/store/sqlite/import_run_items_repo.go`、`internal/store/sqlite/import_run_item_events_repo.go`
- Projection / API / CLI：`internal/batch/status_projection.go`、`internal/app/http_batch_import.go`、`internal/app/batch_runtime.go`、`internal/app/http_batch_runs.go`、`cmd/cli/batch_import.go`
+- 状态库存储：`internal/store/sqlite/import_runs_repo.go`、`internal/store/sqlite/import_run_items_repo.go`、`internal/store/sqlite/import_run_item_events_repo.go`、`internal/store/migrations/0009_batch_import_run_request_context.sql`
+- Projection / API / CLI：`internal/batch/status_projection.go`、`internal/app/http_batch_import.go`、`internal/app/batch_runtime.go`、`internal/app/batch_runtime_background.go`、`internal/app/http_batch_runs.go`、`cmd/cli/batch_import.go`

 ### 测试文件映射

 - 单测：`internal/batch/types_test.go`、`internal/probe/models_test.go`、`internal/probe/aliases_test.go`、`internal/probe/capability_test.go`、`internal/probe/completion_test.go`
 - 状态机：`internal/batch/provider_id_test.go`、`internal/batch/reuse_policy_test.go`、`internal/batch/service_test.go`、`internal/batch/confirmation_test.go`、`internal/batch/validation_test.go`、`internal/batch/status_projection_test.go`
- API / CLI：`internal/app/http_batch_import_test.go`、`internal/app/http_batch_runs_test.go`、`cmd/cli/batch_import_test.go`
+- API / CLI：`internal/app/http_batch_import_test.go`、`internal/app/http_batch_runs_test.go`、`internal/app/batch_runtime_background_test.go`、`cmd/cli/batch_import_test.go`
 - 集成：`tests/integration/batch_import_v2_test.go`

 ### API 路由映射
@@ -78,7 +79,7 @@
 - `go test ./tests/integration/... -count=1`：PASS
 - `go test -cover ./internal/... -count=1`：PASS
  - `internal/access` 76.7%
-  - `internal/batch` 75.4%
+  - `internal/batch` 72.9%
  - `internal/probe` 78.2%
  - `internal/provision` 76.4%
  - `internal/pack` 73.9%
@@ -89,6 +90,9 @@

 - `internal/app/http_batch_import.go` 的 `buildCreateBatchImportRunAction` 已改为先解析已注册 host，再委托 `batchImportRuntimeRunner.execute`
 - `internal/app/batch_runtime.go` 已把 `BatchImportService + ConfirmationWorker + ValidationService` 串成 create-run 的同步入口驱动链
+- `internal/app/batch_runtime_background.go` 已补后台 runtime scheduler；`running` run 在控制面重启后会继续被拾取并推进
+- `internal/store/sqlite/import_run_items_repo.go` 已补原子 lease 获取；不会再在 lease 落库前并发双发 confirmer
+- `internal/app/http_batch_import.go` / `internal/app/http_batch_runs.go` 已补 `cursor/next_cursor`，且 run 列表 `q` 可命中 `run_id / provider_id / base_url`
 - `cmd/cli/batch_import.go` 继续复用 `ActionSet.CreateBatchImportRun`，因此 CLI create-run 也随入口修复自动进入真实 pipeline
 - `internal/app/http_batch_import_test.go` 已新增真实 stub 回归，直接验证 create-run 最终把 item 推进到 `current_stage=done` 且 `access_status=active`

--- a/docs/EXECUTION_BOARD.md
+++ b/docs/EXECUTION_BOARD.md
@@ -15,6 +15,9 @@
 - `self_service` 主链路已通过 latest-head 标准 fresh-host 复验：
  - `artifacts/real-host-acceptance/20260521_210403/05-import.json`
  - `artifacts/real-host-acceptance/20260521_210403/07-access-status.json`
+- latest-head relay-manager 已新增宿主 capability 自愈：
+  - 当第三方 OpenAI-compatible upstream 因宿主把 `openai_responses_supported` 误判成 `true` 而导致 host `/v1/chat/completions` 返回 `502 upstream_error` 时，access closure 与后台 reconcile 会自动把相关 account 修正到 raw `/chat/completions` 路径后再重试
+  - 该修正现在不再依赖宿主长期保留补丁，宿主升级后只要下次 import/access/reconcile 触发，就能重新收敛到正确 capability
 - 官方 provider 验证矩阵当前仍保留一条非阻塞事实：
  - `artifacts/real-host-acceptance/20260521_222212_remote43_minimax-m2-7-official_key_import/21-summary.json` 已证明 official MiniMax 模板链路是通的，但该验证 key 当前命中 upstream `429`
 - `reconcile=drifted` 仍可能在 shared fresh-host 上出现，但当前解释是“历史残留资源噪音”，不阻塞 PRD 首版放行
@@ -69,6 +72,9 @@
   - account test 首次 `403 Forbidden` 已降级为 advisory warning；只要 `/models` 已命中 `smoke_test_model`，不会再把 batch 误判为 blocking failure
   - access closure 对导入后瞬时 `503 / no available accounts` 增加短暂 completion retry，避免宿主异步 probe / account warm-up 窗口把真实可用链路误记成 `broken`
   - `20260522_122706_local_v0129_kimi_a7m_subscription_freshhost` 已证明：在修复后的 relay-manager + patched host 组合下，`kimi-a7m / kimi-k2.6` 可落到 `batch_status=succeeded`、`provider_status=active`、`latest_access_status=subscription_ready`
+14. relay-manager latest-head 已补宿主升级后的 capability 自愈
+   - 对 `API returned 403: Forbidden` 这类 `/responses` 误判 advisory，控制面现在会在 access closure 与 reconcile rerun 中把目标 account 的 `openai_responses_supported` 修正为 `false`，随后重试 gateway `/v1/chat/completions`
+   - 这样即使宿主升级或异步 probe 把 capability 标记覆写错，控制面也能在“安装后确认”与“后台持续对账”两个环节重新拉回可用状态

 ## 已验证门禁

@@ -122,16 +128,13 @@
   - 真实宿主初始化不会自动创建普通用户；上线前必须显式创建普通用户并留存可复用凭据
   - `self_service` 需要普通用户 key 绑定目标标准 group，且通常还需要可用余额
   - `subscription` 需要 subscription 类型 group + 普通用户订阅分配 + key/group 绑定
+   - 若启用持续后台 reconcile，SQLite 状态库将持久化最新 access probe 元数据，部署时必须按 secret 级别保护数据库文件

-2. 结构债务
-   - access / reconcile 仍未完全按 implementation plan 拆到独立子模块
-   - 当前仍无内置 scheduler/jobs
-
-3. 部署与环境限制
+2. 部署与环境限制
   - 标准多阶段 Dockerfile 在受限网络环境下仍不稳
   - 当前推荐 `scripts/build_local_image.sh` + `Dockerfile.local`

-4. official provider 验证矩阵
+3. official provider 验证矩阵
   - official MiniMax 当前 live 样本已证明模板链路可用，但验证 key 命中 upstream `429`
   - Qwen / GLM / Kimi / Step 等官方 provider 是否通过 live 验收，仍取决于后续官方 key 与 quota

@@ -168,7 +171,9 @@
 - 其余 review 问题也已同步收口：
  - capability 从 upstream 总画像升级为 transport + model profiles
  - 结果页字段、状态库存储字段、retry/event trail 已统一
+  - run 级请求上下文已持久化到 `import_runs`，控制面重启后 validate 能继续使用 `host_id / subscription_users / subscription_days / probe_api_key`
  - OpenAPI 已补齐 `/api/batch-import/runs*`，legacy `/api/import-batches/*` 降级为 v1/legacy
+  - run/item 列表 API 已补齐 `cursor/next_cursor`；run 列表 `q` 可命中 `run_id / provider_id / base_url`
  - 已补充重复导入自动复用策略：按 `provider_id + api_key_fingerprint + canonical_model_family` 判断 `reused / patch_only / replace`
  - 已补充同模型别名归一化契约：例如 `kimi 2.6 / kimi-2.6 / kimi-k2.6` 可归并到同一模型家族并快速复用
  - 已补充多账号重复导入与弃用账号再启用策略：active 账号提示“重复已启用”，disabled/deprecated 账号显示原状态并走 `reactivated` 快速启用路径
@@ -188,6 +193,7 @@
 - `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md` 已完成
 - latest-head 已补齐 `internal/app/http_batch_import.go` -> `internal/app/batch_runtime.go` 的 create-run 入口 wiring
 - API 与 CLI create-run 现在都会真实驱动 `BatchImportService + ConfirmationWorker + ValidationService`
+- 控制面 server 启动后会自动运行 batch-import background scheduler，`running` run 在重启后可继续推进
 - 最新一轮验证结果保持全绿：`go test ./... -count=1`、`go test ./tests/integration/... -count=1`、`go test -cover ./internal/... -count=1`、`go vet ./...`、`gofmt -l .`

 **真实 Gate**：✅ 文档、状态机、投影、测试、审计与 create-run 入口已经对齐，**V2 设计已按基线计划交付**
--- a/docs/KNOWN_LIMITATIONS.md
+++ b/docs/KNOWN_LIMITATIONS.md
@@ -4,10 +4,11 @@ This document covers known limitations that operators should be aware of before

 ## Core Limitations

-### 1. No Automated Reconcile Scheduler (P2)
- Reconcilation must be triggered manually via `POST /api/providers/{providerID}/reconcile` or CLI.
- No cron/scheduler service is bundled.
- Workaround: set up a cron job on the host OS calling the HTTP API periodically.
+### 1. Automated Reconcile Is Available, but Disabled by Default (P2)
+- A built-in background reconcile worker is now available in the control plane server.
+- It is gated by `SUB2API_CRM_RECONCILE_WORKER_ENABLED=true` and uses `SUB2API_CRM_RECONCILE_POLL_INTERVAL` for cadence.
+- The current scheduler model is still a simple polling runner rather than a full generic jobs platform.
+- Manual `POST /api/providers/{providerID}/reconcile` and CLI reconcile remain supported.

 ### 2. Real sub2api Compatibility Is Verified on a Fresh Host, but Requires Explicit Operator Preparation
 - Real-host validation has now been executed against a fresh redeployed sub2api host.
@@ -16,28 +17,19 @@ This document covers known limitations that operators should be aware of before
 - However, host initialization alone is not enough: operators must explicitly create ordinary users, keep reusable credentials, bind keys to the correct group, and satisfy the billing/subscription prerequisites documented in `docs/REAL_HOST_ACCEPTANCE_RUNBOOK.md`.
 - This is therefore no longer a code-compatibility blocker; it is an explicit operational prerequisite.

-### 3. Access Module Not Fully Structured per Implementation Plan
- The `access` package contains only `closure.go` (the combined close/validate logic).
- `planner.go`, `subscription_service.go`, `self_service_checker.go` are not separately extracted.
- All access logic is functional in `closure.go` but not split per the planned directory structure.
-
-### 4. Reconcile Logic Inline in Provision Package
- Reconcile lives in `internal/provision/batch_detail_and_reconcile_service.go` rather than a separate `internal/reconcile/*` package.
- Functionally complete but structural gap vs implementation plan.
-
-### 5. Standard Multi-stage Docker Build Still Depends on Outbound Module Download
+### 3. Standard Multi-stage Docker Build Still Depends on Outbound Module Download
 - `Dockerfile.local` has been validated as the recommended proxy-safe build path.
 - `scripts/build_local_image.sh` now prebuilds the Linux binary on the host and produces `sub2api-cn-relay-manager:local` reliably in this environment.
 - The standard multi-stage `Dockerfile` still requires outbound Go module download from inside the container build context; in restricted networks, prefer the local-image path.

 ## Accepted Design Trade-offs

-### 6. CLI Run Functions Not Unit-Tested
+### 4. CLI Run Functions Not Unit-Tested
 - `runInstallPack`, `runImportProvider`, `runPreviewProvider`, `runRollbackProvider`, `runReconcileProvider`, `findProvider` connect to real SQLite/sub2api — these are 0% covered in unit tests.
 - The `execute()` dispatch and all `parse*` functions are fully tested.
 - In an integration/E2E context these functions are exercised through the host stub.

-### 7. No Web UI
+### 5. No Web UI
 - Administration is through CLI and HTTP API only.
 - Consistent with MVP scope defined in PRD.

@@ -45,7 +37,9 @@ This document covers known limitations that operators should be aware of before

 ### Token Security
 - `SUB2API_CRM_ADMIN_TOKEN` must be at least 20 characters, rotated outside source control.
- API keys imported via `--access-api-key` are used for gateway probe calls — they are not stored in control-plane state (only key fingerprint/hash is stored).
+- To support continuous background reconcile, the latest access closure now persists probe metadata in control-plane state:
+  `self_service` stores the probe API key, and `subscription` stores the subscription user selector metadata.
+- Operators should therefore treat the SQLite database as secret-bearing state and protect it accordingly.

 ### Database
 - SQLite is the only supported database backend for v0.1.
--- a/docs/PRODUCTION_CLOSURE_BOARD.md
+++ b/docs/PRODUCTION_CLOSURE_BOARD.md
@@ -86,9 +86,7 @@
 - `subscription` 需要 subscription 类型 group + 普通用户订阅分配 + key/group 绑定

 ### P2 已接受技术债务
- access 模块仍未按 implementation plan 拆到 `planner.go / subscription_service.go / self_service_checker.go`
- reconcile 仍内联在 `internal/provision/`，未拆到 `internal/reconcile/*`
- 无内置 scheduler/jobs；当前通过手动 reconcile + 外部 cron 补偿
+- `internal/worker` 已抽出通用 polling runner，当前 batch-import runtime 与后台 reconcile 都已接入；调度模型仍是固定间隔 polling，而不是完整 jobs/reconcile 平台
 - CLI `run*` 真实链路函数未做系统性 mock 单测
 - 标准多阶段 `Dockerfile` 在受限网络下仍依赖容器内联网拉取 Go modules；本地部署默认走 `scripts/build_local_image.sh`
 - `subscription` 这条 provider matrix 已通过；剩余待补的是 latest-head `self_service` fresh-host 复验，而不是继续替换 provider key
--- a/docs/openapi.yaml
+++ b/docs/openapi.yaml
@@ -173,6 +173,12 @@ paths:
    get:
      security:
        - bearerAuth: []
+      parameters:
+        - $ref: '#/components/parameters/BatchImportRunStateQuery'
+        - $ref: '#/components/parameters/BatchImportAccessModeQuery'
+        - $ref: '#/components/parameters/BatchImportQuery'
+        - $ref: '#/components/parameters/CursorQuery'
+        - $ref: '#/components/parameters/LimitQuery'
      responses:
        '200':
          description: list batch import runs
@@ -205,6 +211,16 @@ paths:
        - bearerAuth: []
      parameters:
        - $ref: '#/components/parameters/RunID'
+        - $ref: '#/components/parameters/BatchImportCurrentStageQuery'
+        - $ref: '#/components/parameters/BatchImportConfirmationStatusQuery'
+        - $ref: '#/components/parameters/BatchImportAccessStatusQuery'
+        - $ref: '#/components/parameters/BatchImportHasWarningQuery'
+        - $ref: '#/components/parameters/BatchImportProviderIDQuery'
+        - $ref: '#/components/parameters/BatchImportMatchedAccountStateQuery'
+        - $ref: '#/components/parameters/BatchImportAccountResolutionQuery'
+        - $ref: '#/components/parameters/BatchImportQuery'
+        - $ref: '#/components/parameters/CursorQuery'
+        - $ref: '#/components/parameters/LimitQuery'
      responses:
        '200':
          description: batch import run items
@@ -471,6 +487,86 @@ components:
      required: false
      schema:
        type: string
+    BatchImportRunStateQuery:
+      name: state
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [running, completed, completed_with_warnings, failed, cancelled]
+    BatchImportAccessModeQuery:
+      name: access_mode
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [subscription, self_service]
+    BatchImportQuery:
+      name: q
+      in: query
+      required: false
+      schema:
+        type: string
+    CursorQuery:
+      name: cursor
+      in: query
+      required: false
+      schema:
+        type: string
+    LimitQuery:
+      name: limit
+      in: query
+      required: false
+      schema:
+        type: integer
+        minimum: 1
+    BatchImportCurrentStageQuery:
+      name: current_stage
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [probe, provision, confirm, validate, done]
+    BatchImportConfirmationStatusQuery:
+      name: confirmation_status
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [pending, confirmed, advisory, failed]
+    BatchImportAccessStatusQuery:
+      name: access_status
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [unknown, active, degraded, broken]
+    BatchImportHasWarningQuery:
+      name: has_warning
+      in: query
+      required: false
+      schema:
+        type: boolean
+    BatchImportProviderIDQuery:
+      name: provider_id
+      in: query
+      required: false
+      schema:
+        type: string
+    BatchImportMatchedAccountStateQuery:
+      name: matched_account_state
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [none, active, disabled, deprecated, broken]
+    BatchImportAccountResolutionQuery:
+      name: account_resolution
+      in: query
+      required: false
+      schema:
+        type: string
+        enum: [created, reused, reactivated, replaced]
  responses:
    Unauthorized:
      description: missing or invalid admin token
@@ -710,6 +806,9 @@ components:
          type: array
          items:
            $ref: '#/components/schemas/BatchImportRunSummary'
+        next_cursor:
+          type: string
+          nullable: true
    BatchImportCapabilityTransportProfile:
      type: object
      properties:
@@ -886,6 +985,9 @@ components:
          type: array
          items:
            $ref: '#/components/schemas/BatchImportRunItemSummary'
+        next_cursor:
+          type: string
+          nullable: true
    ImportBatchInfo:
      type: object
      properties:
--- a/docs/plans/2026-05-22-batch-auto-import-v2-implementation-plan.md
+++ b/docs/plans/2026-05-22-batch-auto-import-v2-implementation-plan.md
@@ -0,0 +1,784 @@
+# Batch Auto-Import V2 Implementation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 实现 V2 的 URL + key 批量导入能力，覆盖模型发现、同模型别名归并、重复导入复用、异步确认、最终 gateway 验证、结果 API 与结果页所需状态投影。
+
+**Architecture:** 采用 `BatchImportService + ConfirmationWorker + ValidationService + RunStateStore + ResultProjection` 分层架构。V2 只以 `import_runs / import_run_items / import_run_item_events` 作为运行态真相，旧 `import_batches/*` 仅保留 legacy linkage。重复导入决策基于 `provider_id + api_key_fingerprint + canonical_model_family`，最终可用性只认宿主真实 `/v1/chat/completions`。
+
+**Tech Stack:** Go 1.22.2、`database/sql` + SQLite、Chi、OpenAPI 3.1、Go `testing`、`httptest`、现有 `internal/host/sub2api` 适配层与 `tests/integration` 集成测试套件。
+
+---
+
+## 0. 实施约束
+
+- 只通过宿主 HTTP API 工作，不直写宿主数据库。
+- 所有状态枚举、字段名、API 路由必须遵循当前 canonical contract。
+- 每个任务都先写失败测试，再做最小实现，再跑验证。
+- 每个任务独立提交，避免大而混杂的 commit。
+- 任何 UI/API 展示都只能读 V2 canonical state，不得回退到日志拼接。
+
+## 1. 任务总览
+
+```text
+T1  Canonical types and enums
+T2  Probe models + alias normalization + canonical family
+T3  Capability profile + smoke completion routing
+T4  Provider ID + reuse policy
+T5  Run/item/event state store repositories
+T6  BatchImportService: Stage 0~2
+T7  ConfirmationWorker + retry + lease
+T8  ValidationService + access status
+T9  ResultProjection
+T10 HTTP API: runs/items
+T11 CLI: batch-import
+T12 Integration + contract verification
+T13 Design restoration audit
+```
+
+## 2. 设计还原验证矩阵
+
+### 2.1 目标覆盖矩阵
+
+| 设计目标 | 对应任务 | 验证方式 |
+|---|---|---|
+| URL + key 自动发现模型 | T2, T6, T12 | `/v1/models` 拉取、集成测试 |
+| 模型纠错与别名归一化 | T2, T4, T9, T12 | unit + item detail projection |
+| 同模型跨中转快速识别 | T2, T4, T12 | `canonical_model_family` 测试 |
+| 重复导入自动复用 | T4, T6, T9, T12 | reuse decision + projection |
+| 已启用重复账号直接复用 | T4, T6, T9, T12 | `matched_account_state=active` |
+| 已停用/已弃用账号快速启用 | T4, T6, T7, T9, T12 | `account_resolution=reactivated` |
+| transport + model capability profile | T3, T9, T10, T12 | profile persistence + API schema |
+| channel/account 演化 | T6, T12 | patch contract + host stub |
+| 异步确认与重试 | T7, T12 | lease/retry/event trail |
+| gateway completion 最终判定 | T8, T12 | `access_status` 唯一写入 |
+| 结果 API 与结果页数据源 | T5, T9, T10, T12 | run/item/event projection |
+| 单一状态源 | T5, T7, T8, T9 | 只读 `import_runs/*` |
+
+### 2.2 契约覆盖矩阵
+
+| 契约 | 对应任务 |
+|---|---|
+| `run_id / item_id / provider_id` | T1, T4, T5 |
+| `run.state` | T1, T5, T9 |
+| `current_stage / confirmation_status / access_status` | T1, T5, T7, T8 |
+| `matched_account_state / account_resolution` | T4, T5, T6, T9, T10 |
+| `api_key_fingerprint` | T4, T5, T6 |
+| `canonical_model_families` | T2, T4, T5, T9, T10 |
+| `provision_reused / reused_from_*` | T4, T5, T6, T9, T10 |
+| `/api/batch-import/runs*` | T10, T12 |
+
+如果 T1~T12 全部完成并通过验证，T13 必须能证明上述矩阵全部为“已覆盖”，否则不得宣称 V2 可按设计实现。
+
+## 3. 实施任务
+
+### Task 1: Canonical Types And Enums
+
+**Files:**
+- Create: `internal/batch/types.go`
+- Test: `internal/batch/types_test.go`
+- Reference: `docs/2026-05-21-BATCH_AUTO_IMPORT_SPEC.md`
+
+**Step 1: Write the failing test**
+
+为以下枚举写失败测试：
+- `RunState`
+- `ItemStage`
+- `ConfirmationStatus`
+- `AccessStatus`
+- `MatchedAccountState`
+- `AccountResolution`
+
+至少覆盖：
+- 常量值是否与文档一致
+- 非法字符串是否会在后续解析层被拒绝
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestRunStateConstants|TestItemStateConstants' -count=1
+```
+
+Expected: FAIL，提示类型或常量不存在。
+
+**Step 3: Write minimal implementation**
+
+在 `internal/batch/types.go` 中定义上述类型与常量，不提前引入不需要的 helper。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestRunStateConstants|TestItemStateConstants' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/types.go internal/batch/types_test.go
+git commit -m "feat(batch): add canonical v2 state enums"
+```
+
+### Task 2: Probe Models, Alias Normalization, Canonical Family
+
+**Files:**
+- Create: `internal/probe/models.go`
+- Create: `internal/probe/aliases.go`
+- Test: `internal/probe/models_test.go`
+- Test: `internal/probe/aliases_test.go`
+- Reference: `docs/2026-05-21-BATCH_AUTO_IMPORT_TDD_PLAN.md`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- `/v1/models` OpenAI 格式解析
+- 空模型列表
+- 鉴权失败
+- `kimi 2.6 / kimi-2.6 / kimi-k2.6` 归并到同一 `canonical_model_family`
+- `deepseek-ai/DeepSeek-V4-Pro` vendor 前缀归一化
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/probe -run 'TestProviderModels|TestCanonicalModelFamily' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `ProviderModels`
+- `NormalizeModelID`
+- `CanonicalModelID`
+- `CanonicalModelFamily`
+- `BuildAliasTable`
+- `ResolveRequestedModel`
+- `RecommendModels`
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/probe -run 'TestProviderModels|TestCanonicalModelFamily' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/probe/models.go internal/probe/aliases.go internal/probe/models_test.go internal/probe/aliases_test.go
+git commit -m "feat(probe): add model discovery and canonical family normalization"
+```
+
+### Task 3: Capability Profile And Smoke Completion Routing
+
+**Files:**
+- Create: `internal/probe/capability.go`
+- Create: `internal/probe/completion.go`
+- Test: `internal/probe/capability_test.go`
+- Test: `internal/probe/completion_test.go`
+- Reference: `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_ARCHITECTURE.md`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- `responses` 不支持但 `chat/completions` 可用
+- transport profile 的 advisory 记录
+- per-model profile 记录
+- `ResolveSmokeModel` 基于别名与能力选择 smoke model
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/probe -run 'TestProbeCapabilities|TestResolveSmokeModel|TestSmokeCompletion' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `TransportProfile`
+- `ModelCapabilityProfile`
+- `CapabilityProfile`
+- `ProbeCapabilities`
+- `CompletionResult`
+- `ResolveSmokeModel`
+- `SmokeCompletion`
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/probe -run 'TestProbeCapabilities|TestResolveSmokeModel|TestSmokeCompletion' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/probe/capability.go internal/probe/completion.go internal/probe/capability_test.go internal/probe/completion_test.go
+git commit -m "feat(probe): add capability profile and smoke completion routing"
+```
+
+### Task 4: Provider ID And Reuse Policy
+
+**Files:**
+- Create: `internal/batch/provider_id.go`
+- Create: `internal/batch/reuse_policy.go`
+- Test: `internal/batch/provider_id_test.go`
+- Test: `internal/batch/reuse_policy_test.go`
+- Reference: `docs/2026-05-21-BATCH_AUTO_IMPORT_SPEC.md:336`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- 同 host 不同 path 生成不同 `provider_id`
+- 已存在 active provider 且 family 已覆盖 -> `reused`
+- 已存在 active account -> `matched_account_state=active`, `account_resolution=reused`
+- `disabled/deprecated` 账号 -> `reactivated`
+- `broken` provider/account -> `replace`
+- 同 family 不同 alias -> 视为已覆盖
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestNormalizeProviderID|TestDecideReuse' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `NormalizeProviderID`
+- `ReuseDecision`
+- `DecideReuse`
+
+不要在这一步直接改 service。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestNormalizeProviderID|TestDecideReuse' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/provider_id.go internal/batch/reuse_policy.go internal/batch/provider_id_test.go internal/batch/reuse_policy_test.go
+git commit -m "feat(batch): add provider id and reuse policy"
+```
+
+### Task 5: Run/Item/Event State Store Repositories
+
+**Files:**
+- Modify: `internal/store/migrations/0007_batch_import_runs.sql`
+- Modify: `internal/store/migrations/0008_batch_import_run_events.sql`
+- Modify: `internal/store/sqlite/import_runs_repo.go`
+- Create: `internal/store/sqlite/import_run_items_repo.go`
+- Create: `internal/store/sqlite/import_run_item_events_repo.go`
+- Modify: `internal/store/sqlite/db.go`
+- Test: `internal/store/sqlite/import_runs_repo_test.go`
+- Test: `tests/integration/store_init_test.go`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- run 创建/更新
+- item upsert 持久化 `api_key_fingerprint / canonical_model_families / matched_account_state / account_resolution / provision_reused`
+- event append/list
+- lease 字段持久化
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/store/sqlite/... ./tests/integration/... -run 'TestRunStateStore|TestStoreAppliesLatestMigration' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+补足 repo 与 migration，确保 schema 与文档完全一致。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/store/sqlite/... ./tests/integration/... -run 'TestRunStateStore|TestStoreAppliesLatestMigration' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/store/migrations/0007_batch_import_runs.sql internal/store/migrations/0008_batch_import_run_events.sql internal/store/sqlite/import_runs_repo.go internal/store/sqlite/import_run_items_repo.go internal/store/sqlite/import_run_item_events_repo.go internal/store/sqlite/db.go internal/store/sqlite/import_runs_repo_test.go tests/integration/store_init_test.go
+git commit -m "feat(store): complete v2 runtime state repositories"
+```
+
+### Task 6: BatchImportService Stage 0~2
+
+**Files:**
+- Create: `internal/batch/service.go`
+- Create: `internal/batch/capability_profile.go`
+- Create: `internal/batch/channel_evolution.go`
+- Test: `internal/batch/service_test.go`
+- Test: `internal/batch/channel_evolution_test.go`
+- Reference: `internal/provision/import_service.go`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- 创建 run + items
+- reuse preflight 跳过重复 provision
+- active 账号重复导入 -> reused
+- deprecated 账号重复导入 -> reactivated
+- patch-only 新 alias
+- legacy batch/provider link 回写
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestBatchImport_StartRun|TestModelMappingDelta' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `BatchImportService.StartRun`
+- `ImportRoutingStrategy`
+- `BuildImportRoutingStrategy`
+- `ChannelPatchContract`
+- `ModelMappingDelta`
+
+先接现有 `provision.ImportService`，不要提前扩展 UI/API。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestBatchImport_StartRun|TestModelMappingDelta' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/service.go internal/batch/capability_profile.go internal/batch/channel_evolution.go internal/batch/service_test.go internal/batch/channel_evolution_test.go
+git commit -m "feat(batch): implement v2 run setup and provision stages"
+```
+
+### Task 7: ConfirmationWorker, Lease And Retry
+
+**Files:**
+- Create: `internal/batch/confirmation.go`
+- Test: `internal/batch/confirmation_test.go`
+- Reference: `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_ARCHITECTURE.md:398`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- 只捞 `confirm + pending + retry_due + lease_expired`
+- `403` probe race -> advisory
+- 初次 `503 no available accounts` -> retry -> success
+- 多 worker lease 互斥
+- `disabled/deprecated` 命中后 reactivated 投影正确
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestConfirmationWorker' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `ConfirmationWorker.Tick`
+- `ConfirmationWorker.ConfirmItem`
+- retry 计划
+- lease 生命周期
+- advisory event 写入
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestConfirmationWorker' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/confirmation.go internal/batch/confirmation_test.go
+git commit -m "feat(batch): add confirmation worker and retry handling"
+```
+
+### Task 8: ValidationService And Final Access Status
+
+**Files:**
+- Create: `internal/batch/validation.go`
+- Test: `internal/batch/validation_test.go`
+- Reference: `internal/access/closure.go`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- `confirmed/advisory + chat 200 -> active`
+- exhausted transient -> `degraded`
+- definitive invalid path -> `broken`
+- 只有 ValidationService 可以写 `access_status`
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestValidationService' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- `ValidationService.ValidateItem`
+- `access_status` 映射
+- 对 run summary 的最小更新
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestValidationService' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/validation.go internal/batch/validation_test.go
+git commit -m "feat(batch): add validation service for final access status"
+```
+
+### Task 9: ResultProjection
+
+**Files:**
+- Create: `internal/batch/status_projection.go`
+- Test: `internal/batch/status_projection_test.go`
+- Reference: `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_API_SCHEMAS.md`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- run summary 聚合
+- item summary/detail projection
+- warning 文案模板
+- `provision_reused` badge
+- `matched_account_state / account_resolution` 文案与 badge
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestStatusProjection' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现：
+- run list projection
+- item list projection
+- item detail projection
+- warning/badge mapping
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/batch -run 'TestStatusProjection' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/batch/status_projection.go internal/batch/status_projection_test.go
+git commit -m "feat(batch): add result projection for v2 runs and items"
+```
+
+### Task 10: HTTP API For Runs And Items
+
+**Files:**
+- Create: `internal/app/http_batch_import.go`
+- Create: `internal/app/http_batch_runs.go`
+- Modify: `internal/app/http_api.go`
+- Test: `internal/app/http_batch_import_test.go`
+- Test: `internal/app/http_batch_runs_test.go`
+- Reference: `docs/openapi.yaml`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- `POST /api/batch-import/runs`
+- `GET /api/batch-import/runs`
+- `GET /api/batch-import/runs/{run_id}`
+- `GET /api/batch-import/runs/{run_id}/items`
+- `GET /api/batch-import/runs/{run_id}/items/{item_id}`
+- `subscription/self_service` 条件必填
+- 列表过滤 `matched_account_state / account_resolution`
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./internal/app -run 'TestBatchImportHTTP|TestBatchRunsHTTP' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+按 OpenAPI 只输出 projection，不泄漏 legacy 表结构。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./internal/app -run 'TestBatchImportHTTP|TestBatchRunsHTTP' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add internal/app/http_batch_import.go internal/app/http_batch_runs.go internal/app/http_api.go internal/app/http_batch_import_test.go internal/app/http_batch_runs_test.go
+git commit -m "feat(api): add batch import v2 endpoints"
+```
+
+### Task 11: CLI Entry For Batch Import
+
+**Files:**
+- Modify: `cmd/cli/main.go`
+- Create: `cmd/cli/batch_import.go`
+- Test: `cmd/cli/batch_import_test.go`
+
+**Step 1: Write the failing test**
+
+覆盖：
+- 参数解析
+- `subscription` 必填订阅参数
+- `self_service` 必填 `probe_api_key`
+- `--confirm-timeout`
+- 结果输出 `run_id/result_page`
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./cmd/cli -run 'TestBatchImportCLI' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+实现 CLI 到 V2 API/service 的入口，不在 CLI 层重复业务逻辑。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./cmd/cli -run 'TestBatchImportCLI' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add cmd/cli/main.go cmd/cli/batch_import.go cmd/cli/batch_import_test.go
+git commit -m "feat(cli): add v2 batch import command"
+```
+
+### Task 12: Integration And End-To-End Verification
+
+**Files:**
+- Create: `tests/integration/batch_import_v2_test.go`
+- Modify: `tests/integration/host_stub_test.go`（如需 stub 扩展）
+
+**Step 1: Write the failing test**
+
+至少覆盖 6 条真实业务链：
+- 发现模型并归一化
+- 重复导入 active 账号 -> reused
+- deprecated 账号 -> reactivated
+- 同 family 不同 alias -> patch_only
+- probe race + warmup retry -> advisory + active
+- run/item/event 详情可从 V2 新表完全读出
+
+**Step 2: Run test to verify it fails**
+
+Run:
+```bash
+go test ./tests/integration/... -run 'TestBatchImportV2' -count=1
+```
+
+Expected: FAIL
+
+**Step 3: Write minimal implementation**
+
+补齐 host stub、fake adapter、seed data，确保每条链路都可复现。
+
+**Step 4: Run test to verify it passes**
+
+Run:
+```bash
+go test ./tests/integration/... -run 'TestBatchImportV2' -count=1
+```
+
+Expected: PASS
+
+**Step 5: Commit**
+
+```bash
+git add tests/integration/batch_import_v2_test.go tests/integration/host_stub_test.go
+git commit -m "test(integration): cover batch import v2 flows"
+```
+
+### Task 13: Design Restoration Audit
+
+**Files:**
+- Create: `docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md`
+- Modify: `docs/EXECUTION_BOARD.md`
+
+**Step 1: Write the failing audit checklist**
+
+列出必须逐项勾选的设计恢复项：
+- 8 项 Objective
+- canonical contract
+- 结果 API
+- migration
+- worker/retry/lease
+- reuse/reactivation
+
+**Step 2: Run verification to identify gaps**
+
+Run:
+```bash
+go test ./... -count=1
+go test ./tests/integration/... -count=1
+go test -cover ./internal/... -count=1
+go vet ./...
+gofmt -l .
+```
+
+Expected: 在实现完成前，这一步用来发现剩余设计缺口；在最终完成时必须全绿。
+
+**Step 3: Write the audit artifact**
+
+将每一项设计要求映射到：
+- 代码文件
+- 测试文件
+- API 路由
+- 状态字段
+
+**Step 4: Update board with true gate**
+
+在执行板中明确：
+- 哪些任务完成
+- 哪些设计要求已还原
+- 是否可宣称“V2 设计已被完整实现”
+
+**Step 5: Commit**
+
+```bash
+git add docs/2026-05-22-BATCH_AUTO_IMPORT_V2_RESTORATION_CHECKLIST.md docs/EXECUTION_BOARD.md
+git commit -m "docs(v2): add restoration checklist and completion gate"
+```
+
+## 4. 全局验证门禁
+
+完成 T1~T13 后，必须一次性通过：
+
+```bash
+gofmt -l .
+go vet ./...
+go test ./... -count=1
+go test ./tests/integration/... -count=1
+go test -cover ./internal/... -count=1
+```
+
+额外检查：
+
+- `docs/openapi.yaml` 与 handler 响应字段一致
+- `import_runs/*` 足以支撑结果页，不依赖 legacy 表拼接
+- `matched_account_state / account_resolution / provision_reused` 能在 item detail 里直接读到
+- `canonical_model_family` 能把同模型别名判定为同一族
+
+## 5. 计划完整性结论
+
+这份计划只有在满足以下条件时，才算“任务可以完全还原规划设计”：
+
+1. T1~T12 实现完成并全部通过验证
+2. T13 的还原清单中不存在未映射设计项
+3. 任一 Objective 都能指向至少一条：
+   - 实现任务
+   - 自动化测试
+   - API 或状态字段证据
+4. 结果页/API 不需要额外新增未规划字段才能解释最终状态
+
+如果 T13 审核时发现任何一项设计要求无法映射到任务或测试，这份计划必须回退修改，不能直接进入实现。
+
+## 6. 推荐提交顺序
+
+建议按以下小步提交：
+
+1. `feat(batch): add canonical v2 state enums`
+2. `feat(probe): add model discovery and canonical family normalization`
+3. `feat(probe): add capability profile and smoke completion routing`
+4. `feat(batch): add provider id and reuse policy`
+5. `feat(store): complete v2 runtime state repositories`
+6. `feat(batch): implement v2 run setup and provision stages`
+7. `feat(batch): add confirmation worker and retry handling`
+8. `feat(batch): add validation service for final access status`
+9. `feat(batch): add result projection for v2 runs and items`
+10. `feat(api): add batch import v2 endpoints`
+11. `feat(cli): add v2 batch import command`
+12. `test(integration): cover batch import v2 flows`
+13. `docs(v2): add restoration checklist and completion gate`
+
+Plan complete and saved to `docs/plans/2026-05-22-batch-auto-import-v2-implementation-plan.md`. Two execution options:
+
+**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
+
+**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
+
+Which approach?