🛠 Platform Repair Plan
This plan prioritises fixes by production impact. Security violations come first, then compliance and business-critical functionality, then the full feature set. Total estimated time to production readiness: 7-9 weeks for a focused team.
Phase 0 - Emergency Security Fixes
Before any other deployment - ~3 days5 violations must be patched before any production traffic. All are small targeted code changes.
COPPA sec 45.3: raw audio must not be stored. Migration: drop both columns, add audio_processed_at DATETIME only. Update submission handler to never write audio reference to DB.
Effort: 2h
Replace mysql.createPool({host:'172.17.0.1'...}) with HTTP call to learner-bot GET /api/v1/learner/:id via X-Internal-Key header. Reader must never touch another service's DB directly.
Effort: 3h
In gdpr-v1.ts: remove db.delete(gdprAuditLog)... Replace with UPDATE SET pii_stripped=true, stripped_at=NOW(). SOC2 audit trail is immutable and must never be deleted.
Effort: 1h
One-line change each file. Existing password hashes remain valid (bcrypt auto-detects cost on verify). New passwords use cost 12 immediately.
Effort: 30min
Add users.email_verified BOOLEAN DEFAULT false. Migration sets true for all existing users. New registrations get false and are blocked from login until verified. Full email flow built in Sprint 1.
Effort: 2h
Sprint 1 - Identity and Auth Foundation
Gate 1 completion - ~1.5 weeksCompletes Gate 1. Nothing else is safe to ship without a working auth foundation.
1A - Schema migrations
Default pending_verification for new registrations. Migration sets active for all existing users. Also add: suspension_reason VARCHAR, email_verified BOOLEAN, failed_attempts INT, locked_until DATETIME.
Effort: 3h
email_verification_tokens: id, user_id FK, token UUID, expires_at, used_at. password_reset_tokens: same shape. audit_log: id, user_id, action VARCHAR, ip, user_agent, metadata JSON, created_at.
Effort: 2h
1B - Email verification flow
Add env var EMAIL_API_KEY. Create lib/email.ts with sendTransactional(to, template, vars). Start with 3 templates: VERIFY_EMAIL, PASSWORD_RESET, WELCOME_TEACHER.
Effort: 4h
Registration sends VERIFY_EMAIL immediately after insert. Login checks state != pending_verification and blocks with 403 + helpful message if not verified.
Effort: 4h
1C - Password reset
Token TTL 1 hour, one-time use only. Sends PASSWORD_RESET email. Writes to audit_log on successful use. Existing admin brute-force reset endpoint remains for emergencies.
Effort: 3h
1D - Account lockout
Increment failed_attempts on each bad password attempt. Set locked_until = NOW() + 15min after 5 consecutive failures. Reset counter on successful login. Return 429 with Retry-After header.
Effort: 2h
1E - Admin suspend and reactivate
Each writes to audit_log and sets users.state. Fix GET /api/admin/users to return all users with optional ?role= filter (currently returns only admin-role users, breaking admin UX).
Effort: 3h
1F - Teacher-portal auth federation
SSO path /api/sso/login already exists and calls Account Center. Remove local teachers email+password signup/login mutations (auth.ts). Teachers must register at Account Center first. Migration creates Account Center users records for all existing teachers.
Effort: 1 day
Gate 1 deliverables
- Register then verify email then active account works end-to-end
- Forgot password then reset link then new password works
- Suspended account gets blocked on login
- Teacher portal auth flows through Account Center only
- All auth events written to audit_log
Sprint 2 - Billing and Entitlement
Gate 2 - ~2 weeksZero billing = zero revenue. Highest business-impact sprint. Do not defer.
2A - Database schema
subscriptions: user_id, stripe_customer_id, stripe_subscription_id, plan, state ENUM(trialing/active/past_due/cancelled), trial_ends_at, current_period_end, grace_period_end. entitlement_grants: user_id, tier, granted_by, granted_at, expires_at, reason. payments: subscription_id, stripe_payment_intent_id, amount_cents, currency, state, created_at.
Effort: 3h
2B - checkEntitlement() the core function
Logic: (1) check entitlement_grants for manual override, (2) check subscriptions.state IN (trialing, active), (3) default to tier:free. In-process Map cache with 60s TTL. Expose via GET /api/billing/entitlement. Add entitlement_tier field to GET /api/auth/session response. Add POST /api/internal/invalidate-entitlement for webhook-triggered cache busting.
Effort: 1 day
2C - Stripe integration
POST /api/billing/checkout creates Stripe Checkout Session and returns URL. POST /api/billing/webhook handles: invoice.paid (set active), invoice.payment_failed (set past_due), customer.subscription.deleted (set cancelled), trial_will_end (send warning email). Each state change calls invalidate-entitlement. POST /api/billing/portal returns Stripe Customer Portal session URL.
Effort: 2 days
2D - Trial auto-provision
On successful register: insert subscriptions row with state=trialing, trial_ends_at=NOW()+30 days. Send WELCOME_TEACHER email including trial expiry date and upgrade link.
Effort: 2h
2E - Wire entitlement checks to gated handlers
Priority order: reader (book access), teacher-portal (class creation and assignments), learner-bot (nightly run). Return HTTP 402 with body {error: 'entitlement_required', tier: 'free'} when not entitled.
Effort: 1 day
2F - Subscription lifecycle cron
Find active subscriptions where current_period_end is in the past: set past_due. Find past_due subscriptions where grace_period_end is in the past: set cancelled. Each transition invalidates entitlement cache and triggers the corresponding email notification.
Effort: 3h
2G - Admin entitlement grants
Allows manual tier override for pilot schools, charities, or support escalations. Writes to audit_log on every grant and revoke.
Effort: 2h
Sprint 3 - Roster, Compliance and Notifications
Gates 3 + 6 - ~2 weeksCovers GDPR compliance, full roster feature set, email notifications, and parent linking. Required before school onboarding.
3A - GDPR delete rewrite
Add gdpr_requests table with state machine (pending, processing, completed). Deletion is async 30-day purge: PII fields nulled, skeleton row retained for audit continuity. Audit log rows are NEVER deleted - only PII fields are nulled. Add GET /api/gdpr/v1/status/:request_id for progress checks.
Effort: 1 day
3B - Student soft-delete
Add students.state ENUM(active|archived|deleted) and students.deleted_at DATETIME. removeStudent sets state=deleted, deleted_at=NOW() instead of running DELETE query. All list queries must filter WHERE state != 'deleted'. Fix book_assignments cascade to also soft-delete.
Effort: 3h
3C - Roster missing fields
students: add uuid VARCHAR(36), school_id INT FK, language VARCHAR, placement_test_completed BOOLEAN, failed_attempts INT, locked BOOLEAN, miles DECIMAL, tokens INT, parent_count INT. classes: add state ENUM(active|archived), archived_at DATETIME, curriculum_territory VARCHAR, school_id INT FK. Enforce 33-student hard cap on addStudent and importRoster endpoints.
Effort: 4h
3D - PIN reveal tokens and login card PDFs
pin_reveal_tokens: token UUID PK, student_id FK, expires_at DATETIME (10min TTL), used_at DATETIME. Add GET /api/v1/pin/:token endpoint (one-time reveal, marks used_at on first access). Add POST /api/v1/classes/:id/login-cards: Puppeteer generates A4 PDF with one login card per student showing username and QR code linking to the PIN reveal URL.
Effort: 1 day
3E - Parent linking flow
Add parent_claims table: parent_id, student_id, state ENUM(pending/approved/rejected), invite_token UUID, expires_at. Add students_parents join table: student_id, parent_id, approved_at. New endpoints: POST /api/v1/parent/find-child, POST /api/v1/parent/claim-child, POST /api/v1/parent-claims/:id/approve, POST /api/v1/parent-claims/:id/reject. On approve: send PARENT_CLAIM_APPROVED email to parent, increment students.parent_count, enforce max-2-parents check before approving.
Effort: 2 days
3F - Tenant isolation
Every WHERE clause on students, classes, and teachers tables must include AND school_id = req.user.school_id. Add memberships table to track teacher-to-school assignments. Propagate school_id to all newly created records. Add schools table to user-center (move from teacher-portal).
Effort: 1 day
3G - Full email notification system (21 types)
Sprint 1 wires 3 types. Remaining 18: INVITE_EMAIL, PASSWORD_CHANGED, ACCOUNT_LOCKED_ALERT, RENEWAL_FAILED, SUBSCRIPTION_EXPIRING, SUBSCRIPTION_EXPIRED, SCHOOL_SUBSCRIPTION_EXPIRED, PAYMENT_RECEIPT, PARENT_CLAIM_APPROVED, PARENT_CLAIM_REJECTED, DATA_EXPORT_READY, ACCOUNT_SUSPENDED, WELCOME_PARENT, WEEKLY_PARENT_DIGEST, CHILD_LOCKED_PIN (in-app only). Add email_log table. Add weekly digest cron to learner-bot (Sunday 08:00 UTC).
Effort: 2 days
3H - addStudent parity with importRoster
importRoster already generates username + bcrypt PIN + creates Account Center learner record + creates learner-bot profile + writes pin_reveal_token. addStudent does none of this. Both paths must produce the same result.
Effort: 3h
Sprint 4 - Admin, Reading Loop and Polish
Gates 4 + 5 + 7 - ~2 weeksCompletes admin layer, closes remaining reading loop gaps, and wires the intelligence layer for full Learner Bot operation.
4A - Admin panel (Gate 7)
Add impersonation_sessions table: admin_id, target_user_id, reason TEXT, started_at, ended_at. POST /api/admin/impersonate/:user_id requires valid TOTP code from admin's authenticator app. POST /api/admin/impersonate/end closes the session. All actions taken during impersonation write to audit_log with impersonated_by field populated.
Effort: 2 days
Cursor-based pagination. Filter params: user_id, action, date_from, date_to. Required for SOC2 Type II access reviews and incident investigations.
Effort: 4h
GET /api/admin/schools (list all schools with pagination). GET /api/admin/schools/:id. PATCH /api/admin/schools/:id. Move schools table from teacher-portal to user-center. Add admin_user_id FK, state ENUM, auto_approve_parent_claims BOOLEAN.
Effort: 4h
4B - Reading loop gaps (Gate 4)
Phase 0 removed the direct DB connection. This sprint completes the replacement: child login issues uc_session cookie via Account Center. Remove reader's own independent users table. All session validation goes through GET /api/auth/session on Account Center.
Effort: 1 day
Add students.placement_test_completed BOOLEAN to teacher-portal schema. Reader calls POST /api/v1/bot/:learner_id/placement-complete after test finishes. Learner-bot updates teacher-portal via internal API call. Teacher dashboard shows the placement_test_completed flag per student.
Effort: 4h
user-center learnerProfiles table already stores milesTotal and tokensTotal. Teacher-portal fetches these via API call (never direct DB access) and displays in the student list.
Effort: 3h
4C - Telemetry loop gaps (Gate 5)
Currently only POST /api/session/end triggers the bot. When event_type=session_ended arrives via the batch events API, also fire to learner-bot. This closes the gap where half of telemetry does not trigger the intelligence loop.
Effort: 2h
Rename existing botRunLog to cron_runs. Add job_name VARCHAR, state ENUM(running/ok/failed), error_log JSON columns to match Wiki spec. Add device_sessions table and telemetry_errors table. Add session force-abandon cron to run every 6 hours.
Effort: 4h
Before processing each learner in the nightly loop, call GET /api/billing/entitlement. Skip learners on cancelled subscriptions (do not generate reports for non-paying accounts). Log skipped count per run in cron_runs.
Effort: 3h
Add cron at Sunday 08:00 UTC to generate per-child digest and dispatch WEEKLY_PARENT_DIGEST via notification service. Add retry cron at 05:00 UTC that re-runs the nightly bot for any learner whose run failed at 04:00 UTC.
Effort: 4h
4D - API consistency
All REST error responses must return {error: 'snake_case_code', message: 'human readable', details: {}}. All list endpoints returning more than 20 rows must support cursor pagination with nextCursor and hasMore fields. All mutating endpoints must handle Idempotency-Key header to prevent duplicate submissions.
Effort: 1 day
4E - CI/CD and staging environment
Each service gets a staging Docker Compose target with separate .env.staging. GitHub Actions pipeline runs: lint then test then build then deploy to staging on PR merge to main. Manual approval step required before promoting staging build to production.
Effort: 2 days
Sprint Summary and Effort Estimates
| Phase | Focus | Key Deliverables | Effort | Gates |
|---|---|---|---|---|
| Phase 0 | Security Fixes | 5 violations patched - COPPA + SOC2 compliant | ~3 days | Pre-requisite |
| Sprint 1 | Identity and Auth | Email verify, password reset, account states, teacher SSO, audit log | ~1.5 wks | Gate 1 done |
| Sprint 2 | Billing | Stripe, checkEntitlement, subscriptions, lifecycle cron | ~2 wks | Gate 2 done |
| Sprint 3 | Roster + Compliance + Notifications | GDPR fix, soft-delete, parent linking, PIN cards, 21 email types, tenant isolation | ~2 wks | Gates 3+6 done |
| Sprint 4 | Admin + Reading Loop + Polish | Impersonation + 2FA, reading loop, telemetry wiring, API consistency, CI/CD | ~2 wks | Gates 4+5+7 done |
| Total | ~7-9 weeks | All 7 gates complete | ||
Acceptance Criteria After Repair
| AC | Name | Fixed In | Expected Result |
|---|---|---|---|
| AC-01 | Teacher Signup | Sprint 1 | PASS |
| AC-02 | Class Creation | Sprint 3 | PASS |
| AC-03 | Child Login | Sprint 4 | PASS |
| AC-04 | Reading Session | Sprint 4 | PASS |
| AC-05 | Telemetry Capture | Sprint 4 | PASS |
| AC-06 | Nightly Report | Sprint 4 | PASS |
| AC-07 | Entitlement Enforcement | Sprint 2 | PASS |
| AC-08 | Billing Failure Flow | Sprint 2 | PASS |
| AC-09 | Admin Impersonate | Sprint 4 | PASS |
| AC-10 | Admin Grant Entitlement | Sprint 2 | PASS |
| AC-11 | GDPR Deletion | Phase 0 + Sprint 3 | PASS |
| AC-12 | Offline Reading Replay | Sprint 4 | PASS |
| AC-13 | First Child Login + Placement | Sprint 4 | PASS |
All 13 acceptance criteria expected to pass after Sprint 4 completion.