# Imports

**Status:** in production
**Lives at:** `/import` + `/import/resolve/[id]` ·
`app/(app)/import/page.tsx`, `app/(app)/import/actions.ts`,
`app/(app)/import/resolve/page.tsx`,
`app/(app)/import/resolve/actions.ts` · `lib/clockify.ts` ·
`scripts/backfill_time_tracking.py` · uses the `pending_imports`
table.

## Summary
A two-step CSV import flow. Step 1: upload + parse + stash into
`pending_imports`. Step 2: a resolver UI lets the user decide what
to do with each operator / client / project name that isn't already
an entity (create new vs map to existing). Step 3: confirm → writes
to `clockify_imports` + `time_entries` with resolved entity IDs.

Sources supported: **Clockify** (Adi's monthly export) and **Toggl**
(Rian's history) — picked via a radio on the upload form, which
sets the expected column set and date/time defaults.

## Why
2026-05-20 — "Adi exports a Clockify list of time logs … the converted
entries that are not marked as 'transferred' go into 'to transfer'."
2026-05-21 — "For imports, I'd like to keep track of different import
batches. I'd like to be able to delete an import batch unless an item in
that import has been transferred already."

## Behavior

### Step 1 — Upload (`/import`)
- `/import` shows a CSV upload form and a list of all existing batches.
- The form requires:
  - **Source** radio: Clockify or Toggl (changes expected column set
    + date/time defaults). See decisions.md #031.
  - **Batch name** (e.g. "April 2026").
  - **CSV file** (≤ 25 MB).
  - **Date format** (MM/DD/YYYY / DD/MM/YYYY / YYYY-MM-DD).
  - **Time format** (12h / 24h).
- On submit:
  1. The CSV is parsed server-side via papaparse + the helpers in
     `lib/clockify.ts`. Each row is normalized into a parsed-entry
     shape (source fields, durations in seconds, naive start/end
     timestamps).
  2. The parsed entries are stashed in `pending_imports(id, org_id,
     uploaded_by, filename, source, parsed_rows JSONB, expires_at)`
     with `expires_at = now() + 24h`.
  3. The user is redirected to `/import/resolve/[pending_id]`.

### Step 2 — Resolve (`/import/resolve/[id]`)
- The resolver introspects the parsed rows and lists every distinct
  `(operator, client, project)` triple it found. For each entity name
  the UI shows:
  - "Already exists in this org" — the row resolves to the existing
    entity by default.
  - "Not found" — the user picks **Create new** or **Map to existing
    <foo>**. Required before confirming.
- The resolver also lists every distinct source_user_email and
  flags any that don't match a `team_members` row in the importer's
  team. The user is warned but can still proceed (those rows import
  with `team_member_id = NULL` and become "Blocked" on /entries
  until the missing team member is added + the entry re-resolved).
- **Confirm** writes:
  - One `clockify_imports` row (`name`, `filename`, `row_count`,
    `imported_by = effective_user_id`, `source`, `notes`).
  - Resolved entity rows (operators / clients / projects) for any
    "Create new" choices, chained correctly.
  - One `time_entries` row per parsed row, with:
    - `team_member_id` set if the source email matches; otherwise NULL.
    - `converted_user` = team_member's `consolidate_as` or
      `display_name` (NULL if no match).
    - `converted_duration_seconds` = `duration × rate_proportion`
      (NULL if no match). Kept on the row for the Toggl-export
      legacy consumer; not used by cost / billout. See
      decisions.md #034.
    - `billout_cost_usd` = `duration_seconds × cost_rate_usd`
      (NULL if either is missing).
    - `billout_amount_usd` = `duration_seconds ×
      effective_billout_rate(team_member, project)` accounting for
      any project rate overrides.
    - `project_id` set to the resolved entity.
    - `invoice_id` left NULL (entries aren't applied to an invoice
      at import time).
  - The `pending_imports` row is deleted on success.
- After insert, redirects to `/import?import_id=<id>` with a
  summary card: total imported, matched, unmatched, list of unknown
  emails.

### Batch management (`/import`)
- Each batch row has: name, filename, imported_at, total entries,
  applied-to-invoice count, notes. Clicking the name jumps to
  `/entries?batch=<id>` filtered to that batch.
- Batches with zero entries attached to an invoice can be deleted
  (cascades to their entries). Batches with any attached entry show
  "Locked".
- The "Initial 2026 migration" batch was created by migration
  `20260521120003_batch_and_admin.sql`'s backfill block — every
  entry that came in via `scripts/backfill_time_tracking.py` (before
  the batch concept existed) belongs to that batch.

## Constraints & edge cases
- File size capped at 25 MB. The Next.js Server Actions body-size limit
  is bumped to 25 MB in `next.config.mjs`.
- Date and time format must be set correctly on upload. Auto-detect is
  not implemented. Clockify's defaults are MM/DD/YYYY + 12h; Toggl's
  defaults are YYYY-MM-DD + 24h. Verify on the first row.
- `pending_imports` rows expire after 24h. Cleanup is currently
  manual / app-level — a Supabase cron job is a follow-up. Stale
  rows are harmless but accumulate.
- A pending upload doesn't reserve names or hold a transaction —
  two uploads of the same CSV resolve independently. No
  deduplication at parse time (see Open considerations).
- Unknown emails do NOT block import. Rows insert with
  `team_member_id = NULL` and become "Blocked" on /entries until
  the missing team member is added and the entry is re-resolved
  (per-row 🔄 button).
- Insert is chunked at 500 rows per request. If a chunk fails the
  partial import is rolled back (entries and the batch row both
  deleted; the `pending_imports` row is kept so the user can retry
  via the resolver).
- Required Clockify columns: Project, Client, Description, Email,
  Billable, Start Date, Start Time, End Date, End Time, Duration (h).
  Toggl columns vary — see `lib/clockify.ts`'s header-alias map.
  Missing any required column fails parse, before any DB write.
- Dates and times are stored naive (no TZ conversion). See
  decisions.md #003.
- The resolver respects view-as: entities + team-member matches are
  looked up under the effective user's org / team. Confirming in
  view-as attributes the batch to the effective user, not the real
  user. See decisions.md #033.

## Permissions
- Managers and Owners (and super admins not in view-as) can upload and
  delete batches.

## Open considerations
- No way to **re-process** an old batch with newly-added team members —
  you'd delete the batch (if untransferred) and re-import, or use
  per-entry Re-resolve.
- No deduplication. Importing the same CSV twice creates two batches
  with duplicate entries.
- No Clockify API integration. Eventually pull directly instead of
  manual CSV.
- Date auto-detect would be nice — try both M/D and D/M and pick the
  one that doesn't produce out-of-range months.
- File is parsed entirely in memory. Reasonable up to ~10 MB; beyond
  that we'd want streaming.

## Tests
- ✅ CLK-001 … CLK-020 — date/time/duration parsing, CSV parsing,
  conversion math.
- 🟡 IMP-001 Uploading a Clockify CSV creates a batch + N entries.
- 🟡 IMP-002 Importing rows with unknown emails leaves
  `team_member_id = NULL` on those rows.
- 🟡 IMP-003 Cannot delete an import batch if any entry in it is
  transferred (server-side check).

## Changelog
- **2026-05-20** — Initial CSV import flow.
- **2026-05-21** — Batches became first-class: added `clockify_imports.name`
  column, batch delete with transferred-guard, "Initial 2026 migration"
  backfill applied to existing entries.
- **2026-05-23** — Toggl CSV is now a first-class source. `lib/clockify.ts`
  accepts Toggl header variants (`User`/`Member`, `End Date`/`Stop Date`,
  `End Time`/`Stop Time`); `billable` defaults to true when the column is
  absent. Each `time_entries` row carries a `source` column
  (`clockify` | `toggl`). Imports are scoped per uploader via RLS — see
  decisions.md #027. A one-shot `scripts/import_toggl_csv.py` brought in
  Rian's 2026 Toggl history attributed to him.
- **2026-05-23** — Upload form now has a **Source: Clockify / Toggl**
  radio. Picking Toggl flips the date/time defaults to YYYY-MM-DD +
  24h (was MM/DD/YYYY + 12h for Clockify), and the picked source is
  written into `clockify_imports.source`. Header note updates to list
  the expected column set per source. See decisions.md #031.
- **2026-05-23** — **Entity-chain resolution on import.** The import
  resolver looks up (or creates) the matching `operators` /
  `clients` / `projects` rows and stamps `time_entries.project_id`
  on every imported row. The text columns (`operator` / `client` /
  `project`) are kept in sync as a denormalized cache. See
  decisions.md #035.
- **2026-05-24** — **Two-step import flow.** Upload now parses into
  `pending_imports` (24h TTL) and redirects to `/import/resolve/[id]`
  where the user explicitly maps each unknown operator / client /
  project to "Create new" or an existing entity. The flow replaced
  the previous implicit "create on the fly" behavior so accidental
  typos in CSV names no longer silently create stub entities. See
  decisions.md #041.
- **2026-05-25** — Cost + billout stamps on imported rows now come
  from per-source-hour `team_members.cost_rate_usd` and
  `billout_rate_usd` (post `effective_billout_rate` precedence). The
  legacy `team.base_rate × proportion` formula is gone.
  `converted_duration_seconds` is still computed (= source ×
  proportion) on the row for the Toggl-export legacy consumer. See
  decisions.md #034.
- **2026-05-27** — **Resolver project candidates reactive to client
  mapping.** Previously the project pickers filtered candidates by
  the row's server-render-time `clientEntityId`, which was always
  `null` for any new-client row. So if you mapped a CSV's
  "Second Wind" client to the existing "Second Wind Consultants",
  the project pickers UNDER that client still said "(no candidates
  at this scope)" because they kept looking at the unresolved
  client. New `effectiveClientId` lookup (useMemo over clDec)
  merges in the user's in-flight mapping decisions so each
  project picker shows the candidates under the EFFECTIVE
  client. The "(new client — will be created)" hint also flips
  off once a mapping is chosen. Server-side commit already
  handled the resolved chain correctly; bug was purely UI.
- **2026-05-27** — **Resolve-page permission gate fixed.** The
  `canSubmit` check on `/import/resolve/[id]` was comparing
  `pending.uploaded_by` to `eu.real_user_id` and OR'ing with
  `is_super_admin` / `org_role === 'owner'` — which excluded
  managers (Adi) entirely and broke view-as. Upload stamps
  `uploaded_by = effective_user_id` (per ADR #033), so the
  comparison was against the wrong field too. New check:
  `canManageImports(eu) && (pending.uploaded_by ===
  eu.effective_user_id || canTransfer(eu))`. Managers can now
  resolve their own pending imports; owners (and super admins
  not in view-as) can resolve anyone's. Side effect: the
  "Create new / Map to existing" picker forms no longer
  silently hide when the user is a manager.
- **2026-05-27** — **Auto-detect date + time formats.** The parser
  no longer relies on the upload form's format selectors as the
  primary signal:
  - **Time format**: `parseTimeOnly` checks for an AM/PM suffix on
    each value. If present → 12h interpretation; otherwise → 24h.
    The Time format selector was removed from the upload form. A
    plain `14:22:07` now parses correctly even when the form
    historically defaulted to 12h.
  - **Date format**: new `detectDateFormat(dates)` helper scans all
    rows' Start Date column up front. ISO `YYYY-MM-DD` is
    identified by the 4-digit-year prefix; numeric formats are
    disambiguated by `>12` in either slot. The Date format
    selector is kept as a tiebreaker for CSVs where every date is
    fully ambiguous (every slot ≤ 12) but otherwise ignored.
  - Verified end-to-end against Adi's Clockify CSV
    (DD/MM/YYYY + 24h) parsed against the form's wrong defaults
    (MM/DD/YYYY + 12h): 37/37 entries succeeded under auto-detect.