netbird-gitops/docs/plans/2026-03-06-reconciler-poc-implementation.md
2026-03-06 13:21:08 +02:00

36 KiB

Reconciler PoC Validation — Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Validate the reconciler end-to-end on a fresh, isolated NetBird instance (VPS-A) with state export and a GITEA_ENABLED feature flag.

Architecture: Deploy NetBird + Caddy + Gitea + Reconciler on VPS-A via Ansible. Add GITEA_ENABLED flag to make Gitea optional. Add state export tool (GET /export + --export CLI). Test reconcile, enrollment detection, and state round-trip.

Tech Stack: Deno 2.x / TypeScript, Zod, Docker Compose, Ansible, Caddy, NetBird self-hosted, Gitea

Design doc: docs/plans/2026-03-06-reconciler-poc-validation.md


Task 1: GITEA_ENABLED Feature Flag — Config

Make Gitea env vars conditionally required based on a new GITEA_ENABLED env var.

Files:

  • Modify: src/config.ts
  • Test: src/config.test.ts (create)

Step 1: Write config tests

Create src/config.test.ts:

import { assertEquals, assertThrows } from "@std/assert";

// We can't easily test loadConfig() directly because it reads Deno.env.
// Instead, test the schema validation logic by extracting it.
// For now, integration-level tests via the main module are sufficient.
// The key behavioral test is: does the service start without GITEA_* vars
// when GITEA_ENABLED=false?

// This is tested implicitly via integration.test.ts updates in Task 3.

Actually — loadConfig() reads directly from Deno.env, making unit tests awkward. The feature flag is better tested at integration level. Skip unit test file creation; test via integration tests in Task 3.

Step 2: Modify src/config.ts

Current code has a single ConfigSchema that requires all Gitea fields. Split into base + conditional:

import { z } from "zod";

const BaseConfigSchema = z.object({
  netbirdApiUrl: z.string().url(),
  netbirdApiToken: z.string().min(1),
  reconcilerToken: z.string().min(1),
  giteaEnabled: z.coerce.boolean().default(true),
  pollIntervalSeconds: z.coerce.number().int().positive().default(30),
  port: z.coerce.number().int().positive().default(8080),
  dataDir: z.string().default("/data"),
});

const GiteaConfigSchema = z.object({
  giteaUrl: z.string().url(),
  giteaToken: z.string().min(1),
  giteaRepo: z.string().regex(/^[^/]+\/[^/]+$/),
});

// When giteaEnabled=true, Gitea fields are present.
// When false, they are undefined.
export type Config = z.infer<typeof BaseConfigSchema> & {
  giteaUrl?: string;
  giteaToken?: string;
  giteaRepo?: string;
};

export function loadConfig(): Config {
  const raw = {
    netbirdApiUrl: Deno.env.get("NETBIRD_API_URL"),
    netbirdApiToken: Deno.env.get("NETBIRD_API_TOKEN"),
    reconcilerToken: Deno.env.get("RECONCILER_TOKEN"),
    giteaEnabled: Deno.env.get("GITEA_ENABLED"),
    pollIntervalSeconds: Deno.env.get("POLL_INTERVAL_SECONDS"),
    port: Deno.env.get("PORT"),
    dataDir: Deno.env.get("DATA_DIR"),
  };

  const base = BaseConfigSchema.parse(raw);

  if (base.giteaEnabled) {
    const giteaRaw = {
      giteaUrl: Deno.env.get("GITEA_URL"),
      giteaToken: Deno.env.get("GITEA_TOKEN"),
      giteaRepo: Deno.env.get("GITEA_REPO"),
    };
    const gitea = GiteaConfigSchema.parse(giteaRaw);
    return { ...base, ...gitea };
  }

  return base;
}

Key behavior: GITEA_ENABLED defaults to true (backward compat). When false, GITEA_URL, GITEA_TOKEN, GITEA_REPO are not required and not validated. z.coerce.boolean() converts string "false"false.

Step 3: Run type checker

Run: deno task check Expected: Type errors in main.ts and server.ts because Config fields are now optional. Fix in next tasks.

Step 4: Commit

feat: make Gitea env vars conditional via GITEA_ENABLED flag

Task 2: GITEA_ENABLED — Wire Into Main and Server

Update main.ts, server.ts, and poller/loop.ts to handle giteaEnabled=false.

Files:

  • Modify: src/main.ts
  • Modify: src/server.ts
  • Modify: src/poller/loop.ts

Step 1: Update src/main.ts

Conditionally create Gitea client and poller:

import { ZodError } from "zod";
import { loadConfig } from "./config.ts";
import { NetbirdClient } from "./netbird/client.ts";
import { GiteaClient } from "./gitea/client.ts";
import { createHandler } from "./server.ts";
import { startPollerLoop } from "./poller/loop.ts";

let config;
try {
  config = loadConfig();
} catch (err) {
  if (err instanceof ZodError) {
    console.error(
      JSON.stringify({ msg: "invalid config", issues: err.issues }),
    );
    Deno.exit(1);
  }
  throw err;
}

const netbird = new NetbirdClient(config.netbirdApiUrl, config.netbirdApiToken);

const gitea = config.giteaEnabled
  ? new GiteaClient(config.giteaUrl!, config.giteaToken!, config.giteaRepo!)
  : null;

const reconcileInProgress = { value: false };

// Start background poller
const pollerAbort = startPollerLoop({
  config,
  netbird,
  gitea,
  reconcileInProgress,
});

// Start HTTP server
const handler = createHandler({ config, netbird, gitea, reconcileInProgress });
console.log(JSON.stringify({
  msg: "starting",
  port: config.port,
  gitea_enabled: config.giteaEnabled,
}));
Deno.serve({ port: config.port, handler });

// Graceful shutdown
Deno.addSignalListener("SIGTERM", () => {
  console.log(JSON.stringify({ msg: "shutting_down" }));
  pollerAbort.abort();
  Deno.exit(0);
});

Step 2: Update ServerContext in src/server.ts

Change gitea field to nullable:

export interface ServerContext {
  config: Config;
  netbird: NetbirdClient;
  gitea: GiteaClient | null;
  reconcileInProgress: { value: boolean };
}

The handleReconcile function does NOT use gitea — it receives desired state from the request body. No changes needed there.

handleSyncEvents uses gitea via pollOnce. Guard it:

async function handleSyncEvents(ctx: ServerContext): Promise<Response> {
  if (!ctx.gitea) {
    // Still allow sync-events — poller will work without Gitea commit-back
  }
  const pollerCtx: PollerContext = {
    config: ctx.config,
    netbird: ctx.netbird,
    gitea: ctx.gitea,
    reconcileInProgress: { value: false },
  };
  // ... rest unchanged
}

Step 3: Update PollerContext in src/poller/loop.ts

Make gitea nullable. Guard all Gitea calls:

export interface PollerContext {
  config: Config;
  netbird: NetbirdClient;
  gitea: GiteaClient | null;
  reconcileInProgress: { value: boolean };
}

In pollOnce():

  • If gitea is null, skip fetching netbird.json from Gitea. Instead, read from local file at {dataDir}/netbird.json (or skip polling entirely if no local file exists).
  • Actually — the simplest approach: if gitea is null, the poller still runs but only detects enrollment events and renames peers. It does NOT read desired state or commit back. It uses the set of ALL setup key names from the last reconcile call (which we don't have in the poller context).

Revised approach for GITEA_ENABLED=false:

The poller needs to know which setup keys exist to detect enrollments. Without Gitea, it can't read netbird.json. Two options:

  1. Read from a local file at {dataDir}/netbird.json.
  2. Fetch setup keys directly from NetBird API (they're already there after reconcile).

Option 2 is cleaner — the reconciler just created these keys, they're in NetBird:

export async function pollOnce(ctx: PollerContext): Promise<void> {
  const { config, netbird, gitea, reconcileInProgress } = ctx;

  if (reconcileInProgress.value) {
    console.log(
      JSON.stringify({ msg: "poll_skipped", reason: "reconcile_in_progress" }),
    );
    return;
  }

  const pollerState = await loadPollerState(config.dataDir);

  // Determine unenrolled setup keys
  let unenrolledKeys: Set<string>;
  let desired: DesiredState | null = null;
  let fileSha: string | null = null;

  if (gitea) {
    // Full Gitea mode: read netbird.json from repo
    const file = await gitea.getFileContent("netbird.json", "main");
    desired = DesiredStateSchema.parse(JSON.parse(file.content));
    fileSha = file.sha;
    unenrolledKeys = new Set<string>();
    for (
      const [name, key] of Object.entries(desired.setup_keys) as [
        string,
        SetupKeyConfig,
      ][]
    ) {
      if (!key.enrolled) unenrolledKeys.add(name);
    }
  } else {
    // Standalone mode: get setup key names from NetBird API directly
    const keys = await netbird.listSetupKeys();
    unenrolledKeys = new Set<string>();
    for (const key of keys) {
      // Consider unused keys as "unenrolled"
      if (key.valid && !key.revoked && key.used_times < key.usage_limit) {
        unenrolledKeys.add(key.name);
      }
    }
  }

  if (unenrolledKeys.size === 0) {
    console.log(JSON.stringify({ msg: "poll_no_unenrolled_keys" }));
    return;
  }

  const events = await netbird.listEvents();
  const enrollments = processEnrollmentEvents(
    events,
    unenrolledKeys,
    pollerState.lastEventTimestamp,
  );

  if (enrollments.length === 0) return;

  console.log(JSON.stringify({
    msg: "poll_enrollments_detected",
    count: enrollments.length,
  }));

  let latestTimestamp = pollerState.lastEventTimestamp;

  for (const enrollment of enrollments) {
    if (gitea && desired && fileSha) {
      // Full mode: rename peer + commit enrolled:true
      await processEnrollment(
        ctx,
        enrollment,
        desired,
        fileSha,
        (newSha, newDesired) => {
          fileSha = newSha;
          desired = newDesired;
        },
      );
    } else {
      // Standalone mode: rename peer only, log enrollment
      await processEnrollmentStandalone(netbird, enrollment);
    }

    if (!latestTimestamp || enrollment.timestamp > latestTimestamp) {
      latestTimestamp = enrollment.timestamp;
    }
  }

  await savePollerState(config.dataDir, {
    lastEventTimestamp: latestTimestamp,
  });
}

Add a new standalone enrollment processor (same file, after processEnrollment):

async function processEnrollmentStandalone(
  netbird: NetbirdClient,
  enrollment: EnrollmentDetection,
): Promise<void> {
  const { setupKeyName, peerId, peerHostname } = enrollment;

  try {
    await netbird.updatePeer(peerId, { name: setupKeyName });
    console.log(JSON.stringify({
      msg: "peer_renamed",
      peer_id: peerId,
      from: peerHostname,
      to: setupKeyName,
      mode: "standalone",
    }));
  } catch (err) {
    console.error(JSON.stringify({
      msg: "peer_rename_failed",
      peer_id: peerId,
      error: err instanceof Error ? err.message : String(err),
    }));
  }

  console.log(JSON.stringify({
    msg: "enrollment_detected",
    setup_key: setupKeyName,
    peer_id: peerId,
    mode: "standalone",
    note: "gitea commit-back skipped (GITEA_ENABLED=false)",
  }));
}

Step 4: Run type checker and tests

Run: deno task check && deno task test Expected: Check passes. Some tests may need updates if they create PollerContext with non-null gitea — they should still pass since they provide a gitea mock.

Step 5: Commit

feat: wire GITEA_ENABLED flag into main, server, and poller

When GITEA_ENABLED=false, the reconciler starts without a Gitea client.
The poller detects enrollments by reading setup keys directly from the
NetBird API and renames peers, but skips the commit-back of enrolled:true.

Task 3: Update Integration Tests for GITEA_ENABLED

Ensure existing tests pass and add a test for standalone (no-Gitea) mode.

Files:

  • Modify: src/integration.test.ts

Step 1: Read the existing integration test

File: src/integration.test.ts — understand the current test setup and how it creates the server handler.

Step 2: Update tests

The existing tests create a ServerContext with a mock gitea. They should continue working since gitea is now GiteaClient | null and the mocks are non-null.

Add one new test:

Deno.test("reconcile works with gitea=null (standalone mode)", async () => {
  const ctx: ServerContext = {
    config: { ...baseConfig, giteaEnabled: false },
    netbird: mockNetbird,
    gitea: null,
    reconcileInProgress: { value: false },
  };
  const handler = createHandler(ctx);

  const resp = await handler(
    new Request("http://localhost/reconcile?dry_run=true", {
      method: "POST",
      headers: { "Authorization": `Bearer ${baseConfig.reconcilerToken}` },
      body: JSON.stringify({
        groups: { "test-group": { peers: [] } },
        setup_keys: {},
      }),
    }),
  );

  assertEquals(resp.status, 200);
  const body = await resp.json();
  assertEquals(body.status, "planned");
});

Exact mock setup depends on the existing test file — read it first to match patterns.

Step 3: Run tests

Run: deno task test Expected: All tests pass, including the new one.

Step 4: Commit

test: add integration test for standalone (no-Gitea) reconcile mode

Task 4: State Export — src/export.ts

New module that transforms ActualState into a valid netbird.json.

Files:

  • Create: src/export.ts
  • Create: src/export.test.ts

Step 1: Write the export test

import { assertEquals } from "@std/assert";
import { exportState } from "./export.ts";
import type { ActualState } from "./state/actual.ts";

Deno.test("exportState produces valid DesiredState from actual", () => {
  const actual: ActualState = {
    groups: [
      {
        id: "g1",
        name: "pilots",
        peers_count: 1,
        peers: [{ id: "p1", name: "Pilot-Hawk-1" }],
        issued: "api",
      },
      {
        id: "g2",
        name: "ground-stations",
        peers_count: 0,
        peers: [],
        issued: "api",
      },
      {
        id: "g-all",
        name: "All",
        peers_count: 1,
        peers: [{ id: "p1", name: "Pilot-Hawk-1" }],
        issued: "integration",
      },
    ],
    groupsByName: new Map(),
    groupsById: new Map(),
    setupKeys: [
      {
        id: 1,
        name: "Pilot-Hawk-1",
        type: "one-off",
        key: "secret",
        expires: "2026-03-13T00:00:00Z",
        valid: true,
        revoked: false,
        used_times: 1,
        state: "overused",
        auto_groups: ["g1"],
        usage_limit: 1,
      },
      {
        id: 2,
        name: "GS-Hawk-1",
        type: "one-off",
        key: "secret2",
        expires: "2026-03-13T00:00:00Z",
        valid: true,
        revoked: false,
        used_times: 0,
        state: "valid",
        auto_groups: ["g2"],
        usage_limit: 1,
      },
    ],
    setupKeysByName: new Map(),
    peers: [{
      id: "p1",
      name: "Pilot-Hawk-1",
      ip: "100.64.0.1",
      connected: true,
      hostname: "laptop",
      os: "linux",
      version: "0.35.0",
      groups: [{ id: "g1", name: "pilots" }],
      last_seen: "2026-03-06T10:00:00Z",
      dns_label: "pilot-hawk-1",
      login_expiration_enabled: false,
      ssh_enabled: false,
      inactivity_expiration_enabled: false,
    }],
    peersByName: new Map(),
    peersById: new Map(),
    policies: [
      {
        id: "pol1",
        name: "pilots-to-gs",
        description: "Allow pilots to reach GS",
        enabled: true,
        rules: [{
          name: "pilots-to-gs",
          description: "",
          enabled: true,
          action: "accept",
          bidirectional: true,
          protocol: "all",
          sources: [{ id: "g1", name: "pilots" }],
          destinations: [{ id: "g2", name: "ground-stations" }],
        }],
      },
    ],
    policiesByName: new Map(),
    routes: [],
    routesByNetworkId: new Map(),
    dns: [],
    dnsByName: new Map(),
  };

  const exported = exportState(actual);

  // Groups: "All" and system groups should be excluded
  assertEquals(Object.keys(exported.groups).sort(), [
    "ground-stations",
    "pilots",
  ]);
  assertEquals(exported.groups["pilots"].peers, ["Pilot-Hawk-1"]);
  assertEquals(exported.groups["ground-stations"].peers, []);

  // Setup keys
  assertEquals(Object.keys(exported.setup_keys).sort(), [
    "GS-Hawk-1",
    "Pilot-Hawk-1",
  ]);
  assertEquals(exported.setup_keys["Pilot-Hawk-1"].enrolled, true); // used_times >= usage_limit
  assertEquals(exported.setup_keys["GS-Hawk-1"].enrolled, false); // not yet used

  // Policies
  assertEquals(Object.keys(exported.policies), ["pilots-to-gs"]);
  assertEquals(exported.policies["pilots-to-gs"].sources, ["pilots"]);
  assertEquals(exported.policies["pilots-to-gs"].destinations, [
    "ground-stations",
  ]);
  assertEquals(exported.policies["pilots-to-gs"].bidirectional, true);

  // Routes and DNS should be empty
  assertEquals(exported.routes, {});
  assertEquals(exported.dns.nameserver_groups, {});
});

Deno.test("exportState handles empty state", () => {
  const actual: ActualState = {
    groups: [{
      id: "g-all",
      name: "All",
      peers_count: 0,
      peers: [],
      issued: "integration",
    }],
    groupsByName: new Map(),
    groupsById: new Map(),
    setupKeys: [],
    setupKeysByName: new Map(),
    peers: [],
    peersByName: new Map(),
    peersById: new Map(),
    policies: [],
    policiesByName: new Map(),
    routes: [],
    routesByNetworkId: new Map(),
    dns: [],
    dnsByName: new Map(),
  };

  const exported = exportState(actual);
  assertEquals(exported.groups, {});
  assertEquals(exported.setup_keys, {});
  assertEquals(exported.policies, {});
  assertEquals(exported.routes, {});
});

Deno.test("exportState maps auto_groups IDs to group names", () => {
  const actual: ActualState = {
    groups: [
      { id: "g1", name: "pilots", peers_count: 0, peers: [], issued: "api" },
    ],
    groupsByName: new Map([["pilots", {
      id: "g1",
      name: "pilots",
      peers_count: 0,
      peers: [],
      issued: "api" as const,
    }]]),
    groupsById: new Map([["g1", {
      id: "g1",
      name: "pilots",
      peers_count: 0,
      peers: [],
      issued: "api" as const,
    }]]),
    setupKeys: [
      {
        id: 1,
        name: "Test-Key",
        type: "one-off" as const,
        key: "k",
        expires: "",
        valid: true,
        revoked: false,
        used_times: 0,
        state: "valid" as const,
        auto_groups: ["g1"],
        usage_limit: 1,
      },
    ],
    setupKeysByName: new Map(),
    peers: [],
    peersByName: new Map(),
    peersById: new Map(),
    policies: [],
    policiesByName: new Map(),
    routes: [],
    routesByNetworkId: new Map(),
    dns: [],
    dnsByName: new Map(),
  };

  const exported = exportState(actual);
  // auto_groups should contain group names, not IDs
  assertEquals(exported.setup_keys["Test-Key"].auto_groups, ["pilots"]);
});

Step 2: Run test to verify it fails

Run: deno test src/export.test.ts Expected: FAIL — exportState not found.

Step 3: Implement src/export.ts

import type { ActualState } from "./state/actual.ts";
import type { DesiredState } from "./state/schema.ts";

/** Groups that are auto-created by NetBird and should not be exported. */
const SYSTEM_GROUP_NAMES = new Set(["All"]);
const SYSTEM_GROUP_ISSUERS = new Set(["integration", "jwt"]);

/**
 * Transforms live ActualState (fetched from NetBird API) into a valid
 * DesiredState object (netbird.json format). Maps all IDs to names.
 *
 * Skips system-generated groups (All, JWT-issued, integration-issued).
 * Setup keys with used_times >= usage_limit are marked enrolled:true.
 */
export function exportState(actual: ActualState): DesiredState {
  // Build ID->name lookup for groups (needed for auto_groups, policies, routes, DNS)
  const groupIdToName = new Map<string, string>();
  for (const g of actual.groups) {
    groupIdToName.set(g.id, g.name);
  }

  // Build setup key name set for peer->key name mapping in groups
  const setupKeyNames = new Set(actual.setupKeys.map((k) => k.name));

  // --- Groups ---
  const groups: DesiredState["groups"] = {};
  for (const g of actual.groups) {
    if (SYSTEM_GROUP_NAMES.has(g.name) || SYSTEM_GROUP_ISSUERS.has(g.issued)) {
      continue;
    }
    // Map peer names — only include peers whose name matches a setup key
    // (the reconciler convention is that peer names equal setup key names)
    const peers = g.peers
      .map((p) => p.name)
      .filter((name) => setupKeyNames.has(name));
    groups[g.name] = { peers };
  }

  // --- Setup Keys ---
  const setup_keys: DesiredState["setup_keys"] = {};
  for (const k of actual.setupKeys) {
    // Map auto_groups from IDs to names
    const autoGroupNames = k.auto_groups
      .map((id) => groupIdToName.get(id))
      .filter((name): name is string => name !== undefined);

    setup_keys[k.name] = {
      type: k.type,
      expires_in: 604800, // Default 7 days — NetBird API doesn't return the original expires_in
      usage_limit: k.usage_limit,
      auto_groups: autoGroupNames,
      enrolled: k.used_times >= k.usage_limit && k.usage_limit > 0,
    };
  }

  // --- Policies ---
  const policies: DesiredState["policies"] = {};
  for (const p of actual.policies) {
    if (p.rules.length === 0) continue;
    const rule = p.rules[0]; // Reconciler creates single-rule policies

    const sources = rule.sources.map((s) =>
      typeof s === "string" ? (groupIdToName.get(s) ?? s) : s.name
    );
    const destinations = rule.destinations.map((d) =>
      typeof d === "string" ? (groupIdToName.get(d) ?? d) : d.name
    );

    policies[p.name] = {
      description: p.description,
      enabled: p.enabled,
      sources,
      destinations,
      bidirectional: rule.bidirectional,
      protocol: rule.protocol,
      action: rule.action,
      ...(rule.ports && rule.ports.length > 0 ? { ports: rule.ports } : {}),
    };
  }

  // --- Routes ---
  const routes: DesiredState["routes"] = {};
  for (const r of actual.routes) {
    const peerGroups = (r.peer_groups ?? [])
      .map((id) => groupIdToName.get(id))
      .filter((name): name is string => name !== undefined);
    const distributionGroups = r.groups
      .map((id) => groupIdToName.get(id))
      .filter((name): name is string => name !== undefined);

    routes[r.network_id] = {
      description: r.description,
      ...(r.network ? { network: r.network } : {}),
      ...(r.domains && r.domains.length > 0 ? { domains: r.domains } : {}),
      peer_groups: peerGroups,
      metric: r.metric,
      masquerade: r.masquerade,
      distribution_groups: distributionGroups,
      enabled: r.enabled,
      keep_route: r.keep_route,
    };
  }

  // --- DNS ---
  const nameserver_groups: DesiredState["dns"]["nameserver_groups"] = {};
  for (const ns of actual.dns) {
    const nsGroups = ns.groups
      .map((id) => groupIdToName.get(id))
      .filter((name): name is string => name !== undefined);

    nameserver_groups[ns.name] = {
      description: ns.description,
      nameservers: ns.nameservers.map((n) => ({
        ip: n.ip,
        ns_type: n.ns_type,
        port: n.port,
      })),
      enabled: ns.enabled,
      groups: nsGroups,
      primary: ns.primary,
      domains: ns.domains,
      search_domains_enabled: ns.search_domains_enabled,
    };
  }

  return {
    groups,
    setup_keys,
    policies,
    routes,
    dns: { nameserver_groups },
  };
}

Step 4: Run tests

Run: deno test src/export.test.ts Expected: All 3 tests pass.

Step 5: Commit

feat: add state export module (ActualState -> netbird.json)

Task 5: State Export — HTTP Endpoint and CLI Flag

Wire the export function into the server (GET /export) and add --export CLI mode.

Files:

  • Modify: src/server.ts
  • Modify: src/main.ts
  • Modify: src/integration.test.ts

Step 1: Add GET /export to src/server.ts

In createHandler(), add before the 404 fallback:

if (url.pathname === "/export" && req.method === "GET") {
  return handleExport(ctx);
}

Add the handler function:

async function handleExport(ctx: ServerContext): Promise<Response> {
  try {
    const actual = await fetchActualState(ctx.netbird);
    const state = exportState(actual);
    return Response.json({
      status: "ok",
      state,
      meta: {
        exported_at: new Date().toISOString(),
        source_url: ctx.config.netbirdApiUrl,
        groups_count: Object.keys(state.groups).length,
        setup_keys_count: Object.keys(state.setup_keys).length,
        policies_count: Object.keys(state.policies).length,
        routes_count: Object.keys(state.routes).length,
        dns_count: Object.keys(state.dns.nameserver_groups).length,
      },
    });
  } catch (err) {
    console.error(JSON.stringify({
      msg: "export_error",
      error: err instanceof Error ? err.message : String(err),
    }));
    return Response.json(
      {
        status: "error",
        error: err instanceof Error ? err.message : String(err),
      },
      { status: 500 },
    );
  }
}

Add import at top of server.ts:

import { exportState } from "./export.ts";

The /export endpoint requires bearer auth (same as other endpoints — it exposes internal state).

Step 2: Add --export CLI mode to src/main.ts

Before the config loading block, check for CLI args:

// CLI mode: --export dumps state to stdout and exits
if (Deno.args.includes("--export")) {
  const apiUrl = getCliArg("--netbird-api-url") ??
    Deno.env.get("NETBIRD_API_URL");
  const apiToken = getCliArg("--netbird-api-token") ??
    Deno.env.get("NETBIRD_API_TOKEN");

  if (!apiUrl || !apiToken) {
    console.error(
      "Usage: --export --netbird-api-url <url> --netbird-api-token <token>",
    );
    console.error("Or set NETBIRD_API_URL and NETBIRD_API_TOKEN env vars.");
    Deno.exit(1);
  }

  const client = new NetbirdClient(apiUrl, apiToken);
  const actual = await fetchActualState(client);
  const state = exportState(actual);
  console.log(JSON.stringify(state, null, 2));
  Deno.exit(0);
}

function getCliArg(name: string): string | undefined {
  const idx = Deno.args.indexOf(name);
  if (idx === -1 || idx + 1 >= Deno.args.length) return undefined;
  return Deno.args[idx + 1];
}

Add imports to main.ts:

import { fetchActualState } from "./state/actual.ts";
import { exportState } from "./export.ts";

This block must come BEFORE the loadConfig() call, since --export mode doesn't need the full config.

Step 3: Add integration test for /export

Deno.test("GET /export returns exported state", async () => {
  // ... create handler with mock netbird that returns test data
  const resp = await handler(
    new Request("http://localhost/export", {
      method: "GET",
      headers: { "Authorization": `Bearer ${baseConfig.reconcilerToken}` },
    }),
  );

  assertEquals(resp.status, 200);
  const body = await resp.json();
  assertEquals(body.status, "ok");
  assertEquals(typeof body.state, "object");
  assertEquals(typeof body.meta.exported_at, "string");
});

Step 4: Run tests

Run: deno task check && deno task test Expected: All pass.

Step 5: Commit

feat: add GET /export endpoint and --export CLI mode for state export

Task 6: Update .env.example and Dockerfile

Add GITEA_ENABLED to env example and ensure Dockerfile passes CLI args.

Files:

  • Modify: deploy/.env.example
  • Modify: Dockerfile (no change needed — already passes env vars)

Step 1: Update deploy/.env.example

Add after NETBIRD_API_TOKEN:

# Set to false to run without Gitea integration (standalone mode)
GITEA_ENABLED=true

Step 2: Add a deno.json task for export

Add to deno.json tasks:

"export": "deno run --allow-net --allow-env src/main.ts --export"

Step 3: Commit

chore: add GITEA_ENABLED to .env.example, add export task to deno.json

Task 7: PoC Ansible — Inventory and Variables

Create the Ansible structure for deploying the full stack to VPS-A.

Files:

  • Create: poc/ansible/inventory.yml
  • Create: poc/ansible/group_vars/all/vars.yml
  • Create: poc/ansible/group_vars/all/vault.yml.example
  • Create: poc/ansible/.gitignore

Step 1: Create inventory

---
all:
  children:
    poc_servers:
      hosts:
        netbird-poc-a:
          ansible_host: 46.225.220.61
          ansible_ssh_private_key_file: ~/.ssh/id_ed25519
          ansible_user: root

Step 2: Create vars.yml

---
# PoC Reconciler Validation — Non-secret configuration

# Domain
netbird_domain: vps-a.networkmonitor.cc

# Versions
netbird_version: "0.63.0"
dashboard_version: "v2.27.1"
caddy_version: "2.10.2"
coturn_version: "4.8.0-r0"

# Reconciler
reconciler_image: "ghcr.io/blastpilot/netbird-reconciler:latest"
reconciler_port: 8080
gitea_enabled: false # Standalone mode for initial testing

# Gitea
gitea_version: "1.23"
gitea_http_port: 3000
gitea_ssh_port: 2222
gitea_admin_user: "blastpilot"
gitea_org_name: "BlastPilot"
gitea_repo_name: "netbird-gitops"

# Directories
base_dir: /opt/netbird-poc

Step 3: Create vault.yml.example

---
# Copy to vault.yml and fill in values.
# This file is gitignored — do NOT commit real secrets.

# Auto-generated at deploy time (leave empty, playbook generates them):
vault_encryption_key: ""
vault_turn_password: ""
vault_relay_secret: ""

# Reconciler auth token (generate with: openssl rand -hex 32)
vault_reconciler_token: ""

# Gitea admin password (for initial setup)
vault_gitea_admin_password: ""

Step 4: Create .gitignore

group_vars/all/vault.yml

Step 5: Commit

feat: add PoC ansible inventory and variable files

Task 8: PoC Ansible — Templates

Create all Docker Compose and config templates. Adapt from the existing PoC templates in PoC/netbird-routing-docs-poc/ansible/netbird/templates/.

Files:

  • Create: poc/ansible/templates/docker-compose.yml.j2
  • Create: poc/ansible/templates/management.json.j2
  • Create: poc/ansible/templates/Caddyfile.j2
  • Create: poc/ansible/templates/dashboard.env.j2
  • Create: poc/ansible/templates/relay.env.j2
  • Create: poc/ansible/templates/turnserver.conf.j2
  • Create: poc/ansible/templates/reconciler.env.j2

Step 1: Create docker-compose.yml.j2

Extend the existing PoC template with Gitea and Reconciler services. All on a shared netbird network.

Services:

  • caddy — reverse proxy (same as existing PoC)
  • dashboard — NetBird dashboard
  • signal — NetBird signal server
  • relay — NetBird relay
  • management — NetBird management API
  • coturn — STUN/TURN
  • gitea — Gitea (SQLite, minimal config)
  • reconciler — our reconciler service

The reconciler image: for the PoC, build locally or use a pre-built image. Since there's no release yet, the playbook should build the Docker image on VPS-A from the repo. Alternative: copy the compiled binary. Simplest: mount the Deno binary from a pre-compiled artifact.

Actually — simplest for a PoC: build the Docker image on VPS-A as part of the playbook. Clone the repo, docker build, done. The Dockerfile already works.

Step 2: Create all other templates

Copy from existing PoC (management.json.j2, Caddyfile.j2, dashboard.env.j2, relay.env.j2, turnserver.conf.j2) and adapt variable names.

Add to Caddyfile.j2 — route for reconciler:

# Reconciler API
handle_path /reconciler/* {
    reverse_proxy reconciler:{{ reconciler_port }}
}

Add to Caddyfile.j2 — route for Gitea:

# Gitea
handle_path /gitea/* {
    reverse_proxy gitea:{{ gitea_http_port }}
}

Or better — serve Gitea on a subpath or a separate port. Gitea behind a subpath requires ROOT_URL config. Simplest: expose Gitea on port 3000 directly (not through Caddy) since it's a PoC.

Create reconciler.env.j2:

NETBIRD_API_URL=https://{{ netbird_domain }}/api
NETBIRD_API_TOKEN={{ vault_netbird_api_token }}
RECONCILER_TOKEN={{ vault_reconciler_token }}
GITEA_ENABLED={{ gitea_enabled }}
{% if gitea_enabled | bool %}
GITEA_URL=http://gitea:{{ gitea_http_port }}
GITEA_TOKEN={{ vault_gitea_token }}
GITEA_REPO={{ gitea_org_name }}/{{ gitea_repo_name }}
{% endif %}
POLL_INTERVAL_SECONDS=30
PORT={{ reconciler_port }}
DATA_DIR=/data

Step 3: Commit

feat: add PoC ansible templates for full stack deployment

Task 9: PoC Ansible — Playbook

The main playbook that orchestrates deployment.

Files:

  • Create: poc/ansible/playbook.yml
  • Create: poc/ansible/files/netbird-seed.json

Step 1: Create seed state file

poc/ansible/files/netbird-seed.json:

{
  "groups": {
    "ground-stations": { "peers": [] },
    "pilots": { "peers": [] }
  },
  "setup_keys": {
    "GS-TestHawk-1": {
      "type": "one-off",
      "expires_in": 604800,
      "usage_limit": 1,
      "auto_groups": ["ground-stations"],
      "enrolled": false
    },
    "Pilot-TestHawk-1": {
      "type": "one-off",
      "expires_in": 604800,
      "usage_limit": 1,
      "auto_groups": ["pilots"],
      "enrolled": false
    }
  },
  "policies": {
    "pilots-to-gs": {
      "enabled": true,
      "sources": ["pilots"],
      "destinations": ["ground-stations"],
      "bidirectional": true
    }
  },
  "routes": {},
  "dns": { "nameserver_groups": {} }
}

Step 2: Create playbook

poc/ansible/playbook.yml — Task groups:

  1. Generate secrets — encryption key, TURN password, relay secret, reconciler token (if not in vault.yml)
  2. Install Docker — apt repo, docker-ce, docker-compose-plugin
  3. Create directories{{ base_dir }}/{config,data,reconciler-data,gitea-data}
  4. Template configs — all .j2 templates → {{ base_dir }}/config/
  5. Build reconciler image — clone repo (or copy Dockerfile + src), docker build
  6. Start stackdocker compose up -d
  7. Wait for health — poll management + reconciler health endpoints
  8. Create NetBird admin — first login to embedded IdP creates admin
  9. Generate NetBird API token — via management API (may need manual step)
  10. (If Gitea enabled) — Create admin user, org, repo, seed netbird.json
  11. Print summary — URLs, credentials, next steps

Note: generating a NetBird API token programmatically requires authenticating via the embedded IdP OAuth2 flow, which is non-trivial to automate. For the PoC, this will likely be a manual step:

  1. Open dashboard, create admin account
  2. Go to Settings → API Tokens → generate token
  3. Put token in vault.yml as vault_netbird_api_token
  4. Re-run playbook (or just restart reconciler container)

The playbook should handle this gracefully — start the reconciler last, after the token is known.

Step 3: Commit

feat: add PoC ansible playbook and seed state file

Task 10: PoC Ansible — README

Document how to use the PoC.

Files:

  • Create: poc/README.md

Step 1: Write README

Cover:

  • Prerequisites (SSH access to VPS-A, DNS records)
  • Setup steps (copy vault.yml.example, fill secrets)
  • Deploy command
  • Post-deploy steps (create NetBird admin, generate API token)
  • Testing steps (dry-run reconcile, apply, enrollment)
  • Teardown

Step 2: Commit

docs: add PoC README with setup and testing instructions

Task 11: Run All Tests and Type Check

Final verification before deployment.

Files: None (verification only)

Step 1: Run full test suite

Run: deno task check && deno task test && deno task lint Expected: All pass.

Step 2: Run format check

Run: deno task fmt --check Expected: No formatting issues (or fix them).

Step 3: Commit any fixes

chore: fix lint/format issues

Summary of Tasks

# Task Type Est.
1 GITEA_ENABLED feature flag — config Code 10 min
2 Wire flag into main, server, poller Code 20 min
3 Update integration tests Test 10 min
4 State export module Code + Test 20 min
5 Export endpoint + CLI flag Code + Test 15 min
6 Update .env.example and deno.json Config 5 min
7 PoC Ansible — inventory + vars Ansible 10 min
8 PoC Ansible — templates Ansible 30 min
9 PoC Ansible — playbook + seed Ansible 30 min
10 PoC README Docs 10 min
11 Final verification Test 5 min

Total estimated: ~2.5 hours

Tasks 1-6 are the reconciler code changes (feature flag + export). Tasks 7-10 are the Ansible deployment. Task 11 is final verification.

Tasks 1-6 must be done sequentially (each builds on the previous). Tasks 7-10 can be done in parallel with each other but after Tasks 1-6 (the Ansible templates reference the feature flag and export endpoint).