netbird-gitops/docs/plans/2026-03-06-reconciler-poc-implementation.md
2026-03-06 13:21:08 +02:00

1376 lines
36 KiB
Markdown

# Reconciler PoC Validation — Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to
> implement this plan task-by-task.
**Goal:** Validate the reconciler end-to-end on a fresh, isolated NetBird
instance (VPS-A) with state export and a `GITEA_ENABLED` feature flag.
**Architecture:** Deploy NetBird + Caddy + Gitea + Reconciler on VPS-A via
Ansible. Add `GITEA_ENABLED` flag to make Gitea optional. Add state export tool
(`GET /export` + `--export` CLI). Test reconcile, enrollment detection, and
state round-trip.
**Tech Stack:** Deno 2.x / TypeScript, Zod, Docker Compose, Ansible, Caddy,
NetBird self-hosted, Gitea
**Design doc:** `docs/plans/2026-03-06-reconciler-poc-validation.md`
---
## Task 1: `GITEA_ENABLED` Feature Flag — Config
Make Gitea env vars conditionally required based on a new `GITEA_ENABLED` env
var.
**Files:**
- Modify: `src/config.ts`
- Test: `src/config.test.ts` (create)
**Step 1: Write config tests**
Create `src/config.test.ts`:
```typescript
import { assertEquals, assertThrows } from "@std/assert";
// We can't easily test loadConfig() directly because it reads Deno.env.
// Instead, test the schema validation logic by extracting it.
// For now, integration-level tests via the main module are sufficient.
// The key behavioral test is: does the service start without GITEA_* vars
// when GITEA_ENABLED=false?
// This is tested implicitly via integration.test.ts updates in Task 3.
```
Actually — `loadConfig()` reads directly from `Deno.env`, making unit tests
awkward. The feature flag is better tested at integration level. Skip unit test
file creation; test via integration tests in Task 3.
**Step 2: Modify `src/config.ts`**
Current code has a single `ConfigSchema` that requires all Gitea fields. Split
into base + conditional:
```typescript
import { z } from "zod";
const BaseConfigSchema = z.object({
netbirdApiUrl: z.string().url(),
netbirdApiToken: z.string().min(1),
reconcilerToken: z.string().min(1),
giteaEnabled: z.coerce.boolean().default(true),
pollIntervalSeconds: z.coerce.number().int().positive().default(30),
port: z.coerce.number().int().positive().default(8080),
dataDir: z.string().default("/data"),
});
const GiteaConfigSchema = z.object({
giteaUrl: z.string().url(),
giteaToken: z.string().min(1),
giteaRepo: z.string().regex(/^[^/]+\/[^/]+$/),
});
// When giteaEnabled=true, Gitea fields are present.
// When false, they are undefined.
export type Config = z.infer<typeof BaseConfigSchema> & {
giteaUrl?: string;
giteaToken?: string;
giteaRepo?: string;
};
export function loadConfig(): Config {
const raw = {
netbirdApiUrl: Deno.env.get("NETBIRD_API_URL"),
netbirdApiToken: Deno.env.get("NETBIRD_API_TOKEN"),
reconcilerToken: Deno.env.get("RECONCILER_TOKEN"),
giteaEnabled: Deno.env.get("GITEA_ENABLED"),
pollIntervalSeconds: Deno.env.get("POLL_INTERVAL_SECONDS"),
port: Deno.env.get("PORT"),
dataDir: Deno.env.get("DATA_DIR"),
};
const base = BaseConfigSchema.parse(raw);
if (base.giteaEnabled) {
const giteaRaw = {
giteaUrl: Deno.env.get("GITEA_URL"),
giteaToken: Deno.env.get("GITEA_TOKEN"),
giteaRepo: Deno.env.get("GITEA_REPO"),
};
const gitea = GiteaConfigSchema.parse(giteaRaw);
return { ...base, ...gitea };
}
return base;
}
```
Key behavior: `GITEA_ENABLED` defaults to `true` (backward compat). When
`false`, `GITEA_URL`, `GITEA_TOKEN`, `GITEA_REPO` are not required and not
validated. `z.coerce.boolean()` converts string `"false"``false`.
**Step 3: Run type checker**
Run: `deno task check` Expected: Type errors in `main.ts` and `server.ts`
because `Config` fields are now optional. Fix in next tasks.
**Step 4: Commit**
```
feat: make Gitea env vars conditional via GITEA_ENABLED flag
```
---
## Task 2: `GITEA_ENABLED` — Wire Into Main and Server
Update `main.ts`, `server.ts`, and `poller/loop.ts` to handle
`giteaEnabled=false`.
**Files:**
- Modify: `src/main.ts`
- Modify: `src/server.ts`
- Modify: `src/poller/loop.ts`
**Step 1: Update `src/main.ts`**
Conditionally create Gitea client and poller:
```typescript
import { ZodError } from "zod";
import { loadConfig } from "./config.ts";
import { NetbirdClient } from "./netbird/client.ts";
import { GiteaClient } from "./gitea/client.ts";
import { createHandler } from "./server.ts";
import { startPollerLoop } from "./poller/loop.ts";
let config;
try {
config = loadConfig();
} catch (err) {
if (err instanceof ZodError) {
console.error(
JSON.stringify({ msg: "invalid config", issues: err.issues }),
);
Deno.exit(1);
}
throw err;
}
const netbird = new NetbirdClient(config.netbirdApiUrl, config.netbirdApiToken);
const gitea = config.giteaEnabled
? new GiteaClient(config.giteaUrl!, config.giteaToken!, config.giteaRepo!)
: null;
const reconcileInProgress = { value: false };
// Start background poller
const pollerAbort = startPollerLoop({
config,
netbird,
gitea,
reconcileInProgress,
});
// Start HTTP server
const handler = createHandler({ config, netbird, gitea, reconcileInProgress });
console.log(JSON.stringify({
msg: "starting",
port: config.port,
gitea_enabled: config.giteaEnabled,
}));
Deno.serve({ port: config.port, handler });
// Graceful shutdown
Deno.addSignalListener("SIGTERM", () => {
console.log(JSON.stringify({ msg: "shutting_down" }));
pollerAbort.abort();
Deno.exit(0);
});
```
**Step 2: Update `ServerContext` in `src/server.ts`**
Change `gitea` field to nullable:
```typescript
export interface ServerContext {
config: Config;
netbird: NetbirdClient;
gitea: GiteaClient | null;
reconcileInProgress: { value: boolean };
}
```
The `handleReconcile` function does NOT use `gitea` — it receives desired state
from the request body. No changes needed there.
`handleSyncEvents` uses `gitea` via `pollOnce`. Guard it:
```typescript
async function handleSyncEvents(ctx: ServerContext): Promise<Response> {
if (!ctx.gitea) {
// Still allow sync-events — poller will work without Gitea commit-back
}
const pollerCtx: PollerContext = {
config: ctx.config,
netbird: ctx.netbird,
gitea: ctx.gitea,
reconcileInProgress: { value: false },
};
// ... rest unchanged
}
```
**Step 3: Update `PollerContext` in `src/poller/loop.ts`**
Make `gitea` nullable. Guard all Gitea calls:
```typescript
export interface PollerContext {
config: Config;
netbird: NetbirdClient;
gitea: GiteaClient | null;
reconcileInProgress: { value: boolean };
}
```
In `pollOnce()`:
- If `gitea` is null, skip fetching `netbird.json` from Gitea. Instead, read
from local file at `{dataDir}/netbird.json` (or skip polling entirely if no
local file exists).
- Actually — the simplest approach: **if gitea is null, the poller still runs**
but only detects enrollment events and renames peers. It does NOT read desired
state or commit back. It uses the set of ALL setup key names from the last
reconcile call (which we don't have in the poller context).
Revised approach for `GITEA_ENABLED=false`:
The poller needs to know which setup keys exist to detect enrollments. Without
Gitea, it can't read `netbird.json`. Two options:
1. Read from a local file at `{dataDir}/netbird.json`.
2. Fetch setup keys directly from NetBird API (they're already there after
reconcile).
Option 2 is cleaner — the reconciler just created these keys, they're in
NetBird:
```typescript
export async function pollOnce(ctx: PollerContext): Promise<void> {
const { config, netbird, gitea, reconcileInProgress } = ctx;
if (reconcileInProgress.value) {
console.log(
JSON.stringify({ msg: "poll_skipped", reason: "reconcile_in_progress" }),
);
return;
}
const pollerState = await loadPollerState(config.dataDir);
// Determine unenrolled setup keys
let unenrolledKeys: Set<string>;
let desired: DesiredState | null = null;
let fileSha: string | null = null;
if (gitea) {
// Full Gitea mode: read netbird.json from repo
const file = await gitea.getFileContent("netbird.json", "main");
desired = DesiredStateSchema.parse(JSON.parse(file.content));
fileSha = file.sha;
unenrolledKeys = new Set<string>();
for (
const [name, key] of Object.entries(desired.setup_keys) as [
string,
SetupKeyConfig,
][]
) {
if (!key.enrolled) unenrolledKeys.add(name);
}
} else {
// Standalone mode: get setup key names from NetBird API directly
const keys = await netbird.listSetupKeys();
unenrolledKeys = new Set<string>();
for (const key of keys) {
// Consider unused keys as "unenrolled"
if (key.valid && !key.revoked && key.used_times < key.usage_limit) {
unenrolledKeys.add(key.name);
}
}
}
if (unenrolledKeys.size === 0) {
console.log(JSON.stringify({ msg: "poll_no_unenrolled_keys" }));
return;
}
const events = await netbird.listEvents();
const enrollments = processEnrollmentEvents(
events,
unenrolledKeys,
pollerState.lastEventTimestamp,
);
if (enrollments.length === 0) return;
console.log(JSON.stringify({
msg: "poll_enrollments_detected",
count: enrollments.length,
}));
let latestTimestamp = pollerState.lastEventTimestamp;
for (const enrollment of enrollments) {
if (gitea && desired && fileSha) {
// Full mode: rename peer + commit enrolled:true
await processEnrollment(
ctx,
enrollment,
desired,
fileSha,
(newSha, newDesired) => {
fileSha = newSha;
desired = newDesired;
},
);
} else {
// Standalone mode: rename peer only, log enrollment
await processEnrollmentStandalone(netbird, enrollment);
}
if (!latestTimestamp || enrollment.timestamp > latestTimestamp) {
latestTimestamp = enrollment.timestamp;
}
}
await savePollerState(config.dataDir, {
lastEventTimestamp: latestTimestamp,
});
}
```
Add a new standalone enrollment processor (same file, after
`processEnrollment`):
```typescript
async function processEnrollmentStandalone(
netbird: NetbirdClient,
enrollment: EnrollmentDetection,
): Promise<void> {
const { setupKeyName, peerId, peerHostname } = enrollment;
try {
await netbird.updatePeer(peerId, { name: setupKeyName });
console.log(JSON.stringify({
msg: "peer_renamed",
peer_id: peerId,
from: peerHostname,
to: setupKeyName,
mode: "standalone",
}));
} catch (err) {
console.error(JSON.stringify({
msg: "peer_rename_failed",
peer_id: peerId,
error: err instanceof Error ? err.message : String(err),
}));
}
console.log(JSON.stringify({
msg: "enrollment_detected",
setup_key: setupKeyName,
peer_id: peerId,
mode: "standalone",
note: "gitea commit-back skipped (GITEA_ENABLED=false)",
}));
}
```
**Step 4: Run type checker and tests**
Run: `deno task check && deno task test` Expected: Check passes. Some tests may
need updates if they create `PollerContext` with non-null gitea — they should
still pass since they provide a gitea mock.
**Step 5: Commit**
```
feat: wire GITEA_ENABLED flag into main, server, and poller
When GITEA_ENABLED=false, the reconciler starts without a Gitea client.
The poller detects enrollments by reading setup keys directly from the
NetBird API and renames peers, but skips the commit-back of enrolled:true.
```
---
## Task 3: Update Integration Tests for GITEA_ENABLED
Ensure existing tests pass and add a test for standalone (no-Gitea) mode.
**Files:**
- Modify: `src/integration.test.ts`
**Step 1: Read the existing integration test**
File: `src/integration.test.ts` — understand the current test setup and how it
creates the server handler.
**Step 2: Update tests**
The existing tests create a `ServerContext` with a mock `gitea`. They should
continue working since `gitea` is now `GiteaClient | null` and the mocks are
non-null.
Add one new test:
```typescript
Deno.test("reconcile works with gitea=null (standalone mode)", async () => {
const ctx: ServerContext = {
config: { ...baseConfig, giteaEnabled: false },
netbird: mockNetbird,
gitea: null,
reconcileInProgress: { value: false },
};
const handler = createHandler(ctx);
const resp = await handler(
new Request("http://localhost/reconcile?dry_run=true", {
method: "POST",
headers: { "Authorization": `Bearer ${baseConfig.reconcilerToken}` },
body: JSON.stringify({
groups: { "test-group": { peers: [] } },
setup_keys: {},
}),
}),
);
assertEquals(resp.status, 200);
const body = await resp.json();
assertEquals(body.status, "planned");
});
```
Exact mock setup depends on the existing test file — read it first to match
patterns.
**Step 3: Run tests**
Run: `deno task test` Expected: All tests pass, including the new one.
**Step 4: Commit**
```
test: add integration test for standalone (no-Gitea) reconcile mode
```
---
## Task 4: State Export — `src/export.ts`
New module that transforms `ActualState` into a valid `netbird.json`.
**Files:**
- Create: `src/export.ts`
- Create: `src/export.test.ts`
**Step 1: Write the export test**
```typescript
import { assertEquals } from "@std/assert";
import { exportState } from "./export.ts";
import type { ActualState } from "./state/actual.ts";
Deno.test("exportState produces valid DesiredState from actual", () => {
const actual: ActualState = {
groups: [
{
id: "g1",
name: "pilots",
peers_count: 1,
peers: [{ id: "p1", name: "Pilot-Hawk-1" }],
issued: "api",
},
{
id: "g2",
name: "ground-stations",
peers_count: 0,
peers: [],
issued: "api",
},
{
id: "g-all",
name: "All",
peers_count: 1,
peers: [{ id: "p1", name: "Pilot-Hawk-1" }],
issued: "integration",
},
],
groupsByName: new Map(),
groupsById: new Map(),
setupKeys: [
{
id: 1,
name: "Pilot-Hawk-1",
type: "one-off",
key: "secret",
expires: "2026-03-13T00:00:00Z",
valid: true,
revoked: false,
used_times: 1,
state: "overused",
auto_groups: ["g1"],
usage_limit: 1,
},
{
id: 2,
name: "GS-Hawk-1",
type: "one-off",
key: "secret2",
expires: "2026-03-13T00:00:00Z",
valid: true,
revoked: false,
used_times: 0,
state: "valid",
auto_groups: ["g2"],
usage_limit: 1,
},
],
setupKeysByName: new Map(),
peers: [{
id: "p1",
name: "Pilot-Hawk-1",
ip: "100.64.0.1",
connected: true,
hostname: "laptop",
os: "linux",
version: "0.35.0",
groups: [{ id: "g1", name: "pilots" }],
last_seen: "2026-03-06T10:00:00Z",
dns_label: "pilot-hawk-1",
login_expiration_enabled: false,
ssh_enabled: false,
inactivity_expiration_enabled: false,
}],
peersByName: new Map(),
peersById: new Map(),
policies: [
{
id: "pol1",
name: "pilots-to-gs",
description: "Allow pilots to reach GS",
enabled: true,
rules: [{
name: "pilots-to-gs",
description: "",
enabled: true,
action: "accept",
bidirectional: true,
protocol: "all",
sources: [{ id: "g1", name: "pilots" }],
destinations: [{ id: "g2", name: "ground-stations" }],
}],
},
],
policiesByName: new Map(),
routes: [],
routesByNetworkId: new Map(),
dns: [],
dnsByName: new Map(),
};
const exported = exportState(actual);
// Groups: "All" and system groups should be excluded
assertEquals(Object.keys(exported.groups).sort(), [
"ground-stations",
"pilots",
]);
assertEquals(exported.groups["pilots"].peers, ["Pilot-Hawk-1"]);
assertEquals(exported.groups["ground-stations"].peers, []);
// Setup keys
assertEquals(Object.keys(exported.setup_keys).sort(), [
"GS-Hawk-1",
"Pilot-Hawk-1",
]);
assertEquals(exported.setup_keys["Pilot-Hawk-1"].enrolled, true); // used_times >= usage_limit
assertEquals(exported.setup_keys["GS-Hawk-1"].enrolled, false); // not yet used
// Policies
assertEquals(Object.keys(exported.policies), ["pilots-to-gs"]);
assertEquals(exported.policies["pilots-to-gs"].sources, ["pilots"]);
assertEquals(exported.policies["pilots-to-gs"].destinations, [
"ground-stations",
]);
assertEquals(exported.policies["pilots-to-gs"].bidirectional, true);
// Routes and DNS should be empty
assertEquals(exported.routes, {});
assertEquals(exported.dns.nameserver_groups, {});
});
Deno.test("exportState handles empty state", () => {
const actual: ActualState = {
groups: [{
id: "g-all",
name: "All",
peers_count: 0,
peers: [],
issued: "integration",
}],
groupsByName: new Map(),
groupsById: new Map(),
setupKeys: [],
setupKeysByName: new Map(),
peers: [],
peersByName: new Map(),
peersById: new Map(),
policies: [],
policiesByName: new Map(),
routes: [],
routesByNetworkId: new Map(),
dns: [],
dnsByName: new Map(),
};
const exported = exportState(actual);
assertEquals(exported.groups, {});
assertEquals(exported.setup_keys, {});
assertEquals(exported.policies, {});
assertEquals(exported.routes, {});
});
Deno.test("exportState maps auto_groups IDs to group names", () => {
const actual: ActualState = {
groups: [
{ id: "g1", name: "pilots", peers_count: 0, peers: [], issued: "api" },
],
groupsByName: new Map([["pilots", {
id: "g1",
name: "pilots",
peers_count: 0,
peers: [],
issued: "api" as const,
}]]),
groupsById: new Map([["g1", {
id: "g1",
name: "pilots",
peers_count: 0,
peers: [],
issued: "api" as const,
}]]),
setupKeys: [
{
id: 1,
name: "Test-Key",
type: "one-off" as const,
key: "k",
expires: "",
valid: true,
revoked: false,
used_times: 0,
state: "valid" as const,
auto_groups: ["g1"],
usage_limit: 1,
},
],
setupKeysByName: new Map(),
peers: [],
peersByName: new Map(),
peersById: new Map(),
policies: [],
policiesByName: new Map(),
routes: [],
routesByNetworkId: new Map(),
dns: [],
dnsByName: new Map(),
};
const exported = exportState(actual);
// auto_groups should contain group names, not IDs
assertEquals(exported.setup_keys["Test-Key"].auto_groups, ["pilots"]);
});
```
**Step 2: Run test to verify it fails**
Run: `deno test src/export.test.ts` Expected: FAIL — `exportState` not found.
**Step 3: Implement `src/export.ts`**
```typescript
import type { ActualState } from "./state/actual.ts";
import type { DesiredState } from "./state/schema.ts";
/** Groups that are auto-created by NetBird and should not be exported. */
const SYSTEM_GROUP_NAMES = new Set(["All"]);
const SYSTEM_GROUP_ISSUERS = new Set(["integration", "jwt"]);
/**
* Transforms live ActualState (fetched from NetBird API) into a valid
* DesiredState object (netbird.json format). Maps all IDs to names.
*
* Skips system-generated groups (All, JWT-issued, integration-issued).
* Setup keys with used_times >= usage_limit are marked enrolled:true.
*/
export function exportState(actual: ActualState): DesiredState {
// Build ID->name lookup for groups (needed for auto_groups, policies, routes, DNS)
const groupIdToName = new Map<string, string>();
for (const g of actual.groups) {
groupIdToName.set(g.id, g.name);
}
// Build setup key name set for peer->key name mapping in groups
const setupKeyNames = new Set(actual.setupKeys.map((k) => k.name));
// --- Groups ---
const groups: DesiredState["groups"] = {};
for (const g of actual.groups) {
if (SYSTEM_GROUP_NAMES.has(g.name) || SYSTEM_GROUP_ISSUERS.has(g.issued)) {
continue;
}
// Map peer names — only include peers whose name matches a setup key
// (the reconciler convention is that peer names equal setup key names)
const peers = g.peers
.map((p) => p.name)
.filter((name) => setupKeyNames.has(name));
groups[g.name] = { peers };
}
// --- Setup Keys ---
const setup_keys: DesiredState["setup_keys"] = {};
for (const k of actual.setupKeys) {
// Map auto_groups from IDs to names
const autoGroupNames = k.auto_groups
.map((id) => groupIdToName.get(id))
.filter((name): name is string => name !== undefined);
setup_keys[k.name] = {
type: k.type,
expires_in: 604800, // Default 7 days — NetBird API doesn't return the original expires_in
usage_limit: k.usage_limit,
auto_groups: autoGroupNames,
enrolled: k.used_times >= k.usage_limit && k.usage_limit > 0,
};
}
// --- Policies ---
const policies: DesiredState["policies"] = {};
for (const p of actual.policies) {
if (p.rules.length === 0) continue;
const rule = p.rules[0]; // Reconciler creates single-rule policies
const sources = rule.sources.map((s) =>
typeof s === "string" ? (groupIdToName.get(s) ?? s) : s.name
);
const destinations = rule.destinations.map((d) =>
typeof d === "string" ? (groupIdToName.get(d) ?? d) : d.name
);
policies[p.name] = {
description: p.description,
enabled: p.enabled,
sources,
destinations,
bidirectional: rule.bidirectional,
protocol: rule.protocol,
action: rule.action,
...(rule.ports && rule.ports.length > 0 ? { ports: rule.ports } : {}),
};
}
// --- Routes ---
const routes: DesiredState["routes"] = {};
for (const r of actual.routes) {
const peerGroups = (r.peer_groups ?? [])
.map((id) => groupIdToName.get(id))
.filter((name): name is string => name !== undefined);
const distributionGroups = r.groups
.map((id) => groupIdToName.get(id))
.filter((name): name is string => name !== undefined);
routes[r.network_id] = {
description: r.description,
...(r.network ? { network: r.network } : {}),
...(r.domains && r.domains.length > 0 ? { domains: r.domains } : {}),
peer_groups: peerGroups,
metric: r.metric,
masquerade: r.masquerade,
distribution_groups: distributionGroups,
enabled: r.enabled,
keep_route: r.keep_route,
};
}
// --- DNS ---
const nameserver_groups: DesiredState["dns"]["nameserver_groups"] = {};
for (const ns of actual.dns) {
const nsGroups = ns.groups
.map((id) => groupIdToName.get(id))
.filter((name): name is string => name !== undefined);
nameserver_groups[ns.name] = {
description: ns.description,
nameservers: ns.nameservers.map((n) => ({
ip: n.ip,
ns_type: n.ns_type,
port: n.port,
})),
enabled: ns.enabled,
groups: nsGroups,
primary: ns.primary,
domains: ns.domains,
search_domains_enabled: ns.search_domains_enabled,
};
}
return {
groups,
setup_keys,
policies,
routes,
dns: { nameserver_groups },
};
}
```
**Step 4: Run tests**
Run: `deno test src/export.test.ts` Expected: All 3 tests pass.
**Step 5: Commit**
```
feat: add state export module (ActualState -> netbird.json)
```
---
## Task 5: State Export — HTTP Endpoint and CLI Flag
Wire the export function into the server (`GET /export`) and add `--export` CLI
mode.
**Files:**
- Modify: `src/server.ts`
- Modify: `src/main.ts`
- Modify: `src/integration.test.ts`
**Step 1: Add `GET /export` to `src/server.ts`**
In `createHandler()`, add before the 404 fallback:
```typescript
if (url.pathname === "/export" && req.method === "GET") {
return handleExport(ctx);
}
```
Add the handler function:
```typescript
async function handleExport(ctx: ServerContext): Promise<Response> {
try {
const actual = await fetchActualState(ctx.netbird);
const state = exportState(actual);
return Response.json({
status: "ok",
state,
meta: {
exported_at: new Date().toISOString(),
source_url: ctx.config.netbirdApiUrl,
groups_count: Object.keys(state.groups).length,
setup_keys_count: Object.keys(state.setup_keys).length,
policies_count: Object.keys(state.policies).length,
routes_count: Object.keys(state.routes).length,
dns_count: Object.keys(state.dns.nameserver_groups).length,
},
});
} catch (err) {
console.error(JSON.stringify({
msg: "export_error",
error: err instanceof Error ? err.message : String(err),
}));
return Response.json(
{
status: "error",
error: err instanceof Error ? err.message : String(err),
},
{ status: 500 },
);
}
}
```
Add import at top of `server.ts`:
```typescript
import { exportState } from "./export.ts";
```
The `/export` endpoint requires bearer auth (same as other endpoints — it
exposes internal state).
**Step 2: Add `--export` CLI mode to `src/main.ts`**
Before the config loading block, check for CLI args:
```typescript
// CLI mode: --export dumps state to stdout and exits
if (Deno.args.includes("--export")) {
const apiUrl = getCliArg("--netbird-api-url") ??
Deno.env.get("NETBIRD_API_URL");
const apiToken = getCliArg("--netbird-api-token") ??
Deno.env.get("NETBIRD_API_TOKEN");
if (!apiUrl || !apiToken) {
console.error(
"Usage: --export --netbird-api-url <url> --netbird-api-token <token>",
);
console.error("Or set NETBIRD_API_URL and NETBIRD_API_TOKEN env vars.");
Deno.exit(1);
}
const client = new NetbirdClient(apiUrl, apiToken);
const actual = await fetchActualState(client);
const state = exportState(actual);
console.log(JSON.stringify(state, null, 2));
Deno.exit(0);
}
function getCliArg(name: string): string | undefined {
const idx = Deno.args.indexOf(name);
if (idx === -1 || idx + 1 >= Deno.args.length) return undefined;
return Deno.args[idx + 1];
}
```
Add imports to `main.ts`:
```typescript
import { fetchActualState } from "./state/actual.ts";
import { exportState } from "./export.ts";
```
This block must come BEFORE the `loadConfig()` call, since `--export` mode
doesn't need the full config.
**Step 3: Add integration test for `/export`**
```typescript
Deno.test("GET /export returns exported state", async () => {
// ... create handler with mock netbird that returns test data
const resp = await handler(
new Request("http://localhost/export", {
method: "GET",
headers: { "Authorization": `Bearer ${baseConfig.reconcilerToken}` },
}),
);
assertEquals(resp.status, 200);
const body = await resp.json();
assertEquals(body.status, "ok");
assertEquals(typeof body.state, "object");
assertEquals(typeof body.meta.exported_at, "string");
});
```
**Step 4: Run tests**
Run: `deno task check && deno task test` Expected: All pass.
**Step 5: Commit**
```
feat: add GET /export endpoint and --export CLI mode for state export
```
---
## Task 6: Update `.env.example` and Dockerfile
Add `GITEA_ENABLED` to env example and ensure Dockerfile passes CLI args.
**Files:**
- Modify: `deploy/.env.example`
- Modify: `Dockerfile` (no change needed — already passes env vars)
**Step 1: Update `deploy/.env.example`**
Add after `NETBIRD_API_TOKEN`:
```
# Set to false to run without Gitea integration (standalone mode)
GITEA_ENABLED=true
```
**Step 2: Add a `deno.json` task for export**
Add to `deno.json` tasks:
```json
"export": "deno run --allow-net --allow-env src/main.ts --export"
```
**Step 3: Commit**
```
chore: add GITEA_ENABLED to .env.example, add export task to deno.json
```
---
## Task 7: PoC Ansible — Inventory and Variables
Create the Ansible structure for deploying the full stack to VPS-A.
**Files:**
- Create: `poc/ansible/inventory.yml`
- Create: `poc/ansible/group_vars/all/vars.yml`
- Create: `poc/ansible/group_vars/all/vault.yml.example`
- Create: `poc/ansible/.gitignore`
**Step 1: Create inventory**
```yaml
---
all:
children:
poc_servers:
hosts:
netbird-poc-a:
ansible_host: 46.225.220.61
ansible_ssh_private_key_file: ~/.ssh/id_ed25519
ansible_user: root
```
**Step 2: Create `vars.yml`**
```yaml
---
# PoC Reconciler Validation — Non-secret configuration
# Domain
netbird_domain: vps-a.networkmonitor.cc
# Versions
netbird_version: "0.63.0"
dashboard_version: "v2.27.1"
caddy_version: "2.10.2"
coturn_version: "4.8.0-r0"
# Reconciler
reconciler_image: "ghcr.io/blastpilot/netbird-reconciler:latest"
reconciler_port: 8080
gitea_enabled: false # Standalone mode for initial testing
# Gitea
gitea_version: "1.23"
gitea_http_port: 3000
gitea_ssh_port: 2222
gitea_admin_user: "blastpilot"
gitea_org_name: "BlastPilot"
gitea_repo_name: "netbird-gitops"
# Directories
base_dir: /opt/netbird-poc
```
**Step 3: Create `vault.yml.example`**
```yaml
---
# Copy to vault.yml and fill in values.
# This file is gitignored — do NOT commit real secrets.
# Auto-generated at deploy time (leave empty, playbook generates them):
vault_encryption_key: ""
vault_turn_password: ""
vault_relay_secret: ""
# Reconciler auth token (generate with: openssl rand -hex 32)
vault_reconciler_token: ""
# Gitea admin password (for initial setup)
vault_gitea_admin_password: ""
```
**Step 4: Create `.gitignore`**
```
group_vars/all/vault.yml
```
**Step 5: Commit**
```
feat: add PoC ansible inventory and variable files
```
---
## Task 8: PoC Ansible — Templates
Create all Docker Compose and config templates. Adapt from the existing PoC
templates in `PoC/netbird-routing-docs-poc/ansible/netbird/templates/`.
**Files:**
- Create: `poc/ansible/templates/docker-compose.yml.j2`
- Create: `poc/ansible/templates/management.json.j2`
- Create: `poc/ansible/templates/Caddyfile.j2`
- Create: `poc/ansible/templates/dashboard.env.j2`
- Create: `poc/ansible/templates/relay.env.j2`
- Create: `poc/ansible/templates/turnserver.conf.j2`
- Create: `poc/ansible/templates/reconciler.env.j2`
**Step 1: Create `docker-compose.yml.j2`**
Extend the existing PoC template with Gitea and Reconciler services. All on a
shared `netbird` network.
Services:
- `caddy` — reverse proxy (same as existing PoC)
- `dashboard` — NetBird dashboard
- `signal` — NetBird signal server
- `relay` — NetBird relay
- `management` — NetBird management API
- `coturn` — STUN/TURN
- `gitea` — Gitea (SQLite, minimal config)
- `reconciler` — our reconciler service
The reconciler image: for the PoC, build locally or use a pre-built image. Since
there's no release yet, the playbook should build the Docker image on VPS-A from
the repo. Alternative: copy the compiled binary. Simplest: mount the Deno binary
from a pre-compiled artifact.
Actually — simplest for a PoC: build the Docker image on VPS-A as part of the
playbook. Clone the repo, `docker build`, done. The Dockerfile already works.
**Step 2: Create all other templates**
Copy from existing PoC (`management.json.j2`, `Caddyfile.j2`,
`dashboard.env.j2`, `relay.env.j2`, `turnserver.conf.j2`) and adapt variable
names.
Add to `Caddyfile.j2` — route for reconciler:
```
# Reconciler API
handle_path /reconciler/* {
reverse_proxy reconciler:{{ reconciler_port }}
}
```
Add to `Caddyfile.j2` — route for Gitea:
```
# Gitea
handle_path /gitea/* {
reverse_proxy gitea:{{ gitea_http_port }}
}
```
Or better — serve Gitea on a subpath or a separate port. Gitea behind a subpath
requires `ROOT_URL` config. Simplest: expose Gitea on port 3000 directly (not
through Caddy) since it's a PoC.
Create `reconciler.env.j2`:
```
NETBIRD_API_URL=https://{{ netbird_domain }}/api
NETBIRD_API_TOKEN={{ vault_netbird_api_token }}
RECONCILER_TOKEN={{ vault_reconciler_token }}
GITEA_ENABLED={{ gitea_enabled }}
{% if gitea_enabled | bool %}
GITEA_URL=http://gitea:{{ gitea_http_port }}
GITEA_TOKEN={{ vault_gitea_token }}
GITEA_REPO={{ gitea_org_name }}/{{ gitea_repo_name }}
{% endif %}
POLL_INTERVAL_SECONDS=30
PORT={{ reconciler_port }}
DATA_DIR=/data
```
**Step 3: Commit**
```
feat: add PoC ansible templates for full stack deployment
```
---
## Task 9: PoC Ansible — Playbook
The main playbook that orchestrates deployment.
**Files:**
- Create: `poc/ansible/playbook.yml`
- Create: `poc/ansible/files/netbird-seed.json`
**Step 1: Create seed state file**
`poc/ansible/files/netbird-seed.json`:
```json
{
"groups": {
"ground-stations": { "peers": [] },
"pilots": { "peers": [] }
},
"setup_keys": {
"GS-TestHawk-1": {
"type": "one-off",
"expires_in": 604800,
"usage_limit": 1,
"auto_groups": ["ground-stations"],
"enrolled": false
},
"Pilot-TestHawk-1": {
"type": "one-off",
"expires_in": 604800,
"usage_limit": 1,
"auto_groups": ["pilots"],
"enrolled": false
}
},
"policies": {
"pilots-to-gs": {
"enabled": true,
"sources": ["pilots"],
"destinations": ["ground-stations"],
"bidirectional": true
}
},
"routes": {},
"dns": { "nameserver_groups": {} }
}
```
**Step 2: Create playbook**
`poc/ansible/playbook.yml` — Task groups:
1. **Generate secrets** — encryption key, TURN password, relay secret,
reconciler token (if not in vault.yml)
2. **Install Docker** — apt repo, docker-ce, docker-compose-plugin
3. **Create directories**
`{{ base_dir }}/{config,data,reconciler-data,gitea-data}`
4. **Template configs** — all `.j2` templates → `{{ base_dir }}/config/`
5. **Build reconciler image** — clone repo (or copy Dockerfile + src),
`docker build`
6. **Start stack**`docker compose up -d`
7. **Wait for health** — poll management + reconciler health endpoints
8. **Create NetBird admin** — first login to embedded IdP creates admin
9. **Generate NetBird API token** — via management API (may need manual step)
10. **(If Gitea enabled)** — Create admin user, org, repo, seed `netbird.json`
11. **Print summary** — URLs, credentials, next steps
Note: generating a NetBird API token programmatically requires authenticating
via the embedded IdP OAuth2 flow, which is non-trivial to automate. For the PoC,
this will likely be a manual step:
1. Open dashboard, create admin account
2. Go to Settings → API Tokens → generate token
3. Put token in `vault.yml` as `vault_netbird_api_token`
4. Re-run playbook (or just restart reconciler container)
The playbook should handle this gracefully — start the reconciler last, after
the token is known.
**Step 3: Commit**
```
feat: add PoC ansible playbook and seed state file
```
---
## Task 10: PoC Ansible — README
Document how to use the PoC.
**Files:**
- Create: `poc/README.md`
**Step 1: Write README**
Cover:
- Prerequisites (SSH access to VPS-A, DNS records)
- Setup steps (copy vault.yml.example, fill secrets)
- Deploy command
- Post-deploy steps (create NetBird admin, generate API token)
- Testing steps (dry-run reconcile, apply, enrollment)
- Teardown
**Step 2: Commit**
```
docs: add PoC README with setup and testing instructions
```
---
## Task 11: Run All Tests and Type Check
Final verification before deployment.
**Files:** None (verification only)
**Step 1: Run full test suite**
Run: `deno task check && deno task test && deno task lint` Expected: All pass.
**Step 2: Run format check**
Run: `deno task fmt --check` Expected: No formatting issues (or fix them).
**Step 3: Commit any fixes**
```
chore: fix lint/format issues
```
---
## Summary of Tasks
| # | Task | Type | Est. |
| -- | ------------------------------------- | ----------- | ------ |
| 1 | `GITEA_ENABLED` feature flag — config | Code | 10 min |
| 2 | Wire flag into main, server, poller | Code | 20 min |
| 3 | Update integration tests | Test | 10 min |
| 4 | State export module | Code + Test | 20 min |
| 5 | Export endpoint + CLI flag | Code + Test | 15 min |
| 6 | Update .env.example and deno.json | Config | 5 min |
| 7 | PoC Ansible — inventory + vars | Ansible | 10 min |
| 8 | PoC Ansible — templates | Ansible | 30 min |
| 9 | PoC Ansible — playbook + seed | Ansible | 30 min |
| 10 | PoC README | Docs | 10 min |
| 11 | Final verification | Test | 5 min |
**Total estimated: ~2.5 hours**
Tasks 1-6 are the reconciler code changes (feature flag + export). Tasks 7-10
are the Ansible deployment. Task 11 is final verification.
Tasks 1-6 must be done sequentially (each builds on the previous). Tasks 7-10
can be done in parallel with each other but after Tasks 1-6 (the Ansible
templates reference the feature flag and export endpoint).