netbird-gitops/docs/plans/2026-03-06-schema-expansion.md
2026-03-06 16:28:01 +02:00

16 KiB

Schema Expansion: Full NetBird State Coverage

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Expand the reconciler schema and export to cover all NetBird resource types: posture checks, networks (with resources and routers), peers, users, and resource-backed policies.

Architecture: Each new resource type follows the existing pattern: add NB types → add schema → add to ActualState → add client methods → add diff logic → add executor handlers → add export → add tests. Policies are extended to support destination_resource as an alternative to destinations. The "All" group gets hardcoded exclusion from deletion.

Tech Stack: Deno 2.x, TypeScript, Zod, injectable fetch for testing.


Task 1: Fix "All" group hardcoded exclusion + policy null-safety

Files:

  • Modify: src/reconcile/diff.ts:66-70 (add "All" name check)
  • Modify: src/reconcile/diff.ts:138-145 (null-safety for destinations)
  • Modify: src/reconcile/diff.test.ts (add test for "All" exclusion with issued: "api")

The diff already filters issued === "api" but "All" has issued: "api" in real environments. Add explicit name exclusion. Also guard against null destinations in policy rules (resource-backed policies).

Changes to src/reconcile/diff.ts:

In diffGroups, line 67, change:

if (!desiredNames.has(group.name) && group.issued === "api") {

to:

if (!desiredNames.has(group.name) && group.issued === "api" && group.name !== "All") {

In diffPolicies, around line 143, wrap destinations extraction:

const actualDests = extractGroupNames(
  existing.rules.flatMap((r) => r.destinations ?? []),
  actual,
).sort();

Add test: computeDiff does not delete "All" group even when issued is "api".

Run: deno task test


Task 2: Add posture check and network types to src/netbird/types.ts

Files:

  • Modify: src/netbird/types.ts

Add these interfaces after the existing types:

/** Posture check as returned by GET /api/posture-checks */
export interface NbPostureCheck {
  id: string;
  name: string;
  description: string;
  checks: Record<string, unknown>;
}

/** Network as returned by GET /api/networks */
export interface NbNetwork {
  id: string;
  name: string;
  description: string;
  resources: string[];
  routers: string[];
  policies: string[];
  routing_peers_count: number;
}

/** Network resource as returned by GET /api/networks/{id}/resources */
export interface NbNetworkResource {
  id: string;
  name: string;
  description: string;
  type: "host" | "subnet" | "domain";
  address: string;
  enabled: boolean;
  groups: Array<
    { id: string; name: string; peers_count: number; resources_count: number }
  >;
}

/** Network router as returned by GET /api/networks/{id}/routers */
export interface NbNetworkRouter {
  id: string;
  peer: string | null;
  peer_groups: string[] | null;
  metric: number;
  masquerade: boolean;
  enabled: boolean;
}

/** User as returned by GET /api/users */
export interface NbUser {
  id: string;
  name: string;
  email: string;
  role: "owner" | "admin" | "user";
  status: "active" | "invited" | "blocked";
  auto_groups: string[];
  is_service_user: boolean;
}

Also add destinationResource and source_posture_checks to NbPolicy:

export interface NbPolicy {
  id: string;
  name: string;
  description: string;
  enabled: boolean;
  rules: NbPolicyRule[];
  source_posture_checks: string[]; // posture check IDs
}

And add to NbPolicyRule:

export interface NbPolicyRule {
  // ... existing fields ...
  destinationResource?: { id: string; type: string } | null;
}

Run: deno task check


Task 3: Add client methods for new resource types

Files:

  • Modify: src/netbird/client.ts

Add sections for:

Posture Checks:

listPostureChecks(): Promise<NbPostureCheck[]>
createPostureCheck(data: Omit<NbPostureCheck, "id">): Promise<NbPostureCheck>
updatePostureCheck(id: string, data: Omit<NbPostureCheck, "id">): Promise<NbPostureCheck>
deletePostureCheck(id: string): Promise<void>

Networks:

listNetworks(): Promise<NbNetwork[]>
createNetwork(data: { name: string; description?: string }): Promise<NbNetwork>
updateNetwork(id: string, data: { name: string; description?: string }): Promise<NbNetwork>
deleteNetwork(id: string): Promise<void>

Network Resources (nested under network):

listNetworkResources(networkId: string): Promise<NbNetworkResource[]>
createNetworkResource(networkId: string, data: { name: string; description?: string; address: string; enabled: boolean; groups: string[] }): Promise<NbNetworkResource>
updateNetworkResource(networkId: string, resourceId: string, data: { name: string; description?: string; address: string; enabled: boolean; groups: string[] }): Promise<NbNetworkResource>
deleteNetworkResource(networkId: string, resourceId: string): Promise<void>

Network Routers:

listNetworkRouters(networkId: string): Promise<NbNetworkRouter[]>
createNetworkRouter(networkId: string, data: Omit<NbNetworkRouter, "id">): Promise<NbNetworkRouter>
updateNetworkRouter(networkId: string, routerId: string, data: Omit<NbNetworkRouter, "id">): Promise<NbNetworkRouter>
deleteNetworkRouter(networkId: string, routerId: string): Promise<void>

Users:

listUsers(): Promise<NbUser[]>
createUser(data: { email: string; name?: string; role: string; auto_groups: string[]; is_service_user: boolean }): Promise<NbUser>
updateUser(id: string, data: { name?: string; role?: string; auto_groups?: string[] }): Promise<NbUser>
deleteUser(id: string): Promise<void>

Run: deno task check


Task 4: Expand ActualState with new resource collections

Files:

  • Modify: src/state/actual.ts

Add to ActualState interface:

postureChecks: NbPostureCheck[];
postureChecksByName: Map<string, NbPostureCheck>;
networks: NbNetwork[];
networksByName: Map<string, NbNetwork>;
networkResources: Map<string, NbNetworkResource[]>;  // networkId -> resources
networkRouters: Map<string, NbNetworkRouter[]>;       // networkId -> routers
users: NbUser[];
usersByEmail: Map<string, NbUser>;

Expand ClientLike to include:

| "listPostureChecks"
| "listNetworks"
| "listNetworkResources"
| "listNetworkRouters"
| "listUsers"

In fetchActualState: fetch posture checks, networks, users in the initial Promise.all. Then for each network, fetch its resources and routers in a second parallel batch.

Run: deno task check


Task 5: Expand the Zod schema with new resource types

Files:

  • Modify: src/state/schema.ts

Add schemas:

export const PostureCheckSchema = z.object({
  description: z.string().default(""),
  checks: z.record(z.string(), z.unknown()),
});

export const NetworkResourceSchema = z.object({
  name: z.string(),
  description: z.string().default(""),
  type: z.enum(["host", "subnet", "domain"]),
  address: z.string(),
  enabled: z.boolean().default(true),
  groups: z.array(z.string()),
});

export const NetworkRouterSchema = z.object({
  peer: z.string().optional(),
  peer_groups: z.array(z.string()).optional(),
  metric: z.number().int().min(1).max(9999).default(9999),
  masquerade: z.boolean().default(true),
  enabled: z.boolean().default(true),
});

export const NetworkSchema = z.object({
  description: z.string().default(""),
  resources: z.array(NetworkResourceSchema).default([]),
  routers: z.array(NetworkRouterSchema).default([]),
});

export const PeerSchema = z.object({
  groups: z.array(z.string()),
  login_expiration_enabled: z.boolean().default(false),
  inactivity_expiration_enabled: z.boolean().default(false),
  ssh_enabled: z.boolean().default(false),
});

export const UserSchema = z.object({
  name: z.string(),
  role: z.enum(["owner", "admin", "user"]),
  auto_groups: z.array(z.string()).default([]),
});

Extend PolicySchema to support destination_resource:

export const DestinationResourceSchema = z.object({
  id: z.string(), // resource name, resolved at reconcile time
  type: z.string(),
});

export const PolicySchema = z.object({
  description: z.string().default(""),
  enabled: z.boolean(),
  sources: z.array(z.string()),
  destinations: z.array(z.string()).default([]),
  destination_resource: DestinationResourceSchema.optional(),
  bidirectional: z.boolean(),
  protocol: z.enum(["tcp", "udp", "icmp", "all"]).default("all"),
  action: z.enum(["accept", "drop"]).default("accept"),
  ports: z.array(z.string()).optional(),
  source_posture_checks: z.array(z.string()).default([]),
});

Add to DesiredStateSchema:

export const DesiredStateSchema = z.object({
  groups: z.record(z.string(), GroupSchema),
  setup_keys: z.record(z.string(), SetupKeySchema),
  policies: z.record(z.string(), PolicySchema).default({}),
  posture_checks: z.record(z.string(), PostureCheckSchema).default({}),
  networks: z.record(z.string(), NetworkSchema).default({}),
  peers: z.record(z.string(), PeerSchema).default({}),
  users: z.record(z.string(), UserSchema).default({}),
  routes: z.record(z.string(), RouteSchema).default({}),
  dns: z.object({
    nameserver_groups: z.record(z.string(), DnsNameserverGroupSchema).default(
      {},
    ),
  }).default({ nameserver_groups: {} }),
});

Update validateCrossReferences to also check:

  • Peer groups reference existing groups
  • User auto_groups reference existing groups
  • Network resource groups reference existing groups
  • Policy source_posture_checks reference existing posture checks
  • Policy destination_resource.id references an existing network resource name

Run: deno task check


Task 6: Add operations for new resource types

Files:

  • Modify: src/reconcile/operations.ts

Add to OperationType:

| "create_posture_check" | "update_posture_check" | "delete_posture_check"
| "create_network" | "update_network" | "delete_network"
| "create_network_resource" | "update_network_resource" | "delete_network_resource"
| "create_network_router" | "update_network_router" | "delete_network_router"
| "create_user" | "update_user" | "delete_user"
| "update_peer"

Update EXECUTION_ORDER — networks must be created before resources/routers, posture checks before policies that reference them:

export const EXECUTION_ORDER: OperationType[] = [
  "create_posture_check",
  "update_posture_check",
  "create_group",
  "update_group",
  "create_setup_key",
  "rename_peer",
  "update_peer_groups",
  "update_peer",
  "create_network",
  "update_network",
  "create_network_resource",
  "update_network_resource",
  "create_network_router",
  "update_network_router",
  "create_user",
  "update_user",
  "create_policy",
  "update_policy",
  "create_route",
  "update_route",
  "create_dns",
  "update_dns",
  // Deletions in reverse dependency order
  "delete_dns",
  "delete_route",
  "delete_policy",
  "delete_user",
  "delete_network_router",
  "delete_network_resource",
  "delete_network",
  "delete_peer",
  "delete_setup_key",
  "delete_posture_check",
  "delete_group",
];

Run: deno task check


Task 7: Add diff logic for new resource types

Files:

  • Modify: src/reconcile/diff.ts

Add diffPostureChecks, diffNetworks, diffPeers, diffUsers functions and call them from computeDiff.

Posture checks: Compare by name. Create if missing. Update if checks object or description changed (deep JSON compare). Delete if not in desired.

Networks: Compare by name. Create network if missing. For each network, diff resources and routers:

  • Resources: match by name within the network. Create/update/delete.
  • Routers: match by peer name (or peer_group). Create/update/delete.

Peers: Compare by name. Only update operations (never create/delete). Compare groups (excluding "All"), login_expiration_enabled, inactivity_expiration_enabled, ssh_enabled.

Users: Compare by email. Create if missing. Update if role or auto_groups changed. Delete if not in desired (but never delete "owner" role).

Policies update: Handle destination_resource — when present, skip group-based destination comparison. Handle source_posture_checks.

Run: deno task check


Task 8: Add executor handlers for new operations

Files:

  • Modify: src/reconcile/executor.ts

Add case handlers in executeSingle for all new operation types. Network operations need special handling: resources and routers reference the network ID, which may be newly created. Track createdNetworkIds similar to createdGroupIds.

Posture check operations: create/update/delete via client methods. Track createdPostureCheckIds.

User operations: resolve auto_groups names to IDs.

Network resource operations: resolve groups names to IDs.

Network router operations: resolve peer name to peer ID, or peer_groups names to group IDs.

Update ExecutorClient type to include all new client methods.

Run: deno task check


Task 9: Update export to cover new resource types

Files:

  • Modify: src/export.ts

Add exportPostureChecks, exportNetworks, exportPeers, exportUsers functions.

Posture checks: Keyed by name. Pass through checks object as-is. Include description.

Networks: Keyed by name. For each network, fetch resources and routers from ActualState maps. Resources: resolve group IDs to names. Routers: resolve peer ID to peer name (via actual.peersById), resolve peer_group IDs to group names.

Peers: Keyed by peer name. Include groups (resolved to names, excluding "All"), login_expiration_enabled, inactivity_expiration_enabled, ssh_enabled.

Users: Keyed by email. Include name, role, auto_groups (resolved to names).

Policies: Handle destinationResource — resolve resource ID to resource name. Include source_posture_checks resolved to posture check names.

Update the exportState return to include all new sections.

Run: deno task check


Task 10: Export the three environments to state/*.json

Run the export against all three production NetBird instances:

mkdir -p state
deno task export -- --netbird-api-url https://dev.netbird.achilles-rnd.cc/api --netbird-api-token <DEV_TOKEN> > state/dev.json
deno task export -- --netbird-api-url https://achilles-rnd.cc/api --netbird-api-token <PROD_TOKEN> > state/prod.json
deno task export -- --netbird-api-url https://ext.netbird.achilles-rnd.cc/api --netbird-api-token <EXT_TOKEN> > state/ext.json

Verify each file parses with the updated schema. Visually inspect for completeness against dashboards.


Task 11: Update tests

Files:

  • Modify: src/reconcile/diff.test.ts — tests for new diff functions
  • Modify: src/reconcile/executor.test.ts — tests for new executor cases
  • Modify: src/export.test.ts — tests for new export functions
  • Modify: src/state/schema.test.ts — tests for new schema validation
  • Modify: src/state/actual.test.ts — tests for expanded fetchActualState
  • Modify: src/integration.test.ts — update mock data to include new resource types

All existing tests must continue to pass. New tests should cover:

  • Posture check CRUD diff/execute
  • Network with resources and routers diff/execute
  • Peer update diff (group changes, setting changes)
  • User CRUD diff/execute
  • Policy with destination_resource (export and diff)
  • Policy with source_posture_checks (export and diff)
  • Export of all new resource types

Run: deno task test — all tests must pass.


Task 12: Final verification

Run full quality gate:

deno task check    # type check
deno fmt --check   # formatting
deno task test     # all tests

All must pass.