Cloud Asset Discovery

/cloud discovers your infrastructure at each cloud provider where you add an account. Per discovery-run, the hub normalizes each resource to one universal data model — so all features (topology, security findings, agent linking, cost estimate) work provider-agnostically.

9 providers supported: AWS, Azure, GCP, Hetzner, Proxmox, DigitalOcean, Scaleway, OVH, IONOS.

Pipeline

Provider API ──► CloudResource (normalized)
                   │
                   ▼  upsert in cloud_resources (UNIQUE per account+resource_id)
                   │
                   ▼  agent matching on private_ip / public_ip → has_agent flag
                   │
                   ▼  agentless SecurityChecks per resource_type
                   │
                   ▼  optional: auto-create topology_node for VMs
                   │
                   ▼  resources not in this run → is_active=false

The pipeline runs every hour (worker-tick), but each cloud_account has its own discovery_interval_mins — default 240 min (4 hours). A manual POST /accounts/:id/discover triggers an immediate run within seconds.

Data Model

Table	What it stores
`cloud_accounts`	provider-credentials (AES-256-GCM encrypted), discovery-config
`cloud_resources`	each discovered resource — VMs, storage, network, managed services
`cloud_security_findings`	result of agentless checks per resource × check_id
`cloud_cost_daily`	optional daily cost rollup (Cost Explorer / Billing Export)
`cloud_discovery_runs`	append-only log of each run with stats + provider-errors

All tables have Postgres RLS applied with the tenant_isolation policy — isolated by tenant, even for direct DB-queries.

Credentials encryption

Each provider-secret (AWS secret key, Hetzner API token, Stripe-style service-account JSON, OVH consumer key, …) is AES-256-GCM encrypted stored in BYTEA columns. The master-key is in the env var CLOUD_ENCRYPTION_KEY (32 bytes hex). Without that var, the Cloud Discovery worker won’t work — handlers return 503 so we never accidentally store plaintext.

Encrypted format on disk:

12-byte nonce  ‖  ciphertext + 16-byte GCM tag

Decryption happens per discovery-run, in-memory; the plaintext never reaches a log or a DB-row.

Providers per provider

AWS (`aws`)

Auth: IAM Role (recommended, arn:aws:iam::…:role/MonsysReadOnly with optional External ID) or IAM User access key.
Discovery: EC2 Instances, VPCs, Subnets, Security Groups, Load Balancers, RDS Instances, S3 Buckets.
Security checks: S3 Public Access Block, S3 encryption, RDS encryption, RDS public access, RDS backup retention, RDS deletion protection, SG open SSH/RDP/All-traffic, EC2 IMDSv2-required, EC2 public IP.
Multi-region: yes, list of AWS regions per account.
Required IAM permissions: ec2:Describe*, rds:Describe*, s3:List* + s3:GetBucketPublicAccessBlock + s3:GetBucketEncryption, elasticloadbalancing:Describe*.

Azure (`azure`)

Auth: Service Principal — subscription_id, tenant_id, client_id, client_secret. Reader-role on the subscription.
Discovery: Virtual Machines, VNets + Subnets, Network Security Groups (NSGs), SQL Servers, Storage Accounts.
Security checks: NSG open SSH/RDP, NSG allow-all, SQL Public Network Access, Storage AllowBlobPublicAccess, Storage HTTPS-only.
VM-state: read from InstanceView.Statuses (PowerState/running etc.).

GCP (`gcp`)

Auth: Service Account JSON key. Requires roles/viewer on the project (or finer-grained: compute.viewer + storage.objectViewer + cloudsql.viewer).
Discovery: GCE Instances (cross-zone via AggregatedList), VPCs, Subnetworks, Firewalls (global), Cloud SQL, GCS Buckets.
Security checks: Firewall open SSH/RDP/all-traffic from 0.0.0.0/0, GCS Public Access Prevention not enforced, GCS no Uniform Bucket-Level Access, Cloud SQL with Ipv4Enabled=true.

Hetzner Cloud (`hetzner`)

Auth: read-only API token from Project → Security → API Tokens.
Discovery: Servers (paginated), Networks, Firewalls, Volumes.
Security checks: open SSH/RDP firewall-rules, public IPv4 on servers.
No official Go SDK — hand-rolled REST client against api.hetzner.cloud.

Proxmox VE (`proxmox`)

Auth: PVEAPIToken in format user@realm!tokenid=secret. PVEAuditor role on / is sufficient.
Discovery: Qemu VMs, LXC containers, nodes, storage. Guest-IP via qemu-guest-agent if available.
Security checks: public-routeable IPs (RFC1918-aware).
SSL: optional (verify_ssl=false for self-signed clusters).

DigitalOcean (`digitalocean`)

Auth: Personal Access Token (read scope sufficient).
Discovery: Droplets, Volumes, VPCs, Firewalls, Managed Databases, Load Balancers.
Security checks: Firewall open SSH/RDP, DB without Trusted Sources, DB SSL-optionally, Droplet public IP.

Scaleway (`scaleway`)

Auth: Access Key + Secret Key + Organization/Project ID + default zone (default fr-par-1).
Discovery: Instances, Volumes, Security Groups, VPCs.
Security checks: open SSH/RDP firewall-rules, public IPv4 on instances.

OVH (`ovh`)

Auth: Consumer Key and Secret.
Discovery: Instances, Volumes, Firewalls, VPCs.
Security checks: open SSH/RDP firewall-rules, public IPv4 on instances.

IONOS (`ionos`)

Auth: API Token.
Discovery: Instances, Volumes, Firewalls, VPCs.
Security checks: open SSH/RDP firewall-rules, public IPv4 on instances.

API

Method	Path
`GET /api/v1/cloud/accounts`	list all cloud-accounts
`POST /api/v1/cloud/accounts`	new account (live credential-validation + AES encrypt)
`DELETE /api/v1/cloud/accounts/:id`	account delete (cascade to resources)
`POST /api/v1/cloud/accounts/:id/discover`	trigger direct discovery-run
`GET /api/v1/cloud/resources?account_id=&type=&has_agent=&is_public=`	resources with filters
`GET /api/v1/cloud/resources/:id`	single resource with findings + raw data
`GET /api/v1/cloud/resources/:id/install`	agent install-commands (curl / SSM / PowerShell)
`GET /api/v1/cloud/summary`	aggregations per account
`GET /api/v1/cloud/findings?severity=&status=`	all security findings tenant-wide
`GET /api/v1/cloud/runs?account_id=`	discovery-run history

Worker-cadence

Worker	Tick	Goal
`CloudDiscoveryWorker`	every 15 min	scans `cloud_accounts` where `last_sync_at + interval` has passed, fork goroutine per account

Failure-modes are recorded in cloud_discovery_runs.provider_errors (JSON array). A run with 2 of 3 passes successful gets status partial — so you see the difference between “no access to RDS” and “complete catastrophe”.

Limitations

No real-time costs — estimated hourly price per instance type, not via Cost Explorer / Cost Management / Billing Export. For exact billing: enable the provider’s own billing-export pipeline.
Kubernetes nodes: AKS / EKS / GKE worker-nodes are discovered as generic VMs but not as Kubernetes-resources. Pod-discovery comes later.
Object Storage depth: we show buckets and their public flag — not individual objects or their ACLs (too expensive on real buckets).
No automatic agent install — we show install-commandos, you execute them yourself. A future version of the hub can via SSM / Azure Run Command / GCP Startup Scripts roll out the agent directly.

Topology — auto-created cloud nodes appear here.
Diagrams — use the topology-graph as basis for PNG/SVG/PDF export.
Security findings carry framework_refs (CIS, ISO 27001, GDPR) and feed the compliance-evidence pipeline.

Cloud Asset Discovery

Pipeline

Data Model

Credentials encryption

Providers per provider

AWS (aws)

Azure (azure)

GCP (gcp)

Hetzner Cloud (hetzner)

Proxmox VE (proxmox)

DigitalOcean (digitalocean)

Scaleway (scaleway)

OVH (ovh)

IONOS (ionos)

API