13 minutes
DigitalOcean Kubernetes Challenge 2021 - Challenge Complete!
Howdy, here we are with my follow up post on the DigitalOcean Kubernetes Challenge 2021, in which I’ll be creating a Kubernetes cluster on DigitalOcean and then deploying Falco to it. I’ll be using Terraform (TF) to do as much as possible because Infrastructure as Code is my jam.
First steps
I created a repository on GitHub to hold the relevant code, if you go look now you should see the completed project. Noice.
Accepting the challenge and signing up for a DigitalOcean (DO) account was pretty simple. There’s a little asterisk on the billing dashboard mentioning that it’s updated daily, so don’t be surprised when applying coupons and not seeing the Remaining account credits
field update. Despite the warning, I found this a little disappointing. I got the message saying the credits were applied, so, why can’t the dashboard reflect that?
As is pretty standard for me these days, one of the first things I did was add a vscode devcontainer configuration so that I have a nice portable contained1 environment for working in. I can’t tell you how nice I find being able to move from one computer to another, pull the repo and basically be ready to go. That said, it’s actually a bit overkill for this project because I only actually needed the terraform binary. You’ll see why soon.
Finally, I took a bit of a peek at the TF documentation for the DO provider. I saw the provider needs an API token to be able to interact with DO on my behalf, which makes total sense, so I created a Personal Access Token
called do-k8s-2021
from the API page. I’m only using the token with this project, so I can confidently delete it in future and know what will be impacted. Anyway, I copied down the token temporarily as it’s only shown once.
To make things a little more interesting, I decided to use Terraform Cloud (TFC) for the first time2. This involved creating a new workspace named do-k8s-2021
to keep things consistent, and selected the CLI-driven workflow
option. I was provided with instructions and sample TF code to get started, and everything just worked3. You might be asking why, at this point.
Well, I’m glad you asked.
- It’s something that I wanted to learn more about and get practical experience using.
- It makes development even more portable.
- Sure, I could store the terraform state in the repo using the local backend, but this is a public project so anyone could see it4, and I feel like it’s a bad practice anyway. The solution? Use a remote state backend.
- Generally I’m using AWS, so I use S3 and DynamoDB for state and state-locking respectively, but this project isn’t on AWS. It’s on DigitalOcean. What to do? 🤔
- Terraform Cloud provides remote state for free. Even better, I can securely store the DO token here as well. That means I don’t have to worry about how to distribute the token between development machines. Noice.
DigitalOcean
So, what did I have at this point?
- A DO account
- A TFC account and workspace, configured with my DO
Personal Access Token
exported as a sensitiveWorkspace variable
calledDIGITALOCEAN_TOKEN
- A nearly empty
main.tf
First things first, I needed to set up the DO provider in TF so I updated the main.tf
.
# main.tf
terraform {
backend "remote" {
organization = "ljones"
workspaces {
name = "do-k8s-2021"
}
}
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.16.0"
}
}
}
provider "digitalocean" {}
The ~>
means that the DO provider can update to any 2.16.x
version. Side node, it’s important to run terraform init
after updating providers so that new providers are downloaded (if running locally) and the .terraform.lock.hcl
file is created. The TL;DR is that this file keeps helps keep things consistent, and it should be committed to version control with the rest of the code.
Before I went slapping more TF code in there, I had a look around the DO console. Featured quite prominently was first-project
. Huh, indeed. What’s that?
Projects let you organize your DigitalOcean resources (like Droplets, Spaces, and load balancers) into groups that fit the way you work. You can create projects that align with the applications, environments, and clients that you host on DigitalOcean.
Yep, cool, I definitely want one of those to house all the resources associated with this, uh, project. Back to the documentation again and lo, there is indeed a resource for managing DO projects. I added the below to my main.tf
and ran a plan
operation.
Note, I usually have
terraform
aliased totf
for convenience, so don’t be surprised when you seetf
being used in the terminal!
resource "digitalocean_project" "do-k8s-2021" {
name = "do-k8s-2021"
description = "DigitalOcean Kubernetes Challenge 2021"
purpose = "Deploy a security and compliance system"
}
$ tf plan
Running plan in the remote backend. Output will stream here. Pressing Ctrl-C
will stop streaming the logs, but will not stop the plan running remotely.
Preparing the remote plan...
To view this run in a browser, visit:
https://app.terraform.io/app/ljones/do-k8s-2021/runs/run-bVsGoY2hdCxXkpDY
Waiting for the plan to start...
Terraform v1.0.11
on linux_amd64
Configuring remote state backend...
Initializing Terraform configuration...
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_project.do-k8s-2021 will be created
+ resource "digitalocean_project" "do-k8s-2021" {
+ created_at = (known after apply)
+ description = "DigitalOcean Kubernetes Challenge 2021"
+ environment = "Development"
+ id = (known after apply)
+ is_default = (known after apply)
+ name = "do-k8s-2021"
+ owner_id = (known after apply)
+ owner_uuid = (known after apply)
+ purpose = "Deploy a security and compliance system"
+ resources = (known after apply)
+ updated_at = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Plan looked good, time for a run.
Another neat thing about using TFC is that my TF
plan|apply
actions were shown live in the web UI as well as my terminal. It also keeps a history of these actions. This is less important maybe for a 1-person project but I see real value for teams.
$ tf apply
Running apply in the remote backend. Output will stream here. Pressing Ctrl-C
will cancel the remote apply if it's still pending. If the apply started it
will stop streaming the logs, but will not stop the apply running remotely.
Preparing the remote apply...
To view this run in a browser, visit:
https://app.terraform.io/app/ljones/do-k8s-2021/runs/run-QnJoQ2wVpHnqt1iv
Waiting for the plan to start...
Terraform v1.0.11
on linux_amd64
Configuring remote state backend...
Initializing Terraform configuration...
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_project.do-k8s-2021 will be created
+ resource "digitalocean_project" "do-k8s-2021" {
+ created_at = (known after apply)
+ description = "DigitalOcean Kubernetes Challenge 2021"
+ environment = "Development"
+ id = (known after apply)
+ is_default = (known after apply)
+ name = "do-k8s-2021"
+ owner_id = (known after apply)
+ owner_uuid = (known after apply)
+ purpose = "Deploy a security and compliance system"
+ resources = (known after apply)
+ updated_at = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions in workspace "do-k8s-2021"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
digitalocean_project.do-k8s-2021: Creating...
digitalocean_project.do-k8s-2021: Creation complete after 2s [id=3401883b-cbc0-4889-ad45-a8d30051cd28]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
I’m not going to show the plan/apply output every time, by the way. I figure it might be helpful for those that haven’t used it before though.
I checked in the DO console and… drum roll… success! Terraform hadn’t been lying to me after all.
Kubernetes (k8s)
It was about this point that I found out Kubernetes on DO is called DigitalOcean Kubernetes or DOKS, and this entire exercise should have been called doks-challenge-2021 or something like that. Oh well.
At this point I had:
- DO project ready
- TFC operational
- A mostly empty
main.tf
terraform {
backend "remote" {
organization = "ljones"
workspaces {
name = "do-k8s-2021"
}
}
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "~> 2.16.0"
}
}
}
provider "digitalocean" {}
resource "digitalocean_project" "do-k8s-2021" {
name = "do-k8s-2021"
description = "DigitalOcean Kubernetes Challenge 2021"
purpose = "Deploy a security and compliance system"
}
The next step on my journey was actually deploying a Kubernetes cluster. At this point I just started skimming through the DO documentation on k8s, particularly cluster creation. I saw there was a HA-cluster option but nah, this isn’t that serious. I read up a little on DO vpcs, regions, block storage and load balancers. Surprisingly5 they don’t have an Australian region, but they do have Singapore and as I’m on the western coast of Australia, that’s pretty much as good as the eastern coast where all the cloud providers host their datacentres.
Then it was back to the TF DO documentation, eventually leading to these updates to main.tf
.
resource "digitalocean_project" "do-k8s-2021" {
name = "do-k8s-2021"
description = "DigitalOcean Kubernetes Challenge 2021"
purpose = "Deploy a security and compliance system"
resources = [
"do:kubernetes:${digitalocean_kubernetes_cluster.doks.id}" # this is a bit tedious
]
}
resource "digitalocean_vpc" "do-k8s-2021" {
name = "do-k8s-2021"
region = "sgp1"
}
data "digitalocean_kubernetes_versions" "doks" {
version_prefix = "1.21."
}
resource "digitalocean_kubernetes_cluster" "doks" {
name = "do-k8s-2021"
region = "sgp1"
auto_upgrade = true
version = data.digitalocean_kubernetes_versions.doks.latest_version
vpc_uuid = digitalocean_vpc.do-k8s-2021.id
maintenance_policy {
start_time = "16:00"
day = "friday"
}
node_pool {
name = "autoscale-worker-pool"
size = "s-1vcpu-2gb" # 1 cpu, 2gb ram (1gb useable)
auto_scale = true
min_nodes = 1
max_nodes = 3 # this is the maximum allowed nodes on my new account without requesting an increase
}
}
Applying, which creates the cluster, took about 6 minutes for me though that may vary for you depending on region and node size. It worked without a hitch though, noice.
The last thing I wanted to do was get the new k8s cluster into TF, so I can deploy manifests or helm charts. More reading of documentation, though this is actually something I’m already doing over in AWS EKS land so there wasn’t much to learn.
- Added the
kubernetes
tf provider - Added the
helm
tf provider - Initialised tf again
- Tested creating/destroying a namespace via tf just to make sure everything was 👌
required_providers {
...
helm = {
source = "hashicorp/helm"
version = "~> 2.4.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.6.0"
}
}
...
provider "kubernetes" {
host = digitalocean_kubernetes_cluster.doks.kube_config[0].host
token = digitalocean_kubernetes_cluster.doks.kube_config[0].token
cluster_ca_certificate = base64decode(digitalocean_kubernetes_cluster.doks.kube_config[0].cluster_ca_certificate)
}
provider "helm" {
kubernetes {
host = digitalocean_kubernetes_cluster.doks.kube_config[0].host
token = digitalocean_kubernetes_cluster.doks.kube_config[0].token
cluster_ca_certificate = base64decode(digitalocean_kubernetes_cluster.doks.kube_config[0].cluster_ca_certificate)
}
}
Falco
Here we are, the final chapter. It was time to install Falco. At this point I had:
- DOKS operational and ready to be interacted with via tf
You may have noticed that I’m still only using a single file,
main.tf
.I’m a fan of keeping things together until splitting them out actually makes my life easier, but this is maybe a style no-no for terraform. Sometimes empty files are used just to indicate the absence of that type of resource, for example, an empty
outputs.tf
indicating a module doesn’t have outputs.For a small greenfield project where I’m the only one working on it, it really doesn’t matter though.
I went digging around and found the helm repository and accompanying documentation. Looked like it should run a basic install without any tweaking of the chart values, so I left everything to default.
resource "helm_release" "falco" {
name = "falco"
repository = "https://falcosecurity.github.io/charts"
chart = "falco"
version = "1.16.2"
namespace = "falco"
create_namespace = true
}
The tf apply was successful, and accessing the k8s dashboard via the DO console, I could see Falco had installed and was operational!
* Setting up /usr/src links from host
* Running falco-driver-loader for: falco version=0.30.0, driver version=3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4
* Running falco-driver-loader with: driver=module, compile=yes, download=yes
* Unloading falco module, if present
* Trying to load a system falco module, if present
* Looking for a falco module locally (kernel 4.19.0-17-cloud-amd64)
* Trying to download a prebuilt falco module from https://download.falco.org/driver/3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4/falco_debian_4.19.0-17-cloud-amd64_1.ko
* Download succeeded
* Success: falco module found and inserted
Mon Nov 22 11:33:41 2021: Falco version 0.30.0 (driver version 3aa7a83bf7b9e6229a3824e3fd1f4452d1e95cb4)
Mon Nov 22 11:33:41 2021: Falco initialized with configuration file /etc/falco/falco.yaml
Mon Nov 22 11:33:41 2021: Loading rules from file /etc/falco/falco_rules.yaml:
Mon Nov 22 11:33:41 2021: Loading rules from file /etc/falco/falco_rules.local.yaml:
Mon Nov 22 11:33:42 2021: Starting internal webserver, listening on port 8765
11:33:42.054459000: Notice Container with sensitive mount started (user=daemon user_loginuid=0 command=container:9f7394213f5d k8s.ns=kube-system k8s.pod=do-node-agent-tf8t9 container=9f7394213f5d image=docker.io/digitalocean/do-agent:3 mounts=/proc:/host/proc::false:private,/sys:/host/sys::false:private,/:/host/root::false:rslave,/var/lib/kubelet/pods/a8eac910-fb2e-40ea-bc13-2cfd181d8750/volumes/kubernetes.io~projected/kube-api-access-ctq56:/var/run/secrets/kubernetes.io/serviceaccount::false:private,/var/lib/kubelet/pods/a8eac910-fb2e-40ea-bc13-2cfd181d8750/etc-hosts:/etc/hosts::true:private,/var/lib/kubelet/pods/a8eac910-fb2e-40ea-bc13-2cfd181d8750/containers/do-node-agent/21e37c0b:/dev/termination-log::true:private) k8s.ns=kube-system k8s.pod=do-node-agent-tf8t9 container=9f7394213f5d
...
Thing is, nothing much was going on really. I had an essentially empty cluster at this point, so that made sense, but still! Happily, the falco helm chart actually has the ability to generate fake events. To do so, I added the below.
resource "helm_release" "falco" {
...
set {
name = "fakeEventGenerator.enabled"
value = "true"
}
}
Tailing the logs once again, there was plenty going on now.
11:49:55.808634583: Warning Sensitive file opened for reading by trusted program after startup (user=root user_loginuid=-1 command=httpd --loglevel info run ^syscall.ReadSensitiveFileUntrusted$ --sleep 6s parent=httpd file=/etc/shadow parent=httpd gparent=event-generator container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:55.911658311: Error File below a known binary directory opened for writing (user=root user_loginuid=-1 command=event-generator run --loop ^syscall file=/bin/created-by-event-generator parent=event-generator pcmdline=event-generator run --loop ^syscall gparent=<NA> container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:55.911718964: Error File below known binary directory renamed/removed (user=root user_loginuid=-1 command=event-generator run --loop ^syscall pcmdline=event-generator run --loop ^syscall operation=unlinkat file=<NA> res=0 dirfd=-100(AT_FDCWD) name=/bin/created-by-event-generator flags=0 container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.013072532: Notice System user ran an interactive command (user=bin user_loginuid=-1 command=login container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.294084849: Debug Shell spawned by untrusted binary (user=root user_loginuid=-1 shell=bash parent=httpd cmdline=bash -c ls > /dev/null pcmdline=httpd --loglevel info run ^helper.RunShell$ gparent=httpd ggparent=event-generator aname[4]=event-generator aname[5]=<NA> aname[6]= aname[7]= container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.501997391: Error File below /etc opened for writing (user=root user_loginuid=-1 command=event-generator run --loop ^syscall parent=event-generator pcmdline=event-generator run --loop ^syscall file=/etc/created-by-event-generator program=event-generator gparent=<NA> ggparent= gggparent= container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.602813830: Error File created below /dev by untrusted program (user=root user_loginuid=-1 command=event-generator run --loop ^syscall file=/dev/created-by-event-generator container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.703339933: Warning Sensitive file opened for reading by non-trusted program (user=root user_loginuid=-1 program=event-generator command=event-generator run --loop ^syscall file=/etc/shadow parent=event-generator gparent=<NA> ggparent= gggparent= container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:56.963228274: Notice Database-related program spawned process other than itself (user=root user_loginuid=-1 program=ls parent=mysqld container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
11:49:57.233889340: Notice Known system binary sent/received network traffic (user=root user_loginuid=-1 command=sha1sum --loglevel info run ^helper.NetworkActivity$ connection=10.244.0.83:49851->10.2.3.4:8192 container_id=64b9a21e112e image=docker.io/falcosecurity/event-generator) k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e k8s.ns=falco k8s.pod=falco-event-generator-84d64cb8fb-j9tpl container=64b9a21e112e
Tailing the actual event generator pod, I saw corresponding logs.
INFO sleep for 100ms action=syscall.WriteBelowEtc
INFO writing to /etc/created-by-event-generator action=syscall.WriteBelowEtc
INFO sleep for 100ms action=syscall.SystemUserInteractive
INFO run command as another user action=syscall.SystemUserInteractive cmdArgs="[]" cmdName=/bin/login user=daemon
INFO sleep for 100ms action=syscall.SystemProcsNetworkActivity
INFO spawn as "sha1sum" action=syscall.SystemProcsNetworkActivity args="^helper.NetworkActivity$"
INFO sleep for 100ms action=helper.NetworkActivity as=sha1sum
INFO action executed action=helper.NetworkActivity as=sha1sum
INFO sleep for 100ms action=syscall.ReadSensitiveFileTrustedAfterStartup
INFO spawn as "httpd" action=syscall.ReadSensitiveFileTrustedAfterStartup args="^syscall.ReadSensitiveFileUntrusted$ --sleep 6s"
INFO sleep for 6s action=syscall.ReadSensitiveFileUntrusted as=httpd
INFO action executed action=syscall.ReadSensitiveFileUntrusted as=httpd
INFO sleep for 100ms action=syscall.WriteBelowBinaryDir
INFO writing to /bin/created-by-event-generator action=syscall.WriteBelowBinaryDir
I didn’t actually need this running fake events forever, so I reverted the helm chart back to defaults and applied again.
Job done
That’s actually the challenge complete. As you can see, Terraform and managed Kubernetes providers like DigitalOcean make it pretty easy to get up and running quickly.
Considering I’ve still got a month or so to go, I’ll probably try some of the other challenges, or extend this one a bit further. Getting Loki and Grafana going with Falco sounds cool 🤔