CONFIGURE AND USE CLUSTER DNS
cluster DNA can allows us to use hostnames from our applications to connect to other services inside of our cluster. That was handle by piece called “KubeDNS”
kubectl get pods -n kube-system —> to see the DNS
kubectl get pods
kubectl get services
kubectl exec -it – nslookup
(We can see the .default and .local)
kubectl exec -it busybox – nslookup dns-target —> to deploy the DNS target in cluster
kubectl expose deployment dns-target —> to expose “dns target service”
kubectl log -n --> To see the log files
kubectl get svc -n kubesystem --> To see the running DNS
CONTAINER NETWORK INTERFACE (CNI)
CNI
• All pods can communicate with all other pods.
• Each pod has its own IP address
• No need for mapping container ports
• Backward compatible model with VMs:
- Port allocation
- Naming
- Service Discovery
- Load balancing
- Application Configuration
- Migration
KUBERNETES NETWORK MODEL
Dynamic Port Allocation Problems:
• Every application must be configured to know which ports, etc.
• API services must inject dynamic port numbers into containers
• Services must be able to find one another or be configured to do so.
The Kubernetes Way:
• All containers can communicate with each other without NAT
• All nodes can communicate with all containers (& vice versa) without NAT
• The IP of a container is the same regardless of which container views it
• K8s applies IP addresses at the pod level
• “IP-per-Pod” – Containers in a pod share a single IP address, like processes in a VM
BUT HOW?
CNI – Container Network Interface
• Must be implemented as an executable invoked by the container management system (in our case, Kubernetes)
• Plugin is responsible for
- Inserting the network interface into the container network namespace
- Making necessary changes to the host
- Assign IP address to the interface
- Set up routes consistent with IP address management
KUBELET
• Default network plugin
• Default cluster-wide network
• Probes for network plugins on startup
FLANNEL
• Simple and easy Layer 3 network fabric
• flanneld runs on each host (via a DaemonSet)
• Allocates subnet least to each host
• Stores network configuration, allocated subnets, other data
• Packets forwarded using VxLANs
CALICO
• Free and Open Source
• Simplified networking model
• Scalable, distributed control plane
• Policy-driven network security
• Uses overlay networks sparingly
• Widely deployed
• Can be run in policy enforcement mode
Others Worth Mentioning
- Cilium
- Contiv
- Contrail
- Multus
- NSX-T
- Nuage Networks VCS
- OpenVSwitch
- OVN
- Romana
- Weave Net
- CNI-Genie
Kubernetes requires its networking model to be implemented by a third-party plugin, called a CNI (Container Network Interface).
Different CNIs feature support for different hardware, software, overlay networks, policies, and features.
Administrators must select a CNI appropriate to their environment.
PERSISTENT VOLUMES
• Native Pod Storage is Ephemeral – Like a Pod
• What happens when a container crashes:
• Kubelet restarts it (possibly on another node)
• File system is re-created from image
• Ephemeral files are gone
DOCKER VOLUMES
• Directory on disk
• Possibly in another container
• New Volume Drivers
KUBERNETES VOLUMES
• Same lifetime as its pod
• Data preserved across container restarts
• Pod goes away -> Volume goes away
• Directory with data
• Accessible to containers in a pod
• Implementation details determined by volume types
USING VOLUMES
• Pod spec indicates which volumes to provide for the pod (spec.volumes)
• Pod spec indicates where to mount these volumes in containers ( spec.containers.volumeMounts)
• Seen from the container’s perspective as the file system
• Volumes cannot mount onto other volumes
• No hard links to other volumes
• Each pod must specify where each volume is mounted
AWSELASTICBLOCKSTORE
• Mounts an AWS EBS volume to a pod
• EBS volume is preserved when unmounted
• Must be created prior to use
• Nodes must be on AWS EC2 instances in the same region
• Single instance mounting only
Created via a command like:
aws ec2 create-volume --availability-zone=eu-west-la --size=10 --volume-type=gp2
AWS ELASTIC BLOCKSTORE EXAMPLE YAML
apiVersion: v1
kind: Pod
metadata:
name: test-ebs
spec: containers:
-image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
-mountPath: /test-ebs
name: test-volume
volumes:
-name: test-volume
#This AWS EBS volume must
#already exist.
awsElasticBlockStore:
volumeID:
fsType: ext4
AZUREDISK AND AZUREFILE
• An azureDisk is used to mount a Microsoft Azure Data Disk into a pod. An azureFile is used to mount a Microsoft Azure File Volume (SMB 2.1 and 3.0) into a pod
CEPHFS
● Allows mounting a CephFS volume to a pod
● Contents of volume are preserved when unmounted
● Must have a Ceph server running
● Share must be exported
CSI
● Container Storage Interface
● In-tree CSI volume plugin for volumes on the same node
● Kubernetes 1.9+
–feature-gates=CSIPersistentVolume=true
● Metadata fields specify what is used and how
● Driver fields specify the name of the driver
● volumeHandle identifies volume name
● readOnly is supported
DOWNWARDAPI
● Mounts a directory and writes data in plan text files
EMPTYDIR
● Created when a pod is assigned to a node
● Exists while pod runs on a particular node
● Initially empty
● Multiple containers can read/write same volume
● Volume can be mounted per container – same or different mount points
● Pod removed -> volume removed
● Stored on node’s local medium
● Optional - set emptyDir.medium = Memory for RAM based tmpfs
EMPTYDIR SAMPLE YAML
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
-image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
-mountPath: /cache
name: cache-volume
volumes:
-name: cache-volume
emptyDir: {}
FC (FIBRE CHANNEL)
● Allows existing fc volume to be mounted to a pod
● Single or Multiple Target using targetWWNs
● FC SAN Zoning must be allocated
● LUNs must be masked to target WWN
FLOCKER
● Open source clustered container data volume manager
● Management and orchestration of volumes
● Allows Flocker dataset to be mounted into a pod
● Must be created prior to mounting
● Can be transferred between pods
● Must have Flocker
GCEPERSISTENTDISK
● Mounts a GCE persistent disk to pod
● Data preserved if volume unmounted
● Nodes must be on GCE VMs
● Same project and zone as the persistent disk(s)
● Multiple concurrent mounts allowed, but read-only
● Create with a command like:
gcloud compute disks create --size=500GB --zone=us-central1-a my-k8s-disk
GCEPERSISTENTDISK SAMPLE YAML
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
-image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
-mountPath: /test-pd
name: test-volume
volumes:
-name: test-volume
#This GCE PD must already
#exist. gcePersistentDisk:
pdName: my-data-disk
fsType: ext4
GITREPO
● Mounts emptyDir and clones a git repository
● YAML Sample:
… volumes:
-name: git-volume
gitRepo:
repository: “git@somewhere:me/my-git-repository.git”
revision: “22f1d8406d464b0c0874075539c1f2e96c253775”
GLUSTERFS
● Allows GlusterFS volume to be mounted to a pod
● Volume data preserved if volume is unmounted
● Multiple concurrent mounts – read/write – are allowed
● Must have GlusterFS
HOSTPATH
● Mounts file or directory from host node’s filesystem to a pod
● Field type - Empty string (for backward compatibility) performs no checks
● DirectoryOrCreate – Created if not present
● Directory – Directory must exist
● FileOrCreate – Created if not present
● File – File must exist
● Socket – Socket must exist
● CharDevice – Character device must exist
● BlockDevice – Block device must exist
HOSTPATH, WARNINGS!
● hostPath might behave differently on different nodes regardless of pod configuration
● Files and directories created on the host are only writable by root
hostPath Sample YAML
… volumes:
-name: test-volume
hostPath:
#directory location on host
path: /data
#this field is optional
type: Directory
ISCSI
● Allows an existing iSCSI volume to be mounted to a pod
● Volume data preserved if volume is unmounted
● Must have an iSCSI provider
● Multiple read only concurrent connections allowed
● Only one writer at a time
LOCAL
● Alpha, requires “PersistentLocalVolumes” and “VolumeScheduling” feature gates
● Allows local mounted storage to be mounted to a pod
● Statically created PersistentVolume
● Kubernetes is aware of the volume’s node constraints
● Must be on the node
● Not suitable for all applications
● 1.9+ - Volume binding can be delayed until pod scheduling
NFS
● Allows existing NFS share to be mounted to a pod
● Volume data preserved if volume unmounted
● Must have an NFS server
PERSISTENT VOLUME CLAIM (PVM)
● Used to mount a PersistentVolume into a pod
● Users can stake a claim to durable storage without knowing implementation
VOLUME TYPES CURRENTLY PROJECTED (SUBJECT TO EXPANSION!):
- secret
- downwardAPI
- configMap
projected Sample YAML
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
-name: container-test
image: busybox
volumeMounts:
-name: all-in-one
mountPath: “/projected -volume”
readOnly: true
volumes:
-name: all-in-one
projected:
sources:
-secret:
name: mysecret
items:
-key: username
path: my-group/my-username
PROJECTED SAMPLE YAML
downwardAPI:
items:
-path: “labels”
fieldRef:
fieldPath: metadata.labels
-path: “cpu_limit”
resourceFieldRef:
containerName: container-test
resource: limits.cpu
configMap:
name: myconfigmap
items:
-key: config
path: my-group/my-config
PORTWORXVOLUME
● Elastic block storage layer
● Storage on a server
● Capabilities tiers
● Aggregates capacity
● Runs in-guest in VMs or on bare metal Linux nodes
● Can be created dynamically or pre-provisioned
RBD
● Allows Rados Block Device volume to be mounted to a pod
● Volume data preserved when volume is unmounted
● Ceph cluster is required
● Multiple concurrent read-only connections allowed
● Single writer only
SCALEIO
● Software-based storage platform
● Allows ScaleIO volumes to be mounted to pods
● Must have existing ScaleIO cluster
● Volumes must be pre-created
SECRET
● Used to pass sensitive information to pods
● Stored using the Kubernetes API
● Mount secrets as files for use by pods
● Volumes are backed by tmpfs so secrets are never written to non-volatile storage
● Secrets must be created in Kubernetes API prior to use
STORAGEOS
● Allows existing StorageOS volume to be mounted to a pod
● Runs as a container in the K8s environment
● Data can be replicated
● Provisioning and compression can improve utilization and reduce cost
● Provides block storage to containers via file system
● Requires 64-bit Linux
● Free Developer License available!
● StorageOS container must run on each node that accesses StorageOS volumes and contributes
capacity
VSPHEREVOLUME
● Kubernetes with vSphere Cloud Provider must be configured
● Used to mount vSphere VMDK to a pod
● Volume data is preserved when volume is unmounted
● Supports both VMFS and VSAN
● Can share subvolumes
FLEXVOLUME PLUGIN
● For when storage vendors create custom plugins without adding it to the K8s repo
● Enables users to mount vendor volumes to a pod
● Vendor plugin implemented using a driver
● Drivers must be installed in correct path on each node
MOUNT PROPAGATION
● Alpha feature as of Kubernetes 1.8
● Allows for sharing volumes mounted by one container to other containers in the same pod
● Other pods in the same node
● Alpha feature as of Kubernetes 1.8
● Allows for sharing volumes mounted by one container to other containers in the same pod
● Other pods in the same node
● --feature-gates MountPropagation=true
● mountPropagation subfield:
- HostToContainer - Container gets subsequent mounts to this volume (default)
- Bidirectional - HostToContainer, plus host sees subsequent mounts made by container
● Might need this on a pod using FlexVolume driver
● Can be dangerous! (privileged containers only)
● Familiarity with Linux Kernel Behavior strongly recommended!
● Any volume mounts created by containers in pods must be unmounted by the containers upon
termination.
CONCLUSION
● Might need this on a pod using FlexVolume driver
● Can be dangerous! (privileged containers only)
● Familiarity with Linux Kernel behavior strongly recommended!
● Any volume mounts created by containers in pods must be unmounted by the containers upon
termination.
VOLUMES AND THEIR ACCESS MODES
• PersistentVolume – API for users that abstracts implementation details of storage
• PersistentVolumeClaim – Method for users to claim durable storage regardless of implementation
details
PERSISTENTVOLUME (PV)
• Provisioned storage in the cluster
• Cluster resource
• Volume plugins (as discussed in the previous lesson) have independent lifecycle from pods
• Volumes share the lifecycle of the pod; PersistentVolumes do not
• API object (YAML) details the implementation
PERSISTENTVOLUMECLAIM (PVC)
• Request for storage
• Pods consume node resources; PVCs consume PV resources.
• Pods can request specific CPU and memory; PVCs can request specific size and access modes.
PVs and PVCs
• Users and applications do not share identical requirements
• Administrators should offer a variety of PVs without users worrying about the implementation details
• PVs are cluster resources, PVCs are requests for the cluster resource
• PVCs also act as a “claim check” on a resource
• PVs and PVCs have a set lifecycle
• Provision
• Bind
• Reclaim
PROVISIONING
• Static
• Creates PVs
• In the K8S API and available for consumption
• Dynamic
• Used when none of the static PVs match the PVC
• Based on StorageClasses
• PVC must request a created and configured storage class
• Claims requesting nameless class disable dynamic provisioning
To enable dynamic storage provisioning, DefaultStorageClass admission controller on the API server must be enabled.
BINDING
● User creates PVC
● Master watches for new PVCs and matches them to PVs
● Master binds PVC to PV
● Volume may be more than the request
● Binds are exclusive
● PVC -> PV mapping is always 1:1
● Claims not matched will remain unbound indefinitely
POD USES VOLUME
● Pods treat PVCs as volumes
● Cluster checks claim, mounts appropriate volume to pod
PERSISTENT VOLUME CLAIM PROTECTION
● Alpha feature as of K8s 1.9
● Ensures PVCs actively in use do not get removed from the system
● PVC considered active when:
- The pod status is Pending and the pod is assigned to a node
- The pod status is Running
● If a user deletes a PVC in use, removal is postponed until PVC is not in use by any pod
RECLAIMING
● User can delete PVC objects
● Reclaim policy for a PV tells the cluster what to do with the volume
● Policies:
- Retain
- Recycle
- Delete
● Policies allow for manual reclamation of a resource.
- PVC deleted
- PV still exists; volume is “released”
- Not yet available because the data is still present
- Admin can manually reclaim the volume
RECLAIMING
● Administrator can configure custom recycler pod template
● Must contain a volumes specification
● Volume plugs that support Delete reclaim policy
- Removes the PersistentVolume object from K8s
- Associated storage assets in the infrastructure
- Dynamically provisioned volumes inherit reclaim policy of their StorageClass
● Administrator should configure StorageClass according to expectations
CAPACITY
● PVs have a specific storage capacity
● Set using “capacity” attribute
● Storage size is the only resource that can be set or requested
● Future attribute plans:
- IOPS
- Throughput
● Prior to K8s v1.9, default behavior for all volume plugins was to create a filesystem
● As of 1.9, however, user can specify volumeMode
- Raw Block Devices – “Block”
- File systems – “Filesystem” (default)
ACCESS MODES
● Must be supported by storage resource provider
● ReadWriteOnce – Can be mounted as read/write by one node only (RWO)
● ReadOnlyMany – Can be mounted read-only by many nodes (ROX)
● ReadWriteMany – Can be mounted read/write by many nodes (RWX)
A volume can only be mounted using one access mode at a time, regardless of the modes that are supported.
CONCLUSION
● PVC vs PV
● Storage object lifecycle
● Access modes
APPLICATIONS AND PERSISTENT STORAGE
Connect to server – Install nfs-kernel
$ sudo apt update
$ sudo apt install nfs-kernel-server
$ sudo mkdir /var/nfs/general -p
$ ls -la /var/nfs/general
$ sudo chown nobody:nogroup /var/nfs/general
$ ls -la /var/nfs/general
$ sudo vi /etc/exports - changes made and add IPs
$ sudo systemctl restart nfs-kernel-server
(Same as in Both Master node and Work Node)
Sample persistent volume.yaml
apiVersion:v1
kind: PersistentVolume
metadata:
name: lapv
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
-ReadWriteMany
PersistentVolumeReclaimPolicy: Recycle
nfs:
path: /var/nfs/general
server:
readOnly: false
$ kubectl create -f PV.yaml
$ kubectl get PV - Gives the info about PV
PVC.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc
spec:
accessModes:
-ReadWriteMany
Resources:
Requests:
Storage: 1Gi
$ kubectl create -f PVC.yaml
$ kubectl get PVC
NFS-POD.yaml
apiVersion: v1
kind: Pod
metadata:
name: nfs-pod
labels:
name: nfs-pod
spec:
containers:
-name: nfs-ctn
Image: busybox
Command:
-sleep
-“3600”
VolumeMounts:
-name: nfsvol
Mountpath: /tmp
restartPolicy: Always
securityContext:
fsGroup: 65534
runAsUser: 65534
volumes:
-name: nfsvol
PersistentVolumeClaim:
claimName: nfs-PVC
$kubectl create -f nfs-pod.yaml
$kubectl get pods
AUTHENTICATION AND AUTHORIZATION
• Transport Layer Security Established
• Authentication (Authenticator Modules)
• Admissions modules
• Authorization
- ABAC
- RBAC
- Webhook
KUBELET AUTHENTICATION AND AUTHORIZATION
• The Kubelet’s endpoint exposes APIs which give access to data of varying sensitivity
How to authenticate and authorize access:
• By default, requests are treated as anonymous requests
• If authenticated, then it authorized the request:
• Default is AlwaysAllow
• Might want to subdivide access because:
- Anonymous auth enabled, but anonymous users should be limited
- Bearer token auth enabled, but some service accounts should be limited
- Client certificate auth enabled, but only some that are signed should be allowed
KUBERNETES NETWORK POLICIES
• Specification of how groups of pods may communicate
• Use labels to select pods and define rules
• Implemented by the network plugin
• Pods are non-isolated by default
• Pods are isolated when a Network Policy selects them
KUBERNETES NETWORK POLICIES EXAMPLE
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
-Ingress
–Egress
ingress:
-from:
-ipBlock:
cidr: 172.17.0.0/16
except:
-172.17.1.0/24
-namespaceSelector:
matchLabels:
project: myproject
-podSelector:
matchLabels:
role: frontend
ports:
-protocol: TCP
port: 6379
egress:
-to:
-ipBlock:
cidr: 10.0.0.0/24
ports:
-protocol: TCP
port: 5978
EXAMPLE DEFAULT ISOLATION POLICY
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
-Ingress
EXPLICITLY ALLOW ALL TRAFFIC SAMPLE YAML
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all
spec:
podSelector: {}
ingress:
-{}
EXPLICITLY DENY OUTGOING TRAFFIC SAMPLE YAML
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
-Egress
EXPLICITLY ALLOW ALL OUTGOING TRAFFIC SAMPLE YAML
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all
spec:
podSelector: {}
egress:
-{}
EXPLICITLY STOP ALL TRAFFIC SAMPLE YAML
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
-Ingress
–Egress
CONCLUSION
• Specification of how groups of pods may communicate
• Use labels to select pods and define rules
• Implemented by the network plugin
• Pods are non-isolated by default
• Pods are isolated when a Network Policy selects them
TLS CERTIFICATES FOR CLUSTER COMPONENTS
Nodes required client certifications inorder to maintain
CERTIFICATION CREATION
$cd k8s/easy-rsa-master/easyrsa3/
$./easyrsa init-pki -à Initialize the tool
$ ./easyrsa – batch “-req-cn=${MASTER-IP}@”date +%s” build-ca nopass
(generate certificate authority)
First we set the all paths
./easyrsa –subject -alt-name=”IP:${MASTER_IP},”
“IP:${MASTER_IP},”
“DNS:kubernetes,”
“DNS:kubernetes.default,”
“DNS:kubernetes.default.svc,”
“DNS:kubernetes.default.svc.cluster,”
“DNS:kubernetes.default.svc.cluster.local”
–days=10000
Build-server-full server nopass
To check set (or) not -à echo $MASTER_IP
$cd pki
$ls -la --à we can see the ca.crt file
$ps aux|grep apiserver
(to see the api server and its info)
SECURING IMAGES
• User containers could contain vulnerabilities
• Continuous security vulnerability scanning
• Check for outdated containers
• Known vulnerabilities
• New vulnerabilities are published every day!
• Regularly apply security updates to your environment
• Don’t run tools like apt-update in containers
• Use rolling updates! (See previous lesson!)
• Ensure only authorized images are used in your environment
• Use private registries to store approved images
• CI Pipeline should ensure that only vetted code is used for building images
DEFINING SECURITY CONTEXTS
sample security.yaml
apiVersion: v1
kind: Pod
metadata:
name: Security-context.pod
spec:
securityContext:
runAsuser: 1000
fsGroup: 2000
volumes:
-name: sam-vol
emptyDir:{}
containers:
-name: sample-container
image: gcr.io/google-samples/node-hello:1.0
volumeMounts:
-name: sam-vol
mountPath: /data/demo
securityContext:
allowPrivilegeEscalation: false
$kubectl create -f security.yaml
$kubectl get pods —> To check the pod
$kubectl exec -it security-context-pod --sh
$ps aux —> See the shell running contents
$cd /data —> see the root directory
$cd demo
$echo LinuxAcademy > test —> see the test file
$kubectl delete -f security.yaml
TROUBLESHOOTING IN KUBERNETES
$kubectl get nodes —> Info about nodes
$kubectl get nodes -o wide —> More info about nodes
$kubectl describe node —> complete description about a node.
$kubectl get pods —> info about pods
$kubectl logs counter —> To see the logs of the all containers.