rh container event (cloud-native roadshow)

ops

lab

  • navigate to https://redhat.qwiklab.com/focuses/191 and login
  • select 'My Learning' then 'OpenShift for Ops Test Drive'
  • Click 'Start Lab' in the top right. Once the lab has been spun up the connection details will appear in the left pane.
  • The lab guide URL will also be shown.

presentation

oshift overview

  • hybrid scaling
    • from on-prem to cloud in mins
  • jenkins pipeline
    • servicenow rest api to 'tick box' before continuing
  • kubernetes
  • oci compatible container runtime (docker)
  • internal container repo in oshift (certified by rh)
  • 10x workload density than vms --??
  • ownership boundaries
    • dev
      • container
        • app
        • os dependencies
    • ops
      • container host
      • infra
  • container image layers
    • immutable images (kill and redeploy)
  • base image patching
    • oshift rebuilds all containers using image stream
      • source to image build
  • lightweight, oci-compliant container runtime (cri-o --??)
    • rhel on node (host) and container
      • pod = collection of containers
        • smallest unit of management in oshift
    • only oci-compliant are supported
  • masters (3x)
    • can lose all w/out effecting live traffic
    • rest api (servicenow to do oshift activites)
    • datastore
      • desired / current state
      • etcd db
        • one per master
        • sync'd across masters
        • ansible playbook bundles instead of bakup (infra as code)
    • orchestration and scheduling
      • placement by policy
    • health/scaling - autoscaling pods
      • endpoints put in by devs
      • readiness probe
      • liveness probe
  • infra nodes
    • integrated container registry
  • persistent storage
    • glusterfs
  • service layer
  • routing layer
    • expose services externally

container storage

  • oshift persistent storage framework
    • PersistentVolumeClaim
      • submitted by dev
    • StorageClass
      • set up by ops
    • Storage Backend
    • PersistentVolmue
      • mounted by pod
      • bound to PersistentVolumeClaim
  • glusterfs

    • (app) node labelled as container native storage
    • underlying storage: das, jbod
    • scale-out linearly
    • replicate sync and async
    • heketi - restful glusterfs management
  • subscription licensing

    • not required for master/infra
    • only for 'worker' nodes (app nodes)
    • based on number of vms or socket pairs
    • spotfleets??
    • cloudforms to manage subscriptions?

lab

  • environment
    • master x1
    • infra x1
    • app x6
    • idm x1 (ldap auth)
  • ssh into master node
  • using ansible playbooks for installing oshift
    • part of the openshift-ansible pkg
  • installers config /etc/ansible/hosts docs
    • general settings under [OSEv3:vars]
  • top level runbook triggers install of cluster
    • /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
  • requires 'super admin' account
  • cmds
  • web_console
  • prometheus
    • cluster infra monitoring and alerting
  • verify storage cluster export HEKETI_CLI_SERVER=http://heketi-storage-storage.apps.674462327352.aws.testdrive.openshift.com export HEKETI_CLI_USER=admin export HEKETI_CLI_KEY=myS3cr3tpassw0rd heketi-cli cluster list #shows internal uuid of cns cluster heketi-cli topology info
  • application management
    • create new project (bucket)
    • deploy new app (automatically created service)
    • view service yaml
    • scale app
    • delete pod
      • oshift redeploys in less than 10secs!
    • create route (expose service)
  • application probes
    • liveness probe
    • readiness probe
    • check endpoint health curl mapit-app-management.apps.674462327352.aws.testdrive.openshift.com/health
    • probe endpoint for liveness (set probe) oc set probe dc/mapit --liveness --get-url=http://:8080/health --initial-delay-seconds=30
    • probe endpoint for readiness (set probe) oc set probe dc/mapit --readiness --get-url=http://:8080/health --initial-delay-seconds=30
    • confirm oc describe dc mapit
      • 'Containers' section
  • add storage to app oc volume dc/mapit --add --name=mapit-storage -t pvc --claim-mode=ReadWriteMany --claim-size=1Gi --claim-name=mapit-storage --mount-path=/app-storage
    • storage now available at /app-storage inside node (rsh log on)
  • project request template, quota, limits
    • view default template
    • modify template cat /opt/lab/support/project_request_template.yaml
    • install new template
    • modify 'master-config.yaml' section 'projectRequestTemplate' sudo vim /etc/origin/master/master-config.yaml
    • restart master sudo systemctl restart atomic-openshift-master-api atomic-openshift-master-controllers
  • groups
    • external auth providers
    • role based access control
    • login as normal user
      • no projects
    • login as 'fancyuser'
      • projects are shown
    • create 3x new projects (lifecycle)
      • ose-teamed-app edit dev and test, view prod
      • ose-fancy-dev edit prod
    • login is as teamed user to see 3x projects
      • create app in prod - fails!
    • prometheus
      • login as fancyuser1
  • infrastructure management, metrics and logging
    • extending cluster
      • view app nodes
      • uncomment '#scaleup_' in '/etc/ansible/hosts'
      • use ansible to verify nodes are online ansible new_nodes -m ping
      • run playbook to extend cluster ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-node/scaleup.yml
    • multi master ha setup docs
    • container-native storage for infra
      • required by registry, logging, metrics
      • configure installer sudo sed -i 's/#cnsinfra_//g' /etc/ansible/hosts
      • install cns cluster for infra ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/config.yml
      • regular file storage service (glusterfs) not supported for logging/metrics
        • must use block storage (glusterblock)
      • metrics
        • based on hawkular in a cassandra db
        • configure installer sudo sed -i 's/#metrics_//g' /etc/ansible/hosts sudo sed -i '/openshift_metrics_install_metrics=false/d' /etc/ansible/hosts
        • run playbook to install metrics ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml
      • logging
        • using efk
          • elasticsearch (centralplace)
          • fluentd (consolidated)
          • kibana (ui)
        • configure installer sudo sed -i 's/#logging_//g' /etc/ansible/hosts sudo sed -i '/openshift_logging_install_logging=false/d' /etc/ansible/hosts
        • run playbook to install logging ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-logging/config.yml
    • multitenant networking
      • sdn based on open vswitch
      • execute creation script bash /opt/lab/support/net-proj.sh
      • get ip of pod b bash /opt/lab/support/podbip.sh
      • export pod b ip export POD_B_IP=$(bash /opt/lab/support/podbip.sh)
      • get name of pod in netproj-a project and export as var oc get pods -n netproj-a export POD_A_NAME=ose-1-zccsx
      • execute ping in pod a try to reach pod b oc exec -n netproj-a $POD_A_NAME -- ping -c1 -W1 $POD_B_IP
        • fails because networks aren't connected
      • join networks oc get netnamespace oc adm pod-network join-projects netproj-a --to=netproj-b oc get netnamespace
        • network ids of two projs now the same
      • retest connectivity oc exec -n netproj-a $POD_A_NAME -- ping -c1 -W1 $POD_B_IP
      • isolate (unjoin) projects oc adm pod-network isolate-projects netproj-a
      • use 'NetworkPolicy' for finer grain
    • node maintenance
      • mark node as 'non-schedulable' then drain all pods on node
        • mark node02 as 'non-schedulable' oc adm manage-node node02.internal.aws.testdrive.openshift.com --schedulable=false
          • does not impact running pods
        • drain pods on node02 (dryrun first)
        • node now ready for maintenance (reboot etc)
        • add node back into oshift oc adm manage-node node02.internal.aws.testdrive.openshift.com --schedulable=true
    • oshift registry with cns
      • uses ephemeral storage in its pod
        • restarts or redeployments cause container images lost
      • add cns to registry
        • add volume oc volume dc/docker-registry --add --name=registry-storage -t pvc \ --claim-mode=ReadWriteMany --claim-size=5Gi \ --claim-name=registry-storage --claim-class=glusterfs-registry --overwrite
        • verify deploymentconfig oc get dc/docker-registry
        • scale registry oc scale dc/docker-registry --replicas=3
  • container-native storage concepts
    • login as super admin in 'storage' oc login -u system:admin -n storage
    • view pods oc get pods -n storage -o wide
    • check service and route oc get service,route
    • perform health check on endpoint curl -w "\n" http://heketi-storage-storage.apps.674462327352.aws.testdrive.openshift.com/hello
    • login as 'fancyuser1' oc login -u fancyuser1 -p openshift
      • create new app oc new-project my-database-app
      • view template oc get template/rails-pgsql-persistent -n openshift
      • view pvc in template oc get template/rails-pgsql-persistent -n openshift -o yaml | grep PersistentVolumeClaim -A8
      • specify storage size oc new-app rails-pgsql-persistent -p VOLUME_CAPACITY=5Gi
      • get route oc get route
    • explore underlying cns
      • login as system admin
      • select 'my-database-app' proj oc project my-database-app
      • view pvc
      • export pvc name as var export PGSQL_PV_NAME=$(oc get pvc/postgresql -o jsonpath="{.spec.volumeName}" -n my-database-app)
      • describe pvc oc describe pv $PGSQL_PV_NAME
      • export glusterfs volume name export PGSQL_GLUSTER_VOLUME=$(oc get pv $PGSQL_PV_NAME -o jsonpath='{.spec.glusterfs.path}')
      • switch to storage project oc project storage
      • view glusterfs pods oc get pods -o wide -l glusterfs=storage-pod
      • store first glusterfs pod name and ip as vars export FIRST_GLUSTER_POD=$(oc get pods -o jsonpath='{.items[0].metadata.name}' -l glusterfs=storage-pod) export FIRST_GLUSTER_IP=$(oc get pods -o jsonpath='{.items[0].status.podIP}' -l glusterfs=storage-pod) echo $FIRST_GLUSTER_POD echo $FIRST_GLUSTER_IP
      • query gluster pod for volumes (rsh) oc rsh $FIRST_GLUSTER_POD gluster volume list
      • query for topology oc rsh $FIRST_GLUSTER_POD gluster volume info $PGSQL_GLUSTER_VOLUME
      • export brick dir path export PGSQL_GLUSTER_BRICK=$(echo -n $(oc rsh $FIRST_GLUSTER_POD gluster vol info $PGSQL_GLUSTER_VOLUME | grep $FIRST_GLUSTER_IP) | cut -d ':' -f 3 | tr -d $'\r' ) echo $PGSQL_GLUSTER_BRICK
      • look at brick dir oc rsh $FIRST_GLUSTER_POD ls -ahl $PGSQL_GLUSTER_BRICK
    • provide scalable, shared storage w/ cns
      • deploy file uploader app oc login -u fancyuser1 -p openshift oc new-project my-shared-storage oc new-app openshift/php:7.0~https://github.com/christianh814/openshift-php-upload-demo --name=file-uploader
        • view logs to wait for app to be deployed oc logs -f bc/file-uploader
      • expose app via route oc expose svc/file-uploader
      • scale up for ha oc scale --replicas=3 dc/file-uploader
      • upload file to app
      • view pods to find where file is located oc rsh file-uploader-1-k2v0d ls -hl uploaded oc rsh file-uploader-1-sz49r ls -hl uploaded oc rsh file-uploader-1-xjg9f ls -hl uploaded
      • create pvc oc volume dc/file-uploader --add --name=my-shared-storage \ -t pvc --claim-mode=ReadWriteMany --claim-size=1Gi \ --claim-name=my-shared-storage --mount-path=/opt/app-root/src/uploaded
      • refresh app (new nodes)
      • upload new file
      • view file across all nodes
      • increase vol capacity
        • fill up current cap oc rsh file-uploader-2-jd22b dd if=/dev/zero of=uploaded/bigfile bs=1M count=1000 oc rsh file-uploader-2-jd22b df -h /opt/app-root/src/uploaded
        • edit pvc oc edit pvc my-shared-storage
          • edit storage size
          • oshift updates on exit from vi
        • confirm cap oc rsh file-uploader-2-jd22b df -h /opt/app-root/src/uploaded
    • providing block storage with cns
      • block storage = iscsi lun
      • view host running elasticsearch oc get pod -l component=es -n logging -o wide
      • view running iscsi session over ssh ssh node05.internal.aws.testdrive.openshift.com sudo iscsiadm -m session
  • exposed services
    • look at 3scale for protection

oc commands

command description
oc login -u system:admin login to oshift
oc get nodes list of nodes
oc project <proj-name> change projects
`oc describe statefulset prometheus describe 'StatefulSet'*
oc describe daemonset prometheus-node-exporter 'node-exporter' 'daemonset'
oc get routes show routes
oc new-project <proj-name> create project
oc new-app docker.io/repo/image deploy app

*'StatefulSet' is a special kubernetes resource - deals with containers that have various startup and other dependencies - a daemonset is another special kubernetes resource. - it makes sure that specified containers are running on certain nodes

  • show pods oc get pods
  • pod information oc describe pod <pod-name>
  • show yaml output for pod oc get pod <pod-name> -o yaml
  • view pods on node oc adm manage-node <node-name> --list-pods
  • show services oc get services
  • service information oc describe service <service-name>
  • show yaml output for service oc get service <service-name> -o yaml
  • show deploymentconfig oc get dc
  • show replicationcontroller oc get rc
  • scale pods oc scale --replicas=2 dc/<label>
  • show endpoints for label oc get endpoints <label>
  • show router oc describe dc router -n default
  • set liveness probe oc set probe dc/<label> --liveness --get-url=http://:8080/health --initial-delay-seconds=30
  • set readiness probe oc set probe dc/<label> --readiness --get-url=http://:8080/health --initial-delay-seconds=30
  • log on to pod (remote shell) oc rsh <pod-name>
  • view default project request template oc adm create-bootstrap-project-template
  • create new default template oc create -f /file/path/to/template.yaml -n default
  • show quota for project oc get quote -n <proj-name>
  • show limitrange for project oc get limitrange -n <proj-name>
  • show groups (auth) oc get groups
  • execute groupsync sudo oc adm groups sync --sync-config=/path/to/config.yaml --confirm
  • show users oc get user
    • users created when first logged in
  • show nodes oc get nodes
  • view nodes by label (i.e. app nodes) oc get nodes -l region=apps
  • show storageclass oc get sc
  • show persistentvolumeclaim oc get pvc
  • show network namespaces oc get netnamespaces

ref