rh container event (cloud-native roadshow)
ops
- wifi:
- Ballsridge Hotel
- beourguest2018
- https://redhat.qwiklab.com/focuses/191
- lab details
- host master.674462327352.aws.testdrive.openshift.com
- user cloud-user
- pass qwikLABS-L64-23179
- docs http://support.674462327352.aws.testdrive.openshift.com/
- feedback
additional credits: testdrivetokens redhat com
run cloudforms/oshift web console on oshift (app nodes) - reverse proxy for bastion
lab
- navigate to
https://redhat.qwiklab.com/focuses/191
and login- user: dudley.burrows@ward.ie
- pass: reachfortheclouds
- select 'My Learning' then 'OpenShift for Ops Test Drive'
- Click 'Start Lab' in the top right. Once the lab has been spun up the connection details will appear in the left pane.
- The lab guide URL will also be shown.
presentation
oshift overview
- hybrid scaling
- from on-prem to cloud in mins
- jenkins pipeline
- servicenow rest api to 'tick box' before continuing
- kubernetes
- oci compatible container runtime (docker)
- internal container repo in oshift (certified by rh)
- 10x workload density than vms --??
- ownership boundaries
- dev
- container
- app
- os dependencies
- container
- ops
- container host
- infra
- dev
- container image layers
- immutable images (kill and redeploy)
- base image patching
- oshift rebuilds all containers using image stream
- source to image build
- oshift rebuilds all containers using image stream
- lightweight, oci-compliant container runtime (cri-o --??)
- rhel on node (host) and container
- pod = collection of containers
- smallest unit of management in oshift
- pod = collection of containers
- only oci-compliant are supported
- rhel on node (host) and container
- masters (3x)
- can lose all w/out effecting live traffic
- rest api (servicenow to do oshift activites)
- datastore
- desired / current state
- etcd db
- one per master
- sync'd across masters
- ansible playbook bundles instead of bakup (infra as code)
- orchestration and scheduling
- placement by policy
- health/scaling - autoscaling pods
- endpoints put in by devs
- readiness probe
- liveness probe
- infra nodes
- integrated container registry
- persistent storage
- glusterfs
- service layer
- routing layer
- expose services externally
container storage
- oshift persistent storage framework
- PersistentVolumeClaim
- submitted by dev
- StorageClass
- set up by ops
- Storage Backend
- PersistentVolmue
- mounted by pod
- bound to PersistentVolumeClaim
- PersistentVolumeClaim
-
glusterfs
- (app) node labelled as container native storage
- underlying storage: das, jbod
- scale-out linearly
- replicate sync and async
- heketi - restful glusterfs management
-
subscription licensing
- not required for master/infra
- only for 'worker' nodes (app nodes)
- based on number of vms or socket pairs
- spotfleets??
- cloudforms to manage subscriptions?
lab
- environment
- master x1
- infra x1
- app x6
- idm x1 (ldap auth)
- ssh into master node
- using ansible playbooks for installing oshift
- part of the
openshift-ansible
pkg
- part of the
- installers config
/etc/ansible/hosts
docs- general settings under
[OSEv3:vars]
- general settings under
- top level runbook triggers install of cluster
/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
- requires 'super admin' account
- cmds
- web_console
-
prometheus
- cluster infra monitoring and alerting
- verify storage cluster
export HEKETI_CLI_SERVER=http://heketi-storage-storage.apps.674462327352.aws.testdrive.openshift.com export HEKETI_CLI_USER=admin export HEKETI_CLI_KEY=myS3cr3tpassw0rd heketi-cli cluster list #shows internal uuid of cns cluster heketi-cli topology info
- application management
- create new project (bucket)
- deploy new app (automatically created service)
- view service yaml
- scale app
- delete pod
- oshift redeploys in less than 10secs!
- create route (expose service)
-
application probes
- liveness probe
- readiness probe
- check endpoint health
curl mapit-app-management.apps.674462327352.aws.testdrive.openshift.com/health
- probe endpoint for liveness (set probe)
oc set probe dc/mapit --liveness --get-url=http://:8080/health --initial-delay-seconds=30
- probe endpoint for readiness (set probe)
oc set probe dc/mapit --readiness --get-url=http://:8080/health --initial-delay-seconds=30
- confirm
oc describe dc mapit
- 'Containers' section
- add storage to app
oc volume dc/mapit --add --name=mapit-storage -t pvc --claim-mode=ReadWriteMany --claim-size=1Gi --claim-name=mapit-storage --mount-path=/app-storage
- storage now available at
/app-storage
inside node (rsh log on)
- storage now available at
- project request template, quota, limits
- view default template
- modify template
cat /opt/lab/support/project_request_template.yaml
- new sections:
- install new template
- modify 'master-config.yaml' section 'projectRequestTemplate'
sudo vim /etc/origin/master/master-config.yaml
- restart master
sudo systemctl restart atomic-openshift-master-api atomic-openshift-master-controllers
- groups
- external auth providers
- role based access control
- login as normal user
- no projects
- login as 'fancyuser'
- projects are shown
- create 3x new projects (lifecycle)
- ose-teamed-app edit dev and test, view prod
- ose-fancy-dev edit prod
- login is as teamed user to see 3x projects
- create app in prod - fails!
- prometheus
- login as fancyuser1
- infrastructure management, metrics and logging
- extending cluster
- view app nodes
- uncomment '#scaleup_' in '/etc/ansible/hosts'
- use ansible to verify nodes are online
ansible new_nodes -m ping
- run playbook to extend cluster
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-node/scaleup.yml
- multi master ha setup docs
- container-native storage for infra
- required by registry, logging, metrics
- configure installer
sudo sed -i 's/#cnsinfra_//g' /etc/ansible/hosts
- install cns cluster for infra
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/config.yml
- regular file storage service (glusterfs) not supported for logging/metrics
- must use block storage (glusterblock)
- metrics
- based on hawkular in a cassandra db
- configure installer
sudo sed -i 's/#metrics_//g' /etc/ansible/hosts sudo sed -i '/openshift_metrics_install_metrics=false/d' /etc/ansible/hosts
- run playbook to install metrics
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml
- logging
- using efk
- elasticsearch (centralplace)
- fluentd (consolidated)
- kibana (ui)
- configure installer
sudo sed -i 's/#logging_//g' /etc/ansible/hosts sudo sed -i '/openshift_logging_install_logging=false/d' /etc/ansible/hosts
- run playbook to install logging
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-logging/config.yml
- using efk
- multitenant networking
- sdn based on open vswitch
- execute creation script
bash /opt/lab/support/net-proj.sh
- get ip of pod b
bash /opt/lab/support/podbip.sh
- export pod b ip
export POD_B_IP=$(bash /opt/lab/support/podbip.sh)
- get name of pod in netproj-a project and export as var
oc get pods -n netproj-a export POD_A_NAME=ose-1-zccsx
- execute
ping
in pod a try to reach pod boc exec -n netproj-a $POD_A_NAME -- ping -c1 -W1 $POD_B_IP
- fails because networks aren't connected
- join networks
oc get netnamespace oc adm pod-network join-projects netproj-a --to=netproj-b oc get netnamespace
- network ids of two projs now the same
- retest connectivity
oc exec -n netproj-a $POD_A_NAME -- ping -c1 -W1 $POD_B_IP
- isolate (unjoin) projects
oc adm pod-network isolate-projects netproj-a
- use 'NetworkPolicy' for finer grain
- node maintenance
- mark node as 'non-schedulable' then drain all pods on node
- mark node02 as 'non-schedulable'
oc adm manage-node node02.internal.aws.testdrive.openshift.com --schedulable=false
- does not impact running pods
- drain pods on node02 (dryrun first)
- node now ready for maintenance (reboot etc)
- add node back into oshift
oc adm manage-node node02.internal.aws.testdrive.openshift.com --schedulable=true
- mark node02 as 'non-schedulable'
- mark node as 'non-schedulable' then drain all pods on node
- oshift registry with cns
- uses ephemeral storage in its pod
- restarts or redeployments cause container images lost
- add cns to registry
- add volume
oc volume dc/docker-registry --add --name=registry-storage -t pvc \ --claim-mode=ReadWriteMany --claim-size=5Gi \ --claim-name=registry-storage --claim-class=glusterfs-registry --overwrite
- verify deploymentconfig
oc get dc/docker-registry
- scale registry
oc scale dc/docker-registry --replicas=3
- add volume
- uses ephemeral storage in its pod
- extending cluster
- container-native storage concepts
- login as super admin in 'storage'
oc login -u system:admin -n storage
- view pods
oc get pods -n storage -o wide
- check service and route
oc get service,route
- perform health check on endpoint
curl -w "\n" http://heketi-storage-storage.apps.674462327352.aws.testdrive.openshift.com/hello
- login as 'fancyuser1'
oc login -u fancyuser1 -p openshift
- create new app
oc new-project my-database-app
- view template
oc get template/rails-pgsql-persistent -n openshift
- view pvc in template
oc get template/rails-pgsql-persistent -n openshift -o yaml | grep PersistentVolumeClaim -A8
- specify storage size
oc new-app rails-pgsql-persistent -p VOLUME_CAPACITY=5Gi
- get route
oc get route
- create new app
- explore underlying cns
- login as system admin
- select 'my-database-app' proj
oc project my-database-app
- view pvc
- export pvc name as var
export PGSQL_PV_NAME=$(oc get pvc/postgresql -o jsonpath="{.spec.volumeName}" -n my-database-app)
- describe pvc
oc describe pv $PGSQL_PV_NAME
- export glusterfs volume name
export PGSQL_GLUSTER_VOLUME=$(oc get pv $PGSQL_PV_NAME -o jsonpath='{.spec.glusterfs.path}')
- switch to storage project
oc project storage
- view glusterfs pods
oc get pods -o wide -l glusterfs=storage-pod
- store first glusterfs pod name and ip as vars
export FIRST_GLUSTER_POD=$(oc get pods -o jsonpath='{.items[0].metadata.name}' -l glusterfs=storage-pod) export FIRST_GLUSTER_IP=$(oc get pods -o jsonpath='{.items[0].status.podIP}' -l glusterfs=storage-pod) echo $FIRST_GLUSTER_POD echo $FIRST_GLUSTER_IP
- query gluster pod for volumes (rsh)
oc rsh $FIRST_GLUSTER_POD gluster volume list
- query for topology
oc rsh $FIRST_GLUSTER_POD gluster volume info $PGSQL_GLUSTER_VOLUME
- export brick dir path
export PGSQL_GLUSTER_BRICK=$(echo -n $(oc rsh $FIRST_GLUSTER_POD gluster vol info $PGSQL_GLUSTER_VOLUME | grep $FIRST_GLUSTER_IP) | cut -d ':' -f 3 | tr -d $'\r' ) echo $PGSQL_GLUSTER_BRICK
- look at brick dir
oc rsh $FIRST_GLUSTER_POD ls -ahl $PGSQL_GLUSTER_BRICK
- provide scalable, shared storage w/ cns
- deploy file uploader app
oc login -u fancyuser1 -p openshift oc new-project my-shared-storage oc new-app openshift/php:7.0~https://github.com/christianh814/openshift-php-upload-demo --name=file-uploader
- view logs to wait for app to be deployed
oc logs -f bc/file-uploader
- view logs to wait for app to be deployed
- expose app via route
oc expose svc/file-uploader
- scale up for ha
oc scale --replicas=3 dc/file-uploader
- upload file to app
- view pods to find where file is located
oc rsh file-uploader-1-k2v0d ls -hl uploaded oc rsh file-uploader-1-sz49r ls -hl uploaded oc rsh file-uploader-1-xjg9f ls -hl uploaded
- create pvc
oc volume dc/file-uploader --add --name=my-shared-storage \ -t pvc --claim-mode=ReadWriteMany --claim-size=1Gi \ --claim-name=my-shared-storage --mount-path=/opt/app-root/src/uploaded
- refresh app (new nodes)
- upload new file
- view file across all nodes
- increase vol capacity
- fill up current cap
oc rsh file-uploader-2-jd22b dd if=/dev/zero of=uploaded/bigfile bs=1M count=1000 oc rsh file-uploader-2-jd22b df -h /opt/app-root/src/uploaded
- edit pvc
oc edit pvc my-shared-storage
- edit storage size
- oshift updates on exit from vi
- confirm cap
oc rsh file-uploader-2-jd22b df -h /opt/app-root/src/uploaded
- fill up current cap
- deploy file uploader app
- providing block storage with cns
- block storage = iscsi lun
- view host running elasticsearch
oc get pod -l component=es -n logging -o wide
- view running iscsi session over ssh
ssh node05.internal.aws.testdrive.openshift.com sudo iscsiadm -m session
- login as super admin in 'storage'
- exposed services
- look at 3scale for protection
oc commands
command | description |
---|---|
oc login -u system:admin |
login to oshift |
oc get nodes |
list of nodes |
oc project <proj-name> |
change projects |
`oc describe statefulset prometheus | describe 'StatefulSet'* |
oc describe daemonset prometheus-node-exporter |
'node-exporter' 'daemonset' |
oc get routes |
show routes |
oc new-project <proj-name> |
create project |
oc new-app docker.io/repo/image |
deploy app |
*'StatefulSet' is a special kubernetes resource - deals with containers that have various startup and other dependencies - a daemonset is another special kubernetes resource. - it makes sure that specified containers are running on certain nodes
- show pods
oc get pods
- pod information
oc describe pod <pod-name>
- show yaml output for pod
oc get pod <pod-name> -o yaml
- view pods on node
oc adm manage-node <node-name> --list-pods
- show services
oc get services
- service information
oc describe service <service-name>
- show yaml output for service
oc get service <service-name> -o yaml
- show deploymentconfig
oc get dc
- show replicationcontroller
oc get rc
- scale pods
oc scale --replicas=2 dc/<label>
- show endpoints for label
oc get endpoints <label>
- show router
oc describe dc router -n default
- set liveness probe
oc set probe dc/<label> --liveness --get-url=http://:8080/health --initial-delay-seconds=30
- set readiness probe
oc set probe dc/<label> --readiness --get-url=http://:8080/health --initial-delay-seconds=30
- log on to pod (remote shell)
oc rsh <pod-name>
- view default project request template
oc adm create-bootstrap-project-template
- create new default template
oc create -f /file/path/to/template.yaml -n default
- show quota for project
oc get quote -n <proj-name>
- show limitrange for project
oc get limitrange -n <proj-name>
- show groups (auth)
oc get groups
- execute
groupsync
sudo oc adm groups sync --sync-config=/path/to/config.yaml --confirm
- show users
oc get user
- users created when first logged in
- show nodes
oc get nodes
- view nodes by label (i.e. app nodes)
oc get nodes -l region=apps
- show storageclass
oc get sc
- show persistentvolumeclaim
oc get pvc
- show network namespaces
oc get netnamespaces
ref
- :1: https://github.com/heketi/heketi
- :2: https://docs.openshift.com/container-platform/3.9/install_config/install/advanced_install.html#configuring-ansible
- :3: https://docs.openshift.com/container-platform/3.9/install_config/cluster_metrics.html#openshift-prometheus
- :4: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
- :5: https://docs.openshift.com/container-platform/3.9/dev_guide/daemonsets.html
- :6: https://docs.openshift.com/container-platform/3.9/architecture/core_concepts/pods_and_services.html#services
- :7: https://docs.openshift.com/container-platform/latest/dev_guide/application_health.html
- :8: https://docs.openshift.com/container-platform/3.9/admin_guide/quota.html
- :9: https://docs.openshift.com/container-platform/3.9/admin_guide/limits.html
- :10: https://docs.openshift.com/container-platform/3.9/install_config/configuring_authentication.html#LDAPPasswordIdentityProvider
- :11: https://docs.openshift.com/container-platform/3.9/admin_guide/manage_rbac.html#admin-guide-manage-rbac
- :12: https://docs.openshift.com/container-platform/3.9/architecture/infrastructure_components/kubernetes_infrastructure.html#high-availability-masters
- :13: http://www.hawkular.org/
- :14: https://docs.openshift.com/container-platform/3.9/architecture/networking/sdn.html
- :15: https://docs.openshift.com/container-platform/3.9/admin_guide/managing_networking.html#admin-guide-networking-networkpolicy
- :16: https://www.3scale.net/