Cloud Foundry Advisory Board Meeting, July 2016: Time to Migrate to Diego

by Roger StrukhoffJuly 14, 2016
Call participants covered numerous technical issues, reminded people of the Frankfurt summit in September, and discussed having smaller events in Asia.

CAB calls are back

It’s time to migrate to the Cloud Foundry Diego container management system, according to a report at the monthly Cloud Foundry Communinity Advisory Board (CAB) call on Wednesday, July 13.

Dr. Max

Dr. Max

The call was the first in two months (a call in June was skipped as it would have followed all the excitement of the Cloud Foundry Summit by just a few days) and was again led by Michael Maximilien (Dr. Max).

The Diego team, headed by Eric Malm, is currently working on a DEA/Warden end-of-life (EOL) announcement, so the time to transition to Diego is now or very soon, according to a report during the CAB call.

For users and organizations who may be new to Cloud Foundry, or need to be updated, it’s important to know that Diego features significant architectural and operational changes and upgrades. Two core resources can be found, respectively, at the Cloud Foundry Foundation website and at GitHub.

 

Willkommen in Frankfurt (Germany)

Chip Childers, Cloud Foundry Foundation

Chip Childers

Chip Childers and Stormy Peters from the CF Foundation presented an update on this year’s summits. The recent, main summit in Santa Clara was well-attended and vibrant, as we noted in a series of daily reports.

The European summit is scheduled to be held September 26–28 in Frankfurt. Registration is now open. Early bird registration expires August 2. It was also noted that more than 200 session proposals have already been submitted, so the time is high to submit any other bright ideas you may have.

Stormy Peters

Stormy Peters

There will not be an Asia summit per se this year, but Chip said the CF Foundation team is looking at options for having “Cloud Foundry Days” and other smaller events, which would be supported by a single sponsor or a few sponsors.

Related to CF Foundation spreading the word about Cloud Foundry in general, Stormy mentioned a need for more technical blog posts, describing project updates, new features, customer stories, technical issues, and other topics of interest to the community. She promised some swag for published authors.

 

Here’s the techie stuff

Several people had previously submitted technical backlog reviews and updates on July 12 on github during a project managers’ committee (PMC) meeting, as follows.

 
CLI (Dies Koper, Fujitsu)

Dies Koper

  • Released CF CLI 6.20.0 addressing some regressions (plug-ins, the .cfignore pattern, loggregator panic) and allowing for binding of route services to routes with paths.

  • Restarted work on adding support for defining TCP routes and HTTP routes with paths in app manifests. The first story on that has taken much longer than expected as cf push lacked unit tests and needed a lot of refactoring before the actual enhancements could be implemented. This story has now been delivered and we’re hoping the remaining stories to complete the epic will go more smoothly.
  • Shared our proposal for an improved CF help UX on the CF Dev mailing list. The implementation is scheduled to start right after the app manifest epic is completed and released.
  • Updating docs.cloudfoundry.org pages to hyperlink references to the CF CLI commands to the command reference guide.
  • Looking into creating a policy on what CF releases a CF CLI release supports. “We want people to be able to install one version of the CLI (the latest) for use with all their Cloud Foundry deployments: potentially multi-region production environments as well as local bosh-lite environments of different versions. At the same time, we would like to remove code complexity dealing with the split of domains in private and share domains and what was there before, and remove the loggregator_consumer library and only rely on the noaa library to retrieve logs and stats. I’m working on a proposal to share with the CF Dev mailing list soon.”

GitHub repo

 
Garden-Linux (Will Pragnell, Pivotal)

Will Pragnell

  • Migrated Garden repos to ‘cloudfoundry’ GitHub org, ‘code.cloudfoundry.org’ import paths.
  • Shipped v0.339.0 to fix a bug where two containers could end up with the same IP address.

GitHub repo

 
Garden-runC

  • Shipped v0.4.0 to fix a bug where two containers could end up with the same IP address.
  • Fixed logging for NetOut rules.
  • Unprivileged containers now use seccomp and AppArmor.
  • Global host info in /proc is now hidden from within containers.
  • Security “parity” work finished!

 
Diego (Eric Malm, Pivotal)

Eric Malm

  • Diego v0.1480.0 officially supports relational data stores.

  • Documented some data store options, behavior of data migration from etcd.
  • Updated performance protocol for end-to-end tests with 250K apps.
  • Raw results and metric dashboard from BBS benchmarks at 200K apps publicly visible.
  • Migrated Diego repos to ‘cloudfoundry’ GitHub org, ‘code.cloudfoundry.org’ import paths.
  • Continued to backfill BBS API documentation.
  • Have access to SoftLayer environment for performance experiments.
  • About to run updated end-to-end performance experiments at 20 cells to validate metric collection and log analysis and to work out operational details.
  • Starting development of officially supported version of veritas CLI tool.

GitHub repo

 
Infrastructure (Amit Gupta, Pivotal)

Amit Gupta

  • Continued work on Consul “scrapers” in service of Elastic Clusters, but temporarily put that work on pause.
  • Continuing work on zero-downtime etcd TLS upgrade—work is done, now building out acceptance test suites.
  • Started work to support zero-downtime upgrade of Consul from BOSH 1.0 to BOSH 2.0 manifests—work is done, now building out acceptance test suites.
  • Postgres job now no longer runs any long-running processes as root.

 
Release integration (Amit Gupta, Pivotal)

  • Continued work on CATS-as-a-Concourse-task—validating input for better operator experience.
  • Worked with various teams to pin down and fix issue related to Diego and WebDav interaction timing out pulling down buildpacks.
  • Continued work on BOSH 2.0 manifests for Cloud Foundry—pipeline should be deploying all jobs by end of day today, next step is to run CATS.
  • Paying down technical debt around our usage of DataDog in AWS acceptance environment.

 
Runtime OG (Michael Fraenkel, IBM)

Michael Fraenkel

  • Investigating memory leak (unable to reproduce).
  • Reduce HM9K etcd usage even further, querying all nodes for each app instance on stats.
  • Bumping our NATS client to the latest.
  • Investigating the correct configuration for a NATs client to prevent route erasure.

 
Loggregator (Jim Campbell, Pivotal)

Jim Campbell

  • Moved BOSH HM Forwarder into OS Loggregator.
  • Variety of work on noaa reliability: timeouts, token refresh, nozzles under load.
  • Building tooling to speed up iteration cycles of load testing M->D TCP.

GitHub repo

 

UAA (Sree Tummidi, Pivotal)

Sree Tummidi

  • UAA SQL Injection CVE.
  • UAA 3.5.0 release this week.
  • Updated Tomcat version to 8.0.0—fixes for Low/Med CVEs.
  • Deprecated properties for UAA being removed in v3.5.0.
  • UTF-8 support in UI Templates.
  • Securing refresh tokens with explicit user scope/permission (offline access).
  • UAA performance benchmarking in progress for OAuth end-points.
  • Upcoming features: Account Chooser for Identity Provider Discovery.

GitHub repo

 
CAPI (Nicholas Calugar, Pivotal)

Nicholas Calugar

  • Default to unprivileged containers on Diego. Added configuration if operator would like to continue running privileged containers.
  • WebDAV performance improvement: disable gzip.
  • Fixed user-provided service creation logging credentials field.
  • Fixed monit getting stuck when NFS goes away.
  • Finished first story in v3 migration epic: v2 apps become v3 processes.
  • Finished story for running CAPI processes as vcap instead of root.

GitHub repo

 
PERSI (Ted Young, Pivotal)

Ted Young

  • Met with Docker, agreed to partner on Plugin API improvements.
  • Creating a local driver/broker that works in BOSH-Lite.
  • Defined new v2.10 Volume Services API.
  • Deployed wordpress on CephFS.

 

 

Container networking (Usha Ramachandran, Pivotal)

Usha Ramachandran

  • Completed basic forwarding over overlay.
  • Completed enforcement of policy for containers on the same cell.
  • Working on enforcement of policy for containers on different cells.
  • Working on CLI plug-in for configuring container networking policies.

 

 

Flintstone / Bits-Service (Simon Moser, IBM)

Simon Moser

  • Delivered the bits-service cloud controller integration pull request to the CAPI team, waiting on merge.
  • Continuing work on signed URLs in the bits-service itself.
  • Incepted on bits-service to CLI integration.

 

 

Buildpacks / Stacks (Danny Rosen, Pivotal)

Danny Rosen

  • New Go buildpack with glide support.

 

 

 

 

Abacus (Dr. Max and Jean-Sebastien Delfino, IBM)

Jean-Sebastien Delfino

  • Initial stories for adding default UI for onboarding services config into Abacus (SAP India).
  • Discussing with Pivotal on trial for their products.
  • Abacus now deployed on all Bluemix envs.
  • Various bug fixes.

GitHub repo

 

The next CAB

There was also some concern expressed during PMC meeting that teams were falling behind in their updates to external dependencies such as NATS and etcd.

This topic merited some discussion during the CAB call as well. It is being addressed by a review of the currency of dependencies in BOSH releases, and there will be discussion of automating activities to address the situation.

And there was a question as to whether there was a plan to move away from etcd entirely. Amit Gupta reported no immediate plan to move entirely away from it. He noted that Diego is planning to move away but there are still other components using etcd, such as loggregator.

The next call is scheduled to be held on August 10, 8 a.m. Pacific Time.