Comcast Deploys 1,000+ Times per Month with Pivotal Cloud Foundry

A self-service platform enables developers at the major telco company to deliver prototypes way faster and fix bugs without downtime, running thousands of apps.
Why read this?
Use Case for Cloud Foundry:

Telecom: Pivotal Cloud Foundry (PCF) is used to create a self-service development environment enabling to build, test, and deploy apps faster.

Business or Technical Result:

As of April 2019, some 19,000 applications and 40,070 app instances were running on the self-service platform. Over 2,100 software developers were working with >10 PCF foundations (including both private and public clouds) in February 2018. Now, engineers can make 1,100 deploys per month, fix bugs without downtime, and deliver app prototypes within four weeks. The time spent on testing the environment has decreased from 4–6 weeks to 10–15 minutes.

Lessons learned:

Careful coordination among developers, architects, and operations engineers is essential. Holding technical workshops (webinars and on-site visits) every month with development teams is a great practice for establishing a convenient engineering process. Regular briefings—to keep customers informed of the platform activities, updates, and metrics—build trust. Using GitHub and automated pipelines can help to unify crowd-sourced documentation, addressing inconsistency (such as residing docs across Wiki pages, SharePoint, etc.).

CF deployed to:

OpenStack

What else is in the stack?

Pivotal CF, Kubernetes, Java, Spring, Docker, Puppet, Couchbase, InfluxDB, GoCD, Concourse, Gradle, Vault, Nagios, Grafana, Alerta, Kapacitor, Kafka

Company Description:

Comcast is the largest cable provider and home ISP in the US, with annual revenues exceeding $94.5 billion (as of 2018). The company was founded in 1963 (as American Cable Systems) and employs about 184,000 specialists. The major divisions include Xfinity (Comcast Cable), NBCUniversal, Sky, and Comcast Business.

Cool fact about the company:

In 2013, the Comcast network was handling up to 6 terabytes of Internet traffic per second each day.

The need for a self-service platform

Comcast is the largest cable provider and home ISP in the US, serving 40 states from its headquarters in Philadelphia. The company was founded in 1963 (as American Cable Systems), and its annual revenues exceeded $94.5 billion in 2018.

In late 2013, Comcast realized that the company’s back-office applications were outdated and monolithic, which adversely affected the business. Their billing, customer management, and order entry were running on 30-year-old mainframes. The infrastructure was too large and rigid, there were no agility and elasticity to scale. The code base was outdated, many processes were manual and required automation.

Nick Beenham

“We had a lot of manual people-based processes, so everything was a ticket. You know, I want to get something done—get a ticket. I want to change a firewall rule—get a ticket. I want to deploy some code—get a ticket. And it really slowed us down,” said Nick Beenham, Senior Principal Engineer at Comcast, during the SpringOne conference a couple of years ago.

“So, one of the things that we wanted to do was to speed up and move faster.” —Nick Beenham, Comcast

To solve these problems, the company migrated from monolithic architectures to microservices, built self-service development environments, and adopted CI/CD practices. The transformation started in 2014, when Comcast’s team decided to use the Pivotal Cloud Foundry platform as the core of this change. As a result, they achieved unprecedented scalability, agility, and the velocity of deployment that the company had never had before.

“We placed a bet on Cloud Foundry. We get features in days, not weeks, and scale takes minutes, not months.” —Greg Otto, Comcast

CF Summit Silicon Valley 2017 Greg OttoGreg Otto at Cloud Foundry Summit 2017 (Image credit: Altoros)

 

Automating numerous processes

By 2015, the company’s development and operations teams had successfully integrated Pivotal Cloud Foundry, Docker, and OpenStack for a pilot project. Working with multiple Cloud Foundry instances, they relied on DevOps tools like Puppet while making custom URLs on demand for customers.

The adoption of Cloud Foundry has caused a shift in thinking and changed the mindset of engineering. Old questions associated with traditional software development were replaced by new questions that focused on how quickly VMs could be deployed and how processes could be automated. Then, the question became how to focus on end-user services rather than simply deploying the VMs.

At that time, the Xfinity division also had a large Enterprise Service Platform at the heart of Comcast’s back office. According to Nick Beenham, the monolith platform was processing 250,000,000 transactions per day, being hosted on 500+ servers. 15 software engineering teams were working over 75 services for more than 150 consumers. However, the system was outdated—some pieces of the code were more than 8 years old.

“We had a data model and an access pattern that even Stephen King would probably be horrified…where multiple owners of the data would all access it for read and write. So, some of the things that we had to do was to split up that data and to give it a definite owner and only those people, only that owner, could change it.” —Nick Beenham, Comcast

To enable further scalability, Oracle’s WebLogic was replaced with Pivotal Cloud Foundry. The system was also redesigned to a microservices architecture.

Migrating a monolith app to a multi-region PCF deployment (Image credit)

“We had 30-year old mainframe technology that our billing applications were running on. And then all the peripheral applications for things like provisioning, activation, and order entry were all related and very tightly coupled…So, we took several of our really big monolithic applications and broke them into microservices and then moved them onto the platform.” —Greg Otto, Comcast

To solve the issues with data management, the company moved from the old Oracle databases to Couchbase. The engineers decomposed databases into smaller stores and strapped data-driven microservices on top of them. As a result, the team improved the system’s consistency, addressed the conflicts associated with data, and fixed user permissions.

Today, Comcast developers also rely on a variety of logging metrics, gathering important KPIs helping to understand the health of their huge infrastructure and thousands of apps. With a comprehensive monitoring system, the team covered OS, infrastructure, the platform (Pivotal Cloud Foundry), and apps with necessary metrics enabling to detect problems at the moment they arise.

The full-stack monitoring solution at Comcast (Image credit)

All the metrics are aggregated in InfluxDB, while the Kapacitor framework is used for real-time log stream processing. At the same time, Nagios is responsible for alerting about the issues.

The full-stack monitoring system was implemented with a modular architecture in mind, according to Tim Leong of Comcast. As a result, the developers got the flexibility to make changes, experiment with new technologies, and scale the deployment easily.

“We don’t want a single enterprise solution. We want replaceable components.” —Greg Otto, Comcast

Comcast also developed a stream processing system based on Kafka, Spring, and Java to monitor its outside plant. According to Mike Graham of Comcast, the aim of this system is to protect Xfinity’s services from things like severe weather, enabling real-time monitoring, analysis, and quick response time to problems.

Monitoring Comcast’s “outside plant” with Kafka and Cloud Foundry (Image credit)

 

Deploying multiple times an hour

Along the way, Comcast has built a self-service, digital assembly line that decreased the time between release cycles.

“It took a week to deploy the entire stack—a long time…With Pivotal Cloud Foundry, we deploy in minutes and we can deploy as often as we want. Sometimes, we’ll deploy multiple times in an hour. It has given us a great deal of velocity that we’ve never had before. ” —Nick Beenham, Comcast

According to Colby Johnston, 19,000 apps and 40,070 app instances are running on the Cloud Foundry Application Runtime at Comcast (as of April 2019). “One of our biggest challenges is just to simply keep up with the demand for this platform,” he said at Cloud Foundry Summit 2019 recently.

Brief update: In a comment to Altoros (August 2019), Greg Otto revealed that there are “at least tens of thousands of deployments per month, with 46,000 instances” already. This means that Comcast’s deployment is constantly growing at an amazing speed, having 2,000+ new app instances being added each month.

Pivotal Cloud Foundry in production at Comcast in 2017 (Image credit)

“With Cloud Foundry, we’ve introduced about 60% improvement in our availability, which has dramatically enhanced the ability of our customers to interface with self-service and our front-line employees to interface with our customers.” —Christopher Tretina, Comcast

“The key to agility is careful coordination among developers, architects and operations engineers, offering a holistic service model and service offering,” noted Sam Guerrero of Comcast. Everyone becomes more engaged in this approach as members of the Comcast team insert themselves further along the assembly line. “If we make our factory better, everything else can improve.”

The company is also experiencing digital transformation as a whole, like many other enterprises that adopted Cloud Foundry, investing heavily in IT and innovation. “A lot of the product advancements you’ve seen come out in the last couple years is a result of a huge investment in becoming a product and technology company, not just an operational company,” explained Rick Rioboli, SVP and CIO at Comcast.

 

Want details? Watch the videos!

 

 

 

About the experts

Nick Beenham is Senior Principal Engineer at Comcast. He works within the Application and Platform Services (APS) organization. Nick is mostly responsible for Continuous Delivery and Automation. Currently, he is working on systems and processes that will enable developers to automate product delivery to the market. He studied software engineering, human-computer interaction, and distributed systems in Edinburgh Napier University.

 
Tim Leong is Principal Engineer of Cloud Services Engineering at Comcast. He is an experienced solutions architect and a software developer focusing on cloud implementation and automation. Currently, Tim is acting as a technical lead for architecture and engineering across various cloud initiatives, such as using Cloud Foundry, Kubernetes, DevOps (CI/CD, Concourse, etc.), and cloud-native transition. He has a Bachelor’s degree in Science, Computer Engineering, from the University of Pittsburgh.

 
Greg Otto has been Executive Director for Cloud Services at Comcast since 2014. At Comcast, Greg helps to transform the product delivery experience. He is working hard on improving the lives of Comcast software engineering teams. Greg has become a driving force in migrating to the cloud. He has put a lot of time into partnering with the Pivotal Cloud Foundry (PCF) team since 2014. Greg and the team have delivered a robust PCF environment across hybrid cloud environments, allowing critical app refactoring efforts and rapid adoption across the enterprise.

 
Charles (Mike) Graham has been building spatial information systems since 1987 on a variety of technology stacks. He has architected dynamic, extensible platforms for ExxonMobil, Chevron, IHS Markit, and other large enterprises. Most recently, Mike has been architecting a Cloud Foundry–deployed Java / Spring streaming platform based on Kafka for monitoring and analyzing Comcast’s entire “outside plant”—including 40+ million in-home devices and 2 million miles of coax and fiber. Mike has seen the introduction of significant technologies in his career and believes that container-deployed streaming apps are a game changer.


The post was written by Alex Khizhniak and Diana Maltseva.