The search for a cloud-native database

Cedrick Lunven (@clunven) and Jeff Carpenter (@jscarp) of K8ssandra discuss the search fora cloud-native database.

The concept of “cloud-native” has come to stand for a collection of best practices for application logic and infrastructure, including databases. However, many of the databases supporting our applications have been around for decades, before the cloud or cloud-native was a thing. The data gravity associated with these legacy solutions has limited our ability to move applications and workloads. As we move to the cloud, how do we evolve our data storage approach? Do we need a cloud-native database? What would it even mean for a database to be cloud-native? Let’s take a look at these questions.

What is Cloud-Native?

It’s helpful to start by defining terms. In unpacking “cloud-native”, let’s start with the word “native”. For individuals, the word may evoke thoughts of your first language, or your country or origin – things that feel natural to you. Or in nature itself, we might consider the native habitats inhabited by wildlife, and how each species is adapted to its environment. We can use this as a basis to understand the meaning of cloud-native.

Here’s how the Cloud Native Computing Foundation (CNCF) defines the term:

“Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds: Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”

This is a rich definition, but it can be a challenge to use this to define what a cloud-native database is, as evidenced by the Database section of the CNCF Landscape Map:

Database section of the CNCF Landscape Map

Databases are just a small portion of a crowded cloud computing landscape.

Look closely, and you’ll notice a wide range of offerings: both traditional relational databases and NoSQL databases, supporting a variety of different data models including key/value, document, and graph. You’ll also find technologies that layer clustering, querying or schema management capabilities on top of existing databases. And this doesn’t even consider related categories in the CNCF landscape such as Streaming and Messaging for data movement, or Cloud Native Storage for persistence.

Which of these databases are cloud-native? Only those that are designed for the cloud, should we include those that can be adapted to work in the cloud? Bill Wilder provides an interesting perspective in his 2012 book, “Cloud Architecture Patterns”, defining “cloud-native” as:

Any application that was architected to take full advantage of cloud platforms”

By this definition, cloud-native databases are those that have been architected to take full advantage of underlying cloud infrastructure. Obvious? Maybe. Contentious? Probably…

Why should I care if my database is cloud-native?

Or to ask a different way, what are the advantages of a cloud-native database? Consider the two main factors driving the popularity of the cloud: cost and time-to-market.

  • Cost – the ability to pay-as-you-go has been vital in increasing cloud adoption. (But that doesn’t mean that cloud is cheap or that cost management is always straightforward.)
  • Time-to-market – the ability to quickly spin up infrastructure to prototype, develop, test, and deliver new applications and features. (But that doesn’t mean that cloud development and operations are easy.)

These goals apply to your database selection, just as they do to any other part of your stack.

What are the characteristics of a cloud-native database?

Now we can revisit the CNCF definition and extract characteristics of a cloud-native database that will help achieve our cost and time-to-market goals:

  • Scalability – the system must be able to add capacity dynamically to absorb additional workload
  • Elasticity – it must also be able to scale back down, so that you only pay for the resources you need
  • Resiliency – the system must survive failures without losing your data
  • Observability – tracking your activity, but also health checking and handling failovers
  • Automation – implementing operations tasks as repeatable logic to reduce the possibility of error. This characteristic is the most difficult to achieve, but is essential to achieve a high delivery tempo at scale

Cloud-native databases are designed to embody these characteristics, which distinguish them from “cloud-ready” databases, that is, those that can be deployed to the cloud with some adaptation.

What’s a good example of a cloud-native database?

Let’s test this definition of a cloud-native database by applying it to Apache Cassandra™ as an example. While the term “cloud-native” was not yet widespread when Cassandra was developed, it bears many of the same architectural influences, since it was inspired by public cloud infrastructure such as Amazon’s Dynamo Paper and Google’s BigTable. Because of this lineage, Cassandra embodies the principles outlined above:

  • Cassandra demonstrates horizontal scalability through adding nodes, and can be scaled down elastically to free resources outside of peak load periods
  • By default, Cassandra is an AP system, that is, it prioritizes availability and partition tolerance over consistency, as described in the CAP theorem. Cassandra’s built in replication, shared-nothing architecture and self-healing features help guarantee resiliency.
  • Cassandra nodes expose logging, metrics, and query tracing, which enable observability
  • Automation is the most challenging aspect for Cassandra, as typical for databases.

While automating the initial deployment of a Cassandra cluster is a relatively simple task, other tasks such as scaling up and down or upgrading can be time-consuming and difficult to automate. After all, even single-node database operations can be challenging, as many a DBA can testify. Fortunately, the K8ssandra project provides best practices for deploying Cassandra on Kubernetes, including major strides forward in automating “day 2” operations.

Does a cloud-native database have to run on Kubernetes?

Speaking of Kubernetes… When we talk about databases in the cloud, we’re really talking about stateful workloads requiring some kind of storage. But in the cloud world, stateful is painful. Data gravity is a real challenge – data may be hard to move due to regulations and laws, and the cost can get quite expensive. This results in a premium on keeping applications close to their data.

The challenges only increase when we begin deploying containerized applications using Kubernetes, since it was not originally designed for stateful workloads. There’s an emerging push toward deploying databases to run on Kubernetes as well, in order to maximize development and operational efficiencies by running the entire stack on a single platform. What additional requirements does Kubernetes put on a cloud-native database?


First, the database must run in containers. This may sound obvious, but some work is required. Storage must be externalized, the memory and other computing resources must be tuned appropriately, and the application logs and metrics must be made available to infrastructure for monitoring and log aggregation.


Next, we need to map the database’s storage needs onto Kubernetes constructs. At a minimum, each database node will make a persistent volume claim that Kubernetes can use to allocate a storage volume with appropriate capacity and I/O characteristics. Databases are typically deployed using Kubernetes Stateful Sets, which help manage the mapping of storage volumes to pods and maintain consistent, predictable, identity.

Automated Operations

Finally, we need tooling to manage and automate database operations, including installation and maintenance. This is typically implemented via the Kubernetes operator pattern. Operators are basically control loops that observe the state of Kubernetes resources and take actions to help achieve a desired state. In this way they are similar to Kubernetes built-in controllers, but with the key difference that they understand domain-specific state and thus help Kubernetes make better decisions.

For example, the K8ssandra project uses cass-operator, which defines a Kubernetes custom resource (CRD) called “CassandraDatacenter” to describe the desired state of each top-level failure domain of a Cassandra cluster. This provides a level of abstraction higher than dealing with Stateful Sets or individual pods.

Kubernetes database operators typically help to answer questions like:

  • What happens during failovers? (pods, disks, networks)
  • What happens when you scale out? (pod rescheduling)
  • How are backups performed?
  • How do we effectively detect and prevent failure?
  • How is software upgraded? (rolling restarts)

Conclusion and what’s next

A cloud-native database is one that is designed with cloud-native principles in mind, including scalability, elasticity, resiliency, observability, and automation. As we’ve seen with Cassandra, automation is often the final milestone to be achieved, but running databases in Kubernetes can actually help us progress toward this goal of automation.

What’s next in the maturation of cloud-native databases? We’d love to hear your input as we continue to invent the future of this technology together.

This blog post originally appeared on K8ssandra and is based on Cedrick’s presentation “Databases in the Cloud-Native Era” from BluePrint London, March 11, 2021 (registration required).


Why do developers adopt or reject cloud technologies?

In the nearly fifteen years since Amazon AWS cracked open the cloud market by releasing S3 – and changed the world by doing so – there has been huge growth in the variety of cloud solutions available for developers to use. We examine the different reasons that developers give for adopting or rejecting cloud technologies. The findings shared in this post are based on the Developer Economics survey 19th edition which ran during June-August 2020 and reached more than 17,000 developers in 159 countries.

Of the new cloud technologies which have appeared over the last fifteen years, containers have arguably had the greatest impact. With 60% of developers using this technology, the benefits are clearly widely recognised. However, with just under 30% of developers using container orchestration tools and management platforms, there is still room for this technology to develop.

In second position, with 45% of cloud developers using this technology, Database-as-a-Service (DBaaS) is also very widely used, and data storage and retrieval will continue to be an important issue, albeit in a much more sophisticated form than S3 originally offered. Cloud Platform-as-a-Service (PaaS) sits in a distant third place. A third of backend developers are using PaaS, putting this technology slightly ahead of the other ones we ask about – between 21-27% of developers use them.

Why do developers adopt or reject cloud technologies - containers is the cloud technology that is most widely used by backend developers.

Abstraction and simplification are two of the main drivers for the mass adoption of cloud technologies, but we can’t overlook the role that flexibility plays. Spinning up instances to cope with variable demand, creating temporary testing environments, and adding storage as required is immensely powerful. But one often-overlooked aspect of this flexibility is that developers and organisations have the flexibility to choose. They are not restricted to the expensive, bare metal they bought ten years ago, and they are less constrained by monolithic purchasing processes, because, to put it simply, these decisions matter less. In a world where infrastructure can be provisioned and destroyed at will, and where data and server configurations can be transferred easily between homogeneous systems, cloud providers have to find other areas of differentiation in order to compete. Vendor lock-in is much less of an issue for users than it once was, and the rise of the developer as a decision-maker has put even more power into their hands. Note the adoption and rejection reasons for DBaaS and orchestration platforms come from our previous survey, fielded in Q1 2020.

“Pricing and support/documentation are most important to developers”

For every cloud technology, with the exception of orchestration tools, pricing and support/documentation are the two most important factors that developers consider when adopting that technology. For the most part, these two factors switch between first and second place, however, pricing drops to fifth place for developers considering adopting an orchestration tool, whereas support/documentation remains at the top by a large margin. Around three in ten of these developers selected ease and speed of development (32%), integration with other systems (31%), community (30%), and pricing (29%) as reasons for adoption, with pricing being around 15 percentage points lower for orchestration tools than for other technologies. On the other hand, community and scalability are generally more important for developers selecting an orchestration tool.

Much of this distinction is driven by the dominance of Kubernetes. With 57% of backend developers who are using an orchestration tool choosing Kubernetes, it is the single most popular orchestration tool, and importantly, it’s free and open source. It stands to reason, therefore, that pricing is simply not an issue for developers using Kubernetes, instead they value the community support that helps them master such a complex tool.

Indeed, as well as pricing being much less important for these developers, the learning curve is also less important. It seems that these developers understand that they are dealing with high levels of complexity and abstraction and accept that there is a lot to learn in this space. But for those developers that want the abstraction and simplicity offered by a commercial container management system, many paid options exist, and pricing is still an important factor in this space.

Why do developers adopt or reject cloud technologies - Ranking of reasons for adoption

Taking developers’ reasons for rejection into account let us view the decision-making process from the other side. Immediately, we see that pricing is the dominant factor when rejecting every technology. Taking a closer look at the data shows the true extent of this – for DBaaS and Infrastructure-as-a-Service (IaaS), developers were more than twice as likely to select pricing as a rejection reason than the second- and third-placed reasons of support/documentation, and the learning curve, respectively. Amongst the remaining technologies, the smallest difference was 8 percentage points, for developers rejecting orchestration tools.

Further down the list, there is a lot of variability between the different technologies. For example, the learning curve was the second most popular rejection reason for developers choosing IaaS, with a quarter of them doing so. This suggests that the learning curve for IaaS is quite steep and that this is a barrier for many developers. This is not the case for DBaaS however, where only 15% of developers stated this as a reason for rejection.

“Suitability, feature set and performance are hygiene factors”

Suitability and feature set has middling importance for developers choosing to adopt a technology, but for many technologies, it is a more important reason for rejection. This shows that suitability and feature set is a hygiene factor – there are relatively few cases where this is of paramount importance, but many where a technology does not meet the needs and is therefore rejected.

Finally, performance sits very low in the hierarchy for developers adopting and rejecting cloud solutions. This indicates that, for the vast majority of uses cases, the range of performance options provided by vendors is sufficient. This suggests that many cloud computing products are, to some extent, homogenous, and that developers are more concerned with the ‘soft’ features, such as support/documentation, community, or learning curve. These features make for a fulfilling development experience, and in the age of the developer as a decision-maker, experience is everything.

Why do developers adopt or reject cloud technologies - ranking of reasons for rejection.

What are your reasons for adopting or rejecting cloud technologies? You can let us know your reasons here!


Choosing the right Containers-as-a-Service (CaaS) – or not

The emergence of cloud native development and containers has redefined how software is developed. But not all organizations have the resources or expertise to set up the required infrastructure to support a containerized application. Luckily, cloud vendors offer Containers-as-a-Service to help developers to capitalize on the benefits of cloud native development. 

All three leading cloud providers have CaaS products but choosing the right one can be a challenge. While everyone has different requirements, it is always beneficial to understand what solutions others are using and why, to help inform decisions. 

Based on research from /Data’s recent Developer Economics survey, we have discovered that there are a few factors that drive developers to choose one CaaS over another: 

  • familiarity with tools and languages
  • integration with other systems
  • support and documentation 
  • ease and speed of development. 

While there were other reasons for developers to consider when adopting a platform, the percentage of developers that considered these four factors important is noticeably different across the three leaders. Some of the other sixteen factors that are tracked in the research  include:

  • pricing
  • community
  • learning curve
  • suitability and feature set
  • performance 
  • scalability


Not all CaaS platforms are selected for the same reasons though. Developers that chose AWS Elastic Container Service were more likely to choose it because of its integration with other systems. This is a reason to choose AWS ECS for 34% of developers using it, compared to 29% and 28% for Azure and Google. Amazon not only has a vast array of tools and services, they also have a robust partner network. The options are so great they have their own service marketplace and have even released a Cloud Map service to help developers discover and manage it all. 

Developers tend to favor Google Container Engine (GCE) because it is easy to use and well documented. Forty five percent of GCE developers chose it in part because of the support and documentation and 36% because of ease and speed of development. We tend to find that developers are consistently happy with the support and documentation that Google provides to their developer community. This satisfaction is an important reason for Google Container Engine users to choose the platform. 

For Azure Container Service, developers like the fact that they can use the tools and languages that they are familiar with. Azure developers are 7 and 12 percentage points more likely to choose Azure Container Service for this reason than Amazon and Google respectively. Our research shows that Microsoft developers are relatively brand loyal so Azure has made it easy for developers to use Microsoft tools for container development and management. Azure has enabled developers to develop using Docker containers and Visual Studio, tools to deploy code to Azure Container Service with a simple command. They have also made it possible to deploy Docker containers to Windows Servers. Finally integration with Active Directory enables loyal Microsoft developers to use existing authentication policies and technologies .

At the end of the day, most developers are looking for a platform that is easy to use and fits with their current strategy and infrastructure, whether it is though integrations, support, or the ability to use the tools that they are comfortable with. 

Containers: Is it really a choice of either or?

While each solution has unique benefits, our analysis also found that many developers were using more than one leading CaaS and in some cases three. Seven percent of developers using a CaaS were using all three of the leading platforms while 46% were using two.


Our data verifies what you may already suspect based on your own experiences: more than half of backend developers are pursuing a multi cloud strategy, choosing not to commit to a single provider. 

There are a number of benefits to a multi-cloud solution that are driving this trend. IT organizations can avoid vendor lockin if teams develop for a multi cloud environment. This approach forces developers to build without relying on vendor specific services, reducing switching costs. Multi-cloud approaches also enables organizations to optimize their infrastructure. Developers and operations pro’s can leverage the strengths of each cloud depending on the requirements of various workloads and applications. Greater resilience is also a key benefit to consider. This is especially important in denial of service attacks where compute resources can be overwhelmed with fake requests. With a backup cloud ready and waiting, workloads can just shift to the backup cloud.

The choice of either one CaaS or the other will become even less relevant in the future as leading vendors are all standardizing on Kubernetes. Amazon and Azure are promoting Kubenetes-specific CaaS offerings to focus more on Kubernetes as the underlying orchestration engine. Azure is actually migrating all its users to the Kubernetes service. With Kubernetes the standard orchestration engine, migrating apps and container across cloud providers becomes much easier.

We are also seeing Amazon and Azure working to make it more convenient to develop using containers and Kubernetes. Both firms are offering clusterless or serverless Kubernetes services such as Fargate from AWS and Azure Container Instance. These solutions enable developers to just deploy containers without having to worry about servers or clusters. This approach will make it easier for developers but the additional level of abstraction also reduces flexibility and increases switching costs. 

Amazon’s open sourcing of Firecracker, the micro VM that supports the serverless platform Lambda and Fargate, will be another interesting development to watch. This may prove to be Amazon’s response to Kubernetes but for the serverless market. While still a ways off this could lead to a serverless ecosystem that is just as flexible as the container landscape.

What do you think?

Do you feel strongly about the container solution you are using? 

Or perhaps you feel poorly about certain other containers. 

Let us know about it and have your voice heard by taking the Developer Economics survey.