Bi-weekly podcast explores the fast-growing world of data in the cloud, including database platforms, cloud services, analytics, and business use cases. Hosted by tech journalist and analyst John Foley with expert guests from across the database market.
Hi everyone! This is an update to my recent blog post on the final days of the legacy data warehouse (link below).
The topic of legacy data warehouses slowly fading away struck a chord with many readers. Now we have updates from Snowflake and Teradata.
On Aug 24, the same day I published “The Final Days of the Legacy Data Warehouse,” Snowflake announced its earnings for Q2 FY2023. Not surprisingly, a question about legacy systems came up during Snowflake’s earnings call. One financial analyst asked Snowflake CEO Frank Slootman about the level of activity of customers migrating from on-premises systems to Snowflake’s data cloud.
Slootman: “In the last week, I've heard two very, very iconic names in two different industries that were staunch on-premises people, who would never ever go cloud, and that are now going [cloud]. So I just feel that the resistance is completely breaking….A lot of this is that they’re going to get left behind. You can’t take advantage of innovations that are only available on the cloud. We’re going to see acceleration out of this.”
Is he right? I have no doubt that he is.
According to Ocient, 59% of respondents to its survey are actively looking to switch data warehouse providers. They specifically named IBM, Cloudera, and Teradata as the top 3 legacy environments that data managers want to move away from.
Their reasons:
· 40% want to modernize their legacy platforms
· 42% feel their existing system isn’t comprehensive enough, and
· 36% say it’s not flexible enough
This explains why Snowflake, with its data cloud and data marketplace, has become such a tour de force. Other disruptors are Databricks, Firebolt, SingleStore, TileDB, Yellowbrick, and of course AWS, Google, and Microsoft.
I would include Ocient as well, with its hyperscale data warehouse platform, which is capable of analyzing trillions of records.
The old guard responds
Where does that leave traditional data warehouse providers—companies like IBM and Teradata? They know that their customers want newer, cloud-native platforms. And they’re taking steps to modernize their offerings.
That brings me back to Teradata, which recently made a product announcement that is relevant to this whole discussion.
Teradata is synonymous with the older data warehouses that many organizations are looking to replace. But Teradata is fighting back, as SVP Ashish Yajnik described to me in an earlier Cloud Database Report podcast conversation (link below).
Teradata’s new cloud-native architecture
Now, Teradata has just introduced VantageCloud Lake, a new and improved cloud data warehouse that is based on a cloud-native architecture. With modern capabilities like object storage in the cloud, auto scaling, and self-service in AWS, and soon to be available in other clouds.
So the decision to move to a cloud data warehouse is getting easier, but also harder in some respects.
Easier because that’s the inevitable direction the industry is heading. For CIOs and CTOs the question is when, not if.
Harder because incumbent vendors like Teradata are not standing by while Snowflake and Databricks pick off their installed base. They’re responding with cloud-native platforms of their own.
Who will be the next leaders in this fast-changing market? We’ll have to wait a while longer for the query results on that question.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
Has the database market attempted to solve data complexity — only to create even more complexity?
That’s the argument of Raj Verma, CEO of SingleStore, who thinks he has the answer to the plethora of databases found in many of today’s IT environments: one database that can handle operational data, analytics, and many different data types in a single, unified platform.
It’s not a new idea — the database industry went down the path of “universal databases” back in the 1990’s (i.e. Illustra, Informix) — and SingleStore isn’t the only vendor with an all-purpose DBMS. But the company is establishing its database as a viable solution among the many that are out there. For that reason, I added SingleStore to the Cloud Database Report’s Top 20 list earlier this year (see below).
You may remember SingleStore by its former name, MemSQL. The company was rebranded in 2020, and has been growing, expanding, and building its database for modern applications.
On the latest episode of the Cloud Database Report podcast, I talked to CEO Raj Verma about the rebranding of SingleStore, multi-model databases, the competitive landscape — and Verma’s ambitious goal of being on the short list of preferred database providers for large organizations.
“We feel that enterprises will spend 95% of their database dollars on probably three companies in the future,” Verma says. “And we want to be one of them.”
Recent moves
SingleStore has hired two Microsoft veterans to lead engineering and product development. Shireesh Thota joins as SVP of engineering to oversee development of the company’s multi-model SQL database. And Yatharth Gupta head ups product management/design as VP of product management.
In an expanded partnership, IBM has agreed to license and support the SingleStore database. SingleStore was already available via IBM’s Cloud Pak for Data and in the Red Hat Marketplace. IBM has also become an investor in SingleStore.
Last September, SingleStore announced $80 million in Series F funding. Investors include Dell, HPE, and Google Ventures, among others.
As you can see, SingleStore is associating itself with some of the biggest names in enterprise tech. While that doesn’t assure success, it certainly lends credibility to its unified database proposition and strategic direction.
All of which serves as the backdrop for my conversation with Raj Verma.
Key topics from the interview include:
The rebranding of SingleStore
How 'Database 3.0' is different from earlier eras
What is data intensity?
All-purpose databases vs. purpose-built DBMS's
What organizations can do to simplify database sprawl
Rethinking the post-pandemic workplace
What’s next for SingleStore
Quotes from the podcast:
“Our mission is very simple. It is to unify and simplify modern data.”
“The volume, variety, and velocity of data just inundated enterprise organizations.”
“We feel the future will belong to a database that can combine a vast majority of workloads in a hybrid, multi-cloud environment.”
“The personality of data is ever evolving.”
“The shelf life of data is going down dramatically, and the volume is increasingly. So without speed, you're going to be done — you know what I mean?”
“This convergence of databases is a foregone conclusion, in my opinion....I am fairly confident that there will be a massive consolidation in the database space.”
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
This audio article was originally published by the Cloud Database Report on March 2, 2022.
Gartner’s Magic Quadrant has long served as a proof point of a vendor’s relevance in its respective market. But what about those that don’t make it into the quadrant? Here are my observations about six key players—DataStax, Micro Focus, MongoDB, Neo4j, Yellowbrick Data, and Yugabyte—that were not included in Gartner’s Cloud Database MQ for 2021.
You can listen here, or read the full story below.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
Many of Teradata's customers continue to manage enterprise data warehouses on premises, while transitioning to cloud services over months or years. Yajnik is responsible for Teradata’s product transformation to the cloud, which is a high priority as the company repositions its data warehouse platform for use in hybrid and multi-cloud environments.
Over the past few months, Teradata has struck industry partnerships with AWS and Microsoft Azure. Recent customer announcements include Telefonica, Volkswagen, and Tesco.
Key topics from the interview include:
Quotes from the podcast:
2021 was a busy year for cloud databases, with startups like Cockroach Labs, DataStax, and SingleStore challenging larger, established vendors like Oracle, IBM, and SAP. And of course the Big 3 cloud providers - Microsoft, AWS, and Google Cloud.
There’s a lot of momentum carrying into 2022. A few observations on products and platforms.
A few comments about the competitive landscape. I see 3 major trends.
For more on the latest trends in the cloud database market, register for Acceleration Economy's Cloud Database Battleground on January 27, 2022. The digital event will be hosted by John Foley, editor of the Cloud Database Report and database analyst with Acceleration Economy. Registration is free.
Participating companies include Couchbase, Cockroach Labs, DataStax, Redis, SingleStore, and Yugabyte. Each vendor will answer the same five questions:
Ocient is a software startup that specializes in complex analysis of the world's largest datasets. Early adopters are hyperscale web companies and enterprises that need to analyze data sets of billions or trillions of records.
Prior to Ocient, Gladwin was the founder of object storage vendor Cleversafe, acquired by IBM in 2015. That experience with mega-size data storage carried over to Ocient, whose software is optimized to run on NVMe solid state storage, industry standard CPUs, and 100 GB networking.
John Foley is editor of the Cloud Database Report and senior analyst with Acceleration Economy.
Key topics from the interview include:
Quotes from the podcast:
Yellowbrick Data is a 7-year-old startup that continues to grow in the highly competitive cloud data warehouse market. Yellowbrick recently raised $75 million in its latest round of capital funding as it expands into a variety of industries, including telecom, healthcare, retail, and manufacturing.
Yellowbrick describes itself as a cloud-native data warehouse. It is available for deployment on premises and in hybrid cloud and multi-cloud environments.
Key topics from the interview include:
Quotes from the podcast:
With a PhD in Computer Science and Engineering from the Hong Kong University of Science and Technology, Papadopoulos worked as a research scientist at Massachusetts Institute of Technology and Intel Labs prior to launching TileDB. As he explains in this interview, the idea for TileDB originated in that research work in emerging big data systems and the hardware requirements to support those workloads.
Universal databases are not new, but they are re-emerging as an alternative to the single-purpose databases that have become popular in the tech industry.
Key topics from the interview include:
Quotes from the podcast:
Ranganathan discusses the design considerations that influenced development of YugabyteDB, including the learnings gleaned from the engineering team’s previous work at Facebook. YugabyteDB can be deployed on premises or as a cloud service. With built-in replication, YugabyteDB can be used to distribute data across geographic regions in support of data localization requirements and for high availability.
Key topics in the interview include:
Quotes from the podcast:
In this episode of the Cloud Database Report Podcast, editor and host John Foley talks with Ciaran Dynes, Chief Product Officer of Matillion, about the process of integrating and preparing data for cloud data warehouses. Ciaran is responsible for product strategy and incorporating customer requirements into Matillion’s products, which include software tools for data integration and ETL/ELT.
Key topics in the interview include:
Quotes from the podcast conversation:
The adoption of cloud databases is accelerating, driven by business transformation and the need for database modernization.
In this episode of the Cloud Database Report Podcast, founding editor John Foley talks with Andi Gutmans, Google Cloud's GM and VP of Engineering for Databases, about the platforms and technologies that organizations are using to build and manage these new data environments.
Gutmans is responsible for development of Google Cloud's databases and related technologies, including Bigtable, Cloud SQL, Spanner, and Firestore. In this conversation, he discusses the three steps of cloud database adoption: migration, modernization, and transformation. "We're definitely seeing a tremendous acceleration," he says.
Gutmans talks about the different types of database migrations, from "homogenous" migrations that are relatively fast and simple to more complex ones that involve different database sources and target platforms. He reviews the tools and services available to help with the process, including Google Cloud's Database Migration Service and Datastream for change data capture.
Gutmans provides an overview of the "data cloud" model as a comprehensive data environment that connects multiple databases and reduces the need for organizations to build their own plumbing. Data clouds can "democratize" data while providing security and governance.
Looking ahead, Google Cloud will continue to focus on database migrations, developing new enterprise capabilities, and providing a better experience for developers.
Alexa Weber Morales has years of experience writing about the developer community, cloud infrastructure, and database tools. She had a long career in tech journalism, including as Editor in Chief of Software Development magazine, prior to joining Oracle as a writer, editor, and content strategist.
In this podcast, John Foley, Editor of the Cloud Database Report, talks to Alexa about cloud-native database development, digital transformation, online education, and more. The conversation ranges from Kubernetes to Java development to building applications with Oracle's Apex low-code development platform. Alexa also talks about what motivates and inspires developers.
An interesting note about Alexa — she is also a Grammy-award winning singer, songwriter, and musician known for her work in salsa jazz. In this podcast, Alexa talks about using online learning to write her first symphony.
Pinecone Systems' new vector database provide similarity search as a cloud service. Use cases include recommendations, personalization, image search, and deduplication of records.
A vector, or vector embedding, is a string of numbers that represents documents, images, or other data. Vectors are used in the development of machine learning applications. A vector database stores, searches, and retrieves the representations by similarity or by relevance.
Pinecone’s vector database is accessed through an API. Early adopters range from startups to large companies with machine learning initiatives that need to scale.
Pinecone Systems’ lead investor was also an early investor in Snowflake, and the similarities don’t stop there.