Database management is much more complicated now that Big Data has arrived on the scene. In addition to traditional, structured data like business contacts and product intelligence, we now have semi-structured and unstructured data coming at us fast and furious from all directions.
The biggest source of this hard-to-analyze information is the mobile web. The flow of data just doesn’t slow down as more and more people around the world access the Internet and use social media on mobile devices. And most organizations struggle to collect, organize, store and analyze all of it on their own.
Enter the Cloud: a viable option for companies that don’t have a lot of money for capital investments in equipment or the budget to maintain an IT department of the size needed to manage Big Data in house.
Experts expect database-as-a-service (DBAAS), just like all the other “as-a-service” options out there, to eventually become the standard solution for all but the most highly sensitive and mission-critical data.
A variety of cloud database management systems are available to store and analyze both relational (SQL) and non-relational (NoSQL) types of data. Here’s a list of some of the leading service providers and their solutions:
Cloud database management options
- MicrosoftAzure/SQLDatabase – A “full featured relational database-as-a-service,” with “Tables” that offer NoSQL capabilities for storing large amounts of unstructured data, and “Blobs” (Binary Large Objects) for storing large amounts of unstructured text, video, audio and images.
- AmazonWebServices/DynamoDB/RelationalDatabaseService – Amazon’s offerings include NoSQL, MySQL, Oracle and MS SQL Server solutions. SimpleDB is Amazon’s “highly available and flexible non-relational data store that [takes on] the work of database administration.”
- Xeround – A fully managed MySQL DBAAS that the vendor calls a “drop-in solution” because it “automates all configuration and ongoing DB operations.”
- GoogleCloudSQL/GoogleAppEngineDatastore – Google’s solutions for storing structured and unstructured data.
- ClearDB – This MySQL DBAAS boasts 100% uptime due to its “multi-regional read/write mirroring.”
- Database.com – A native cloud database service developed in house at Salesforce.com that became generally available in 2011. The vendor’s website says it was “built with the needs of a social and mobile world at its core, not as an afterthought.”
The unique features of cloud databases (namely the ability to distribute data across wide geographical areas and among different servers in one physical data center) are based on cloud computing technology made possible by virtualization, something relational database management systems (RDBMS) were not designed for.
To get around this limitation, leading DBAAS companies including Microsoft and Amazon offer their own RDBMS applications or software optimized for the cloud computing environment.
Leaving a legacy RDMS behind
Moving to the Cloud can be straightforward or quite the complex process, depending on the application. In a recent white paper, DataStax Corporation talks about the benefits of moving data to a NoSQL database in the Cloud when an organization outgrows its legacy RDBMS.
NoSQL is a non-relational database management system. NoSQL was designed specifically to handle storing and retrieving large quantities of data without defined relationships (i.e., Big Data). But, data stored in a NoSQL database can be structured. For those readers well-schooled in database terminology, NoSQL systems:
- Do not use SQL as their query language,
- Guarantee eventual consistency only (not ACID), and
- Have a distributed, fault-tolerant architecture.
DataStax is a leading provider of enterprise-level cloud database products and services that are based on the open-source NoSQL database Apache Cassandra. In the paper, the authors argue that moving data to the Cloud requires more consideration than one might think because an RDBMS is not designed to run on a virtualized platform. DataStax contends that moving an existing RDBMS from an on-premise server to a cloud platform “in no way maximizes the capabilities of a cloud computing environment.”
If your company is outgrowing its legacy RDBMS and you think it’s time to move your data to the Cloud, there are eight defining characteristics of NoSQL cloud databases that you should be aware of, according to DataStax:
- Elasticity is the ability to “add and subtract nodes (defined as actual physical machines or virtual machines) when the underlying application and business demands it.”
- The adding and subtracting of nodes in response to demand happens on the fly so that no downtime occurs.
- When it comes to elasticity, an RDBMS makes “elastic expansion and contraction tricky and complex to manage.”
- Elasticity makes it possible to scale out in a linear fashion so that database performance increases when necessary.
- For example, “if two nodes are able to handle the throughput of 200,000 transactions, then four nodes should be able to manage 400,000” if demand spikes.
- The ability to scale out also means that large volumes of data are processed in the same amount of time that small volumes of data are, and response times laid out in Service Level Agreements can be met, even when demand fluctuates.
- High availability
- High availability or “uptime” is critical to businesses that can lose tens of thousands or even millions of dollars per minute of downtime, depending on the industry.
- Cloud databases claim high availability because they “piggyback off of a cloud provider’s infrastructure” which is designed to provide easy data distribution and redundancy.
- Easy data distribution
- Because cloud providers have the ability to “distribute [computing] resources and data across different geographies or ‘zones’…the underlying database of a cloud application…[can] read and write from any node that makes up the cloud database.
- Redundant copies of data are important so if “the primary copy is destroyed, another copy is available for use.”
- Redundant copies can be stored over a wide geographic area or within the same data center on different physical server racks.
- Distributed, redundant copies of data ensure high availability.
- Support for all datatypes
- Cloud-based NoSQL databases “offer flexible and dynamic schema that accepts all key data formats” including structured, semi-structured, and unstructured.
- An RDBMS can only handle structured data.
- Easier manageability
- Tools or sets of tools for carrying out “routine administrative operations” are provided by the vendor. Usually these tools are accessed via a web browser.
- Lower cost
- The elasticity and scalability of cloud databases is what makes them less costly because the pricing model for cloud computing is pay-as-you-go.
- The complexity of a traditional RDBMS can be just as expensive to implement in the Cloud as they are on-premise because they do not scale out well.
- DataStax advises “When looking to implement a database in the cloud, IT professionals should seek a cost structure that is friendly to scaling out horizontally, regardless of machine size or the data volume being managed.”
These eight characteristics of NoSQL cloud databases are helpful background information that IT managers can use when contemplating moving company data to a cloud platform. DataStax maintains that an organization will be disappointed if it expects to gain flexibility and realize cost savings just by moving a legacy RDBMS to the Cloud.