dcsimg

10 Big Data Cloud Services

  • 10 Big Data Cloud Services

    10 Big Data Cloud Services
    These leading big data analytics services can help enterprises derive valuable insights from large quantities of data.

    Image Source: Pixabay

  • 1. Amazon Redshift

    1. Amazon Redshift

    Redshift is the Amazon Web Services (AWS) data warehouse offering. The company touts it as a cost-effective way to house big data for analysis with traditional business intelligence (BI) tools. According to the website, "With Amazon Redshift, you can start small for just $0.25 per hour with no commitments and scale out to petabytes of data for $1,000 per terabyte per year, less than a tenth the cost of traditional solutions."

    Key features of the service include fast queries, SQL support, built-in encryption and support for Redshift Spectrum, a tool that allows users to run SQL queries on unstructured data stored in Amazon S3, as well as structured data stored in Redshift.

    In its 2017 Magic Quadrant for Data Management Solutions for Analytics, Gartner named Amazon a "Leader" due to the services like Redshift and EMR (see next slide).

  • 2. Amazon EMR

    2. Amazon EMR

    AWS also offers a Hadoop-based big data cloud service called Amazon EMR. (EMR originally stood for Elastic MapReduce, but now AWS just uses the acronym rather than the full name). In addition to Hadoop, it supports other popular open source big data tools, including Apache Spark, HBase, Presto and Flink, and it can also integrate with other AWS services like storage and databases. According to the website, "Amazon EMR securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics."

  • 3. HDInsight

    3. HDInsight

    Microsoft Azure's competitor to EMR is called HDInsight. It supports a wide variety of open source big data tools, such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, HBase and R. Well-known companies that use the service include AccuWeather, Toyota, LG, Schneider Electric and GE.

    Last month, Microsoft announced that it was reducing prices for HDInsight by 52 percent, and that it was dropping the price for R Server for Azure HDInsight by a full 80 percent. It also rolled out new features like, an enterprise security package, integration with Azure Log Analytics and Power BI direct query, and plug-ins for IntelliJ and Eclipse.

  • 4. Data Lake Analytics

    4. Data Lake Analytics

    Microsoft also offers a separate, but similar big data cloud service called Data Lake Analytics. It is based on YARN, the same technology that is at the core of Hadoop, but it's a little different than most cloud-based Hadoop services. The focus here is on ease of use; to use it developers need know only one language — U-SQL, which is a mash-up of SQL and C# — rather than the several languages they might need to know to use many other Hadoop services. It integrates with other Microsoft tools like Visual Studio and Azure Active Directory, and the platform handles most infrastructure management tasks automatically. Another unique feature is that pricing is based on the job, rather than by the hour, which can make it easier to calculate costs.

  • 5. Google BigQuery

    5. Google BigQuery

    Like Data Lake Analytics, Google BigQuery aims to be extremely easy to use. It's a "serverless" cloud service, which means that users don't need to worry about configuring or managing infrastructure in any way — BigQuery takes care of all of that for them. It also has a reputation for being extremely fast, having performed well in industry benchmark tests. And it's affordable with a free tier that includes 1TB of analyzed data and 10GB of stored data. It's a data warehouse, meant for use with structured data, and it can also integrate with Google Cloud Platform's machine learning and artificial intelligence (AI) tools.

    In the recent Forrester Wave report on Insight Platforms-as-a-Service for Q3 '17, Google was the only company to be listed as a "Leader."

  • 6. Cloud Dataflow

    6. Cloud Dataflow

    Google's lineup of big data cloud services also includes Cloud Dataflow, a fully managed service for transforming and enriching data. For many organizations, getting big data ready for processing with analytics tools is a complex task that consumes a great deal of time and energy. Cloud Dataflow is a "serverless" service that greatly simplifies this task. It works with both batch and streaming data, and in some benchmark tests, it has performed faster than Apache Spark. It integrates with other Google Cloud big data tools, including BigQuery.

  • 7. IBM Watson Data Platform

    7. IBM Watson Data Platform

    IBM Watson Data Platform isn't just a single big data cloud service. Instead, it's a collection of services for preparing, storing and analyzing big data, as well as for creating applications with embedded analytics and intelligence. it includes Hadoop- and Spark-based tools, as well as machine learning tools and much more. It also integrates with other IBM Cloud services, including storage, Watson cognitive computing services and developer tools.

    In the Forrester Wave report on Insight Platforms-as-a-Service for Q3 '17, IBM was second only to Google and was named a "strong performer."

  • 8. Oracle Big Data Cloud Service

    8. Oracle Big Data Cloud Service

    Oracle Big Data Cloud Service is an automated data science and analytics service with built-in Hadoop and Spark engines. It integrates with Oracle database and other Oracle applications, as well as with the company's other big data cloud services, including Big Data SQL Cloud Service and the Big Data Cloud Machine for private and hybrid cloud capabilities. Key capabilities include spatial and graph analysis, R support, comprehensive security and on-demand provisioning./p>

  • 9. Cloudera Altus

    9. Cloudera Altus

    Cloudera offers one of the most well-known enterprise distributions of Hadoop, and its Altus service delivers some of those capabilities as a cloud PaaS. It includes Altus Data Engineering for data preparation and the Altus Analytic DB, a self-service analytics and BI service that is currently a closed beta offering. However, note that Cloudera does not manage its own cloud infrastructure; Altus runs on AWS.

    The company also offers Cloudera Director, a tool for running and managing Cloudera Hadoop clusters on AWS, Microsoft Azure or Google Cloud Platform.

  • 10. Qubole Data Service

    10. Qubole Data Service

    Qubole describes its data service as "the first autonomous big data platform." It uses analytics capabilities to analyze and optimize your big data platform, helping to improve reliability, performance and cost. In other words, it uses analytics to improve your analytics. It runs on AWS, Microsoft Azure or Oracle Cloud. It includes Hadoop, Spark, Presto, Hive and other open source big data tools, as well as comprehensive security capabilities. According to the company it costs 30 to 50 percent less than other big data cloud services and up to 80 percent less than big data solutions deployed on-premises.

  • 1 of

10 Big Data Cloud Services

  • 1 of
  • 10 Big Data Cloud Services

    10 Big Data Cloud Services

    These leading big data analytics services can help enterprises derive valuable insights from large quantities of data.

    Image Source: Pixabay

  • 1. Amazon Redshift

    1. Amazon Redshift

    Redshift is the Amazon Web Services (AWS) data warehouse offering. The company touts it as a cost-effective way to house big data for analysis with traditional business intelligence (BI) tools. According to the website, "With Amazon Redshift, you can start small for just $0.25 per hour with no commitments and scale out to petabytes of data for $1,000 per terabyte per year, less than a tenth the cost of traditional solutions."

    Key features of the service include fast queries, SQL support, built-in encryption and support for Redshift Spectrum, a tool that allows users to run SQL queries on unstructured data stored in Amazon S3, as well as structured data stored in Redshift.

    In its 2017 Magic Quadrant for Data Management Solutions for Analytics, Gartner named Amazon a "Leader" due to the services like Redshift and EMR (see next slide).

  • 2. Amazon EMR

    2. Amazon EMR

    AWS also offers a Hadoop-based big data cloud service called Amazon EMR. (EMR originally stood for Elastic MapReduce, but now AWS just uses the acronym rather than the full name). In addition to Hadoop, it supports other popular open source big data tools, including Apache Spark, HBase, Presto and Flink, and it can also integrate with other AWS services like storage and databases. According to the website, "Amazon EMR securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics."

  • 3. HDInsight

    3. HDInsight

    Microsoft Azure's competitor to EMR is called HDInsight. It supports a wide variety of open source big data tools, such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, HBase and R. Well-known companies that use the service include AccuWeather, Toyota, LG, Schneider Electric and GE.

    Last month, Microsoft announced that it was reducing prices for HDInsight by 52 percent, and that it was dropping the price for R Server for Azure HDInsight by a full 80 percent. It also rolled out new features like, an enterprise security package, integration with Azure Log Analytics and Power BI direct query, and plug-ins for IntelliJ and Eclipse.

  • 4. Data Lake Analytics

    4. Data Lake Analytics

    Microsoft also offers a separate, but similar big data cloud service called Data Lake Analytics. It is based on YARN, the same technology that is at the core of Hadoop, but it's a little different than most cloud-based Hadoop services. The focus here is on ease of use; to use it developers need know only one language — U-SQL, which is a mash-up of SQL and C# — rather than the several languages they might need to know to use many other Hadoop services. It integrates with other Microsoft tools like Visual Studio and Azure Active Directory, and the platform handles most infrastructure management tasks automatically. Another unique feature is that pricing is based on the job, rather than by the hour, which can make it easier to calculate costs.

  • 5. Google BigQuery

    5. Google BigQuery

    Like Data Lake Analytics, Google BigQuery aims to be extremely easy to use. It's a "serverless" cloud service, which means that users don't need to worry about configuring or managing infrastructure in any way — BigQuery takes care of all of that for them. It also has a reputation for being extremely fast, having performed well in industry benchmark tests. And it's affordable with a free tier that includes 1TB of analyzed data and 10GB of stored data. It's a data warehouse, meant for use with structured data, and it can also integrate with Google Cloud Platform's machine learning and artificial intelligence (AI) tools.

    In the recent Forrester Wave report on Insight Platforms-as-a-Service for Q3 '17, Google was the only company to be listed as a "Leader."

  • 6. Cloud Dataflow

    6. Cloud Dataflow

    Google's lineup of big data cloud services also includes Cloud Dataflow, a fully managed service for transforming and enriching data. For many organizations, getting big data ready for processing with analytics tools is a complex task that consumes a great deal of time and energy. Cloud Dataflow is a "serverless" service that greatly simplifies this task. It works with both batch and streaming data, and in some benchmark tests, it has performed faster than Apache Spark. It integrates with other Google Cloud big data tools, including BigQuery.

  • 7. IBM Watson Data Platform

    7. IBM Watson Data Platform

    IBM Watson Data Platform isn't just a single big data cloud service. Instead, it's a collection of services for preparing, storing and analyzing big data, as well as for creating applications with embedded analytics and intelligence. it includes Hadoop- and Spark-based tools, as well as machine learning tools and much more. It also integrates with other IBM Cloud services, including storage, Watson cognitive computing services and developer tools.

    In the Forrester Wave report on Insight Platforms-as-a-Service for Q3 '17, IBM was second only to Google and was named a "strong performer."

  • 8. Oracle Big Data Cloud Service

    8. Oracle Big Data Cloud Service

    Oracle Big Data Cloud Service is an automated data science and analytics service with built-in Hadoop and Spark engines. It integrates with Oracle database and other Oracle applications, as well as with the company's other big data cloud services, including Big Data SQL Cloud Service and the Big Data Cloud Machine for private and hybrid cloud capabilities. Key capabilities include spatial and graph analysis, R support, comprehensive security and on-demand provisioning./p>

  • 9. Cloudera Altus

    9. Cloudera Altus

    Cloudera offers one of the most well-known enterprise distributions of Hadoop, and its Altus service delivers some of those capabilities as a cloud PaaS. It includes Altus Data Engineering for data preparation and the Altus Analytic DB, a self-service analytics and BI service that is currently a closed beta offering. However, note that Cloudera does not manage its own cloud infrastructure; Altus runs on AWS.

    The company also offers Cloudera Director, a tool for running and managing Cloudera Hadoop clusters on AWS, Microsoft Azure or Google Cloud Platform.

  • 10. Qubole Data Service

    10. Qubole Data Service

    Qubole describes its data service as "the first autonomous big data platform." It uses analytics capabilities to analyze and optimize your big data platform, helping to improve reliability, performance and cost. In other words, it uses analytics to improve your analytics. It runs on AWS, Microsoft Azure or Oracle Cloud. It includes Hadoop, Spark, Presto, Hive and other open source big data tools, as well as comprehensive security capabilities. According to the company it costs 30 to 50 percent less than other big data cloud services and up to 80 percent less than big data solutions deployed on-premises.

Choosing a Big Data Cloud services is a critical task for businesses. For many organizations, 2018 will be the year that they migrate their big data — and especially their Big Data analytics — to the public cloud.

Forrester analyst Brian Hopkins has said that enterprises that want to remain competitive absolutely must move to the cloud. "Enterprise architects must recognize that the combination of big data and public cloud is not just a trend; it is an extinction-level event for digital dinosaurs," he wrote. "Digital predators who get there first will exploit the accelerating cycle of big data innovation in the public cloud, becoming more customer obsessed. Digital dinosaurs will recognize they are too late, will scramble to win back customers, and eventually die off."

So if enterprises don't want to become one of those extinct "digital dinosaurs," which big data cloud service should they use?

Organizations have literally dozens of big data cloud services they could choose from. Many of those are based on the open source Hadoop framework. But that doesn't mean they are all the same.

"As enterprises strategize to 'future-proof' their big data analytics investments across on-premise and multi-cloud Data Lakes, they need to take into consideration the fact that not every data platform is born equal," said Dave Mariani, CEO of vendor AtScale, which offers a platform to improve the performance of other analytics and BI solutions.

Mariani added that organizations might also have reasons to choose a big data cloud service that isn't based on Hadoop. He said, "Hadoop is a data processing platform — not a database — which makes it more flexible for a variety of workloads. But if you're looking for a data warehouse, Amazon RedShift and Google BigQuery are great out-of-the-box choices for running a data warehouse in the Cloud without the overhead of managing the infrastructure."

The following slideshow highlights ten big data cloud services that enterprises might want to consider.

Submit a Comment

Loading Comments...