Continuing Datamation’s series on big data, Internet of Things (IoT) and artificial intelligence offerings from major cloud providers, it’s time to switch gears from Microsoft Azure to Google Cloud Platform. And given the vast amounts of data that powers the search giant’s services, it’s only fitting to start with big data and analytics.
Google Cloud Platform Big Data Portfolio
Cloud-based big data services are increasingly seen as a way for enterprises to gain a competitive edge. There are several benefits to letting big cloud providers handle an organization’s big data, including simplified provisioning, lower IT management requirements, reduced costs and the ability to access data from practically anywhere, to name a few.
Businesses considering placing their big data and analytics workloads on Google Cloud Platform will find a robust portfolio of services that takes a serverless approach to building and delivering data-driven applications. Of course, that raises the question, what is serverless?
Serverless computing does indeed requires servers and their processors to keep applications up and running. Serverless cloud solutions automate the provisioning and configuration of those servers, freeing developers to create and update apps while worrying less about setting up and managing the servers required to run them.
Google uses this approach in BigQuery, an enterprise big data warehouse for low-cost business analytics on petabyte-scale datasets. Carrying on the serverless theme, Google boasts that “there is no infrastructure to manage, no need to guess the needed capacity or overprovision, and you don’t need a database administrator.”
Google’s NoSQL big data database service is called Cloud Bigtable. If there is any doubt about its ability to handle massive workloads, the company reminds that it is the same database that powers Google Search, Gmail and Maps, among many more services.
For developers who are familiar with Apache’s open source big data technologies, Google offers Cloud Dataproc. The managed service for Apache Hadoop and Spark offers automated cluster management, per-second billing and scalable clusters that quickly settle into their new sizes.
The company’s batch and stream data processing offering is called, fittingly enough, Cloud Dataflow. It can be used for batch computation, ETL (extract, transform, load) and streaming analytics. Again, developers who are well-versed in the Apache ecosystem will feel right at home.
“Cloud Dataflow supports fast, simplified pipeline development via expressive Java and Python APIs in the Apache Beam SDK, which provides a rich set of windowing and session analysis primitives as well as an ecosystem of source and sink connectors,” states Google. “Plus, Beam’s unique, unified development model lets you reuse more code across streaming and batch pipelines.”
For exploring big data and deriving insights from it, Google offers Cloud Datalab. It is based on Jupyter, an open-source data science platform that allows users to create and share interactive documents that contain visualizations, text and live code.
Cloud Dataprep, as its name suggests, allows users to clean up their structured and unstructured data in preparation for analysis. Currently in beta, the service is operated by Trifacta but integrates seamlessly with Google’s ecosystem, down to the company’s licensing scheme and the user experience.
Finally, to turn big data insights into charts and dashboards that business executives and typical office workers can understand, Google offers Data Studio. Also in beta, the service can be used to create shareable visualizations and reports that can help workforces make data-driven decisions.
Google Cloud Platform IoT Portfolio
Today and in the near future, a good portion of the big data generated by businesses is likely to come from IoT devices. Google is ready for that, too.
The Google Cloud IoT portfolio overlaps somewhat with the company’s suite of big data solutions (more on that later). But for the “core” functionality involved in connecting and managing a vast fleet of IoT devices, there’s Cloud IoT Core.
Launched in September 2017 and currently in beta, Cloud IoT Core enables businesses to securely link their IoT devices to Google’s analytics and AI services.
“With Cloud IoT Core, you can easily connect and centrally manage millions of globally dispersed IoT devices,” wrote Indranil Chakraborty, product manager of Google Cloud, in the beta launch announcement. “When used as part of the broader Google Cloud IoT solution, you can ingest all your IoT data and connect to our state-of-the-art analytics services including Google Cloud Pub/Sub, Google Cloud Dataflow, Google Cloud Bigtable, Google BigQuery, and Google Cloud Machine Learning Engine to gain actionable insights.”
Google Cloud Pub/Sub is used to ingest IoT event streams, enabling event-driven computing and stream analytics. Attentive readers will notice that a number of big data services from Google, namely Cloud Dataflow, Bigtable and BigQuery, can be harnessed to gather and act upon the real-time data produced by IoT deployments.
For businesses that want to take a crack at building their own IoT devices, Google is running a developer preview of its Android Things program. Built on the company’s mobile and IoT device platform, the program is intended to help organizations rapidly prototype their ideas and turn them into commercial products.
Google Cloud Platform AI Portfolio
Developers seeking to build AI-enabled apps have a range of Google APIs at their disposal.
Google Cloud Vision API can be used to create applications that can detect and identify objects and faces in an image. It can also be used to extract text from pictures or even keep objectionable material at bay.
Google Cloud Video Intelligence turns videos into searchable content. Using the company’s library of 20,000 labels, it automatically analyzes video and can identify objects and when they appear.
Google Cloud Speech API can be used to turn speech into text and help turn voice commands issued in over 110 languages and variations into action. The Cloud Translation API can be used to break down language barriers with real-time, neural network-based translation services.
Meanwhile, Google Natural Language API can be used to read between the lines, revealing a user’s intent based on a text chat or determine the sentiment surrounding a brand or product based social media posts.
Businesses can build chatbots with DialogFlow Enterprise Edition. Google offers more than 30 pre-built virtual agents that can be used as templates for conversational experiences in websites, messaging apps, and more.
Google Cloud Machine Learning Engine uses the company’s own TensorFlow framework, which is used in Google Photos and other products, to help users build machine learning models. Models produced by the service can support terabytes of data and up to thousands of users.
Businesses with more specialized needs can use the Google Cloud AutoML to train their custom machine learning models. Currently in “alpha” or pre-beta, it uses the company’s transfer learning and neural architecture search technologies to help users quickly train, evaluate and deploy models based on their own data. Interested parties can request access here.
In a very targeted example of putting Google’s AI to work, the company launched a private beta of its Cloud Jobs Discovery service (Cloud Jobs API) that can help staffing agencies and applicant tracking systems connect job seekers and employers.
All told, Google’s track record of maintaining its own ecosystem crowd-pleasing services like Google Home and Gmail, makes a compelling case for entrusting the company with an enterprise’s big data, IoT and AI needs.
Pedro Hernandez is a contributing editor at Datamation. Follow him on Twitter @ecoINSITE.