Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Hands-On Artificial Intelligence on Google Cloud Platform
Nội dung xem thử
Mô tả chi tiết
Hands-On Artificial Intelligence
on Google Cloud Platform
Build intelligent applications powered by TensorFlow, Cloud
AutoML, BigQuery, and Dialogflow
Anand Deshpande
Manish Kumar
Vikram Chaudhari
BIRMINGHAM - MUMBAI
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well
as industry leading tools to help you plan your personal development and advance your
career. For more information, please visit our website.
Why subscribe?
Spend less time learning and more time coding with practical eBooks and Videos
from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and
ePub files available? You can upgrade to the eBook version at www.packt.com and as a print
book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
[email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a
range of free newsletters, and receive exclusive discounts and offers on Packt books and
eBooks.
Hands-On Artificial Intelligence on Google
Cloud Platform
Copyright © 2020 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, without the prior written permission of the publisher, except in the case of brief quotations
embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented.
However, the information contained in this book is sold without warranty, either express or implied. Neither the
authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to
have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products
mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy
of this information.
Acquisition Editor: Devika Battike
Content Development Editor: Nazia Shaikh
Senior Editor: Ayaan Hoda
Technical Editor: Utkarsha S. Kadam
Copy Editor: Safis Editing
Language Support Editor: Storm Mann
Project Coordinator: Aishwarya Mohan
Proofreader: Safis Editing
Indexer: Rekha Nair
Production Designer: Joshua Misquitta
First published: March 2020
Production reference: 1050320
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78953-846-5
www.packt.com
About the authors
Anand Deshpande has over 19 years' experience with IT services and product
development. He is currently working as Vice President of Advanced Analytics and
Product Development at VSquare Systems Pvt. Ltd. (VSquare). He has developed a special
interest in data science and an algorithmic approach to data management and analytics and
co-authored a book entitled Artificial Intelligence for Big Data in May 2018.
This book and anything worthwhile in my life is possible only with the blessings of my
spiritual guru, my parents, in-laws and the unconditional love and support from my wife
Mugdha and my daughters Devyani and Sharvari. Special thanks to my friends and coauthors of this book, Manish Kumar and Vikram Chaudhari. And lastly, I would like to
thank Mr. Sunil Kakade and Mr. Ajay Deshmukh for mentoring and their support.
Manish Kumar works as Director of Technology and Architecture at VSquare. He has over
13 years' experience in providing technology solutions to complex business problems. He
has worked extensively on web application development, IoT, big data, cloud technologies,
and blockchain. Aside from this book, Manish has co-authored three books (Mastering
Hadoop 3, Artificial Intelligence for Big Data, and Building Streaming Applications with Apache
Kafka).
I would like to thank my parents, Dr. N.K. Singh and Dr. Rambha Singh, for their
blessings. The time spent on this book has taken some precious time from my wife, Mrs.
Swati Singh, and my adorable son, Lakshya Singh and I would like to thank them for their
support throughout this time. And lastly, I would like to thank my friends and co-authors,
Anand Deshpande and Vikram Chaudhari.
Vikram Chaudhari works as Director of Data and Advanced Analytics at VSquare. He has
over 10 years' IT experience. He is a certified AWS and Google Cloud Architect and has
completed multiple implementations of data pipelines with Amazon Web Services and
Google Cloud Platform. With implementation experience on multiple data pipelines across
platforms, Vikram is instrumental in creating reusable components and accelerators that
reduce costs and implementation time.
I would like to thank my mother, Mrs. Jyoti Chaudhari for her encouragement and
blessings, my wife Amruta Chaudhari for her continuous support and love, my precious
little gem Aahana, as she had to compromise on some of her weekend plans when I was
busy writing the book. Finally, I would like to thank my mentor, Anand Deshpande, who
always guided me and helped me to get to the next level in my career and, Mr. Manish
Kumar, who is the co-author and one of the best Architects I have worked with. Lastly, I
would want to thank the wonderful Packt team for their wonderful support throughout the
project.
About the reviewers
Arvind Ravulavaru is a full stack architect and consultant with over 11 years' experience in
software development and 2 years' experience in hardware and product development. For
the last 5 years, he has been working extensively on JavaScript, both on the server side and
the client side and, for the last few years, in IoT, AI, ML, and big data.
Alexey Bokov is an experienced Azure architect and has been a Microsoft technical
evangelist since 2011. He works closely with Microsoft's top-tier customers all around the
world to develop applications based on the Azure cloud platform. Building cloud-based
applications in challenging scenarios is his passion, along with helping the development
community to upskill and learn new things through hands-on exercises and hacking. He's a
long-time contributor to, and co-author and reviewer of, many Azure books, and, from time
to time, speaks at Kubernetes events.
Judy T Raj is a Google Certified Professional Cloud Architect with extensive experience
across the three leading cloud platforms—AWS, Azure, and the GCP. She has co-authored
a book on Google Cloud Platform with Packt Publishing. She has also worked with a wide
range of technologies, such as data science, deep learning, computer vision, blockchain, big
data, and IoT and has published many relevant academic papers. Currently employed as a
Data and AI Engineer, she is a passionate coder, a machine learning practitioner, and a
computer vision enthusiast.
Packt is searching for authors like you
If you're interested in becoming an author for Packt, please visit authors.packtpub.com
and apply today. We have worked with thousands of developers and tech professionals,
just like you, to help them share their insight with the global tech community. You can
make a general application, apply for a specific hot topic that we are recruiting an author
for, or submit your own idea.
Table of Contents
Preface 1
Section 1: Section 1: Basics of Google Cloud Platform
Chapter 1: Overview of AI and GCP 8
Understanding the Cloud First strategy for advanced data analytics 9
Advantages of a Cloud First strategy 10
Anti-patterns of the Cloud First strategy 12
Google data centers 13
Overview of GCP 15
AI building blocks 16
Data 17
Storage 17
Processing 18
Actions 18
Natural language processing 18
Speech recognition 19
Machine vision 19
Information processing and reasoning 19
Planning and exploring 20
Handling and control 20
Navigation and movement 20
Speech generation 21
Image generation 21
AI tools available on GCP 21
Sight 21
Language 23
Conversation 24
Summary 25
Chapter 2: Computing and Processing Using GCP Components 26
Understanding the compute options 27
Compute Engine 27
Compute Engine and AI applications 28
App Engine 29
App Engine and AI applications 29
Cloud Functions 30
Cloud Functions and AI applications 30
Kubernetes Engine 30
Kubernetes Engine and AI applications 31
Diving into the storage options 31
Table of Contents
[ ii ]
Cloud Storage 32
Cloud Storage and AI applications 33
Cloud Bigtable 33
Cloud Bigtable and AI applications 34
Cloud Datastore 34
Cloud Datastore and AI applications 35
Cloud Firestore 35
Cloud Firestore and AI applications 36
Cloud SQL 36
Cloud SQL and AI applications 37
Cloud Spanner 37
Cloud Spanner and AI applications 38
Cloud Memorystore 38
Cloud Memorystore and AI applications 39
Cloud Filestore 39
Cloud Filestore and AI applications 40
Understanding the processing options 40
BigQuery 40
BigQuery and AI applications 41
Cloud Dataproc 41
Cloud Dataproc and AI applications 42
Cloud Dataflow 42
Cloud Dataflow and AI applications 43
Building an ML pipeline 43
Understanding the flow design 44
Loading data into Cloud Storage 45
Loading data to BigQuery 46
Training the model 48
Evaluating the model 49
Testing the model 50
Summary 51
Section 2: Section 2: Artificial Intelligence with Google Cloud
Platform
Chapter 3: Machine Learning Applications with XGBoost 53
Overview of the XGBoost library 54
Ensemble learning 54
How does ensemble learning decide on the optimal predictive model? 55
Reducible errors – bias 55
Reducible errors – variance 55
Irreducible errors 56
Total error 56
Gradient boosting 57
eXtreme Gradient Boosting (XGBoost) 60
Training and storing XGBoost machine learning models 61
Using XGBoost trained models 64
Table of Contents
[ iii ]
Building a recommendation system using the XGBoost library 68
Creating and testing the XGBoost recommendation system model 70
Summary 71
Chapter 4: Using Cloud AutoML 72
Overview of Cloud AutoML 72
The workings of AutoML 74
AutoML API overview 75
REST source – pointing to model locations 75
REST source – for evaluating the model 77
REST source – the operations API 78
Document classification using AutoML Natural Language 78
The traditional machine learning approach for document classification 79
Document classification with AutoML 79
Navigating to the AutoML Natural Language interface 80
Creating the dataset 81
Labeling the training data 82
Training the model 83
Evaluating the model 84
The command line 85
Python 86
Java 86
Node.js 88
Using the model for predictions 88
The web interface 89
A REST API for model predictions 89
Python code for model predictions 90
Image classification using AutoML Vision APIs 90
Image classification steps with AutoML Vision 91
Collecting training images 91
Creating a dataset 91
Labeling and uploading training images 93
Training the model 96
Evaluating the model 98
The command-line interface 100
Python code 100
Testing the model 101
Python code 102
Performing speech-to-text conversion using the Speech-to-Text API 103
Synchronous requests 104
Asynchronous requests 111
Streaming requests 111
Sentiment analysis using AutoML Natural Language APIs 113
Summary 117
Chapter 5: Building a Big Data Cloud Machine Learning Engine 118
Understanding ML 119
Understanding how to use Cloud Machine Learning Engine 120
Google Cloud AI Platform Notebooks 121
Table of Contents
[ iv ]
Google AI Platform deep learning images 121
Creating Google Platform AI Notebooks 123
Using Google Platform AI Notebooks 127
Automating AI Notebooks execution 129
Overview of the Keras framework 134
Training your model using the Keras framework 137
Training your model using Google AI Platform 141
Asynchronous batch prediction using Cloud Machine Learning
Engine 145
Real-time prediction using Cloud Machine Learning Engine 150
Summary 151
Chapter 6: Smart Conversational Applications Using DialogFlow 152
Introduction to DialogFlow 153
Understanding the building blocks of DialogFlow 153
Building a DialogFlow agent 159
Use cases supported by DialogFlow 170
Performing audio sentiment analysis using DialogFlow 171
Summary 172
Section 3: Section 3: TensorFlow on Google Cloud Platform
Chapter 7: Understanding Cloud TPUs 175
Introducing Cloud TPUs and their organization 175
Advantages of using TPUs 178
Mapping of software and hardware architecture 179
Available TPU versions 180
Performance benefits of TPU v3 over TPU v2 181
Available TPU configurations 182
Software architecture 184
Best practices of model development using TPUs 185
Guiding principles for model development on a TPU 186
Training your model using TPUEstimator 186
Standard TensorFlow Estimator API 187
TPUEstimator programming model 187
TPUEstimator concepts 188
Converting from TensorFlow Estimator to TPUEstimator 188
Setting up TensorBoard for analyzing TPU performance 189
Performance guide 191
XLA compiler performance 191
Consequences of tiling 191
Fusion 192
Understanding preemptible TPUs 193
Steps for creating a preemptible TPU from the console 193
Preemptible TPU pricing 195
Table of Contents
[ v ]
Preemptible TPU detection 196
Summary 196
Chapter 8: Implementing TensorFlow Models Using Cloud ML Engine 197
Understanding the components of Cloud ML Engine 198
Training service 198
Using the built-in algorithms 198
Using a custom training application 208
Prediction service 210
Notebooks 215
Data Labeling Service 219
Deep learning containers 220
Steps involved in training and utilizing a TensorFlow model 221
Prerequisites 221
Creating a TensorFlow application and running it locally 223
Project structure recommendation 223
Training data 224
Packaging and deploying your training application in Cloud ML
Engine 226
Choosing the right compute options for your training job 229
Choosing the hyperparameters for the training job 231
Monitoring your TensorFlow training model jobs 233
Summary 234
Chapter 9: Building Prediction Applications 235
Overview of machine-based intelligent predictions 235
Understanding the prediction process 236
Maintaining models and their versions 239
Taking a deep dive into saved models 244
SignatureDef in the TensorFlow SavedModel 246
TensorFlow SavedModel APIs 250
Deploying the models on GCP 252
Uploading saved models to a Google Cloud Storage bucket 255
Testing machine learning models 256
Deploying models and their version 257
Model training example 264
Performing prediction with service endpoints 278
Summary 279
Section 4: Section 4: Building Applications and Upcoming
Features
Chapter 10: Building an AI application 281
A step-by-step approach to developing AI applications 281
Problem classification 282
Classification 283
Table of Contents
[ vi ]
Regression 283
Clustering 283
Optimization 283
Anomaly detection 284
Ranking 284
Data preparation 284
Data acquisition 284
Data processing 285
Problem modeling 285
Validation and execution 286
Holdout 286
Cross-validation 287
Model evaluation parameters (metrics) 287
Classification metrics 287
Model deployment 288
Overview of the use case – automated invoice processing (AIP) 290
Designing AIP with AI platform tools on GCP 292
Performing optical character recognition using the Vision API 296
Storing the invoice with Cloud SQL 300
Creating a Cloud SQL instance 300
Setting up the database and tables 302
Enabling the Cloud SQL API 304
Enabling the Cloud Functions API 304
Creating a Cloud Function 305
Providing the Cloud SQL Admin role 309
Validating the invoice with Cloud Functions 311
Scheduling the invoice for the payment queue (pub/sub) 313
Notifying the vendor and AP team about the payment completion 315
Creating conversational interface for AIP 316
Upcoming features 316
Summary 319
Other Books You May Enjoy 320
Index 323
Preface
We are at an interesting juncture in the journey of computation; the majority of enterprise
workloads (databases, computation, and analytics) are moving over to the cloud. Cloud
computing is making it feasible and easy for anyone to have access to high-end computing
power. As a result, we are seeing a fundamental shift in the way computational resources
are utilized by individuals and organizations of all sizes. The providers of cloud computing
infrastructure have also enabled the Software as a Service (SaaS) paradigm. With this,
various services related to storage, compute, machine learning, and visualization are
available without having to perform any server management or administration. The
fundamental shift we are seeing based on our experience is toward a serverless
architecture.
At this juncture, Google Cloud Platform (GCP) is a complete platform provided by Google
based on years of innovation and research that contributed toward building the most
powerful search engine available to date. The same technology stack is now made
commercially available by Google and as a result, everyone has access to massive
computational power. With the tools available on GCP, it is very easy to build complex
data management pipelines as well as machine learning/AI workflows. This
democratization brings enormous power to the development community in building
cutting-edge, innovative, intelligent applications that complement human intelligence and
increase human capabilities many times.
This book is an attempt to make it easy to get started with building intelligent AI systems
using GCP. We have taken a hands-on approach to explain various concepts and
components available on GCP that facilitate AI development. We hope that this book will
be a good starting point to begin exploring an exciting and ever-expanding world of
computation to build AI-enabled applications on GCP.
Preface
[ 2 ]
Who this book is for
This book is for software developers, technical leads, and architects who are planning to
build AI applications with GCP. In addition to that, students and anyone who has a great
idea for an AI application and wants to understand the available tools and techniques to
quickly build prototypes and eventually production-grade applications will benefit from
this book. This book is also useful for business analysts and anyone who understands the
data landscape from the business perspective. Without a great deal of prior hands-on
experience, it is possible to follow this book to build AI-enabled applications based on
domain knowledge. We have attempted to provide the instructions in a step-by-step
manner so the reader finds them easy to follow and implement.
What this book covers
We have divided this book into four sections based on the logical grouping of the content
covered. Section 1 provides the fundamental introduction to GCP and introduces the reader
to various tools available on GCP.
Chapter 1, Overview of AI on GCP, sets the context for serverless computation on the cloud
and introduces the reader to GCP.
Chapter 2, Computing and Processing Using GCP Components, introduces the reader to
various tools and applications available on GCP for end-to-end data management, which is
essential for building AI applications on GCP.
Chapter 3, Machine Learning Applications with XGBoost, shows how one of the most popular
machine learning algorithms, XGBoost, is utilized on GCP. The idea is to enable readers to
understand that machine learning algorithms can be used on GCP without having to worry
about the underlying infrastructure and computation resources.
Chapter 4, Using Cloud AutoML, will help us take our first step toward the democratization
of machine learning. AutoML intends to provide machine learning as a service and makes it
easy for anyone with a limited understanding of machine learning models and core
implementation details to build applications with machine learning models. We will be
introduced to natural language and vision interfaces using AutoML.
Preface
[ 3 ]
Chapter 5, Building a Big Data Cloud Machine Learning Engine, will explore some of the
fundamentals of machine learning in the cloud. There is a paradigm shift with regard to
machine learning models on the cloud. We need to understand various concepts of cloud
computing and how storage and compute are leveraged to build and deploy models. This is
an essential chapter to understand if we want to optimize the costs of training and running
machine learning models on the cloud. We will take a look at various frameworks, such as
Keras, and we'll see how to use them on GCP.
Chapter 6, Smart Conversational Applications Using DialogFlow, discusses conversational
interfaces to machine intelligence, which is an essential component of overall AI
capabilities. In this chapter, we will understand how to use DialogFlow to build
conversational applications. DialogFlow provides an easy web and API interface for
quickly building a conversational application. It is possible to provide human-like verbal
communication once the model is trained for a large number of conversation paths.
Chapter 7, Understanding Cloud TPUs, discusses Tensor Processing Units (TPUs), which are
the fundamental building blocks behind the machine learning models on GCP. In this
chapter, we will introduce readers to TPUs and discuss their organization and significance
for accelerating machine learning workflows on GCP. If we want to optimize performance
and increase speed, it is imperative to utilize the strength of TPUs.
Chapter 8, Implement TensorFlow Models Using Cloud ML Engine, will further explore ML
Engine and shows how to build TensorFlow models on ML Engine. We will take a stepwise approach to training and deploying the machine learning models on GCP. We will
take a look at recommended best practices for building a machine learning pipeline on
GCP.
Chapter 9, Building Prediction Applications, explains the process of building prediction
applications on GCP. We begin by discussing the basics of the prediction process and take a
step-by-step, hands-on approach to building a prediction application with GCP. We will
train and deploy a model on the platform and utilize the API layer to interface with the
deployed model.
Chapter 10, Building an AI Application, will utilize various components of GCP to build an
end-to-end AI application. We will illustrate this with an example use case: automating an
invoice processing workflow using the tools on GCP.