Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Analytics for the Internet of Things (IoT)
PREMIUM
Số trang
369
Kích thước
25.8 MB
Định dạng
PDF
Lượt xem
1667

Analytics for the Internet of Things (IoT)

Nội dung xem thử

Mô tả chi tiết

Analytics for the Internet of Things

(IoT)

Intelligent Analytics for Your Intelligent Devices

Andrew Minteer

BIRMINGHAM - MUMBAI

Analytics for the Internet of Things (IoT)

Copyright © 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or

transmitted in any form or by any means, without the prior written permission of the

publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the

information presented. However, the information contained in this book is sold without

warranty, either express or implied. Neither the author, nor Packt Publishing, and its

dealers and distributors will be held liable for any damages caused or alleged to be caused

directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the

companies and products mentioned in this book by the appropriate use of capitals.

However, Packt Publishing cannot guarantee the accuracy of this information.

First published: July 2017

Production reference: 1210717

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78712-073-0

www.packtpub.com

Credits

Author

Andrew Minteer

Copy Editor

Yesha Gangani

Reviewer

Ruben Oliva Ramos

Project Coordinator

Judie Jose

Commissioning Editor

Kartikey Pandey

Proofreader

Safis Editing

Acquisition Editor

Namrata Patil

Indexer

Aishwarya Gangawane

Content Development Editor

Abhishek Jadhav

Graphics

Kirk D'Penha

Technical Editor

Prachi Sawant

Production Coordinator

Aparna Bhagat

About the Author

Andrew Minteer is currently the senior director, data science and research at a leading

global retail company. Prior to that, he served as the director, IoT Analytics and Machine

Learning at a Fortune 500 manufacturing company.

He has an MBA from Indiana University with a background in statistics, software

development, database design, cloud architecture, and has led analytics teams for over 10

years.

He first taught himself to program on an Atari 800 computer at the age of 11 and fondly

remembers the frustration of waiting through 20 minutes of beeps and static to load a 100-

line program. He now thoroughly enjoys launching a 1 TB GPU-backed cloud instance in a

few minutes and getting right to work.

Andrew is a private pilot who looks forward to spending some time in the air sometime

soon. He enjoys kayaking, camping, traveling the world, and playing around with his six￾year-old son and three-year-old daughter.

I would like to thank my ever-patient wife, Julie, for her constant support and tolerance of

so many nights and weekends spent working on this technical book. I also thank her for

credibly convincing me that this book was not actually a sleep aid, she was just tired from

watching the kids. I also want to thank my energetic little princess-dress-wearing

daughter, Olivia, and my intelligent Lego-wielding son, Max, for inspiring me to keep at

it. Thank you to my family for your constant support and encouragement, especially my

father who I suspect is more excited about this book than I am.

While I am thanking everyone, I want to give a shout-out to all the fantastic people I have

worked with over the years, both bosses and colleagues. I have learned far more from them

than they have from me. I have been truly lucky to work with such talented people.

Last but not least, I want to thank all my editors and reviewers for their comments and

insights in developing this book.

I hope you, the reader, not only learn a lot about analytics for IoT but also enjoy the

experience.

About the Reviewer

Ruben Oliva Ramos is a computer systems engineer from Tecnologico of León Institute,

with a master's degree in computer and electronic systems engineering, teleinformatics, and

networking specialization from the University of Salle Bajio in Leon, Guanajuato Mexico.

He has more than 5 years of experience in developing web applications to control and

monitor devices connected with Arduino and Raspberry Pi using web frameworks and

cloud services to build applications using the Internet of Things.

He is a mechatronics teacher at the University of Salle Bajio and teaches students on the

master's degree in design and engineering of mechatronics systems. He also works at

Centro de Bachillerato Tecnologico Industrial 225 in Leon, Guanajuato Mexico, teaching

subjects such as electronics, robotics and control, automation, and microcontrollers at

Mechatronics Technician Career, consultant and developer projects in areas such as

monitoring systems and datalogger data using technologies such as Android, iOS,

Windows Phone, HTML5, PHP, CSS, Ajax, JavaScript, Angular, ASP .NET databases SQlite,

mongoDB, MySQL, web servers Node.js, IIS, hardware programming Arduino, Raspberry

pi, Ethernet Shield, GPS and GSM/GPRS, ESP8266, control, and monitor systems for data

acquisition and programming.

I would like to thank my savior and lord, Jesus Christ for giving me strength and courage

to pursue this project, to my dearest wife, Mayte, our two lovely sons, Ruben and Dario,

To my father (Ruben), my dearest mom (Rosalia), my brother (Juan Tomas), and my sister

(Rosalia) whom I love, for all their support while reviewing this book, for allowing me to

pursue my dream and tolerating not being with them after my busy day job.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and

ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a

print book customer, you are entitled to a discount on the eBook copy. Get in touch with us

at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a

range of free newsletters and receive exclusive discounts and offers on Packt books and

eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt

books and video courses, as well as industry-leading tools to help you plan your personal

development and advance your career.

why subscribe

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial

process. To help us improve, please leave us an honest review on this book's Amazon page

at https://www.amazon.com/dp/1787120732.

If you'd like to join our team of regular reviewers, you can e-mail us at

[email protected]. We award our regular reviewers with free eBooks and

videos in exchange for their valuable feedback. Help us be relentless in improving our

products!

Table of Contents

Preface 1

Chapter 1: Defining IoT Analytics and Challenges 8

The situation 8

Defining IoT analytics 12

Defining analytics 12

Defining the Internet of Things 14

The concept of constrained 15

IoT analytics challenges 15

The data volume 16

Problems with time 18

Problems with space 20

Data quality 21

Analytics challenges 22

Business value concerns 23

Summary 23

Chapter 2: IoT Devices and Networking Protocols 24

IoT devices 25

The wild world of IoT devices 25

Healthcare 25

Manufacturing 25

Transportation and logistics 26

Retail 27

Oil and gas 27

Home automation or monitoring 27

Wearables 28

Sensor types 28

Networking basics 29

IoT networking connectivity protocols 31

Connectivity protocols (when the available power is limited) 31

Bluetooth Low Energy (also called Bluetooth Smart) 32

6LoWPAN 33

ZigBee 36

Advantages of ZigBee 38

Disadvantages of ZigBee 38

Common use cases 38

NFC 39

[ ii ]

Common use cases 40

Sigfox 40

Connectivity protocols (when power is not a problem) 41

Wi-Fi 41

Common use cases 42

Cellular (4G/LTE) 43

Common use cases 44

IoT networking data messaging protocols 44

Message Queue Telemetry Transport (MQTT) 45

Topics 46

Advantages to MQTT 47

Disadvantages to MQTT 49

QoS levels 49

QoS 0 50

QoS 1 50

QoS 2 51

Last Will and Testament (LWT) 52

Tips for analytics 53

Common use cases 53

Hyper-Text Transport Protocol (HTTP) 53

Representational State Transfer (REST) principles 54

HTTP and IoT 55

Advantages to HTTP 55

Disadvantages to HTTP 55

Constrained Application Protocol (CoAP) 55

Advantages to CoAP 57

Disadvantages to CoAP 58

Message reliability 58

Common use cases 58

Data Distribution Service (DDS) 59

Common use cases 60

Analyzing data to infer protocol and device characteristics 61

Summary 63

Chapter 3: IoT Analytics for the Cloud 64

Building elastic analytics 65

What is cloud infrastructure? 65

Elastic analytics concepts 67

Design with the endgame in mind 69

Designing for scale 69

Decouple key components 69

Encapsulate analytics 69

Decoupling with message queues 70

Distributed computing 73

Avoid containing analytics to one server 73

[ iii ]

When to use distributed and when to use one server 73

Assuming that change is constant 74

Leverage managed services 74

Use Application Programming Interfaces (API) 76

Cloud security and analytics 77

Public/private keys 77

Public versus private subnets 77

Access restrictions 78

Securing customer data 78

The AWS overview 79

AWS key concepts 81

Regions 81

Availability Zones 81

Subnet 82

Security groups 82

AWS key core services 82

Virtual Private Cloud (VPC) 82

Identity and Access Management (IAM) 84

Elastic Compute (EC2) 84

Simple Storage Service (S3) 85

AWS key services for IoT analytics 85

Amazon Simple Queue Service (SQS) 86

Amazon Elastic Map Reduce (EMR) 86

AWS machine learning 87

Amazon Relational Database Service (RDS) 88

Amazon Redshift 88

Microsoft Azure overview 88

Azure Data Lake Store 88

Azure Analysis Services 89

HDInsight 90

The R server option 90

The ThingWorx overview 91

ThingWorx Core 92

ThingWorx Connection Services 92

ThingWorx Edge 93

ThingWorx concepts 94

Thing templates 94

Things 94

Properties 95

Services 95

Events 95

Thing shapes 96

Data shapes 96

[ iv ]

Entities 96

Summary 96

Chapter 4: Creating an AWS Cloud Analytics Environment 97

The AWS CloudFormation overview 97

The AWS Virtual Private Cloud (VPC) setup walk-through 99

Creating a key pair for the NAT and bastion instances 101

Creating an S3 bucket to store data 103

Creating a VPC for IoT Analytics 105

What is a NAT gateway? 105

What is a bastion host? 106

Your VPC architecture 106

The VPC Creation walk-through 108

How to terminate and clean up the environment 117

Summary 120

Chapter 5: Collecting All That Data - Strategies and Techniques 121

Designing data processing for analytics 122

Amazon Kinesis 122

AWS Lambda 123

AWS Athena 123

The AWS IoT platform 124

Microsoft Azure IoT Hub 125

Applying big data technology to storage 126

Hadoop 126

Hadoop cluster architectures 129

What is a Node? 130

Node types 130

Hadoop Distributed File System 131

Parquet 133

Avro 136

Hive 137

Serialization/Deserialization (SerDe) 139

Hadoop MapReduce 140

Yet Another Resource Negotiator (YARN) 140

HBase 142

Amazon DynamoDB 142

Amazon S3 142

Apache Spark for data processing 143

What is Apache Spark? 143

Spark and big data analytics 144

Thinking about a single machine versus a cluster of machines 145

Using Spark for IoT data processing 146

[ v ]

To stream or not to stream 148

Lambda architectures 149

Handling change 150

Summary 151

Chapter 6: Getting to Know Your Data - Exploring IoT Data 152

Exploring and visualizing data 154

The Tableau overview 154

Techniques to understand data quality 156

Look at your data - au naturel 157

Data completeness 158

Data validity 164

Assessing Information Lag 166

Representativeness 167

Basic time series analysis 167

What is meant by time series? 168

Applying time series analysis 168

Get to know categories in the data 173

Bring in geography 173

Look for attributes that might have predictive value 175

R (the pirate's language...if he was a statistician) 175

Installing R and RStudio 175

Using R for statistical analysis 176

Summing it all up 180

Solving industry-specific analysis problems 181

Manufacturing 181

Healthcare 182

Retail 183

Summary 183

Chapter 7: Decorating Your Data - Adding External Datasets to

Innovate 184

Adding internal datasets 185

Which ones and why? 186

Customer information 186

Production data 186

Field services 187

Financial 187

Adding external datasets 187

External datasets - geography 188

Elevation 188

SRTM elevation 188

National Elevation Dataset (NED) 189

[ vi ]

Weather 190

Geographical features 191

Planet.osm 192

Google Maps API 193

USGS national transportation datasets 194

External datasets - demographic 195

The U.S. Census Bureau 195

CIA World Factbook 196

External datasets - economic 197

Organization for Economic Cooperation and Development (OECD) 197

Federal Reserve Economic Data (FRED) 199

Summary 200

Chapter 8: Communicating with Others - Visualization and

Dashboarding 201

Common mistakes when designing visuals 203

The Hierarchy of Questions method 206

The Hierarchy of Questions method overview 207

Developing question trees 208

Pulling together the data 212

Aligning views with question flows 212

Designing visual analysis for IoT data 212

Using layout positioning to convey importance 213

Use color to highlight important data 213

The impact of using a single color to communicate importance 214

Be consistent across visuals 215

Make charts easy to interpret 215

Creating a dashboard with Tableau 216

The dashboard walk-through 216

Hierarchy of Questions example 216

Aligning visuals to the thought process 217

Creating individual views 218

Assembling views into a dashboard 222

Creating and visualizing alerts 224

Alert principles 225

Organizing alerts using a Tableau dashboard 225

Summary 229

Chapter 9: Applying Geospatial Analytics to IoT Data 230

Why do you need geospatial analytics for IoT? 232

The basics of geospatial analysis 234

Welcome to Null Island 234

Coordinate Reference Systems 235

[ vii ]

The Earth is not a ball 235

Vector-based methods 238

The bounding box 240

Contains 241

Buffer 242

Dilation and erosion 243

Simplify 244

Vector summary 245

Raster-based methods 245

Storing geospatial data 246

File formats 246

Spatial extensions for relational databases 248

Storing geospatial data in HDFS 248

Spatial indexing 249

R-tree 249

Processing geospatial data 251

Geospatial analysis software 251

ArcGIS 251

QGIS 252

ogr2ogr 253

PostGIS spatial functions 254

Geospatial analysis in the big data world 255

Solving the pollution reporting problem 256

Summary 257

Chapter 10: Data Science for IoT Analytics 258

Machine learning (ML) 259

What is machine learning? 260

Representation 262

Evaluation 262

Optimization 263

Generalization 263

Feature engineering with IoT data 264

Dealing with missing values 265

Centering and scaling 270

Time series handling 271

Validation methods 272

Cross-validation 272

Test set 274

Precision, recall, and specificity 274

Understanding the bias–variance tradeoff 276

Bias 277

Variance 278

Tải ngay đi em, còn do dự, trời tối mất!