ANHO Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, EMC Education Services 1) Read Chapter 1 of Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, EMC Education Services (Editor)
ISBN: 978-1-118-87613-8 January 2015.
2) Forbes Article Ten Ways Big Data Is Revolutionizing Marketing And Sales:
https://www.forbes.com/sites/louiscolumbus/2016/05/09/ten-ways-big-data-is-revolutionizing-marketing-and-sales/#276bba5321cf
3) Read Module 1: Explore in “R for Data Science” https://r4ds.had.co.nz/ and perform exercises
4) Readup on Amazon Web Services, Microsoft Azure and Google Cloud Platform Cost comparison:
https://www.techrepublic.com/article/amazon-aws-mi…
https://www.upwork.com/hiring/for-clients/aws-vs-a… School of Computer &
Information Sciences
ITS836 Data Science and Big Data Analytics
ITS 836
1
HW 01
Exercise 1: Compare the costs of AWS, Azure and
GGP
Exercise 2: Install R and Rstudio
Exercise 3: Do the Module 1, R for Data Science
ITS 836
2
Exercise 1 Big Data Cost Comparison
Amazon Web Services, Microsoft Azure and Google
Cloud Platform Cost comparison
Use the following references (or any others):
https://www.techrepublic.com/article/amazon-awsmicrosoft-azure-and-google-cloud-platform-comparingprices-for-basic-services/
https://www.upwork.com/hiring/for-clients/aws-vs-azurevs-google-cloud-platform-comparison/
ITS 836
3
Exercise 2: Install R, Rstudio and Packages
Chapter 1 Download and Install R, Rstudio and Packages
https://r4ds.had.co.nz/introduction.html
https://cloud.r-project.org/
Download and Install R ( copy a screenshot on your powerpoint slide)
Precompiled binary distributions of the base system and contributed
packages, Windows and Mac users most likely want one of these
versions of R:
Download R for Linux
Download R for (Mac) OS X
Download R for Windows
ITS 836
4
www.r-project.org/
ITS 836
5
R Studio
ITS 836
6
Exercise3: Module 1 Explore
R for Data Science
Chapter 5 Data Transformation
5.2.4
5.3.1
5.4.1
5.5.2
5.6.7
Chapter 6 Workflow: scripts
6.3
Chapter 7 Exploratory Data Analysis
7.3.4
7.4.1
7.5.1.1
7.5.2.1
7.5.3.1
Chapter 8 Workflow: projects
Chapter 3 Data Visualization
3.2.4
3.3.1
3.5.1
3.6.1
3.7.1
3.8.1
3.9.1
Chapter 4 Workflow Basics
4.4
https://r4ds.had.co.nz/data-visualisation.html
https://r4ds.had.co.nz/workflow-basics.html
https://r4ds.had.co.nz/transform.html
https://r4ds.had.co.nz/workflow-scripts.html
https://r4ds.had.co.nz/exploratory-data-analysis.html
ITS 836
7
R for Data Science 5 Modules
I Explore
II Wrangle
III Program
IV Model
R for Data Science, Garrett Grolemund & Hadley Wickham
https://r4ds.had.co.nz/index.html
ITS 836
V Communicate
8
Assignment Response format
Do the exercise
Share screen shot on your powerpoint response
Share the code and the plots in powerpoint format (Use
existing ppt slide format)
Put your name and id number
Upload
ITS 836
9
Questions?
ITS 836
10
School of Computer &
Information Sciences
ITS836 Data Science and Big Data Analytics
ITS 836
1
Data, data everywhere
1Zettabyte = 1000 EB
1Exabyte = 1000PB
1 Petabyte = 1000 TB
1 TB = 1000 GB
Data produced each year
163 ZB
8.0 ZB
800 EB
1.8 ZB
logarithmic scale
1 Zettabyte
161 EB
1 Exabyte
5 EB
120 PB
100-years of HD video + audio
1 Petabyte
60 PB
Human brain’s capacity
14 PB
2002
2006
2009
2011
2015
2023
Amazon Revenue Growth
AWS Cloud Revenue
$4.6
$7.9
2014
2015
$12.2
2016
$17.5
2017
Amazon dominates in Cloud Services
AWS rolled out quietly in 2006, roughly 6 years
ahead of its closest competition.
Amazon has head start
allowed it to invest in the infrastructure.
ITS 836
3
How the Web Players Make Money?
Data is Key for
Advertising
Revenue:
Facebook 97%
Alphabet 97%
Digital Services
Amazon and
Microsoft
All rely on Big
Data and AI
ITS 836
4
1.1 Big Data Overview
Industries that gather and exploit data
Credit card companies monitor purchase
Good at identifying fraudulent purchases
Mobile phone companies analyze calling patterns e.g.,
even on rival networks
Look for customers might switch providers
For social networks data is primary product
Intrinsic value increases as data grows
ITS 836
5
Attributes Defining Big Data Characteristics
Huge volume of data
Not just thousands/millions, but
billions of items
Complexity of data types and
structures
Varity of sources, formats, structures
Speed of new data creation and
grow
High velocity, rapid ingestion, fast
analysis
ITS 836
6
Example: Genotyping from 23andme.com
ITS 836
7
1.1.1 Data Structures:
Characteristics of Big Data
ITS 836
8
Data Structures:
Characteristics of Big Data
Structured defined data type, format, structure
Semi-structured
Text data with discernable patterns e.g., XML data
Quasi-structured
Transactional data, OLAP cubes, RDBMS, CVS files,
spreadsheets
Text data with erratic data formats e.g., clickstream
data
Unstructured
Data with no inherent structure text docs, PDFs,
images, video
ITS 836
9
Structured Data vs Semi-structured Data
ITS 836
10
Example of Quasi-Structured Data
visiting 3 websites adds 3 URLs to users log files
Example of Unstructured Data
Video about Antarctica Expedition
ITS 836
11
1.1.2 Types of Data Repositories
from an Analyst Perspective
ITS 836
12
Business Drivers for Advanced Analytics
Forbes Article: Ten Ways Big Data Is Revolutionizing Marketing And Sales
https://www.forbes.com/sites/louiscolumbus/2016/05/09/ten-ways-big-data-is-revolutionizing-marketing-and-sales/#276bba5321cf
ITS 836
13
Big Data Drives the Business
ITS 836
14
1.2 State of the Practice in Analytics
Business
Intelligence (BI)
versus Data Science
Current Analytical
Architecture
Drivers of Big Data
Emerging Big Data
Ecosystem and a
New Approach to
Analytics
ITS 836
15
1.2.2 Current Analytical Architecture
Typical Analytic Architecture
Data sources must be
well understood
EDW Enterprise
Data Warehouse
From the EDW data is
read by applications
Data scientists get
data for downstream
analytics processing
ITS 836
16
Sources of Big Data Deluge
Mobile sensors GPS, accelerometer, etc.
Social media 700 Facebook updates/sec
in2012
Video surveillance street cameras, stores,
etc.
Video rendering processing video for
display
Smart grids gather and act on information
Geophysical exploration oil, gas, etc.
Medical imaging reveals internal body
structures
Gene sequencing more prevalent, less
expensive, healthcare would like to predict
personal illnesses
ITS 836
17
1.2.3 Drivers of Big Data
Data Evolution & Rise of Big Data Sources
ITS 836
18
1.2.4 Emerging Big Data Ecosystem
New Approach to Analytics
Four main groups of players
Data devices
Games, smartphones, computers,
etc.
Data collectors
Phone and TV companies,
Internet, Govt, etc.
Data aggregators make sense
of data
Websites, credit bureaus, media
archives, etc.
Data users and buyers
Banks, law enforcement,
marketers, employers, etc.
ITS 836
19
1.3 Key Roles for the New Big Data
Ecosystem
1. Deep analytical talent
Advanced training in
quantitative disciplines e.g.,
math, statistics, machine
learning
2. Data savvy professionals
Savvy but less technical than
group 1
3. Technology and data enablers
Support people e.g., DB
admins, programmers, etc.
ITS 836
20
Three Key Roles of the
New Big Data Ecosystem
Three Recurring Data Scientist
Activities
1. Reframe business challenges as
analytics challenges
2. Design, implement, and deploy
statistical models and data
mining techniques on Big Data
3. Develop insights that lead to
actionable recommendations
https://datajobs.com/what-is-data-science
ITS 836
21
Profile of Data Scientist
Five Main Sets of Skills
Quantitative skill e.g., math,
statistics
Technical aptitude e.g., software
engineering, programming
Skeptical mindset and critical
thinking ability to examine work
critically
Curious and creative passionate
about data and finding creative
solutions
Communicative and collaborative
can articulate ideas, can work with
others
ITS 836
22
1.4 Examples of Big Data Analytics
Retailer Target
Uses life events: marriage, divorce,
pregnancy
Apache Hadoop
Open source Big Data infrastructure
innovation
MapReduce paradigm, ideal for many
projects
Social Media Company LinkedIn
Social network for working
professionals
Can graph a users professional
network
250 million users in 2014
ITS 836
23
Demand for Data Scientists Grows
Focus shifts to Machine Learning
Big Data vs Machine Learning
https://trends.google.com/trends/explore?date=today%205-y&q=big%20data,machine%20learning#TIMESERIES
ITS 836
24
ITS 836
25
Summary
Big Data comes from myriad sources
Social media, sensors, IoT, video
surveillance, and sources only recently
considered
Social Network Using InMaps
Companies are finding creative and
novel ways to use Big Data
Exploiting Big Data opportunities
requires
New data architectures
New machine learning algorithms,
ways of working
People with new skill sets
Always Review Chapter Exercises
ITS 836
26
Focus of Course
1.
Explain Big Data Analytics, and its importance to todays organizations.
2.
Understand the Big Data analytics lifecycle.
3.
Explore basic data analytic methods using R.
4.
Examine clustering analysis methods.
5.
Survey association rules.
6.
Show how to implement regression analytics.
7.
Employ classification analysis methods.
8.
Explore time series analysis methods.
9.
Understand text analysis.
10.
Survey analytics technology and tools.
11.
Examine in-database analysis techniques.
12.
Understand how to apply analysis techniques in real life situations.
ITS 836
27
Questions?
ITS 836
28
Purchase answer to see full
attachment
Why Choose Us
Top quality papers
We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.
Professional academic writers
We have hired a team of professional writers experienced in academic and business writing. Most of them are native speakers and PhD holders able to take care of any assignment you need help with.
Free revisions
If you feel that we missed something, send the order for a free revision. You will have 10 days to send the order for revision after you receive the final paper. You can either do it on your own after signing in to your personal account or by contacting our support.
On-time delivery
All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.
Original & confidential
We use several checkers to make sure that all papers you receive are plagiarism-free. Our editors carefully go through all in-text citations. We also promise full confidentiality in all our services.
24/7 Customer Support
Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.
Try it now!
How it works?
Follow these simple steps to get your paper done
Place your order
Fill in the order form and provide all details of your assignment.
Proceed with the payment
Choose the payment system that suits you most.
Receive the final file
Once your paper is ready, we will email it to you.
Our Services
No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.
Essays
You are welcome to choose your academic level and the type of your paper. Our academic experts will gladly help you with essays, case studies, research papers and other assignments.
Admissions
Admission help & business writing
You can be positive that we will be here 24/7 to help you get accepted to the Master’s program at the TOP-universities or help you get a well-paid position.
Reviews
Editing your paper
Our academic writers and editors will help you submit a well-structured and organized paper just on time. We will ensure that your final paper is of the highest quality and absolutely free of mistakes.
Reviews
Revising your paper
Our academic writers and editors will help you with unlimited number of revisions in case you need any customization of your academic papers