Final Project: NYC Parking Tickets

Final Project Big Data Technologies 201A

This is where the big data file is and the sample data

https://www.kaggle.com/new-york-city/nyc-parking-tickets/data

This is a cool map I couldn't get to work. Maybe next course I will attempt it

http://www.bigendiandata.com/2017-06-27-Mapping_in_Jupyter/

This is a way to split csv files with a windows machine

https://www.addictivetips.com/windows-tips/csv-splitter-for-windows/

Moving local csv file from Local Drive to VM (Docker Container)

Command Prompt
Move to Linux server
pscp -i c:\BigDataTechnolgies\ServerKey\tjpauley_azure.ppk C:\data\DateDim.csv tjpauley@IPAddress:

Note: I used the FAQ at putty's website

Linux Command
Copy to sandbox data folder

sudo cp DateDim.csv /data

Zeppelin Notebook

Create data frame

case class DateDim ( DateNo: String, Weekday: String, Year: Integer, Month: Integer, Day: Integer)

val ParkingFourteens = spark.read.option("inferSchema", "true").option("header","true").csv("file:///data/DateDim.csv")
DateDims.toDF().registerTempTable("DateDim")
DateDims.show(1)

Zeppelin Notebook Commands

%md = markdown
%sql = SQL
http://IPAddress:9995 Note: Open port 9995

Jupyter Notebook Commands

http://IPAddress:9999 Note: Open port 9999

Search This Blog

Big Data Certificate 201A

Final Project: NYC Parking Tickets

Moving local csv file from Local Drive to VM (Docker Container)

Comments

Post a Comment

Popular posts from this blog

Assignment 03 Fun with Spark Part II