<PBA Web - Databases - Exercise
ZiBaT => Peter Levinsky => PBA-Database => exercise
Big Data #1
Updated : 2018-04-06

Big Data (hadoop / Udemy) #1

Background


BIG, page 12m-16b + 21-28b
https://www.udemy.com/hadoopstarterkit/ Section 1+2, lesson 1-3

https://www.udemy.com/hadoopstarterkit/ Section 3, lesson 4+6

https://www.udemy.com/hadoopstarterkit/ Section 4, lesson 7+8

https://www.udemy.com/hadoopstarterkit/ Section 3, lesson 5

 

Assignment 1 Big Data Quiz

After viewing course content section 1 (lesson 1) + 2 (lesson 2+3):

Take the Quiz in section 2 lesson 3 "Quiz 1: Test your understanding of Big Data"

 

Assignment 2 HDFS

After viewing course content section 3 (lesson 4 + 5 + 6):

  1. Take the Quiz in section 3 Lesson 6 "Quiz 2: Test your understanding of HDFS"
     
  2. Steps to get cluster access:
    1. Click on link in Udemy course section 3, lesson 5
    2. On the web page shown click on the big, yellow box ”Give me access to Hadoop Cluster"
    3. Fill in the form, and you will get an email with further instructions
    4. In the received email click on link cluster-key.zip (download) – this will download a zip file. Unpack the file
       
  3. Initial: Get Putty To Access The Udemy-Hadoop Sandbox (cluster)
    1. Download and install PuTTY (exe file) from http://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html
    2. Startup PuTTY and change the following settings:
      1. Category ”Session” : Host name / IP address – type in the IP address received in your email
      2. Category ”SSH” + ”Auth”: click on Browse button and select the .ppk file from the unzipped cluster-key.zip
      3. Category ”Session”: Type a name (e.g. Hadoop Cluster) in the field called ”Saved Session” and click the Save button
    3. Start PuTTY client by clicking on the Open button
    4. In the window, which now opens, type in User name (copy/paste user name from received email & right click in PuTTY window) and press return
    5. You now have access to the AWS cluster
       
  4. After the initial setup Normal start og Putty
    1. Startup PuTTY
    2. Category ”Session” : select saved session (”Hadoop Cluster”) & press the Load button
    3. Click the Open button
    4. In the window, which now opens, type in User name (copy/paste user name from received email & right click in PuTTY window) and press return
    5. You now have access to the AWS cluster
       
  5. Work with HDFS
    1. Try the different file commands from the Section 3 Lesson 5 "Working With Hdfs"

Assignment 3

After viewing course content section 4 (lesson 7 + 8) :

  1. Take the Quiz in section 4 Lesson 10 "Test your understanding of MapReduce"

 

 

Extra 3 MapReduce advanced (include java programing) After viewing course content section 4 (lesson 9 + 10) :
  1. Use your Hadoop cluster access
    1. Max Close Price Reducer
    2. Max Close Price Mapper
    3. Max Close Price