ZiBaT => Peter Levinsky => Big Data=> exercise
EXTRA Investigation of Hadoop
Updated : 2017-02-23


EXTRA Investigation of Hadoop/Hive

Idea: To have the hadoop (the ligth-version) up and running
Background: You have a Hadoop instance running (see installingHadoop)
the book : Fronter: Big Data Forensics – Learning Hadoop Investigations, Joe Sremack, Packt publishing

Create your own table

In the hive (from ambari)

Create a programming interface to this sensor table

You can use C#

You are going to use an ODBC driver and Visual Studio

  1. Download a ODBC driver to hortonworks Hive
    https://hortonworks.com/downloads/ -- choose 32bit / (64bit) and install the msi-file
     
  2. Open your 'ODBC Data Source' to add this new driver
    see (! for windows 7 - so do not download - this you did in step 1) from the middle how to setup ODBC data source
    https://github.com/hortonworks/hadoop-tutorials/blob/master/Sandbox/T07_Installing_the_Hortonworks_ODBC_Driver_on_Windows_7.md
    name as well as password is 'maria_dev'

  3. Create a C# project in Visual Studio

code example ------------------------------:

var conn = new OdbcConnection("DSN=MyHive"); // My ODBC name is 'MyHive'
conn.Open();
var sql = new OdbcCommand("-- some select statement -- ",conn);

Console.WriteLine("Result");

var reader = sql.ExecuteReader();
while (reader.Read())
{

/* Example if table row is (temp int, ligth int, time string)
int temp = reader.GetInt32(0);
int light = reader.GetInt32(1);
string time = reader.GetString(2);

Console.WriteLine($"t={temp} l={light} timestamp={time}");
*/
}

Console.WriteLine("End");
Console.ReadLine();

------------------------------------------------------

You can use Java

Make a progarm in Java preferable (or in C# -- not sure it will work directly) - you can e.g. use Netbeans to develope the program.
You can take the *.jar file move to linux and run as follow: java -jar *.jar

To access look at this example: https://cwiki.apache.org/confluence/display/Hive/HiveClient (some example to Python and PHP as well)

 

Setup and Use these sample files (Instead of the NYSE from the tutorial)

http://stackoverflow.com/questions/10843892/download-large-data-for-hadoop