# Binary Tree and Tree Traversals using Java

The root node of a tree is the node with no parents. There is at most one root node in a rooted tree.A leaf node has no children.
The depth of a node n is the length of the path from the root to the node. The set of all nodes at a given depth is sometimes called a level of the tree. The root node is at depth zero.
The height of a tree is the length of the path from the root to the deepest node in the tree. A (rooted) tree with only one node (the root) has a height of zero.
Siblings are nodes that share the same parent node.
A node p is an ancestor of a node q if it exists on the path from q to the root. The node q is then termed a descendant of p.
The size of a node is the number of descendants it has including itself.
In-degree of a node is the number of edges arriving at that node.
Out-degree of a node is the number of edges leaving that node.
The root is the only node in the tree with In-degree = 0.

TreeNode Class

```
public class TreeNode{

TreeNode leftNode;
TreeNode rightNode;
int data;

public TreeNode(int value){
data = value;
leftNode = rightNode = null;
}

public void insert(int value){
if (value < data){
if (leftNode == null)
leftNode = new TreeNode(value);
else
leftNode.insert(value);
}
else if (value > data){
if (rightNode == null)
rightNode = new TreeNode(value);
else
rightNode.insert(value);
}
}

}
```

Preorder Traversal

```  preorder(node)
if node = null then return
print node.value
preorder(node.left)
preorder(node.right)
```

Postorder Traversal

```  postorder(node)
if node = null then return
postorder(node.left)
postorder(node.right)
print node.value
```

Inorder Traversal

```  inorder(node)
if node = null then return
inorder(node.left)
print node.value
inorder(node.right)
```

Queue based Level order Traversal

```levelorder(root)
q = empty queue
q.enqueue(root)
while not q.empty do
node := q.dequeue()
visit(node)
if node.left ≠ null
q.enqueue(node.left)
if node.right ≠ null
q.enqueue(node.right)
```

Tree Class

```
public class Tree {

private TreeNode root;

Queue queue = new Queue();

public Tree(){
root = null;
}

public void insertNode(int value){

if ( root == null)
root = new TreeNode(value);
else
root.insert(value);
}

public void preorderTraversal(){
preorder(root);
}

public void inorderTraversal(){
inorder(root);
}

public void postorderTraversal(){
postorder(root);
}

public void levelorderTraversal(){
levelorder(root);
}

public void levelorder(TreeNode node){
if (node == null)
return;

if (node == root)
queue.offer(node);

if (node.leftNode != null)
queue.offer(node.leftNode);
if (node.rightNode != null)
queue.offer(node.rightNode);

System.out.printf("%d ", queue.poll().data);

levelorder(queue.peek());
}

public void preorder(TreeNode node){

if (node == null)
return;

System.out.printf("%d ", node.data);
preorder(node.leftNode);
preorder(node.rightNode);

}

public void postorder(TreeNode node){

if (node == null)
return;

postorder(node.leftNode);
postorder(node.rightNode);
System.out.printf("%d ", node.data);
}

public void inorder(TreeNode node){

if (node == null)
return;

inorder(node.leftNode);
System.out.printf("%d ", node.data);
inorder(node.rightNode);
}

}

```

TreeTest Class

```import java.util.Random;

public class TreeTest {

public static void main(String[] args){
Tree tree = new Tree();
int value;
Random random = new Random();

System.out.println("Inserting values");
for (int i=1; i<=10;i++){
value=random.nextInt(100);
System.out.printf("%d ", value);
tree.insertNode(value);
}
System.out.println();
System.out.println("Preorder");
tree.preorderTraversal();
System.out.println();
System.out.println("Postorder");
tree.postorderTraversal();
System.out.println();
System.out.println("Inorder");
tree.inorderTraversal();
System.out.println();
System.out.println("Levelorder");
tree.levelorderTraversal();
}
}

```

Of course, you need the Queue DS for level order traversal.

```
import java.util.ArrayList;
import java.util.List;

public class Queue {

private List qlist;

public Queue(){
qlist = new ArrayList();
}

public void offer(TreeNode o){
if (qlist.isEmpty())
else
}

public TreeNode poll(){
if (qlist.isEmpty())
return null;
else
return qlist.remove(0);
}

public TreeNode peek(){
if (qlist.isEmpty())
return null;
else
return qlist.get(0);
}
}

```

And finally the output

```Inserting values
87 77 81 89 4 26 23 27 57 1
Preorder
87 77 4 1 26 23 27 57 81 89
Postorder
1 23 57 27 26 4 81 77 89 87
Inorder
1 4 23 26 27 57 77 81 87 89
Levelorder
87 77 89 4 81 1 26 23 27 57
```

# HBase administration using the Java API, using code examples

I have not given a formal introduction on HBase, but this post will help those who have already set up and have an active HBase installation. I will be dealing with the administrative work that can be done on HBase using the Java API. The API is vast and easy to use. I have explained the code wherever I find it necessary, but this
post is by all means incomplete. I have as usual, provided the full code at the end. Cheers. 🙂

If you want to follow along your better import all this, or if you are using an IDE like Eclipse, you’ll follow along just fine as it automatically fixes up your imports. The only thing you need to do is to set the class path to include all the jar files from the hadoop installation and/or hbase installation, especially the hadoop-0.*.*-core.jar and the jar files inside the lib folder. I’ll put in another post on that later.

```
import java.io.IOException;
import java.util.Collection;

```

1. Creating a table in HBase

```    public void createTable (String tablename, String familyname) throws IOException {

Configuration conf = HBaseConfiguration.create();

HTableDescriptor tabledescriptor = new HTableDescriptor(Bytes.toBytes(tablename));

}
```

2. Adding a column to an existing table

```    public void addColumn (String tablename, String  colunmnname) throws IOException{

Configuration conf = HBaseConfiguration.create();
System.out.println("Added column : " + colunmnname + "to table " + tablename);
}
```

3. Deleting a column to an existing table

```    public void delColumn (String tablename, String  colunmnname) throws IOException{

Configuration conf = HBaseConfiguration.create();
System.out.println("Deleted column : " + colunmnname + "from table " + tablename);
}
```

4. Check if your hbase cluster is running properly

```    public static void checkIfRunning() throws MasterNotRunningException, ZooKeeperConnectionException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
//Check if Hbase is running
try{
}catch(Exception e){
System.err.println("Exception at " + e);
System.exit(1);
}
}
```

5. Major compaction

```    public void majorCompact (String mytable) throws IOException{

//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
//Instantiate a new client.
HTable table = new HTable(conf,mytable);

String tablename = table.toString();
try{
System.out.println("Compaction done!");
}catch(Exception e){
System.out.println(e);
}
}
```

6. Minor compaction

```    public void minorcompact(String trname) throws IOException, InterruptedException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
}
```

7. Print out the cluster status.

```    public ClusterStatus getclusterstatus () throws IOException{
Configuration conf = HBaseConfiguration.create();
}
```

8. Get all the cluster details.

```    public void printClusterDetails() throws IOException{
ClusterStatus status = getclusterstatus();

status.getServerInfo();
Collection serverinfo =  status.getServerInfo();
for (HServerInfo s : serverinfo){
System.out.println("Servername " + s.getServerName());
System.out.println("Hostname " + s.getHostname());
System.out.println("Hostname:Port " + s.getHostnamePort());
System.out.println("Info port" + s.getInfoPort());
System.out.println();
}

String version = status.getHBaseVersion();
System.out.println("Version " + version);

int regioncounts = status.getRegionsCount();
System.out.println("Regioncounts :" + regioncounts);

int servers = status.getServers();
System.out.println("Servers :" + servers);

for (String s : Servernames  ){
}
}
```

9. Disable a table.

```    public void disabletable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
}
```

10. Enable a table

```    public void enabletable(String tablename) throws IOException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
}
```

11. Delete a table.

```    public void deletetable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
}
```

12. Check if table is available

```   public void isTableAvailable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " available ?" + result);
}
```

13. Check if table is enabled

```    public void isTableEnabled(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " enabled ?" + result);
}
```

14. Check if table is disabled

```    public void isTableDisabled(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " disabled ?" + result);
}
```

15. Check if table exists.

```    public void tableExists(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " exists ?" + result);
}
```

16. List all tables

```    public void listTables () throws IOException{
Configuration conf = HBaseConfiguration.create();
}
```

17. Flush tables.

```   public void flush(String trname) throws IOException{
Configuration conf = HBaseConfiguration.create();
}
```

18. Shutdown hbase.

```    public void shutdown() throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Shutting down..");
}
```

19. Modify column for a table.

```    @SuppressWarnings("deprecation")
public void modifyColumn(String tablename, String columnname, String descriptor) throws IOException{
Configuration conf = HBaseConfiguration.create();

}
```

20. Modify the avilable table.

```    public void modifyTable(String tablename, String newtablename) throws IOException{
Configuration conf = HBaseConfiguration.create();

}
```

21. Split based on tablename.

```    public void split(String tablename) throws IOException, InterruptedException{
Configuration conf = HBaseConfiguration.create();
}
```

22. Check if master is running.

```    public void isMasterRunning() throws MasterNotRunningException, ZooKeeperConnectionException{
Configuration conf = HBaseConfiguration.create();
System.out.println( "Master running ? "+ administer.isMasterRunning());
}
```

There are lots more, you can check the Java API for HBase and prepare more. I found all this necessary. And some well..

The full listing of the code:

```/*
*
* */

import java.io.IOException;
import java.util.Collection;

}

public void addColumn (String tablename, String  colunmnname) throws IOException{

Configuration conf = HBaseConfiguration.create();
System.out.println("Added column : " + colunmnname + "to table " + tablename);
}

public void delColumn (String tablename, String  colunmnname) throws IOException{

Configuration conf = HBaseConfiguration.create();
System.out.println("Deleted column : " + colunmnname + "from table " + tablename);
}

public void createTable (String tablename, String familyname) throws IOException {

Configuration conf = HBaseConfiguration.create();

HTableDescriptor tabledescriptor = new HTableDescriptor(Bytes.toBytes(tablename));

}

public void majorCompact (String mytable) throws IOException{

//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
//Instantiate a new client.
HTable table = new HTable(conf,mytable);

String tablename = table.toString();
try{
System.out.println("Compaction done!");
}catch(Exception e){
System.out.println(e);
}
}

public static void checkIfRunning() throws MasterNotRunningException, ZooKeeperConnectionException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
//Check if Hbase is running
try{
}catch(Exception e){
System.err.println("Exception at " + e);
System.exit(1);
}
}

public void minorcompact(String trname) throws IOException, InterruptedException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
}

public void deletetable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
}

public void disabletable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
}

public void enabletable(String tablename) throws IOException{
//Create the required configuration.
Configuration conf = HBaseConfiguration.create();
}

public void flush(String trname) throws IOException{
Configuration conf = HBaseConfiguration.create();
}

public ClusterStatus getclusterstatus () throws IOException{
Configuration conf = HBaseConfiguration.create();
}

public void printClusterDetails() throws IOException{
ClusterStatus status = getclusterstatus();

status.getServerInfo();
Collection serverinfo =  status.getServerInfo();
for (HServerInfo s : serverinfo){
System.out.println("Servername " + s.getServerName());
System.out.println("Hostname " + s.getHostname());
System.out.println("Hostname:Port " + s.getHostnamePort());
System.out.println("Info port" + s.getInfoPort());
System.out.println();
}

String version = status.getHBaseVersion();
System.out.println("Version " + version);

int regioncounts = status.getRegionsCount();
System.out.println("Regioncounts :" + regioncounts);

int servers = status.getServers();
System.out.println("Servers :" + servers);

for (String s : Servernames  ){
}

}

public void isTableAvailable(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " available ?" + result);
}

public void isTableEnabled(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " enabled ?" + result);
}

public void isTableDisabled(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " disabled ?" + result);
}

public void tableExists(String tablename) throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Table " + tablename + " exists ?" + result);
}

public void shutdown() throws IOException{
Configuration conf = HBaseConfiguration.create();
System.out.println("Shutting down..");
}

public void listTables () throws IOException{
Configuration conf = HBaseConfiguration.create();
}

@SuppressWarnings("deprecation")
public void modifyColumn(String tablename, String columnname, String descriptor) throws IOException{
Configuration conf = HBaseConfiguration.create();

}

public void modifyTable(String tablename, String newtablename) throws IOException{
Configuration conf = HBaseConfiguration.create();

}

public void split(String tablename) throws IOException, InterruptedException{
Configuration conf = HBaseConfiguration.create();
}

public void isMasterRunning() throws MasterNotRunningException, ZooKeeperConnectionException{
Configuration conf = HBaseConfiguration.create();
System.out.println( "Master running ? "+ administer.isMasterRunning());
}

public static void main (String[] args) throws IOException{

//Check if Hbase is running properly

//other functions based on arguments.

}
}

```

And hey, I’ll update this post soon with more details especially about compaction and stuff! Till then Cheers! 🙂

# Shell scripting: Arithmetic using expr, bc and dc tools

You will definitely need to do some math sometime or the other on the shell. As always ‘expr’ was the most popular thing out there to do complicated mathematical expressions. I was looking at some other options as well when I came across the bc and dc tools. I will explain each one of them in this post.

expr

This is by far the most famous for doing some math on the bash shell. There are two kinds mainly. One on string expressions and then the usual numericals. I would be writing about the later.

```expr 40 % 5
0
expr 40 / 5
8
expr 40 / 5 / 8
1
expr 40 / 5 / 8 + 1
2
expr 40 / 5 / 8 + 1 * 10
expr: syntax error
```

Of course, while doing multiplication you need to use the escaped character ‘\’ backslash. And thus,

```expr 40 / 5 / 8 + 1 \* 10
11
```

The brackets, division, multiplication, addition and subtraction rules also govern here. Now lets look at the others.

bc

This is a language bc that supports arbitrary precision numbers with interactive execution of statements. It starts by processing code from all the files listed on the command line in the order listed. Now a neat way to calculate stuff is:

```echo 2*30/3 | bc
20
echo "20 + 5 * 3" | bc
35
```

Again this follows the basic BODMAS rules.

dc

Stands for desk calculator. Its an interactive calculator on the shell. It supports the basic arithmetic and uses the standard + – / * symbols but entered after the digits. Once you enter the symbol, get the calculated output by passing ‘p’ similar to our ‘=’ symbol on the calculator. And you can keep going.

```dc
98
9
*
p
882
10
/
p
88
```

If I find more useful tools, I’ll update this post. If you have better ideas to implement this, feel free to suggest!

# Shell scripting: awk tutorial

This is one of the best linux line processing utility I have come across. I can do all sort of stuff. Add, replace, find and index stuff. And basically much more. I’ll just get to the examples.

The Basics:

Suppose we have a file like:

```echo \$somefile
this_is_something_interesting

echo \$somefile | awk -F '_' '{print toupper(\$3)}'
SOMETHING
```

Now, -F is the field delimiter. And we split the contents of the variable or a file based on the delimiter, which in our case is ‘_’. Now print is the standard function to print out stuff. Now \$3 contains the third split value, which in this case is ‘something’. toupper() converts this to uppercase. Duh!.

This example shows how awk converts a string into an array.

```echo \$time
10:20:30

hms=`echo \$time | awk '{split(\$0,a,":"); print a[1], a[2], a[3]}'`
echo \$hms
10 20 30
```

How the delimiter is ‘:’. Store the splits into variable a which acts like an array. a[2] contains the second split 20 and so on.

Okay, so how do the get the second last or last split of a string?

```c=`echo \$i | awk 'BEGIN{FS="_"}{for (i=1; i<=NF; i++) if (i==NF-1) print \$i}'` # NF contains total number of splits and variable c contains the second last word-split.
```

This is the basic syntax. You start with BEGIN, where you mention the field seperator FS. Now, NF is a special variable that contains the number of splits. So we loop through the condition, till we reach NF-1, the second last split, which we check using the ‘if’ condition. Then just print it out of course!

Substitution using awk:
This is done using ‘sub’. The first and only the FIRST occurrence of ‘shower’ is replaced by ‘steam’. ‘\$0’ means the entire string.

```text=`echo \$text | awk '{sub("shower","stream"); print \$0}'` # substitution only first
```

Global substitution using awk:
Using ‘gsub’, the entire string or file is replaced. Now, a[a-z] means, all words starting with a and followed by any alphabet, is to be replaced by x.

```text1=`echo \$text | awk '{gsub("a[a-z]","x"); print \$0}'` # global, a followed by an alpha replaced by x
```

Similarly, the condition here is to replace anything between ‘a’ and ‘d’ with ‘tt’.

```text2=`echo \$text | awk '{sub("a*d","tt"); print \$0}'` # a followed by anything till d, replace d with tt
```

Similarly for numbers:

```cat=`echo \$name | awk '{sub("[0-9]+",""); print \$0}'` # first occurance removes numbers
```

Another example:

```short=`echo \$name | awk '{gsub("[b-z]",""); print \$0}'` # global removes all from b to z and replace with ''
```

Substring using awk:
Suppose we want just the part of the string. (12,8) means go to the 12th character, and get me the next 8 characters.

```echo \$caption
thisislinuxjunkies

object=`echo \$caption | awk '{print substr(\$0,12,8)}'` # substring
echo \$object
inuxjunk
```

Reading a particular set of lines:
‘NR’ is a special variable with awk which tells us about the number of lines read.

```echo \$myfile
a
b
c
d
awk 'NR < 3' \$myfile # number of lines read in a file
a
b
```

This is not a complete set of things you get to do with awk. I’ll update this post if I find more neat tricks! Questions appreciated! 🙂

# Shell scripting: alias tutorial

This is a neat little utility on the bash shell. Aliases allow a string to be substituted for a word when it is used as the first word of a simple command.

```#alias#
alias today='date +"%A, %B %-d, %Y"' #Tuesday, October 18, 2011
```

And how do you undo this alias? Simple.

```unalias today
```

Now typing today on the terminal returns “Tuesday, October 18, 2011”. These are some neat little tricks I think are required on any linux box.

Use alias to fix missing space typos.

```alias cd..='cd ..'
alias ..='cd ..'
```

Display the working directory.

```alias .='echo \$PWD'
```

Prevent accidental deletions by making rm interactive.

```alias rm='rm -i'
```

I hope this will get you started! 🙂

# A HDFSClient for Hadoop using the native JAVA API, a tutorial

I’d like to talk about doing some day to day administrative task on the Hadoop system. Although the hadoop fs <commands> can get you to do most of the things, its still worthwhile to explore the rich API in Java for Hadoop. This post is by no means complete, but can get you started well.

The most basic step is to create an object of this class.

```HDFSClient client = new HDFSClient();
```

Of course, you need to import a bunch of stuff. But if you are using an IDE like Eclipse, you’ll follow along just fine just by importing these. This should word fine for the entire code.

```import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

```

1. Copying from Local file system to HDFS.
Copies a local file onto HDFS. You do have the hadoop file system command to do the same.

```hadoop fs -copyFromLocal <local fs> <hadoop fs>
```

I am not explaining much here as the comments are quite helpful. Of course, while importing the configuration files, make sure to point them to your hadoop systems location. For mine, it looks like this:

```Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
```

This is how the Java API looks like:

```public void copyFromLocal (String source, String dest) throws IOException {

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

Path dstPath = new Path(dest);
// Check if the file already exists
if (!(fileSystem.exists(dstPath))) {
System.out.println("No such destination " + dstPath);
return;
}

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

try{
fileSystem.copyFromLocalFile(srcPath, dstPath);
System.out.println("File " + filename + "copied to " + dest);
}catch(Exception e){
System.err.println("Exception caught! :" + e);
System.exit(1);
}finally{
fileSystem.close();
}
}
```

2.Copying files from HDFS to the local file system.

The hadoop fs command is the following.

```hadoop fs -copyToLocal <hadoop fs> <local fs>
```

Alternatively,

```hadoop fs -copyToLocal
```
```public void copyFromHdfs (String source, String dest) throws IOException {

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

Path dstPath = new Path(dest);
// Check if the file already exists
if (!(fileSystem.exists(dstPath))) {
System.out.println("No such destination " + dstPath);
return;
}

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

try{
fileSystem.copyToLocalFile(srcPath, dstPath)
System.out.println("File " + filename + "copied to " + dest);
}catch(Exception e){
System.err.println("Exception caught! :" + e);
System.exit(1);
}finally{
fileSystem.close();
}
}
```

3.Renaming a file in HDFS.

You can use the mv command in this context.

```hadoop fs -mv <this name> <new name>
```
```public void renameFile (String fromthis, String tothis) throws IOException{
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path fromPath = new Path(fromthis);
Path toPath = new Path(tothis);

if (!(fileSystem.exists(fromPath))) {
System.out.println("No such destination " + fromPath);
return;
}

if (fileSystem.exists(toPath)) {
return;
}

try{
boolean isRenamed = fileSystem.rename(fromPath, toPath);
if(isRenamed){
System.out.println("Renamed from " + fromthis + "to " + tothis);
}
}catch(Exception e){
System.out.println("Exception :" + e);
System.exit(1);
}finally{
fileSystem.close();
}

}
```

```public void addFile(String source, String dest) throws IOException {

// Conf object will read the HDFS configuration parameters
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

// Create the destination path including the filename.
if (dest.charAt(dest.length() - 1) != '/') {
dest = dest + "/" + filename;
} else {
dest = dest + filename;
}

// Check if the file already exists
Path path = new Path(dest);
if (fileSystem.exists(path)) {
System.out.println("File " + dest + " already exists");
return;
}

// Create a new file and write data to it.
FSDataOutputStream out = fileSystem.create(path);
InputStream in = new BufferedInputStream(new FileInputStream(
new File(source)));

byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}

// Close all the file descripters
in.close();
out.close();
fileSystem.close();
}

```

5.Delete a file from HDFS.

You can use the following:

For removing a directory or a file:

```hadoop fs -rmr <hdfs path>
```

If you want to skip the trash also, use:

```hadoop fs -rmr -skipTrash <hdfs path>
```

```public void deleteFile(String file) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(file);
if (!fileSystem.exists(path)) {
System.out.println("File " + file + " does not exists");
return;
}

fileSystem.delete(new Path(file), true);

fileSystem.close();
}
```

6.Get modification time of a file in HDFS.

If you have any idea on this let me know. 🙂

```public void getModificationTime(String source) throws IOException{

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

// Check if the file already exists
if (!(fileSystem.exists(srcPath))) {
System.out.println("No such destination " + srcPath);
return;
}
// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

FileStatus fileStatus = fileSystem.getFileStatus(srcPath);
long modificationTime = fileStatus.getModificationTime();

System.out.format("File %s; Modification time : %0.2f %n",filename,modificationTime);

}
```

7.Get the block locations of a file in HDFS.

```public void getBlockLocations(String source) throws IOException{

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

// Check if the file already exists
if (!(ifExists(srcPath))) {
System.out.println("No such destination " + srcPath);
return;
}
// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

FileStatus fileStatus = fileSystem.getFileStatus(srcPath);

BlockLocation[] blkLocations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
int blkCount = blkLocations.length;

System.out.println("File :" + filename + "stored at:");
for (int i=0; i < blkCount; i++) {
String[] hosts = blkLocations[i].getHosts();
System.out.format("Host %d: %s %n", i, hosts);
}

}
```

8.List all the datanodes in terms of hostnames.
This is a neat way rather than looking up the /etc/hosts file in the namenode.

```public void getHostnames () throws IOException{
Configuration config = new Configuration();

FileSystem fs = FileSystem.get(config);
DistributedFileSystem hdfs = (DistributedFileSystem) fs;
DatanodeInfo[] dataNodeStats = hdfs.getDataNodeStats();

String[] names = new String[dataNodeStats.length];
for (int i = 0; i < dataNodeStats.length; i++) {
names[i] = dataNodeStats[i].getHostName();
System.out.println((dataNodeStats[i].getHostName()));
}
}
```

9.Create a new directory in HDFS.
Creating a directory will be done as:

```hadoop fs -mkdir <hadoop fs path>
```

```public void mkdir(String dir) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(dir);
if (fileSystem.exists(path)) {
System.out.println("Dir " + dir + " already exists!");
return;
}

fileSystem.mkdirs(path);

fileSystem.close();
}
```

10. Read a file from HDFS

```public void readFile(String file) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(file);
if (!fileSystem.exists(path)) {
System.out.println("File " + file + " does not exists");
return;
}

FSDataInputStream in = fileSystem.open(path);

String filename = file.substring(file.lastIndexOf('/') + 1,
file.length());

OutputStream out = new BufferedOutputStream(new FileOutputStream(
new File(filename)));

byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}

in.close();
out.close();
fileSystem.close();
}
```

11.Checking if a file exists in HDFS

```public boolean ifExists (Path source) throws IOException{

Configuration config = new Configuration();

FileSystem hdfs = FileSystem.get(config);
boolean isExists = hdfs.exists(source);
return isExists;
}

```

I know this is no way complete. But this is a rather long post. I hope it is useful. Responses appreciated!
And here is the complete code for HDFSClient.java. Happy Hadooping! 🙂

```/*
Feel free to use, copy and distribute this program in any form.
HDFSClient.java
https://linuxjunkies.wordpress.com/
2011
*/

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

public class HDFSClient {

public HDFSClient() {

}

public static void printUsage(){
System.out.println("Usage: hdfsclient add" + "<local_path> <hdfs_path>");
System.out.println("Usage: hdfsclient delete" + "<hdfs_path>");
System.out.println("Usage: hdfsclient mkdir" + "<hdfs_path>");
System.out.println("Usage: hdfsclient copyfromlocal" + "<local_path> <hdfs_path>");
System.out.println("Usage: hdfsclient copytolocal" + " <hdfs_path> <local_path> ");
System.out.println("Usage: hdfsclient modificationtime" + "<hdfs_path>");
System.out.println("Usage: hdfsclient getblocklocations" + "<hdfs_path>");
System.out.println("Usage: hdfsclient gethostnames");
}

public boolean ifExists (Path source) throws IOException{

Configuration config = new Configuration();

FileSystem hdfs = FileSystem.get(config);
boolean isExists = hdfs.exists(source);
return isExists;
}

public void getHostnames () throws IOException{
Configuration config = new Configuration();

FileSystem fs = FileSystem.get(config);
DistributedFileSystem hdfs = (DistributedFileSystem) fs;
DatanodeInfo[] dataNodeStats = hdfs.getDataNodeStats();

String[] names = new String[dataNodeStats.length];
for (int i = 0; i < dataNodeStats.length; i++) {
names[i] = dataNodeStats[i].getHostName();
System.out.println((dataNodeStats[i].getHostName()));
}
}

public void getBlockLocations(String source) throws IOException{

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

// Check if the file already exists
if (!(ifExists(srcPath))) {
System.out.println("No such destination " + srcPath);
return;
}
// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

FileStatus fileStatus = fileSystem.getFileStatus(srcPath);

BlockLocation[] blkLocations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
int blkCount = blkLocations.length;

System.out.println("File :" + filename + "stored at:");
for (int i=0; i < blkCount; i++) {
String[] hosts = blkLocations[i].getHosts();
System.out.format("Host %d: %s %n", i, hosts);
}

}

public void getModificationTime(String source) throws IOException{

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

// Check if the file already exists
if (!(fileSystem.exists(srcPath))) {
System.out.println("No such destination " + srcPath);
return;
}
// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

FileStatus fileStatus = fileSystem.getFileStatus(srcPath);
long modificationTime = fileStatus.getModificationTime();

System.out.format("File %s; Modification time : %0.2f %n",filename,modificationTime);

}

public void copyFromLocal (String source, String dest) throws IOException {

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

Path dstPath = new Path(dest);
// Check if the file already exists
if (!(fileSystem.exists(dstPath))) {
System.out.println("No such destination " + dstPath);
return;
}

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

try{
fileSystem.copyFromLocalFile(srcPath, dstPath);
System.out.println("File " + filename + "copied to " + dest);
}catch(Exception e){
System.err.println("Exception caught! :" + e);
System.exit(1);
}finally{
fileSystem.close();
}
}

public void copyToLocal (String source, String dest) throws IOException {

Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path srcPath = new Path(source);

Path dstPath = new Path(dest);
// Check if the file already exists
if (!(fileSystem.exists(srcPath))) {
System.out.println("No such destination " + srcPath);
return;
}

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

try{
fileSystem.copyToLocalFile(srcPath, dstPath);
System.out.println("File " + filename + "copied to " + dest);
}catch(Exception e){
System.err.println("Exception caught! :" + e);
System.exit(1);
}finally{
fileSystem.close();
}
}

public void renameFile (String fromthis, String tothis) throws IOException{
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);
Path fromPath = new Path(fromthis);
Path toPath = new Path(tothis);

if (!(fileSystem.exists(fromPath))) {
System.out.println("No such destination " + fromPath);
return;
}

if (fileSystem.exists(toPath)) {
return;
}

try{
boolean isRenamed = fileSystem.rename(fromPath, toPath);
if(isRenamed){
System.out.println("Renamed from " + fromthis + "to " + tothis);
}
}catch(Exception e){
System.out.println("Exception :" + e);
System.exit(1);
}finally{
fileSystem.close();
}

}

public void addFile(String source, String dest) throws IOException {

// Conf object will read the HDFS configuration parameters
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

// Create the destination path including the filename.
if (dest.charAt(dest.length() - 1) != '/') {
dest = dest + "/" + filename;
} else {
dest = dest + filename;
}

// Check if the file already exists
Path path = new Path(dest);
if (fileSystem.exists(path)) {
System.out.println("File " + dest + " already exists");
return;
}

// Create a new file and write data to it.
FSDataOutputStream out = fileSystem.create(path);
InputStream in = new BufferedInputStream(new FileInputStream(
new File(source)));

byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}

// Close all the file descripters
in.close();
out.close();
fileSystem.close();
}

public void readFile(String file) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(file);
if (!fileSystem.exists(path)) {
System.out.println("File " + file + " does not exists");
return;
}

FSDataInputStream in = fileSystem.open(path);

String filename = file.substring(file.lastIndexOf('/') + 1,
file.length());

OutputStream out = new BufferedOutputStream(new FileOutputStream(
new File(filename)));

byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}

in.close();
out.close();
fileSystem.close();
}

public void deleteFile(String file) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(file);
if (!fileSystem.exists(path)) {
System.out.println("File " + file + " does not exists");
return;
}

fileSystem.delete(new Path(file), true);

fileSystem.close();
}

public void mkdir(String dir) throws IOException {
Configuration conf = new Configuration();

FileSystem fileSystem = FileSystem.get(conf);

Path path = new Path(dir);
if (fileSystem.exists(path)) {
System.out.println("Dir " + dir + " already exists!");
return;
}

fileSystem.mkdirs(path);

fileSystem.close();
}

public static void main(String[] args) throws IOException {

if (args.length < 1) {
printUsage();
System.exit(1);
}

HDFSClient client = new HDFSClient();
if (args.length < 3) {
System.out.println("Usage: hdfsclient add <local_path> " + "<hdfs_path>");
System.exit(1);
}

if (args.length < 2) {
System.exit(1);
}

} else if (args[0].equals("delete")) {
if (args.length < 2) {
System.out.println("Usage: hdfsclient delete <hdfs_path>");
System.exit(1);
}

client.deleteFile(args[1]);
} else if (args[0].equals("mkdir")) {
if (args.length < 2) {
System.out.println("Usage: hdfsclient mkdir <hdfs_path>");
System.exit(1);
}

client.mkdir(args[1]);
}else if (args[0].equals("copyfromlocal")) {
if (args.length < 3) {
System.out.println("Usage: hdfsclient copyfromlocal <from_local_path> <to_hdfs_path>");
System.exit(1);
}

client.copyFromLocal(args[1], args[2]);
} else if (args[0].equals("rename")) {
if (args.length < 3) {
System.out.println("Usage: hdfsclient rename <old_hdfs_path> <new_hdfs_path>");
System.exit(1);
}

client.renameFile(args[1], args[2]);
}else if (args[0].equals("copytolocal")) {
if (args.length < 3) {
System.out.println("Usage: hdfsclient copytolocal <from_hdfs_path> <to_local_path>");
System.exit(1);
}

client.copyToLocal(args[1], args[2]);
}else if (args[0].equals("modificationtime")) {
if (args.length < 2) {
System.out.println("Usage: hdfsclient modificationtime <hdfs_path>");
System.exit(1);
}

client.getModificationTime(args[1]);
}else if (args[0].equals("getblocklocations")) {
if (args.length < 2) {
System.out.println("Usage: hdfsclient getblocklocations <hdfs_path>");
System.exit(1);
}

client.getBlockLocations(args[1]);
} else if (args[0].equals("gethostnames")) {

client.getHostnames();
}else {

printUsage();
System.exit(1);
}

System.out.println("Done!");
}
}
```