Wednesday, July 29, 2015

Hadoop Admin Commands quick reference

Reference:http://www.thegeekstuff.com/
 
Hadoop filesystem commands


hadoop fs -mkdir /dir
hadoop fs -ls
hadoop fs -cat <filename>
hadoop fs -rm <<filename>>
hadoop fs -mv file:///data/datafile /user/hduser/data
hadoop fs -touchz <<filename>> -create empty file
hadoop fs -stat <filename>
hadoop fs -expunge   <<empty trash on hdfs>>
ram@ram:/etc/init.d$ hadoop fs -du /user
50270  /user/1.log
0      /user/hive

hadoop fs -copyFromLocal <source> <destination>
hadoop fs -copyToLocal <source> <destination>
hadoop fs -put <source> <destination> --copy from remote location
hadoop fs -get <source> <destination> --copy to remote location
hadoop distcp hdfs://192.168.0.8:8020/input hdfs://192.168.0.8:8020/output
-- Copy data from one cluster to another using the cluster URL
hadoop fs -setrep -w 3 file1
hadoop fs -getmerge mydir bigfile
-- Merge files in mydir directory and download it as one big file




Hadoop Job Commands

hadoop job -submit <job-file>
hadoop job -status <job-id>
hadoop job -history
hadoop job -kill-task <task-id>


ram@ram:/etc/init.d$ hadoop job -list all
DEPRECATED: Use of
this script to execute mapred command is deprecated.
Instead use the mapred command for it.

15/07/29 21:03:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/07/29 21:03:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total jobs:0
                  JobId         State         StartTime        UserName           Queue      Priority     UsedContainers     RsvdContainers     UsedMem RsvdMem     NeededMem       AM info


ram@ram:/etc/init.d$ hadoop job -list-active-trackers
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

15/07/29 21:04:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
tracker_ram:49874



Hadoop Namenode commands

hadoop namenode -format
hadoop namenode -upgrade
hadoop namenode -recover -force
hadoop fsck -delete    <<delete corrupted files>>
hadoop fsck -move    <<move corrupted files to lost+found folder>

-- Recover namenode metadata after a cluster failure (may lose data)

ram@ram:/etc/init.d$ stop-dfs.sh
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode

ram@ram:/etc/init.d$ stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
no proxyserver to stop


ram@ram:/etc/init.d$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-ram-namenode-ram.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-ram-datanode-ram.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-ram-secondarynamenode-ram.out


ram@ram:/etc/init.d$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-ram-resourcemanager-ram.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-ram-nodemanager-ram.out
ram@ram:/etc/init.d$


ram@ram:/etc/init.d$ jps
6330 NodeManager
6192 ResourceManager
5827 DataNode
6649 Jps
6028 SecondaryNameNode
5664 NameNode



ram@ram:/etc/init.d$ hadoop fsck /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Connecting to namenode via http://localhost:50070/fsck?ugi=ram&path=%2F
FSCK started by ram (auth:SIMPLE) from /127.0.0.1 for path / at Wed Jul 29 20:56:55 IST 2015
.
/user/1.log:  Under replicated BP-393036986-127.0.1.1-1437358619878:blk_1073741825_1001. Target Replicas is 3 but found 1 replica(s).
Status: HEALTHY
 Total size:    50270 B
 Total dirs:    7
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    1 (avg. block size 50270 B)
 Minimally replicated blocks:    1 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    1 (100.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    1.0
 Corrupt blocks:        0
 Missing replicas:        2 (66.666664 %)
 Number of data-nodes:        1
 Number of racks:        1
FSCK ended at Wed Jul 29 20:56:55 IST 2015 in 3 milliseconds


The filesystem under path '/' is HEALTHY



ram@ram:/etc/init.d$ hadoop fsck / -files -blocks -locations -racks
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.


Connecting to namenode via http://localhost:50070/fsck?ugi=ram&files=1&blocks=1&locations=1&racks=1&path=%2F
FSCK started by ram (auth:SIMPLE) from /127.0.0.1 for path / at Wed Jul 29 20:58:22 IST 2015
/ <dir>
/tmp <dir>
/tmp/hive <dir>
/tmp/hive/ram <dir>
/user <dir>
/user/1.log 50270 bytes, 1 block(s):  Under replicated BP-393036986-127.0.1.1-1437358619878:blk_1073741825_1001. Target Replicas is 3 but found 1 replica(s).
0. BP-393036986-127.0.1.1-1437358619878:blk_1073741825_1001 len=50270 repl=1 [/default-rack/127.0.0.1:50010]

/user/hive <dir>
/user/hive/warehouse <dir>
Status: HEALTHY
 Total size:    50270 B
 Total dirs:    7
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    1 (avg. block size 50270 B)
 Minimally replicated blocks:    1 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    1 (100.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    1.0
 Corrupt blocks:        0
 Missing replicas:        2 (66.666664 %)
 Number of data-nodes:        1
 Number of racks:        1
FSCK ended at Wed Jul 29 20:58:22 IST 2015 in 3 milliseconds


The filesystem under path '/' is HEALTHY



Hadoop dfsadmin commands

ram@ram:/etc/init.d$ hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Configured Capacity: 98496679936 (91.73 GB)
Present Capacity: 80164052992 (74.66 GB)
DFS Remaining: 80163958784 (74.66 GB)
DFS Used: 94208 (92 KB)
DFS Used%: 0.00%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: ram
Decommission Status : Normal
Configured Capacity: 98496679936 (91.73 GB)
DFS Used: 94208 (92 KB)
Non DFS Used: 18332626944 (17.07 GB)
DFS Remaining: 80163958784 (74.66 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.39%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Jul 29 21:06:41 IST 2015



ram@ram:/etc/init.d$ hadoop dfsadmin -setQuota 10 /user
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.



ram@ram:/etc/init.d$ hadoop fs -count -q /user
          10               6            none             inf            3            1              50270 /user
ram@ram:/etc/init.d$

ram@ram:/etc/init.d$ hadoop dfsadmin -safemode enter
Safe mode is ON


ram@ram:/etc/init.d$ hadoop dfsadmin -saveNamespace
<<Backup Metadata (fsimage & edits). Put cluster in safe mode before this command.>>
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Save namespace successful


ram@ram:/etc/init.d$ hadoop dfsadmin -safemode get
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Safe mode is ON



ram@ram:/etc/init.d$ hadoop dfsadmin -safemode leave
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Safe mode is OFF
ram@ram:/etc/init.d$



Hadoop yarn commands



Hadoop Balancer commands

ram@ram:/etc/init.d$ start-balancer.sh
starting balancer, logging to /usr/local/hadoop/logs/hadoop-ram-balancer-ram.out

hadoop dfsadmin -setBalancerBandwidth <bandwidthinbytes>
ram@ram:/etc/init.d$ hadoop balancer -threshold 20
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

15/07/29 21:16:22 INFO balancer.Balancer: Using a threshold of 20.0
15/07/29 21:16:22 INFO balancer.Balancer: namenodes  = [hdfs://localhost:9000]
15/07/29 21:16:22 INFO balancer.Balancer: parameters = Balancer.Parameters[BalancingPolicy.Node, threshold=20.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
15/07/29 21:16:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/07/29 21:16:24 INFO net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:50010
15/07/29 21:16:24 INFO balancer.Balancer: 0 over-utilized: []
15/07/29 21:16:24 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
29 Jul, 2015 9:16:24 PM           0                  0 B                 0 B               -1 B
29 Jul, 2015 9:16:24 PM  Balancing took 2.217 seconds
ram@ram:/etc/init.d$





3 comments: