hadoop fs

Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]
	[-checksum <src> ...]
	[-chgrp [-R] GROUP PATH...]
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
	[-chown [-R] [OWNER][:[GROUP]] PATH...]
	[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
	[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
	[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
	[-createSnapshot <snapshotDir> [<snapshotName>]]
	[-deleteSnapshot <snapshotDir> <snapshotName>]
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] [-v] [-x] <path> ...]
	[-expunge [-immediate]]
	[-find <path> ... <expression> ...]
	[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
	[-head <file>]
	[-help [cmd ...]]
	[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] [-s <sleep interval>] <file>]
	[-test -[defswrz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touch [-a] [-m] [-t TIMESTAMP ] [-c] <path> ...]
	[-touchz <path> ...]
	[-truncate [-w] <length> <path> ...]
	[-usage [cmd ...]]

Generic options supported are:
-conf <configuration file>        specify an application configuration file
-D <property=value>               define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port>  specify a ResourceManager
-files <file1,...>                specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...>               specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...>          specify a comma-separated list of archives to be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]

hadoop fs -ls /

Found 4 items
drwxrwxrwx   - spark  hadoop          0 2023-10-17 08:58 /spark-history
drwxrwxrwx   - hdfs   hadoop          0 2023-10-16 16:33 /tmp
drwxrwxrwx   - hdfs   hadoop          0 2023-10-16 16:33 /user
drwxr-x--x   - hadoop hadoop          0 2023-10-16 16:33 /yarn

hadoop fs -ls /user/

Found 3 items
drwxr-x--x   - hive   hadoop          0 2023-10-16 16:33 /user/hive
drwxr-x--x   - spark  hadoop          0 2023-10-16 16:33 /user/spark
drwxr-x--x   - yanfei hadoop          0 2023-10-31 19:22 /user/yanfei

hadoop fs -mkdir /user/yanfei

hadoop fs -ls /user

Found 5 items
drwxr-x--x   - hive    hadoop          0 2025-02-24 11:40 /user/hive
drwxr-x--x   - root    hadoop          0 2025-02-24 18:08 /user/root
drwxr-x--x   - spark   hadoop          0 2025-02-24 11:40 /user/spark
drwxr-x--x   - teacher hadoop          0 2025-02-24 22:06 /user/teacher
drwxr-x--x   - yanfei  hadoop          0 2025-03-03 23:14 /user/yanfei

hadoop fs -put /home/yanfei/lectures/BDE-hadoop.ipynb .

hadoop fs -ls /user/yanfei

Found 1 items
-rw-r-----   2 yanfei hadoop      46055 2025-03-03 23:14 /user/yanfei/BDE-hadoop.ipynb

hadoop fs -mv BDE-hadoop.ipynb hadoop.ipynb

hadoop fs -ls /user/yanfei

Found 1 items
-rw-r-----   2 yanfei hadoop      46055 2025-03-03 23:14 /user/yanfei/hadoop.ipynb

hdfs fsck /user/yanfei/hadoop.ipynb -files -blocks -locations

Connecting to namenode via http://master-1-1.c-90a0b99c49464ed0.cn-hangzhou.emr.aliyuncs.com:9870/fsck?ugi=yanfei&files=1&blocks=1&locations=1&path=%2Fuser%2Fyanfei%2Fhadoop.ipynb
FSCK started by yanfei (auth:SIMPLE) from /10.0.0.123 for path /user/yanfei/hadoop.ipynb at Mon Mar 03 23:15:05 CST 2025
/user/yanfei/hadoop.ipynb 46055 bytes, replicated: replication=2, 1 block(s):  OK
0. BP-465277494-10.0.0.123-1740368374401:blk_1073741842_1018 len=46055 Live_repl=2  [DatanodeInfoWithStorage[10.0.0.122:9866,DS-c9551a7c-917a-4aba-b747-408a0749cf43,DISK], DatanodeInfoWithStorage[10.0.0.124:9866,DS-d24ad1ec-6b54-4150-b31c-88d0f39083bb,DISK]]


Status: HEALTHY
 Number of data-nodes:	2
 Number of racks:		1
 Total dirs:			0
 Total symlinks:		0

Replicated Blocks:
 Total size:	46055 B
 Total files:	1
 Total blocks (validated):	1 (avg. block size 46055 B)
 Minimally replicated blocks:	1 (100.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	2
 Average block replication:	2.0
 Missing blocks:		0
 Corrupt blocks:		0
 Missing replicas:		0 (0.0 %)

Erasure Coded Block Groups:
 Total size:	0 B
 Total files:	0
 Total block groups (validated):	0
 Minimally erasure-coded block groups:	0
 Over-erasure-coded block groups:	0
 Under-erasure-coded block groups:	0
 Unsatisfactory placement block groups:	0
 Average block group size:	0.0
 Missing block groups:		0
 Corrupt block groups:		0
 Missing internal blocks:	0
FSCK ended at Mon Mar 03 23:15:05 CST 2025 in 4 milliseconds


The filesystem under path '/user/yanfei/hadoop.ipynb' is HEALTHY

Big Data Essentials¶

Hadoop HDFS¶

Modules of Hadoop¶

HDFS¶

NameNodes¶

DataNodes¶

Under the covers of HDFS¶

Data storage of HDFS¶

Data storage of HDFS¶

HDFS commands¶

HDFS path¶