Big Data Essentials¶

L2: Using Linux as a Data Scientist¶





Yanfei Kang
yanfeikang@buaa.edu.cn
School of Economics and Management
Beihang University
http://yanfei.site

Why Linux?¶

  • Linux is a free, open-source operating system.

  • Almost all of distributed software are offered with Linux distributions.

How much do we need to know about Linux?¶

  • Login to a Linux server

  • Navigation with basic Linux shell commands

  • File Manipulation

  • Use an editor within a Linux server

  • Knowing your environment

Log into aliyun server¶

  • Mac: open terminal and type ssh username@ip.
  • Windows: get Windows Terminal from the Microsoft store.

The command line¶

  • A command line, or terminal, is a text based interface to the system.
  • You are able to enter commands by typing them on the keyboard and feedback will be given to you similarly as text.

Shell¶

  • A program that takes commands from the keyboard and gives them to the operating system to perform.
  • Bash: one kind of shell.


Basic Navigation¶

In [ ]:
# Who am I?
whoami

# Where we are?
pwd

pwd

whoami
yanfei
/home/yanfei/lectures
/home/yanfei/lectures
yanfei
In [2]:
# What's in our current location?

ls
BDE-L0-intro.ipynb                  BDE-L4-pymodeling.slides.html
BDE-L0-intro.slides.html            BDE-L5-scraping.ipynb
BDE-L10-sparkintro.ipynb            BDE-L5-scraping.slides.html
BDE-L10-sparkintro.slides.html      BDE-L5-selenium.html
BDE-L11-sparkdp.ipynb               BDE-L5-selenium.ipynb
BDE-L11-sparkdp.slides.html         BDE-L5-selenium.slides.html
BDE-L12-sparkml.ipynb               BDE-L6-parallel.ipynb
BDE-L12-sparkml.slides.html         BDE-L6-parallel.slides.html
BDE-L13-sparktm.ipynb               BDE-L7-hadoop.ipynb
BDE-L13-sparktm.slides.html         BDE-L7-hadoop.slides.html
BDE-L14-SparkR-example.slides.html  BDE-L8-streaming.ipynb
BDE-L1-bigdata.ipynb                BDE-L8-streaming.slides.html
BDE-L1-bigdata.slides.html          BDE-L9-hive.ipynb
BDE-L2-linux.ipynb                  BDE-L9-hive.slides.html
BDE-L2-linux.slides.html            code
BDE-L3-git.ipynb                    data
BDE-L3-git.slides.html              figs
BDE-L3-pyintro.ipynb                Makefile
BDE-L3-pyintro.slides.html          myspark
BDE-L4-pymodeling.ipynb             setup_rise.py

Paths¶

  • The file system under linux is a hierarchical structure.
  • At the very top of the structure is what's called the root directory. It is denoted by a single slash ( / ).

Absolute and relative paths¶

  • Absolute paths specify a location (file or directory) in relation to the root directory. You can identify them easily as they always begin with a forward slash ( / ).
  • Relative paths specify a location (file or directory) in relation to where we currently are in the system. They will not begin with a slash.
In [1]:
ls
BDE-L0-intro.ipynb                  BDE-L4-pymodeling.slides.html
BDE-L0-intro.slides.html            BDE-L5-scraping.ipynb
BDE-L10-sparkintro.ipynb            BDE-L5-scraping.slides.html
BDE-L10-sparkintro.slides.html      BDE-L5-selenium.html
BDE-L11-sparkdp.ipynb               BDE-L5-selenium.ipynb
BDE-L11-sparkdp.slides.html         BDE-L5-selenium.slides.html
BDE-L12-sparkml.ipynb               BDE-L6-parallel.ipynb
BDE-L12-sparkml.slides.html         BDE-L6-parallel.slides.html
BDE-L13-sparktm.ipynb               BDE-L7-hadoop.ipynb
BDE-L13-sparktm.slides.html         BDE-L7-hadoop.slides.html
BDE-L14-SparkR-example.slides.html  BDE-L8-streaming.ipynb
BDE-L1-bigdata.ipynb                BDE-L8-streaming.slides.html
BDE-L1-bigdata.slides.html          BDE-L9-hive.ipynb
BDE-L2-linux.ipynb                  BDE-L9-hive.slides.html
BDE-L2-linux.slides.html            code
BDE-L3-git.ipynb                    data
BDE-L3-git.slides.html              figs
BDE-L3-pyintro.ipynb                Makefile
BDE-L3-pyintro.slides.html          myspark
BDE-L4-pymodeling.ipynb             setup_rise.py
In [18]:
ls /home/yanfei/lectures
BDE-L0-intro.ipynb		BDE-L5-scraping.ipynb
BDE-L0-intro.slides.html	BDE-L5-scraping.slides.html
BDE-L10-sparkintro.ipynb	BDE-L6-parallel.ipynb
BDE-L10-sparkintro.slides.html	BDE-L6-parallel.slides.html
BDE-L11-sparkdp.ipynb		BDE-L7-hadoop.ipynb
BDE-L11-sparkdp.slides.html	BDE-L7-hadoop.slides.html
BDE-L12-sparkml.ipynb		BDE-L8-streaming.ipynb
BDE-L12-sparkml.slides.html	BDE-L8-streaming.slides.html
BDE-L13-sparktm.ipynb		BDE-L9-hive.ipynb
BDE-L13-sparktm.slides.html	BDE-L9-hive.slides.html
BDE-L14-SparkR-example.ipynb	code
BDE-L1-bigdata.ipynb		data
BDE-L1-bigdata.slides.html	figs
BDE-L2-linux.ipynb		Makefile
BDE-L2-linux.slides.html	myspark
BDE-L3-pyintro.ipynb		output
BDE-L3-pyintro.slides.html	setup_rise.py
BDE-L4-pymodeling.ipynb		spark-warehouse
BDE-L4-pymodeling.slides.html

More on paths¶

  • ~ (tilde) - This is a shortcut for your home directory. /home/yanfei/lectures or ~/lectures.
  • . (dot) - This is a reference to your current directory. ls ./.
  • .. (dotdot)- This is a reference to the parent directory. ls ../.

Let's move around a bit¶

  • In order to move around in the system we use a command called cd which stands for change directory.
  • Use Tab Completion.

Need help of a command?¶

  • The manual pages are a set of pages that explain every command available on your system including what they do, the specifics of how you run them and what command line arguments they accept.
  • Try man ls.
  • Try ls --help.
In [6]:
ls -lstah
total 12M
4.0K drwxrwxr-x  8 yanfei yanfei 4.0K Oct 18 10:52 .
 44K -rw-rw-r--  1 yanfei yanfei  44K Oct 18 10:52 BDE-L2-linux.ipynb
4.0K drwxr--r-x 12 yanfei yanfei 4.0K Oct 18 10:49 ..
4.0K drwxrwxr-x  8 yanfei yanfei 4.0K Oct 17 17:03 .git
324K -rw-rw-r--  1 yanfei yanfei 324K Oct 17 17:02 BDE-L2-linux.slides.html
4.0K drwxrwxr-x  2 yanfei yanfei 4.0K Oct 17 11:16 myspark
4.0K -rwxrwxr-x  1 yanfei yanfei  898 Oct 17 11:16 setup_rise.py
4.0K drwxrwxr-x  2 yanfei yanfei 4.0K Oct 17 11:16 figs
4.0K drwxrwxr-x  6 yanfei yanfei 4.0K Oct 17 11:16 data
4.0K drwxrwxr-x  6 yanfei yanfei 4.0K Oct 17 11:16 code
592K -rw-rw-r--  1 yanfei yanfei 590K Oct 17 11:16 BDE-L9-hive.slides.html
4.0K -rw-rw-r--  1 yanfei yanfei  241 Oct 17 11:16 Makefile
 16K -rw-rw-r--  1 yanfei yanfei  16K Oct 17 11:16 BDE-L8-streaming.ipynb
596K -rw-rw-r--  1 yanfei yanfei 593K Oct 17 11:16 BDE-L8-streaming.slides.html
 16K -rw-rw-r--  1 yanfei yanfei  14K Oct 17 11:16 BDE-L9-hive.ipynb
 72K -rw-rw-r--  1 yanfei yanfei  72K Oct 17 11:16 BDE-L7-hadoop.ipynb
652K -rw-rw-r--  1 yanfei yanfei 649K Oct 17 11:16 BDE-L7-hadoop.slides.html
608K -rw-rw-r--  1 yanfei yanfei 607K Oct 17 11:16 BDE-L6-parallel.slides.html
580K -rw-rw-r--  1 yanfei yanfei 580K Oct 17 11:16 BDE-L5-selenium.slides.html
 24K -rw-rw-r--  1 yanfei yanfei  21K Oct 17 11:16 BDE-L6-parallel.ipynb
276K -rw-rw-r--  1 yanfei yanfei 276K Oct 17 11:16 BDE-L5-selenium.html
8.0K -rw-rw-r--  1 yanfei yanfei 6.7K Oct 17 11:16 BDE-L5-selenium.ipynb
828K -rw-rw-r--  1 yanfei yanfei 827K Oct 17 11:16 BDE-L5-scraping.slides.html
240K -rwxrwxr-x  1 yanfei yanfei 238K Oct 17 11:16 BDE-L5-scraping.ipynb
864K -rw-rw-r--  1 yanfei yanfei 862K Oct 17 11:16 BDE-L4-pymodeling.slides.html
656K -rw-rw-r--  1 yanfei yanfei 655K Oct 17 11:16 BDE-L3-pyintro.slides.html
264K -rw-rw-r--  1 yanfei yanfei 264K Oct 17 11:16 BDE-L4-pymodeling.ipynb
584K -rw-rw-r--  1 yanfei yanfei 581K Oct 17 11:16 BDE-L3-git.slides.html
 40K -rwxrwxr-x  1 yanfei yanfei  37K Oct 17 11:16 BDE-L3-pyintro.ipynb
584K -rw-rw-r--  1 yanfei yanfei 583K Oct 17 11:16 BDE-L14-SparkR-example.slides.html
 12K -rw-rw-r--  1 yanfei yanfei 8.8K Oct 17 11:16 BDE-L3-git.ipynb
740K -rw-rw-r--  1 yanfei yanfei 738K Oct 17 11:16 BDE-L13-sparktm.slides.html
660K -rw-rw-r--  1 yanfei yanfei 659K Oct 17 11:16 BDE-L12-sparkml.slides.html
 96K -rw-rw-r--  1 yanfei yanfei  96K Oct 17 11:16 BDE-L13-sparktm.ipynb
 48K -rw-rw-r--  1 yanfei yanfei  46K Oct 17 11:16 BDE-L12-sparkml.ipynb
 76K -rw-rw-r--  1 yanfei yanfei  76K Oct 17 11:16 BDE-L11-sparkdp.ipynb
684K -rw-rw-r--  1 yanfei yanfei 681K Oct 17 11:16 BDE-L11-sparkdp.slides.html
584K -rw-rw-r--  1 yanfei yanfei 584K Oct 17 11:16 BDE-L10-sparkintro.slides.html
 16K -rw-rw-r--  1 yanfei yanfei  13K Oct 17 11:16 BDE-L10-sparkintro.ipynb
572K -rw-rw-r--  1 yanfei yanfei 572K Oct 17 11:16 BDE-L1-bigdata.slides.html
4.0K -rw-rw-r--  1 yanfei yanfei 3.5K Oct 17 11:16 BDE-L0-intro.ipynb
572K -rw-rw-r--  1 yanfei yanfei 570K Oct 17 11:16 BDE-L0-intro.slides.html
4.0K -rw-rw-r--  1 yanfei yanfei 3.7K Oct 17 11:16 BDE-L1-bigdata.ipynb
4.0K drwxrwxr-x  2 yanfei yanfei 4.0K Oct 17 11:16 .ipynb_checkpoints
 16K -rw-rw-r--  1 yanfei yanfei  13K Oct 17 11:16 .DS_Store
4.0K -rw-rw-r--  1 yanfei yanfei   50 Oct 17 11:16 .gitignore

Lab¶

  • Use the commands cd and ls to explore what directories are on your system and what's in them. Make sure you use a variety of relative and absolute paths.
  • Now go to your home directory using 4 different methods.
  • Make sure you are using Tab Completion when typing out your paths too. Why do anything you can get the computer to do for you?


File manipulation¶

File manipulation¶

  • Making a directory: mkdir
  • Removing a directory: rmdir
  • Creating a blank file: touch
  • Copying a file or directory: cp
  • Moving a file or directory: mv
  • Renaming files or directories
  • Removing files or empty directories: rm
  • How to removing non empty directories?
  • Note: no undo options.
  • etc.

Lab¶

  • Start by creating a directory in your home directory in which to experiment.
  • In that directory, create a series of files and directories (and files and directories in those directories).
  • Now rename a few of those files and directories.
  • Delete one of the directories that has other files and directories in them.
  • Move back to your home directory and from there copy a file from one of your subdirectories into the initial directory you created.
  • Now move that file back into another directory.
  • Rename a few files
  • Next, move a file and rename it in the process.
  • Finally, have a look at the existing directories in your home directory.

Upload local files¶

  • Command line shell: rsync
  • Or use FileZilla

A command line editor¶

  • Many text editor available: nano, vim, emacs.
  • There are two modes in Vim. Insert (or Input) mode and Edit mode.
  • In input mode you may input or enter content into the file.
  • In edit mode you can move around the file, perform actions such as deleting, copying, search and replace, saving etc.
  • A common mistake is to start entering commands without first going back into edit mode or to start typing input without first going into insert mode.

First file¶

  • Start with vim firstfile.
  • You always start off in edit mode so the first thing we are going to do is switch to insert mode by pressing i. You can tell when you are in insert mode as the bottom left corner will tell you.

Saving and editing¶

  • :q! - discard all changes, since the last save, and exit
  • :w - save file but don't exit
  • :wq - again, save and exit

Other ways to view files¶

  • Try cat firstfile.
  • For larger files there is a better suited command which is less.
  • head, tail.
In [4]:
head BDE-L0-intro.slides.html
<!DOCTYPE html>
<html>
<head>

<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="chrome=1" />

<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />

Navigating a file in Vim¶

Now let's go back into the file we just created and enter some more content. In insert mode you may use the arrow keys to move the cursor around. Enter two more paragraphs of content then hit Esc to go back to edit mode.

  • Arrow keys - move the cursor around
  • j, k, h, l - move the cursor down, up, left and right (similar to the arrow keys)
  • ^ (caret) - move cursor to beginning of current line
  • $ - move cursor to end of the current line
  • nG - move to the nth line (eg 5G moves to 5th line)
  • G - move to the last line
  • w - move to the beginning of the next word
  • nw - move forward n word (eg 2w moves two words forwards)
  • b - move to the beginning of the previous word
  • nb - move back n word
  • { - move backward one paragraph
  • } - move forward one paragraph

Deleting content¶

  • x - delete a single character
  • nx - delete n characters (eg 5x deletes five characters)
  • dd - delete the current line
  • dn - d followed by a movement command. Delete to where the movement command would have taken you. (eg d5w means delete 5 words)

Undoing¶

  • u - Undo the last action (you may keep pressing u to keep undoing)
  • U (Note: capital) - Undo all changes to the current line

Lab¶

  • Start by creating a file and putting some content into it.
  • Save the file and view it in both cat and less
  • Go back into the file in vi and enter some more content.
  • Move around the content using at least 6 different movement commands.
  • Play about with several of the delete commands, especially the ones that incorporate a movement command. Remember you may undo your changes so you don't have to keep putting new content in.


Wildcards¶

What are they?¶

  • * - represents zero or more characters
  • ? - represents a single character
  • [] - represents a range of characters
In [24]:
# Examples

ls B*
BDE-L0-intro.ipynb		BDE-L3-pyintro.ipynb
BDE-L0-intro.slides.html	BDE-L3-pyintro.slides.html
BDE-L10-sparkintro.ipynb	BDE-L4-pymodeling.ipynb
BDE-L10-sparkintro.slides.html	BDE-L4-pymodeling.slides.html
BDE-L11-sparkdp.ipynb		BDE-L5-scraping.ipynb
BDE-L11-sparkdp.slides.html	BDE-L5-scraping.slides.html
BDE-L12-sparkml.ipynb		BDE-L6-parallel.ipynb
BDE-L12-sparkml.slides.html	BDE-L6-parallel.slides.html
BDE-L13-sparktm.ipynb		BDE-L7-hadoop.ipynb
BDE-L13-sparktm.slides.html	BDE-L7-hadoop.slides.html
BDE-L14-SparkR-example.ipynb	BDE-L8-streaming.ipynb
BDE-L1-bigdata.ipynb		BDE-L8-streaming.slides.html
BDE-L1-bigdata.slides.html	BDE-L9-hive.ipynb
BDE-L2-linux.ipynb		BDE-L9-hive.slides.html
BDE-L2-linux.slides.html
In [23]:
ls *.????b
BDE-L0-intro.ipynb	  BDE-L14-SparkR-example.ipynb	BDE-L5-scraping.ipynb
BDE-L10-sparkintro.ipynb  BDE-L1-bigdata.ipynb		BDE-L6-parallel.ipynb
BDE-L11-sparkdp.ipynb	  BDE-L2-linux.ipynb		BDE-L7-hadoop.ipynb
BDE-L12-sparkml.ipynb	  BDE-L3-pyintro.ipynb		BDE-L8-streaming.ipynb
BDE-L13-sparktm.ipynb	  BDE-L4-pymodeling.ipynb	BDE-L9-hive.ipynb
In [22]:
ls *[0-1]*
BDE-L0-intro.ipynb		BDE-L12-sparkml.slides.html
BDE-L0-intro.slides.html	BDE-L13-sparktm.ipynb
BDE-L10-sparkintro.ipynb	BDE-L13-sparktm.slides.html
BDE-L10-sparkintro.slides.html	BDE-L14-SparkR-example.ipynb
BDE-L11-sparkdp.ipynb		BDE-L1-bigdata.ipynb
BDE-L11-sparkdp.slides.html	BDE-L1-bigdata.slides.html
BDE-L12-sparkml.ipynb
In [21]:
ls */*.png
figs/bigdata.png		   figs/mapreduce.png
figs/hdfs.png			   figs/ml-Pipeline.png
figs/king-analogy-viz.png	   figs/queen-woman-girl-embeddings.png
figs/king-man-woman-embedding.png  figs/spark-cluster-overview.png
figs/king.png			   figs/spark-runs-everywhere.png
figs/lmcd2d.png			   figs/traditional.png
figs/lmcd3d.png			   figs/wordcount.png
In [20]:
ls -lhsa /home/*/.bash_history
4.0K -rw------- 1 yanfei yanfei 2.3K Nov 22 23:25 /home/yanfei/.bash_history

Lab¶

  • A good directory to play with is /etc which is a directory containing config files for the system. As a normal user you may view the files but you can't make any changes so we can't do any harm. Do a listing of that directory to see what's there. Then pick various subsets of files and see if you can create a pattern to select only those files.
  • Do a listing of /etc with only files that contain an extension.
  • What about only a 3 letter extension?
  • How about files whose name contains an uppercase letter? (hint: [[:upper:]] may be useful here)
  • Can you list files whose name is 4 characters long?


Piping and redirection¶

Piping and redirection¶

  • Piping and redirection help create powerful workflows that will automate your work, saving you time and effort.
  • We looked at a collection of filters that would manipulate data for us. How we may join them together to do more powerful data manipulation?

Redirecting to a File¶

In [12]:
ls > output
cat output
BDE-L0-intro.ipynb
BDE-L0-intro.slides.html
BDE-L10-sparkintro.ipynb
BDE-L10-sparkintro.slides.html
BDE-L11-sparkdp.ipynb
BDE-L11-sparkdp.slides.html
BDE-L12-sparkml.ipynb
BDE-L12-sparkml.slides.html
BDE-L13-sparktm.ipynb
BDE-L13-sparktm.slides.html
BDE-L14-SparkR-example.ipynb
BDE-L1-bigdata.ipynb
BDE-L1-bigdata.slides.html
BDE-L2-linux.ipynb
BDE-L2-linux.slides.html
BDE-L3-pyintro.ipynb
BDE-L3-pyintro.slides.html
BDE-L4-pymodeling.ipynb
BDE-L4-pymodeling.slides.html
BDE-L5-scraping.ipynb
BDE-L5-scraping.slides.html
BDE-L6-parallel.ipynb
BDE-L6-parallel.slides.html
BDE-L7-hadoop.ipynb
BDE-L7-hadoop.slides.html
BDE-L8-streaming.ipynb
BDE-L8-streaming.slides.html
BDE-L9-hive.ipynb
BDE-L9-hive.slides.html
code
data
figs
myspark
output
setup_rise.py
spark-warehouse

Saving to an existing file¶

In [11]:
wc -l output
wc -l output >> output
cat output
36 output
BDE-L0-intro.ipynb
BDE-L0-intro.slides.html
BDE-L10-sparkintro.ipynb
BDE-L10-sparkintro.slides.html
BDE-L11-sparkdp.ipynb
BDE-L11-sparkdp.slides.html
BDE-L12-sparkml.ipynb
BDE-L12-sparkml.slides.html
BDE-L13-sparktm.ipynb
BDE-L13-sparktm.slides.html
BDE-L14-SparkR-example.ipynb
BDE-L1-bigdata.ipynb
BDE-L1-bigdata.slides.html
BDE-L2-linux.ipynb
BDE-L2-linux.slides.html
BDE-L3-pyintro.ipynb
BDE-L3-pyintro.slides.html
BDE-L4-pymodeling.ipynb
BDE-L4-pymodeling.slides.html
BDE-L5-scraping.ipynb
BDE-L5-scraping.slides.html
BDE-L6-parallel.ipynb
BDE-L6-parallel.slides.html
BDE-L7-hadoop.ipynb
BDE-L7-hadoop.slides.html
BDE-L8-streaming.ipynb
BDE-L8-streaming.slides.html
BDE-L9-hive.ipynb
BDE-L9-hive.slides.html
code
data
figs
myspark
output
setup_rise.py
spark-warehouse
36 output

Piping¶

  • Now we'll take a look at a mechanism for sending data from one program to another.
  • It's called piping.
  • The operator we use is ( | ).
  • command 1 | command 2: the pipe command lets you sends the output of command 1 to command 2.
In [6]:
ls | head -3
BDE-L0-intro.ipynb
BDE-L0-intro.slides.html
BDE-L10-sparkintro.ipynb
In [1]:
cat BDE-L0-intro.ipynb | wc -l
141


Knowing your environment¶

Knowing your environment¶

  • top: shows you the processes running on your machine, ordered by resource consumption.
  • free: displays a simple readout of how much memory is used and available on your system.
  • du: short for disk usage. It's is extremely useful for estimating the size of directories.
In [31]:
du -h
4.0K	./spark-warehouse
140K	./.git/objects/8e
28K	./.git/objects/5e
12K	./.git/objects/bc
8.0K	./.git/objects/0f
128K	./.git/objects/f5
8.0K	./.git/objects/0b
692K	./.git/objects/f3
8.0K	./.git/objects/20
184K	./.git/objects/98
8.0K	./.git/objects/4b
72K	./.git/objects/ce
568K	./.git/objects/c7
12K	./.git/objects/f6
20K	./.git/objects/8a
368K	./.git/objects/cd
12K	./.git/objects/35
12K	./.git/objects/1a
8.0K	./.git/objects/b4
8.0K	./.git/objects/d7
8.0K	./.git/objects/c2 2
8.0K	./.git/objects/84 2
72K	./.git/objects/9c
176K	./.git/objects/33
12K	./.git/objects/b2
8.0K	./.git/objects/ad
176K	./.git/objects/53
552K	./.git/objects/32
8.0K	./.git/objects/60
12K	./.git/objects/65
16K	./.git/objects/ac
8.0K	./.git/objects/f7
204K	./.git/objects/db
8.0K	./.git/objects/90 2
12K	./.git/objects/1c
16K	./.git/objects/3a
160K	./.git/objects/a8
12K	./.git/objects/73 2
12K	./.git/objects/09 2
20K	./.git/objects/95
48K	./.git/objects/12
8.0K	./.git/objects/28
8.0K	./.git/objects/d3
76K	./.git/objects/f2
164K	./.git/objects/03
12K	./.git/objects/66
624K	./.git/objects/3f
8.0K	./.git/objects/7a
8.0K	./.git/objects/ea
16K	./.git/objects/d0
80K	./.git/objects/23
12K	./.git/objects/70
84K	./.git/objects/1f
92K	./.git/objects/fd
8.0K	./.git/objects/7e 2
12K	./.git/objects/3d
8.0K	./.git/objects/74
12K	./.git/objects/f1
4.0K	./.git/objects/info
8.0K	./.git/objects/50
64K	./.git/objects/90
16K	./.git/objects/00 2
12K	./.git/objects/f8
8.0K	./.git/objects/86
8.0K	./.git/objects/22
8.0K	./.git/objects/6d
20K	./.git/objects/cb
12K	./.git/objects/48
12K	./.git/objects/a0
12K	./.git/objects/8f
12K	./.git/objects/c9
20K	./.git/objects/07 2
96K	./.git/objects/44
76K	./.git/objects/5d
12K	./.git/objects/09
60K	./.git/objects/7f
12K	./.git/objects/a1 2
80K	./.git/objects/97
4.0K	./.git/objects/pack
16K	./.git/objects/0a
12K	./.git/objects/9e
12K	./.git/objects/dc 2
12K	./.git/objects/17
8.0K	./.git/objects/a5
32K	./.git/objects/31
140K	./.git/objects/91
12K	./.git/objects/43
144K	./.git/objects/2d
276K	./.git/objects/04
44K	./.git/objects/e9
8.0K	./.git/objects/d6
136K	./.git/objects/e8
12K	./.git/objects/8b
68K	./.git/objects/64
8.0K	./.git/objects/79 2
8.0K	./.git/objects/5c
20K	./.git/objects/2c
8.0K	./.git/objects/fa
12K	./.git/objects/4f
12K	./.git/objects/78 2
12K	./.git/objects/c8
12K	./.git/objects/c4
8.0K	./.git/objects/41
8.0K	./.git/objects/7e
8.0K	./.git/objects/fa 2
84K	./.git/objects/e3
12K	./.git/objects/15
8.0K	./.git/objects/05
12K	./.git/objects/a4
132K	./.git/objects/f9
16K	./.git/objects/bd
16K	./.git/objects/83
24K	./.git/objects/4e
8.0K	./.git/objects/d8
12K	./.git/objects/27
12K	./.git/objects/24
16K	./.git/objects/ba
8.0K	./.git/objects/8b 2
8.0K	./.git/objects/ec
12K	./.git/objects/e6
16K	./.git/objects/b8
12K	./.git/objects/63
20K	./.git/objects/07
128K	./.git/objects/36
552K	./.git/objects/94
12K	./.git/objects/58
8.0K	./.git/objects/6a
12K	./.git/objects/8f 2
124K	./.git/objects/68
8.0K	./.git/objects/c6
12K	./.git/objects/da
16K	./.git/objects/cc
116K	./.git/objects/55
8.0K	./.git/objects/34
24K	./.git/objects/08
12K	./.git/objects/7b
108K	./.git/objects/71
12K	./.git/objects/a0 2
80K	./.git/objects/6b
8.0K	./.git/objects/3b
56K	./.git/objects/7c
8.0K	./.git/objects/ca
192K	./.git/objects/fb
12K	./.git/objects/79
316K	./.git/objects/0d
72K	./.git/objects/b5
256K	./.git/objects/5a
1.1M	./.git/objects/16
556K	./.git/objects/ab
24K	./.git/objects/45
172K	./.git/objects/af
12K	./.git/objects/1b
16K	./.git/objects/e0
12K	./.git/objects/be
16K	./.git/objects/46
8.0K	./.git/objects/88
12K	./.git/objects/73
32K	./.git/objects/1e
24K	./.git/objects/c1
20K	./.git/objects/51
8.0K	./.git/objects/f0
8.0K	./.git/objects/52
12K	./.git/objects/a2
8.0K	./.git/objects/a3
8.0K	./.git/objects/49
8.0K	./.git/objects/3c
16K	./.git/objects/77
240K	./.git/objects/cf
12K	./.git/objects/de
68K	./.git/objects/38
284K	./.git/objects/e7
8.0K	./.git/objects/10
248K	./.git/objects/3e
12K	./.git/objects/b9
76K	./.git/objects/e5
8.0K	./.git/objects/02
72K	./.git/objects/2e
140K	./.git/objects/21
80K	./.git/objects/19
124K	./.git/objects/d9
20K	./.git/objects/40
8.0K	./.git/objects/61
12K	./.git/objects/ff
24K	./.git/objects/6e
92K	./.git/objects/e1
4.8M	./.git/objects/8d
8.0K	./.git/objects/19 2
12K	./.git/objects/96
8.0K	./.git/objects/d1
20K	./.git/objects/69
80K	./.git/objects/a1
156K	./.git/objects/82
8.0K	./.git/objects/29
16K	./.git/objects/80
64K	./.git/objects/75
12K	./.git/objects/63 2
12K	./.git/objects/80 2
8.0K	./.git/objects/26
8.0K	./.git/objects/a9 2
12K	./.git/objects/e4
140K	./.git/objects/4d
132K	./.git/objects/2f
12K	./.git/objects/dc
8.0K	./.git/objects/28 2
168K	./.git/objects/9b
8.0K	./.git/objects/06
64K	./.git/objects/76
44K	./.git/objects/59
88K	./.git/objects/25
28K	./.git/objects/fe
132K	./.git/objects/b3
8.0K	./.git/objects/1d
8.0K	./.git/objects/62
128K	./.git/objects/b1
8.0K	./.git/objects/d5
12K	./.git/objects/6c
16K	./.git/objects/d4
76K	./.git/objects/df
8.0K	./.git/objects/06 2
476K	./.git/objects/bf
12K	./.git/objects/13
8.0K	./.git/objects/2b
8.0K	./.git/objects/a9
12K	./.git/objects/8e 2
8.0K	./.git/objects/0e
220K	./.git/objects/30
8.0K	./.git/objects/56
16K	./.git/objects/cd 2
8.0K	./.git/objects/0c
16K	./.git/objects/ee
8.0K	./.git/objects/84
16K	./.git/objects/9d
8.0K	./.git/objects/85
16K	./.git/objects/e0 2
8.0K	./.git/objects/01
16K	./.git/objects/ae
8.0K	./.git/objects/c2
184K	./.git/objects/4c
8.0K	./.git/objects/a6 2
12K	./.git/objects/a6
548K	./.git/objects/b0
132K	./.git/objects/b6
16K	./.git/objects/2a
124K	./.git/objects/c0
12K	./.git/objects/78
68K	./.git/objects/8c
16K	./.git/objects/14
12K	./.git/objects/8a 2
20K	./.git/objects/00
8.0K	./.git/objects/a7
12K	./.git/objects/e2
12K	./.git/objects/4c 2
76K	./.git/objects/87
12K	./.git/objects/6b 2
24K	./.git/objects/92
21M	./.git/objects
104K	./.git/hooks
8.0K	./.git/refs/remotes/origin
12K	./.git/refs/remotes
4.0K	./.git/refs/tags
12K	./.git/refs/heads
32K	./.git/refs
12K	./.git/info
8.0K	./.git/logs/refs/remotes/origin
12K	./.git/logs/refs/remotes
12K	./.git/logs/refs/heads
28K	./.git/logs/refs
36K	./.git/logs
22M	./.git
712K	./.ipynb_checkpoints
1.8M	./figs
16K	./myspark
16M	./data/ml-100k
404K	./data/ebola/liberia_data
92K	./data/ebola/guinea_data
416K	./data/ebola/sl_data
920K	./data/ebola
8.0K	./data/.ipynb_checkpoints
420K	./data/microbiome
23M	./data
12K	./code/L5
12K	./code/L3
8.0K	./code/L9/wordcount
12K	./code/L9/movie
4.0K	./code/L9/stocks/.ipynb_checkpoints
12K	./code/L9/stocks
36K	./code/L9
4.0K	./code/L8/patent-avg/.ipynb_checkpoints
20K	./code/L8/patent-avg
4.0K	./code/L8/patent-max/.ipynb_checkpoints
16K	./code/L8/patent-max
16K	./code/L8/.airdelay-R
4.0K	./code/L8/.ipynb_checkpoints
4.0K	./code/L8/wordcount-py/.ipynb_checkpoints
36K	./code/L8/wordcount-py
12K	./code/L8/wordcount-sh
108K	./code/L8
4.0K	./code/.ipynb_checkpoints
12K	./code/L10
196K	./code
57M	.

References¶

  1. Chapter 2 of Textbook.
  2. Linux tutorial.
  3. Vim shortcuts.