How to Install Hadoop on Windows

How to Install Hadoop on Windows


hello everyone today let’s talk about
how to install Hadoop on Windows so this is going to be a very simplified
approach as compared to the previous option which I had provided using
Haughton works data platform or HTP okay so before moving on please also make a
note to check the links below for the full list of Hadoop books along
with the reviews from other people so let’s get started so coming to the requirements for
setting a product you need to have the latest JDK 1.8 so I’m going to choose
1.8 Update 171 for the demo then we need to download the latest hurdle package
I’m going to share the links from where you can download them so let’s dive in so let’s go to the Oracle website and
download the JDK I have the URL here as you can see so my system is running on windows
64-bit version so then I am going to download the JDK 8 Update 1 7 or Windows
so depending on 64-bit or 32-bit you can download it 6 bit version or 64-bit
version so go ahead accept the license agreement and download that download it
into my system ok so if you are not make sure to download it second thing is to
download the hadoop software go to hadoop dot Apache 8 dot orgy so this is
where you can find the source and binary super hadoop so here on the lead left menu you can
see there’s a releases link go ahead and click on that and as you can see we have
the various versions are sort of available you can go ahead and pick up
the latest one also not a problem okay but just make sure to download the
binary package instead of the source package binary is something which is
already pre-built you don’t have to do work there but if you download the
source you have to make sure to compile all of those who is going to take a lot
of time so for this demonstration I have already downloaded 2.7 dot six binary
package so go ahead and download that as well okay so as you can see here we already
downloaded it so let’s go ahead and start the installation of one by one so
first one is let’s install JDK package double-click on that so as I can see here the Installer
window has popped up next so first thing is I’m not going to
install that into the default location which is Program Files Java JDK I’m
going to put that into C Java JDK so let’s go ahead and change that location finally you can see the
during installation as well so make sure to change that so just added a Jerry location of the
bet next so we have done with installation here
now let next step is to add the Java home to the environment variables so go to your environment variables so
go to the windows settings and so here go back and add a new variable
called Java home and the path of Java video installed which is essentially in
C okay so also go back and add the Java bin to the path like this okay and now open the command
prompt in fact Java – version so as you can see
we have instant one eight zero one seven one okay so the next step is to install
Hadoop so here I’ve already extracted the
downloaded part of 2.7 dot 6 so these are the files that you get
after the extraction so one thing to remember is when you are extracting this
but in a tar dot pencil you will see there are issu errors being thrown on
there so files that is okay you can safely ignore them
okay so next step is to cooperate this extracted Hadoop folder go into your C
directory and copy the directory here directly so once the deed is done copy the part
of it open your window settings and again go to deliver on much of a
difference for your accountant then go back and add a new variable
called Hadoop and the variable value is the location
of Hadoop so also make sure to add this to the
pact so edit the path of the user so
similarly you need to add one more location which is the s-pen of Hadoop so
basically this location contains all the important executables for your Linux and
command files for your windows we are going to talk about that in a while
so just go ahead and add this now the variable through the path has been so
close your window settings go back to the head of where you have copied
need to see go to the etc’ Hadoop so here comes the
important step we are going to modify four different XML files where we need
to add specific properties so and we are going ahead with the modification of
this with minimal properties that are required okay so let’s open course I
taught XML so go to the configuration section add a
property are the name here similarly value so the property name is
going to be F s dot default FS and the value is going to be HDFS localhost and 9000 so there’s only
property that is required for now so this is basically used by your name
node and the data node so the next configuration file is map read site dot XML so you’ll not find
that here the only file you will find this is a template file my friend – –
site dot XML dot template go ahead and copy copy and create a copy of that and
change the name to path red side dot XML so let me just copy the property here
and stop typing again and then we are going to change that so here let’s
change the property which is Map Reduce dot preamble dot name
okay and the value is going to be young says the Gator aware
yawn is going to be the new framework which will execute MapReduce also along
with the biggest other frameworks so now minimize your files go back to your
heart application in C so under seha to go ahead and create a
data directory going to the data directory and create
two locations one for data node and the other for the name node so this particular name note location
the data directory will be used by the name hood to store the credit loss and
the FS image and similarly any files that we push into the data node the
blocks will be stored in this particular data node location so copy this location go back to etc’ do let’s open history of a site dot examine let’s go back and copy the property from
the Maverick site then we’re going to modify that so we are going to modify three
properties here we need to have this current in three properties in HDFS
ninth or x1 the first one is a replication factor so how many times you
want how to replicate your data so because we are running on a pseudo
distributed system which means everything on one system we just want a
one copy of it next these are the paths for the data node and the name of
directories DFS dot name node dot name directory so just make sure you change the copper
names otherwise it’s going to be difficult to
really figure out what really went wrong when you start your services okay so
finally change the property name to DFS dot data and oh god named our directory and give the path to the data so we are done with the third file
modification as well now let’s go back to the other file which is the on-site
dot XML from a deep so here we are going ahead with two
different properties Aeon dot note manager dot auxilary
services note the – here and we are going to
specify MapReduce and Scotia similarly legenda other property here
so yawn dot node manager at Fox Services dot MapReduce got shuffled off so which means we need to specify the
actual MapReduce class which can be used for shuffling data so this is a package
which is already there in the Apache codebase $40 bet she taught Hadoop dot
map red dot handling okay so just go back and see if you have all the right
properties with the default values so course I taught XML with just one
property pointing to your local host history of this location
similarly map read the framework name which is er is the other site
replication factor the data directory so here I made a
mistake in stock name should be data so DFS name Lord God named after a Greek
pointing to the name nor theta location data node dot data directory point to
the data mode and finally Orion dot XML okay so far so good
and finally let’s open the huddle and be done in command here we are going to
specify our Java pathway so just go and check this line set Java
home and point it to the right location of Java so one more tip here is if you
are not installed JDK in C Java location and if you install it under C colon
program files you have to put the entire way to build in double quotes because
it’s going to break because of a space to make sure to save it so we are done with all the changes the
installation of Hadoop and the installation of Java as well now the
final thing is to go back to your Hadoop pin which is
copied into C so as you can see here the pin location does not have many
executable files here so what I’ve done is I have one I’ve created some files
here and also figure out the other missing files which are required in this
bin location so just copy all of those go back to your see how to plication
into the bin and replace everything so I know you guys don’t have this so if
you require these pens please feel free to ping me on my channel I’ll be glad to
share those okay so we are done with all the basic setup now let’s go back to our
command line the first step is to execute the format option of Hadoop HDFS
name node format so I’m able to access this command from anywhere in the system
because we added that s pin and the pin of Hadoop into the path ok so as you can
see here the data node and the name of data locations are cleared of or
formatted now let’s go back and execute start on
that command so you will be able to find this in your C Hadoop and has been so
here you will have start all dot Sh so this guy is going to execute all the
required processes which are namely the name node which is the master data node
resource manager and the node manager or you can go on individually and execute
start DFS to start the disputer file system start yarn start balancers etc ok
so either you can copy this location if you are not put this into a path execute
start on command from here or if you already put that as I’ve explained in
this video you can’t directly execute the command from anywhere so this command is going to start four
different process as they are said and you will see four different command from
Spiegel okay so as you can see here the name node is up and running so we don’t
have any issues here similarly the data node is up and running which means our
configuration is fine and it’s ready to accept the blocks
here is our resource manager which is working fine
and finally our North management now let’s go back to a browser and see the
UI of the resource manager and the name node the resource manager will always be
there on a it’s rated port and you as you can see here this is the UI where
all the MapReduce programs that are executed in the system will be visible
let’s go to our name node UI which should be there on the localhost five
zero zero seven zero so as you can see this gives your
overall summary of our cluster the date when it was started the specification of
Hadoop that’s unique cluster ID the specific cool idea of the block storage
and then assembly and as you can see the era clock and the FS image will be
stored in the name node data directory which we had created earlier now let’s
go back and push some sample files into this duplication so first let’s create
an input directory into which will push a sample file so I already have a sample file here
input point where I have random numbers unique numbers I’m going to push this
file now let’s see if our file has actually
been added into HDFS so we can also do a cat on this file and
see what’s a data line there so we are sexually copied or file into
HDFS let’s also two more to some more
commands on how to leave the safe mode and also remove the directory what I
created so here we have switched off the safe
mode now let’s remove the file water put into the input directory and then delete
that a tree from HDFS so as you can see we have deleted this
file from HDFS so that’s it for today folks I hope you
had enjoyed this video and simplified process of installing how to burn
windows so if you want to follow along with me for so please be sure to
subscribe to my channel thank you you

34 comments

  1. Hi sir, when I tried hdfs namenode -format i am getting below error :

    ERROR common.Util: Syntax error in URI C:hadoop-3.1.3datanamenode. Please check hdfs configuration.

    please help

  2. Hello sir I got an error of :
    C:Users***********>hadoop version

    The system cannot find the path specified.

    Error: JAVA_HOME is incorrectly set.

    Please update C:hadoop-3.1.0etchadoophadoop-env.cmd

    '-Xmx512m' is not recognized as an internal or external command,

    operable program or batch file.

    please tell me how to remove this????????

  3. i have this kind of error while final in step, please help me i don't understand what i do.
    C:UsersDhanoj Kumar Paswan>start-all.cmd

    This script is Deprecated. Instead use start-dfs.cmd and start-yarn.cmd

    The system cannot find the path specified.

    Error: JAVA_HOME is incorrectly set.

    Please update C:hadoop-2.10.0etchadoophadoop-env.cmd

  4. Facing This Issue After Doing All Configuration As Per You Sir.

    C:UsersShubham>hdfs namenode -format

    Error: Could not find or load main class Vishwakarma

    Caused by: java.lang.ClassNotFoundException: Vishwakarma

  5. Nice tutorial for the Hadoop installation on Windows, its works very nicely and also you has demonstrate very clearly in this video. thank u so much for this video.Great Work ,it is very usefull

  6. Hi, I was getting all the namenode, nodemanger, yarn up and running in the cmd, and was able to access 8088 and 50070 but the commands was showing in the cmd, it shows unknown command

  7. well Explained and really helpful to the point tutorial. Expect more such tutorial from you.
    Please let me know how to connect with you.

  8. Thank you for such awsome tuto.
    I get error in the last step while running the hdfs-start:
    ssh: connect to host localhost port 22: Connection refused
    Have any one idea about it? Should i install some ssh or i just missed something ?

  9. After following up all your instructions, only resource manager is successfully running. Data node , name node,node manager throws an error saying "java.io.FileNotFoundException"

  10. when I am doing I am getting below mentioned errors.
    hdfs namenode-format

    Error: Could not find or load main class Patel

    Also could you please provide extra files?

  11. hdfs namenode -format
    : (While executing this command I'm getting an error like this!? What to do??
    Error: Could not find or load main class D

    Caused by: java.lang.ClassNotFoundException: D

  12. As soon as I enter start-all.cmd they all start but only name node runs properly. data node nodemanager and resource manager shut down. Can you tell me
    how to solve this issue?

  13. @BigData 101 can u please help me with this error "Error: Could not find or load main class" when i try to format namenode or even try to check for hadoop version.

  14. Guys, I have figure out the ClassNotFoundException
    Go to your etchadoophadoop-env file
    in the last line change
    HADOOP_IDENT_STRING=%USERNAME%
    to
    HADOOP_IDENT_STRING=myuser
    Note:(username must be without space)
    it will work fine. ๐Ÿ™‚ Enjoy

  15. I have an issue, can you please share what files did you copy from configuration folder of yours?
    Also, I get an error 'This site canโ€™t be reached, localhost refused to connect.' when I try to connect to localhost.
    Any tips or help?
    Thank you ๐Ÿ™‚
    btw: AWESOME VIDEO ๐Ÿ™‚

  16. I'm getting this error…

    Error: JAVA_HOME is not set.

    '-Dhadoop.security.logger' is not recognized as an internal or external command,

    operable program or batch file.

    My Java home path is => export JAVA_HOME="C:Program FilesJavajdk1.8.0_192"

    Can you please help to resolve that problem?

  17. Hello,BigData 101 , Im getting this error-Syntax error in URI C:hadoop-3.1.2hadoop-3.1.2datanamenode, illegal character in opaque part at index 2. While installing hadoop,two directories got created by default and hence I gave that path in the xml file . There is no such error in data node logs , only in the name node . But the name node is running while giving jps . Could you let me know ways to solve this or would it work properly with this error .

  18. Excellent tutorial.

    It has helped me to successfully installed hadoop on my system.

    Thank you BigData 101

Leave a Reply

Your email address will not be published. Required fields are marked *