Newby: How to actually update software?!

Login or Register to Reply

Thread Tools Search this Thread
# 1  
Newby: How to actually update software?!

Hi All -

1) I work with bigdate for a living, use lots of neat software, SAS, SQL Server, etc. I know how to get my data and such, analyze it, etc...
2) I use UNIX at work (Solaris mostly) and can easily navigate around Unix and get the job done, vi and sas -nodms are about my fav, and some python and shells here there).
3) What I am not is a Unix System Admin. Sure I get the OS, and why it is so great, i.e. / is the bees knees, /home/me, etc....
4) So I have now joined the club! I have three Linux machines at home all connected and talking and everything is great! I even installed Hadoop (well specifically: hadoop-1.2.1 at /usr/local/hadoop) by getting it all untared and such and setup and nice and pretty by doing this:
sudo mv hadoop-1.2.0/ hadoop

5) So life is great! My main node and slaves are all happy! Woohoo!
6) But now change comes, Apache releases stable release hadoop-2.2.0 three days ago!?Smilie
8) I cant download the new version, and merely do this while here /usr/local:
sudo mv hadoop-2.2.0/ hadoop

I am sure the sky would fall, baby dolphins would die and other ill effects of such awful OS understanding.

9) So I come to you all for a very basic, best practice way of doing upgrades?
10) I think I "just" follow the upgrade guide at apache re updating the Hadoop OS, but you know, there really isn't something there that says, "Well buddy, just setup /usr/local/hadoop-2.2.0 then move all the config files and whatnot to the old install (/usr/local/hadoop) to the new one (/usr/local/hadoop-2.2.0) and when you are comfortable that it's all running swell, after following our guide, well, just DELETE the whole folder: /usr/local/hadoop and then do this:
sudo mv hadoop-2.2.0/ hadoop

and life will be happily ever after lived......

You folks get what I mean?

I look at the update/install guide and think that a fundamental OS/software update process is not connecting with me. I know this isn't Windows and there is no "Update Hadoop 1.2.0 to Hadopp 2.2.0" button. But I just am a little gray on the lifecycle of Unix, I know that I have to install a full new binary at: /usr/local/hadoop-2.2.0, but I just don't know how to clean up all the mess when I am done getting the clusters all set up. I know how to do that, I just don't get how best practice is to update software???

Any intel?

I have these Hadoop clusters at home to LEARN on, if I do something wrong, oh well, just fix it all and start over. I just got sooo far and now I am feeling a little lost on how to manage my Linux clusters with updates correctly. I just want to learn, this isn't for work or anything.

Thanks in advance for your help!
/*have a great weekend!!*/
# 2  
Oh well...

What you have so verbosely told us about is the reason why "by untarring it" is considered the least preferable way to install software among SysAdmins.

Most software comes in "packages". Packages are bundles of files, but not only bundles of files. There are several packaging systems (you mentioned you use Linux: Linux has two competing systems and almost every Linux distribution uses one of these two". Other Unix systems have even different package systems) and all offer slightly different opportunities, so i will concentrate on the general aspects. You might want to find out how that translates to specifically your system later, once you got a grasp of the basics.

Installing software is not merely done by putting some files somewhere. The process of installing software usually consists of several steps:

- putting some files somewhere
- doing some changes in the system (creating users, setting up startup procedures for the software, ...)
- doing some basic configuration of the software

Then there are some additional aspects:

Software is oftenly depending on other software to work. Check if it is there.

Software changes. Maybe an older version is installed and has to be updated while preserving its configuration as good as possible. Also these "versions" should be tracked somehow.

It should be possible to remove software. Ideally installing a package and then uninstalling it should leave your system in exactly the same state as before.

All these things are taken care of by software packages. In most cases there is a special software, called "package manager", which allows you to install, remove and update/change software. It takes the information it works with from the packages it deals with. This package manager is called "apt" in Debian and Debian-like Linuxes (Ubuntu and others), in RedHat-like systems (RedHat, Fedora, CentOS and others) it is "rpm", in IBMs AIX it is "installp", and on on.

The packages themselves carry - along with the files they contain - some metainformation (version numbers, language codes for packages in various languages, dependencies of all sorts, ...) which is read and used by the package manager and things they do at various events. Events could be: prior installation, past installation, prior deinstallation, past deinstallation and so on.

Suppose we want to install "ssh", secure-shell, which you probably use to connect to your Linux boxes.

First, we need some version information. Let us say our ssh binaries are version 2.0. Now we now, that everything below 2.0 - 1.9, 1.5, 0.9, ... - is prior to this version (-> do an update) and everything above - i.e. 2.1 - is past this version (-> do not install or do only if some "force"-option is used). So this places the software in some version context.

Second, we need some files. This is the "untar it somewhere"-part of the installation and the smallest concern in package managers.

Third, the daemon part of "ssh", the "sshd" has to be started during system startup, so some files ("/etc/inittab", "/etc/rc.*/...", ...) have to be altered after installation. For this there has to be a "post-install" script, but also a "prior-deinstall"-script to undo these changes. For some packages certain administrative users have to be created (and deleted at deinstallation). Maybe filesystems have to be checked for enough free space (prior installation). You can imagine what the various pre- and post- installation, - deinstallation, -other-event-scripts are for.

Finally, "ssh" won't work on its own. It needs some SSL-library to be installed. If there is only one such library, this is the simplest form of dependency. Matters can be more complicated if one of several possible libraries has to be installed (either "OpenSSL" or "whateverSSL" but at least one of these two), if one of several packages has to be installed alongside (so-called corequisites), if a certain version (or version range) is needed, and so on. All these requirements are written into the meta-information of the package and dealt with by the package manager.

The last point is so-called "repositories": you don't want to hunt for packages on the net, put them in some nonedescript place and install them. Chances are, if you have many servers to administrate, you forget where you placed these packages if you do not organize and document cleanly their whereabouts. This is why all these package managers can deal with repositories: places where lots of packages are kept and can be installed. Prerequisite needs could (and can) be satisfied there automatically, different platforms (some packages are available in special builds for different systems) can automatically be served the package suitable for them, and so on.

I hope i have convinced you that you should never NEVER EVER "untar something somewhere", but always use a package to install and deinstall software. At work i even go so far to - when installing a non-packaged software - first create a package from the bunch of files and only then install that to a test system, let alone my production systems.

I hope this helps.

# 3  
Bakuian - Yes this is very helpful! I've 3 Debian--wheezy-based Linux machines, would something like this work?

Debian Synaptic

Thanks for your helpSmilie

I'm a BigData guy not really a OS Admin, and really-really appreciate your aidSmilie

Just swung by the library too but they didn't have any Debian books, just a generic Linux+ cert book, will start reading up, though the index says nothing about Pkg MgmtSmilie
# 4  
Have a google on apt-get, the package management software for Debian-Based systems.

If you need to create a package that doesnt exist in the repos (/etc/apt/sources.list ; with like: apt-get search PACKAGENAME) have a look at:

Hope this helps

NOTE: The syntax *should* apply, but me used to use RedHat-Based systems.
# 5  
Originally Posted by sas
Bakuian - Yes this is very helpful! I've 3 Debian--wheezy-based Linux machines, would something like this work?

Debian Synaptic
Yes. Synaptics is a graphical frontend for the package manager (apt). Instead of typing "apt-get search <packagename>" or something such, you can list all the available packages, filter for a certain one (or some), download and install them, satisfying any dependencies automatically in the process.

Deinstall what you "untarred somewhere" and search for apt-packages for Hadoop.

I hope this helps.

# 6  
Thanks a bunch!

I love the Synaptic software (or shd I say "package"Smilie ), thanks for leading me down the path to a package manager Bakunin!

What if a package isn't available using apt-get? Then what? Untar and pray tar doesn't get all over the OS (ok bad joke...)...?
# 7  
It may be available through an alternate server.

Worst case, if you do have to build it yourself, keep a list of everything it installs should you need to rm it at a later date.
Login or Register to Reply

Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
update package/software questions
Hi, In SUSE linux, it looks like that I can update the package using YAST or install the package from source. So, which way works better? Also, if i install the package of new version. Do i need to remove/delete the old version package, or I just need to change my environments and keep the old...... SuSE
Software Update reporting script
I need a script that gets the output of softwareupdate -al on each machine (the list of available updates) and reports how many updates are needed in total by all the machines on the network, and the results to 4 different recipients. i.e. total patches required = 12 hostname1 =4 patch(es)...... OS X (Apple)
OS X (Apple)