[Opinion] A Public Answer To Rob McNelly


 
Thread Tools Search this Thread
Operating Systems AIX [Opinion] A Public Answer To Rob McNelly
# 1  
Old 06-22-2016
[Opinion] A Public Answer To Rob McNelly

Why Do We Need Root on the HMC?

In this article in IBMSystems Magazine Rob McNelly asked the question

Why Don't We Have Root on the HMC?

and he goes on to justify why we indeed shouldn't have root - kinda. I think his arguments are not as valid as he perhaps thinks they are and what's more i think he deserves an answer as public as his statement. I will paraphrase some of his statements as i understand them, but you should read his linked article yourself to finally judge if i have misrepresented or misunderstood him.

First, Mister McNelly says it is "in the nature" of Sysadmins to believe they need root everywhere. This might be the case for some immature hacker kids. Fact is, i - and certainly every other responsible sysadmin i know - only switch to root if i really need to do it, not because it is my "habit" to do so. It is just the nature of my work which calls for the power of the superuser: otherwise i wouldn't know how to increase filesystem sizes, unlock user accounts or start up/shut down systems - these are the most common requests i face every day. But my "normal" work, which doesn't require these extraordinary powers - writing scripts, working out procedures, ..., i do with my ordinary user account. The only group i carry is "staff" and the only thing different from any other user acount is the size of my HOME directory (~200MB) because i generate reports and lists rather using UNIX text filters than these abominable "office" suites. (As a rule of thumb: data that really matters is not stored within an Excel sheet.)

The second reason Mister McNelly cites is that an (arbitrarily) administrated system (as opposed to an appliance) is a support nightmare. Now i can appreciate this argument! But guess what: any system with a variable configuration is more difficult to support than a system with a fixed config. Maybe IBM should lock out all users from all their AIX systems as this would make supporting the OS much easier, no?

And why does the HMC have to be a separate system anyways? Lets face it: it is basically a (acceptably but not outstandingly well designed) web application and a supplemental set of commands to do on command line what can be done within the web application. Can't that be an application which can be installed? What needs a separate system here?

For instance, i have installed the "EMC solutions enabler" on an AIX LPAR to administrate my array of VMax storage systems. It is a set of executables i just use within scripts of my own and it writes plain log files i can read. I'll give you that, to use non-standard SCSI commands to communicate with the VMax which requires "gatekeeper devices" to be created is probably a pretty bad idea - there was a thing invented for that kind of service, i believe it was called "networks". But save for that ´the management software for the system is a normal application. Why can't that be done for the HMC software?

Yes, i can understand Mister McNellys point that installing "everything and the kitchen sink" on the HMC can create problems - just like cramming several applications onto any other single system will likely cause problems and is a very bad design decision. But i wouldn't do that like i wouldn't design any other system that poorly. Still i could make my work easier with storing some really necessary files on the HMC without being forbidden to organize my HOME with that ridiculous restricted shell. I mean: does it really make support esaier when i am forced to have 50 files in my home instead of having them organized in neat subdirectories (which i can't create)? Who is helped by the fact that i cannot pipe the output of, say, lssyscfg, into a grep? I might even want to use the same shell i use throughout my whole AIX installation - Korn Shell - instead of being forced to use bash solely on the HMC.

So, do i want root on the HMC, as McNelly finally asks? No, for the most time a decent user account with a normal, not-restricted shell would suffice. But to manage this account - in the same responsible way i manage the rest of my 350 LPARs - i'd like to become root now and then to do whatever administrators do. Of course i know how to jailbreak the HMC (like perhaps every halfways capable admin does), but why do i need to "break into" a system i have set up, a system i run and for which i (well, actually my company) have paid good money?

If IBM would put the effort they put into making it harder to become root into further development of the HMC software itself - wouldn't it help people (outside their support staff)? It reminds me somewhat on the situation with IPhones, Android phones, Cyanogenmod and that awful decision to make the replacement of batteries impossible. I understand that it helps protecting the cashflow because this way it is easier to gain money from customers without doing more.

But on one hand: i may have to bear it, but i do not have to like it. And on the other hand: we are not talking about some mobile phone for 69.99. We are talking about the two HMCs i use to manage one and a half dozen p780s and p880s, about 2 million dollars apiece. Do you think it is necessary to squeeze out some minimal additional benefit by pestering me with a restricted shell for my daily work? And if you really think i couldn't handle the responsibility for such a vital system: don't you think i should be removed from the position where i manage the LPARs running the corporate SAP systems too?

Just my 2 cents for the whole HMC discussion.

bakunin

Last edited by Scrutinizer; 06-22-2016 at 11:26 PM.. Reason: Corrected url
These 2 Users Gave Thanks to bakunin For This Post:
# 2  
Old 06-22-2016
I'll chip in... discussion is very welcome.

A large number of Power sysadmin are simply not able, or capable, of doing their jobs.

When they get their hands on an HMC (let alone large corporate bank staff hacking up the ODM and asking IBM how to fix their mess - on two year old code - without a reboot - in an AIX LPAR) they "think" (relative term) they see linux and just simply go MAD.

How can IBM support any of that?

The HMC is an important box so needs to be treated with respect, as you said it's vital to support an expensive estate.

In the early days, to some extent it still is, HMC users / admin "think" they can do all sorts of "things" on "their" "linux box"...

How could that be supported without making the HMC a black box and simply not letting it happen?

Do you use vio commands on a vio server, or just oem_setup to save time - it'll catch you out sooner or later...

Do you hack the ODM if a command doesn't work the way you want - it'll catch you out sooner or later...

Do it by the book or suffer the consequences.

Just my experience, hope it helps some body or some poor system with an irresponsible admin ;0)

If you have a beef about HMC, AIX, admin rights, Etc. raise a PMR and if IBM say "not supported" they'll give you the process to raise a DCR. If the DCR (design change request) is rejected at least they'll let you know why they think your idea is not possible or plausible.
These 3 Users Gave Thanks to dukessd For This Post:
# 3  
Old 06-23-2016
Quote:
Originally Posted by dukessd
A large number of Power sysadmin are simply not able, or capable, of doing their jobs.
Amen. You are right, but i think you are missing the point: first, a determined "non-expert" (to avoid words more to the point) will be able to mess up anything. As i said, when locking him out of the HMC helps, why not lock him out of any other LPAR too?

Second: this is digging into a much larger area so i'll try to keep it short. The reason that so few capable admins for AIX are there is because IBM did (and, IMHO, still does) a very bad job at educating them. If i am a Linux admin and want to hone my skills i get myself a PC for $300 and start hacking. I will perhaps make it go FUBAR a few times but all this will teach me valuable lessons and i will be all the more capable once i work on really productive systems professionally. If i am an AIX admin i do - what? Buy myself a system for ~ $20k only to find out i can't even create an LPAR because i need to shell out another $50k in various licenses for one thing or the other? This might be OK for a bank, but is beyond my financial reach.

In addition the IBM documentation once used to be exhaustive. It isn't any more. In fact it is quite incomplete, bookmarks to the documentation tend to be invalidated within hours so that you start over searching for the same pages (which sometimes are not to be found again however) and even if you find what you search for the information is oftenly incomplete and leaves many questions open.

Of course there are courses: EUR 5k for 4 days of class and what they tell you is basically: "use SMIT and you are on the safe side". I don't care how to do something, i want to understand what i do. I found out over and over again that the people holding the course knew even less than me.

Quote:
Originally Posted by dukessd
How could that be supported without making the HMC a black box and simply not letting it happen?
As i said: by making what is the HMC today into an application as did EMC, as did IBM with their PSSP, as do many other developers of all sorts of management software. There is no reason that it has to be so complicated that it needs the "black blox" to run smoothly.

Quote:
Originally Posted by dukessd
Do you use vio commands on a vio server, or just oem_setup to save time - it'll catch you out sooner or later...

Do you hack the ODM if a command doesn't work the way you want - it'll catch you out sooner or later...

Do it by the book or suffer the consequences.
Well, i do all that on occasion, especially when the "by-the-book" methods didn't work out. And i was only so much amused when i was finally allowed to throw out everything i learned on AIX out of the window and had to learn a second, completely different set of commands to do on a VIOS the same things i do on an LPAR.

For the book by which i should do it: if it doesn't tell me what i need to know to do it right it is simply incomplete and/or badly written. Don't hold me accountable for IBM delivering bad/wrong/incomplete/misleading documentation.

Quote:
Originally Posted by dukessd
If you have a beef about HMC, AIX, admin rights, Etc. raise a PMR and if IBM say "not supported" they'll give you the process to raise a DCR. If the DCR (design change request) is rejected at least they'll let you know why they think your idea is not possible or plausible.
Having used AIX since version 3.2 i know this process. It is just my opinion that IBM took some wrong (design) decisions and even though i cannot help it i do not have to appreciate it either. And i do not have to take Rob McNellys apologetic stance towards this without objection.

Finally, on a more philosopic point about systems design in general: if you design a system to cater to the dumbest possible administrator you are likely to get the dumbest possible system which even the smartest possible admin can't make any more intelligent.

bakunin
These 5 Users Gave Thanks to bakunin For This Post:
# 4  
Old 06-23-2016
Before I chuck my couple of cents worth into the bucket here, a quick précis on me and what I’m doing at the moment.

I’m nearing retirement, I’ve worked on a huge range of equipment – for a long list of names, pretty much all gone now. Probably worked on more than 20 flavours of *NIX for companies like Data General, Sun, Olivetti, Norsk Data, Wordplex, Motorola, Intergraph and a number of others.

For the last 15 years I have been a “Data Centre Migration Specialist”, whatever one of those is. At the moment I am sub- contracted to a client by IBM. At this point I should say that I am not permanently employed by IBM, but this is the fourth time that I’ve been contracted out by IBM. The current job is to move the data centre of a major player in the UK utility market into a new headquarters building, a project expected to last at least another 18 months.

The IBM estate is pretty mixed and aged, I have a number of P770’s, P740’s, P570’s and RS6000’s running a number of levels of AIX from 4.3 to 7.1 – with 7.2 about to go on the floor in the form of a number of S824’s – there are a total of four HMC’s. I have also got quite a number of Linux (200) and Sun (350) servers to move, the end client has hardware support from Oracle, IBM and HP-CDS and OS support from IBM and Oracle.

So now my 2˘ worth:-

I can agree with most of what has been said above, I can understand IBM wanting to lock the HMC appliance down as much as possible and I understand the sysadmin desire to have full control of any machine on the network as Bakunin says – if there’s not a competency issue. In truth, my main reason for coming down on the restricted side of this argument is exactly that – competency! I have a number of systems that have been up and running for longer than many of my support contacts have been systems admins, I don’t actually have privileged access to many of the systems – I have elevated access or “root” access on none of the systems. Should I need root access, it has to be requested, approved and I am issued with a one-time password.

I find it to be a total pain, but that is the implemented system. On investigation the reason for the system being implemented was, you guessed it competency! Cited examples, well I could give you any number. But an example that I think sums it up quite well is one that was easy to recover from, but could have been catastrophic had it been a customer facing system with say five or six thousand users. Instead of a development system, with just a couple of hundred developers. Where the “root” user executed a recursive delete command with a space in it, from the root directory and effectively deleted the full contents of the server – mostly source code and development tools.

I have worked in the *NIX world since 1981, over that time I have watched the skill level of the sysadmin degrade, a lot of it revolves around training – my first “Sysadmin I” course was five weeks long and I never actually saw a machine. It was all spent sitting at a Wyse 30 terminal, with a number of other trainees. Now I see sysadmins working for major vendors, with no training whatsoever.

I am in many respects happy that these administrative and management appliances have been made idiot proof as much as possible, but also very wary – just when you find that you have secured the systems against Idiot V1.0, you’ll find that the management will upgrade to Idiot V2.0.

IMHO only training and experience makes for a competent sysadmin, but unfortunately these things come with a high price tag. Inexperienced resource is easy to find and cheap to run, moving the support off shore can exacerbate the problem – through language, not competency although my personal experience has been that you have the same ratio of competent/incompetent people evenly distributed around the world.

I have tried to keep myself current with as much as I can, even attending further training – here I definitely agree with Bakunin. When I’m doing stuff “I want to know what I’m actually doing”, after many years of AIX – and using “smit” on both AIX and Solaris(for information, smit was ported to Solaris by a major financial company in the UK), I knew about pressing F6 to see what was going to be run by the system. The standard of knowledge of the instructor made it obvious that he had almost no experience, as he couldn’t answer some of the simplest questions and answered others incorrectly – at which point I actually asked to see the manager of the training facility to request reimbursement.

So when I see the standard of people moving into the sysadmin world, I can understand why the move to making things safe through idiot proofing. My approach would be to weed out the idiots and provision better training, but unfortunately that costs more.

Gull04
These 5 Users Gave Thanks to gull04 For This Post:
# 5  
Old 06-23-2016
Quote:
I have worked in the *NIX world since 1981, over that time I have watched the skill level of the sysadmin degrade, a lot of it revolves around training – my first “Sysadmin I” course was five weeks long and I never actually saw a machine. It was all spent sitting at a Wyse 30 terminal, with a number of other trainees. Now I see sysadmins working for major vendors, with no training whatsoever.
Oh, you missed 11 exciting yearsSmilie
There is a strong believe that a new style of IT (cloud, virtualization, orchestration, automation, auto-scaling, self-healing, ...) will obsolete traditional system administration. Instead management-by-click will emerge.
Just order your desired IT-functions on your smartphone, and voila - your new company can go!
# 6  
Old 06-23-2016
Sounds like we all agree then.
"management-by-click" - of course that'll all just work as expected...
Where it the sad face emoticon?
# 7  
Old 06-24-2016
Hi Guys,

As to "management-by-click", well maybe. But just in case, my Pig is outside, saddled and ready to fly.

Gull04
This User Gave Thanks to gull04 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

Something in my mind - what's your opinion ?

Dear Forum staff / Advisors / members , I am having something in my mind, about Linux / Unix possible Interview questions collections, I guess if I post them here,which might be useful for our members and for students, and in meantime we can discuss also about those questions, what's your... (4 Replies)
Discussion started by: Akshay Hegde
4 Replies

2. Shell Programming and Scripting

What are public keys in ssh and how do we create the public keys??

Hi All, I am having knowledge on some basics of ssh and wanted to know what are the public keys and how can we create and implement it in connecting server. Please provide the information for the above, it would be helpful for me. Thanks, Ravindra (1 Reply)
Discussion started by: ravi3cha
1 Replies

3. UNIX for Advanced & Expert Users

Expert Opinion

This perhaps does not belong in ths category; apologies, however, we have a heated debate going and your input will decide the result. Should UNIX (HP, AIX, etc) be rebooted following a monthly cycle (Every month, or a qtr, etc.). We have some UX admins (grumps) who say they have seen a UX... (6 Replies)
Discussion started by: rsheikh
6 Replies

4. Post Here to Contact Site Administrators and Moderators

Opinion

Hi, I am new at this site and at unix. I was reading some answers that the administrators and moderators have posted to others, and sometimes I feel like their a little sarcastic. I am asking just to be patient to me, I know nothing about unix but I do want to learn, and I think that positive... (7 Replies)
Discussion started by: HN19
7 Replies

5. Solaris

Your Opinion requested

Ladies/Gentlemen, I am looking for a web-based tool to keep track of my Sun inventory. The following list of fields are fields I would like to store: Root Passwd (needs to be secure) / Hostid / Console Port / IP Address / Platform / Application / Hostname . . . you get the point. Do any of... (4 Replies)
Discussion started by: pc9456
4 Replies

6. UNIX Desktop Questions & Answers

Need your help and opinion

Hey all, I'm brand new to Unix/Linux and have a couple of questions. I own a small education/consulting company that has a staff of approx. 50 employees. Most our work is geared towards the office-style environment (i.e. Word, Excel, Powerpoint, etc.). There are also some C and Java programmers... (4 Replies)
Discussion started by: dennie1
4 Replies
Login or Register to Ask a Question